Heart energy signature spectrogram for cardiovascular diagnosis
© Kudriavtsev et al; licensee BioMed Central Ltd. 2007
Received: 24 October 2006
Accepted: 04 May 2007
Published: 04 May 2007
A new method and application is proposed to characterize intensity and pitch of human heart sounds and murmurs. Using recorded heart sounds from the library of one of the authors, a visual map of heart sound energy was established. Both normal and abnormal heart sound recordings were studied. Representation is based on Wigner-Ville joint time-frequency transformations. The proposed methodology separates acoustic contributions of cardiac events simultaneously in pitch, time and energy. The resolution accuracy is superior to any other existing spectrogram method. The characteristic energy signature of the innocent heart murmur in a child with the S3 sound is presented. It allows clear detection of S1, S2 and S3 sounds, S2 split, systolic murmur, and intensity of these components. The original signal, heart sound power change with time, time-averaged frequency, energy density spectra and instantaneous variations of power and frequency/pitch with time, are presented. These data allow full quantitative characterization of heart sounds and murmurs. High accuracy in both time and pitch resolution is demonstrated. Resulting visual images have self-referencing quality, whereby individual features and their changes become immediately obvious.
1.1 Cardiac auscultation
Cardiac auscultation is a difficult skill to acquire and today most medical students graduate without the ability to determine whether a heart sound or murmur is normal or abnormal [1, 2]. Evidence also indicates that this skill is not acquired later in practice [3, 4]. There is question that despite improved heart sound teaching methods  whether improvement in this clinical skill has occurred. This diagnostic deficit results in, (a) certain patients with an undiagnosed organic cardiac lesion will suffer ill health or possible death at a later date, or (b) in the case of the innocent murmur, present in at least 72% of normal children [6, 7], expensive cardiac investigation must be carried out to reach this diagnosis. The availability of a new quantitative digitally based computer method, such as herein described, and which with high accuracy can determine and quantify key heart sound variations (i.e. frequency/pitch, intensity, timing, energy, sound split and ejection click), will present a valuable asset to the delivery of health care.
The approach used is based on spectrograms representing a dynamic graphic image of heart sound intensity in time and frequency. At present, current methods of spectral display are not generally understood or even employed in clinical medicine. We propose new method and format that will enable better characterization of heart sounds and hopefully will present a new foundation for subsequent clinical implementation and testing.
1.2 Phonocardiograms (PCG)
The PCG is a display of the heart sound signal showing that heart sounds and murmurs can provide useful information to the physician by complementing cardiac auscultation. Basic methodology of distinguishing cardiac murmurs from the PCG is the same as interpreting murmurs from auscultation. However, it provides additional information about timing of cardiac phases and events as well as serving as a digital record that can be utilized to characterize dynamic changes associated with therapy and course of the disease. PCG complements auscultation.
The major PCG clinical drawback is that it does not present information on frequency (pitch) of heart sounds and their components, one of the major deciding factors for murmur clinical interpretation. It does not have the ability to differentiate separate multiple (folded) frequencies of various sounds and presents no information concerning dynamic changes of energy (power) stored in the sound. Other deficiencies arguably include signal filtration effects (change of visual representation due to filtration) and presence of artifacts and noises that can visually mask weak sounds. Challenges in pinpointing start and end points of certain sounds have been reported. End point positions will also depend on the applied filter, which add additional uncertainty. Manual segmentation (separation of heart sound components) may be another problem as well.
PCG never achieved acceptance as a routine clinical investigative method , but did find a valuable place in clinical investigation and research. However current newly developed "system science"  and signal processing computational technologies in combination with a digital sound recording technologies, electronic recording stethoscopes, advanced new vibration sensors  and finally extraordinary computing power now afforded to PDA's, tablet PCs, palm PCs, laptops and MP3 players/recorders make it now possible to completely revitalize old PCG-based approaches.
1.3 Heart sound spectrography
The concept of heart sound spectral display was first introduced by McKusick in 1955  and in a subsequent series of his clinical publications [12–14]. (Victor A. McKusick, M.D., Professor of Medical Genetics and Cardiologist by training, The Johns Hopkins University School of Medicine and a physician-scientist widely acknowledged as the father of genetic medicine, is a recipient of the National Medal of Science, 2002.) This display provides added frequency (pitch) dimension to the PCG signal display. While this work did not receive significant reception by clinicians, there is now a renewed interest in this approach both in clinical medicine and in biomedical engineering research. Using spectrograms obtained by McKusick, Don Michael  illustrated the intrinsic properties of various heart lesions in his monograph "Auscultation of the Heart". Similar works has been recently reported by Balster et al. , Nopponen and Lukkarinen [18, 19]. Tovar-Corona et al. [20, 21], Bhatikar et al. , Tuchinda and Thompson [26, 27] utilized wavelet-based transform to obtain time varying scalogram maps. The spectrogram offers additional insight into time dependent change of murmur frequency. Donnerstein  correlated spectrogram frequency characteristics with Doppler echo velocity. Tavel & Katz [23, 24] reported a method of clinical differentiation of aortic stenosis from innocent murmur using spectrogram measurements. Finally, Tavel  indicated great promise for this approach for clinical diagnosis.
Unfortunately methods presented in [9–19, 23–25] use various forms of the Short Term Fast Fourier Transform (STFT) to obtain instantaneous frequency characteristics of signals, and all these methods are subject to the "quantum uncertainty" theorem, which states that a signal and its Fourier transform can not both have small support , and that frequency and time and both can not be determined to arbitrary precision [45, 47]. The resulting outcome of this drawback is a non-unique, low fidelity image, which changes depending on its frequency resolution [29, 30]. Also, heart sounds are nonlinear, non-sinusoidal and exponential signals, and it has been demonstrated in signal processing literature  that Fourier transform is not mathematically appropriate method to study such signals.
Tuchinda and Thompson , Tovar-Corona et al. [20, 21], Bhatikar et al.  utilized continuous wavelet based transformation (CWT) to develop spectrogram looking maps that present wavelet scale variation in time (scaleograms). CWT approach is not as well established in clinical studies as traditional spectrogram approach [8–19] and is presently emerging. Unlike for STFT spectrograms, time and frequency resolution of the CWT is non-uniform in the entire time-frequency domain . At high frequencies, there is good time resolution and bad frequency resolution. At low frequencies, we have better frequency resolution and bad time resolution. Accordingly, this results in smearing the time-frequency representation of the signal in time at low frequencies. The speed of wavelet transform computation and improved resolution over the STFT are the primary reasons that the wavelet transforms have become a popular analysis tool . Graphic results presented by Tuchinda and Thompson  also fail to provide sufficient qualitative resolution and have a strong visual "skewness" as compared to traditional spectrograms.
There are numerous recent publications on the subject of digital recording and analysis of heart sounds. Green et al.  discuss optimal ways of recording heart murmur findings using SNOMED templates, DeGroff et al.  suggest a potential for computerized frequency analysis to further improve the accuracy of murmur assessment and Nigam et al. [34, 35] introduce new ways of segmenting heart sound signal. Finley et al.  demonstrated that email digital recordings of children's heart sounds are of diagnostic quality and allow accurate distinction between innocent and pathologic murmurs in >90% of cases. Marcus et al.  and Collins et al.  use heart sound recordings to correlate S3 estimates, BNP levels and CHF diagnosis, demonstrating very high specificity (85–90%) of digital heart sound recordings for CHF diagnosis in patients over 50 years old. Kudriavtsev et al.  demonstrated that Still's murmurs have narrow spectral bandwidth, with this being a significant feature differentiating them from abnormal murmurs.
We conclude that there is a clear upsurge of clinical interest in spectrographic representation of heart sounds. However existing signal processing methods lack in accuracy and resolution. Unlike other short term Fourier transform based approaches [15–19] and Gabor's transformation [32, 48, 49], which offer approximation to instantaneous energy distribution of a signal, the Wigner-Ville distribution [45, 46] has been derived to compute the signal energy at each time instant, exactly utilizing knowledge of the entire signal to compute time-frequency properties for each moment in time. The method used in this study is based on the Wigner-Ville distribution and is called Heart Energy Signature (HES).
HES represents a unique state of dynamically changing multi-component signal of the heart beat. It can be visualized as an image of the instantaneous heart pulsation energy distribution in both frequency and time domains. It is intended for use as an individual biometric characteristic for heart sound interpretation.
2.1 Wigner-Ville distribution function
The pseudo Wigner Ville Distribution [40–42] is a form of the spectrogram, which is based on joint time-frequency distribution (Eugene Paul Wigner (1902–1995), Nobel prize laureate in Physics, introduced quasi-probability distribution in 1932 study quantum corrections to classical statistical mechanics). Wigner's probability function [44, 45] addresses a question of the Heisenberg quantum uncertainty  – that momentum and position of the particle can not be determined to arbitrary precision (quantum physics theory). For a quantum particle described by its probability function of coordinates, Wigner has developed a probability distribution of the particle to simultaneously have particular coordinates and momentum . Ville  further developed Wigner's function to compute the instantaneous frequency of the signal at each time instant. The resulting Wigner-Ville distribution of time and frequency [32, 40, 41] attempts to express frequency as a function of time. Since signal frequency is related to signal energy, one can interpret Wigner-Ville distribution as the energy map of a signal in time and frequency.
2.2 Heart energy signature (HES)
Murmur frequency resolution for STFT (Fig. 4A, and Figs. 7(A-D))
Window size, w
Mean frequency, Hz
Peak frequency, Hz
Low Frequency, Hz
High frequency, Hz
Quantitative comparisons of murmur frequency resolution as presented on Fig. 2B
Half-bandwidth HB, Hz
Low frequency LF, Hz
High frequency HF, Hz
Tovar-Corona et al. [20, 21] described similarly appearing contour graphs obtained using the continuous wavelet (CWT) based method. This method is now gaining acceptance in signal processing and mathematical details of this method are discussed elsewhere [30–32]. The CWT method presents results in wavelet scale – time map, and not on frequency (pitch) time map. Thus, accurately correlating wavelet scales with frequencies is difficult and for this reason is beyond the scope of this study.
2.3 Mathematical method and computational implementation
find the minimum x min and the maximum x max values of the signal
divide the signal by 0.5*|x max - x min|
Presentation of the time signal in the normalized form is important, since the same signal can appear differently at different amplitude scales. Furthermore, normalization of the heart signal creates the signal presentation with easily computed proportionality relationships between the amplitudes of the signal at various time instances.
After normalization, the next step in the heart sound processing is computation of heart sound energy, as described in the following section.
2.3.3 HES derivation using joint time frequency transformation
The time plot of the heart sound PCG displays the amplitude of the sound at each instant, i.e. no information about the energy is displayed. An accepted principle in acoustics is that the energy of the single frequency acoustic signal at each instant is proportional to the squared amplitude of the signal and the squared frequency of the signal. Computation of the acoustic energy is particularly difficult where the acoustic signal consists of many signals with fast changing frequency components. In this case, the acoustic energy must be presented in the form reflecting its energy content at each instant at the various frequencies contained in the signal. Thus, one must compute acoustic energy as a function of both time and frequency: E = E(t, f).
The best method to compute heart sound energy is to utilize joint time-frequency distribution (JTFD). JTFD reflects the distribution of the signal energy in the time-frequency plane [51, 52]. However, JTFD may not mathematically satisfy energy properties, i.e. to be positive throughout time-frequency plane. In order for distribution to have the same properties as energy, the chosen distribution has been modified to be a real positive value at each point of the time-frequency plane. Steps to obtain such distribution are outlined below.
The WVD can be regarded as theoretically optimal in that a maximum number of desirable mathematical properties are satisfied . In the field of the signal processing all time-frequency distributions of Cohen's class can be computed by means of convolution of the Wigner distribution with a two-dimensional impulse response function .
where A and σ are real positive constants.
The WVD and PWVD may not necessary be positive functions at each point on the time-frequency domain for general signals. From the energy concept, it would be more convenient to work with a positive function, as in the case of magnitude of FFT. The WVD can artificially be made positive by simply calculating its absolute value at each point. The common interpretation of WVD as an energy density can thus be allowed, or the intensity of a signal, to be simultaneous in time and frequency.
Since for general signals, the WVD takes on negative values, the absolute positive form |PWVD xx (t,f)| of the PWVD is used in the format for the HES. This guarantees the distribution to be positive in the time-frequency plane and makes straightforward interpretation of the distribution as the signal energy in the time-frequency.
The absolute positive form of the PWVD is used for computation of the HES. Thus, the preferred method to compute heart sound energy distribution is as follows: E(t, f) = |PWVD xx (t, f, A, σ)|,
where A = 1.0, σ2 = 10-5, t ∈ [τ, τ + T], f ∈ [f 1, f 2].
Accordingly the integral of the WVD over the frequency variable at a certain time t yields the instantaneous signal power at that time. Using the energy density interpretation of the PWVD, the signal energy at time t and frequency f contained in a cell dt by df can be found as |PWVD xx (t, f)|dtdf . Other important signal characteristics that can be defined from the PWVD include the instantaneous energy of the signal, or signal power  P(t) = ∫|PWVD xx (t, f)|df. (11a)
2.4 Heart energy signature (HES) format
The HES format utilizes several additional averaged and instantaneous characteristics which may be extracted from the HES image and source signal. The components of the format to present heart sound energy are schematically illustrated in Figs. 1(A,B) and in the graphic form of Figs. 2(A,B). Figs. 1(A,B) illustrate schematically generic PCG (oscillographic heart sound display) of two heart beats and its HES spectrogram reflection. It includes S1, S2, S3 sounds and also illustrate S2 split as well as systolic and diastolic intervals. Fig. 1B demonstrates (maps) heart sound components as energy contours, with shape depending upon energy distribution in time and frequency. Thus, additional heart sounds or murmurs as well as normal S1 and S2 will be manifested as additional energy contours and missing sounds will lead to significant reduction or complete disappearance of these contours. Contour values inside the dark zones always exceed certain predefined energy threshold, for example 20% of maximum energy or 50% of maximum energy.
A1. The distribution of the heart sound energy simultaneously in time and frequency E = E(t, f), t ∈ [τ, τ + T], f ∈ [f 1, f 2] (13)
where E is heart sound energy distribution, t is time, f is frequency.
B1. The normalized heart sound corresponding to the heart sound energy x = x(t), t ∈ [τ, τ + T], x(t) ∈ [- 1, +1] (14)
C1. The instantaneous energy of the heart sound signal, or heart sound power, corresponding to x(t) P = P(t) t ∈ [τ, τ + T] (15a)
D1. Instantaneous heart sound power, corresponding to x(t) at a particular frequency f P = P(t) t ∈ [τ, τ + T] (15b)
E1. The energy density spectrum of the heart sound, corresponding to x(t) D = |X(f)|2, f ∈ [f 1, f 2] (16)
F1. Instantaneous mean and peak frequency of the heart sound signal, corresponding to E(t,f). Instant peak frequency (IPF) = frequency f* for given t*, for which E(t*,f*) = Max(E(t*,f)) (17)
a two-dimensional image (2D) representing the distribution of the heart sound energy simultaneously in time and frequency as defined in A1 (Fig. 3A)
a time plot of the normalized heart sound corresponding to the heart sound energy as defined in B1 (Fig. 3B)
a plot of the instantaneous energy of the heart sound, or heart sound power, as defined in C1 (Fig. 3C)
a plot of the instantaneous energy of the heart sound at a given frequency, or heart sound power at a given frequency, as defined in D1 (Fig. 3D)
a plot of the energy density spectrum of the heart sound, as defined in E1 (Fig. 3E).
a plot of the time averaged energy density spectrum computed by Fast Fourier Transform FFT (Fig. 3E2)
These plots help to provide a comprehensive description and quantitative differentiation of heart abnormalities. For example, Fig. 3A provides qualitative visualization of every cardiac sound event, including S2 split and gives instantly visual ranges of change in frequency and time. Fig. 3C provides us with precise estimation of S2 split duration (time distance between peaks is equal to 30 ms) and S2 duration (75 ms) and provides relative scale of intensity for systolic murmur (30% of maximum power). Fig. 3D provides the same measure, but at the dominant murmur frequency of 125 Hz (horizontal line on Fig. 3A). On that frequency murmur power is increased to 45% of maximum. Fig. 3E1 shows the energy density spectrum of murmur obtained from HES for entire murmur time duration (murmur peak is at 125 Hz and half-bandwidth at 50% magnitude is 21 Hz), same spectrum obtained by FFT is shown on Fig. 3E2 (peak frequency 124 Hz, half-bandwidth 11 Hz and on Fig. 3E3 we show instant energy density spectrum obtained at murmur intensity peak (vertical line on Fig. 3A), showing peak at 128 Hz and half-bandwidth of 23 Hz. This instantaneous frequency distribution can be obtained at any time instant for murmur or any other sound component. We also measure effective frequency bandwidth (lower frequency LF and high frequency HF) at 50% of the magnitude on the spectrum distribution. Instantaneous peak and mean frequencies extracted from HES are shown on Figs. 4(A,B) where we can clearly see dominant frequencies of S1, S2 and murmur. Murmur frequency varies in time between 100 to 150 Hz; S2 frequency varies between 160 to 100 Hz and S1 changes between 70 to 145 Hz. Instantaneous jumps in frequency can be visualized, measured and correlated with extra sounds(clicks, snaps, splits, etc). To further illustrate HES format ability to characterize heart sound we extracted murmur signal from the heart sound (Fig. 4C) and analyzed it separately. Murmur energy signature is depicted in Fig. 4D showing narrow frequency width, with data extraction performed at vertical and horizontal lines. From this image mean frequency of 125 Hz, frequency half-bandwidth 20 Hz, lower frequency 102 Hz and high frequency 142 Hz are measured. On Fig. 4E the murmur instantaneous power change at dominant frequency is shown. The horizontal line drawn at 10% power threshold shows murmur start, end points and timing of murmur energy peak (0.262 sec). Fig. 4F shows instantaneous murmur frequency variation, with frequency being 125 Hz at murmur energy peak.
2.5 Heart sound data
In developing this method heart sounds that were recorded between 1980–91 by one of the authors at the Department of Pediatric Cardiology, IWK Center at Dalhousie University, Canada were utilized. Heart sounds were collected at four major auscultation positions, and subsequently documented as a database which included auscultation position, binary wave sound track (recorded at 11 kHz sampling rate), clinical diagnosis and auscultatory diagnosis. Most of this data was subsequently included in the educational teaching system, EarsOn  which included 260 clinical recordings and which was successfully utilized in the teaching of medical students and physicians . All heart sound recordings used in this paper are from that database (see Additional files 1, 2, 5, 6, 7, 8, 9), and no artificial or simulated sounds have been used.
3.0 Results and Discussion
3.1 Overview – what can we do with HES
New hardware and digital sound recording technologies can not completely revitalize old approaches (i.e. PCG or spectrograms) unless new methods present new advantages. The HES spectrogram offers a variety of new tools that may greatly reduce all existing disadvantages of PCG (see Sect. 1.2). By allowing dynamic separation of signal contributions (tones) in frequency and space intrinsic details of signal morphology are demonstrated. The HES spectrogram does not require special filtration. Background and line noise equally contribute to the entire spectrogram and thus are effectively eliminated from the visual image. The time dependent heart power plot deciphered from the HES spectrogram allows easy identification of start and end points of various heart sound components, showing as well separate peak components when signals are split. The method allows objective characterization of both time-averaged and instantaneous murmur's mean and peak frequencies and effective bandwidth, thus aiding in subsequent murmur classification. Folded multi-component frequencies are immediately visualized and documented on the HES image (Eq. 13). Integral HES energy (power plot, Eq. 15a) allows provision of a very detailed characterization of murmur and heart sound intensity (as it changes with time), offering unlimited possibilities to introduce precise digital scores to replace subjective Freeman-Levine system  of murmur intensity grades. Finally, the combination of PCG, HES image, power and frequency plots (Eqs. 13–18), allow much easier separation and identification of heart sounds, without the necessity of additional synchronized ECG tracings. These data plots also offer precise ability to measure timing of cardiac sound events, durations, delays and documentation of arrhythmias (variation in distances between energy peaks) if present. Additionally, HES offers the possibility of correlating time varying frequency (Eqs. 17–18) and intensity (Eqs. 15 (a,b)) of heart sounds with time dependent C-Doppler velocity profiles or M-mode ultrasound. This is done by a virtue of providing precisely calculated time varying mean frequency, instantaneous frequency and signal energy derived from the HES image. Rabin  directly correlates murmur intensity and murmur frequency with velocity and pressure gradient through an obstruction.
3.2 Examples of innocent murmurs using HES spectrograms
An example of a HES spectrogram for the innocent Still's murmur is presented in Fig. 2A with additional details shown in Figs. 2(B,C). The key characteristic of this image is the sharp resolution, making it possible to visually identify every component of the heart sound in a manner schematically described in Figs. 1(A,B). This heart sound is recorded at the apex. S1 is followed shortly by a 2+/6 systolic murmur of distinct musical quality. S1, S2 and S3 sounds are easily heard, and S2 sound is single to the experienced auscultator. However HES indicates narrow splitting of A2 and P2, illustrating the capability of the method. To illustrate the quantitative side of this method, instantaneous mean frequency characteristics were measured for murmur at times t = 0.4 s; 0.43 s; 0.45 s and 0.47 s. Murmur mean frequency and half-bandwidth were correspondingly 112.9/13.4; 106.2/14.8; 99.5/10.7; 94/13.4 (all measured in Hz) all consistently demonstrating narrow frequency bandwidth.
A second example of an innocent murmur recording of a pediatric patient is shown in Figs. 3(A–E) and 4(A–F). Sound recorded at 2nd LSB. These figures contain 13 components and demonstrates full graphic representation of HES image and format. The first five key elements are shown one below the other. The HES image shows corresponding flooded energy contours (Fig. 3A). The PCG of a single heart beat shows S1, murmur and split S2 sounds in a consecutive order (Fig. 3B). Power plots of Figs. 3C and 3D illustrate the most significant energy components and their gradients. Fig. 3C demonstrates the integral power plot (energy integrated across all frequencies) and in Fig. 3D power is extracted at a pre-selected murmur frequency (shown as a horizontal line at the cross-hair, Fig. 3D), The power plot clearly shows the maximum ratio of murmur to S2 intensity to be equal to 32%, which is consistent with clinical impression of the murmur being of 2+/6 intensity . The instantaneous frequency plot (Figs. 4A and 4B) illustrates the start and end of each heart sound. The variation of frequency of these components is also seen. Clearly demonstrated is that the murmur frequency decreases, that S3 frequency is the lowest; and that S2 has two frequency components separated in time.
energy density spectrum of the heart murmur (E1),
murmur time averaged frequency spectrum, obtained using FFT (E2), and
instantaneous energy density spectrum of heart murmur at its peak intensity (vertical cross-hair at the Fig. 3A).
Statistical characteristics of murmur frequency spectrum (peak frequency, mean value, bandwidth around mean value) and all numerical frequency criteria are obtained using equations (17–18) by integrating the energy signature image. Corresponding murmur detail and its HES are shown in Figs. 4C and 4D. Murmur oscillations are evident (basically non-musical), the frequency band of which is shown in detail, confirming the clinical diagnosis of innocent flow murmur [39, 50].
In Figs. 4D and 4E, "zoomed in" details of the HES are demonstrated, as well as detail of Doppler echo-like murmur intensity variation plot. The detail of the instantaneous murmur frequency plot (Fig. 4F), showing the murmur gradually increasing its frequency from 80 Hz to 125 Hz and then reducing it to 70 Hz, is also shown. Easily seen is that while time averaged frequency and instantaneous time frequency bands are narrow, the time dependent variation of the frequency spectrum of the murmur is significant. Abrupt start and end points of the murmur are clearly evident in these plots.
3.3 STFT spectrogram accuracy vs. new method
Classic spectrograms display frequency on the vertical axis and time on the horizontal, and plot sound intensity (measured in decibel) as a color map. They utilize Short Term Window Fourier Transform (STFT), which is a first order method and is subject to the uncertainty principle, whereby one cannot achieve simultaneous resolution in both frequency and time . Publications concerning signal processing also point to the inadequacy of the STFT method [28–30].
3.3.1 Comparison using model signals
In Figs. 5(A–E) comparison of the accuracy of newly developed HES method vs. STFT spectrogram method (Eqs. 12a and 12b) is shown. Labview V7.0 (National Instruments, Austin, TX) was utilized for STFT analysis with Hanning time window (w). In all cases frequency resolution window was set to be 2048 points. On Fig. 5A we present our test signal – basic chirp (6 kHz sampling rate) that emulates linear change in frequency between 60 Hz and 110 Hz (Eq.19). Similar test results are also presented in . Shown also are: HES energy signature spectrogram in Fig. 5B; two examples of STFT spectrograms with time windows w = 32 and 16 data points in Figs. 5(C,D) and quantitative frequency resolution comparison in Fig. 5E. We conclude that HES frequency resolution at signal peak is +-9.5 Hz and is 3.7 to 6.78 times better than STFT resolution. At signal peak HES mean frequency estimate 82.11 Hz matches very well (3%) with analytical chirp frequency of 84.9 Hz. Accuracy of temporal resolution was studied using model function derived from real heart sounds (Fig. 6A, sampling frequency 11 kHz). This function [see Additional file 3] includes two events (DTS1 and DTS2), each event presenting a narrow time split cardiac sound. DTS1 has 20 ms split and DTS2 has 41 ms split. HES results are presented on Fig. 6B and STFT (w = 256) results are presented on Fig. 6C. Matching power plots are presented on Figs. 6D(HES) and 6E (STFT). Clearly seen is that HES provides excellent visual separation of both temporal events, and time split is evident and measurable on energy contour plots and integral power plot. STFT spectrogram images are smeared, there is no visual separation and split is lost on the power plot. HES temporal accuracy at split mid point is 14.6% for DTS1 and 2.9% for DTS2, while correspondingly STFT temporal error is 78.5% and 66%.
In Figs. 7(A,B,C) STFT spectrograms that correspond to the HES image of Fig. 3A are shown. Clearly seen is that short window (Fig. 7A) overstretches frequency range, medium size window (Fig. 7B) provides very crude visual resolution and large size time window (Fig. 7C) completely smears all important components of the heart sound. Matching integral power plots obtained using Eq. 15a are presented in Figs. 8(A,B,C,D). Shown also is the excellent HES resolution of S2 time split (Fig. 8A), excellent time resolution of short window STFT (Fig. 8B), the bad time resolution for medium size STFT window (S2 split is lost, Fig. 8C) and the poor time resolution for large size STFT window (S1,S2 and murmur are smeared together, Fig. 8D). Corresponding quantitative frequency resolution measures are presented in Table 1. Again seen is that frequency resolution for STFT time windows w = 256 & 1024 is good, but image quality is pixilated, and time resolution for these windows is grossly insufficient (Figs. 8C and 8D). Time resolution for STFT window w = 16 is excellent, but frequency resolution is grossly overestimated, lower bound of frequency is completely lost (zero Hz), upper bound is exaggerated by 41% and frequency bandwidth is exaggerated by 215%.
3.3.2. Comparison using clinical heart sound and third party software
In this section comparison of the STFT spectrogram method with our newly developed method using example of a clinical heart sound is carried out. The original recording is also presented in Figs. 3(A–E) and 4(A–F) (using new method), the heart sound recording being that of an innocent murmur (musical) of a child, the sound track being recorded with 11 kHz sampling rate. To enable further research and comparisons the sound track is attached [see Additional file 2]. The STFT spectrograms shown were obtained using state-of-the art Meditron Analyzer software (FDA cleared clinical product distributed by Welch Allyn, NY) using its default (best) transformation option – Hanning window with 1024, 2048, 4096 data points correspondingly. A typical clinical complaint concerning this method is that clear separation of heart sounds is not allowed, and that precise identification of A2 and P2 components and their split is difficult. In Figs. 9(A,B) and 10(A,B,C), spectrograms obtained using STFT are shown, which currently have been the only means of obtaining dynamic content of the heart sound spectral signal. Quantitative data extraction is not possible as it is not possible using current third party software.
Recent studies [16–19, 23–25] show similar spectrograms, also without presenting quantitative details. Certain researchers (i.e. Bentley et al ) simply select window function and its parameters (i.e. length) for a particular heart signal by trial and error. This leads to ambiguity in time-frequency resolution to the point that two STFT computed for the same signal, but with different window function parameters, could hardly be identified as computed for the same signal This is clearly demonstrated by comparison of Fig. 7A and Fig. 7C). Results presented in Figs. 10(A,B,C) illustrate this ambiguity as it is not clear which resolution is correct. Window sizes w = 1024, 2048, 4096 data points with sampling rate of 11 kHz were employed. Small size window will improve time resolution, but will have poor frequency resolution and vise versa.
A recording of an innocent murmur of a child (Fig. 9A) is used to demonstrate the drawbacks of the wavelet method. The spectrogram detail is shown in Fig. 9B, while Fig. 9C illustrates the wavelet transform based scaleogram approach similar to one used by Kim and Tavel . Wavelet results were obtained using commercial music editing software Nero Wave Editor and are presented here for illustration purposes. While some correspondence between waveforms and wavelet scaleogram "splashes" is seen, detailed resolution remains insufficient and the image becomes highly smeared, especially at lower frequency range. This behavior is typical for wavelet transform and similar appearing visual images have been reported [27, 30]. It has been indicated  that STFT and wavelet transform have similar resolution and that improved localization of acoustic events will be useful. Wavelet based visual images (contours) were also reported by Tovar Corona et al. [20, 21], however the article presents very sketchy results and method description. It is well established in the signal processing literature [28, 30, 54] that the STFT spectrogram image is window size dependent, as is clearly seen in the Figs. 10A,B, and 10C. As window size (number of computational points in the sample) increases, the time resolution of the spectrogram decreases (increased horizontal smearing), resulting in improved frequency resolution. These trends are mutually exclusive. Separating the murmur from S1 is very difficult, and detecting S3 with certainty is virtually impossible, as in Fig. 10A, which initially appeared satisfactory, Figs. 10B and 10C become unusable. Close analysis of spectrograms (Figs. 10A, 10B, 10C)) also indicates the tendency to overestimate the upper frequency bound of the signal, especially when improved time resolution is sought. This in turn leads to unrealistic and significantly overestimated conclusions about murmur frequencies. The HES method based spectrogram of the same recording is shown in Fig. 10D and in greater detail in Figs. 2A and 2B. HES is the second order method and simultaneously resolves both time and frequency and separate all key components of the heart sound. This can be seen from Fig. 2A. Fig. 2B clearly demonstrates the detail of Fig. 2A, focusing on the murmur at the end of S1, and on S2 and S3 sounds. Note a clear separation of the heart murmur, and that aortic and pulmonary components of P2 are clearly visible. One also sees in S1 clear separation of mitral and tricuspid components in the power (energy) vs. time plot (Fig. 2C). We can also identify from the plot time the delay between S2 and S3 as well as their duration. In Table 2 we provide further quantitative comparison of murmur, S1 and S2 time averaged frequency resolution using HES and STFT (obtained by the authors using method described in Sect 3.3.1 and Eqs. 12a and 12b, time window w = 16 data points, frequency window 2048 data points). STFT exceeds upper frequency estimation by 60 to 66% and frequency bandwidth estimation between 2.2 to 5 times.
Several other relevant issues and especially end point detection using HES, PCG, STFT and effects of filtration are discussed in the Additional file 4.
3.4 Analyzing heart energy signature
We conclude that abnormalities may be detected using overall and immediate visual contrast i.e. self-referencing property of HES image. By also utilizing the quantitative nature of the above displays, the ability to demonstrate murmur pitch, for example, is shown.
4.0 Future Work
Future work will focus on documenting various important characteristics of HES spectrograms and specifically on their ability to characterize heart murmur frequency, S2 heart sound frequency and its split, murmur timing and duration, murmur intensity, S3 sound presence and intensity, S1 presence and intensity, presence of ejection sounds and arrhythmias. The library of heart sounds of one of the authors [3, 50] will be utilized to conduct these studies. The systolic ejection click is a very difficult auscultatory event for the physician to elicit, yet may be the single abnormal finding. One study currently in progress is to study a cohort of patients with a heart murmur to determine blindly whether the murmur is innocent or abnormal. A study of patients with systolic ejection clicks by HES is in process. The ultimate goal is to have available a low price bedside automated diagnostic tool to supplement the existing problem of low auscultatory skill of the family physician.
In this paper a new method and format of analyzing heart sounds using Cohen class joint time-frequency transformation is presented. Initial results of applying these methods to normal and abnormal heart sounds are demonstrated. This method allows detailed quantitative characterization of heart sounds and uses both visual images and a system of integral plots that characterize time averaged and instantaneous sound intensity and frequency variations. Examples of the innocent murmur, mitral valve prolapse, pulmonary and aortic stenosis, and Tetralogy of Fallot are presented. All heart sound components, i.e. S1, S2, S3, murmurs and sound splits, were clearly separated in time and frequency. High resolution of generated heart sound images, in both time and pitch, are demonstrated, presenting a distinctly improved quality to traditional spectrogram images (based on SFFT). The resulting visual images have self-referencing quality, whereby individual features and their changes become immediately obvious.The procedure also offers a new parameter for cardiac research, which, for example by virtue of its ability to portray third heart sounds , may well play a valuable part in the management of patients with coronary artery disease. Current ACC/AHA Guidelines recommend that patients with unstable angina and a concurrent auscultated S3 be classified in the group at highest risk for adverse outcomes and considered candidates for an early invasive strategy .
- Mangione S, Nieman LZ: Cardiac auscultatory skills in internal medicine and family practice trainees. JAMA 1997,278(9):717–28. 10.1001/jama.278.9.717View ArticleGoogle Scholar
- Mangione S: The teaching of cardiac auscultation during internal medicine and family medicine training-a nation wide comparison. Acad Med 1998,73(10 suppl):S10–2.View ArticleGoogle Scholar
- Roy DL, Sargent J, Gray J, Hoyt B, Allen M, Fleming M: Helping family physicians improve their cardiac auscultatory skills with an inter-active CD Rom. J Contin Educ Health Prof 2002, 22: 152–9. 10.1002/chp.1340220304View ArticleGoogle Scholar
- Haney I, Ipp M, Feldman W, McCrindle BW: Accuracy ofclinical assessment of heart murmurs by office based (general practice) pediatricians. Arch Dis Child 1999, 81: 409–412.View ArticleGoogle Scholar
- Favrat B, Pecoud A, Jaussi A: Teaching cardiac auscultation to trainees in internal medicine and family practice: does it work? BMC Med Educ 4: 5. 2004 March 31 10.1186/1472-6920-4-5
- Fogel DH: The innocent heat murmur in children; a clinical study of its incidence and characteristics. Am Heart J 1960, 59: 844–55. 10.1016/0002-8703(60)90408-7View ArticleGoogle Scholar
- MacLaren MJ, Lachman AS, Pocock WA, Barlow JB: Innocent murmurs and third heart sounds in black school children. Br Heart J 1980, 43: 67–73.View ArticleGoogle Scholar
- U.S. Department of Health and Human Services, Center for Medicare and Medicaid Services (CMS). Obsolete and unreliable diagnostic tests Medicare Coverage Issues Manual Sec. 50–34. CMS Pub. No. 6. Baltimore, MD: CMS; effective January 1, 1997 [http://www.cms.hhs.gov/manuals/]
- Baura GD: The Business of innovation. MX: The Online Information Source for the Medical Device Industry 2005. [http://www.devicelink.com/mx/archive/05/01/baura.html]Google Scholar
- Kassal J, Reeves W, Donnerstein R: Polymer-based adhered differential output sensor for cardiac auscultation. Med Electronics 1994, 54–63.Google Scholar
- McKusick VA, Webb GN, Humphries JN, Reid JA: On cardiovascular sound: further observations by means of spectral phonocardiography. Circulation 1955, 11: 849–870. [http://circ.ahajournals.org/cgi/reprint/11/6/849]View ArticleGoogle Scholar
- McKusick VA, Massengale ON Jr, Wigod M, Webb GN: Spectral phonocardiographic studies in congenital heart disease. Brit Heart J 1956, 18: 403–416.View ArticleGoogle Scholar
- McKusick VA: Cardiovascular sound in health and disease. Baltimore: Williams & Wilkins; 1958.Google Scholar
- McKusick VA: Spectral phonocardiography. Am J Cardiol 1959, 4: 200–206. 10.1016/0002-9149(59)90248-6View ArticleGoogle Scholar
- Don Michael TA: Auscultation of the Heart: A Cardiophonetic Approach. McGraw Hill 1998, 392.Google Scholar
- Balster DA, Chan DP, Rowland DG, Allen HD: Digital acoustic analysis of precordial innocent versus ventricular septal defect murmurs in children. Am J Cardiol 1997,79(11):1552–55. 10.1016/S0002-9149(97)00194-XView ArticleGoogle Scholar
- Donnerstein R, Thomsen VV: Hemodynamic and anatomic factors affecting the frequency content of Still's innocent murmur. Am J Cardiol 1994, 74: 508–510. 10.1016/0002-9149(94)90917-2View ArticleGoogle Scholar
- Noponen AL, Lukkarinen S, Angela A, Sikio K, Sepponen R: How to recognize innocent vibratory murmur. Comp in Cardiol 2000, 27: 561–564.Google Scholar
- Noponen AL: 4 th Congress on Pediatric Cardiology. Poster Presentation 2005.Google Scholar
- Tovar-Corona B, Hind MD, Torry JN, Vincent R: Effects of respiration on heart sounds using time-frequency analysis. Comp in Cardiol 2001, 457–460.Google Scholar
- Tovar-Corona B, Torry JN: Time-frequency representation of systolic murmurs using wavelets. Comput in Cardiol 1998, 601–604.Google Scholar
- DeGroff CG, Bhatikar S, Herzburg J, Shandas R, Vlades-Cruz L, Mahajan RL: Artificial neural network-based method of screening heart murmurs in children. Circulation 2001, 103: 2711–2716.View ArticleGoogle Scholar
- Tavel ME, Katz H: Usefulness of a new sound spectral averaging technique to distinguish an innocent systolic murmur from that of aortic stenosis. Am J Cardiol 2005, 95: 902–904. 10.1016/j.amjcard.2004.12.018View ArticleGoogle Scholar
- Kim D, Tavel ME: Assessment of severity of aortic stenosis through time-frequency analysis of murmur. Chest 2003, 124: 1638–1644. 10.1378/chest.124.5.1638View ArticleGoogle Scholar
- Tavel ME: Cardiac auscultation: a glorious past: and it does have a future. Circulation 2006, 113: 1255–1259. 10.1161/CIRCULATIONAHA.105.591149View ArticleGoogle Scholar
- Tuchinda C, Thompson WR: Cardiac auscultatory recording database: delivering heart sounds through the internet. Proc AMIA Symp 2001, 716–20.Google Scholar
- Hayek CS, Thompson WR: Wavelet processing of systolic murmurs to assist with clinical diagnosis of heart disease. Biomed Instrum Technol (Biomedical instrumentation & technology/Association for the Advancement of Medical Instrumentation) 2003,37(4):263–70.Google Scholar
- Cohen L: The uncertainty principle for the short time fourier transform and wavelet transform. In Wavelet transforms and time frequency signal analysis Edited by: Debnath L. Birkhäuser Boston; 2001, 1–428. [http://en.wikipedia.org/wiki/Short-time_Fourier_transform]Google Scholar
- Rein S, Reisslein M: Identifying the classical music composition of an unknown performance with wavelet dispersion vector neural nets (extended version). Information Sciences 2006,176(12):1629–55. [http://www.fulton.asu.edu/~mre/audio_wav_ext_Dec04.pdf] 10.1016/j.ins.2005.06.002View ArticleMATHGoogle Scholar
- Signal processing toolbox for use with Matlab, Version 5 Mathworks 2002, 1–800.
- Qian S, Dapang CD: Joint time-frequency analysis: method and applications. Prentice Hall, NJ 1996, 78–79.Google Scholar
- Lang CW, Forinash K: Time-frequency analysis with the continuos wavelet transform. Am J Phys 1998,66(9):794–797. 10.1119/1.18959View ArticleGoogle Scholar
- Green JM, Wilcke JR, Abbott J, Rees LP: Development and evaluation of methods for structured recording of heart murmur findings using SNOMED-CT post-coordination. J Am Med Inform Assoc 2006,13(3):321–33. 10.1197/jamia.M1973View ArticleGoogle Scholar
- Nigam V, Priemer R: A dynamic method to estimate the time split between the A2 and P2 components of the S2 heart sound. Physiol Meas 2006,27(7):553–67. 10.1088/0967-3334/27/7/001View ArticleGoogle Scholar
- Nigam V, Priemer R: Accessing heart dynamics to estimate durations of heart sounds. Physiol Meas 2005,26(6):1005–18. 10.1088/0967-3334/26/6/010View ArticleGoogle Scholar
- Finley P, Warren AE, Sharratt GP, Amit M: Assessing children's heart sounds at a distance with digital recordings. Pediatrics 2006, 118: 2322–2325. 10.1542/peds.2006-1557View ArticleGoogle Scholar
- Marcus GM, Gerber IL, McKeown BH, Vessey JC, Jordan MV, Huddleston M, McCulloch CE, Foster E, Chatterjee K, Michaels AD: Association between phonocardiographic third and fourth heart sounds and objective measures of left ventricular function. JAMA 2005,293(18):2238–44. 10.1001/jama.293.18.2238View ArticleGoogle Scholar
- Collins SP, Lindsell CJ, Peacock WF, Hedger VD, Askew J, Eckert DC, Storrow AB: The combined utility of an S3 heart sound and B-type natriuretic peptide levels in emergency department patients with dyspnea. J Card Fail 2006,12(4):286–92. 10.1016/j.cardfail.2006.01.012View ArticleGoogle Scholar
- Kudriavtsev VV, Kaelber D, Lazbin M, Polyshchuk VV, Roy DL: New tool to identify Still's murmurs. Pediatric Academic Societies Annual Meeting 2006 April 29–May 2 [http://www.bsignetics.com/papers/PASStillsMurmur.pdf]
- Polyshchuk VV, Kudriavtsev VV: Cardiovascular sound signature: method, process and format. USPTO Application, 2005; 0222515. Assignee: Biosignetics Corp; 1–41.
- Polyshchuk V, Choy FK, Braun MJ: New gear-fault-detection parameter by use of joint time-frequency distribution. AIAA J of Prop and Power 2002,16(2):340–346.View ArticleGoogle Scholar
- Polyshchuk V: Detection and quantification of the gear tooth damage from the vibration and acoustic signatures. In PhD Dissertation. University of Akron; 1999.Google Scholar
- Reed TR, Reed NE, Fritzson P: Heart Sound analysis for symptom detection and computer-aided diagnosis. Sim Mod Practice and Theory 2004,12(2):129–46. [http://www.bearcave.com/misl/misl_tech/signal/nonstat/index.html] 10.1016/j.simpat.2003.11.005View ArticleGoogle Scholar
- de Bruijn NG: A theory of generalized functions, with applications to Wigner distribution and Wyel correspondence. Nieuw Archief voor Wiskunde 1973,XX1(3):205–80.Google Scholar
- Wigner E: On the quantum correction for thermodynamic equilibrium. Phys Rev 1932, 40: 749–59. [http://en.wikipedia.org/wiki/Wigner_quasi-probability_distribution] 10.1103/PhysRev.40.749View ArticleGoogle Scholar
- Ville J: Theorie et applications de la notion de signal analytique. Cables et Transmission 1948, 1: 61–74. 2eA. (in French)Google Scholar
- Messiah A: Quantum mechanics. translated by G.M. Temmer, North Holland, Amsterdam 1970, 1: 50.Google Scholar
- Cable CS, Ducharme NG, Hackett RP, Erb HN, Mitchell LM, Soderholm LV: Sound signature for identification and quantification of upper airway disease in horses. Am J of Vet Res 2002,63(12):1707–13. 10.2460/ajvr.2002.63.1707View ArticleGoogle Scholar
- Cohen L: Time frequency analysis. Prentice Hall, Englewood Cliffs, NJ 1995.Google Scholar
- Roy DL, Hoyt B: Collection of heart sounds: EarsOn. CD Rom, CorSonics Corp, Halifax, N.S [http://www.corsonics.com]
- Cohen L: Time-frequency distributions – a review. Proceedings of the IEEE 1989,77(7):941–81. 10.1109/5.30749View ArticleGoogle Scholar
- Mertins A: Signal analysis. In Wavelets, Filter banks, Time-frequency transforms and applications. John Wiley & Sons; 1999.Google Scholar
- Claasen TACM, Mecklenbrauker VFG: The Wigner distribution – a tool for time-frequency signal analysis, Part I: Continuous time signals. Philips J of Res 1980,35(3):217–250.MATHGoogle Scholar
- Chen VC, Ling H: Time frequency transforms for radar imaging and signal analysis. Artech House 2003, 264. (page 37 for Wigner Ville, page 28 for STFT) [http://airborne.nrl.navy.mil/~vchen/tftsa.html]Google Scholar
- Bentley PM: Time-frequency and time-scale techniques for the classification of native and bioprosthetic heart valve sounds. IEEE Trans Biomed Eng 1998,45(1):125–8. 10.1109/10.650366View ArticleGoogle Scholar
- Bsignal – Heart energy signature visualization system, Version 3.5 Product manual 2006, 1–120. [http://www.biosignetics.com]
- Wigner quasi-probability distribution [http://en.wikipedia.org/wiki/Wigner_quasi-probability_distribution]
- Braunwald E, Antman EM, Beasley JW, et al.: ACC/AHA guideline update for the management of patients with unstable angina and non-ST-segment elevation myocardial infarction -2002: summary article: a report of the American College of Cardiology/American Heart Association Task Force on Practice Guidelines (Committee on the Management of Patients With Unstable Angina). Circulation 2002,106(14):1893–900. 10.1161/01.CIR.0000037106.76139.53View ArticleGoogle Scholar
- Freeman AR, Levine SA: The clinical significance of the systolic murmur: A study of 1,000 consecutive "non-cardiac" cases. Ann Intern Med 1933, 6: 1371–85.View ArticleGoogle Scholar
- Nygaard H, Thuesen L, Terp K, Hasenkam JM, Paulsens PK: Assessing the severity of aortic valve Stenosis by spectral analysis of cardiac murmurs (spectral vibrography). Part II: Clinical Aspects. J Heart Valve Dis 1993,2(4):468–75.Google Scholar
- Kvart C, French AT, Fuentes VL, Haggstorm J, McEwan JD, Schober KE: Analysis of murmur intensity, duration and frequency components in dogs with aortic stenosis. J Small Anim Pract 1998,39(7):318–24.View ArticleGoogle Scholar
- Bulgrin JR, Rubal BJ, Thompson CR, Moody JM: Comparison of short-time Fourier, wavelet and time-domain analyses of intracardiac sounds. Biomed Sci Instrum 1993, 29: 465–72.Google Scholar
- Rabin A: Auscultation of the Heart. Volume 6. 2nd edition. Year Book Medical Publishers; 1967:1–240.Google Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.