Research  Open  Published:
From wavelets to adaptive approximations: timefrequency parametrization of EEG
BioMedical Engineering OnLinevolume 2, Article number: 1 (2003)
Abstract
This paper presents a summary of timefrequency analysis of the electrical activity of the brain (EEG). It covers in details two major steps: introduction of wavelets and adaptive approximations. Presented studies include timefrequency solutions to several standard research and clinical problems, encountered in analysis of evoked potentials, sleep EEG, epileptic activities, ERD/ERS and pharmacoEEG. Based upon these results we conclude that the matching pursuit algorithm provides a unified parametrization of EEG, applicable in a variety of experimental and clinical setups. This conclusion is followed by a brief discussion of the current state of the mathematical and algorithmical aspects of adaptive timefrequency approximations of signals.
EEG
The electroencephalogram (EEG), recorded from electrodes placed on the scalp, reflects the electrical activity of the brain.
First registered from human scalp in 1929 [1], until today EEG remains an important tool in neuroscience and clinical neurophysiology. For a long time it was the only objective parameter providing information on brain's function. Recently emerging dynamic imaging techniques like fMRI and PET offer a complementary information on brains functioning. Their drawbacks include significantly lower time resolution, high cost and invasiveness. Nevertheless, in spite of those drawbacks, they are often preferred for the straightforward interpretability. On the contrary, visual interpretation of EEG is a difficult, tedious and complicated task, requiring many years of experience. And even then it contains a significant subjective factor:
Every experienced electroencephalographer has his or her personal approach to EEG interpretation. (...) there is an element of science and element of art in a good EEG interpretation; it is the latter that defies standardization.
writes Prof. Ernst Niedermayer in the recent edition of a fundamental reference [2].
In spite of this discouraging opinion, application of various signal processing methods in this field is still very popular. First introduction of Fourier analysis to EEG is dated 1932 [3]. Since then spectral analysis has become a standard tool in this field. But until today, basically no other method gained a general acceptance – the few widely accepted and applied methods of EEG analysis still amount to [4]:

1.
visual analysis of raw EEG traces,

2.
Fourier estimation of spectral power in selected frequency bands,

3.
description of evoked potentials, averaged in the time domain.
One of the reasons of this unsatisfactory situation is that among the variety of new methods, proposed each year for the analysis of EEG, very few have any direct relationship to the traditional, visual analysis. That means that their results usually cannot be directly related to the most valuable knowledge base of 70 years of experience, collected by means of the visual analysis of EEG. On the contrary, each new method needs to be evaluated from the scratch, in many laboratories and over many repeated studies, before it can be considered as a clinically usable parameter. And since there is a multitude of new methods being proposed, with few a priori hints for their possible performance, this process seems to be hardly convergent.
Wavelets: the first step towards timefrequency analysis
EEG as a signal is extremely complex and reveals both rhythmical and transient features. Therefore, when orthogonal wavelets triggered in eighties the timefrequency revolution in signal processing, analysis of EEG was one of the fields where hopes for a more "morphological" description of signal arose.
First successful applications of wavelets addressed the issue of evoked potentials (EP). Eventrelated potentials are small changes of EEG activity in response to an external (evoked) or internal stimulus. They are usually an order of magnitude weaker than the ongoing EEG activity, so traditionally they were observed only in time averages of stimulussynchronized repetitions. Owing to the property of the time synchronization, they can be studied in the space of orthogonal wavelet coefficients. Figures 2,3,4 present the first wavelet study of evoked potentials [5]. Randomly chosen epochs of ongoing EEG were statistically compared – in the space of wavelet coefficients – to those synchronized by an auditory stimulus. As a result, only 9 out of 512 coefficients were found to discriminate between those conditions. They were enough to reproduce the basic morphology of the average EP. Figure 4 presents reconstruction of single evoked potentials using those coefficients.
Limitations of linear and quadratic timefrequency representations
Further attempts to apply wavelets to the analysis of ongoing EEG activity encountered a major drawback. Representation of the signal, obtained from orthogonal wavelet representation, is not timeshift invariant. This property is illustrated in Figure 5.
This limitation is not present in the continuous wavelet transform, as well as several quadratic representations of signal's energy density in the timefrequency plane (c.f. [6] and Figure 8). However, in such case we loose the compact description offered by orthogonal wavelets – the representation is highly redundant. Another problem relates to the occurence of cross terms (Figure 6).
Both these problems can be solved (or at least significantly reduced) by the solution proposed in the next section.
Adaptive approximations and matching pursuit
Given a set of functions (dictionary) D = {g _{1},g _{2}...g _{ n }} such that g _{ i } = 1, we can define an optimal Mapproximation as an expansion, minimizing the error ε of an approximation of signal f(t) by M waveforms:
where {γ_{ i }}_{ i = 1..M } represents the indices of the chosen functions g _{γi}. Finding such an optimal approximation is an NPhard problem. A suboptimal expansion can be found by means of an iterative procedure, such as the matching pursuit algorithm (MP) proposed by Mallat and Zhang [7].
In the first step of MP, the waveform g _{γ0} which best matches the signal f(t) is chosen. In each of the consecutive steps, the waveform g _{γn } is matched to the signal R^{n} f, which is the residual left after subtracting results of previous iterations:
Orthogonality of R^{n+1} f and g _{γn } in each step implies energy conservation:
For a complete dictionary the procedure converges to f:
From this equation we can derive a timefrequency distribution of the signal's energy, that is free of crossterms, by adding Wigner distributions of selected functions:
This magnitude is presented in Figure 7, calculated from MP decomposition of a simulated signal with known and simple content. We observe that most of the structures are represented compactly and with high resolution, except for the structure of changing frequency (linear chirp). It is represented by a series of structures of constant frequency, since in the applied Gabor dictionary (section 5) we have only constant frequency modulations. Section 7 presents an alternative approach to this issue.
Figure 8 presents estimates of the timefrequency density of the same signal's energy, obtained from: spectrograms with different window widths, continuous wavelet transform and smoothed pseudo WignerVille distribution. Only in the last case representation of the chirp looks better than on the plot obtained from MP decomposition, but we must take into account that in this case the parameters of the kernel of the distribution were optimized for this particular signal.
Except for the lack of cross terms and high resolution, adaptive timefrequency parametrizations exhibit one more basic and important advantage over the continuous timefrequency representations. Unlike the maps from Figure 8, for all the structures presented in Figure 7 we have a priori the exact values of their time and frequency centers, widths, amplitudes and phases. This property will be thoroughly explored in the following studies.
First application in EEG analysis: sleep spindles
The presence of sleep spindle should not be defined unless it is of at least 0.5 sec duration, i.e., one should be able to count 6 or 7 distinct waves within the halfsecond period. (...) The term should be used only to describe activity between 12 and 14 cps.

says the definition from the basic reference [8] – "A manual of standardized terminology, techniques and scoring system for sleep stages in human subjects". It can be directly translated into the language of parameters of the structures fitted to the signal by the algorithm discussed in the previous section.
By choosing from the timefrequency atoms, fitted to EEG by the MP algorithm, those conforming to the above criteria, we obtain a detailed, automatic and highresolution parametrization of the relevant structures, which correspond to sleep spindles [9, 10]. Figures 10 and 11 present results of such a procedure carried out for several derivations of an overnight sleep EEG recording. This parametrization has proven to be consistent with visual detection, especially for the structures of higher amplitudes [11]. For lower amplitudes the algorithm detects also spindles elusive to a human expert.
Results presented in these figures conform also the previously observed effect of predominance of lower frequencies of sleep spindles in frontal derivations and higher frequencies in parietal derivations. We also notice that some frequencies on these plots are "prefereed", i.e. exhibit regular maxima in the histograms. But are they preferred by the brain or by the analysis algorithm? This question will be resolved in the next section.
Discrete Gabor dictionary
Gabor functions (sinemodulated Gaussian functions) provide optimal joint timefrequency localization. A real Gabor function can be expressed as:
N is the size of the signal for which the dictionary is constructed, K(γ) is such that g _{γ} = 1. γ = {u,ω,s,φ } denotes parameters of the dictionary's functions (timefrequency atoms). These parameters generate a threedimensional continuous space; this results in an infinite dictionary size. Therefore, in practical implementations, we use subsets of the possible dictionary's functions.
In the dictionary implemented originally by Mallat and Zhang in [7], parameters of these functions (timefrequency atoms) are chosen from dyadic sequences of integers. Their sampling is governed by an extra parameter – octave j (integer). Scale s, which corresponds to an atom's width in time, is derived from the dyadic sequence s = 2^{j}, 0 ≤ j ≤ L (signal size N = 2^{L}). Parameters u and ω, which correspond to position in time and frequency, are sampled for each octave j with interval s = 2^{j}, or, if oversampling by l is introduced, with interval 2^{jl}.
This specific waveletlike structure of the dictionary is responsible for the fact that decomposition over such a dictionary "prefers" certain positions in frequency or time (Figures 12, 13). In those positions more atoms are available for decomposition due to the particular structure of the dictionary, i.e. scheme of subsampling the parameter space.
Stochastic dictionaries
In forming the dictionary used for MP decomposition, using any fixed scheme of subsampling the parameter space introduces statistical bias in the resulting parametrization. This bias becomes apparent only when statistics comes into play, like in parameterization of large amounts of data. Unbiased MP decompositions can be obtained by an analogue of Monte Carlo methods.
In [12] we proposed MP with stochastic dictionaries, where the parameters of a dictionary's atoms are randomized before each decomposition. A stochastic dictionary D is constructed for a given signal length N and chosen "resolutions"^{1} in time, frequency and scale (Δt, Δω and Δs). The space of parameters t, ω and s is thereby divided into bricks of size Δt by Δω by Δs each, where ω ∈ (0,π). In each of those bricks, one timefrequency atom is chosen by drawing its parameters from flat distributions within the given ranges of continuous parameters.
Figures 14 and 15 present results for the same data as Figures 10 and 11. These results, owing to the application of the above described stochastic dictionaries, are free from the statistical bias.
^{1}The resolution of the matching pursuit is hard to define in general, since the procedure is nonlinear and signaldependent. It should be related to the distance between neighboring dictionary waveforms available for decomposition. In the described procedure, this distance does not exceed twice the value of the corresponding parameter (Δt, Δω or Δs).
Pharmaco EEG
This section presents an application of the proposed methodology to the standard clinical problem of testing influence of sleep inducing drugs on the sleep EEG.
The effects of zolpidem, midazolam and placebo, administered at bed time, were studied in 7 paid volunteers. From each of the 21 allnight recordings (3 nights for each subject) we extracted artifactfree epochs from derivation C3A2 of the 10–20 system. For the analysis of slow wave activity (SWA) we used data from stages III and IV and for sleep spindles only data from sleep stage II.
Figures 16 and 18 present the classical approach to the analysis of these data, that is averaging the spectral integrals in the above mentioned ranges, for each of the analyzed recordings.
Figures 17 and 19 present each of the structures, classified as sleep spindles (Figure 17) or SWA (Figure 19) as a dot in the frequencyamplitude plane. These structures were selected using not only the frequency information, as in the "classical" approach, but also information on time duration and amplitude from Table 1. Total power carried by these structures, summed and normalized per time unit, is indicated on the right of each plot, together with the average number of occurrences per minute, average amplitude and frequency.
We observe that selective MPbased estimates of power are more sensitive to the effect of influence of both drugs than the spectral integrals. This sensitivity becomes crucial in two of the analyzed cases:

1.
In the recording of patient ZCB09, which revealed a very low spindling activity (last row in Figures 16 and 17), fluctuations in the background masked the power of sleep spindles to such an extent, that spectral integrals indicate a partially reverse trend, i.e. decrease of power of sleep spindles under the influence of midazolam. Selective estimates indicate the expected increase.

2.
For patient ZCB08 (second row from the bottom) spectral integrals indicate increase of power of SWA (δ), while MP estimates reveal a decrease coherent with all the other cases.
Finally, as an example of an effect elusive to the classical methodology, these results indicate also a statistically significant decrease of the frequency of the slow wave activity. Further discussion and statistical evaluation of results can be found in [13].
Figure 20 partially explains the increased sensitivity of the proposed approach as compared to the classical paradigm of bandlimited spectral power integrals.
Representation of changing frequency
Another interesting property of the stochastic timefrequency dictionaries, discussed in section 5.1, relates to the representation of structures of changing frequency. Such structures are absent in the dictionaries usually applied for the decomposition, and therefore are represented as a series od fixedfrequency Gabor functions (Figure 22 (f) and Figure 21 (b)).
However, by averaging several decompositions of the same signal in different realizations of smaller stochastic dictionaries, we obtain a more continuous representation like in Figures 22 (g) and Figure 21 (c). The representation is no more compact, but retains the general property of absence of cross terms.
Eventrelated desynchronization and synchronization
Advantages of the MP algorithm with stochastic dictionaries can be also combined with the stochastic element present inherently in the data, like e.g. in the case of analyzing repetitions of eventrelated potentials. This relates especially to the nonphase locked activity, i.e. such that would not be enhanced in the stimulussynchronized time averages. Its detection requires a different analysis technique, allowing for averaging signal's energy irrelevant of the phase^{2}. Classically it was achieved by squaring the values of signals, bandpass filtered in a priori chosen frequency bands, before averaging ([14], Figure 23).
This methodology has several drawbacks, including tha arbitrary choice of frequency bands and limited sensitivity (c.f. Figure 20). Averaging the timefrequency densities of energy, obtained from the MP decomposition, allows for an instantaneous evaluation of the complete picture of changes in the timefrequency plane, with high resolution. Figure 25 presents this picture for the same data as in Figure 23. Another study in Figure 26 reveals clearly splitting of the alpha band in the upper and lower frequencies, with different behaviour related to the finger movement. This phenomenon would be very difficult to find using the classical methodology.
^{2}If we want to analyze phasic synchronization in higher frequencies, like e.g. gamma band around 40 Hz, the accuracy of the triger's determination would have to be much better than e.g. 1 ms, which is rather beyond the accuracy of determining e.g. the moment of a finger movement. But nonphasic effects should be visible in the presented approach.
Epileptic seizures
Discussed advantages of timefrequency estimates of energy density, obtained from MP decomposition, allowed for identification of different dynamic states during evolution and propagation of epileptic seizures in [15].
Figure 27 presents an example of timefrequency density of energy, calculated from the MP expansion of an intracranial recording of epileptic seizure. Sensitivity of the algorithm to the phase changes clearly distinguishes the initial less synchronized period of transitional rhytmic activity (starting near 15th second) from the organized rhytmic activity. In this period (starting about 30th second) all the harmonics, up to the Nyquist frequency, clearly follow the pattern of decreasing frequency. Such a clear representation of changing frequency was possible using the technique discussed in section 7.
Problems
As presented in figure 28, greedy MP algorithm in certain cases can fail to properly decompose signal containing even a simple combination of dictionary's functions. This counterexample was simulated especially for discussion of practical issues and as such may look scary, but on the other hand we never encountered such a failure in applications to real world signals. The trap in this case lies in the fact that both these structures have exactly the same phase, and it might be even dicussable if they should not be treated as one in the first approximation. If they were produced by two different biological generators, such a coincidence of phases would not be possible and they would be parametrized separately.
Other, more theoretical examples of MP failures are given in [16] and [17]. Some of these cases can be properly solved by the orthogonalized matching pursuit [18], at a cost of increased computational requirements and a possibility of introducing numerical instabilities [19]. Another modification of the MP algorithm, discussed in [20], relies on a modification of the similarity function used in each step to choosed the "best fit". Other works [21] indicate that global optimalization of the l^{1} norm of expansion's coefficients might be the best choice [17], but, in spite of recent advances in linear programming, computational complexity is still very high.
Other problems, awaiting better solutions, include: efficient and (at the same time) biasfree implementations, robust estimation of resolution, proper addressing of the tradeoff between resolution (increasing dictionary size) and computation cost, and extension of the algorithm to the multichannel case. Such research will improve effectivity and understanding of mathematical foundations of these higly nonlinear procedures. However, current implementations and knowledge of algorithm's properties are already sufficient for large scale applications in EEG research and clinical practice. They may serve as a basis for a unification of parametrization and description of EEG.
Implementations of MP with stochastic dictionaries is available from http://brain.fuw.edu.pl/mp.
Authors contributions
Parts of presented research, according to the quoted references, were coauthored by:

Prof. Katarzyna J. Blinowska, head of the Laboratory of Medical Physics, Warsaw University, and my colleagues from the Lab: Józef Ginter Jr, Dobiesław Ircha, Jarosław Zygierewicz.

Prof. Waldemar Szelenberger, Medical University of Warsaw

Prof Ernest A. Bartnik, Inst. of Theoretical Physics, Warsaw University

Prof. Piotr Franaszczuk, Johns Hopkins Medical University

Prof. Gert Pfurtscheller and dr Christa Neuper, Graz Technical University
References
 1.
Hans Berger: Über das elektrenkephalogramm des menschen. Arch f Psychiat 1929, 87: 527–570.
 2.
Ernst Niedermayer, Lopes Da Silva F: Electroencephalography: Basic Principles, Clinical Applications and Related Fields. Williams & Wilkins fourth Edition 1999, 1154.
 3.
Dietsch G: Fourier Analyse von Elektroenzephalogrammen des Menschen. Pflügers Arch Ges Physiol 1932, 230: 106–112.
 4.
Durka PJ, Blinowska KJ: A unified timefrequency parametrization of EEG. IEEE Engineering in Medicine and Biology Magazine 2001,20(5):47–53. 10.1109/51.956819
 5.
Bartnik EA, Blinowska KJ, Durka PJ: Single evoked potential reconstruction by means of wavelet transform. Biol Cybern 1992, 67: 175–181.
 6.
Metin Akay, editor: Time Frequency and Wavelets in Biomedical Signal Processing. IEEE Press Series in Biomedical Engineering IEEE Press 1997.
 7.
Stéphane Mallat, Zhifeng Zhang: Matching pursuit with timefrequency dictionaries. IEEE Transactions on Signal Processing 1993, 41: 3397–3415. 10.1109/78.258082
 8.
Rechtschaffen A, Kales A, editors: A manual of standardized terminology, techniques and scoring system for sleep stages in human subjects. Number 204 in National Institutes of Health Publications. US Government Printing Office, Washington DC 1968.
 9.
Blinowska KJ, Durka PJ: The application of Wavelet Transform and Matching Pursuit to the timevarying EEG signals. In, Intelligent Engineering Systems through Artificial Neural Networks (Edited by: Dagli CH, Fernandez BR). ASME Press, New York 1994, 4: 535–540.
 10.
Durka PJ, Blinowska KJ: Analysis of EEG transients by means of matching pursuit. Ann Biomed Eng 1995, 23: 608–611.
 11.
Żygierewicz J, Blinowska KJ, Durka PJ, Szelenberger W, Niemcewicz Sz, Androsiuk W: High resolution study of sleep spindles. Clin Neurophysiol 1999,110(12):2136–2147. 10.1016/S13882457(99)001753
 12.
Piotr Durka J, Ircha D, Blinowska KJ: Stochastic timefrequency dictionaries for matching pursuit. IEEE Tran Signal Process 2001,49(3):507–510. 10.1109/78.905866
 13.
Durka PJ, Szelenberger W, Blinowska KJ, Androsiuk W, Myszka M: Adaptive timefrequency parametrization in pharmaco EEG. Journal of Neuroscience Methods 2002, 117: 65–71. 10.1016/S01650270(02)000754
 14.
Pfurtscheller G, Lopes da Silva FH: Eventrelated EEG/MEG synchronization and desynchronization: Basic principles. Clin Neurophysiol 1999, 110: 1842–1857. 10.1016/S13882457(99)001418
 15.
Franaszczuk PJ, Bergey GK, Durka PJ, Eisenberg HM: Timefrequency analysis using the Matching Pursuit algorithm applied to seizures originating from the mesial temporal lobe. Electroencephalogr Clin Neurophysiol 1998, 106: 513–521. 10.1016/S00134694(98)000248
 16.
DeVore RA, Temlyakov VN: Some remarks on greedy algorithms. Advances in Computational Mathematics 1996, 5: 173–187.
 17.
Chen SS, Donoho DL, Saunders MA: Atomic decomposition by basis pursuit. SIAM Review 2001,43(1):129–159. 10.1137/S003614450037906X
 18.
Pati YC, Rezaiifar R, Krishnaprasad PS: Orthogonal matching pursuit: Recursive function approximation with applications to wavelet decomposition. In Proceedings of the 27th Annual Asilomar Conference on Signals, Systems and Computers November 1993
 19.
Davis G, Mallat S, Zhang Z: Adaptive timefrequency decompositions. SPIE Journal of Optical Engineering 1994,33(7):2183–2191.
 20.
Jaggi S, Karl WC, Mallat S, Willsky AS: High resolution pursuit for feature extraction. 1998.
 21.
Donoho DL, Huo X: Uncertainty principles and ideal atomic decomposition. IEEE Transactions on Information Theory 2001.,47(7):
 22.
Piotr Durka J, Ircha D, Neuper Ch, Pfurtscheller G: Timefrequency microstructure of eventrelated desynchronization and synchronization. Med Biol Eng Comput 2001,39(3):315–321.
 23.
Piotr Jerzy Durka: Timefrequency analyses of EEG. PhD thesis, Warsaw University 1996. [http://durka.info/dissert.html]
Author information
Authors’ original submitted files for images
Rights and permissions
About this article
Received
Accepted
Published
DOI
Keywords
 Matching Pursuit
 Slow Wave Activity
 Sleep Spindle
 Orthogonal Wavelet
 Gabor Function