Spectral subtraction denoising preprocessing block to improve P300based braincomputer interfacing
 Mohammed J Alhaddad^{1}Email author,
 Mahmoud I Kamel^{1},
 Meena M Makary^{2},
 Hani Hargas^{1} and
 Yasser M Kadah^{2}
https://doi.org/10.1186/1475925X1336
© Alhaddad et al.; licensee BioMed Central Ltd. 2014
Received: 10 September 2013
Accepted: 28 March 2014
Published: 4 April 2014
Abstract
Background
The signals acquired in braincomputer interface (BCI) experiments usually involve several complicated sampling, artifact and noise conditions. This mandated the use of several strategies as preprocessing to allow the extraction of meaningful components of the measured signals to be passed along to further processing steps. In spite of the success present preprocessing methods have to improve the reliability of BCI, there is still room for further improvement to boost the performance even more.
Methods
A new preprocessing method for denoising P300based braincomputer interface data that allows better performance with lower number of channels and blocks is presented. The new denoising technique is based on a modified version of the spectral subtraction denoising and works on each temporal signal channel independently thus offering seamless integration with existing preprocessing and allowing low channel counts to be used.
Results
The new method is verified using experimental data and compared to the classification results of the same data without denoising and with denoising using present wavelet shrinkage based technique. Enhanced performance in different experiments as quantitatively assessed using classification block accuracy as well as bit rate estimates was confirmed.
Conclusion
The new preprocessing method based on spectral subtraction denoising offer superior performance to existing methods and has potential for practical utility as a new standard preprocessing block in BCI signal processing.
Keywords
Braincomputer interface Spectral subtraction Wavelet shrinkage Signal denoisingIntroduction
Brain computer interfacing (BCI) is an important tool that allows direct reading of information from the subject’s brain activity by a computer. Such information can be used to perform actions controlled by the subject and hence provide an additional means of communication beside normal communication channels present in normal subjects. Such means can be the only way of communication with patients of such disease conditions as muscular dystrophy (MS) and therefore its development and enhancement have been the focus of many research groups in the past decade.
The brain activity at different locations can be measured using different methods that include electroencephalography (EEG), magnetoencephalography (MEG), and some functional imaging modalities such as functional magnetic resonance imaging (fMRI). These techniques offer brain activity signal time courses that come from a particular location in the brain with the resolution of such spatial localization ranging from a few signals for the whole brain (as with EEG) to signal for each 1 mm^{3} voxel within the subject’s brain (as with fMRI). The complexity of such systems also range from a simple, relatively inexpensive electrode cap worn by the subject and attached to a relatively small processing unit that provide very noisy signals while allowing subject mobility (as with EEG) to large expensive high field fMRI systems that allow excellent signaltonoise ratio to be obtained while restricting the slightest subject motion during data acquisition. So, there is a clear tradeoff between the quality of signals collected on one side and the mobility of the subject and the cost of the system on the other side. Approaches to improve quality of information from EEGbased systems through noise/artifact removal as well as more sophisticated analysis techniques would therefore allow this low cost, mobile technology to achieve better practical utility.
Several research articles addressed the problem of achieving higher quality of EEG signals for BCI applications and otherwise with aim to improve the signaltonoise ratio (SNR).Two broad categories can be immediately recognized; namely, spatial domain techniques and temporal domain techniques. In the spatial domain techniques, the data from multiple spatiallydistinct channels are utilized to identify the true signal projected onto all channels from the noise that is generally assumed to be independent among such channels. Such methods range from simple local spatial averaging to sophisticated variants of blind source separation methods such as independent component analysis [1–6]. On the other hand, temporal domain techniques attempt to find similarities within the time domain of a single channel signal that can be used to identify and suppress the noise components in that signal. This can be done by many methods ranging from simple averaging of consecutive epochs to transform domain based filtering techniques ranging from basic bandpass filtering [7, 8] to different variants of the wavelet shrinkage method [9–18]. A hybrid method between spatial and temporal methods has also been recently proposed to take advantage of available channels and redundant signal epochs [19]. The predominant method of filtering used in BCI today is basic bandpass filtering that has become an essential part of the conventional preprocessing chain of BCI.
Even though previous denoising methods have contributed significant improvements, there are still limitations that need further research to reduce. For example, spatial domain methods rely on the availability of many channels (or electrodes), which would increase the cost, increase the weight, and cause loss of localization of EEG signals from the brain. Also, the integration of temporal domain signals into the preprocessing chain of BCI signals is yet to be done and is bound to increase the computational complexity requiring more expensive digital backend hardware. Both techniques increase the power consumption of a portable BCI system due to additional channel frontends or higher processing needed in the digital backend. Therefore, a technique that would allow the use of a small set of channels and improve the performance of BCI system beyond the present methods at a reasonable computational cost would be highly desirable.
The aim of this work is to develop a denoising method for P300based braincomputer interface data that allows better performance to be obtained with lower number of channels and blocks. The new method will be applied to experimental data and compared to the classification results of the same data using the same preprocessing and classification steps to allow direct comparison of results. Also, the new method will be compared to bandpass filtering and wavelet shrinkage based denoising as the relevant and widely used method for denoising at the present. Performance in different experiments will be quantitatively assessed using classification block accuracy as well as bit rate. The computational complexity of the new method is also described and compared to previous methods.
Methods
The methodological approach that will be followed in this work is to adopt spectral subtraction based signal denoising, which is an effective speech signal denoising method that was previously applied to fMRI signal denoising [20]. This method uses adaptive estimation of noise and does not assume a model for the true signal thus matching well our problem. Here, we derive the spectral subtraction method for EEG applications and point out the modifications to the previous work to meet our unique application requirements.
To solve the above problem and allow artifactfree use of spectral subtraction, we here propose a modified version of the spectral subtraction method in which the original signal is converted to an evensymmetric signal by concatenating the signal with its time reversed version before using the discrete Fourier transform to estimate the power spectrum. This bears similarity to what is done in the widelyused discrete Cosine transform. This has two important implications that address the above two issues in the original method. First, the phase of this evensymmetric signal is expected to be zero for positive frequency amplitudes or π for negative ones. However, we observe a deterministic linear phase corresponding to a shift of ½ point since the origin of symmetry of this signal lies in between the two middle points. This changes the role of the phase estimation in the original method to merely sign detection and compensation for the deterministic ½ point shift yielding very high noise immunity. Second, the even symmetric signal form ensures the continuity at both ends of the signal to be preserved thus eliminating edge artifacts. The block diagram of the modified version of spectral subtraction is presented in Figure 3. The result of the using the modified spectral subtraction on the same signal in Figure 2 is shown at the bottom plot where the artifact present in the old spectral subtraction method is completely absent in the new method. The detailed steps of implementation of the new method are given as follows:

Step 1: Read in the raw epoch data s(t) and convert it to a symmetric signal by concatenation with its reflected version s(t).

Step 2: Compute the fast Fourier transform of the symmetric raw epoch data. Estimate and keep the linear phase of the result.

Step 3: Compute the periodogrambased estimate of the power spectrum as the squared magnitude of the fast Fourier transform of the raw epoch data.

Step 4: Estimate the noise level by computing the average of the power spectrum values in the upper 20% of the frequency range that contains no signal components.

Step 5: Use Equation (3) to compute the power spectrum of the denoised signal. If the subtraction result at any frequency is negative, it is clipped to zero.

Step 6: Compute the denoised signal discrete Fourier transform as the square root of the denoised signal power spectrum and transform it back to the timedomain denoised signal after adding the deterministic linear phase estimated in Step 2.
EEG signal noise power spectrum estimation
In order to implement the above denoising strategy, the noise power spectrum has to be estimated. Given that the noise model is Gaussian white noise, its power spectrum is well known to be constant over all frequencies that is directly proportional to the noise variance. Hence, it is sufficient to estimate a single parameter in order to completely determine the noise power spectrum.
Experimental verification
In this work, the data of Hoffmann et al. [21] were used to test the developed denoising method and compared it to both the case of no denoising and the case of wavelet shrinkage denoising [11, 15]. We followed the exact same sequence of preprocessing and classification in this paper to allow the direct comparison between the two cases of preprocessing with and without the denoising step. The description of the data set is found in detail in [21] but a summary will be provided here. The duration of one run was approximately one minute and the duration of one session including setup of electrodes and short breaks between runs was approximately 30 min. One session comprised on average 810 trials, and the whole data for one subject consisted on average of 3240 trials. The impact of different electrode configurations and machine learning algorithms on classification accuracy was tested in an offline procedure. For each subject fourfold crossvalidation was used to estimate average classification accuracy. The preprocessing operations applied were: referencing, bandpass filtering with cutoff frequencies set to 1.0 Hz and 12.0 Hz, downsampling by a factor of 64, single trials were extraction, windsorizing and finally amplitude normalization. The number of electrodes was selected as 4, 8, 16 or 32 depending on the experiment with the same electrode configurations in [21]. Then, the feature vector construction was done whereby the samples from the selected electrodes were concatenated into feature vectors. The dimensionality of the feature vectors was N_{e} × N_{t}, where N_{e} denotes the number of electrodes (selected as 4, 8, 16, or 32) and N_{t} denotes the number of temporal samples in one trial (32 samples in our experiments). Classification of data was performed using Bayesian linear discriminant analysis (BLDA) and the software developed by [21] was used to perform this step. Given that the original signal passed through the standard preprocessing chain including the bandpass filter, comparing the results of different methods to it includes bandpass filter based denoising in the comparison. For the wavelet denoising, standard wavelet shrinkage denoising was used using Matlab with the basic wavelet chosen as “Coiflet3” as suggested by [15] for direct comparison noting that we were able to get similar results using other basic wavelet functions (e.g., Daubechies8). The universal threshold was selected with no multiplicative threshold rescaling [15].
Results and discussion
It can be observed that the block accuracy results for 4channel data (plotted in red) show a significant improvement from the original data in both spectral subtraction and wavelet shrinkage methods with low number of blocks. This is also reflected as higher bitrates in the same range. Even though the effect of denoising in general is more apparent in experiments with lower number of channels and low number of blocks, there is still evident improvement in experiments with high number of channels where 100% accuracy is reached earlier as evident in all cases. This is important to indicate that the inherent spatial compounding from the many electrodes can still take advantage of temporal denoising methods and that a combination of the two yields the best results.
By inspecting the results further, we observe that the spectral subtraction method offers better results than wavelet shrinkage based denoising in most experiments with the exception of a few cases such as in the 4channel data of Subject 2 where the 100% accuracy is maintained once reached in wavelet denoising while it does not with spectral subtraction. Nevertheless, in all other cases the spectral subtraction results are superior as evident in the achieved block accuracy and bit rate for any given experiment. As a general observation, the results of spectral subtraction and wavelet denoising methods show a clear advantage over the results with only bandpass filtering in the original signal. Since such denoising step can be inserted within the conventional preprocessing of BCI data, this study shows clear evidence that these more sophisticated denoising methods should be integrated as a standard step in the preprocessing chain to improve the SNR of the collected signals.
Assuming a data set of M channels with N points each, the computational complexity of spectral subtraction is O(M N log_{2} N). On the other hand, The computational complexity of wavelet shrinkage method varies with different implementation with a minimum complexity of O(M N^{2}), which is significantly higher. For example, for N = 100,000 points and same number of channels, the wavelet shrinkage method will require N/log_{2}(N) times the computations of spectral subtraction, which is more than 3 orders of magnitude higher. Therefore, the computational complexity of spectral subtraction is more efficient for applications requiring embedded implementations or fpr realtime processing.
The model used in data processing amounts to subtracting the noise component uniformly across all frequencies. This is different from conventional frequency selective filters that are equivalent to a convolution in the time domain that causes the noise components in different time points to be correlated in the output signal. Hence, a theoretical advantage of this method is its preservation of the independence of random components within the time points processed. Hence, it is wellsuited for use with standard statistical analysis methods that require statistical independence of samples. An example of such methods is when improving statistical estimation by using data from multiple blocks where the presence of correlated rather than independent noise across blocks degrades the achievable improvement. Given that the wavelet shrinkage based methods involve frequency selective filters to compute its coefficients, the same advantage cannot be claimed for that method. This explains the overwhelmingly better performance of the spectral subtraction method than the wavelet shrinkage based method when the number of blocks is higher.
Conclusions
In this work, a new denoising method for P300based braincomputer interface data that allows better performance to be obtained with lower number of channels and blocks was developed. The new method was verified using experimental data and promising improved results were obtained. The new method was favorably compared to bandpass filtering and wavelet shrinkage based denoising as the present relevant and widely used method for denoising. Performance in different experiments using classification block accuracy as well as bit rate show significant improvement with a clear advantage in computational complexity. The results highlight the potential for including the new method as a standard preprocessing block for BCI data.
Declarations
Acknowledgements
Many thanks go to all the subjects who volunteered to participate in the experiments described in this paper. We would like to thank our team for their efforts in the BCI project. Special thank to Prof. Aravinda Prasad Sistla, Computer Science Department, College of Engineering, The University of Illinois at Chicago. This research was funded by the Deanship of Scientific Research (DSR), King Abdulaziz University, under grant, No. (16151432 HiCi). The authors, therefore, acknowledge with thanks DSR technical and financial support.
Authors’ Affiliations
References
 Ramirez RR, Kopell BH, Butson CR, Hiner BC, Baillet S: Spectral signal space projection algorithm for frequency domain MEG and EEG denoising, whitening, and source imaging. NeuroImage 2011, 56: 78–92. 10.1016/j.neuroimage.2011.02.002View ArticleGoogle Scholar
 de Cheveigné A, Simon JZ: Denoising based on spatial filtering. J Neurosci Methods 2008, 171: 331–339. 10.1016/j.jneumeth.2008.03.015View ArticleGoogle Scholar
 Piresa G: Statistical spatial filtering for a P300based BCI: tests in ablebodied, and patients with cerebral palsy and amyotrophic lateral sclerosis. J Neurosci Methods 2011, 195: 270–281. 10.1016/j.jneumeth.2010.11.016View ArticleGoogle Scholar
 Vorobyov S, Cichocki A: Blind noise reduction for multisensory signals using ICA and subspace filtering, with application to EEG analysis. Biol Cybern 2002, 86: 293–303. 10.1007/s0042200102986View ArticleGoogle Scholar
 Akhtar MT, Mitsuhashi W, James CJ: Employing spatially constrained ICA and wavelet denoising, for automatic removal of artifacts from multichannel EEG data. Signal Process 2012, 92: 401–416. 10.1016/j.sigpro.2011.08.005View ArticleGoogle Scholar
 Geetha G, Geethalakshmi SN: Artifact removal from EEG using spatially constrained independent component analysis and wavelet denoising with Otsu’s thresholding technique. Procedia Eng 2012, 30: 1064–1071.View ArticleGoogle Scholar
 Hammon PS, de Sa VR: Assessment of Preprocessing on Classifiers Used in the P300 Speller Paradigm. In Proceedings of the 28th IEEE EMBS Annual International Conference of IEEE ISBN: 1–4244–0032–5, Aug 30Sept 3 2006. New York City, USA: IEEE Press; 2006:1319–1322.Google Scholar
 Hammon PS, de Sa VR: Preprocessing and metaclassification for braincomputer interfaces. IEEE Trans Biomed Eng 2007, 54: 518–525.View ArticleGoogle Scholar
 Romo Vázquez R, VélezPérez R, Ranta R, Louis Dorr V, Maquin D, Maillard K: Blind source separation, wavelet denoising and discriminant analysis for EEG artefacts and noise cancelling. Biomed Signal Process Control 2012, 7: 389–400. 10.1016/j.bspc.2011.06.005View ArticleGoogle Scholar
 Ahmadi M, Quian Quiroga R: Automatic denoising of singletrial evoked potentials. NeuroImage 2013, 66: 672–680.View ArticleGoogle Scholar
 Donoho DL: Denoising by softthresholding. IEEE Trans Inf Theory 1995, 42: 613–627.MathSciNetView ArticleGoogle Scholar
 Quian Quiroga R, Garcia H: Singletrial eventrelated potentials with wavelet denoising. Clin Neurophysiol 2003, 114: 376–390. 10.1016/S13882457(02)003656View ArticleGoogle Scholar
 Effern A, Lehnertz K, Grunwald T, Fernandez G, David P, Elger CE: Time adaptive denoising of single trial eventrelated potentials in the wavelet domain. Psychophysiology 2000, 37: 859–865. 10.1111/14698986.3760859View ArticleGoogle Scholar
 Gao J, Sultan H, Hu J, Tung WW: Denoising nonlinear time series by adaptive filtering and wavelet shrinkage: a comparison. IEEE Signal Process Lett 2010, 17: 237–240.View ArticleGoogle Scholar
 Saavedra C, Bougrain L: Wavelet denoising for P300 singletrial detection. In Proceedings of the 5th French conference on computational neuroscience (Neurocomp’10). lyon, France: NeuroComp; 2010:227–231.Google Scholar
 Hammad S, Corazzol M, Kamavuako EN, Jensen W: Wavelet denoising and ANN/SVM decoding of a selfpaced forelimb movement based on multiunit intracortical signals in rats. In Proceedings of 2012 IEEE EMBS Conference on Biomedical Engineering and Sciences (IECBES). Langkawi, Malaysia: IEEE Press; 2012:990–994.View ArticleGoogle Scholar
 Sammaiah A, Narsimha B, Suresh E, Reddy M: On the performance of wavelet transform improving Eye blink detections for BCI. In Proceedings of 2011 International Conference on Emerging Trends in Electrical and Computer Technology (ICETECT). Tamil Nadu, India: IEEE Press; 2011:800–804.View ArticleGoogle Scholar
 Estrada E, Nazeran H, Sierra G, Ebrahimi F, Setarehdan SK: Waveletbased EEG denoising for automatic sleep stage classification. In Proceedings of 21st International Conference on Electrical Communications and Computers (CONIELECOMP). Puebla, Mexico: IEEE Press; 2011:295–298.Google Scholar
 Tu Y, Huang G, Hung YS, Hu L, Hu Y, Zhang Z: Singletrial Detection of Visual Evoked Potentials by Common Spatial Patterns and Wavelet Filtering for Braincomputer Interface. In Proceedings of 35th Annual International Conference of the IEEE EMBS, ISBN: 978–14577–0216–7, 3  7 July 2013. Osaka, Japan: IEEE Press; 2013:2882–2885.Google Scholar
 Kadah Y: Adaptive denoising of eventrelated functional magnetic resonance imaging data using spectrum subtraction. IEEE Trans Biomed Eng 2004, 51: 1944–1953. 10.1109/TBME.2004.831525View ArticleGoogle Scholar
 Hoffmann U, Vesin JM, Ebrahimi T, Diserens K: An efficient P300based brain–computer interface for disabled subjects. J Neurosci Methods 2008, 167: 115–125. 10.1016/j.jneumeth.2007.03.005View ArticleGoogle Scholar
 Thompson DE, BlainMoraes S, Huggins JE: Performance assessment in braincomputer interfacebased augmentative and alternative communication. BioMed Eng OnLine 2013, 12: 43. 10.1186/1475925X1243View ArticleGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.