Real-time feature extraction of P300 component using adaptive nonlinear principal component analysis

Turnip, Arjon; Hong, Keum-Shik; Jeong, Myung-Yung

doi:10.1186/1475-925X-10-83

Research
Open access
Published: 23 September 2011

Real-time feature extraction of P300 component using adaptive nonlinear principal component analysis

Arjon Turnip¹,
Keum-Shik Hong^1,2 &
Myung-Yung Jeong¹

BioMedical Engineering OnLine volume 10, Article number: 83 (2011) Cite this article

10k Accesses
104 Citations
Metrics details

Abstract

Background

The electroencephalography (EEG) signals are known to involve the firings of neurons in the brain. The P300 wave is a high potential caused by an event-related stimulus. The detection of P300s included in the measured EEG signals is widely investigated. The difficulties in detecting them are that they are mixed with other signals generated over a large brain area and their amplitudes are very small due to the distance and resistivity differences in their transmittance.

Methods

A novel real-time feature extraction method for detecting P300 waves by combining an adaptive nonlinear principal component analysis (ANPCA) and a multilayer neural network is proposed. The measured EEG signals are first filtered using a sixth-order band-pass filter with cut-off frequencies of 1 Hz and 12 Hz. The proposed ANPCA scheme consists of four steps: pre-separation, whitening, separation, and estimation. In the experiment, four different inter-stimulus intervals (ISIs) are utilized: 325 ms, 350 ms, 375 ms, and 400 ms.

Results

The developed multi-stage principal component analysis method applied at the pre-separation step has reduced the external noises and artifacts significantly. The introduced adaptive law in the whitening step has made the subsequent algorithm in the separation step to converge fast. The separation performance index has varied from -20 dB to -33 dB due to randomness of source signals. The robustness of the ANPCA against background noises has been evaluated by comparing the separation performance indices of the ANPCA with four algorithms (NPCA, NSS-JD, JADE, and SOBI), in which the ANPCA algorithm demonstrated the shortest iteration time with performance index about 0.03. Upon this, it is asserted that the ANPCA algorithm successfully separates mixed source signals.

Conclusions

The independent components produced from the observed data using the proposed method illustrated that the extracted signals were clearly the P300 components elicited by task-related stimuli. The experiment using 350 ms ISI showed the best performance. Since the proposed method does not use down-sampling and averaging, it can be used as a viable tool for real-time clinical applications.

Background

The first recording of the electric field of a human brain was made by the German psychiatrist Hans Berger in Jena, Germany, in 1924. He named the recorded signals electroencephalograms (EEGs) [1]. Over the past few decades, this signal has attracted very considerable interest and attention in the study of cognitive processes in both clinical [2–9] and research areas [10–16]. Its main advantages are non-invasive measurement, superior temporal resolution, easy implementation, and low cost [17, 18]. An event-related potential (ERP), as a derivative of the EEG, is a measured brain response directly resulted from a thought or perception. In 1964 and 1965, respectively, two groups (Chapman and Bragdon [19] and Sutton et al. [20]) independently discovered a P300 component (a wave peak approximately 300 milliseconds (ms) after a task-relevant stimulus). Recently, a great variety of potential applications of the ERP-based P300 component have been widely studied [21–26].

Ideally, the EEG machine records, along the scalp, the electrical activities generated by the firing of neurons within the brain. The present problem is that EEG signals contain the neurons' activities located in some significant distances away from the sensors (electrodes). Therefore, given the distance between the electrode and the neuronal activities, the EEG signal collected at any point on a person's scalp is a nonlinear mixture of the activities generated over a large brain area. In this paper, the recorded EEG data are assumed to be a linear mixture of neuronal activities for brevity. Certainly, dealing with the typical low-amplitude and low signal-to-noise ratio (SNR) potentials, the removal of other biological signals becomes one of the major challenges in the study of ERPs. To resolve this problem, down-sampling and averaging methods of EEG data over multiple trials are usually required. However, the down-sampling method can cause some signals to become indistinguishable and distorted, which implies an alteration of the original characteristics of the waveform of information. Also, the averaging method assumes that the signals are long-time stationary and deterministic relative to the stimulus onset. This assumption might cause the loss of time resolution specifically for dissimilar trials. Also, the stationarity and determinacy assumption on EEG signals might not work, because one must consider other factors such as maturation, age, sex, state-of-consciousness, psychiatric and neurological disorders, etc [27].

In this paper, a more efficient means of feature extraction is developed to cope with the drawbacks of the down-sampling and averaging method. Previous research has shown that several aspects of the ERP (especially the latency, magnitude, and topography) are highly variable across trials [27, 28]. Many techniques [29–33] appeared in research area to resolve the problem of EEG (specifically for obtaining P300 components) are not sufficiently standardized for clinical usage. Moreover, those techniques usually have been performed off-line. In this paper, a real-time feature extraction method for P300 components using an adaptive nonlinear principal component analysis (ANPCA) incorporating the multilayer neural network (MNN) is proposed. The MNN technique has been widely adopted in the fields of information and neural sciences (i.e., feature extraction, classification, modeling, etc.) [34–39]. The experimental results in this paper show that the implementation of the proposed method achieves a very significant statistical improvement in extracting P300 components.

The main contributions of this paper are the following. (i) The developed multi-stage principal component analysis (PCA) applied at the pre-separation step reduces external noises and artifacts significantly, and separates the colored source in the measured EEG signals. (ii) The designed adaptive rule in the whitening step makes the subsequent separation algorithm to converge fast. (iii) The combination of the proposed ANPCA method and the MNN for feature extraction can identify the P300 components in real-time (i.e., without down-sampling and averaging). (iv) Furthermore, the proposed method can become a viable tool in both research and clinical applications.

Methods

Data acquisition

Figures 1(a) and 1(b) show the overall schematic and block diagram, respectively, of the proposed real-time feature extraction method. In the experiment, two masters students and five Ph.D. students (all males, age 32 ± 5 years, none of whom had any known neurological deficits) have participated. A seven-choice signal paradigm (i.e., forward, turn right, turn left, backward, backward right, backward left, and stop) is used to stimulate the seven subjects. They sit in a comfortable chair in front of a computer monitor located at 60 cm away from their eyes. The subjects are asked to count silently the number of times of the flashes of a preselected image on the screen while imagining a car moving in the direction of the flashed signal. Four seconds after a starting tone, seven different images flash in random order, one image at a time. A software program (E-Prime 2.0, developer: Schneider, Sharpsburg-USA) is employed for presenting stimuli.

The left-hand side box in Figure 1(a) shows a g-MOBIlab+ biosignal acquisition device (Christoph Guger, Austria), with which the EEG signals are recorded continuously and digitized at a 256 Hz sampling rate. Figure 2 depicts the positioning of the eight electrodes (channels) at Fz, Cz, Pz, Oz, P7, P3, P4, P8 by following the 10-20 International System [40] and the linked-ears reference. The ground electrode is placed at the center of the forehead. The impedance at each location is kept below 5 kΩ. The participants are supposed not to have any eye and head movements during the EEG recording. Each subject records four sessions; four different image-flash durations (i.e., 25 ms, 50 ms, 75 ms, and 100 ms, respectively) followed by a 300 ms blank screen. Hence, the inter-stimulus intervals (ISIs) in this work range from 325 ms to 400 ms.

Real-time feature extraction

Let M be the number of measured EEG signals and N be the number of unknown input sources. Then, the measured signal at channel i, x _i(k), can be represented as a linear combination of N unknown mutually statistically independent source signals s _j (k), j = 1,2, ..., N, as follows (typically M ≥ N) [41, 42].

x_{i} (k) = \sum_{j = 1}^{N} a_{i j} s_{j} (k) + n_{i} (k),

(1)

or in matrix form,

x (k) = A s (k) + n (k),

(2)

where x(k) = [x ₁(k), x ₂(k), ..., x _M (k)] ^T ∈ R^Mis the vector of EEG signals, A ∈ R^{M × N}with entries a _ij is the unknown M × N mixing matrix, s(k) = [s ₁(k), s ₂(k), ..., s _N (k)] ^T ∈ R^N is the unknown vector of colored source signals, and n(k) ∈ R^Mis the vector of additive noises. The objective of this work is to estimate both A and s(k). The following assumptions are made: Individual components of the source vector s(k) are statistically independent of one another; the matrix A is invertible and has full rank; each component in s(k) is a stationary; and the noise vector n(k) is white with Gaussian distribution. The P300 extraction is made in the following steps: pre-separation, whitening, separation, and estimation without ignoring the additive noise signal n(k).

Pre-separation step

The pre-separation step uses a multi-stage PCA to separate the sources and also to reduce external noises and artefacts from the measured signal vector. The eigenvalue decomposition of the correlation matrix R _xx of the measured signal x(k) is given by [42]

R_{x x} = E {x (k) x^{T} (k)} = V Λ V^{T},

(3)

where Λ ∈ R^{M × N}is a pseudo-diagonal matrix. On the basis of the largest eigenvalues, the spatial whitening procedures can be written as

\bar{x} (k) = B x (k) = Λ_{j}^{- 1 ∕ 2} V_{j}^{T} x (k),

(4)

where Λ _j = diag{λ ₁ , λ ₂ , ..., λ _N } with λ ₁ ≥ λ ₂ ≥ ... ≥ λ _N and V _j = {v ₁, v ₂,...v _N } ∈ R^N×M. Therefore, the PCA is performed for a new vector of signals, which is defined [41, 42]

\tilde{x} (k) = \bar{x} (k) + \bar{x} (k - τ),

(5)

where τ is an arbitrary time delay. The covariance matrix of the vector $\tilde{x} (k)$ is expressed as

\begin{gathered} R_{\tilde{x} \tilde{x}} = R_{\tilde{x}} (0) = E {\tilde{x} (k) {\tilde{x}}^{T} (k)} \\ = 2 R_{\bar{x}} (0) + R_{\bar{x}} (τ) + R_{\bar{x}}^{T} (τ), \end{gathered}

(6)

where $R_{\bar{x} \bar{x}} = R_{\bar{x}} (0) = E {\bar{x} (k) {\bar{x}}^{T} (k)} = H R_{s s} H^{T} = I$ , under the assumption that H = BA is orthogonal and R _SS = I and

R_{\bar{x}} (τ) = E {\bar{x} (k) {\bar{x}}^{T} (k - τ)} = H R_{s} (τ) H^{T} .

(7)

Hence, the matrix decomposition can be written

R_{\tilde{x} \tilde{x}} = H D (τ) H^{T} = V_{\tilde{x}} Λ_{\tilde{x}} V_{\tilde{x}}^{T},

(8)

where D(τ)is a diagonal matrix expressed as

D (τ) = 2 I + R_{s} (τ) + R_{s}^{T} (τ),

(9)

with diagonal elements d _ii (τ) = 2(1+E{s _i (k)s _i (k-τ)}) If the diagonal elements are distinct, the eigenvalue decomposition is unique. Thus, the mixing matrix and the input vector $\overset{⌣}{x} (k)$ , respectively, can be estimated as $A = B^{+} V_{\tilde{x}}$ and

\overset{⌣}{x} (k) = V_{\tilde{x}}^{T} \bar{x} (k) = V_{\tilde{x}}^{T} B x (k) .

(10)

Assume that the process $\overset{⌣}{x} (k) \in C^{M}$ comprises a zero-mean sequence whose covariance matrix is defined as in (3), and that we are going to extract its complex-values eigenvectors v _i and corresponding principal components (PCs) in real-time. Employing a self-supervising principle and hierarchical neural network architecture, the PCs( ${\overset{⌣}{x}}_{i}$ ) are extracted sequentially as

{\overset{⌣}{x}}_{i} = v_{i}^{T} x = \sum_{p = 1}^{M} v_{i p} x_{p} (t) .

(11)

The vector v _i should be determined in such a way that the reconstructed vector $\bar{x} = v_{i}^{*} {\overset{⌣}{x}}_{i}$ will reproduce the input vector $\overset{⌣}{x} (t)$ according to a suitable optimization. For this purpose, let us define a complex-valued instantaneous error vector as

\begin{gathered} e_{i} (t) = {[e_{i 1} (t), e_{i 2} (t), \dots, e_{i M} (t)]}^{T} \\ = x (t) - \bar{x} (t) = x (t) - v_{i}^{*} \overset{⌣}{x} (t) \\ = (I - v_{i} v_{i}^{H}) x (t) = e_{i}^{R} (t) + j e_{i}^{I} (t), \end{gathered}

(12)

where I is the identity matrix, $e_{i}^{R} (t)$ and $e_{i}^{I} (t)$ are the real part and imaginary parts of the error vector e _i (t), respectively, and $j = \sqrt{- 1}$ . In order to find the optimal value of the vector v _i , we can define the following standard 2-norm cost function.

\begin{gathered} E_{i} (v_{i}) = \frac{1}{2} ({∥e_{i}^{R}∥}_{2}^{2} + {∥e_{i}^{I}∥}_{2}^{2}) \\ = \frac{1}{2} (\sum_{p = 1}^{M} {(e_{i p}^{R})}^{2} + \sum_{p = 1}^{M} {(e_{i p}^{I})}^{2}), \end{gathered}

(13)

where $e_{i p}^{R}$ is the p th element of $e_{i}^{R}$ . The minimization of the cost function (13), according to the standard gradient descent approach for the real and imaginary parts of the vector $v_{i} = v_{i}^{R} + j v_{i}^{I}$ , leads to a set of differential equations as

\begin{gathered} \frac{d v_{i p}^{R}}{d t} = - β_{i} \frac{\partial E_{i} (v_{i})}{\partial v_{i p}^{R}} \\ = β_{i} (E_{1} + x_{p}^{R} \sum_{h = 1}^{M} E_{2} + x_{p}^{I} \sum_{h = 1}^{M} E_{3}), \end{gathered}

(14)

\begin{gathered} \frac{d v_{i p}^{I}}{d t} = - β_{i} \frac{\partial E_{i} (v_{i})}{\partial v_{i p}^{I}} \\ = β_{i} (E_{4} + x_{p}^{R} \sum_{h = 1}^{M} E_{3} - x_{p}^{I} \sum_{h = 1}^{M} E_{2}), \end{gathered}

(15)

where β _i > 0 is the learning rate, $E_{1} = e_{i p}^{R} {\overset{⌣}{x}}_{i}^{R} + e_{i p}^{I} {\overset{⌣}{x}}_{i}^{I}$ , $E_{2} = e_{i h}^{R} v_{i h}^{R} - e_{i h}^{I} v_{i h}^{I}$ , $E_{3} = e_{i h}^{R} v_{i h}^{I} + e_{i h}^{I} v_{i h}^{R}$ , $E_{4} = e_{i p}^{R} {\overset{⌣}{x}}_{i}^{I} - e_{i p}^{I} {\overset{⌣}{x}}_{i}^{R}$ , ${\overset{⌣}{x}}_{i} \overset{Δ}{=} {\overset{⌣}{x}}_{i}^{R} + j {\overset{⌣}{x}}_{i}^{I}$ , and $e_{i p} \overset{Δ}{=} e_{i p}^{R} + j e_{i p}^{I}$ . Combining (14) and (15) and taking into account that $v_{i p} \overset{Δ}{=} v_{i p}^{R} + j v_{i p}^{I}$ , the adaptation law for updating the parameters is obtained as

\begin{gathered} \frac{d v_{i p} (t)}{d t} = β_{i} (t) ({\overset{⌣}{x}}_{i} (t) e_{i p}^{*} (t) + \\ x_{p}^{*} (t) \sum_{h = 1}^{M} v_{i h} (t) e_{i h} (t)), \end{gathered}

(16)

which can be written in matrix form as

\frac{d v_{i}}{d t} = β_{i} [{\overset{⌣}{x}}_{i} e_{i}^{*} + x^{*} v_{i}^{T} e_{i}],

(17)

for any v _i (0) ≠ 0, β _i (t) > 0. Since the second term in (17), which can be written $x^{*} v_{i}^{T} e_{i} = x^{*} (1 - v_{i}^{H} v_{i}) {\overset{⌣}{x}}_{i}$ , tends quickly to zero as $v_{i}^{H} v_{i}$ tends to 1 with t→∝ it can be neglected. The adaptation law in (17) can be further simplified to

\begin{gathered} \frac{d v_{i}}{d t} = β_{i} {\overset{⌣}{x}}_{i} e_{i}^{*} = β_{i} {\overset{⌣}{x}}_{i} {[x - v_{i}^{*} {\overset{⌣}{x}}_{i}]}^{*} \\ = β_{i} v_{i}^{T} x [I - v_{i} v_{i}^{H}] x^{*}, \end{gathered}

(18)

where (.)* denotes a complex conjugate. In discrete time, the adaptation law in (18) can be written

v_{i} (k + 1) = v_{i} (k) + β_{i} (k) {\overset{⌣}{x}}_{i} (k) E_{5},

(19)

where $E_{5} = [x^{*} (k) - v_{i} (k) {\overset{⌣}{x}}_{i}^{*} (k)]$ .

Whitening step

The whitening step uses the PCA to transform the data into an appropriate space and to reduce the redundancy of the observed data. The separated input vector $\tilde{x} (k)$ is whitened in the second step by applying the following transformation.

u (k) = P (k) \overset{⌣}{x} (k),

(20)

where u(k) is the whitened k vector, and P is the whitening matrix, which is determined using the neural learning approach. The objective is to find a simple adaptive algorithm for estimating the whitening matrix P, such that the covariance matrix of the whitened signals u(k) will be a diagonal matrix, that is, R _uu = E{uu^T } = diag{λ ₁, λ ₂, ..., λ _N} = I _N , and will be mutually uncorrelated if all of the cross-correlations are zero, that is, r _ij = E{u _i u _j } = 0, for all i ≠ j, with non-zero autocorrelations $r_{i j} = E {u_{i}^{2}} = λ_{i} > 0$ . Therefore, the minimization function can be formulated in the following 2-norm.

\begin{gathered} J_{2} (W) = \frac{1}{4} \sum_{i = 1}^{N} \sum_{j = 1}^{M} {(E {u_{i} u_{j}} - λ_{i} δ_{i j})}^{2} \\ = \frac{1}{4} {∥E {u u^{T}} - I_{N}∥}^{2} . \end{gathered}

(21)

To derive an adaptive learning algorithm, the following transformation

\begin{gathered} E {u u^{T}} = E {P \overset{⌣}{x} {\overset{⌣}{x}}^{T} P^{T}} \\ = E {P A s s^{T} (P A)^{T}} \\ = B R_{s s} B^{T} = B B^{T}, \end{gathered}

(22)

is used, where B = PA is the global transformation matrix from s to u. Without loss of generality, R _ss = E{ss^T } = I _N is assumed. By substituting (22) into (21), the optimization criterion can be written as

\begin{gathered} J_{2} (W) = \frac{1}{4} {∥B B^{T} - I_{N}∥}^{2} \\ = \frac{1}{4} t r [(B B^{T} - I_{N}) (B B^{T} - I_{N})] . \end{gathered}

(23)

Applying the standard gradient descent approach and the chain rule, the derivative of (23) is obtained as

\frac{d B}{d t} = η (I_{N} - B B^{T}) B = η (I_{N} - R_{u u}) B .

(24)

Taking into account that B = PA and assuming that A varies very slowly in time (i.e., dA/dt≈0), we have

\frac{d P}{d t} = η (I_{N} - R_{u u}) P .

(25)

Using the simple Euler formula, the corresponding discrete-time adaptive learning algorithm can be written as

P (k + 1) = P (k) + η (k) (I_{N} - R_{u u}^{k}) P (k),

(26)

where η(k) is the learning parameter to be adjusted according to $η (k) = 1 ∕ \{ξ ∕ (η (k - 1)) + {∥u (k)∥}_{2}^{2}\}$ , and ξ is the forgetting factor (i.e., 0 < ξ < 1). The covariance matrix R _uu can be estimated as

{\hat{R}}_{u u}^{k} = 〈u u^{T}〈 = \frac{1}{N} \sum_{k = 0}^{N - 1} {u (k) u (k)}^{T},

(27)

where $u (k) = P (k) \overset{⌣}{x} (k)$ .

Separation step

The separation of the whitened signals u(k) is the third step of the proposed algorithm, which is accomplished by applying the nonlinear principal component analysis (NPCA) learning rule. The multichannel linear separation transformation is given in the following form.

y (k) = W^{T} (k) u (k),

(28)

where W(k) is the separation matrix, whose values are updated through the NPCA learning rule. If the independent signals are zero-mean, the generalized covariance matrix of f(y _i ) and g(y _j ) (f(y _i ) and g(y _j ) are different and odd nonlinear activation functions such that f(y) = y³ and g(y) = tanh(y)) is a non-singular diagonal matrix R _fg = E{f(y)g^T (y)}-E{f(y)}E{g^T (y)}. On the basis of the independence criterion, the nonlinear covariance matrix is given as [41, 43]

R_{f g} = 〈f (y) g^{T} (y)〈 + I,

(29)

where f(y) = [f(y ₁), f(y ₂), ..., f(y _N )] ^T and g(y) = [g(y ₁), g(y ₂), ..., g(y _N )] ^T , provided that E{f(y _i )} = 0 or E{g(y _i )} = 0. To satisfy these conditions for arbitrary distributed sources, the nonlinearities are selected as f _i (y _i ) = φ _i (y _i ), g _i (y _i ) = y _i or f _i (y _i ) = y _i , g _i (y _i ) = φ _i (y _i ), where φ _i (y _i ) are suitably designed nonlinear functions, defining g(y) as an odd function and f(y) = g(y)-y. Therefore, similarly to (21)-(26), a real-time implementation algorithm can be derived as

W (k + 1) = W (k) - μ (k) f (y (k)) g^{T} y W (k),

(30)

where g^T y = (f^T y(k)-y^T (k)). Since the separation matrix W(k) is assumed to be orthogonal (i.e., W^T (k)W(k) = I), the real-time adaptation rule can be rewritten as

W (k + 1) = W (k) + μ (k) f (y (k)) W_{b} W (k),

(31)

where y(k) is the separated signal and the output of the second step, W _b = (u^T (k)-f^T y(k)), μ(k) is the learning parameter (it is adjusted according to $μ (k) = 1 ∕ \{γ ∕ (μ (k - 1)) + {∥y (k)∥}_{2}^{2}\}$ with the forgetting factor 0 < γ < 1), and f(.)is a suitably chosen nonlinear function that is usually selected to be odd in order to ensure both stability and signal separations. These nonlinear functions require the use of high-order statistics (HOS). In the present study, f(.)was chosen as f(t) = tanh(t). Finally, since f(t) = dg(t)/dt, g(t) = In[cosh(t)].

Estimation step

The final step is the estimation of the independent component basis vector of the mixing matrix A(k). The estimate of the observed data is given by

\hat{x} (k) = Q (k) y (k) .

(32)

Comparing (32) with (2), and since $ŝ (k) ≅ y (k)$ (i.e., $ŝ (k)$ is the estimated source signal s(k)), it can be concluded that $Â (k) = Q (k)$ . Therefore, the columns of the matrix Q(k) are the estimates of the columns of the matrix $Â (k)$ . Since Q(k) is the estimation matrix, its values (similarly to (26)) are updated through the adaptation law as

Q (k + 1) = Q (k) + α (k) Q_{e} y^{T} (k),

(33)

where $Q_{e} = [\hat{x} (k) - Q (k) y (k)]$ . The quality of the source estimate in y(k) can be measured using the zero-forcing solution. Such a solution attempts to adapt the demixing matrix such that

lim_{k \to \infty} C (k) Â (k) = Φ D,

(34)

where C(k) = W(k)P(k)V(k), Φ is a (M × M) permutation matrix with one unity entry in any row or column, and D is a diagonal nonsingular scaling matrix. In this case, it becomes

y_{i} (k) = d_{j j} s_{j} (k) + \sum_{l = 1}^{N} b_{i l} (k) n_{l} (k),

(35)

for some non-replicative assignment j→i for 1 ≤ i ≤ N and 1 ≤ j ≤ M Thus, each element of y(k) is the sum of a single unique source in s(k) and a noise term. In each simulation run, the performance index (PI) is evaluated using the following equation [44].

P I (k) = \frac{1}{M - 1} (M - \frac{1}{2} \sum_{i = 1}^{N} (C_{a} + C_{b})),

(36)

where $C_{a} = \frac{max_{1 \leq j \leq M} {|c_{i j} (k)|}^{2}}{\sum_{j = 1}^{M} {|c_{i j} (k)|}^{2}}$ , $C_{b} = \frac{max_{1 \leq j \leq M} {|c_{j i} (k)|}^{2}}{\sum_{j = 1}^{M} {|c_{j i} (k)|}^{2}}$ , c _ij denotes the (i, j)th element of the matrix in C(k), corresponding to the j th independent component (IC) in the desired subset of sources. This dimensionless performance metric measures the deviation of the combined system from a diagonally scaled permutation matrix (i.e., 0 ≤ PI(k) ≤ 1 for all matrices C(k), PI(k) is one when the sources maximally mixed in the outputs, and PI(k) is zero when the desired subset of the ICs is perfectly separated). The first term in (36) gives the error of the separation of the output component y _i (k) in (35) with respect to the sources and the second term measures the degree of the desired IC, c _j , appearing multiple times at the output. The integration of the four steps is called the adaptive nonlinear principal component analysis (ANPCA) method. In order to improve the flexibility, efficiency, and performance of blind signals separation or extraction, the proposed ANPCA scheme is run upon a multilayer neural network. The multiple layers of neurons with nonlinear transfer functions allow the network to learn both linear and nonlinear relationships between input and output vectors. Furthermore, this allows us to combine second-order statistics (SOS) and the HOS algorithm to extract features having different statistical properties, existing at various layers, and originating from various sources. The synaptic weights in each layer are updated by employing the algorithm described above.

Results

Preparatory to an analysis of the features of P300 components from EEG signals in real-time, actual signals were recorded in an eight-channel (Fz, Cz, Pz, Oz, P7, P3, P4, and P8) configuration. Figure 3 shows the observed EEG signals with background signal amplitudes of around 300 micro volts. Figure 4 shows the pre-processed signals with amplitudes of around 25 micro volts, which were filtered using a sixth-order BPF with cut-off frequencies of 1 Hz and 12 Hz. One way of gaining further insights into EEG signals is by introducing ANPCA techniques. The present model of EEG analysis consists of four main steps: pre-separation (learning rate β of 0.6), whitening (forgetting factor η of 0.01), separation (forgetting factor γ of 0.002), and estimation (learning rate α of 0.3). In this algorithm, the pre-separation and the whitening steps enable faster adaptation at the separation step. The performance of the component separation of the ANPCA algorithm in the output was evaluated using (36). The evolutions of PI(Ni) for six different run of the proposed method generated from the data with 350 ms ISI is given in Figure 5. It can be seen that the algorithm takes between four and ten epochs to converge. Depending on the simulation run, the performance factor varies from -22 dB to -33 dB, due to random differences in the source signals. The robustness of the ANPCA was evaluated by comparing its separation performance with suggested algorithms (i.e., NPCA [45], Nonstationary Source Separation-Joint Diagonalization (NSS-JD) [42], Joint Approximate Diagonalization of Eigen-matrices (JADE) [46], and Second-Order Blind Identification (SOBI) [47]) as shown in Figure 6. Figures 7, 8, 9, and 10 show the real-time-extracted signals from eight-electrode of the P300 component using the ANPCA algorithm using ISI of 325 ms, 350 ms, 375 ms, and 400 ms, respectively. The P300 amplitudes of individual subject, taken from Fz electrode, for ISI of 325 ms, 350 ms, 375 ms, and 400 ms, respectively, is shown in Figure 11 (a) P300 amplitude upon a single stimulus and (b) P300 upon multiple stimuli. By averaging the eight extracted signals from the eight-electrode, the P300 components were not detected in some periods as indicated in Figure 12. This signal was averaged using the 350 ms ISI data. Comparative plots of the classification accuracies along seven stimuli for all subjects (subjects 1-7) are provided in Figure 13. The best classification accuracy was achieved using ISI 350 ms. The average value of the classification accuracies upon seven block stimuli for all of the subjects is given in Table 1. The classification using ISI 350 ms gave the the higher average value with smallest standard deviation.

Table 1 Average value of the classification accuracies upon seven stimuli

Full size table

Discussion

The ability to measure and classify single-trial responses in real-time from specific brain regions has important theoretical and practical implications for both clinical and research applications. In this study, the amplitude of the background signal was around 300 micro volts as shown in Figure 3. Since the amplitude of the P300 component is very small (around 1.5 micro volts) compared with that of the background, the pre-processing filtering is required. These EEG signals were filtered using a sixth-order BPF with cut-off frequencies of 1 Hz (i.e., to remove the trend from low frequency bands) and 12 Hz (i.e., to remove unimportant information from high frequency bands), respectively. However, as shown in Figure 4, the signals nonetheless were corrupted by noises with background signal amplitudes of around 25 micro volts. Although there were some noticeable improvements, classification of the signals with respect to the given stimulus remained difficult. Therefore, an ANPCA-algorithm-based multilayer neural network model that can be used to analyzed complex P300 component from EEG signals in real-time is proposed. The MNN model with back-propagation training algorithm has five layers: the input and output layers have the same number of units N; the first and third layers are nonlinear (a sigmoid function as a universal approximation), and the second and fourth layers are linear. Layer 2 contains M units, that is, as many as there are nonlinear PCs. The activations of the neurons in Layer 2 are the nonlinear PCs of the input data. The back-propagation algorithm with an adaptive learning rate and momentum was used to train the neural networks. The values of the learning rate and the momentum were estimated by trial and error until no further improvement in the performance index could be obtained. The parameter values chosen were 0.3 and 0.8, respectively. The networks were trained before the EEG signals are recorded for one session. The time length for the training was range from 15.925 s to 19.6 s for each ISI.

Figure 5 shows the evolution of PI(Ni) for six different simulation runs in one implementation of the proposed method. The performance of the ANPCA algorithm in (30) was evaluated using (36) with W(0) = I. A single block of N = 7000 samples has been used to compute all coefficient updates for six run, where $\hat{x} (k + N i) = \hat{x} (k)$ for all integer values i ≥ 0 and i ≤ k ≤ N-1. As it can be seen, the algorithm took between four and ten epochs to converge. Depending on the simulation run, the performance factor varies from -22 dB to -33 dB, due to random differences in the source signals. The accuracy of the method generally improves for increasing values of block length N. It can be confirmed that the ANPCA algorithm successfully separates the mixture of source signals. To evaluate the robustness of the ANPCA against background noise, the separation performance indices of the ANPCA were compared with the suggested algorithms (i.e., NPCA, JADE, NSS-JD, and SOBI). The accuracy of the recovered independent components compared to the sources was measured according to the specified performance function in (36). Figure 6 shows the overall performance of all algorithms. For data iterations longer than 5000 iterations, the performance index was not much better, but was more and more time consuming. The quality of separation increases dramatically after 1500 length of iterations for the proposed method (ANPCA) and after 4000 length of iterations for other algorithms. It's clear that the proposed method present the shortest iteration time performance index about little over 0.03 (an acceptable level for separation). Upon this, it is asserted that the ANPCA algorithm successfully separates mixed source signals. The same accuracy level of separation was achieved after 4000 iterations by using other algorithms.

The ICs that were produced from the observed data using the ANPCA algorithm (for ISI of about 325 ms, 350 ms, 375 ms, and 400 ms) are shown in Figures 7, 8, 9, and 10. Although the signals were still corrupted by noises (manifested as the high amplitudes of non-targets in some sessions), the behaviours of the extracted signals clearly represented the P300 components. The observed signal was of the P300 event-related potential signal form. For the ISI of about 325 ms (Figure 7), it was found that the amplitude of the P300 component was higher than for the other ISI, as shown in Figure 11 (a), but noisier than for the higher ISI. As noted in Figure 7, the non-target amplitudes were roughly similar to the target amplitudes. For the ISI of about 350 ms (Figure 8), the target and non-target amplitudes were clearer and easier to distinguish than for the other ISI. For the ISI of about 375 ms (Figure 9), it was found that in some sessions the non-target amplitudes were higher than the target ones. For the ISI of about 400 ms, it was found that none of the channels showed similar behavior, as indicated in Figure 10. In this case the assumption of long stationary segment for averaging method will cause loss of the time resolution. Figures 7, 8, 9, and 10 show that the extracted signal amplitudes decreased (i.e., from the Fz to the P8 channel) as the distance of the electrodes increased. Figure 11(a) plots the amplitudes of the P300 component for four different ISIs (Fz channel) upon a single stimulus (scale of 700 ms) and indicate that the short ISI could increase both a target and a non-target amplitudes. Figure 11(b) plots the amplitudes of the P300 component for four different ISIs upon multiple stimuli (scale of 60 s) and indicate the peak shifting of the P300 component with respect to the various ISIs. The experiment using 350 ms ISI showed the best performance. Figure 12 displays the averages of the signals extracted from the eight-channel with ISI 350 ms. By averaging, the amplitude of a target gets bigger compared to that of a non-target, if the signal is long time stationary. But, this will fail for dissimilar trials, as indicated in Figure 12 (i.e., solid circle for the target and dashed circle for the non-target). This is one of the main reasons why the proposed method does not use the averaging scheme.

Comparative plots of the classification accuracies for the seven subjects were provided in Figure 13. All subjects achieved an average classification accuracy of 100% after three blocks of stimulus presentations were averaged (i.e., 8 s). In this regard, the subject intention was be recognized after eight seconds of the first given stimulus. Shown alongside the average value of the classification accuracies upon seven block stimuli for all of the subjects, in Table I, are the corresponding 85% confidence intervals. According to Table 1, the experiment with ISI 350 ms provides the highest average classification accuracies (88.921%) and smallest standard deviation (1.807) over all subjects. By contrast, ISI 400 ms showed the worst classification accuracies (84.839%). However, the worst standard deviation (4.959) was given by the experiment with ISI 325 ms. These results reflect the fact that the best performance was obtained through the experiment with ISI 350 ms.

Routine P300 component of EEG signals has been widely used in the clinical circumstances [21–26]. In this context, the use of physiological signals rather than behavioral responses of patient are often advisable, albeit challenging. Overall, the P300 component has sparked considerable interest as a clinical-application diagnostic tool. The most efficient method of implementing the diagnostic tool is through real-time detection. The amplitude of different waveforms at a single point can also be displayed in a similar format. This type of display provides a more objective analysis of the EEG activity compared to a subjective visual analysis by a physician. Simultaneous video monitoring of the patient during the EEG recording is becoming more popular. It allows the physician to closely correlate EEG waveforms with the patient's activity and may help produce a more accurate diagnosis.

Conclusions

The applicability of the proposed ANPCA method for extracting the P300 waves included in the EEG signals for real-time without down-sampling and averaging of the original signals was demonstrated. The separation performance factor of the ANPCA varied from -22 dB to -33 dB due to the randomness of source signals. In comparison with other algorithms (i.e., NPCA, NSS-JD, JADE, and SOBI), the ANPCA presented the shortest iteration time with performance index about 0.03. Since all the computations are done in real time, the ANPCA can be used as a viable tool for clinical applications.

References

Berger H: Uber das elektroenkephalogram des menschen. Archiv fur Psychiatrie und Nervenkrankheiten 1929, 87: 527–570. 10.1007/BF01797193
Article Google Scholar
Niedermeyer E, da Silva FL: Electroencephalography: Basic principles, clinical applications, and related fields. 5th edition. Lippincot Williams & Wilkins; 2004.
Google Scholar
Wang JT, Young GB, Connolly JF: Prognostic value of evoked responses and eventrelated brain potentials in coma. Can J Neurol Sci 2004, 31: 438–450.
Article Google Scholar
Rousseff RT, Tzvetanov P, Atanassova PA, Volkov I, Hristova I: Correlation between cognitive P300 changes and the grade of closed head injury. Electromyogr Clin Neurophysiol 2006, 46: 275–282.
Google Scholar
Pritchett S, Zilberg E, Xu ZM, Myles P, Brown I, Burton D: Peak and averaged bicoherence for different EEG patterns during general anaesthesia. Biomedical Engineering Online 2010, 9: 76. 10.1186/1475-925X-9-76
Article Google Scholar
Lorenz J, Kunze K, Bromm B: Differentiation of conversive sensory loss and malingering by P300 in a modified oddball task. Neuroreport 1998, 9: 187–191. 10.1097/00001756-199801260-00003
Article Google Scholar
Towle VL, Sutcliffe E, Sokol S: Diagnosing functional visual deficits with the P300 component of the visual evoked potential. Arch Ophthalmol 1985, 103: 47–50.
Article Google Scholar
Rosenfeld JP, Cantwell B, Nasman VT, Wojdac V, Ivanov S, Mazzeri L: A modified, event-related potential-based guilty knowledge test. Int J Neurosci 1988, 42: 157–161. 10.3109/00207458808985770
Article Google Scholar
Abootalebi V, Moradi MH, Khalilzadeh MA: A new approach for EEG feature extraction in P300-based lie. Computer Methods and Programs in Biomedicine 2009, 94: 48–57. 10.1016/j.cmpb.2008.10.001
Article Google Scholar
Roberts SJ, Penny WD: Real-time brain computer interfacing: A preliminary study using bayesian learning. Med Biol Eng Comput 2000, 38: 56–61. 10.1007/BF02344689
Article Google Scholar
Takano K, Kamatsu T, Hata N, Nakajima Y, Kansaku K: Visual stimuli for the P300 brain-computer interface: A comparison of white/gray and green/blue flicker matrices. Clinical Neurophysiology 2009, 120: 1562–1566. 10.1016/j.clinph.2009.06.002
Article Google Scholar
Hazrati MKh, Erfanian A: An online EEG-based brain-computer interface for controlling hand grasp using an adaptive probabilistic neural network. Medical Engineering & Physics 2010, 32: 730–739. 10.1016/j.medengphy.2010.04.016
Article Google Scholar
Lv J, Li Y, gu Z: Decoding hand movement velocity from electroencephalogram signals during a drawing task. Biomedical Engineering Online 2010, 9: 64. 10.1186/1475-925X-9-64
Article Google Scholar
Lee Y, Lee H, Kim J, Shin HC, Lee M: Classification of BMI control commands from rat's neural signals using extreme learning machine. Biomedical Engineering Online 2009, 9: 29.
Article Google Scholar
Shen TW, Tompkins WJ, Hu YH: Implementation of a one-lead ECG human identification system on a normal population. Journal of Engineering and Computer Innovations 2011, 2: 12–21.
Google Scholar
Lewis D, Brigder D: Market researchers make increasing use of brain imaging. Advances in Clinical Neuroscience & Rehabilitation 2005, 5: 35–36.
Google Scholar
Etevenon P, Lebrun N, Clochon P, Perchey G, Eustache F, Baron JC: High temporal resolution dynamic mapping of instantaneous EEG amplitude modulation after tone-burst auditory stimulation. Brain Topography 1999, 12: 129–137. 10.1023/A:1023466312686
Article Google Scholar
Davidson PR, Jones RD, Peiris MT: EEG-based lapse detection with high temporal resolution. IEEE Trans Biomed Eng 2007, 4: 832–841.
Article Google Scholar
Chapman RM, Bragdon HR: Evoked responses to numerical and non-numerical visual stimuli while problem solving. Nature 1964, 203: 1155–1157. 10.1038/2031155a0
Article Google Scholar
Sutton S, Braren M, John ER, Zubin J: Evoked potential correlates of stimulus uncertainty. Science 1965, 150: 1187–1188. 10.1126/science.150.3700.1187
Article Google Scholar
Ma Q, Shen Q, Xu Q, Li D, Shu L, Weber B: Empathic responses to others' gains and losses: An electrophysiological investigation. Neuroimage 2011, 54: 2472–2480. 10.1016/j.neuroimage.2010.10.045
Article Google Scholar
Bauer LO: Interactive effects of HIV/AIDS, body mass, and substance abuse on the frontal brain: A P300 study. Psychiatry Research 2011, 185: 232–237. 10.1016/j.psychres.2009.08.020
Article Google Scholar
Kessels LT, Ruiter RA, Brug J, Jansma BM: The effects of tailored and threatening nutrition information on message attention. Evidence from an event-related potential study. Appetite 2011, 56: 32–38. 10.1016/j.appet.2010.11.139
Article Google Scholar
Lahteenmaki PM, Holopainen I, Krause CM, Helenius H, Salmi TT, Heikki LA: Cognitive functions of adolescent childhood cancer survivors assessed by event related potentials. Med Pediatr Oncol 2001, 36: 442–50. 10.1002/mpo.1108
Article Google Scholar
Luijten M, van Meel CS, Franken IHA: Diminished error processing in smokers during smoking cue exposure. Pharmacology Biochemistry and Behavior 2011, 97: 514–520. 10.1016/j.pbb.2010.10.012
Article Google Scholar
Heinrich SP, Marhofer D, Bach M: Cognitive visual acuity estimation based on the event-related potential P300 component. Clinical Neurophysiology 2010, 121: 1464–1472. 10.1016/j.clinph.2010.03.030
Article Google Scholar
Polich J, Howard L, Starr A: Effects of age on the P300 component of the event-related potential from auditory stimuli: Peak definition, variation, and measurement. The Journal of Gerontology 1985, 40: 721–726.
Article Google Scholar
Brazier MAB: Evoked responses recorded from the depths of the human brain. Annals of the New York Academy of Sciences 1964, 112: 33–59.
Article Google Scholar
Yeah CL, Chang HC, Wu CH, Lee PL: Extraction of single-trial cortical beta oscillatory activities in EEG signals using empirical mode decomposition. Biomedical Engineering Online 2009, 9: 25.
Article Google Scholar
Graichen U, Witte H, Haueisen J: Analysis of induced components in electroencephalograms using a multiple correlation method. Biomedical Engineering Online 2009, 9: 21.
Article Google Scholar
LeVan P, Gotman J: Independent component analysis as a model-free approach for the detection of bold changes related to epileptic spikes: A simulation study. Human Brain Mapping 2009, 30: 2021–2031. 10.1002/hbm.20647
Article Google Scholar
Sabeti M, Katebi SD, Boostani R, Price GW: A new approach for EEG signal classification of schizophrenic and control participants. Expert Systems with Applications 2011, 38: 2063–2071. 10.1016/j.eswa.2010.07.145
Article Google Scholar
Wessel JR, Ullsperger M: Selection of independent components representing event-related brain potentials: A data-driven approach for greater objectivity. NeuroImage 2011, 54: 2105–2115. 10.1016/j.neuroimage.2010.10.033
Article Google Scholar
Mahmoudi Z, Rahati S, Chasemi MM, Asadpour V, Tayarani H, Rajati M: Classification of voice disorder in children with cochlear implantation and hearing aid using multiple classifier fusion. Biomedical Engineering Online 2011, 10: 3. 10.1186/1475-925X-10-3
Article Google Scholar
Kim J, Shin HS, Shin K, Lee M: Robust algorithm for arrhythmia classification in ECG using extreme learning machine. Biomedical Engineering Online 2009, 8: 31. 10.1186/1475-925X-8-31
Article Google Scholar
Yuenyong S, Nishihara A, Kongprawechnon W, Tungpimolrut K: A framework for automatic heart sound analysis without segmentation. Biomedical Engineering Online 2011, 10: 13. 10.1186/1475-925X-10-13
Article Google Scholar
Kulkarni S, Reddy NP, Hariharan SI: Facial expression (mood) recognition from facial images using committee neural networks. Biomedical Engineering Online 2009, 10: 16.
Article Google Scholar
Shrirao NA, Reddy NP, Kosuri DR: Neural network committees for finger joint angle estimation from surface EMG signals. Biomedical Engineering Online 2009, 8: 2. 10.1186/1475-925X-8-2
Article Google Scholar
Hu XS, Hong KS, Ge SS, Jeong MY: Kalman estimator- and general linear model-based on-line brain activation mapping by near-infrared spectroscopy. BioMedical Engineering OnLine 2010, 9: 82. 10.1186/1475-925X-9-82
Article Google Scholar
Jasper HH: Report of the committee on methods of clinical examination in electroencephalography. Electroenceph Clin Neurophysiol 1958, 10: 370–375.
Article Google Scholar
Choi S, Cichocki A: Blind separation of nonstationary sources in noisy mixtures. Electronics Letters 2000, 36: 848–849. 10.1049/el:20000623
Article Google Scholar
Cichocki A, Amari SI: Adaptive Blind Signal and Image Processing: Learning Algorithms and Applications. John Wiley & Sons, LTD, England; 2002.
Book Google Scholar
Oja E: The nonlinear PCA learning rule in independent component analysis. Neurocomputing 1997, 17: 25–45. 10.1016/S0925-2312(97)00045-3
Article Google Scholar
Hu YH, Hwang JN: Handbook of Neural Network Signal Processing. CRC Press, New York, Washington, D.C.; 2002.
Google Scholar
Turnip A, Hong KS, Jeong MY: Real time feature extraction of EEG-based P300 using nonlinear principal component analysis. In 17th Annual Meeting of the Organization for Human Brain Mapping, June 26–30, 2011. Quebec City, Canada;
Google Scholar
Naraharisetti KVP: Removal of ocular artifacts from EEG signal using joint approximate diagonalization of eigen matrices (JADE) and wavelet transform. Canadian Journal on Biomedical Engineering & Technology 2010, 1: 56–60.
Google Scholar
Gharieb RR, Cichocki S: Second-order statistics based blind source separation using a bank of subband filters. Digital Signal Processing 2003, 13: 252–274. 10.1016/S1051-2004(02)00034-9
Article Google Scholar

Download references

Acknowledgements

This research was supported by the World Class University program funded by the Ministry of Education, Science and Technology through the National Research Foundation of Korea (grant no. R31-20004).

Author information

Authors and Affiliations

Department of Cogno-Mechatronics Engineering, Pusan National University, 30 Jangjeon-dong, Geumjeong-gu, Busan, 609-735, Korea
Arjon Turnip, Keum-Shik Hong & Myung-Yung Jeong
School of Mechanical Engineering, Pusan National University, 30 Jangjeon-dong, Geumjeong-gu, Busan, 609-735, Korea
Keum-Shik Hong

Authors

Arjon Turnip
View author publications
You can also search for this author in PubMed Google Scholar
Keum-Shik Hong
View author publications
You can also search for this author in PubMed Google Scholar
Myung-Yung Jeong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Keum-Shik Hong.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

AT carried out data acquisition and processing and drafted the manuscript. KSH supervised the project and corrected the manuscript. MYJ provided suggestions to improve the manuscript. All of the authors read and approved the final manuscript.

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Authors’ original file for figure 4

Authors’ original file for figure 5

Authors’ original file for figure 6

Authors’ original file for figure 7

Authors’ original file for figure 8

Authors’ original file for figure 9

Authors’ original file for figure 10

Authors’ original file for figure 11

Authors’ original file for figure 12

Authors’ original file for figure 13

Rights and permissions

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Turnip, A., Hong, KS. & Jeong, MY. Real-time feature extraction of P300 component using adaptive nonlinear principal component analysis. BioMed Eng OnLine 10, 83 (2011). https://doi.org/10.1186/1475-925X-10-83

Download citation

Received: 11 May 2011
Accepted: 23 September 2011
Published: 23 September 2011
DOI: https://doi.org/10.1186/1475-925X-10-83

Real-time feature extraction of P300 component using adaptive nonlinear principal component analysis

Abstract

Background

Methods

Results

Conclusions

Background

Methods

Data acquisition

Real-time feature extraction

Pre-separation step

Whitening step

Separation step

Estimation step

Results

Discussion

Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Competing interests

Authors' contributions

Authors’ original submitted files for images

Rights and permissions

About this article

Cite this article

Share this article

Keywords

BioMedical Engineering OnLine

Contact us