- Research
- Open Access
- Published:

# Wavelet-based Gaussian-mixture hidden Markov model for the detection of multistage seizure dynamics: A proof-of-concept study

*BioMedical Engineering OnLine*
**volume 10**, Article number: 29 (2011)

## Abstract

### Background

Epilepsy is a common neurological disorder characterized by recurrent electrophysiological activities, known as seizures. Without the appropriate detection strategies, these seizure episodes can dramatically affect the quality of life for those afflicted. The rationale of this study is to develop an unsupervised algorithm for the detection of seizure states so that it may be implemented along with potential intervention strategies.

### Methods

Hidden Markov model (HMM) was developed to interpret the state transitions of the *in vitro* rat hippocampal slice local field potentials (LFPs) during seizure episodes. It can be used to estimate the probability of state transitions and the corresponding characteristics of each state. Wavelet features were clustered and used to differentiate the electrophysiological characteristics at each corresponding HMM states. Using unsupervised training method, the HMM and the clustering parameters were obtained simultaneously. The HMM states were then assigned to the electrophysiological data using expert guided technique. Minimum redundancy maximum relevance (mRMR) analysis and Akaike Information Criterion (AICc) were applied to reduce the effect of over-fitting. The sensitivity, specificity and optimality index of chronic seizure detection were compared for various HMM topologies. The ability of distinguishing early and late tonic firing patterns prior to chronic seizures were also evaluated.

### Results

Significant improvement in state detection performance was achieved when additional wavelet coefficient rates of change information were used as features. The final HMM topology obtained using mRMR and AICc was able to detect non-ictal (interictal), early and late tonic firing, chronic seizures and postictal activities. A mean sensitivity of 95.7%, mean specificity of 98.9% and optimality index of 0.995 in the detection of chronic seizures was achieved. The detection of early and late tonic firing was validated with experimental intracellular electrical recordings of seizures.

### Conclusions

The HMM implementation of a seizure dynamics detector is an improvement over existing approaches using visual detection and complexity measures. The subjectivity involved in partitioning the observed data prior to training can be eliminated. It can also decipher the probabilities of seizure state transitions using the magnitude and rate of change wavelet information of the LFPs.

## Background

Epilepsy is one of the most common neurological disorders, affecting over 50 million people worldwide. The disorder is characterized by spontaneous, recurrent, seemingly unpredictable symptoms called seizures [1, 2]. A seizure can be defined as the sudden manifestation of lowered complexity synchronized rhythmic activities across populations of neurons [3]. Epilepsy affects motor and speech, as well as other cognitive functions that, if untreated, can lead to permanent damage to the brain. Time-frequency representation (TFR) is an important marker for understanding the progression into seizure onsets [4]. The progression to the seizure onset is associated with the entrainment of neuronal population firing. In particular, the TFR of epileptiform oscillation has been suggested as a useful tool in localizing regions of seizure onsets and in understanding the mechanisms behind seizure generation [5]. The current methods for detecting the state transitions of seizure episodes based on TFR usually involve the computation of state-specific features through supervised learning techniques [6–10]. In general, supervised learning strategies require certain knowledge of the system so that the data can be separated into different states based on their known dynamics before the training process. To date, a great deal of subjectivity is required for the implementation of these algorithms since the desired solution for detection must be defined by the experimenter.

Seizure detection refers to the identification of seizure onsets a few seconds before the observable behavioral changes [11, 12]. Several effective supervised pattern recognition strategies have been developed for seizure detection. Multi-layered networks were first introduced to analyze EEG data pertaining to seizure phenomenon in the mid-1990s [13, 14]. Various methods such as autoregressive models [15, 16] to more advanced techniques such as support vector machines [17, 18] were also proposed. The feature space for these approaches ranged from spectrogram [19], dominant frequency, power and amplitude [20] to time frequency distribution [21] such as wavelet transform based approximate entropy [22, 23]. It has been reported that frequency content of neuronal electrical activity changed significantly during the progression of a seizure, both in the interspike interval and in the intraburst dynamics [8, 24]. The major disadvantage of these supervised methods is that their performances can only be as good as the initial data separation criteria. Therefore, it is essential to utilize an unsupervised learning paradigm that transcends these restrictions.

An unsupervised probabilistic approach for the detection of seizure-like events (SLEs) *in vitro* extracellular local field potentials (LFPs) seizures using hidden Markov model (HMM) [25, 26] along with clustering of wavelet features is proposed. The *in vitro* model, emulating human epilepsy, provides a platform for testing the seizure dynamics detection algorithm. The HMM is not meant to reproduce the exact electrophysiological recordings of the brain. Instead, it is used to capture the essential TFR characteristics in the progression of SLEs and to estimate the state transitions as a multi-stage process. The HMM has an advantage over the supervised approaches because it does not require prior manual separation of data into different dynamics. The current approach of detecting seizure events using Markov models involves the estimation of either two (seizure and interictal) [27] or three (baseline, detected and seizure) [28] distinct states. Even though these proposed methods appear to detect seizure onsets, they failed to address the possibility of having multiple distinctive dynamics between non-ictal (interictal) and chronic seizure events, which may be an important aspect for the development of seizure therapy techniques. The training process of the HMM is an unsupervised approach. However, a certain amount of bias can be achieved by making an informed choice on the assignment of the model states to the electrographical activities after the unsupervised training process is completed.

To determine the optimal HMM topology for seizure detection, two methods were proposed and compared. First, the performance of the trained HMM was evaluated on the validation set to determine the suitable HMM topology. Second, minimum redundancy maximum relevance (mRMR) analysis [29] and Akaike Information Criteria (AICc) [30, 31] were used to find a suitable feature space and optimum model by balancing the log-likelihood (LL) against the number of model parameters. Furthermore, the performances of HMMs were compared with the wavelet-based supervised machine learning techniques [6, 7] based on the sensitivity and specificity of chronic seizure detection, the detection delay and the optimality measure [32]. We hypothesized that using appropriate wavelet features, the HMM can detect the various stages of SLEs at least as well as, if not better than, the supervised machine learning algorithms. We also hypothesized that the optimal HMM topology can illuminate multiple transitional characteristics in the tonic firing phrase leading to the onsets of chronic seizure activities.

## Methods

### A. Tissue preparation and data acquisition

Hippocampal slice recordings were obtained from eight Wistar rats (17-25 days old). The animals were anaesthetized with halothane and decapitated in accordance with the Canadian Animal Care Guidelines. The brains were promptly dissected and maintained in ice-cold (4°C) artificial cerebrospinal fluid (aCSF) for four to five minutes. Each brain was incised in a horizontal manner in accordance with the procedure outlined by [33, 34]. The dorsal cortex of each hemisphere was cut parallel to the rostral/caudal axis and glued dorsal side down to an aluminium block, with caudal end towards the blade. The block was secured at a 12-14° angle, and brain slices of 400 μm thick were sectioned using a vibratome. Next, slices were maintained at room temperature in oxygenated "standard" aCSF (95% O_{2}, 5% CO_{2}) for a minimum of one hour prior to recording. The composition of the "standard" aCSF was as follows (in mM), NaCl (125), KCl (5), NaH_{2}PO_{4} (1.25), MgSO_{4} (2), CaCl_{2} (2), NaHCO_{3} (25) and glucose (10). The pH was approximately 7.4, with osmolarity in the range of 300 ± 5 mOsm. During the data acquisition stage, slices were transferred to the fusion chamber maintained at 30°C (Medical Systems Corp., Model PDMI-2, Harvard Apparatus, St. Laurent, Quebec, Canada). The pyramidal cells were visualized with an upright microscope (BX51, Olympus, Melville, NY, USA) using infrared imaging with differential interference contrast (IR-DIC) under 40 × magnification (water-immersion objective) with an OLY-150IR camera-video monitor unit (Olympus) [35]. Pyramidal cells were identified based on their characteristic spike frequency of 15.0 ± 5 Hz, their morphological features and the general location of the electrode placement. At the time of recording, spontaneous SLEs were induced by perfusing the slice with low-Mg^{2+} ACSF (containing in mM: 123 NaCl, 5 KCl, 1.5 CaCl2, 0.25 MgSO4, 25 NaHCO3, 1.2 NaH2PO4 and 15 glucose), or by tetanic stimulation of the CA3 region in "standard" aCSF, once every 10 min (80 Hz, 1 second duration), using a Grass S44 stimulator (Grass Medical Instruments, West Warwick, Rhode Island). The reduction of extracellular Mg^{2+} concentration has long been known to enhance neuronal excitability by decreasing membrane surface charge screening and, thereby, facilitating the activation of inward currents in addition to increasing the synaptic excitation by unblocking the NMDA receptor [36]. The population dynamic of the LFPs were measured using an aCSF-filled borosilicate glass pipette located in stratus pyramidal of the CA1 region of the hippocampus. Data was acquired using a custom-made DC differential amplifier with a lowpass filter (corner frequency 400 Hz), digitized at 1 kHz by a Digidata 1322 (Axon Instruments, Union City, California). The whole-cell patch-clamp recordings were performed in the current clamp configuration using an Axopatch 200B amplifier (Axon Instruments, Union City, CA, USA). The whole-cell patch pipette solutions contained (contained in mM: 8 NaCl, 0.001 CaCl_{2}, 10 Na-Hepes, 5 KCl, 140 potassium gluconate, 1 MgCl_{2}, 0.3 Na-GTP and 2 Na-ATP). The perforated patch pipette solution (containing in mM: 50 KCl, 2 Hepes, 0.1 EGTA and ≤50 μg/ml gramicidin).

Overall, this study utilized 50 SLEs from 20 hippocampal slices in 8 rats with at least two SLEs recordings per rat. Each SLE was normalized in amplitude between -1 and +1. The DC components and the 60 Hz noise along with their harmonics were removed using FIR notch filtering. All of the analyses were performed using MATLAB (MathWorks, Natick, MA). The entire date set contained SLEs 50-379 s in duration (mean ± standard deviation of 145 ± 70 s). A typical SLE was made up of the interictal period, followed by tonic firing and then chronic seizure, before returning to the interictal period through the postictal activity. The mean duration of the chronic seizure was 74 s with standard deviation of 32 s. The duration of the tonic firing pattern lasted 0-301 s (mean ± standard deviation of 49 ± 48 s). We also included over 30 min of non-ictal/interictal bursting (IB) activities to determine the false positive rates of the seizure detectors. Details of the dataset for analysis are given in Table 1.

### B. Wavelet-based hidden Markov model seizure detector

The HMM is a powerful technique for the estimation and analysis of state transitions in any potentially multi-stage process. It is particularly useful in describing the progression of time-varying phenomenon in which the observed signals are emitted from the underlying dynamical states whose detailed generating mechanism is unknown or hidden. This is also important because the underlying dynamics of the brain is still relatively unknown.

#### Feature space

The LFPs recorded near the stratus pyramidal of the CA1 region of *in vitro* rat hippocampal slices were used to train the HMM. The continuous wavelet transform (CWT) [37–39] using Morlet mother wavelet *ψ(t)* was performed to extract the features. We have noted from our previous study that the choice of mother wavelet does not affect the pattern recognition system [6]. The modified wavelets *ψ*
_{
s,l
}(*t*), derived from a mother wavelet by a scaling factor *s* and a translation factor *l*, is defined as:

The wavelet coefficients (*c*) can be computed at each non-overlapping 1-second moving time window, as the correlation between the modified wavelet and the input data. For this study, seven frequency components (as illustrated in Table 2) denoting important physiologically-relevant frequency bands were obtained. We evaluated two feature spaces. The first feature set was made up of 7-D wavelet coefficient data, which was identical to the feature set used in the previous supervised seizure detection algorithm called the wavelet artificial neural networks (WANN) [6, 7]. The second feature set was a 14-D vector consisted of the information from the first feature set and the rate of change information *Δc*, defined as the numerical difference of *c* from the previous time window at each frequency band.

#### HMM topology

The HMM topology is defined by the number of states (Q) in the HMM and the number of basis functions (M) used to represent the feature space. The value Q can be interpreted as the potential number of distinct dynamics that may exist within a SLE. The value M can loosely be interpreted as the complexity of the feature for each possible model state.

The model parameters of the HMM help create two probability functions: the state transition probability and the emission probability. The parameter *a*
_{
ij
}is the probability of state transition from state *i* to state *j* where *i* and *j* can range from 1 to Q. The model state *j* at time *t* is denoted as (*S*
_{
t
}= *j*).

In this study, Q could go up to 10, allowing for the detection of potential multi-stage SLE processes. The transition of the model states was also assumed to follow a static probability distribution such that the transition probability was independent of time.

For the representation of the features space, an unsupervised Mixture of Gaussians (MoGs) clustering technique was used. The feature vectors in the D-dimensional space is denoted as ** x**, where D = 7 or 14 depending of the features. The multivariate Gaussian probability density measure is a function of the center

*μ*

_{ k }(D by 1 in size) and covariance matrix

*Σ*

_{ k }(D by D in size). It is defined as:

The HMM was set up such that each state *j* would have M number of clusters in the feature space. Initially, the Gaussian clusters were randomly scattered and a subset of the feature ** x** was assigned to the k

^{th}cluster in the hidden state

*j*. The output probability function, called emission probability

*b*

_{ j }

*(*

*x**)*can then be defined as:

Here, the weighting factor for each cluster is denoted as *w*
_{
jk
}. In order to interpret *b*
_{
j
}
*(* *x**)* as true probability, the weighting factor needs to satisfy the following criteria:

In this paper, the number of clusters was allowed to vary from 1 to 5. A schematic representation of a 3-state HMM based on a 2-D feature space is shown in Figure 1, illustrating how the transition probabilities and the emission probabilities can tell us about the progression of the model states and the representation of Gaussian clusters in the state space.

#### Training and validation

The training of HMM involved the process of maximizing the probability of fitting the distribution of the feature space through an iterative process of updating the model parameters [40]. Forty percent of the SLEs were randomly selected as the training set. For each HMM topology (Q = 1 to 10 and M = 1 to 5), the model parameters (*a*
_{
ij
}
*, w*
_{
jk
}
*, μ*
_{
jk
}and *Σ*
_{
jk
}) related to HMM transitions and clustering were randomly initialized and estimated in an iterative manner simultaneously. These parameters were updated according to the Expectation-Maximization (EM) algorithm [25]. At the Expectation step, the responsibility of the model in representing the observed data was evaluated. The joint probability of observing all the data up to time *t* at state *j* was given as *α*
_{
j
}
*(t)* and the conditional probability of all the future data from time *t+1* onward at state *i* was given as *β*
_{
i
}
*(t)*. The joint probability and conditional probability can be updated in an iterative manner:

Using α and β, the probability of *S*
_{
t
}= *i* is given by *γ*
_{
i
}
*(t)*. The probability of having *S*
_{
t
}= *i* and *S*
_{
t+1
}= *j* is denoted as *ζ*
_{
ij
}
*(t)*.

For the clustering of feature space using MoGs, the probability that the k^{th} cluster at state *j* can generate a particular observation *x*_{
t
}is given by:

From these estimates, the model parameters can be updated in the Maximization step.

The EM algorithm was repeated until the log-likelihood (LL) measure was no longer showing significant improvement. The LL measures quantified the goodness of fit between the feature ** x**:

The EM algorithm would typically take only a few seconds to complete. From the learning process described above, it should be obvious that if the number of clusters (M) and the number of possible HMM states (Q) were allowed to increase unrestricted, the LL would continue to improve at the expense of over-fitting the data.

Two methods were used to select the appropriate HMM model topology (Q and M) for each feature set. The first method involved the use of the validation data set. The performance of the trained HMMs at different Q and M combinations were examined on the validation sets consisting of 20% of the overall SLEs data. The HMM associated with the best overall balance between ictal and non-seizure data detection was denoted as the optimal HMMs. The optimal HMM obtained using 7-D wavelet coefficient features was denoted HMM_{opt7D} and the optimal HMM obtained using 14-D wavelet and rate of change features was denoted HMM_{opt14D}. These models were then tested to evaluate the state transitions on the test data, consisting of the remaining 40% of the SLEs. The second method to find the HMM topology involved the use of minimum redundancy maximum relevance (mRMR) analysis [41] to reducing the dimensionality of the feature space, and the use of AICc [30]. The mRMR method, which has been frequently used in gene expression research, can identify subsets of feature that were relevant to the classification tasks. It selected the features that were mutually far apart from each other (small *W*
_{
c
}) while still having a strong correlation (large *V*
_{
F
}) to the target state.

Here, *Π* represents the set of features, *C(f*
_{
m
}
*,f*
_{
n
}
*)* is the correlation between the two wavelet features *f*
_{
m
}and *f*
_{
n
}and *F(f*
_{
m
}
*,S)* is the F-statistics between the feature *f*
_{
m
}and the target chronic seizure state *S*. The wavelet features associated with the largest mutual information quotient *V*
_{
F
}
*/W*
_{
c
}were selected to construct the reduced feature set. This reduced feature set was then used to build other HMMs based on AICc. The AICc rewards goodness of fit based on the LL information, but also includes a penalty term that is proportional to the number of parameters (K = 3QM + Q^{2}) to reduce over-fitting.

The size of the test set (*n*) also plays a role in determining AICc if *n/K < 40*. For each HMM topology (Q and M), an AICc value was computed. These AICc values were then rescaled with respect to the minimum AICc within the group such that the lowest AICc value was set to zero [31].

The model with ΔAICc < 0.25 while having the least number of model parameters (K) was then denoted as the optimal HMM (HMM_{AIC}).

### C. Statistical test and optimality index

The HMM topologies were evaluated according to their abilities to detect non-seizure events, different stages in the tonic activities as well as chronic seizures. After the unsupervised training, the marginal posterior distribution *γ*
_{
i
}
*(t)* for each state *i* was computed by evaluating the HMMs on the test data. The detection of distinct initiation and termination of seizure dynamics as they evolved with time [42] was also considered in the form of early and late tonic spikes. An expert-guided state assignment procedure was used to identify non-ictal to ictal transitions using short time maximum Lyapunov exponent estimator derived from Rosenstein's algorithm [43]. The dynamics of chronic seizure period was first assigned to the most probable HMM state after training. Based on the state progression in the transition matrix, the tonic firing and non-ictal states were assigned in reverse order to HMM states before chronic activities; the postictal activity was assigned to the appropriate state forward-in-time to the HMM state after chronic activities.

Two types of spiking activities can be observed from the data set: tonic firing and chronic seizure activity. To evaluate the statistical robustness of the HMM, the sensitivity (TP) and specificity (TN) measures were determined. TP is defined by the percentage of correct chronic state detection within 30 s after the start of the chronic events, denoted as electrographical chronic seizure onset time (EcSOT) [44]. The percentage of correctly identified non-seizure/interictal activities at least 30 s prior to EcSOT is denoted as the TN. The ability of the HMM to detect chronic seizure onset early enough was determined. The time delay (*ΔT*) is defined as the time difference between the EcSOT and the approximated chronic seizure onset time (AcSOT) for each HMM topology. The AcSOT is defined as the time instance at which *γ*
_{
i
}
*(t)* first identified the chronic seizure state within a detection horizon of 30 s around the EcSOT such that the EcSOT is located in the middle of a 1 min time window. Here, a positive *ΔT* would imply early chronic seizure detection while a negative *ΔT* would mean that the detection happened after the chronic seizure onset. Finally, the optimality index measure (*O*) [32], used to evaluate the overall performance of the HMM, is defined as:

Here, the value *d* is the chronic seizure duration for each SLE. A larger *O* indicates a better overall performance for a given detector. These measures (TP, TN, *ΔT* and *O*) from the HMMs were also compared with other supervised wavelet artificial neural network (WANN) [6] built on identical features. Figure 2 provides a graphical illustration on the overall training and evaluation strategies.

## Results

Comparison of the performance of HMM_{opt7D} and HMM_{opt14D} was done using a five-fold cross-validation technique. The model parameters of the HMMs were identified such that the best statistics for seizure detection in terms of optimality index (*O*) can be produced in the validation data set.

After the unsupervised training and before the assignment of electrophysiological states to the model states, the LL for different combinations of Q and M values were compared to give the initial estimate on the goodness of fit between the model and the recorded data. Not surprisingly, the LL on the training data would increase and gradually reach a plateau for large Q and M values. However, using a large number of states and clusters would cause over-fitting due to the curse of dimensionality, hence severely jeopardizing the generalization ability of the detector. It was also not surprising that at identical Q and M combination, the LL for the 14-D HMM using both *c* and *Δc* was always larger than the LL for the 7-D HMM using *c* only (see Figure 3a and 3b). This implies that the feature space containing *Δc* may more accurately represent the underlying dynamics of SLEs.

After assessing each HMM using the validation data, the topologies of the HMM_{opt7D} and HMM_{opt14D} were obtained with the expert-guided state assignment. For HMM with wavelet features (*c*) only, the maximum sensitivity of 68.4 ± 21.1% and specificity of 85.8 ± 13.7% on the validation set was obtained for the HMM topology of Q = 5 and M = 4. The *a*
_{
ij
}of the HMM_{opt7D} also exhibited a unidirectional sequence of state transitions (as shown in Figure 4a). The state *S*
_{
5
}was assigned as the chronic seizure state, S_{5} was preceded by *S*
_{
4
}as the tonic firing state and *S*
_{
2
}as the interictal state. The *S*
_{
1
}and *S*
_{
3
}can then be consolidated to represent the postictal activity. An illustration of the LFP in the test set along with the posterior probabilities of the model states is shown in Figure 4b. The HMM_{opt14D} was found to contain Q = 8 and M = 3, with the state transition diagram shown in Figure 5a. The application of HMM_{opt14D} on the validation set gave the maximum sensitivity of 80.1 ± 15.7% and maximum specificity of 95.3 ± 4.4%. The state *S*
_{
6
}can be assigned as chronic seizure, *S*
_{
3
}and *S*
_{
8
}can be assigned as the postictal and interictal respectively. The three interconnecting states of *S*
_{
1
}, *S*
_{
4
}, and *S*
_{
7
}before chronic state *S*
_{
6
}suggested that they can be assigned as the late tonic firing activities. The state transition followed a somewhat unidirectional evolution except for the interconnecting states of the early tonic state (*S*
_{
2
}and *S*
_{
5
}) back to the interictal state (*S*
_{
8
}), which appeared to suggest the possibility of having seizure permissive tonic firing not leading to chronic seizures. Figure 5b gives a graphical illustration of the LFP along with the posterior probabilities of the model states using HMM_{opt14D}.

Next, the HMM approach was also compared against the supervised approach using identical wavelet coefficient feature set, in the form of a fully-supervised wavelet artificial neural network (WANN) seizure detection method designed by our group [6, 7]. The training data for the WANN was created by first identifying the EcSOT, the data prior to the EcSOT was separated into 30 s intervals based on the assumption that there may be distinct changes in LFP leading to the EcSOT. Based on one-way ANOVA statistical analysis, the WANN and HMMopt7D did not show significant difference in their optimality index (0.756 ± 0.059 and 0.665 ± 0.260, respectively).

When the wavelet rate of change information was included in the feature space, the HMM_{opt14D} gave the best overall performance out of the three approaches, with *O* = 0.915 ± 0.302 (p < 0.005). Table 3 summarizes the result of the chronic seizure detection for these three models.

The next question we asked was whether the state transitions of the HMM would hinder the performance of the state detection on seizure-free data as well as seizure permissive states that could translate back into interictal without generating seizures. Using 30 min of interictal bursts and non-seizure data that were not part of the training set, mean detection accuracies of 90.1% and 97.1% were achieved using HMM_{opt7D} and HMM_{opt14D} respectively. Since the state transition of HMM_{opt7D} is strictly unidirectional (Figure 4a), any missed false detection of the interictal state would trigger a cascade of errors, resulting in significantly lower accuracy in interictal state detection. HMM_{opt14D} on the other hand allowed possible reversal in state transition back to interictal. In Figure 6, a sample LFP recording along with the posterior probabilities of the HMM_{opt14D} is shown, illustrating that the model was able to classify the IB activities as non-ictal/interictal.

Another method for selecting the appropriate HMM topology was investigated by reducing the feature space dimension and the number of free parameters in the model. Minimum redundancy maximum relevance (mRMR) technique was used to select a subset of relevant wavelet features. It can alleviate the effect of over-fitting caused by the curse of dimensionality and improve the model's ability to generalize. In conjunction with the AICc, which helped balance the LL against the number of model parameters, suitable optimum labelled HMM_{AIC} can be found. A summary of the mRMR analysis is shown in Table 4. The wavelet coefficients associated with the alpha (8 - 15 Hz) and beta (15 - 40 Hz) bands had the largest mutual information quotient (*V*
_{
F
}
*/W*
_{
c
}= 0.497 and 0.436 respectively). The wavelet coefficients at these two frequency bands then constituted the reduced features space. For the feature space consisted of the wavelet coefficients (*c*) only, the simplest 2-D HMM topology with ΔAICc < 0.25 was found at Q = 3, M = 2 (Figure 7a). The state transition diagram of the corresponding HMM is shown in Figure 8a. It contained three bidirectionally connected states. When the posterior probabilities of this model were matched against the LFP after assigning the model states (Figure 8b), no distinction between the tonic firing and the postictal events was found. There were also cases in which the state transitions jumped from interictal to chronic seizure directly and back. ON average, the performance of this HMM was slightly better than the HMM_{opt7D}, with TP = 80.9 ± 34.4%, TN = 94.8 ± 15.6%, *ΔT* = 3.79 ± 8.67 s and *O* = 0.813 ± 0.247, even though student T-test analysis did not reveal any statistically significant improvement. When the wavelet rate of change information (*Δc*) was added as feature, a 4-D reduced feature space was created. The simplest topology with ΔAICc < 0.25 for this feature set was found at Q = 5 and M = 3 (Figure 7b). Figure 9a shows the state transition diagram for this HMM (called HMM_{AIC}). Similar to the HMM_{opt14D}, the state progression was mainly unidirectional with a non-zero state transition probability from the early tonic state back to the interictal state. This suggests that it is possible to have seizure permissive early tonic activity that can be reverted back to the interictal state. The marginal posterior probability *γ*
_{
i
}
*(t)* for each state *i* of HMM_{AIC} is plotted against a sample test data in Figure 9b. Even though no significant improvement over HMM_{opt14D} was revealed using the student T-test, the HMM_{AIC} gives the best overall performance out of all the HMMs created in this study with TP = 95.7 ± 14.0%, TN = 98.9 ± 6.5%, *ΔT* = -2.03 ± 7.10 s and *O* = 0.995 ± 0.129. A summary of the performance measures for the HMM_{AIC} using mRMR and ΔAICc is presented in Table 5.

Finally, we evaluated the correlations between the model tonic firing state from the HMM_{AIC} with the phasic inhibition or excitation in the intracellular activities [35]. It has been suggested that the intracellular whole-cell recordings exhibited a switch from a dominant phasic inhibition (pre_{h}) to a dominant phasic excitation (pre_{d}) mode in the state transition leading to the chronic seizure onset [35]. An intermediate state (pre_{m}) was also reported to compose of a mixture of pre_{h} and pre_{d} mode. None of the HMM created in this study was able to detect early pre_{h} mode using the LFP data. In the HMM_{opt14D}, the states *S*
_{
2
}and *S*
_{
5
}(Figure 5a) can be considered as a combination of late pre_{h} and pre_{m}. Figure 10a summarizes the temporal relationship between the HMM-identified early and late tonic firing activities in the LFP with the identified pre_{m} and pre_{d} modes in the whole-cell recording. Most of the late tonic firing activities identified by the model started earlier than the pre_{d} intracellular activities. Out of the 20 test cases, 85.9% of the pre_{d} activity was identified as the late tonic firing phase. A paired T-test did not indicate a statistically significant difference between the start of the whole-cell pre_{d} activity and the onset of the late tonic HMM state (p > 0.49), as illustrated in Figure 10b.

## Discussion

The administration of a therapeutic intervention, such as electrical stimulation, may be effective in preventing seizures before and during the early stages of the seizure onset [45]. The success of a real-time closed-loop seizure prevention method depends on the time available between the early seizure detection and the manifestation of ictal onset. Many early seizure detection algorithms have been proposed [9, 15, 23, 32, 46–50]. The application of HMM for seizure dynamics detection was inspired by the relatively poor objective criteria for identifying the precise period of preictal interval in supervised learning. Here, an unsupervised training strategy with expert guided state assignment of HMM is proposed where the best rule to represent the wavelet features of seizure progression can be identified. One of the major criteria for selecting an appropriate seizure detection algorithm for a real-time seizure intervention system is the available time for the detection of the impending ictal onset. In this paper, HMMs were created utilizing different feature vectors to characterize the dynamics of SLEs. A motivation behind this work is to determine whether an unsupervised method can produce an accurate seizure detector with a high optimality index. Furthermore, feature selection based on the mRMR criteria and topology selection based on ΔAICc were implemented to evaluate the ability of the HMM to detect multi-stage dynamics leading to chronic seizure onset by measuring the sensitivity, specificity, detection delay and the optimality index.

The unsupervised learning strategy of the HMM involves the estimation of the model parameters through the maximization of the LL function. The training data consists of the wavelet-based features from 20 SLEs. Unlike other recent approaches that assumed a fixed number of states (such as interictal, preictal and ictal) in the model [28], the total number of hidden states in the HMM in this study was allowed to vary in an unsupervised manner. It is logical to expect that if more states are allowed, the better the model would match the observations. However, increasing the possible number of hidden states would lead to over-fitting, hence reducing its ability to generalize. To alleviate this problem, we need to find out the optimal HMM topology defined by the number of states (Q) and the number clusters in the feature space (M) that would be able to generalize well.

The first method we tried was to set aside a portion of the data for validation purpose. The objective was to get the Q and M combination that would give the highest optimality index in the validation set. The optimal HMM (HMM_{opt7D}) created using only wavelet coefficient features had five states (Q = 5) with each feature space modeled by four clusters (M = 4). If the rate of change of the wavelet coefficients was also included in the feature space, the optimal HMM (HMM_{opt14D}) would consist of eight states (Q = 8) with the feature space modeled using three clusters (M = 3). The HMM_{opt14D} is superior to the HMM_{opt7D} not only because it has a higher optimality index (0.915 ± 0.129, compared to 0.665 ± 0.260), but also because it is more robust in identifying multiple distinct dynamics between non-ictal and ictal events. The detection of early and late tonic firing activities prior to the chronic onset *in vitro* also became possible. While the training process of the HMM was not constrained to follow a unidirectional state transition, the state transitions leading to the ictal onset often possessed some unidirectional progress. Once the HMM_{opt7D} output moved away from the interictal state, it must go through the whole seizure progression before returning to interictal. This would then increase the number of false detection in HMM_{opt7D}, as indicated by the results shown in Table 3. On the other hand, some early tonic firing state in the HMM_{opt14D} was allowed to revert back to interictal directly. However, once it had advanced past the early tonic state, it could not revert back to non-ictal activities. The HMM_{opt14D} was successfully tested on non-seizure or interictal burst data, demonstrating a mean accuracy of over 97%. Next, the unsupervised HMM and supervised WANN approaches [6] were compared. The two methods differ in the way that the parameters are obtained. The parameters of the HMM were updated in an iterative manner until no significant improvement in LL was achieved. The supervised WANN approach, on the other hand, required that the human user separated the data into different training groups. The WANN parameters were updated iteratively based on the partial derivative of error with respect to the weights. By definition, the performance of any supervised learning algorithm can be no better than the initial separation of seizure states by the human expert's "Gold Standard". Since the human user has access to the entire observation before marking the chronic onsets, retrospective bias exists in any supervised learning algorithm [28]. This retrospective bias can be eliminated with an unsupervised algorithm such as HMM. The HMM implementation is superior to the WANN because it can distinguish between the early and late tonic firing without having to define them before training.

The second method to reduce the complexity of the HMM based on mRMR feature selection criteria and ΔAICc topology selection was evaluated. The mRMR analysis showed that activity in the alpha (8 - 15 Hz) and beta (15 - 40 Hz) range has the largest mutual information quotient (*V*
_{
F
}
*/W*
_{
c
}) of 0.497 and 0.436 respectively. This result is consistent with the existing literature showing that the alpha and beta frequency bands exhibit considerable difference in the signal complexity between healthy subjects, epileptic subjects during a seizure-free interval, and during seizure [8]. Two new feature spaces were constructed: One consisted of 2-D wavelet coefficients and one consisted of 4-D wavelet coefficient with rate of change information. Using these new feature spaces, HMMs with different topologies were created using unsupervised learning. Since having large number of Q and M often lead to the curse of dimensionality resulting in over-fitting, AICc was used to select the appropriate model topology by balancing the goodness of fit with the number of parameters used. We found that many Q and M combinations were able to achieve similar levels of AICc; the optimal HMM was selected as the model with the smallest number of parameters while maintaining a difference of <0.25 from the minimum AICc. The HMM of 2-D reduced features consists of three states (Q = 3) with two clusters (M = 2). One of these three states reflects the dynamics of both tonic firing and postictal activities (Figure 8a). Because of the small transition probability (*a*
_{
ij
}= 0.001) from the tonic/postictal back to interictal state, it typically would take a much longer time for the model to return to the proper interictal state after chronic seizure. This has been the source for most of the misclassification in the non-ictal activity using this model. The HMM model of 4-D reduced features consists of five states (Q = 5) with three clusters (M = 3). Again, by including the wavelet rate of change information, this model gave the best overall performance with *O* = 0.995 ± 0.129, even though the use of AICc does not warrant an improvement in seizure detection performance. The AICc simply offers a much simpler alternative to selecting the optimal HMM topology. We were able to more easily and more quickly create a HMM seizure detector with fewer number of model states and feature clusters with similar performance level.

The interpretation of early and late tonic firing state detected using HMM_{AIC} in terms of the underlying intracellular whole-cell measurement was investigated. When the model posterior probabilities were plotted against the corresponding intracellular whole-cell recording, we found a significant overlap (>85%) between the late tonic state and the intracellular pre_{d} activity. It was also observed that the early hyperpolarizing activity (pre_{h}) could not be distinguished from the interictal activities.

The application of HMM is based on the assumption that state transitions in seizure generation follow first-order Markov processes. Depending on the sampling period of the signal and the available history of HMM output, it is likely that neurodynamic would depend on more than one previous sample in time. This work can be extended to incorporate additional states in the past for the estimation of state transitions, which can be achieved through modification of the conditional state transition probability equation such as the hidden semi-Markov model [51] where state transition depends on the lapsed time since entering the current state. This model is also appropriate because the underlying process of seizure generation may not have a geometrically distributed duration. Another possible improvement is to include other types of features such as multi-site coherence in the feature space, since the manifestation of epilepsy typically involved progressive global entrainment. Preliminary analysis on the analysis of clinical seizure data from intracranial EEG measurement using HMM also showed promising results [52].

## Conclusions

Based on the analysis of optimality index, the implementation of hidden Markov model as a seizure dynamics detector offers significant improvement over existing approaches based on human visual classification and supervised connectionist perspectives. The subjectivity involved in partitioning the observed data into target states prior to training is eliminated. This model is able to estimate the parameters needed to best fit the observed data depending on its specified topology. Once the unsupervised training is completed, the assignment of the model state to the electrophysiology data is guided by a human expert. To alleviate the curse of dimensionality, feature selection based on minimum redundancy maximum relevance and topology selection based on Akaike information criteria are implemented without jeopardizing the sensitivity, specificity and early detection time of the final model. A five-state hidden Markov model was created, capable of detecting chronic seizures with 95.7% sensitivity and 98.9% specificity. It was also able to detect early and late tonic firing activities that correlate with the intracellular whole-cell dynamics prior to chronic seizure onset. Such a model would potentially allow the researchers to decipher possible sequence of distinct dynamic modes leading to seizure onsets.

## References

Milton J, Black D:

**Dynamic diseases in neurology and psychiatry.***Chaos*1995,**5:**8–13. 10.1063/1.166103Shin C, McNamara J:

**Mechanism of epilepsy.***Ann Rev Med*1994,**45:**379–389. 10.1146/annurev.med.45.1.379Chiu AW, Jahromi SS, Khosravani H, Carlen PL, Bardakjian BL:

**The effects of high-frequency oscillations in hippocampal electrical activities on the classification of epileptiform events using artificial neural networks.***J Neural Eng*2006,**3:**9–20. 10.1088/1741-2560/3/1/002Lasztoczi B, Antal K, Nyikos L, Emri Z, Kardos J:

**High-frequency synaptic input contributes to seizure initiation in the low-[Mg2+] model of epilepsy.***European Journal of Neuroscience*2004,**19:**1361–1372. 10.1111/j.1460-9568.2004.03231.xWorrell GA, Parish L, Cranstoun SD, Jonas R, Baltuch G, Litt B:

**High-frequency oscillations and seizure generation in neocortical epilepsy.***Brain*2004,**127:**1496–1506. 10.1093/brain/awh149Chiu AW, Daniel S, Khosravani H, Carlen PL, Bardakjian BL:

**Prediction of seizure onset in an in-vitro hippocampal slice model of epilepsy using Gaussian-based and wavelet-based artificial neural networks.***Ann Biomed Eng*2005,**33:**798–810. 10.1007/s10439-005-2346-1Chiu AW, Kang EE, Derchansky M, Carlen PL, Bardakjian BL:

**Online prediction of onsets of seizure-like events in hippocampal neural networks using wavelet artificial neural networks.***Ann Biomed Eng*2006,**34:**282–294. 10.1007/s10439-005-9029-9Adeli H, Ghosh-Dastidar S, Dadmehr N:

**A wavelet-chaos methodology for analysis of EEGs and EEG subbands to detect seizure and epilepsy.***IEEE Transactions on Biomedical Engineering*2007,**54:**205–211.Khan YU, Gotman J:

**Wavelet based automatic seizure detection in intracerebral electroencephalogram.***Clin Neurophysiol*2003,**114:**898–908. 10.1016/S1388-2457(03)00035-XSubasi A, Alkan A, Koklukaya E, Kiymik MK:

**Wavelet neural network classification of EEG signals by using AR model with MLE preprocessing.***Neural Netw*2005,**18:**985–997. 10.1016/j.neunet.2005.01.006Osorio I, Frei MG, Wilkinson SB:

**Real-time automated detection and quantitative analysis of seizures and short-term prediction of clinical onset.***Epilepsia*1998,**39:**615–627. 10.1111/j.1528-1157.1998.tb01430.xGrewal S, Gotman J:

**An automatic warning system for epileptic seizures recorded on intracerebral EEGs.***Clinical Neurophysiology*2005,**116:**2460–2472. 10.1016/j.clinph.2005.05.020Webber WR, Lesser RP, Richardson RT, Wilson K:

**An approach to seizure detection using an artificial neural network (ANN).***Electroencephalogr Clin Neurophysiol*1996,**98:**250–272. 10.1016/0013-4694(95)00277-4Pradhan N, Sadasivan PK, Arunodaya GR:

**Detection of seizure activity in EEG by an artificial neural network: a preliminary study.***Comput Biomed Res*1996,**29:**303–313. 10.1006/cbmr.1996.0022Kiymik MK, Subasi A, Ozcalik HR:

**Neural networks with periodogram and autoregressive spectral analysis methods in detection of epileptic seizure.***J Med Syst*2004,**28:**511–522.Kim H, Rosen J:

**Epileptic seizure detection - an AR model based algorithm for implantable device.***Conf Proc IEEE Eng Med Biol Soc***1:**5541–5544.Nandan M, Talathi SS, Myers SM, Ditto WL, Khargonekar PP, Carney PR:

**Support vector machines for seizure detection in an animal model of chronic epilepsy.***Journal of Neural Engineering*2010,**7:**036001. 10.1088/1741-2560/7/3/036001Shoeb A, Edwards H, Connolly J, Bourgeois B, Treves T, Guttag J:

**Patient-specific seizure onset detection.***Epilepsy & Behavior*2004,**5:**483–498. 10.1016/j.yebeh.2004.05.005Gabor AJ:

**Seizure detection using a self-organizing neural network: Validation and comparison with other detection strategies.***Electroencephalography and Clinical Neurophysiology*1998,**107:**27–32. 10.1016/S0013-4694(98)00043-1Karayiannis NB, Mukherjee A, Glover JR, Ktonas PY, Frost JD Jr, Hrachovy RA, Mizrahi EM:

**Detection of pseudosinusoidal epileptic seizure segments in the neonatal EEG by cascading a rule-based algorithm with a neural network.***IEEE Trans Biomed Eng*2006,**53:**633–641. 10.1109/TBME.2006.870249Guerrero-Mosquera C, Trigueros AM, Franco JI, Navia-Vazquez A:

**New feature extraction approach for epileptic EEG signal detection using time-frequency distributions.***Med Biol Eng Comput***48:**321–330.Guo L, Rivero D, Pazos A:

**Epileptic seizure detection using multiwavelet transform based approximate entropy and artificial neural networks.***J Neurosci Methods***193:**156–163.Ocak H:

**Automatic detection of epileptic seizures in EEG using discrete wavelet transform and approximate entropy.***Expert Systems with Applications*2009,**36:**2027–2036. 10.1016/j.eswa.2007.12.065Perez Velazquez JL, Khosravani H, Lozano A, Bardakjian BL, Carlen PL, Wennberg R:

**Type III intermittency in human partial epilepsy.***European Journal of Neuroscience*1999,**11:**2571–2576. 10.1046/j.1460-9568.1999.00688.xRabiner LR:

**A tutorial on hidden Markov models and selected applications in speech recognition.***Proceedings of the IEEE*1989,**77:**257–286. 10.1109/5.18626Bishop CM:

*Pattern recognition and machine learning*. New York: Springer; 2006.Sunderam S, Osorio I, Frei MG, JF W:

**Stochastic modeling and prediction of experimental seizures in Sprague-Dawley rats.***J Clin Neurophysiol*2001,**18:**275–282. 10.1097/00004691-200105000-00007Wong S, Gardner AB, Krieger AM, Litt B:

**A stochastic framework for evaluating seizure prediction algorithms using hidden Markov models.***J Neurophysiol*2007,**97:**2525–2532. 10.1152/jn.00190.2006MacKay D:

*Bayesian methods for backpropagation networks*. Springer; 1994.Akaike H:

**A new look at the statistical model identification.***IEEE Transactions on Automatic Control*1974,**AC-19:**716–723.Burnham KP, Anderson DR:

*Model Selection and Multi-Model Inference: A Practical Information-Theoretic Approach*. Springer; 2002.Talathi SS, Hwang DU, Spano ML, Simonotto J, Furman MD, Myers SM, Winters JT, Ditto WL, Carney PR:

**Non-parametric early seizure detection in an animal model of temporal lobe epilepsy.***Journal of Neural Engineering*2008,**5:**85–98. 10.1088/1741-2560/5/1/009Rafiq A, DeLorenzo RJ, Coulter DA:

**Generation and propagation of epileptiform discharges in a combined entorhinal cortex/hippocampal slice.***J Neurophysiol*1993,**70:**1962–1974.Rafiq A, Zhang YF, DeLorenzo RJ, Coulter DA:

**Long-duration self-sustained epileptiform activity in the hippocampal-parahippocampal slice: a model of status epilepticus.***J Neurophysiol*1995,**74:**2028–2042.Derchansky M, Jahromi SS, Mamani M, Shin DS, Sik A, Carlen PL:

**Transition to seizures in the isolated immature mouse hippocampus: a switch from dominant phasic inhibition to dominant phasic excitation.***J Physiol*2008,**586:**477–494.Mody I, Lambert JD, Heinemann U:

**Low extracellular magnesium induces epileptiform activity and spreading depression in rat hippocampal slices.***J Neurophysiol*1987,**57:**869–888.Blanco S, D'Attellis CE, Isaacson SI, Rosso OA, Sirae RO:

**Time-frequency analysis of electroencephalogram series. II. Gabor and wavelet transforms.***Physical Review E - Statistical Physics, Plasmas, Fluids, and Related Interdisciplinary Topics*1996,**54:**6661–6672.Mallat S:

**A theory for multiresolution signal decomposition: the wavelet representation.***IEEE transactions on pattern analysis and machine intelligence*1989,**11:**674–693. 10.1109/34.192463Rosso OA, Martin MT, Figliola A, Keller K, Plastino A:

**EEG analysis using wavelet-based information tools.***J Neurosci Methods*2006,**153:**163–182. 10.1016/j.jneumeth.2005.10.009Bilmes JA:

**A gentle tutorial on the EM algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models.**In*Int Comput Sci Inst (ICSI) Tech Rep ICSI-TR-97–021*. Berkeley, CA; 1997.Peng H, Long F, Ding C:

**Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy.***IEEE Transactions on Pattern Analysis and Machine Intelligence*2005,**27:**1226–1238.Schiff SJ, Sauer T, Kumar R, Weinstein SL:

**Neuronal spatiotemporal pattern discrimination: the dynamical evolution of seizures.***Neuroimage*2005,**28:**1043–1055. 10.1016/j.neuroimage.2005.06.059Rosenstein MT, Collins JJ, Luca CJD:

**A practical method for calculating largest Lyapunov exponents from small data sets.***Physica D*1993,**65:**117–134. 10.1016/0167-2789(93)90009-PLasztoczi B, Kardos J:

**Cyclothiazide prolongs low [Mg2+]-induced seizure-like events.***J Neurophysiol*2006,**96:**3538–3544. 10.1152/jn.00287.2006Khosravani H, Carlen PL, Velazquez JL:

**The control of seizure-like activity in the rat hippocampal slice.***Biophys J*2003,**84:**687–695. 10.1016/S0006-3495(03)74888-7Mormann F, Andrzejak RG, Kreuz T, Rieke C, David P, Elger CE, Lehnertz K:

**Automated detection of a preseizure state based on a decrease in synchronization in intracranial electroencephalogram recordings from epilepsy patients.***Phys Rev E Stat Nonlin Soft Matter Phys*2003,**67:**021912.Alkan A, Koklukaya E, Subasi A:

**Automatic seizure detection in EEG using logistic regression and artificial neural network.***Journal of Neuroscience Methods*2005,**148:**167–176. 10.1016/j.jneumeth.2005.04.009Kuhlmann L, Burkitt AN, Cook MJ, Fuller K, Grayden DB, Seiderer L, Mareels IM:

**Seizure detection using seizure probability estimation: comparison of features used to detect seizures.***Ann Biomed Eng*2009,**37:**2129–2145. 10.1007/s10439-009-9755-5Tezel G, özbay Y:

**A new approach for epileptic seizure detection using adaptive neural network.***Expert Systems with Applications*2009,**36:**172–180. 10.1016/j.eswa.2007.09.007Aarabi A, Fazel-Rezai R, Aghakhani Y:

**A fuzzy rule-based system for epileptic seizure detection in intracranial EEG.***Clinical Neurophysiology*2009,**120:**1648–1657. 10.1016/j.clinph.2009.07.002Russell M, Moore R:

**Explicit variable duration hidden markov models for automatic speech recognition.***Proc of ICASSP*1985, 5–8.Gadi H, Moller DW, Valiante TA, Andrade D, Chiu AW:

**Spatial time-Frequency Analysis and Non-parametric Classification of Human IEEG Recordings Using HMM.***BMES Annual Fall Meeting*2008.

## Acknowledgements

The project described was supported by Grant Number P20RR016456 from the National Center for Research Resources. The authors would also like to acknowledge the Canadian Institutes of Health Research (CIHR), the Natural Science and Engineering Research Council (NSERC) of Canada and the Louisiana Board of Regents (LEQSF(2007-10)-RD-A-20) for their financial support, as well as Dr. Geoffrey Hinton from the University of Toronto for his insightful feedbacks.

## Author information

### Authors and Affiliations

### Corresponding author

## Additional information

### Competing interests

The authors declare that they have no competing interests.

### Authors' contributions

AWLC and SOT carried out the data analysis and drafted the manuscript. PLC supervised and MD and MC performed the data acquisition. AWLC and BLB participated in the design of the study. All authors read and approved the current manuscript.

## Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

## Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

## About this article

### Cite this article

Chiu, A.W., Derchansky, M., Cotic, M. *et al.* Wavelet-based Gaussian-mixture hidden Markov model for the detection of multistage seizure dynamics: A proof-of-concept study.
*BioMed Eng OnLine* **10**, 29 (2011). https://doi.org/10.1186/1475-925X-10-29

Received:

Accepted:

Published:

DOI: https://doi.org/10.1186/1475-925X-10-29

### Keywords

- Feature Space
- Hide Markov Model
- Wavelet Coefficient
- Seizure Detection
- Wavelet Feature