Skip to main content

Automatic detection of epilepsy from EEGs using a temporal convolutional network with a self-attention layer



Over 60% of epilepsy patients globally are children, whose early diagnosis and treatment are critical for their development and can substantially reduce the disease’s burden on both families and society. Numerous algorithms for automated epilepsy detection from EEGs have been proposed. Yet, the occurrence of epileptic seizures during an EEG exam cannot always be guaranteed in clinical practice. Models that exclusively use seizure EEGs for detection risk artificially enhanced performance metrics. Therefore, there is a pressing need for a universally applicable model that can perform automatic epilepsy detection in a variety of complex real-world scenarios.


To address this problem, we have devised a novel technique employing a temporal convolutional neural network with self-attention (TCN-SA). Our model comprises two primary components: a TCN for extracting time-variant features from EEG signals, followed by a self-attention (SA) layer that assigns importance to these features. By focusing on key features, our model achieves heightened classification accuracy for epilepsy detection.


The efficacy of our model was validated on a pediatric epilepsy dataset we collected and on the Bonn dataset, attaining accuracies of 95.50% on our dataset, and 97.37% (A v. E), and 93.50% (B vs E), respectively. When compared with other deep learning architectures (temporal convolutional neural network, self-attention network, and standardized convolutional neural network) using the same datasets, our TCN-SA model demonstrated superior performance in the automated detection of epilepsy.


The proven effectiveness of the TCN-SA approach substantiates its potential as a valuable tool for the automated detection of epilepsy, offering significant benefits in diverse and complex real-world clinical settings.


Epilepsy (EP), a prevalent chronic condition of the nervous system [1], is characterized by irregular neuronal activity and transient cerebral dysfunction resulting from hyper-synchronous discharges. Around 65 million [2] people worldwide suffer from EP, the majority of whom live in low- and middle-income countries. The prevalence of active EP is 6.38%, and the annual incidence is 614.4/106 population [3]. Although the majority of EP patients have a good prognosis and can live ordinary lives, about 35% of them develop refractory EP due to ineffective drug treatments [4]. Childhood EP incidence is notably high [4], with youths representing over 60% of the patient demographics [5, 6]. Moreover, severe EP may hinder a child’s growth and cognitive development, intensifying the familial and social impacts of the disease [7]. Thus, early detection of EP in children is imperative in expediting treatment and minimizing the development of refractory EP.

EEG waveforms indicative of EP are traditionally diagnosed visually by physicians, a method subject to intra-observer variability, thereby diminishing accuracy. As EEGs are multi-channel one-dimensional sequences, it is inefficient to rely on physicians to mark abnormal EEG segments. Given that classification and prediction tasks based on EEG signals are popular multivariable time-series tasks, the automatic identification of EP from EEG signals has long been a research topic of interest to clinical physicians. The advent of machine learning in computing has enhanced the automated analysis of EP [8, 9], demonstrating promising classification capabilities across time [10,11,12], frequency [13, 14], and time–frequency domains [15], as well as measures of complexity and synchrony [16,17,18,19,20,21,22].

Methods based on machine learning have performed well but have the following limitations. First, feature extraction is highly operator-dependent, introducing subjectivity [23]. Second, EEGs inherently feature low signal-to-noise ratios and are susceptible to artifacts from both environmental noise and patient movement, complicating the analysis [23]. Third, the normal EEGs of neurologically asymptomatic individuals can exhibit minor variations [24]. Therefore, automatic identification models based on EEG signals should have high stability that can accommodate person-to-person and temporal differences.

The rapid increase in computing power has made deep learning the fastest-developing branch of machine learning. Its good generalisability, high accuracy, and high stability, coupled with its network architecture, make the technique suitable for the discovery of high-dimensional features and potential associations [23]. Three types of deep learning are currently used for the automatic identification of EEG signals. The first is a convolutional neural network (CNN) [25,26,27,28]. Acharya et al. [26] proposed a 13-layer deep CNN for identifying targets from normal, interictal, and seizure EEGs. A system was proposed by Thomas et al. [29] to classify EP based on the interictal EEG, which consists of a Convolutional Neural Network (CNN)-based IED detector, a Template Matching-based IED detector, and a spectral feature-based classifier, and which yielded a mean Leave-One-Institution-Out cross-validation area under curve (AUC) of 0.826 on datasets from six centers. The second is a recurrent neural network (RNN) [30,31,32,33]. Li et al. [30] developed a fully convolutional long-term memory, which showed 97.62% sensitivity with the Freiburg hospital database and 94.07% sensitivity with the Children’s Hospital Boston-Massachusetts Institute of Technology scalp EEG database. The third combines two or more neural networks to obtain the main architecture and constructs a deep learning network; for instance, any neural network [34]. Besides, some recent innovations in deep learning have been utilized for the automatic detection task of EP, such as the attention mechanism [35]. Currently, little research has been done using attention mechanisms for EP detection using EEG signals [36, 37].

Although CNNs and RNNs have shown good performance in the automatic detection of EP, they have inherent limitations. In the face of sequential tasks, CNNs, which simulate human visual perception in the receptive field, must limit their receptive fields as one cannot predict current and historical information based on future information. Therefore, when employing EEG sequence data, it is imperative to constrain the receptive field’s orientation appropriately. Parameters such as hyper-parameters can drastically affect the performance of CNNs. RNNs are the preferred network architectures for multivariate time-series-related tasks. However, RNNs may also face the problem of gradients exploding or vanishing during gradient descent when applied to long sequences. For the automatic detection task of EP, RNNs still struggle to capture a sufficient abnormal activity information of temporal context under sequences of finite length for highly accurate predictions of EP. Presently, this problem only can only be lessened by the improvement of RNNs like long short-term memory (LSTM) and gated recurrent unit (GRU), and cannot be solved. Besides, the working mode of RNNs cannot support parallel computing, and RNNs require a long computing time, because they compute the output of each moment by depending on that of the previous moment. If their face difficulties when analyzing sequences with long-term associations [38,39,40], further limiting it development for sequence tasks.

Combining the advantages of CNNs with those of RNNs, Lea et al. [41, 42] proposed a new convolutional method and constructed temporal convolutional neural networks (TCNs). These TCNs captured the long-term associations of sequences with variable-size receptive fields by flexibly changing the dilated values [43, 44]. Models utilizing residual connections not only ensure that the input and output lengths are the same but also address the problem of gradients exploding or vanishing, which is faced by RNNs. Unlike CNNs, TCNs support parallel computing, can handle different types of time-series tasks, and perform better than traditional sequential modeling networks such as RNNs [45,46,47,48,49]. For the automatic detection of EP, Zhang et al. [50] utilized a TCN to classify the Bonn dataset [51] and achieved excellent performance.

However, TCN has some insufficiencies for the automatic detection of EP, especially when the EEG recordings have the period of interictal and seizure at the same time. Interictal and seizure EEG have different patterns. In the real-world EEG examination, the proportion of the interictal period and the ictal period in EP are highly imbalanced. Sometimes, seizures may not manifest during an entire EEG examination. When it extracts features information from EEG sequences with causal and dilated convolution at equal intervals, TCN will learns the interictal and seizure EEG with the same weight, which will lead to inflated efficiency metrics. Therefore, it is not suitable for the task of EP in the real world. Consequently, a more adaptable model suited for the complexity of real-world EP detection is essential—one that can handle various conditions, including recordings capturing solely interictal activity.

This problem can be elegantly addressed by combining TCNs with self-attention (SA) layers. The TCN-SA model was first proposed by Dai et al. [53] for the detection of daily living activities in long-term untrimmed videos. Our study is the first to use a TCN-SA model for the automatic detection of EP in the real world. For the automatic detection task of EP, by learning the interictal abnormal activities and epileptic seizures, our deep learning model extracted their related information as potential features. In the TCN-SA model, the TCN block not only ensures computation at the current moment to circumvent the influence of future information but also captures sufficient abnormal activity information with flexible size in the receptive fields for highly accurate predictions of EP. Additionally, the SA block adapts the learning weights during network training to recognize seizures, reducing computational demands and enhancing accuracy. We evaluated the TCN-SA model’s performance against other deep learning models (TCN, SA and standard CNN models), using cross-validation to confirm its stability and reliability.

The rest of this article is organized as follows. Sect. “Results” introduces the TCN-SA model and related theory. Sect. “Discussion” describes the datasets and experimental settings. The experimental results are listed in Sect. “Conclusions”, where we also describe the utilization of cross-validation to evaluate the stability and reliability of the TCN-SA model. Sect. “Methods” comprises the discussion, and Sect. 6 concludes the study.


Our EEG dataset

Our EEG dataset was randomly divided into training and testing sets at a ratio of 7:3. We then used the TCN-SA model to identify patients with EP based on our EEG data. The training accuracy, test accuracy, training loss, and test loss were obtained, as shown in Fig. 1a, b. Additionally, to compare with our model, we implemented three deep learning models on the same dataset: a standard CNN, an SA neural network, and a TCN. The specific training and test results are shown in Fig. 1c–h.

Fig. 1
figure 1

Accuracy and loss of the training, valid, test results of the four models. a, b TCN-SA model; c, d TCN model; e, f CNN model; and g, h SA model

Figure 1 shows that the TCN-SA model had the highest training and testing accuracy and the lowest training and testing loss among the four models. In terms of training and testing, that of TCN-SA model decreased at the fastest speed; it also had a lower final training and test loss than the TCN model. This result indicated that adding the SA layer improved the training and test accuracy of the model.

At the performance testing stage, the four models were subject to fivefold cross-validation. The participants in each fold are listed in Table 1. The average accuracy, sensitivity, specificity, precision, and F1-score (with standard errors) of the four models are shown in Table 2. The average receiver-operating characteristic (ROC) curve and average area under the curve (AUC; with standard errors) are shown in Fig. 2. The accuracy, sensitivity, specificity, precision, and F1-score of the TCN-SA model were 95.50%, 91.22%, 98.72%, 98.20%, and 0.94699, respectively; these were the highest of all the models. Compared with the CNN and TCN models, our method showed average accuracy improvements of 5.86 and 5.26%, respectively, sensitivity improvements of 8.25 and 7.23%, respectively, specificity improvements of 3.96 and 3.69%, respectively, precision improvements of 5.52 and 5.00%, respectively, and F1-score improvements of 0.0719 and 0.0639, respectively. Compared with the CNN model, the TCN model showed improvements in the average accuracy, sensitivity, specificity, precision, and F1-score, which were 0.60%, 1.03%, 0.27%, 0.52%, and 0.0080, respectively.

Table 1 Participants in each fold
Table 2 Performance of TCN-SA, TCN, SA, and CNN models
Fig. 2
figure 2

Average ROC curve, confidence interval, and average AUC of the four models in the fivefold cross-validation

Figure 2 illustrates that the ROC curve of the TCN-SA model is proximate to the optimal position in the upper left corner and possesses the most constrained confidence interval among the models studied. This signifies that the model conferred the most favorable critical value. Utilizing this point for classification yielded high sensitivity and specificity while ensuring the combined false-positive and false-negative rates remained low. The AUC area of the TCN-SA model was 0.95 ± 0.01, outperforming the other models by showcasing the highest mean and least variability. The AUC of the TCN, SA, and CNN models was 0.89 ± 0.03, 0.92 ± 0.02, and 0.89 ± 0.03, respectively.

In a comprehensive assessment using the F1-score to rank the fivefold cross-validation outcomes, the TCN-SA model maintains superiority in both training and testing accuracy and loss, eclipsing the best performances of three models (TCN, SA, and CNN) (See Appendix: Fig. 11).

To further illustrate the performance of our model, we constructed a confusion matrix (Fig. 3). The color depth in the confusion matrix reflects accuracy and the values are marked in white within each color block. The matrix reveals specificity and sensitivity in the top left and lower right squares, while the upper right and lower left squares represent false-positive and false-negative rates, respectively. The sensitivity and specificity of the TCN-SA model with the best performance were 94.12% and 99.49%, respectively, and those of the worst performances were 89.41% and 98.12%, respectively.

Fig. 3
figure 3

The best and worst confusion matrices for the fivefold cross-validations of the model

We further compared the overall accuracy of each participant in terms of the segment-based evaluation criteria across the four models. After obtaining the predictions for every segment for one participant, we concluded that the overall accuracy indicated the correct percentage of all the segments. In Fig. 4, the vertical axis labels the participants, with yellow and blue bar graphs representing the accurate and erroneous segment percentages, respectively, aided by red and green lines denoting thresholds at 0.2 and 0.1. The same notations hold for the other models (See Appendices Fig. 15).

Fig. 4
figure 4

Accuracy of the TCN-SA model for participants following segment-based evaluation criteria

In Fig. 4, the length of the yellow bar is < 0.2 (red line) for the majority of participants, indicating the overall accuracy exceeding 80% for almost all of the participants; of these, five participants had a yellow bar length greater than > 0.1 (green line), indicating that the overall accuracy surpasses 90% for the remaining 29 participants, with 12 participants even having accuracies reaching 100%. Some participants display overall accuracies below 80% (as seen in Appendices 15), with a few approximating a 50% overall accuracy.

The bonn EEG dataset

We conducted two experiments for the classification of healthy children and those with EP by combining different subsets, A-E and B-E, and adopting threefold cross-validation of the four models. The average accuracy, sensitivity, specificity, accuracy, and F1-score (with standard errors) of the four models are shown in Table 3. The average ROC curve and AUC are shown in Figs. 5 and 6. The accuracy, sensitivity, specificity, precision, and F1-score of the TCN-SA model for A-E were 97.37%, 94.88%, 99.91%, 99.91%, and 0.9730, respectively, whereas those for B-E were 93.50%, 88.07%, 99.00%, 98.86%, and 0.9311, respectively. For A-E, compared with the CNN and TCN models, our method showed average accuracy improvements of 3.57% and 2.48%, respectively, sensitivity improvements of 6.87% and 5.00%, respectively, and F1-score improvements of 0.0387 and 0.0267, respectively. For B-E, our method showed average accuracy improvements of 4.37% and 1.85%, respectively, sensitivity improvements of 6.81% and 2.69%, respectively, and F1-score improvements of 0.0491 and 0.0203, respectively, over the CNN and TCN models. Even if the performances of the SA and TCN-SA models were roughly the same in two experiments with the Bonn dataset [51], with the SA model even performing slightly better than the TCN-SA model, the performance of the SA model was the worst of the four models in experiments with our EEG data.

Table 3 Performances of the TCN-SA, TCN, SA, and CNN models with the Bonn dataset
Fig. 5
figure 5

Average ROC curve, confidence interval, and average AUC of the four models in the threefold cross-validation for A-E

Fig. 6
figure 6

Average ROC curve, confidence interval, and average AUC of the four models in the threefold cross-validation for B-E

Figures 5 and 6 show that the AUC for A-E with the TCN-SA, TCN, SA, and CNN models is 0.97 ± 0.02, 0.94 ± 0.00, 0.97 ± 0.02, and 0.94 ± 0.01, respectively, whereas that for B-E is 0.93 ± 0.02, 0.92 ± 0.01, 0.94 ± 0.01, and 0.89 ± 0.01, respectively. For A-E, the ROC curve of the TCN-SA model is closer to the upper left corner, with the TCN-SA and SA models having the largest AUC. For B-E, the ROC curve of the SA model is closer to the upper left corner, with the narrowest confidence interval of the curve.

In addition, the F1-score was used to sort the cross-validation models and compare the best performances of the four models with the worst performance of our model in the threefold cross-validation (See Appendix: Figs. 11, 12). The result shows that our model had the highest accuracy among the best performances of the four models. However, the worst performance of our model was at an intermediate level.

Finally, Figs. 7 and 8 show confusion matrices. In both experiments, the specificity of the TCN-SA model exceeded 98%. The sensitivity of the TCN-SA model at peak performance was 97.31% and 90.93%, respectively, for the A-E and B-E subsets. Its sensitivity at worst performance was 90.18% and 82.27%, respectively, for the A-E and B-E subsets.

Fig. 7
figure 7

The best and worst confusion matrices for the threefold cross-validation of the TCN-SA model with the A-E subset

Fig. 8
figure 8

The best and worst confusion matrices for the threefold cross-validation of the TCN-SA model with the B-E subset


In this paper, we propose a deep learning model (TCN-SA) for the automatic detection of EP in real world, which can adapt to complex conditions of the real-world scenario. By combining TCN with SA, our method provided a general model that can simultaneously handle the interictal and seizure EEGs; accordingly, our model is highly suitable for clinical applications. When only interictal EEG was included in EEG recordings, the TCN module extracted effective features with the same weighting from abnormal activity and normal activity in EEG sequences, and the SA module focused on the characteristic information of abnormal activity and increased the learned weight of abnormal activity to identify patients with EP during training. When interictal EEG and seizure EEG simultaneously in the EEG recordings, the TCN module extracted effective features with the same weighting from abnormal activity and normal activity in EEG sequences, and the SA module focuses on the abnormal activity of interictal EEG and seizure, changing a learned weight of interictal abnormal activity and seizure with their intensity difference. It can achieve efficient identification of patients with EP if the model prefers to learn information of seizure when the EEG recordings contain interictal EEG and seizure EEG simultaneously. This method was evaluated using two experiments performed with the pediatric EP dataset from the Shenzhen People’s Hospital and the Bonn dataset [51]. The results showed that our method performed well at the participant’s level in the EP detection task. The results of two datasets showed that our model can adapt to complex real-world scenarios and used as a clinically useful model for automatic detection of EP.

For our EEG dataset, the TCN-SA model showed the highest accuracy, sensitivity, specificity, precision, and F1-score of all four tested models. The performance of the TCN model was also better than that of the CNN model. This indicated that adding the SA layer enhanced the ability of the model to identify patients by focusing on key information, improving its overall performance. The ROC curve of the TCN-SA model had the best performance and narrowest confidence interval, indicating that our model had the best performance and stability. Regarding the overall accuracy for each participant in terms of segment-based evaluation criteria, the overall accuracy with our model was > 80% for almost all of the participants; for the 29 remaining participants, the overall accuracy exceeded 90%, and 14 participants had accuracies that reached 100%. For some participants, the overall accuracy was about 50% when the other three models were used. Despite the negative influence of individual differences, our model had the best performance in detection, reflecting its high stability. Compared with the CNN model, the TCN model had better performance in terms of the overall accuracy of participants in the healthy control group; it was also superior in the EP detection task.

For the Bonn dataset [51], the performances of the SA and TCN-SA models were roughly the same in two experiments (A-E and B-E). However, there are some clear differences in EP status between the Bonn [51] and our EEG datasets. The E subset consisted of mainly epileptic seizures, whereas our EEG dataset contained the interictal and ictal states, with the interictal state being predominant. Upon comparing the data status of our EEG and the Bonn datasets, we found that the SA model performed better in identifying epileptic seizures. However, when comparing the best performances of all four models, our model had the highest accuracy; its worst performance was at an intermediate level. It also confirmed that compared the task of identifying EP using the period of interictal state, the general deep learning model can easier achieve good performance identifying epileptic seizures.

Although the results of the TCN-SA model with the Bonn and our EEG datasets were roughly the same, the degree of difficulty of experiments with our EEG dataset was higher than that with the Bonn dataset [51]. First, the Bonn dataset [51] consisted of single-channel signals, whereas our EEG dataset comprised multi-channel signals. Compared with single-channel signals, multi-channel signals are more complex and redundant and hence contain more useful information regarding epileptic seizures. Second, as the experiment with our EEG dataset was evaluated based on subjects; the problem of inflating extrapolation ability was avoided using EEG fragments from the same subject only in either the training or testing set. Third, the E subset in the Bonn dataset [51] comprised epileptic seizures, whereas our EEG dataset contained the interictal and ictal states, with the interictal state being predominant; specifically, three subjects exhibited no seizures during the EEG recordings. Our model showed high classification accuracy in the experiment with the Bonn dataset [51], verifying that it could handle the task of automatically detecting EP from a general epileptic EEG dataset.

To further evaluate the effectiveness of our model, we compared it with other works for the automatic detection of EP from EEG signals. As shown in Table 4, the results of our method and those of other methods were evaluated using the Bonn dataset [51]. Our method appeared to perform equivalently to others. For the A-E subset, our method was second best but differed from the best method by only 0.63%. For the B-E subset, our method was the best.

Table 4 Comparing the performance of the TCN-SA model with that of other models

Although our model showed high classification accuracy, it has some limitations. First, we only verified our model using a dataset from children with EP. Future research will aim to acquire adult EP patient data to broaden the model’s applicability. Moreover, although our model achieved great classification for the EEG dataset we collected, it cannot be utilized to locate seizures for online detection. In our EEG dataset, abnormal discharge segments and normal segments from the raw EEGs were extracted by us under the guidance of professional neurologists before data preprocessing. In the future work, the TCN-SA model can be utilized to locate seizures for online detection after the automatic data preprocessing so as to apply to the pre-consultation to the neurologist at the Outpatient Department [69,70,71]. Finally, our model also lacks interpretability. Although this is an effect of the general black-box nature of deep learning approaches, it is necessary to interpret models in the medical field [72]. We aim to improve the interpretability of our model by taking the advantage of outstanding machine learning algorithm in the automatic detection of epilepsy [73,74,75].


In this study, the TCN-SA model was used for the first time for the automatic detection of EP from EEG data. The TCN extracts EEG features and the SA layer enhances the identification of key features, thereby lowering the computational cost and time. The TCN-SA model achieved 95.40% accuracy in the classification of EP among children; compared with the TCN, SA, and CNN models, its accuracy was improved by 5.33%, 6.79%, and 6.24%, respectively. In addition, our method achieved high classification accuracies with the Bonn dataset [51] (A-E and B-E subsets). The validity of the TCN-SA model shows that it is worthy of implementation for the automatic detection of EP from EEG data.


Data description

A new dataset that we generated ourselves was used to verify our model, and the Bonn dataset [51] was used as the external validation.

Our EEG data

We obtained EEG data from the Department of Pediatrics, Shenzhen People’s Hospital, China, between January 2019 and June 2021. The raw data were anonymized before analysis. This study was approved by the Ethics Committee of the School of Public Health, Sun Yat-sen University (No.2021–081), and informed consent was obtained from the research participants. In accordance with the international 10–20 system, an EEG instrument has 19 electrodes (FP1, FP2, F3, F4, C3, C4, P3, P4, O1, O2, F7, F8, T3, T4, T5, T6, Fz, Cz, and Pz) and two reference electrodes (A1 and A2). Resting-state EEGs were recorded at 500 Hz with a Nicolet recording system (Thermo Nicolet Corporation, USA). Based on the guidelines listed by the International Alliance Against Epilepsy [54], the inclusion and exclusion criteria were set as follows:

  1. 1.

    Inclusion criteria:

  1. (1)

    At least two unprovoked (or reflex) seizures occurring > 24 h apart

  2. (2)

    One unprovoked (or reflex) seizure and probability of further seizures similar to the general recurrence risk (at least 60%) after two unprovoked seizures

  3. (3)

    Diagnosis of an EP syndrome.

  1. 2.

    Exclusion criteria:

  1. (1)

    Other neurological diseases in addition to EP

  2. (2)

    Significant progressive disorders or unstable medical conditions requiring acute intervention

  3. (3)

    Cognitive impairments precluding psychiatric and clinical evaluations

  4. (4)

    Any history of anti-seizure medication use.

The EEG data from healthy children were normal, whereas those from children with EP covered their interictal and ictal states. In our dataset, the abnormal discharge segments were of two types: seizures in their ictal state, and spike and wave complexes in their interictal and ictal states. However, seizures were not recorded in the EEG data of every child with EP; three children did not have seizures when recording EEGs.

The dataset after selection consisted of 35 children is divided into two groups: one group of healthy children (n = 21, average age: 6.9 ± 3.6 years, male = 12, female = 9) and one group of children with EP (n = 14, average age: 7.6 ± 3.7, male = 8, female = 6). The two groups were homogeneous in terms of sex and age; the Chi-square test showed no significant differences in sex (\(X^{2} \, < \,0.001\), \(P=1.000\)), and two-tailed Mann–Whitney U tests showed no significant differences in average age groups (\(U=125.500\), \(P=0.466\)).

The bonn dataset

The Bonn dataset [51] has five subsets (A, B, C, D and E). Each contains 100 single-channel segments, and each signal lasts 23.6 s and was obtained at a sampling rate of 173.61 Hz. The EEG segments of subsets A and B were collected from five healthy volunteers, whereas those of subsets C, D and E were collected from five patients with EP. Table 5 shows details of the Bonn dataset [51]. All the segments were selected and removed from the continuous multi-channel EEG recordings after visual inspection for artifacts such as muscle activity and eye movements.

Table 5 Characteristics of the Bonn dataset

Data preprocessing

Under the guidance of professional neurologists, we extracted abnormal discharge segments from the raw EEGs of the children with EP and then extracted normal segments from the raw EEGs of the healthy children, without interference. Moreover, the raw data needed to be preprocessed before conducting formal EEG analysis. To reduce the computational burden, we downsampled the data at 100 Hz. First, high-pass filtering was carried out at a frequency of 1.6 Hz, after which low-pass filtering was carried out at 70 Hz. Then, band-pass filtering was used to remove the power frequency interference (50 Hz). Finally, the original data were divided into non-overlapping fragments with lengths of 2 s using a sliding time window. The preprocessed fragment set of each patient was thus obtained. Each fragment size was 19 (channels) × 200 (sampling points).

For the Bonn dataset [51], we also down sampled the data at 100 Hz and conducted the same preprocessing with filtration. The EEG signals were divided into non-overlapping fragments of equal size, with each fragment size being 1 (channels) × 100 (sampling points).

The data were standardized to the range of [0,1] using the following formula to prevent numerical overflows and improve prediction accuracy. These preprocessing steps were performed using the EEGLab toolbox [55] in MATLAB (MathWorks). Subsequently, standardized segmented data are then input into a neural network, correspond to an input size of 19(channels) × 200(sampling points), 1(channel) × 100(sampling points), respectively


Model architecture

The architecture of the TCN-SA model primarily consists of the TCN and SA blocks, as shown in Fig. 9. In the model, the TCN block is utilized to learn sequences of EEGs in each sample after preprocessing, capture the long-term features of EEG signals, and then output feature sequences. The SA layer, placed after the TCN block, is used to obtain the inner links of feature sequences and compute associations between pairs of features to discriminate interictal and seizure EEGs. We used attention weights to increase the effectiveness of neural network training and obtain classification prediction outputs through the full connection.

Fig. 9
figure 9

Overview of the proposed TCN-SA model

The details of the proposed model are described in the following subsection.

Temporal convolutional neural network

Temporal convolutional neural network is a new type of CNN originally proposed by Lea et al. [41, 42]. This type of network is used to analyze input data by combining causal and dilated convolution. They adopt a residual network to generate the outcome.

Causal and dilated convolution

A key characteristic of a sequence model is that the prediction of each moment only depends on the observation of its historical moment and not on future observations. Causal convolution [41, 42] requires that the output of the current moment to be obtained only via a convolutional calculation using features of the historical moment. This implies that causal convolution is a one-way convolution from the historical moment to the current moment.

Causal convolution has an important advantage of supporting parallel operation but requires an infinite number of convolutions when it is adopted for super-long sequences. To overcome this, a dilated convolution was applied to causal convolution to dynamically change the receptive field size of causal convolution by adjusting the dilated value to reduce the number of convolutions. For any causal convolution layer, as the dilated value can increase in the form \({2}^{i} (i=\mathrm{number of convolution layer})\) when there is more than one layer of causal convolution in the network, the length of the historical sequence based on dilated convolution is determined by the following formula:

$$length=d \times \left(k-1\right)$$

where d represents the dilation factor, k represents the filter size and \(length\) is the length of the historical sequence calculated.

Figure 10 shows an example of causal convolution combined with dilated convolution.

Fig. 10
figure 10

Causal and dilated convolution: a represents the causal convolution with a convolution kernel size of 2, b represents the dilated convolution with a convolution kernel size of 3, and (c) represents the causal convolution with a dilated value of 3 and convolution kernel size of 2

Residual module

For a residual module, the input is computed by combining causal dilated convolution with non-linear mapping, whereas the output is exported following a full connection layer. As each residual module contains both a dropout and weight normalization layer, the full connection layer not only improves the stability of the neural network but also ensures that the lengths of the input and output remain consistent. The formula is as follows:


where \(o\) and \(x\) are the output and input to the model, respectively, and \(activation\) represents the activation function.

SA mechanism

Attention mechanisms tend to focus on the attention of human beings, the underlying process of which is similar to that of vision. An attention mechanism can improve the performance of a model in a stepwise manner [56] by focusing on key information. There are four types of attention mechanism: softness of attention, forms of input features, input representations, and output representations [57]. The SA mechanism belongs to the category of input representations. In time-series models, the SA mechanism may weigh observations for each moment with the correlations between them. For multiple convolutional layers, the SA mechanism significantly compresses the characteristic matrix of the convolution output and retains important information. In addition, compared with the traditional sequence model that performs well in identifying long-term associations, SA has been more widely applied in various fields [58,59,60,61].

To measure self-attention, a data sequence of length of N was first encoded into \(\mathrm{key }M=\left\{{m}_{1},{m}_{2},\dots ,{m}_{n-1},{m}_{n}\right\}\) and expressed as a key value in the form \(\left(M,V\right)=[\left({m}_{1},{v}_{1}\right), \left({m}_{2},{v}_{2}\right), \dots .,({m}_{n-1},{v}_{n-1}),({m}_{n},{v}_{n})]\). Note that \({m}_{i}\) corresponds uniquely to one in V. M and V are different representations of a data sequence that indicate attention distribution and contextual information [55], respectively. For each query \(Q\) from M, similarity with all values of \(M\) were calculated using a score function known as the scaled multiplicate [34]; the standardized attention score of each key \(M\) to \(V\) was then obtained using the function \(softmax\). Finally, sequence \(V\) was weighted to the normalized attention score as the attention weight and named the SA value. The formula is as follows:


where \(d\) represents the dimensions of an input sequence.

Training and testing

At the model comparison stage, we randomly selected 70% of the subjects from whom we obtained EEG data to be part of the training set, the remaining subjects were made part of the testing set, and we randomly selected 20% of the training set to serve as a validation set. The training and testing sets comprised the fragment set of the corresponding subjects. To avoid inflating the extrapolation ability, we ensured that there were no fragments from the same subject simultaneously in the training and testing sets. At the performance testing stage, we utilized fivefold cross-validation to evaluate the stability and reliability of the TCN-SA model for the task of automatically detecting EP. For this cross-validation technique, the training set was split into five groups, with each group containing 6–8 participants. This process was repeated five times, with one group in each run serving as a source of validation data and the rest of the groups serving as the training set. In other words, we randomly selected 20% of the subjects from each training set to serve as a validation set. The validation sets were then used to adjust the hyper-parameters during model training to avoid overfitting.

To compare with our previous study, we verified our model using the Bonn dataset [51] and threefold cross-validation. The EEG data were divided into three subsets, of which two served as the training sets and one served as the testing set. This process was repeated 3 times, and the average value of the evaluation measurement over these three runs was computed. Each validation set was selected from 20% of each training set.

Experiment settings

The experiment was performed on the high-performance computing cluster platform at the School of Public Health, Sun Yat-sen University. We also used Python 3.8, run on an Intel Xeon E5-2682 v4 CPU, a GTX1080TI GPU and a CUDA11.0 acceleration environment using the PyTorch deep learning framework. Table 6 shows our model parameters settings.

Table 6 Parameters settings the TCN-SA model

In our work, the computational complexity of our model measured by the number of floating-point operations (FLOPs) and model parameters. The experimental results show that our model parameters was 0.42 M, and the FLOPs was 4.04 GFLOPs.

Tables 79 present a comparative analysis of our model across various configurations of layers, kernel sizes, and activation functions. For layers and kernel size, we compared our model and TCN model when the SA block shares the layers and kernel size because of the entirety of the TCN and SA blocks. In Table 7, accuracy of TCN model increased with the number of layers, while accuracy of TCN-SA model was stable (layers ≤ 3), and the same experimental results can be shown in Table 8 (kernel size ≤ 8), so layers with 2 and kernel size with 4 would be the optimal choice with high efficiency and low energy consumption. The different change between TCN model and TCN-SA model verified that the SA layer can effectively attribute for the output accuracy when it focuses on the abnormal activities of interictal EEG and seizure by changing the learning weights of the TCN during neural network training. For activation function, we compared three models. As shown in Table 9, the softmax activation function yielded high and stable output accuracy among three models (TCN, SA, TCN-SA), and loss function corresponding to softmax activation function is cross-entropy.

Table 7 Accuracy (%) of TCN-SA, TCN models with different layers
Table 8 Accuracy (%) of TCN-SA and TCN models with different kernel size
Table 9 Accuracy (%) of TCN-SA, TCN, and SA models with different activation function

Evaluation criteria

An average preformation of cross-validation was used to obtain stable results for our network model. The performance of a network model is measured by five indicators: precision, sensitivity, specificity, F1-score, and accuracy.

Availability of data and materials

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to issues of participant confidentiality.



Area under curve


Convolutional neural network






Gated recurrent unit


Long short-term memory


Recurrent neural network


Receiver-operating characteristic




Temporal convolutional neural network


  1. Paul F, David H, Abir H, Dhiya AJ, Khaled AA. Automatic epileptic seizure detection using scalp EEG and advanced artificial intelligence techniques. J Biomed Biotechnol. 2015.

    Article  Google Scholar 

  2. Ngugi AK, Bottomley C, Kleinschmidt I, Sander JW, Newton CR. Estimation of the burden of active and life-time epilepsy: a meta-analytic approach. Epilepsia. 2010;51(5):883–90.

    Article  Google Scholar 

  3. Fiest KM, Sauro KM, Wiebe S, Patten SB, Kwon C-S, Dykeman J, Pringsheim T, Lorenzetti DL, Jetté N. Prevalence and incidence of epilepsy: a systematic review and meta-analysis of international studies. Neurology. 2017;88(3):296–303.

    Article  Google Scholar 

  4. Asadi-Pooya AA, Sperling MR. Strategies for surgical treatment of epilepsies in developing countries. Epilepsia. 2008;49(3):381–5.

    Article  Google Scholar 

  5. Mac TL, Tran DS, Quet F, Odermatt P, Preux PM, Tan CT. Epidemiology, aetiology, and clinical management of epilepsy in Asia: a systematic review. Lancet Neurol. 2007;6(6):533–43.

    Article  Google Scholar 

  6. Ba-Diop A, Marin B, Druet-Cabanac M, Ngoungou EB, Newton CR, Preux PM. Epidemiology, causes, and treatment of epilepsy in sub-Saharan Africa. Lancet Neurol. 2014;13(10):1029–44.

    Article  Google Scholar 

  7. Eugen T, Patrick K, Byungin L, Amitabh D. Epilepsy in Asia: disease burden, management barriers, and challenges. Epilepsia. 2018.

    Article  Google Scholar 

  8. Supriya S, Siuly S, Wang H, Zhang YC. Automated epilepsy detection techniques from electroencephalogram signals: a review study. Health Inform Sci Syst. 2020;8(1):33.

    Article  Google Scholar 

  9. Loureno CDS, Tjepkema-Cloostermans MC, Putten MJAMV. Machine learning for detection of interictal epileptiform discharges. Clin Neurophysiol. 2021;132(7):1433–43.

    Article  Google Scholar 

  10. Abdellatef E, Emara HM, Shoaib MR, Ibrahim FE, Elwekeil M, El-Shafai W, Taha TE, El-Fishawy AS, El-Rabaie EM, Eldokany IM, Abd El-Samie FE. Automated diagnosis of EEG abnormalities with different classification techniques. Med Biol Eng Compu. 2023.

    Article  Google Scholar 

  11. Ibrahim FE, Emara HM, El-Shafai W, Elwekeil M, Rihan M, Eldokany IM, Taha TE, El-Fishawy AS, El-Rabaie EM, Abdellatef E, Abd El-Samie FE. Deep-learning-based seizure detection and prediction from electroencephalography signals. Int J Numer Methods Biomed Eng. 2022;38(6):e3573.

    Article  Google Scholar 

  12. Chakrabarti S, Swetapadma A, Ranjan A, Pattnaik PK. Time domain implementation of pediatric epileptic seizure detection system for enhancing the performance of detection and easy monitoring of pediatric patients. Biomed Signal Process Control. 2023.

    Article  Google Scholar 

  13. Mahmoodian N, Haddadnia J, Illanes A, Boese A, Friebe M. Seizure prediction with cross-higher-order spectral analysis of EEG signals. 2020; 14: 821–828.

  14. de Borman A, Vespa S, Tahry RE, Absil PA. Estimation of seizure onset zone from ictal scalp EEG using independent component analysis in extratemporal lobe epilepsy. J Neural Eng. 2022;19(2):026005.

    Article  Google Scholar 

  15. Madhavan S, Tripathy RK, Pachori RB. Time-frequency domain deep convolutional neural network for the classification of focal and non-focal EEG signals. IEEE Sens J. 2020.

    Article  Google Scholar 

  16. Sharma R, Sircar P, Pachori RB. Automated focal EEG signal detection based on third order cumulant function. Biomed Signal Process Control. 2020.

    Article  Google Scholar 

  17. Detti P, de Lara GZM, Bruni R, Pranzo M, Sarnari F, Vatti G. A patient-specific approach for short-term epileptic seizures prediction through the analysis of EEG synchronization. IEEE Trans Biomed Eng. 2019;66(6):1494–504.

    Article  Google Scholar 

  18. Gao X, Yan X, Gao P, Gao X, Zhang S. Automatic detection of epileptic seizure based on approximate entropy, recurrence quantification analysis and convolutional neural networks. Artif Intell Med. 2020;102:101711.

    Article  Google Scholar 

  19. Li Z, Wang X, Xing Y, Zhang X, Yu T, Li X. Measuring multivariate phase synchronization with symbolization and permutation. Neural Netw. 2023;167:838–46.

    Article  Google Scholar 

  20. Jiang L, Fan Q, Ren J, Dong F, Jiang T, Liu J. An improved BECT spike detection method with functional brain network features based on PLV. Front Neurosci. 2023;17:1150668.

    Article  Google Scholar 

  21. Zheng S, Zhang X, Song P, Hu Y, Gong X, Peng X. Complexity-based graph convolutional neural network for epilepsy diagnosis in normal, acute, and chronic stages. Front Comput Neurosci. 2023;17:1211096.

    Article  Google Scholar 

  22. Zarei R, He J, Siuly S, Huang GY, Zhang YC. Exploring Douglas-Peucker algorithm in the detection of epileptic seizure from multicategory EEG signals. Biomed Res Int. 2019;2019:1–19.

    Article  Google Scholar 

  23. Zhang X, Yao L, Wang XZ, Monaghan JJM, Zhang Y. A survey on deep learning-based non-invasive brain signals: recent advances and new frontiers. J Neural Eng. 2020.

    Article  Google Scholar 

  24. Roy Y, Banville H, Albuquerque I, Gramfort A, Faubert J. Deep learning-based electroencephalography analysis: a systematic review. J Neural Eng. 2019.

    Article  Google Scholar 

  25. Ansari AH, Cherian PJ, Caicedo A, Naulaers G, Vos MD, Huffel SV. Neonatal seizure detection using deep convolutional neural networks. Int J Neural Syst. 2019.

    Article  Google Scholar 

  26. Acharya UR, Oh SL, Hagiwara Y, Tan JH, Adeli H. Deep convolutional neural network for the automated detection and diagnosis of seizure using EEG signals. Comput Biol Med. 2017;100:270–8.

    Article  Google Scholar 

  27. Raghu S, Sriraam N, Temel Y, Rao SV, Kubben PL. EEG based multi-class seizure type classification using convolutional neural network and transfer learning. Neural Netw. 2020;124:202–12.

    Article  Google Scholar 

  28. Tveit J, Aurlien H, Plis S, Calhoun VD, Tatum WO, Schomer DL, Arntsen V, Cox F, Fahoum F, Gallentine WB, Gardella E, Hahn CD, Husain AM, Kessler S, Kural MA, Nascimento FA, Tankisi H, Ulvin LB, Wennberg R, Beniczky S. Automated interpretation of clinical electro-encephalograms using artificial intelligence. JAMA Neurol. 2023;80(8):805–12.

    Article  Google Scholar 

  29. Thomas J, Thangavel P, Peh WY, Dauwels J. Automated adult epilepsy diagnostic tool based on interictal scalp electroencephalogram characteristics: a six-center study. Int J Neural Syst. 2021;31(5):205–0074.

    Article  Google Scholar 

  30. Li Y, Yu Z, Chen Y, Yang CF, Li Y, Li XA, Li B. Automatic seizure detection using fully convolutional nested LSTM. Int J Neural Syst. 2020.

    Article  Google Scholar 

  31. Craley J, Johnson E, Jouny C, Venkataraman A. Automated inter-patient seizure detection using multichannel convolutional and recurrent neural networks. Biomed Signal Process Control. 2021;64:102360.

    Article  Google Scholar 

  32. Abdelhameed AM, Bayoumi M. A deep learning approach for automatic seizure detection in children with epilepsy. Front Comput Neurosci. 2021;15:19.

    Article  Google Scholar 

  33. Nasseri M, Pal Attia T, Joseph B, Gregg NM, Nurse ES, Viana PF, Schulze-Bonhage A, Dümpelmann M, Worrell G, Freestone DR, Richardson MP, Brinkmann BH. Non-invasive wearable seizure detection using long-short-term memory networks with transfer learning. J Neural Eng. 2021.

    Article  Google Scholar 

  34. Kaur A, Puri V, Shashvat K, Maurya AK. Automated identification of inter-ictal discharges using residual deep learning neural network amidst of various artifacts. Chaos, Solitons Fractals. 2022;156:111886.

    Article  Google Scholar 

  35. Vaswani A, Shazeer N, Parmar N, Jakob U, Llion J, Gomez AN, Lukasz K, Illia P. Attention is all you need. Conf Neural Inform Process Syst. 2017.

  36. Chatzichristos C, Dan J, Narayanan AM, Seeuws N, Huffel SV. Epileptic seizure detection in EEG via fusion of multi-view attention-gated U-net deep neural networks. In: 2020 IEEE SPMB. 2020.

  37. Wang Z, Hou S, Xiao T, Zhang Y, Lv H, Li J, Zhao S, Zhao Y. Lightweight seizure detection based on multi-scale channel attention. Int J Neural Syst. 2023.

    Article  Google Scholar 

  38. Fan J, Zhang K, Huang YP, Zhu YF, Chen BP. Parallel spatio-temporal attention-based TCN for multivariate time series prediction. Neural Comput Appl. 2021.

    Article  Google Scholar 

  39. Alzubaidi L, Zhang J, Humaidi AJ, Ayad AD, Farhan L. Review of deep learning: concepts, CNN architectures, challenges, applications, future directions. J Big Data. 2021.

    Article  Google Scholar 

  40. Glorot X, Bengio Y. Understanding the difficulty of training deep feedforward neural networks. J Mach Learn Res. 2010;9:249–56.

    Google Scholar 

  41. Lea C, Vidal R, Reiter A, Hager GD. Temporal convolutional networks: a unified approach to action segmentation. Springer International Publishing. 2016.

  42. Lea C, Flynn MD, Vidal R, Reiter A, Hanger GD. Temporal convolutional networks for action segmentation and detection. IEEE Computer Society. 2017.

  43. Cheng W, Wang Y, Peng Z, Ren X, Shuai Y, Zang S, Liu H, Cheng H, Wu J. High-efficiency chaotic time series prediction based on time convolution neural network. Chaos Solitons Fractals. 2021.

    Article  MathSciNet  Google Scholar 

  44. Teng F, Song Y, Guo X. Attention-TCN-BiGRU: an air target combat intention recognition model. Mathematics. 2021.

    Article  Google Scholar 

  45. Guirguis K, Schorn C, Guntoro A, Abdulatif S, Yang B. SELD-TCN: Sound Event Localization & Detection via Temporal Convolutional Networks. European Signal Processing Conference. IEEE. 2021.

  46. Chen YT, Kang YF, Chen YX, Wang ZZ. Probabilistic forecasting with temporal convolutional neural network. Neurocomputing. 2020;399:491–501.

    Article  Google Scholar 

  47. Yan J, Mu L, Wang LZ, Ranjan R, Albert YZ. Temporal convolutional networks for the advance prediction of enSo. Sci Rep. 2020;10(1):8055.

    Article  Google Scholar 

  48. Li Y, Yu R, Shahabi C, Liu Y. Diffusion convolutional recurrent neural network: data-driven traffic forecasting. Int Conf Learn Represent. 2018.

  49. Bai SJ, Kolter JZ, Koltun V. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv e-prints. 2018.

  50. Zhang JW, Wu HB, Su W, Wang X, Wu J. A new approach for classification of epilepsy EEG signals based on temporal convolutional neural networks. In: 2018 11th ISCID. 2018.

  51. Andrzejak RG, Lehnertz K, Mormann F, Rieke C, David P, Elger CE. Indications of non-linear deterministic and finite-dimensional structures in time series of brain electrical activity: dependence on recording region and brain state. Phys Rev E. 2001;64(6):061907.

    Article  Google Scholar 

  52. Tao Z, Chen W, Li M. Ar based quadratic feature extraction in the vmd domain for the automated seizure detection of eeg using random forest classifier. Biomed Signal Process Control. 2017;31:550–9.

    Article  Google Scholar 

  53. Dai R, Minciullo L, Garattoni L, Francesca G, Bremond F. Self-attention temporal convolutional network for long-term daily living activity detection. IEEE. 2019.

  54. Fisher RS, Acevedo C, Arzimanoglou A, Bogacz A, Cross JH, Elger CE, Engel J Jr, Forsgren L, French JA, Glynn M, Hesdorffer DC, Lee BI, Mathern GW, Moshé SL, Perucca E, Scheffer IE, Tomson T, Watanabe M, Wiebe S. ILAE official report: apractical clinical definition of epilepsy. Epilepsia. 2014;55(4):475–82.

    Article  Google Scholar 

  55. Delorme A, Makeig S. EEGLAB: an open-source toolbox for analysis of single-trial EEG dynamics including independent component analysis. J Neurosci Methods. 2004;134(1):9–21.

    Article  Google Scholar 

  56. Huang JY, Lu CH, Pin GL, Sun L, Ye XJ. TCN-ATT: a non-recurrent model for sequence-based malware detection. In: Lauw HW, Wong RC-W, Ntoulas A, Lim E-P, Ng S-K, Pan SJ, editors. Advances in knowledge discovery and data mining. PAKDD 2020. Lecture notes in computer science. Cham: Springer International Publishing; 2020.

    Google Scholar 

  57. Niu ZY, Zhong GQ, Yu H. A review on the attention mechanism of deep learning. Neurocomputing. 2021;2021(452):48–62.

    Article  Google Scholar 

  58. Huang SH, Lingjie X, Congwei J. Residual attention net for superior cross-domain time sequence modeling. arXiv e-prints. 2020.

    Article  Google Scholar 

  59. Yao QH, Wang RX, Fan XM, Liu JK, Li Y. Multi-class Arrhythmia detection from 12-lead varied-length ECG using attention-based time-incremental convolutional neural network. Inform Fusion. 2020;53:174–82.

    Article  Google Scholar 

  60. Su JS, Wu S, Xiong DY, Lu YJ, Han XP, Zhang B. Variational recurrent neural machine translation. In: Thirty-second Aaai conference on artificial intelligence. 2018; pp 5488–95.

  61. Zhang XW, Su JS, Qin Y, Liu Y, Ji RR, Wang HJ. Asynchronous bidirectional decoding for neural machine translation. In: Thirty-second Aaai conference on artificial intelligence. 2018; pp 5698–705.

  62. Ilias L, Askounis D, Psarras J. Multimodal detection of epilepsy with deep neural networks. Expert Syst Appl. 2023.

    Article  Google Scholar 

  63. Guo L, Rivero D, Pazos A. Epileptic seizure detection using multiwavelet transform based approximate entropy and artificial neural networks. J Neurosci Methods. 2010;193(1):156–63.

    Article  Google Scholar 

  64. Nigam VP, Graupe D. A neural-network-based detection of epilepsy. Neurol Res. 2004;26(1):55–60.

    Article  Google Scholar 

  65. Subasi A. EEG signal classification using wavelet feature extraction and a mixture of expert model. Expert Syst Appl. 2007;32(4):1084–93.

    Article  Google Scholar 

  66. Kaya Y, Ertugrul OF. A stable feature extraction method in classification epileptic EEG signals. Australas Phys Eng Sci Med. 2018;41(3):721–30.

    Article  Google Scholar 

  67. Ahmedt-Aristizabal D, Fookes C, Nguyen K, Sridharan S. Deep classification of epileptic signals. In: IEEE engineering in medicine and biology society. Annual international conference. 2018. pp 332–5.

  68. Ullah I, Muhammad H, Hatim A, et al. An automated system for epilepsy detection using EEG brain signals based on deep learning approach. Expert Syst Appl. 2018;107:61–71.

    Article  Google Scholar 

  69. Siddiqui MK, Huang X, Morales-Menendez R, et al. Machine learning based novel cost-sensitive seizure detection classifier for imbalanced EEG data sets. Int J Interactive Des Manuf. 2020;14:1491–509.

    Article  Google Scholar 

  70. Siddiqui MK, Islam MZ, Kabir MA. Analyzing performance of classification techniques in detecting epileptic seizure. In: Cong G, Peng WC, Zhang W, Li C, Sun A, editors. Advanced data mining and applications. ADMA 2017. Lecture notes in computer science. Cham: Springer; 2017.

    Google Scholar 

  71. Siddiqui MK, Islam MZ. Data mining approach in seizure detection. In: 2016 IEEE region 10 conference (TENCON)-proceedings of the international conference. 2016. pp 3579–83.

  72. Saint-Esteven ALG, Bogowicz M, Konukoglu E, Riesterer O, Balermpas P, Guckenberger M, Tanadini-Lang S, Timmeren JEV. A 2.5D convolutional neural network for HPV prediction in advanced oropharyngeal cancer. Comput Biol Med. 2022;142:105215.

    Article  Google Scholar 

  73. Siddiqui MK, Islam MZ, Kabir MA. A novel quick seizure detection and localization through brain data mining on ECoG dataset. Neural Comput Appl. 2019;31:5595–608.

    Article  Google Scholar 

  74. Karpov OE, Grubov VV, Maksimenko VA, et al. Extreme value theory inspires explainable machine learning approach for seizure detection. Sci Rep. 2022;12(1):11474.

    Article  Google Scholar 

  75. Siddiqui MK, Morales-Menendez R, Huang X, et al. A review of epileptic seizure detection using machine learning classifiers. Brain Inf. 2020;7:5.

    Article  Google Scholar 

Download references


Professional English language editing support provided by AsiaEdit (


This research was funded by the Natural Science Foundation of Guangdong Province, China, Grant No. 2022A1515011237.

Author information

Authors and Affiliations



Conceptualization, L.H., keying Zhou, S.C. and J.Z.; data curation, L.H., S.C., and Y.C.; formal analysis, L.H. and K.Z.; methodology, L.H.; resources, K.Z. and Y.C.; software, L.H.; validation, L.H.; visualization, L.H.; writing—original draft, L.H.; writing—review & editing, K.Z. and J.Z.

Corresponding author

Correspondence to Jinxin Zhang.

Ethics declarations

Ethics approval and consent to participate

The study was conducted in accordance with the Declaration of Helsinki and approved by the Ethics Committee of the School of Public Health in Sun Yat-sen University (2021-No.081). Informed consent was obtained from all subjects involved in the study.

Consent for publication

Persons depicted in illustrations gave consent for publication.

Competing interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.



A. Comparison of the five-fold cross-validations of the four models.

B. Accuracy of the three models for participants following segment-based evaluation

See Figs. 11, 12, 13, 14, 15, 16

Fig. 11
figure 11

Accuracy and loss of the best and worst effects of the four models in the fivefold cross-validation. a Accuracy; b loss

Fig. 12
figure 12

Accuracy and losses of the best and worst performances of the four models in the threefold cross-validation for A-E. a Accuracy, b loss

Fig. 13
figure 13

Accuracy and losses of the best and worst performances of the four models in the threefold cross-validation for B-E. a Accuracy, b loss

Fig. 14
figure 14

Accuracy of the CNN model for participants following segment-based evaluation criteria

Fig. 15
figure 15

Accuracy of the TCN model for participants following segment-based evaluation criteria

Fig. 16
figure 16

Accuracy of the SA model for participants following segment-based evaluation criteria

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Huang, L., Zhou, K., Chen, S. et al. Automatic detection of epilepsy from EEGs using a temporal convolutional network with a self-attention layer. BioMed Eng OnLine 23, 50 (2024).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: