Cardiac arrhythmia classification using autoregressive modeling

Background Computer-assisted arrhythmia recognition is critical for the management of cardiac disorders. Various techniques have been utilized to classify arrhythmias. Generally, these techniques classify two or three arrhythmias or have significantly large processing times. A simpler autoregressive modeling (AR) technique is proposed to classify normal sinus rhythm (NSR) and various cardiac arrhythmias including atrial premature contraction (APC), premature ventricular contraction (PVC), superventricular tachycardia (SVT), ventricular tachycardia (VT) and ventricular fibrillation (VF). Methods AR Modeling was performed on ECG data from normal sinus rhythm as well as various arrhythmias. The AR coefficients were computed using Burg's algorithm. The AR coefficients were classified using a generalized linear model (GLM) based algorithm in various stages. Results AR modeling results showed that an order of four was sufficient for modeling the ECG signals. The accuracy of detecting NSR, APC, PVC, SVT, VT and VF were 93.2% to 100% using the GLM based classification algorithm. Conclusion The results show that AR modeling is useful for the classification of cardiac arrhythmias, with reasonably high accuracies. Further validation of the proposed technique will yield acceptable results for clinical implementation.


Background
Automatic ECG analysis is critical for diagnosis and treatment of critically ill patients. Modeling and simulation of ECG under various conditions are very important in understanding the functioning of the cardiovascular system as well as in the diagnosis of heart diseases. Arrhythmias represent a serious threat to the patient recovering from acute myocardial infarction, especially ventricular arrhythmias like ventricular tachycardia (VT) and ventricular fibrillation (VF). In particular, VT and VF are lifethreatening conditions and produce significant haemodynamic deterioration [1]. There is a need for quick identifi-cation of these conditions. Other arrhythmias like atrial premature contraction (APC), premature ventricular contraction (PVC) and superventricular tachycardia (SVT) are not as lethal as VF, but are important in diagnosing the disorders of the heart. The reliable detection of these arrhythmias constitutes a challenge for a cardiovascular diagnostic system. Consequently, significant amount of research has focused on the development of algorithms for accurate diagnosis of ventricular arrhythmias.
Cardiac arrhythmias using two intracardiac channels can be detected using the correlation waveform analysis (CWA) [5]. CWA was used to detect morphologic changes in the intracardiac electrogram when compared to electrograms during sinus rhythm. Each electrogram had its respective template and the templates were obtained by signal averaging the waveform from a passage of sinus rhythm. The software trigger was used to align the template with the cycle being tested. A window size of N point defined each cycle.
Methods such as direct ECG feature detection [6], Fourier transform [12], time-frequency analysis [7] and complexity analysis [13] have also been employed. The main objective of the direct ECG feature detection was to investigate how many different ventricular conduction defects (VCD) categories can be formed by advanced cluster analysis methods, which reduce the number of classification parameters into a reasonably small set for a meaningful classification. The second objective was to investigate to what extent a select set of repolarization parameters would help in identification of distinct VCD subgroups. The study of detection of ECG features show that the key features of ECG are QRS duration, T axis angle, T amplitude, QRS axis angle and spatial angle. Typically, morphological features related to the P, QRS and ST waves are used for ECG signal analysis [1].
A technique based on averaged threshold crossing intervals was proposed for the detection of VT and VF based on heart rate measurements [4]. A modified sequential detection algorithm was further proposed to improve the accuracy of detecting VT and VF [10]. A Fourier transform based algorithm has been proposed for the detection of supraventricular rhythms from ventricular rhythms [12]. Power spectra computed from the QRS complexes extracted from ECG signals were classified using a neural network. High sensitivity and specificity values greater than 98% have been reported for discriminating supraventricular rhythms from ventricular rhythms. However, SVT and VT were grouped together as ventricular rhythms. A new algorithm based on complexity measures was proposed for the detection of NSR, VT and VF [13]. The algorithm was tested for varying lengths of data and very high accuracy values were achieved for data lengths of 7 sec for classifying NSR, VT and VF. The algorithm was suggested for real-time implementation in automatic external defibrillators.
Recently, a new approach for the discrimination among VF, VT and SVT has been developed using a total least squares-based Prony modeling algorithm [16]. Two features, energy fractional factor (EFF) and predominant frequency (PF) were derived from the total least squares based Prony model. A two-stage classification method is used in which the EFF is used for discriminating SVT from VT and VF in the first stage followed by using PF for further separation of VF and VT the second stage. A classification accuracy of 95.24%, 96.00% and 97.78% were reported for SVT, VF and VT respectively for the Prony modeling algorithm. However, the total least squaresbased Prony modeling technique did not consider NSR, APC and PVC for feature extraction and discrimination. AR modeling has been used extensively to model heart rate variability (HRV) and for power spectrum estimation of ECG and HRV signals [17][18][19][20][21]. Amplitude modulated sinusoidal signal model, which is a special case of the time-dependent AR model have been applied to modeling ECG signals [17]. Adaptive AR modeling with Kalman filtering has been used [20]. Parameters extracted from AR modeling have been used for arrhythmia classification in conjunction with other features [8]. For example, two AR coefficients, along with the mean-square value of the QRS complex segments were utilized as features for classification of normal and abnormal PVC, where the prediction order was only 2, a fuzzy adaptive resonance theory mapping (ARTMAP) was used for classification. The best result of PVC correct detection were 92% under the ratios of the training data size and testing data size was 2 to 4 [8]. It has been suggested that increasing the model order would not reduce the prediction error implying that a linear predictor order of two is sufficient for fast cardiac arrhythmia detection [22].
The objective of the present study is to model the ECG signal and classify certain cardiac arrhythmias at the ICU. Most of the techniques involve significant amounts of computation and processing time for extraction of features and classification. The other disadvantage is the small number of arrhythmias classified using a given technique with most techniques being used to classify two to three arrhythmias [2,[4][5][6][7][8][9][10][11][12][13]15,16]. There is a need for extending a particular technique for a larger number of arrhythmias. In addition, the proposed technique should be amenable to real-time implementation so that it can be used in intensive care units.
In this study, the ECG signals were modeled using AR analysis for classifying cardiac arrhythmias. The advantage of AR modeling is its simplicity and it is suitable for realtime classification at the ICU or ambulatory monitoring. AR models are popular due to the linear form of the system of simultaneous equations involving the unknown AR model parameters and the availability of efficient algorithm for computing the solution [23,24]. AR modeling has been used in various applications including classification of physiological signals like electroencephalograms http://www.biomedical-engineering-online.com/content/1/1/5 [25]. AR modeling is adapted for extracting good features from ECG signals, thus enabling the discrimination of certain ECG arrhythmias. The computed AR coefficients were checked for modeling accuracy for the various types of ECG signals. Various pattern classification techniques have been applied to the classification of arrhythmias [3,12,14,16,[26][27][28]. In the current study, the AR coefficients computed from the ECG signals were classified using a generalized linear model (GLM) [29]. Various arrhythmias including Atrial Premature Contraction (APC), Premature Ventricular Contraction (PVC), Superventricular Tachycardia (SVT), Ventricular Tachycardia (VT) and Ventricular Fibrillation (VF) were classified.

Preprocessing
ECG data for the analysis and classification was obtained from the MIT-BIH arrhythmia database, the MIT-BIH Ven-tricular Arrhythmia database and the MIT-BIH supraventricular arrhythmia database. Various ECG segments were selected from the databases for modeling and classification. The data set included around 200 segments each of normal ECGs, APCs, PVCs, SVTs, VTs and VFs. The sampling frequency of the data from the MIT-BIH Arrhythmia database was 360 Hz, the sampling frequency of the data from the MIT-BIH ventricular arrhythmia database was 250 Hz and the sampling frequency of the data from the MIT-BIH supraventricular arrhythmia was 128 Hz. The data from the MIT-BIH arrhythmia and supraventricular arrhythmia databases were re-sampled so that all the data used in the analysis had a sampling frequency of 250 Hz.
Prior to modeling, the ECG signals were preprocessed to remove noise due to power line interference, respiration, muscles tremors, spikes etc., and to detect the R peaks in the ECG signals. The R peaks of ECG were detected using

Figure 3
A patient ECG and simulated ECG having NSR Tompkins's algorithm [30]. The sample size affects the segment selected for AR modeling and care must be taken to pick at least one cardiac cycle so that the signal can be accurately modeled and can be useful in diagnosis. Cardiac cycle length or RR intervals differ for a normal sinus rhythm and vary for different arrhythmias. A normal ECG sinus rhythm refers to the usual case in health adults where the heat rate is 60-100 beats per minutes. In APC, a RR interval is shorter than normal, and the subsequent interval is no longer than normal. In VF, the RR intervals are much shorter than in a normal sinus rhythm. In the current study, a sample size of 300 (1.2 seconds) was used which consists of one hundred samples before the R peak and 200 samples after the R peak. It is adequate to capture most if not all of the information from a particular cardiac cycle.

AR Modeling
AR analysis models the ECG signal as the output of a linear system driven by white noise of zero mean and unknown variance [23,24]. AR models have the form where v[k] is the ECG time series, n[k] is zero mean white noise, a i 's are the AR coefficients, and P is the AR order.
A critical issue in AR modelling is the AR order used to model a signal. It is necessary to select an appropriate AR order so that the signal is modelled with sufficient accuracy so as to be useful for classification. Various model orders were used to estimate the accuracy of the where v(i) and (i) are the original and the simulated signals at the i th instant, and m and are the mean of the original and simulated signals respectively and N is the length of the modeled signal. The signal-to-noise ratio is given by Burg's algorithm was used to estimate the AR coefficients with a pre-selected model order P [23,24].

GLM-based classification
In the current study, AR coefficients were used to classify cardiac arrhythmias. A stage-by-stage GLM based classification model has been used for classification of the various cardiac arrhythmias. A GLM is given by [29] where = [y 1 ,y 2 ,...,y N ] T is an N-dimensional vector of observed responses, β = [β 0 , β 1 ,...,β P ] T is a P+1 dimensional vector of unknown parameters, A is N × (P + 1) matrix of known predictors (AR coefficients) and ∈ = [∈ 1 ,∈ 2 ,...,∈ N ] T is an N dimensional error vector.
The least squares estimator is given by Generalized linear model based classification was performed in stages to differentiate between the normal ECG signals and the various cardiac arrhythmias. The various stages of classification and groupings in every stage are shown in Fig 1. During the training phase, the estimator β was computed based on the known classes of ECG segments that form the training set. The AR coefficients and the previously estimated β were used to compute the correct response at a particular stage of classification during the testing phase. To perform the stage by-stage classification, Euclidean distance measure between the AR coefficients of different classes was used to determine the groupings of classes at each stage. The AR coefficients  A patient ECG and simulated ECG with VT YŶŶŶ were assigned values 1 or -1 depending on the membership of an ECG segment to a corresponding class or group. The number of elements in was the number of examples in the training set. The estimator β was computed at each stage of classification based on the selected training sets. During testing, the output response (Y1 in stage 1, Y2 in stage 2, etc) was computed using the AR coefficients and the previously estimated β at each stage. A threshold value of zero was used to classify the output response as belonging to a group at a particular stage. Sixty samples from each class were used for training and the remaining was used for testing in the classification phase. The training sets were picked randomly and the sensitivity and specificity were measured for the NSR and the arrhythmias multiple times. The sensitivity and specificity was computed for all the classes as given by where TE represents the total number of events, FN represents false negative, and FP represents false positive [2]. The average sensitivity and specificity values were computed for NSR and the cardiac arrhythmias.

Results
The AR modeling was applied to six different types of ECG signals from the MIT-BIH database. Classification was performed using a GLM-based classification algorithm.

Figure 8
A patient ECG and simulated ECG with VF Y AR modeling results Two main criteria, SNR and ρ were used to evaluate the performance of the AR model with different model orders. The correction coefficients for all ECG signals were 0.99. The SNR was calculated to be from 15.7 dB to 29.43 dB. Figure 2 shows the variation of SNR as a function of model order P. The SNR increased initially with model order P, but remains almost constant for model orders greater than or equal to four. In addition, computing the AR coefficients of higher orders would increase the number of computations. Hence, AR model of order four was used for further classification. The parameters computed using this model order were good enough to achieve a good SNR and correlation coefficient ρ and were found to be sensitive enough to differentiate the five types of ECG signals. The original NSR, APC, PVC, SVT, VT and VF segments as well as the modeled segments are shown in Figs 3,4,5,6,7 and 8.
The results were consistent with other studies on the selection of model order for AR modeling [22]. AR modeling has been used for compression and it has been found that increase in accuracy by increasing the order of the predictor is negligible for predictors of order higher than 3 [22].
The mean AR coefficients for all the ECG types used in the current study are shown in Table 1.

Classification results
AR coefficients computed with order four were used for classification. Six types of ECG signals namely, NSR, APC, PVC, SVT, VT, and VF were considered for classification. Classification was performed using a generalized model linear model, which was applied in various stages. Figure  1 shows the stage-by-stage GLM-based classification algorithm for classifying various ECG signals. The six classes were separated into two groups with Normal, APC, PVC, and SVT signals forming one group and VT and VF forming another group at stage one. This grouping was evident by computing the Euclidean distance between the mean AR coefficients from various classes. The Euclidean distance between classes VF and VT was small. Similarly, the Euclidean distance among classes Normal, APC, PVC and SVT was small. The distance between VF/VT and Normal/ APC/PVC/SVT was large and hence in the first stage, classes VT and VF were grouped together. The other group consisted of classes Normal, APC, PVC and SVT. In the second stage, VT and VT were differentiated. Stages three, four, five and six were used to differentiate between NSR, APC, PVC and SVT as shown in Fig 1. The least squares estimator β was computed for various stages and the value Y was used to determine the classes in each stage. In the first stage (Y1), the AR coefficients from an ECG signal was separated into two groups, one consisting of NSR, APC, PVC and SVT and the other consisting of VT and VF. In the second stage (Y2), VT and VF were differentiated. In the third stage (Y3), SVT was distinguished from NSR, APC and PVC. In the later stages (Y4, Y5 and Y6), NSR, APV and PVC were distinguished from each other and classified.

Discussion
Different values of AR modeling orders were tested for the ECG signals and the results showed that AR order of four is sufficient to model the ECG signal for the purpose of classification of selected arrhythmias. AR coefficients were used to classify the ECG beats into normal and five selected abnormal conditions. A stage-by-stage generalized linear model classification algorithm was used to distinguish between the different types of arrhythmias under consideration in the current study.
The classification results show that AR modeling can be used to discriminate between different arrhythmias. The classification results achieved using AR modeling is comparable to the recently published results on the classification of cardiac arrhythmias [8][9][10][11][12][13][14][15][16]. Normal and abnormal PVC conditions have been classified using LPC coefficients classified using a fuzzy ARTMAP classifier with sensitivity of 97% and specificity of 99% [8]. Accuracy of 93% and 96% has been reported for VT and VF respectively using a modified sequential probability ratio test algorithm [10]. An overall accuracy of 93% to 99% was achieved with decimated ECG data and artificial neural networks [14]. However, the data set consisted of a high number of NSR and PVC beats and the performance of beats including atrial premature beats was not very high. The total least squares-based Prony modeling technique produced an accuracy of 95.24%, 96% and 97.78% for SVT, VT and VF respectively [16].
AR modeling based classification algorithm has demonstrated good performance in classification. The algorithms are easy to implement and the AR coefficients can be easily computed. Preprocessing involves the detection of R peaks for which a number of techniques are available that can be implemented for real-time processing. A detailed comparison of computation times has not been performed; however, it is noted that computing the AR coefficients is simpler than most proposed measures for arrhythmia recognition. In addition, the computations were performed for 1.2 seconds of data only compared to 3 to 7 seconds for the complexity measures based tech-nique [13] and 5 to 9 seconds in the Prony modeling technique [16].
Some of the proposed techniques use only a smaller number of arrhythmias (2-3) than the current study [2,[4][5][6][7][8][9][10][11][12][13]15,16]. The fuzzy ARTMAP technique has been used to classify normal and abnormal PVC conditions only [8]. A time sequenced adaptive filter has been proposed for VT and VF alone [9]. A real time discrimination algorithm with a Fourier-transform neural network has been has been proposed to distinguish between superventricular rhythms and ventricular rhythms in which PVC and VT were lumped together as belonging to a single class of ventricular rhythms [12]. The complexity measure-based technique has been used to classify NSR, VT and VF [13]. A QRS feature based-algorithm for decimated ECG data using artificial neural networks has been proposed that include various types of beats including APC and PVC, but they do not include the life threatening conditions like VT and VF [14]. The Prony modeling technique has been used to classify SVT, VT and VF but their study does not include episodes from normal, APC or PVC [16]. The current study classifies six different ECG classes and the performance is comparable to those studies that involve fewer classes.
In the current study, a fixed sample size has been used for AR modeling. A variable sample size based on the estimation of the R-R interval might yield better results independent of the heart rate of the subjects. The generalization capabilities of the AR model and the classification algorithms can be refined by applying the proposed approach to a larger data set. Further work is in progress to extend the proposed approach for classification of other types of cardiac arrhythmias as well as applying it to other signals of the cardiovascular system such as the hemodynamic signals, particularly for real-time applications. AR modeling is a linear modeling technique and might not necessarily be suitable for ECG signals under all conditions. Further work can be done to extend the current work to nonlinear parametric models that can better capture the non-linear and non-stationary nature of the ECG.
In addition to their utility in classification and diagnosis, AR coefficients can also be used for compression. AR modeling can lead to a low cost, high performance, simple to use portable telemedicine system for ECG offering a combination of diagnostic capability with compression.

Conclusions
The proposed AR modeling and GLM for classification have been shown to be effective for the classification of cardiac arrhythmias in critically ill patients and aid in the diagnosis of heart disease. AR modeling and GLM models