ECG data
We used two databases, Computers in Cardiology challenge 2001 and 2004 (CinC 2001, 2004) of physionet [17, 18]. The CinC 2001 database includes both AFib and non-AFib data files. These files were made from 24 hour ECG by cutting appropriate segments, and came from 48 different people. The files whose names begin with 'n' contain the ECG data from people who do not have any AFib. However, those people had several diseases only except AFib or else they were normal. Even numbered files whose names begin with 'p' and end with 'c' contain AFib ECG data. We dissected each data into one minute quantity to analyze easily and took only first one minute amount. The CinC 2004 database includes only AFib data files. Each file of the database had one minute amount of data. Sampling frequency of each file was 128 Hz. Each ECG data has two simultaneous components which record two different leads of ECG. We chose the first component. ECG files 'n27' and 'n27c' from the CinC 2001 database had too much noise, so we could not detect heart beat well; hence, we omitted two files. We used 25 AFib and 98 non-AFib data files from the CinC 2001 database and 80 AFib data files from the CinC 2004 database. Almost two non-AFib data files were obtained from one person and only one AFib data file was obtained from each patient. There was no explanation about the number of patients in the CinC 2004 database.
Inter-beat intervals
We obtained inter-beat intervals from input ECG data by using the wavelet method [19]. We present an overview here. First we applied a discrete wavelet transform on an input ECG data to find transform coefficient vectors
where A
N
is an approximation coefficient vector, and D
i
, (i = 1,..., N) is a detail coefficient vector. We chose one detail coefficient vector D
i
by a criterion [19], and assigned zeros to the detail coefficient vectors Di-1, Di-2,⋯, D1. Figure 1(b) shows a waveform obtained by applying inverse wavelet transform to the coefficient vectors
where 0 means a zero vector. Figure 1(c) is a result obtained by subtracting the waveform of Figure 1(b) from the waveform of Figure 1(a). We can see the waveform of Figure 1(c) is leveled out and the details of the waveform were preserved with respect to the waveform of Figure 1(a).
Next we tried to find time position of each QRS complex which is protruded substantially above the baseline. The QRS complexes designate the heart beats. We calculated the approximation and detail coefficient vectors
by applying discrete wavelet transform to the waveform resulted from removing the baseline. Choosing one detail coefficient vector
we made new
by applying some treatments to the vector
[19]. We assigned zero vectors to the other vectors and applied inverse wavelet transform to
We determined the most adequate wavelet scale by comparing the Pearson correlation coefficients [19]. Figure 2 shows that the waveform obtained by inverse wavelet transform indicates the time positions of the QRS complexes.
Poincaré plots
If we represent the inter-beat intervals as a sequence I1, I2, I3, I4, I5,⋯, I
n
like Figure 3(a), we can make a Poincaré plot that is composed of the points (I1, I2), (I2, I3), (I3, I4), (I4, I5),⋯, (In-1, I
n
). We connected the consecutive points with lines to observe dynamics of the inter-beat intervals.
The Poincaré plot applicable to discrete data is closely related to a conventional phase plane of continuous data. If an x-coordinate of a point in Poincaré plot is x1, y-coordinate of the point is mathematically related to
[20]. If the x axis of phase plane is x, the y axis corresponds to
.
Figure 3 describes the procedure of building a Poincaré plot. Figure 3(a) indicates an ECG data containing a premature ventricular contraction (PVC); in addition, it represents the inter-beat intervals I1, I2, I3, I4, I5, I6. Figure 3(b) describes the Poincaré plot made from these inter-beat intervals. This Poincaré plot has the points of (I1, I2), (I2, I3), (I3, I4), (I4, I5), (I5, I6), and we drew the lines between the consecutive points to observe the dynamics more easily. The points revolve clockwise and make a wedge-shaped diagram. This is because the inter-beat intervals changed around the PVC.
Typical patterns of Poincaré plots
The Poincaré plots from non-AFib data show several typical patterns. Figure 4(a) represents an ECG data whose inter-beat intervals are uniformly distributed. The Poincaré plot in Figure 4(b) shows a pattern that the points congregate around one central point. This stands for the almost same inter-beat intervals between the former and the latter beats. The mark O means the QRS complex detector found the time position corresponding to the ventricular activity. Figure 5(a) shows some PVCs exist. The inter-beat intervals change around the PVCs. This is represented in the Figure 5(b) as a wedge-shaped Poincaré plot. This type of Poincaré plot is also reported in Zemaityte et al.'s paper [21]. The difference between the Poincaré plot in this paper and the plot in Zemaityte et al.'s paper is whether the lines are drawn or not between the consecutive points in the plots.
Poincaré plot in case of AFib
Figure 6 demonstrates that Poincaré plot does not have any specific pattern in case of AFib, and the points in the Poincaré plot move irregularly. This explains that the inter-beat intervals are statistically independent from each other under the state of AFib, except for a slight correlation between the immediate subsequent beats [20]. The points in the plot often move across the diagonal line. We drew the lines between the consecutive points in the Poincaré plot to observe movements of the points more easily. This plot is similar to many AFib plots in other papers [20, 21].
Feature selection
Mean stepping increment of inter-beat intervals
Let us assume that we were given inter-beat intervals, I1, I2, I3, I4, I5,⋯, I
n
. The points in Poincaré plot will be (I1, I2), (I2, I3),⋯, (In-1, I
n
) in order. If we designate two consecutive points as (I
j
, Ij+1) and (Ij+1, Ij+2), the distance between two points in the Poincaré plot will be
. We calculated mean value of these quantities as
. This implies rate of change of the inter-beat intervals in the Poincaré plot. To normalize this and make this quantity dimensionless, we divided it by mean inter-beat interval,
. We defined next quantity as the mean stepping increment of the inter-beat intervals.
Dispersion of points around diagonal line in Poincaré plot
Let us calculate coordinates of a central point on the diagonal line in Poincaré plot. If the inter-beat intervals are I1, I2, I3, I4, I5,⋯, I
n
, the points of the Poincaré plot consist of (I1, I2), (I2, I3), (I3, I4),⋯, (In-1, I
n
). We tried to find a central point (x, x) minimizing sum of distance squares from this point to all the other points in the Poincaré plot. If we designate this sum as E(x), this will be represented as follows.
To find the point minimizing this sum, we calculated a derivative with respect to the variable x. From
, we found the central point (a, a) as follows.
The distance from a point (I
j
, Ij+1) to the diagonal line y = x is represented as
. Standard deviation of these terms is represented as follows.
This term can be used to indicate how spread the points in Poincaré plot are distributed around the diagonal line. We chose the following ratio of the above two terms as a distinguishing feature.
Number of clusters in Poincaré plot
To determine the number of clusters in Poincaré plot, we developed a clustering method based on spectral graph theory [22]. The Poincaré plots in Figure 4(b) and 5(b) show that non-AFib data sets have a limited number of clusters. On the other hand, Figure 6(b) shows that AFib data sets can have many clusters or just one conglomerate lump.
Correcting faults in QRS complex detection
Poincaré plot helped us to catch a fault of our QRS complex detector. If the QRS complex detector misses one QRS complex like Figure 7(a), the inter-beat interval corresponding to that portion shall be longer than the other intervals. This is represented as a triangle in Figure 7(b) because the method of making a Poincaré plot produces the points (I1, I2), (I2, I3), (I3, I4), (I4, I5) where I1, I2, I4 and I5 are almost same but less than I3.
We tried to correct this by identifying the triangle in the Poincaré plot as follows. Let us designate two consecutive points as (I
j
, Ij+1) and (Ij+1, Ij+2). We calculated the coordinates of the central point (a, a) on diagonal line in Poincaré plot in the above section. We can think of the central point (a, a) as a new origin; then the coordinates of two consecutive points will be (I
j
- a, Ij+1- a) and (Ij+1- a, Ij+2- a) with regard to this new origin. We identified a mistake done by the QRS complex detector when the distances
, are similar and the coordinates of the middle point
are positive and this middle point is located near the diagonal line and the smaller of
, is substantially large.
Support vector machine classifier
We employed 1-norm support vector machine with radial basis function kernel. The forms of the support vector machine and the radial basis function kernel were given as follows.
The parameters C and γ were selected by an automatic tool provided by a support vector machine program [23]. We gave an option '-v 2' to this automatic tool.