### Pre-processing of *RR*
_{
n
}series

#### A. Median filter

A median filter is implemented by windowing the acquired data, ranking the samples in the window, and outputting the median of the sorted samples. Considering a RR interval (*RR*
_{
n
}) sequence *x*
_{
n
}, as shown in Figure 1(a), the output *y*
_{
n
} of this nonlinear filter is given by,

{y}_{n}=\text{median}\{{x}_{n-w},\cdots \phantom{\rule{0.3em}{0ex}},{x}_{n},\cdots \phantom{\rule{0.3em}{0ex}},{x}_{n+w}\}

(1)

where the window is of a fixed width 2*w*+1. From the perspective of signal processing, the time delay of the median filter is *w*. A window size of 17 is used herein, with a delay of 8 samples. The introduction of a median filter brings about two advantages: (*i*) the suppression of unwanted outliers, which are mostly caused by erroneously detected (or missed) R-wave peaks; (*ii*) to preserve sharp edges (i.e., onsets and terminations of AF episodes) without extensively blurring the context.

#### B. Integer filter for low scale reference

Subsequently, we filter the output *y*
_{
n
} of median filter with a low-pass filter of the form

{H}_{l}(z)=\frac{1-{z}^{-16}}{1-{z}^{-1}}

(2)

where, the gain is *G* *a* *i* *n* 1=16=2^{4}, and the intrinsic delay of *H*
_{
l
}(*z*) is 7.5 samples. This low-pass filter is applied to smooth *y*
_{
n
} resulting from the previous median filtering. Another benefit of the low-pass filter is the removal of fluctuations possibly caused by Respiratory Sinus Arrhythmia (RSA) phenomena around the current sample from acquisition. Let *xl*
_{
n
} be the output of this filter, as illustrated in Figure 1(b).

#### C. Integer filter for high scale reference

Another low-pass filter *H*
_{
h
}(*z*) is then applied to the resultant *xl*
_{
n
} of the previous low-pass filter *H*
_{
l
}(*z*),

{H}_{h}(z)=\frac{1-{z}^{-32}-{z}^{-64}+{z}^{-96}}{1-2{z}^{-1}+{z}^{-2}}

(3)

where, the gain is *G* *a* *i* *n* 2=2048=2^{11}, and the relevant delay of *H*
_{
h
}(*z*) is 47 samples. This low-pass filter is introduced to generate a reference RR sequence of a larger scale, which needs to be exploited in the definition of symbolic series as explained in the following subsection. The resulting output denoted by *xh*
_{
n
} is shown in Figure 1(c).

As we have seen, the time delays of *x*
_{
n
} and *xl*
_{
n
} are −62.5 and −47 samples with respect to *xh*
_{
n
}, respectively. To ensure synchronization of the filtered data, let {x}_{n}^{\prime} and {\mathit{\text{xl}}}_{n}^{\prime} denote the corresponding time-delay corrected sequences of *x*
_{
n
} and *xl*
_{
n
}, respectively. Then, \Delta {\mathit{\text{RR}}}_{n}={x}_{n}^{\prime}-{\mathit{\text{xl}}}_{n}^{\prime} can be defined as the difference in time delay, seen in Figure 1(d).

### Symbolic dynamics of *Δ* *RR*
_{
n
}

The purpose of employing symbolic dynamics is to describe the dynamic behavior of *Δ* *RR*
_{
n
} with respect to *xh*
_{
n
}. Symbolic dynamics encodes the information as a variation of *RR*
_{
n
} to a series with fewer symbols, with each symbol representing an instantaneous state. The implemented thresholds can be defined as: *t* *h* *r* *e* 1=*xh*
_{
n
}×2^{−4} (with *t* *h* *r* *e* 1=*xh*
_{
n
}>>4), *t* *h* *r* *e* 2=*xh*
_{
n
}×2^{−3} (with *t* *h* *r* *e* 2=*xh*
_{
n
}>>3), *t* *h* *r* *e* 3=*t* *h* *r* *e* 1+*t* *h* *r* *e* 2, *t* *h* *r* *e* 4=*xh*
_{
n
}×2^{−2} (with *t* *h* *r* *e* 4=*xh*
_{
n
}>>2) and *t* *h* *r* *e* 5=*t* *h* *r* *e* 4+*t* *h* *r* *e* 1. The mapping function of the symbol transform can therefore be defined as,

{\mathit{\text{sy}}}_{n}=\left\{\begin{array}{cc}0& \phantom{\rule{1em}{0ex}}\phantom{\rule{2em}{0ex}}\phantom{\rule{0.3em}{0ex}}\text{if}\phantom{\rule{1em}{0ex}}\Delta {\mathit{\text{RR}}}_{n}<-\mathit{\text{thre}}4\\ 1& \phantom{\rule{1em}{0ex}}\text{else if}\phantom{\rule{1em}{0ex}}\Delta {\mathit{\text{RR}}}_{n}<-\mathit{\text{thre}}3\\ 2& \phantom{\rule{1em}{0ex}}\text{else if}\phantom{\rule{1em}{0ex}}\Delta {\mathit{\text{RR}}}_{n}<-\mathit{\text{thre}}2\\ 3& \phantom{\rule{1em}{0ex}}\text{else if}\phantom{\rule{1em}{0ex}}\Delta {\mathit{\text{RR}}}_{n}<-\mathit{\text{thre}}1\\ 4& \phantom{\rule{1em}{0ex}}\text{else if}\phantom{\rule{1em}{0ex}}\Delta {\mathit{\text{RR}}}_{n}<\mathit{\text{thre}}1\\ 5& \phantom{\rule{1em}{0ex}}\text{else if}\phantom{\rule{1em}{0ex}}\Delta {\mathit{\text{RR}}}_{n}<\mathit{\text{thre}}2\\ 6& \phantom{\rule{1em}{0ex}}\text{else if}\phantom{\rule{1em}{0ex}}\Delta {\mathit{\text{RR}}}_{n}<\mathit{\text{thre}}3\\ 7& \phantom{\rule{1em}{0ex}}\text{else if}\phantom{\rule{1em}{0ex}}\Delta {\mathit{\text{RR}}}_{n}<\mathit{\text{thre}}4\\ 8& \phantom{\rule{1em}{0ex}}\text{else if}\phantom{\rule{1em}{0ex}}\Delta {\mathit{\text{RR}}}_{n}<\mathit{\text{thre}}5\\ 9& \phantom{\rule{2em}{0ex}}\phantom{\rule{2em}{0ex}}\phantom{\rule{0.3em}{0ex}}\phantom{\rule{0.3em}{0ex}}\text{other cases}\end{array}\right.

(4)

The raw RR sequence *x*
_{
n
} is then quantified into symbol sequence *sy*
_{
n
} with specific symbols from the predefined “alphabet” in Eq. (4) (i.e., 0 to 9). Recalling Figure 1(a)-(d) and scanning the distribution of calculated symbols in Figure 1(e), we confirm that most of normal beats are defined as zero symbols, and possible abnormal beats (arrhythmias, e.g., AF) are defined as non-zero symbols by the transform Eq. (4).

To facilitate the analysis of *sy*
_{
n
}, the widely used 3-symbol template (i.e., a word consists of 3 successive symbols) is applied to examine entropic properties. The word value can then be calculated by a novel operator as defined below,

{\mathit{\text{wv}}}_{n}=({\mathit{\text{sy}}}_{n-2}\times {2}^{8})+({\mathit{\text{sy}}}_{n-1}\times {2}^{4})+{\mathit{\text{sy}}}_{n}

(5)

where, *sy*
_{
n−2}×2^{8} and *sy*
_{
n−1}×2^{4} are implemented with *sy*
_{
n−2}<<8 and *sy*
_{
n−1}<<4, and 0≤*wv*
_{
n
}≤2457. Figure 2 briefly elucidates the transformation of the symbol sequence with the template and the corresponding word, while Figure 1(f) depicts the word sequence of *sy*
_{
n
} shown in Figure 1(e).

### Shannon entropy

Shannon entropy (SE) is a statistical tool that quantifies a time series in terms of the information size. For the sake of completeness, we define the discrete probability space of a dynamic system as **A**=(*A*|*P*). The total number of elements in **A** is *N*. The characteristic elements can then be defined as *A*={*a*
_{1},⋯,*a*
_{
k
}}, as well as the relevant probability set *P*={*p*
_{1},⋯,*p*
_{
k
}}(1≤*k*≤*N*). Each element *a*
_{
i
} has probability *p*
_{
i
}=*N*
_{
i
}/*N* (0<{p}_{i}\le 1,\sum _{i=1}^{k}{p}_{i}=1), where *N*
_{
i
} is the total number of the element *a*
_{
i
} in **A**. Thus, the SE of **A** is defined as [26],

\mathcal{\mathscr{H}}(\mathbf{A})=-\sum _{i=1}^{k}{p}_{i}{log}_{2}{p}_{i}

(6)

By Jensen’s inequality, we can prove \mathcal{\mathscr{H}}(\mathbf{A})\le \underset{2}{log}k\le {log}_{2}N with equality if *p*
_{
i
}≡1/*N* *and* *k*≡*N* for all *i*. Then, a uniform distribution of \mathcal{\mathscr{H}}(\mathbf{A}) can be expressed as,

{\mathcal{\mathscr{H}}}^{\prime}(\mathbf{A})=-\frac{1}{{log}_{2}N}\sum _{i=1}^{k}{p}_{i}\underset{2}{log}{p}_{i}

(7)

where, if *N*≡1, make log2*N*=1. Eq. (7) is also referred to as the normalized entropy, since the entropy is divided by the maximum entropy log2*N*. A coarser version of {\mathcal{\mathscr{H}}}^{\prime}(\mathbf{A}) can be defined as,

{\mathcal{\mathscr{H}}}^{\mathrm{\prime \prime}}(\mathbf{A})=-\frac{k}{N\underset{2}{log}N}\sum _{i=1}^{k}{p}_{i}\underset{2}{log}{p}_{i}

(8)

Currently, the dynamic **A** consists of all 127 consecutive word elements from *wv*
_{
n−126} to *wv*
_{
n
} (the bin size in this case is *N*=127). By determining the characteristic set *A* and the relevant probability set *P* with these elements, we can thus calculate the SE {\mathcal{\mathscr{H}}}^{\mathrm{\prime \prime}}(\mathbf{A}). The presence of AF is then detectable, with the rhythm labeled AF if {\mathcal{\mathscr{H}}}^{\mathrm{\prime \prime}}(\mathbf{A}) exceeds a discrimination threshold, and otherwise non-AF, which can be seen in Figure 1(g). We utilize the training database to determine the optimal discrimination threshold by investigating various threshold settings which lie within the range [0.0, 1.0]; the best performing threshold of 0.353 is thus derived and employed for the performance assessment using different testing databases.

### Key issues of online processing

From Eq. (1)–(5) and (8), outwardly, this AF detection technique poses computational challenges. However, these challenges can be overcome by implementing clever recursive algorithms with beat-by-beat, real-time processing.

#### A. Pseudo-recursive median filtering

The median filter in Eq. (1) can be implemented with a so-called pseudo-recursive method: for input *x*
_{
i
}, we define *S*={*s*
_{
r
}
*↑*:1≤*r*≤2*w*+1} as a sorted array of successive elements from *x*
_{
i−2w−1} to *x*
_{
i−1}, where the output *y*
_{
i
} is obtained by following steps ➊-➎ below,

➊ A Binary search technique is used to seek out the position *m* of the sample *x*
_{
i−2w−1} which will depart from the window (i.e., *s*
_{
m
}=*x*
_{
i−2w−1}. Simultaneously, *x*
_{
i
} will get into the window);

➋ The Binary search technique is applied again to search for the position *t* at which the input *x*
_{
i
} needs to be set (i.e., *s*
_{
t
}<*x*
_{
i
}≤*s*
_{
t+1});

➌ From positions *m* to *t*, the current *s*
_{
r
} is replaced with the adjacent *s*
_{
r ± 1} (the ’ _{±}’ indicates where the element is taken from the right or left, with the ’ _{+}’ and ’ _{−}’ symbols representing the element to the right and left, respectively);

➍ Replace the element *s*
_{
t
} with *x*
_{
i
};

➎ Median *s*
_{
w+1} of the updated *S* becomes output *y*
_{
i
}.

For the following input *x*
_{
i+1}, we repeat steps ➊ to ➎ and obtain the new output *s*
_{
w+1} (i.e., *y*
_{
i+1}), as shown in Figure 3, where the sorting utilizes the Binary search technique twice. Comparing our technique with the traditional median filter, the computational complexity can be decreased from approximately *O*(*n*^{2}) to *O*(*n*).

#### B. Recursive implementation of integer filters

The recursive implementation (also referred to as the “difference equation”) of the filter *H*
_{
l
}(*z*) can be expressed as,

{\mathit{\text{xl}}}_{n}={\mathit{\text{xl}}}_{n-1}+{y}_{n}-{y}_{n-16}

(9)

The above equation, Eq. (9) includes 1 integer addition, 3 integer subtractions as well as 1 integer right-shift operation, when *xl*
_{
n
}>>4 (as *G* *a* *i* *n* 1=2^{4}) to offset the gain of *H*
_{
l
}(*z*).

The filter *H*
_{
h
}(*z*) can then be computed recursively using

\begin{array}{ll}{\mathit{\text{xh}}}_{n}& =({\mathit{\text{xh}}}_{n-1}\times 2)-{\mathit{\text{xh}}}_{n-2}\\ \phantom{\rule{1em}{0ex}}+{\mathit{\text{xl}}}_{n}-{\mathit{\text{xl}}}_{n-32}-{\mathit{\text{xl}}}_{n-64}+{\mathit{\text{xl}}}_{n-96}\end{array}

(10)

where, *xh*
_{
n−1}×2 is implemented with *xh*
_{
n−1}<<1. The above equation, Eq. (10) consists of 2 integer additions, 8 integer subtractions, 1 integer left-shift operation and 1 integer right-shift operation, when *xh*
_{
n
}>>11 (as *G* *a* *i* *n* 2=2^{11}) to offset the gain of *H*
_{
h
}(*z*).

#### C. Mapping the definition of
-\frac{1}{\underset{2}{log}N}{p}_{i}\underset{2}{log}{p}_{i}

Investigating the dynamic **A**, we immediately see that each characteristic symbol of each bin *N* may have the probability *p*
_{
i
}=*i*/*N* (1≤*i*≤*N*, i.e., 1/*N*≤*p*
_{
i
}≤1). Along these lines, a probability array *PiMap* can be pre-calculated,

\begin{array}{ll}\mathit{\text{PiMap}}\left[127\right]& =-\frac{\mathit{\text{Cons}}}{\underset{2}{log}N}\left\{{p}_{1}\underset{2}{log}{p}_{1},\cdots \phantom{\rule{0.3em}{0ex}},{p}_{63}\underset{2}{log}{p}_{63},\right.\\ \phantom{\rule{6em}{0ex}}\left(\right)close="\}">{p}_{64}\underset{2}{log}{p}_{64},\cdots \phantom{\rule{0.3em}{0ex}},{p}_{127}\underset{2}{log}{p}_{127}\end{array}\n \n \n \n \n =\n \n \n \u230a\n \xb7\n \u230b\n \n \n {\n 7874\n ,\n \cdots \n \n ,\n 71790\n ,\n 71291\n ,\n \cdots \n \n ,\n 0\n }\n \n \n

(11)

where, *C* *o* *n* *s*=1000000 is a constant such that decimal floating points can be converted into integers and *N*=127, and \stackrel{\lfloor \xb7\rfloor}{=} indicates to take the integer part of each -\frac{\mathit{\text{Cons}}}{\underset{2}{log}N}{p}_{i}\underset{2}{log}{p}_{i}.

Notably, for each cardiac cycle screened, this predefined *PiMap* permits the sole operation by picking the straightforward integer (i.e., *P* *i* *M* *a* *p*[*i*]) from the set *PiMap* in accordance with the index *i* rather than calculating -\frac{1}{\underset{2}{log}N}{p}_{i}\underset{2}{log}{p}_{i} using arithmetic and logarithmic operations. The use of this predefined calculation significantly decreases calculation times.

#### D. Recursive implementation of
{\mathcal{\mathscr{H}}}^{\mathrm{\prime \prime}}(\mathbf{A})

We define a buffer array {\mathit{\text{nu}}}_{w{v}_{i}} (*wv*
_{
i
}≤2457) to store the number of the *i* th characteristic element *wv*
_{
i
} in space **A**. For the input *wv*
_{
n
}, it will get into **A** (i.e., *wv*
_{
n
} will be the rightmost element), and simultaneously the leftmost element *wv*
_{
n−127} will depart from **A**, see Figure 2 for clarity. It is obvious that a variation of SE {\mathcal{\mathscr{H}}}^{\prime}(\mathbf{A}) is purely determined by {\mathit{\text{nu}}}_{w{v}_{n}} and {\mathit{\text{nu}}}_{w{v}_{n-127}} in dynamic **A**. Therefore, {\mathcal{\mathscr{H}}}^{\mathrm{\prime \prime}}(\mathbf{A}) is calculated recursively by the algorithm below,

where s{h}_{n}^{\prime} and s{h}_{n}^{\mathrm{\prime \prime}} represent {\mathcal{\mathscr{H}}}^{\prime}(\mathbf{A}) and {\mathcal{\mathscr{H}}}^{\mathrm{\prime \prime}}(\mathbf{A}), respectively; ^{(∗} indicates that *P* *i* *M* *a* *p*[*i*]=0 is fixed for the case *i*≡0; and 127000000=*N*∗*C* *o* *n* *s*=127×1000000. For the next input *wv*
_{
n+1}, steps âž€-âž‚ are again executed to obtain s{h}_{n+1}^{\mathrm{\prime \prime}}. From an online processing perspective, the time delays of s{h}_{n}^{\mathrm{\prime \prime}} are 64 and 126.5 samples with respect to *xh*
_{
n
} and *x*
_{
n
}, respectively.

An architecture of the overall logic of the recursive realization can be seen in Figure 4. By using recursive algorithms, this AF detector consists of several basic operations, such as integer addition/subtraction, integer comparison and integer shifting. In effect, the calculation of s{h}_{n}^{\mathrm{\prime \prime}} and distinguishing the current beat *x*
_{
n
}, only needs to include 1 multiplication and 1 division lying within \frac{k}{127000000}\xb7, together with 1 floating-point comparison between s{h}_{n}^{\mathrm{\prime \prime}} and a threshold. Consequently, a useful computational efficiency can be achieved.