Compressive sensing theory is proposed by Donoho, and its basic idea is that the best CS performance is achieved by exploring data sparsity. The conventional CS scheme mainly includes sparse representation of data, design of measurement matrix and reconstruction algorithm [11]. In our proposed CS scheme for gait telemonitoring, as shown in Fig. 1, sparse binary matrix is optimally designed as a measurement matrix for acceleration data compression on bodyworn device. And BSBL algorithm is introduced to reconstruct acceleration data with high fidelity at remote terminal. This greatly contributes to gait classification with high quality in gait telemonitoring application. The detail of our CS scheme is presented as follows.
Design of the optimal measurement matrix
According to the CS theory, the raw acceleration data X ∊ R
^{N × 1} (N denotes the length of data) on bodyworn device can be greatly compressed by
where measurement matrix Φ ∊ R
^{M × N} (M ≤ N) and compressed data Y ∊ R
^{M × 1} (M is data length). Here, sparse binary matrix, as shown in Eq. (2), is selected to be optimally designed as a measurement matrix that contains a smallest amount of nonzero entries, in order to greatly reduce lots of computational resources on bodyworn device. For comparison, Gaussian random matrix and Bernoulli random matrix are both selected to further validate the best CS performance from our scheme for acceleration data.
$$\Phi = \left( \begin{aligned} 1{ 0 1 0 } \cdots { 1} \hfill \\ 0 { 1 1 0 } \cdots { 0} \hfill \\ \vdots \, \vdots \, \vdots \, \vdots \, \cdots \, \vdots \hfill \\ 1 { 0 0 1 } \cdots { 0} \hfill \\ \end{aligned} \right)$$
(2)
BSBL algorithm for reconstructing acceleration data
In conventional CS reconstruction algorithm, data must satisfy sparse enough in time domain or the transformed domain [11]. That is, data X is sparsely represented as X = ψα, where ψ ∊ R
^{N × N} is sparse basis, and α is the corresponding sparse representation coefficient. And according to Eq. (1), compressed data Y = ΦX = Φψα. In CS theory, the estimation of sparse coefficient α can be obtained by solving the following l
_{1} optimization problem
$$\hbox{min} \left\ \upalpha \right\_{1} {\text{subject}}\,{\text{to}}\,Y = \Phi \uppsi \upalpha$$
(3)
Therefore, the reconstruction of data \(\hat{X}\) can be achieved by the estimated α, i.e.\(\hat{X} = \Psi \upalpha \approx X\). The detailed procedure of solution is found in [11].
Unlike the above reconstruction algorithm, BSBL algorithm accurately reconstructs nonsparse data by exploiting block sparsity [17–19]. In this study, acceleration data Xis considered as a concatenation of a number of blocks, that is, \(X = \underbrace {{[x_{1} , \ldots ,x_{{h_{1} }} }}_{{x_{1}^{T} }}, \ldots ,\underbrace {{x_{{h_{l  1} + 1}} , \ldots ,x_{{h_{l} }} ]^{T} }}_{{x_{l}^{T} }}\) where l is a number of the randomly partitioned blocks, and h
_{
i
} denotes the partitioned block size. In BSBL algorithm for reconstructing acceleration data, each partition block X
_{
j
} ∊ R
^{l × 1} satisfies the parameterized multivariate Gaussian distribution:
$$p\left( {X_{j} ;\uplambda_{j} ,b_{j} } \right) \sim N\left( {0,\uplambda_{j} b_{j} } \right),\quad j = 1,2, \ldots ,l$$
(4)
where the unknown positive parameters λ_{
j
} is used to capture block sparsity, and λ_{
j
} = 0 means that the jth block is zero. The unknown parameter \(b_{j} \in R^{{d_{j} \times d_{j} }}\) is a positive definite matrix, and it describes the correlation among elements within the jth block. Here, all partitioned block are assumed to satisfy mutual uncorrelated, and the prior density of X is defined as
$$p\left( {X;\left\{ {\uplambda_{j} ,b_{j} } \right\}} \right) \sim N\left( {0,\sum_{x} } \right) \quad where\,\sum\nolimits_{x} { = \left\{ \begin{aligned} \uplambda_{1} b_{1} ,0, \ldots \ldots ,0 \hfill \\ 0,\uplambda_{2} b_{2} ,0, \ldots ,0 \hfill \\ \ldots \ldots \ldots \ldots \ldots \ldots \hfill \\ 0,0, \ldots \ldots \ldots ,\uplambda_{l} b_{l} \hfill \\ \end{aligned} \right\}}$$
(5)
In view of acceleration data contaminated by noise, compressed data Y ∊ R
^{M × 1} is defined as
where Z ∊ R
^{M × 1} denotes noise, and it satisfies Gaussian distribution of N(0, ρI)where ρ is a positive scalar and I ∊ R
^{M × 1}is an identity matrix. Based on Y ∊ R
^{M × 1}, the posterior density of X is defined as Gaussian distribution
$$p\left( {{X \mathord{\left/ {\vphantom {X {Y;\uprho ,\left\{ {\uplambda_{j} ,b_{j} } \right\}_{j = 1}^{l} }}} \right. \kern0pt} {Y;\uprho ,\left\{ {\uplambda_{j} ,b_{j} } \right\}_{j = 1}^{l} }}} \right) = {\rm N}\left( {\upmu_{\Uptheta } ,\sum\nolimits_{\Uptheta } {} } \right)$$
(7)
where the mean value \(\upmu_{\Uptheta } = \sum\nolimits_{x} {\Phi^{T} \left( {\uprho I + \Phi \sum\nolimits_{x} {\Phi^{T} } } \right)^{  1} Y}\), and the value of covariance matrix \(\sum\nolimits_{\Uptheta } = \left( {\sum\nolimits_{x}^{  1} {} + \frac{1}{\uprho }\Phi^{T} \Phi } \right)^{  1}\).
So, reconstruction of acceleration data \(\hat{X}\) is achieved when all parameters (ρ, λ_{
j
}, b
_{
j
}) are available, i.e. \(\hat{X} \approx \upmu_{\Uptheta }\). In BSBL framework, all parameters (ρ, λ_{
j
}, b
_{
j
}) are obtained by optimal learning algorithms [18, 19]. In this study, we select the boundoptimization BSBL algorithm (i.e. BSBLBO) due to its faster convergence speed. The detail procedure for solution is presented in Appendix 1.
For comparison, some conventional CS reconstruction algorithms were selected to validate superior ability of BSBL algorithm for reconstructing acceleration data. These selected algorithms include: (1) algorithms without block structure such as orthogonal matching pursuit (OMP) [28], basis pursuit (BP) [29], subspace pursuit (SP) [30], smoothed l
_{0}norm (SL0) [31]; (2) algorithms with block structure such as dynamic group sparsity (DGS) [32], structured orthogonal matching pursuit (SOMP) [33], Group Lasso [34], and these algorithms only consider the prior knowledge of block partition.
Evaluation criteria for the proposed CS scheme
In this study, the assessment of performance of our scheme is implemented according to the following common criteria [18, 20, 22, 23]:

1.
Compression ratio (CR): it is used to quantitatively evaluate the ability of compression of acceleration data, and is defined as
$$CR = \frac{N}{M} \times 100 \, \%$$
(8)

2.
Normalized mean square error (NMSE): it is employed to assess the performance of reconstruction of acceleration data, and is defined as
$$NMSE(X,\hat{X}) = \frac{{\left\ {X  \hat{X}} \right\_{2}^{2} }}{{\left\ X \right\_{2}^{2} }}$$
(9)

3.
Signaltonoise ratio (SNR): it is only used to measure the ability of the reconstruction of acceleration data, and is defined as
$$SNR = 10\log_{10} \frac{{\sum\nolimits_{n = 1}^{N} {X^{2} (n)} }}{{\sum\nolimits_{n = 1}^{N} {(X(n)  \hat{X}(n))^{2} } }}$$
(10)

4.
Pearson correlation coefficient: it is also used to evaluate the reconstruction performance by testing the similarity difference between the raw data and the reconstruction data, and is defined as
$$R = \frac{{\sum\nolimits_{i = 1}^{N} {\left( {X_{i}  \bar{X}} \right)\left( {\hat{X}_{i}  \bar{\hat{X}}} \right)} }}{{\sqrt {\sum\nolimits_{i = 1}^{N} {\left( {X_{i}  \bar{X}} \right)^{2} } } \sqrt {\sum\nolimits_{i = 1}^{N} {\left( {\hat{X}_{i}  \bar{\hat{X}}} \right)^{2} } } }} \quad i = 1,2, \ldots ,N$$
(11)
where \(\bar{\hat{X}}\) denotes the mean value.
Gait classification models based on the reconstructed acceleration data
Three commonly used gait classification models such as SVM, MLP, and KStar were selected to further test the reconstructed acceleration data with high fidelity for gait monitoring. SVM classification model is derived from the Vapnik–Chervonenkis theory and structural risk minimization [25]. Its basic idea is that the data to be classified can be firstly mapped into the highdimensional feature space via kernel function, and then an optimal separating hyperplane is constructed between the classes in the mapped space. The multilayer perceptron is also an effective classification model based on artificial neural networks, and it usually uses the back propagation algorithm to construct classification model [26]. KStar is a common classification model based on knearest neighbors framework, and its basic idea is to measure the difference between input sample by using entropybased distance [27]. A detailed description of these classification algorithms are found in [25–27], respectively.
In order to improve classification performance, some good gait features containing more separate information are selected from the reconstructed acceleration data according to some common statistical parameters [5, 9, 10, 13, 14]: Standard deviation (SD), Skewness, Kurtosis and Pearson correlation coefficient. These selected parameters are defined as follows:
$$SD = \sqrt {\frac{1}{N}\sum\limits_{i = 1}^{N} {\left( {\hat{X}_{i}  \bar{\hat{X}}_{i} } \right)}^{2} }$$
(12)
$$Skewness = \frac{{N\sum\nolimits_{i = 1}^{N} {\left( {\hat{X}_{i}  \bar{\hat{X}}_{i} } \right)^{3} } }}{{\left( {N  1} \right)\left( {N  2} \right)SD^{3} }}$$
(13)
$$Kurtosis = \left( {\frac{{N\left( {N + 1} \right)}}{{\left( {N  1} \right)\left( {N  2} \right)\left( {N  3} \right)}}\frac{{\sum\nolimits_{i = 1}^{N} {\left( {\hat{X}_{i}  \bar{\hat{X}}_{i} } \right)^{4} } }}{{SD^{4} }}} \right)  \frac{{3\left( {N  1} \right)^{2} }}{{\left( {N  2} \right)\left( {N  3} \right)}}$$
(14)
In addition, Pearson correlation coefficient is defined in Eq. (11).
Also, the prediction ability of gait classification model is evaluated according to three statistical measures [16, 35]: Accuracy (Acc), Sensitivity (Sen), and Specificity (Spe). The above measures are defined as follows:
$$Accuracy = \frac{TP + TN}{TP + FP + TN + FN} \times 100 \, \%$$
(15)
$$Sensitivity = \frac{TP}{TP + FN} \times 100 \, \%$$
(16)
$$Specificity = \frac{TN}{TN + FP} \times 100 \, \%$$
(17)
where TP refers to the number of true positive; FP is the number of false positive; TN denotes the number of true negative; FN is the number of false negative.