Estimating metabolic equivalents for activities in daily life using acceleration and heart rate in wearable devices

Background Herein, an algorithm that can be used in wearable health monitoring devices to estimate metabolic equivalents (METs) based on physical activity intensity data, particularly for certain activities in daily life that make MET estimation difficult. Results Energy expenditure data were obtained from 42 volunteers using indirect calorimetry, triaxial accelerations and heart rates. The proposed algorithm used the percentage of heart rate reserve (%HRR) and the acceleration signal from the wearable device to divide the data into a middle-intensity group and a high-intensity group (HIG). The two groups were defined in terms of estimated METs. Evaluation results revealed that the classification accuracy for both groups was higher than 91%. To further facilitate MET estimation, five multiple-regression models using different features were evaluated via leave-one-out cross-validation. Using this approach, all models showed significant improvements in mean absolute percentage error (MAPE) of METs in the HIG, which included stair ascent, and the maximum reduction in MAPE for HIG was 24% compared to the previous model (HJA-750), which demonstrated a 70.7% improvement ratio. The most suitable model for our purpose that utilized heart rate and filtered synthetic acceleration was selected and its estimation error trend was confirmed. Conclusion For HIG, the MAPE recalculated by the most suitable model was 10.5%. The improvement ratio was 71.6% as compared to the previous model (HJA-750C). This result was almost identical to that obtained from leave-one-out cross-validation. This proposed algorithm revealed an improvement in estimation accuracy for activities in daily life; in particular, the results included estimated values associated with stair ascent, which has been a difficult activity to evaluate so far.


Background
Increasing physical activity is particularly important to prevent lifestyle diseases. Accurate monitoring of physical activity intensity (PAI) during daily lifestyle activities has gained popularity. To obtain accurate measurements, the collection of continuous and long-term PAI data is crucial. Coleman et al. [1] found that variation in long-term step monitoring can quantify differences resulting from changes in health status, and that long-term continuous PAI data collection and step counting is an effective method for monitoring and evaluating improvements in lifestyle. Such data are also useful to improve health guidance to facilitate the prevention of lifestyle diseases. Accordingly, a strong ongoing demand exists for devices that can monitor lifestyle activities in terms of PAI data.
Several previous studies have examined specific methods for monitoring PAI data, which have led to the introduction of various devices to collect related information [2][3][4][5][6]. Most of these studies, including our previous work [7,8] that specifically proposed an estimation algorithm for PAIs for household activities, have explored monitoring devices and algorithms by examining the use of acceleration data [7,8]. This previous research laid the groundwork for monitoring daily activities using PAI data. However, important shortcomings related specifically to small wearable monitoring devices must still be addressed.
A significant difficulty is associated with the estimation of energy expenditure (EE) for certain activities which cannot be estimated from acceleration data alone, making it difficult to calculate the PAI of such activities accurately. There are two major types of such activities. The first group includes activities with small body movements, but a large PAI, such as cycling or muscle training. The other type includes activities accompanied by an elevation change, such as hiking or stair ascent. The PAIs of these activities tend to be underestimated. An example is 'stair ascent' , which produced an error rate of − 60.6% in our previous study [8]. Crouter et al. reported that the error rate of one device (Acti-Graph Model 7164; ActiGraph LLC, Pensacola, FL) that uses the Freedson MET Equation is − 38.3% for stair ascent and descent [9]. As previously mentioned, monitoring the PAI during daily-life activities with high accuracy is important in the prevention of many lifestyle diseases, and the EE estimation accuracy of these activities must be improved, especially with respect to wearable monitoring devices.
Another major objective of PAI monitoring is to collect data continuously over long periods. Wearable monitoring devices must therefore be suitable for this purpose. When monitoring PAI during daily activities, it will eventually be possible to confirm the results in real time. Hence, wearable devices must independently obtain signals from sensors, process those signals and estimate PAI. Therefore, estimating PAI using a simple approach with minimal cost and power consumption is important if the estimation algorithm is to be integrated into wearable devices.
Various methods have been proposed to improve the PAI accuracy for activities that are difficult to estimate. In previous studies [10][11][12][13][14], the number of PA classifications was increased using multiple sensors and machine learning algorithms such as support vector machine (SVM), and as a result, the accuracy of PAI estimation was improved. For example, Cvetković et al. [10] reported that 30 parameters were obtained from the accelerometer and used in their algorithm. By doing so, the classification numbers and estimation accuracy could be increased. However, the SVM approach is difficult to integrate into wearable devices without support from cloud computing. Therefore, it is apparently not suitable for our purpose because wireless communication with cloud-based analytical software requires a significant amount of additional power consumption.
Another approach is to combine different sensors with an accelerometer [15][16][17][18][19][20][21][22][23], such as a barometer. The algorithms using a barometer to measure changes in elevation (i.e. altitude) have been reported by Ohtaki and Voleno [18,19] and others [20,21]. By improving the estimation formula and classified PA using elevation information, the algorithm can improve the accuracy of EE estimation for stair ascent. However, a barometer typically consumes considerable power, and this method is effective for improving the accuracy of estimating EE for activities that involve a change in elevation, such as stair ascent. However, it does not work for other types of activities such as a cycling because the change in elevation is gradual. Therefore, these conventional methods are not practical for accurate PAI monitoring in daily life, especially in wearable monitoring devices.
Another method uses heart rate data to improve accuracy, as reported by Crouter and Li [22,23]. A strong correlation exists between PAI and heart rate, and consequently, EE estimation accuracy can be increased. Additionally, the power consumption required to monitor heart rate is very small. For example, Izumi et al. proposed a system-on-a-chip (SoC) sensor which enables heart rate inter-beat interval (RRI) measurements with very low power consumption [24]. Other groups have also proposed low-power-consumption SoCs for measuring heart rate [25,26]. These studies have confirmed that the marginal increase in power consumption from the addition of a heart rate sensor to a wearable device can be minimized. Furthermore, heart rate data provide valuable biological information related to a person's health levels, with measurements providing a wide range of useful information beyond PAI estimates, and the addition of a heart rate sensor to a wearable device can thus provide a wide range of benefits. Given this range of benefits, we developed a simple algorithm that combines heart rate and acceleration data.
Crouter et al. reported an error of − 20.5% [22] in the estimation of the EE related to stair ascent by using acceleration and heart rate data. This error is relatively large for actual application. Therefore, this paper addresses the development of the PAI monitoring algorithm, which can improve the estimation accuracy for activities that have previously shown EE estimation difficulty, such as stair ascent. For continuous and long-term monitoring, the proposed algorithm must be embedded in wearable devices. Therefore, we propose a simple algorithm that combines heart rate and acceleration data with a decision tree and multiple regression analysis. The algorithm proposed in this paper is expanded from the algorithm using only accelerations described previously in the literature [7,8]. The heart rate information is used to resolve difficulties associated with applications that are based solely on acceleration data.

Methods
This study represents two major improvements over our previous study [8]. First, the number of locomotive activity classification groups has been increased from one to two. Second, the number of parameters used in our multiple-regression models for estimating PAI data has also been increased. We also conducted experiments for the development and evaluation of the estimation algorithms proposed in the present study.

Signal processing
This paper presents a classification algorithm for increasing the number of classification groups used in locomotive activities. The algorithm uses the following three indices: filtered synthetic acceleration (ACC fil ), ratio of unfiltered synthetic acceleration to filtered synthetic acceleration (RUF) and the percentage of heart rate reserve (%HRR). This section describes the method used for signal processing of acceleration data and heart rates to calculate these three indices.

Triaxial acceleration
The measured triaxial acceleration is processed in a manner similar to that described in our previous study [8]. First, signals from a triaxial accelerometer were run through a high-pass filter with a 0.7 Hz cut-off frequency to remove the gravitational acceleration component, for reasons presented in our earlier report [7]. Fast Fourier transform analysis revealed that for locomotive activities, peak power appeared at a frequency of 1.0 Hz or higher. The peak frequency and walking pace also increased proportionally. For household activities, the peak power appeared at 1.0 Hz or less. The mean frequency of the peak was 0.29 ± 0.19 Hz. The peak value of household activities is strongly influenced by the gravitational acceleration component because of a change in body position. If this influence is removed, the peak value in household activities becomes 1 Hz or higher [7]. Therefore, the cut-off frequency was set as 0.7 Hz (mean + 2SD) to remove the influence of gravitational acceleration on household activities and ensure that acceleration signals during locomotive activities were not affected. Subsequently, the synthetic acceleration along three axes, the anteroposterior axis (X), mediolateral axis (Y) and vertical axis (Z), with the vector magnitude equal to √ X 2 + Y 2 + Z 2 , was calculated using raw (unfiltered) acceleration signals and the values were then run through a highpass filter. Finally, the ratio of unfiltered to filtered signals was calculated to classify the activities as household or locomotive activities. ACC fil is defined as the mean value of the synthetic acceleration obtained from the filtered signal during each activity, and is calculated by averaging the mean values of the synthetic acceleration every 10 s.

Percentage heart rate reserve (%HRR)
Our methodology uses %HRR, as defined in Eq. (1) below, which was obtained from the available literature [27].
The heart rate during activity (HR act ) represents the mean value of the average heart rate every 10 s during activities, while the heart rate at rest (HR rest ) is defined as the mean value of the average heart rate over a resting period of 7 min. The maximum heart rate (HR max ) is calculated based on the Karvonen formula as (2) HR max = 220 − Age

Algorithm for physical activity classification
The first major improvement is to increase the number of classification groups. The previous decision tree classifies the physical activities into three groups, namely sedentary, household and locomotive [8]. In this study, the new node to classify into middle-intensity group (MIG) or high-intensity group (HIG) was appended to the previous decision tree. Sedentary is defined as an activity which has a near-resting energy expenditure, such as sitting. Household activity is defined as a PA excluding locomotion, which nevertheless shows over one MET, such as vacuuming and washing dishes. Although our pioneering study assessed the possibility of increasing the number of classifications within the locomotive activity group [28], this study expanded that concept by further classifying locomotive activities into MIG and HIG activities. These groups were classified using %HRR in the proposed decision tree to utilize the correlation between PAI and %HRR.
We define MIG activities as activities involving six or less METs, such as walking at a normal speed. HIG activities are defined as activities involving more than six METs, such as jogging and stair ascent. This value (six METs) reflects a cut-off value in the guidelines established by the American College of Sports Medicine [29]. Figure 1 shows the decision tree used for our PA group classifications.
A previous study divided locomotive activity from other activity groups used by ACC fil and RUF. Locomotive activity is divided into two additional groups by %HRR.

Estimation of METs as EE
The second improvement involves an increase in the number of independent variables used in the multiple-regression model for PAI estimation. This paper proposes five multiple-regression models using software (SPSS Statistics 24, SPSS24; IBM Corp., Armonk, NY). The forced entry method was employed to confirm the influence of quantities of certain parameters used in the respective regression models to obtain the estimation results. Table 1 presents the parameters used in each of the proposed models. In this study, four parameters were selected, including acceleration, %HRR, body mass index (BMI) and weight. BMI was used not only as the weight value but also as a characteristic representing body shape.
The accuracies of the proposed models were compared based on the mean absolute percentage error (MAPE) of estimated MET values produced by the leave-one-out cross-validation. In this process, one part of the data observation serves as the validation set and the other serves as the training set. The cross-validation process was then repeated for a number of subjects, with each subject used only once to obtain the validation set. The results from the respective subjects were averaged to produce a single estimate. Based on a comparison of the results, the most suitable multiple-regression model was selected for our study.

Experimental methods
To develop and evaluate the proposed classification algorithm and the EE estimation model, we measured the METs, triaxial acceleration and RRIs of volunteer test subjects as they performed various activities (Fig. 2a).

Subjects
A total of 42 volunteers participated in the experiments for this study, which were conducted at the National Institute of Health and Nutrition (NIHN) in Tokyo, Japan, following the guidelines laid down in the Declaration of Helsinki. All procedures involving human subjects were approved by the Ethical Committees of NIHN and by Omron Healthcare Co., Ltd. Subjects were excluded from the study if they showed any contraindication to exercise, or if they were physically unable to complete an activity. Approximately five subjects were chosen from each 10-year age range and gender ( Table 2).
The weights and heights of all subjects were measured to produce a BMI for each. The age, gender, height, weight and BMI distributions of the test subjects are listed in Table 2. Details related to the purposes and procedures involved in this study were explained to subjects before measurements were taken, with prior written informed consent obtained from all subjects.

Experimental setup for data measurement
The study required the monitoring of triaxial acceleration, RRI and EE for various physical activities conducted for a set period. Test subjects performed 23 distinct activities (including resting in a seated position), during which their triaxial acceleration and RRI values were recorded using the Health Patch MD function (Vital Connect Inc., San Jose, Ca). In this study, heart beats per minute were converted from those recorded RRI by Health Patch MD and used to calculate %HRR; it was developed for 24-h monitoring.
A clinical validation of this device has been provided in a previous report [30], and an activity monitor (HJA-750C; Omron Healthcare Co., Ltd., Kyoto, Japan) was positioned at each subject's waist. Figure 2b shows the location of these devices. During each activity, the subjects' exhaled respiration was collected in a Douglas bag. The EE values were estimated from volumes of oxygen and carbon dioxide, reported as VO 2 and VCO 2 , respectively, using Weir's equation [31]. For reference, the MET values were calculated by dividing the EE during the activities by the measured resting metabolic rate. This study specifically addressed eight activities, including stair ascent/ descent, walking (three speeds), walking with load (two patterns) and jogging, all of which were identified as locomotive activities. These experiments were conducted in a controlled laboratory setting. The HR rest utilized average heart rate data, after excluding the first 3 min, for 10 min of sitting. For six of the activities (excluding stair ascent and descent), the subjects were instructed to walk at a speed determined using a pace leader. Stair ascent and descent were evaluated with the subjects selecting their own speed. Table 3 shows the speed and time for each of the eight activities. Table 4 shows averages and standard deviations (SDs) of METs and %HRR for each activity; the respective numbers of test subjects for the eight activities are also listed. Measurements for two of the subjects could not be conducted for any activity. Measurement failures occurred during a number of activities for several subjects; these results were excluded from the study. In stair ascent and jogging activities, MET values of more than six were defined as HIG activities. Because the other six activities always have MET values less than six, they were defined as MIG activities.

Measurement results
The ACC fil and %HRR measurement results for the eight locomotive activities are presented in Fig. 3. Table 5 presents the statistical results of the METs, ACC fil and %HRR analyses. The three indexes were analysed by using linear mixed effect model with SPSS statistics 24. The all indexes have significant differences between MIG and HIG (pairwise comparisons, p < 0.05). Figure 3a shows the relation between MET and ACC fil results for each activity. In MIG, the mean − 3SD of ACC fil was 25.3 [mG] and the mean + 3SD was 562.8 [mG]. The stair ascent results followed the same distribution trends as those reported in our previous study [8]. The distribution of the latter is depicted in Fig. 3a.
The ACC fil of MIG correlates with the METs (r = 0.680), as the ACC fil of HIG does (r = 0.688). Figure 3b presents the relation between METs and %HRR. The mean + 3SD of the %HRR for HIG was 102.15%, whereas the mean − 3SD of %HRR for HIG was 13.11%. The %HRR result of HIG activity had a relatively large SD (= 14.34), as shown in Table 5. However, the HIG average %HRR was more than twice as large as the average %HRR for MIG. Additionally, there were two distributions separated at  around 40%, and correlation existed between %HRR and METs of the MIG (r = 0.756) and HIG (r = 0.573).

Classification and MET Estimation Result
We used a decision tree to further classify the locomotive activity into two groups according to %HRR. For PA classification, this study first identified the %HRR values for which the classification accuracy of MIG and HIG was the highest. Figure 4 shows the relation between %HRR and the classification accuracy of MIG and HIG. As a result, the %HRR with the highest observed classification accuracy was 40.15%. This was close to the boundary value between Light and Moderate (%HRR = 40%) for classifying activities by %HRR established in the ACSM guidelines. For this reason and for simplification, the threshold for HIG and MIG was set at 40% instead of 40.15%. Table 6 shows the classification accuracy for each activity. For activities in MIG, 91.6% were correctly classified. For activities in HIG, 94.3% were correctly classified, while 5.7% were misclassified into MIG.
As described in "Estimation of METs as EE" section, validation results were produced by the leave-one-out cross-validation method. This approach involves utilizing one part of the observed data as the validation set and the remaining observations as the training a b Fig. 3 Measurement results from the eight activities. a Relation between ACC fil and METs. r HACC is the correlation coefficient of the relation between ACC fil and METs in HIG, and r MACC is the correlation coefficient of MIG. b Relation between HRR and METs. The correlation coefficient of the relation between %HRR and METs is r HHR for HIG and r MHR for MIG   Based on the results of the leave-one-out cross-validation, Prop 3 should be selected as the most suitable model for the purpose of this paper. Because the number of features used by Prop.3 is the smallest and the calculation amount can be made small, although there was no clear difference between the MAPE values for Props. 04 and 05. This paper selected Prop. 03 results to confirm the error trends, recalculating the multiple-regression models for MIG and HIG using all measured data as the training data. The resulting regression equations for MIG and HIG are shown, respectively, in Eqs. (3) and (4) below. Figure 5 shows the relation between the estimation error and the METs, which was estimated using Prop. 03 and the previous model (HJA-750). The solid lines depict the mean values of MIG and HIG, while the dashed lines represent 95% prediction intervals (PI) of error. Table 8 presents the statistical results of the estimation error. The MIG result using the proposed model had a few fixed biases because the 95% confidence interval (CI) of the average ranged from 0.07 to 0.31. However, there was no proportional error (r = 0.005). The HIG results obtained using the proposed model showed no fixed bias because the 95% CI of the average ranged from − 0.48 to 0.21. However, there was a proportional error (r = − 0.543). Some stair ascent results showed errors that were less than − 3.0 METs, although all errors corresponding to the results of the previous model were less than − 3.0 METs.
The results obtained using our previous model and the proposed model is compared in Table 9. The MAPE of MIG and HIG in Table 9 were calculated from all data defined in their groups. The results showed that the MAPE and MPE using the proposed model was improved compared to the previous algorithm for the MIG    Additionally, we confirmed the error distribution based on the classification results presented in Fig. 6. The PI of the correct classification data for the MIG ranged from − 1.036 to 1.033 METs, although the PI of the HIG ranged from − 2.32 to 2.28 METs. Large errors were found for both groups when the results were incorrectly classified. The MIG errors had a mean value of 2.29 METs, whereas the HIG errors had a mean of approximately − 3 METs.
In this paper, MAPE (i.e. the estimation accuracy) was compared with other algorithms using a random forest and REPTree. These are machine learning methods which use decision trees. The results are shown in Table 10. Although the data used for estimation and the number of activities were different, their results showed no clear differences between the proposed and others. The result of Luštrek [11] is better than proposed result. However, as described in Table 9, the MAPE of jogging using the proposed model is 11.66% and can be confirmed to be lower than the result of Luštrek [11].
In addition, the estimation accuracy during stair ascent was compared with other algorithms, as an example of activities difficult to estimate. Table 11 lists the results compared with those obtained using other algorithms, which indicates that the proposed model has an advantage over other estimation methods, excluding the method of Wang.  Note that the proposed method was realized with fewer features, and the difference in accuracy was only 0.3%.

Discussion
This paper proposes an algorithm to estimate METs for daily-life activities, including those for which MET estimation is difficult, such as stair ascent. Table 9 shows the MAPE and MPE of the respective activities. The proposed algorithm achieved higher accuracy in both MIG and HIG compared to the previous algorithm available in the commercial product HJA-750. MAPE decreased by 0.43% in MIG and 26.45% in HIG, in comparison with the previous model. Specifically, the estimation accuracy of stair ascent activity was clearly increased. In terms of the stair ascent activity, MAPE was 9.61%. However, the MAPE of METs with the previous model (HJA-750C) was 58.70%. The MAPE of the proposed model was estimated to be 49.09% smaller than that of the previous model, which is an improvement of approximately 84%. These results demonstrate the superiority of the proposed model for accurate MET estimation for HIG, including that for the stair ascent activity. We confirmed the error derived from comparing the proposed model and our previous model (HJA-750C) for MET estimation. The relation between METs and error rates was confirmed when all data were used as training data. Figure 5c, d show the relation between the error rate and METs for HIG. These figures clarify that the use of the proposed model results in a significant improvement over the previous model with respect to estimation accuracy. This improvement is supported by the data in Table 8, where 95% PI of the proposed model ranged from − 2.68 to 2.41, whereas the PI obtained using the previous model ranged from − 7.44 to 2.73. These results indicated a clear improvement over the previously used algorithm.
The results obtained in this study were evaluated through comparisons with previous algorithms [9,11,12,19,22,32]. Table 10 shows the results of our comparison with other algorithms using machine learning. There were no clear differences using the proposed model. Although it is a comparison with different datasets, it was possible to obtain an estimation accuracy similar to that of machine learning methods, only we accomplished this with a very simple algorithm. These results strongly suggest that the proposed algorithm represents a significant improvement over other versions and methods. Table 11 presents the MPE results for stair ascent and descent. An improvement in accuracy of about 10% was observed compared to the result of the Actiheart combined activity and the HR algorithm, and the MPE of the proposed model was lower by about 2% for the ActiGraph 2-regression model. Therefore, the proposed model demonstrates better estimation accuracy than other algorithms that use acceleration and heart rate data. Similar results were found for the MPE in stair ascent compared to other algorithms that use an accelerometer and a barometer. The result reported by Voleno et al. [19] was 6.6%. The MPE of the proposed model was lower. The algorithm reported by Wang et al. [32] was slightly better than our proposed model, by about 0.3%. According to Table 4, the METs during stair ascent were about seven and the difference between the proposed model and Wang's [32] amounted to about 0.08. Thus, there was a very little difference between the proposed algorithm and Wang's algorithm [32]. In addition, Wang's algorithm [32] used 13 parameters, while our proposed algorithm used only two parameters and a simple decision tree. Therefore, the processing overhead required by our proposed algorithm was less than that put forward by Wang [32]. In addition to these advantages, our proposed algorithm can be implemented with relatively low power consumption.
Considering the causes of error, this study confirmed the relation between error rate and classification results. As depicted in Fig. 6, a large error rate in MIG (over two METs) was traceable to misclassification issues. Assuming that these can be eliminated, the revised 95% prediction interval would be − 1.04 to 1.03, which is better than that obtained with HJA-750C (− 1.38 to 1.40). Therefore, it is clear that the most effective means of improving error rates using the proposed model is to suppress misclassification. To achieve this, it is necessary to revise the equation of %HRR calculation and to add individual adjustments.
The result of the classification accuracy is described in Table 6. The average classification accuracy was higher than 91% for both MIG and HIG. Thus, one can reasonably conclude that our proposed decision tree is appropriate. However, the classification result of brisk walking was relatively low (76.7%) compared to those of other activities. The results showed that the %HRR trends for misclassified subjects differed from other results that were classified correctly. To ascertain the cause of these %HRR trend differences, physical information related to misclassification of subjects who were walking briskly was examined more closely. It was found that five out of the seven misclassified subjects were female, even though the male/female ratio for all test subjects was 50%. These results suggest that sex has a strong effect on %HRR.
To address this issue, it is necessary to consider sex %HRR calculation, as reported by Whyte et al. who used equations for calculating the maximum heart rate that considered sex and age [33]. For example, in the case of sedentary females, Eq. 5 below was used.
The adoption of this equation makes it possible to improve the maximum heart rate accuracy estimation utilized in %HRR calculation.
In addition, subjects with a BMI of over 25 were 30% of all subjects, while those with a BMI over 25 (for brisk walking) were misclassified 57% of the time. This result suggests that BMI also affects %HRR accuracy and that adjusting %HRR to reflect individual BMI values would also improve the classification accuracy.