 Research
 Open Access
 Published:
Dependency criterion based brain pathological age estimation of Alzheimer’s disease patients with MR scans
BioMedical Engineering OnLine volume 16, Article number: 50 (2017)
Abstract
Objectives
Traditional brain age estimation methods are based on the idea that uses the real age as the training label. However, these methods ignore that there is a deviation between the real age and the brain age due to the accelerated brain aging.
Methods
This paper considers this deviation and obtains it by maximizing the correlation between the estimated brain age and the class label rather than by minimizing the difference between the estimated brain age and the real age. Firstly, set the search range of the deviation as the deviation candidates according to the prior knowledge. Secondly, use the support vector regression as the age estimation model to minimize the difference between the estimated age and the real age plus deviation rather than the real age itself. Thirdly, design the fitness function based on the correlation criterion. Fourthly, conduct age estimation on the validation dataset using the trained age estimation model, put the estimated age into the fitness function, and obtain the fitness value of the deviation candidate. Fifthly, repeat the iteration until all the deviation candidates are involved and get the optimal deviation with maximum fitness values. The real age plus the optimal deviation is taken as the brain pathological age.
Results
The experimental results showed that the separability of the samples was apparently improved. For normal control Alzheimer’s disease (NCAD), normal control mild cognition impairment (NCMCI), and mild cognition impairment—Alzheimer’s disease (MCIAD), the average improvements were 0.164 (31.66%), 0.1284 (34.29%), and 0.0206 (7.1%), respectively. For NCMCIAD, the average improvement was 0.2002 (50.39%). The estimated brain pathological age could be not only more helpful for the classification of AD but also more precisely reflect the accelerated brain aging.
Conclusion
In conclusion, this paper proposes a new kind of brain age—brain pathological age and offers an estimation method for it that can distinguish different states of AD, thereby better reflecting accelerated brain aging. Besides, the brain pathological age is most helpful for feature reduction, thereby simplifying the relevant classification algorithm.
Background
Alzheimer’s disease (AD) is a common neurodegenerative disease. The key for prevention and treatment is early diagnosis [1]. Magnetic resonance imaging (MRI) is a medical imaging technique used in radiology to visualize the anatomy and the physiological processes of the body in both healthy and disease states. It is noninvasive, nonradioactive, and highly costeffective, and it can reflect changes in anatomical structures and functions in different biological tissue quantitatively, so it has been applied in the early diagnosis of AD with positive results [2, 3]. Research on AD based on MRI has been conducted according to the visible changes for diagnosis [4–7]. Although research has obtained positive results, the classification accuracy, stability and the number of biomarkers are still not sufficient for clinical applications.
Brain MR Images include some changes invisible to the naked eye, such as \(A\beta\) plaque deposition, asymmetry, age, and so on [8–11]. These changes usually represent more essential information about the evolutionary process of AD [12–17]. MRI could be helpful for a deeper understanding of the development process of the disease and for providing better image biomarkers, thereby realizing better classification accuracy with fewer features.
Among these features, brain age is a representative biomarker [16–18]. Pfefferbaum et al. found that the volumes of the major anatomical structures changed as age increased. The anatomical structures include gray matter (GM), white matter (WM), and cerebrospinal fluid (CSF) [19]. It was found that there are complex relationships between changes in the anatomical structures and normal aging [20]. Good et al. found that a linear decrease in GM was predominant in normal aging, as well as a decrease in CSF, according to a crosssectional VoxelBased Morphometry (VBM) study [20]. In addition, local areas of GM decrease with age, and crosssectional morphometric analysis suggested that there are nonlinear patterns of neurodegenerative agerelated changes in GM volume [21]. Cole et al. found that the brain age can reflect the accelerated atrophy after traumatic brain injury [22]. Rzezak found that the agerelated changes in gray matter relate to the education attainment [23]. Duchesne et al. estimated the brain age across the life span using MRI technique [24].
There is a strong relationship between agerelated changes in the anatomical structures and the neurodegenerative diseases, such as Alzheimer’s disease (AD), vascular dementia (VD) and schizophrenia [25–27]. Even before the onset of clinical symptoms, some anatomical structures begin to undergo accelerating changes, including volume decreases compared with normal aging. In other words, the age feature has become very important for neurodegenerative diseases, especially AD, so it has received much attention until now [20, 21, 26]. Most of the results have been positive and have shown the feasibility of early diagnosis. In addition, these results have supported the fact that AD is a form of accelerated aging, indicating accelerated brain atrophy [26–29, 36].
Due to the importance of the age feature, some studies of age estimation have been conducted using MRI scans in recent years [30–32]. The research has shown that it is feasible and effective to noninvasively estimate brain age using MRI scans. Some research has further studied how to estimate the brain age on MRI scans in some agerelated diseases, including AD [33–40, 45]. The results have shown that it is feasible to estimate age using MRI images. Some of these studies used only NC samples (healthy people) for training age estimation models in order to estimate the distinguishable brain age for the diagnosis of AD. The estimated distinguishable brain age could be determined in different classes of samples [36–39].
Encouraged by the role of the estimated brain age on diagnosis of AD, the researchers further studied the improvements and applications of the brain age estimation method [40–46]. Irimia, Andrei et al. combined the structural and connectome information in sMRI and DTI images for brain age estimation [40]. Kondo et al. parcellated brain tissues into local regions defined by the automated anatomical labeling atlas and extracted the features of the local regions for brain age estimation [41]. Nakano et al. conducted brain age estimation by using Manifold learning, principal component analysis, and multiple regression models [42]. Except from the improved studies above, recently some application studies based on the brain age estimation were conducted [43, 44]. Loewe et al. further combined estimated brain age and APOE status for classification of AD and MCI patients [43]. Luders et al. studied the difference between the estimated brain ages of the longterm meditators and the control subjects [44]. The Katja Franke et al. studied the effect of the APOE Genotype on individual brain age in normal aging, Mild Cognitive impairment, and Alzheimer’s disease. They found that the brain age can be a useful and accurate tool for predicting conversion from MCI to AD even if the information of the patient’s APOE status is missing [45].
It is worth noting that the studies above were based on the same idea to estimate the brain age. The idea was to estimate the brain age by minimizing the difference between the estimate age and the real age. Firstly, a regression model was selected to estimate the age; the input consisted of the MR image features, and the output is the estimated age. Secondly, an error function was designed to train the model such as mean absolute error (MAE) which was the difference between the estimate age and the real age. Thirdly, it found the best estimated age by minimizing the error.
As to the idea, there are some problems to consider. Firstly, because AD is a type of accelerated aging, the deviation between the real age and the brain age changes with different states NC, MCI, and AD. Therefore, it is not suitable to use the real age as the training label. Secondly, the traditional methods aimed to estimate an age close to the real age by minimizing the error function (distance). Because the real age is not suitable for the training label, the minimization is meaningless for classification. Thirdly, some studies have trained regression models with NC samples and tested people with three states (NC MCI AD), indicating that the NC samples contained information about the difference between NC, MCI and AD. However, evidence or proof has not been provided in these papers. The age estimation was based on regression (machine learning method), but the training samples (NC) were quite different from the test samples (NC, MCI, and AD). According to machine learning theory, the training process was not reasonable and reliable.
Because the fact is that the AD process is a form of accelerated aging, the deviation between the real age and estimated brain age should be considered. The training label should not be the real age but the real age plus deviation. Because the deviation is related to the evolutionary process of AD, it can characterize the three states of AD. Therefore, it would be reasonable to determine the suitable deviation by maximizing the classification accuracy of the three states of AD. These deviations could quantitatively and directly estimate the extent of accelerated aging, improve the classification ability of the estimated age and be helpful for the early diagnosis of AD and for understanding neurodegenerative diseases. Because the age deviations are related to the diagnosis of AD, the age plus deviation is called the brain pathological age here.
Methods
Subjects/database
In order to validate the algorithm in this paper, the paper selected the publicly accessible ADNI database (http://adni.loni.usc.edu/). The samples were chosen with preprocessing and feature extraction, while in order to emphasize the role of age and to avoid the impact and fluctuation of the multiple features, the samples had only 2 image features and had not been processed with feature selection. The two features of the data set were the volumes of the left and right parts of the hippocampus. The total number of samples in the data set was 1485, consisting of three classes of samples: NC, MCI and AD. The number of NC samples was 540, the number of MCI samples was 534, and the number of AD samples was 411. The age distribution ranges of the three classes of samples were all 65–85 years old. The MRI sequence used is T2 dual echo sequence at 1.5T; the image size is about \(256 \times 256 \times 170\) voxels with the voxel size of approximately 1 mm × 1 mm × 1.2 mm. The image scanner was a GE Medical Systems scanner. With the SPM8 package and the VBM8 toolbox, two features are extracted from the MRI images, and the features are the volumes of left and right hippocampus, thereby obtaining the feature data. The feature data is stored with excel format in the ADNI. Since same images with different image processing methods will lead to different feature data, thereby influencing the comparison of different brain age estimation methods. Hence the feature data rather than the relevant images are used for study directly in this manuscript.
To simplify the analysis, the samples were divided into three classes: NC, MCI and AD. Moreover, the numbers of the three classes of samples were the same in order to eliminate the effects of unbalanced samples. The number of AD samples was 411 or less, so the number of different classes of samples was 411. The three classes of samples were within similar age distribution ranges of 65–85 years old. To facilitate description, the data set is called the “hippocampus dataset” in subsequent sections. Relevant, brief information about the hippocampus dataset is shown in Table 1.
The difference between Path_brainAge_estima and BrainAge_estima
BrainAge_estima means the traditional method for brain age, and the Path_brainAge_estima means the proposed method for brain age in this paper. The fitness function of the training model of the BrainAge_estima was the error of estimated age and real age,and the algorithm estimated the age by minimizing the error. The purpose was to train the age estimation model to approximate the real age. In this paper, the fitness function of the training model of the Path_brainAge_estima was based on the correlation criterion, namely the correlation of the estimated age and the class label. The proposed algorithm estimated the brain age by maximizing this correlation value which indirectly reflects the classification capability. Its purpose was to train the age estimation model to approximate the optimal classification accuracy. Compared with the BrainAge_estima, this Path_brainAge_estima was not only based on the optimization of classification accuracy but also reflected the fact mentioned in the Introduction section. The estimated pathological brain age was more beneficial to improving the classification accuracy for the diagnosis of AD (classification of AD).
The fitness functions of the two types of algorithm are briefly described as follows.\(F_{1}\) is the fitness function of the BrainAge_estima, and \(F_{2}\) is the fitness function of the Path_brainAge_estima in this paper.
where \(y\) is the real age, and \(\hat{y}\) is the age estimated by the regression model.
where \(\hat{y}\) is the age estimated by the regression model, and \(y_{label}\) is the class label of samples.
The Path_brainAge_estima
In this paper, we present an idea for automatically estimating the brain pathological ages of subjects with different states of AD using MRI images (scans) for the diagnosis of AD. Firstly, the deviation is considered to characterize accelerated aging directly. Secondly, the training label is real age plus deviation, so the objective of training the age estimation model becomes more reasonable. Thirdly, a fitness function is designed with the correlation criterion so that the deviation can contribute to improving the diagnosis of AD. As we know, the aim of estimating the brain age is to diagnose AD, so the estimation of brain age can be transformed into a maximization problem. Fourthly, the training samples include subjects with the three states of AD, so the whole process of estimating the brain age not only uses information from NC samples but also information from subjects of MCI and AD. The information for training the age estimation model is more abundant and helps to improve the quality of the estimated deviation. The real age plus the estimated deviation is called the brain pathological age. During optimizing the brain age estimation algorithm, different kernel functions and unbalanced/balanced datasets are studied to choose the best brain age estimation model.
To verify the performance of this proposed algorithm, subjects from a public dataset and cross validation (CV) testing methods were used. The dataset came from a popular public dataset of ADrelated research: ADNI (Alzheimer’s Disease Neuroimaging Initiative, ADNI). In total, data from more than 1200 subjects were included. Twoclass classification experiments (NCMCI, NCAD, and MCIAD) were performed. In addition, the threeclass classification problem was also considered. Except for the classification problems, the deviation was discussed so that the estimated age could show strong separability capability for different states of AD. Each experiment was repeated several times to demonstrate the stability, statistical characteristics and significance level of the estimated age. Because the traditional age estimation methods for diagnosis of AD are based on the same idea, the comparison was not based on one concrete algorithm but based on the idea. Therefore, in the experimental part, only one representative algorithm in [36] was selected and compared.
To facilitate description, the proposed age estimation idea (algorithm) and estimated age are called Path_brainAge_estima and brain pathological age respectively. The traditional brain age idea (algorithm) and the estimated age are called BrainAge_estima and traditional brain age respectively.
The proposed algorithm in this paper (Path_brainAge_estima) was mainly based on a hybrid integrated age deviation selection model, by searching the age deviation to maximize the fitness function (2) and to obtain the brain pathological age. It mainly included the following parts: (1) the regression model—support vector regression (SVR); and (2) a fitness function (evaluation criteria): correlation criterion. Because the brain pathological age and real age have a deviation and it changes with the state of AD, the deviation is a variable.
For class 1, the deviation is set to \(w\), which ranges from \(w_{\text{min} }\) to \(w_{\text{max} }\); the deviation for class 2 is set to \(q\), which ranges from \(q_{\text{min} }\) to \(q_{\text{max} }\); the deviation for class 3 is set to \(r\), which ranges from \(r_{\text{min} }\) to \(r_{\text{max} }\),…; the deviation of class n is set to be \(s\), which ranges from \(s_{\text{min} }\) to \(s_{\text{max} }\). Assuming that the real age of the \(i1th\) samples in class 1 is \(A_{ge\_class1\_i1}\), the \(i2th\) sample in class 2 is \(A_{ge\_class2\_i2}\), the \(i3th\) sample in class 3 is \(A_{ge\_class3\_i3},\ldots,\) the \(inth\) sample in class n is \(A_{{ge\_class{\text{n}}\_in}}\), and the training label of the SVR is not \(A_{ge\_class1\_i1}\), \(A_{ge\_class2\_i2}\), \(A_{ge\_class3\_i3} \ldots,\) \(A_{{ge\_class{\text{n}}\_in}}\) but \(A_{ge\_class1\_i1} + w\), \(A_{ge\_class2\_i2} + q\), \(A_{ge\_class3\_i3} + r, \ldots\), \(A_{{ge\_class{\text{n}}\_in}} + s\), respectively.
Firstly, the samples are divided into a training set, a validation set and a test set randomly. Secondly, the SVR model is trained using the training set based on the current combination of deviations: \(w\),’\(q\), \(r\),…, \(s\). Then, the input validation set is inserted into the SVR model to obtain the estimated ages to calculate the fitness value based on the fitness function.
The deviations \(w\), \(q\), \(r\),…, \(s\) are within \(\left[ {w_{\text{min} } ,w_{\text{max} } } \right]\), \(\left[ {q_{\text{min} }, \, q_{\text{max} } } \right]\), \(\left[ {r_{\text{min} }, \, r_{\text{max} } } \right]\),…,\(\left[ {s_{\text{min} }, \, s_{\text{max} } } \right]\), respectively, and all the candidate deviations belong to the set of \(F_{2} \{ \} \left {_{w,q,r, \ldots,s} } \right.\). The \(F_{2} \{ \} \left {_{w,q,r, \ldots, s} } \right.\) is defined as follows: \(\left\{ {F_{2} \in A_{{F_{2} }} \left {A_{{F_{2} }} :F_{2} } \right_{w,q,r,\ldots,s} ,w = w_{\text{min} } :w_{\text{max} } ,q = q_{\text{min} } :q_{text{max} }, r = r_{text{min} } :r_{\text{max} }, \ldots,s = s_{\text{min} } :s_{\text{max} } } \right\}\).In the set, the maximum fitness value \(F_{2\_\text{max} }\) is obtained, and the corresponding optimal deviations are \(w_{ma}\), \(q_{ma}\), \(r_{ma}\),…, \(s_{ma}\). They are calculated by the following formula:
The main process of the Path_brainAge_estima is described by the following flowchart in Fig. 1.
As seen from the flowchart, this Path_brainAge_estima uses the SVR model to estimate brain pathological age and introduces the separability distance criterion to design the fitness function. The algorithm fully considers that there are different deviations between the real age and brain age in different states of AD. The pseudocode of this algorithm is shown as follows. The process of this algorithm is to calculate the fitness value based on a combination of deviations \(\left( {w,q,r,\ldots,s} \right)\). Therefore, it is necessary to repeat the following circles: \(\left[ {w_{\text{min} } ,w_{\text{max} } } \right]\), \(\left[ {q_{\text{min},} \, q_{\text{max} } } \right]\), \(\left[ {r_{\text{min} }, \, r_{\text{max} } } \right]\),…, \(\left[ {s_{\text{min} }, \, s_{\text{max} } } \right]\), while computing and storing all possible combinations of \(w,q,r,\ldots,s\), corresponding to the fitness values and the corresponding trained SVR models.
The pseudo code is described as follows:
Support vector regression (SVR)
Support Vector Machine (SVM) is a highly efficient type of machine learning algorithm described by Vapnik. SVR is the regression variation of SVM with outstanding nonlinear mapping performance [47]. The purpose of SVR is to determine the plane that can accurately predict the distribution of the data. If the problem is linear, the equation for the hyperplane is provided by expression (3):
where \(\lambda^{*}\) and \(b^{*}\) are Lagrange multipliers.
If the problem is nonlinear, there are two methods to obtain a linear case. The first idea is that the data are projected in a space with a greater dimension; the other idea is the introduction of a kernel function. The SVR is used here for its robustness against noise and the possibility of processing data that are nonlinear. In this paper, the outputs of SVR are the estimated ages.
Fitness function based on dependency criterion
Correlation quantifies the relationship between features in order to identify feature candidates that may be the best to achieve desired effects [48, 49]. Linear correlation methods are robust and computationally efficient, but only detect the linear correlations. Nonlinear correlation methods can detect nonlinear correlations, but require careful parameterization. Nonlinear correlations can also be quantified by regression validation errors. Correlations do not imply causality, so correlation analysis may reveal false correlations. If the underlying features are known, the spurious correlation can be handled in a partially correlated way. Suppose that the covariance matrix Ϲ of a data set \(X \subset R^{F}\),where each matrix element \(c_{ij}\) denotes the covariance between the features \(x^{(i)}\) and \(x^{(j)}\), i, j = 1,…, p.
If \(c_{ij}\) is positive, then there is a strong positive dependency between \(x^{(i)}\) and \(x^{(j)}\), i.e. high values of \(x^{(i)}\) coincide with high values of \(x^{(j)}\), and low values of \(x^{(i)}\) coincide with low values of \(x^{(j)}\). If \(c_{ij}\) is negative, then there is a strong negative dependency, i.e. high values of \(x^{(i)}\) coincide with low values of \(x^{(j)}\) and vice versa. If \(c_{ij}\) is close to zero, then there is a weak dependency between \(x^{(i)}\) and \(x^{(j)}\). If a feature is multiplied by a constant factor α, then the covariance between this feature and any other feature will also increase by a factor α, although we do not expect this feature to make more useful contributions to data analysis. The correlation coefficient compensates the effect of constant scaling by dividing the covariance by the product of the standard deviations of both features.
The standard deviations are the square roots of the variances, i.e. the square roots of the diagonal elements of the covariance matrix, \(s^{(i)} = \sqrt {c_{ii} }\), so the correlation matrix can be directly computed from the covariance matrix.
where \(s_{ij}\) ∈ [−1,1]. If \(s_{ij}\) ≈ 1 then there is a strong positive correlation between \(x^{(i)}\) and \(x^{{({\text{j}})}}\). If \(s_{ij}\) ≈ −1, then there is a strong negative correlation. If \(s_{ij}\) ≈ 0 then \(x^{(i)}\) and \(x^{{({\text{j}})}}\) are (almost) independent, so correlation can be interpreted as the opposite of independence. Notice that for μ–σ—standardized data, covariance and correlation are equal.
This paper designed correlation criteria as fitness functions. Select the correlation between the predicted age of the validation set and the category label as the fitness value. The fitness function of the expression is:
where λ is fitness value, \(\hat{y}\) is the estimated brain age and \(y_{label}\) is the categories of samples.
Experimental conditions
In order to demonstrate the advantages of this proposed brain pathological age estimation algorithm, two fitness functions were used: \(\lambda_{1}\) and \(\lambda_{2}\). A twoclass experiment and threeclass experiment were conducted, which were NCAD, NCMCI, MCIAD and NCMCIAD. The samples were randomly divided into a training set, a validation set and a test set 100 times, yielding 100 groups of samples.
In this paper, the experimental operating system platform was the Windows, version 7, 64bit operating system, and the memory size was 128 GB. The algorithm was implemented in MATLAB, version 2014a. Because the Path_brainAge_estima is different from the traditional age estimation idea (BrainAge_estima) rather than a concrete algorithm, only one representative algorithm based on the traditional idea [36] was selected, realized and compared with the Path_brainAge_estima (the proposed algorithm). The kernel functions of SVR include linear kernel, Polynomial kernel and Gaussian kernel; the parameters are set with default values.
Results
Estimation of brain pathological age
Study of kernel function of SVR
For the two classes of samples, the range of age deviation is set according to prior knowledge. The deviation is usually within 10 years or so. Therefore, \(\left[ {w_{\text{min} } ,w_{\text{max} } } \right] = [ 10,10]\), \(\left[ {q_{\text{min} } ,q_{\text{max} } } \right] = [ 10,10]\). In order to compare the results of different kernel functions, the experiment was repeated 10 times with different kernel functions respectively for NCAD, and kernels’ parameters are the default. The average results about age detection \(\left( {w,q} \right)\) are shown in Table 2.
From Table 2, for NCAD, it can be seen the average values of \(w\) were always less than \(q\), but the difference between the \(w\) and \(q\) obtained by the linear kernel function was the largest in the three kernel functions. The average values of \(w\) was −5.1 and the average values of \(q\) was 7 with linear kernel function, respectively. \(w\) was usually less than zero, and \(q\) was normally greater than zero. So the pathological age obtained by the linear kernel function could distinguish between healthy people and AD patients best.
The age estimation was based on the training samples and validation samples. The training samples were used for training the age estimation model. The validation samples were used to calculate the fitness value of the deviation candidate and to determine the optimal brain pathological age and the optimal age estimation model. To further verify the performance of the optimal age estimation model, it is necessary to apply the model to the test samples.
In this section, experiments about NCAD with different kernel functions are conducted. The correlation coefficient is used as dependency criterion here. As discussed above, the values based on the dependency criterion can detect the correlation of the test samples better, which in turn can improve classification accuracy indirectly. If the correlation value is large, then classification accuracy is high accordingly. The mean and standard deviation of the correlation values obtained by different kernel functions are shown in Table 3.
From Table 3, for different kernel functions, the Path_brainAge_estima showed better correlation than the case ‘without age estimation’, which indicate that real age alone was not sufficient, In addition, our algorithm had better correlation (see the boldface type) than the current popular idea (BrainAge_estima). For normal control Alzheimer’s disease (NCAD), normal control mild cognition impairment (NCMCI), and mild cognition impairment—Alzheimer’s disease (MCIAD), the average improvements were 0.164 (31.66%), 0.1284 (34.29%), and 0.0206 (7.1%), respectively.
The improvements are apparent, especially with linear kernel function. It can be found that the mean of fitness value with linear kernel was the largest for the three kernel functions. It showed that the brain pathological age from our algorithm with linear kernel function was more helpful for the classification of AD. Therefore, it is applied for the subsequent experiments.
Estimation of brain pathological age
For the two classes of samples, experiments about NCMCI and MCIAD were also conducted for 10 times respectively; the range of the deviation was set as the same as NCAD; the average results about age detection \(\left( {w,q} \right)\) are shown in Table 4. Considering the time cost, for three classes of samples, the range of age deviation was set as follows: \(\left[ {w_{{_{\text{min} } }} ,w_{\text{max} } } \right] = [  8,8]\), \(\left[ {q_{\text{min} } ,q_{\text{max} } } \right] = \left[ {  8,8} \right]\), \(\left[ {r_{\text{min} } ,r_{\text{max} } } \right] = \left[ {  8,8} \right]\). The same experiment was repeated 10 times. The average results for the age estimation of \(\left( {w,q,r} \right)\) are shown in Table 5.
From Table 4, for NCAD, it can be seen the average values of \(w\) was −5.1 and the average values of \(q\) was 7, respectively. \(w\) was always less than \(q\). In addition, it was also found that \(w\) was usually less than zero, and \(q\) was normally greater than zero. There was a difference between healthy people’s pathological age and AD patients’ pathological age. In other words, the pathological age could distinguish between healthy people and AD patients while the real age could not. In order to show the significant difference between the pathological age and the real age, pvalues are computed. According to the pvalues, two of the estimated pathological ages were significantly different from the real age (p < 0.05) significantly. The case was similar with NCMCI and MCIAD, and \(w\) was always less than \(q\) (see Fig. 2).
From Table 5, for the three classes of samples (NCMCIAD), it could be seen that the average value of \(w\) was −4, \(q\) was −0.7, and \(r\) was 2.7, and they meet the inequality constraints \(w < q < r\). The results showed that the deviation for healthy people (NC) was usually lower than that for the MCI subject, and the latter is lower than that for AD patients. In other words, the deviation between the pathological age and the real age could distinguish NC, MCI and AD, while the single real age could not. Please see Fig. 3 for more information.
Verification of effectiveness of the estimated brain pathological age
In this section, the twoclass and threeclass problems are carried out. They are NCMCI, MCIAD, and NCMCIAD. As the same as Table 3, the mean and standard deviation of the correlation values are shown in Tables 6 and 7 for the twoclass problem and threeclass problem.
From Table 6 above, for the cases of NCAD, NCMCI and MCIAD, the Path_brainAge_estima showed better correlation than the case ‘without age estimation’, which indicate that real age alone was not sufficient, so it was necessary to estimate the brain age. In addition, our algorithm had better correlation value (see the boldface type) than the other algorithms. It showed that the pathological age from our algorithm was more helpful for the classification of AD. According to the case ‘without age estimation’, Path_brainAge_estima and BrainAge_estima showed apparent improvements, but these improvements were different with the twoclass problems. For NCAD, the improvement was most apparent, possibly because NC is quite different from AD. According to the standard deviation, our algorithm was better than the BrainAge_estima, indicating that our algorithm was more stable than the traditional age estimation algorithm. For MCIAD, all the correlation values are lower than 0.5, it means that the correlation is not strong enough. The possible first reason is that the difference between MCI and AD are small and they are difficult to be separated. The possible second reason is that the step size for the brain pathological age estimation is not small enough and the search range of the brain age deviationmay not be appropriate.
The case is similar to that in Table 7. From the Tables above, for the cases of NCMCIAD, the BrainAge_estima had better correlation than the case ‘without age estimation’, indicating that real age alone was not sufficient and that it was necessary to estimate the brain age. In addition, our algorithm had better correlation (see the boldface type) than the traditional age estimation algorithm, demonstrating that the brain pathological age from our algorithm was more helpful to the classification of AD. According to the case ‘without age estimation’, BrainAge_estima and Path_brainAge_estima had apparent improvements. The difference between healthy people and AD patients was amplified as much as possible. Nevertheless, our algorithm still had the best correlation. According to the standard deviation, our algorithm was better than BrainAge_estima, indicating that our algorithm was more stable than the traditional age estimation algorithm.
Figure 4 is a graphical representation of Tables 6 and 7, showing that the correlation value with these algorithms had a trend of gradual increase.
As the data analysis above, the brain pathological age has highest correlation values with the class label, thereby having best classification capability. In other words, the brain pathological age has highest dependency with the class label. According to the principle of feature optimization, a good feature subset has two characteristics: high classification capability and small feature size. The high dependency with the class label can support the high classification capability indirectly. In this section, let us study the dependency of the brain age with the MR features. The high dependency can support the high redundancy, thereby being helpful for reducing the feature size. Table 8 shows the dependency of the brain age with the class label and the MR features. ‘CwC’ means correlation of age with class label; ‘ACwF’ means average correlation of age with MR features; ‘CwF1’ means the correlation of age with 1st feature; ‘CwF2’ means the correlation of age with 2nd feature. Each data is with format of mean and stand deviation of the correlation value.
Seen from Table 8, the brain age is helpful for improving the dependency with the class label. The brain pathological age obtains highest correlation (dependency) with the class label. It means that the brain pathological age is most helpful for classification of AD. For example, for CwC of NC_MCI_AD, the correlation of the real age with the class label is 0.045, the correlation of the traditional brain age with the class label is 0.397, and the correlation of the brain pathological age with the class label is 0.598. The case is similar as NC_AD, NC_MCI and MCI_AD. More important, the correlation of the brain pathological age with the MR features (redundancy) is highest. It means the brain pathological age is most helpful for feature reduction, thereby reducing the complexity of the classification model. For example, for ACwF of NC_MCI_AD, the correlation of the real age with the MR features is 0.141, the correlation of the traditional brain age with the MR features is 0.691, and the correlation of the brain pathological age with the MR features is 0.961. The case is similar as NC_AD, NC_MCI and MCI_AD.
Discussion
The estimated brain age, based on MRI images using different methods, can distinguish the different states of AD, and it is helpful for improving classification accuracy. Some methods use all classes of samples for training, while others use only normal people (NC) for training, but all of them are based on the same idea. The idea is to estimate the age by minimizing the distance between the estimated age and the real age. This idea is not in accordance with the fact that AD process is a form of accelerated aging.
This paper solved this problem based on brain pathological age by maximizing the classification accuracy of AD. Firstly, the samples are divided into three sets: training set, validation set and test set. Secondly, age deviation is introduced. Thirdly, the dependency criterion of correlation is used as fitness function. Fourthly, based on the age deviation candidate and the training set, the SVR is trained; the corresponding fitness value is obtained based on the validation set. Fifthly, the age deviation is optimized by maximizing the fitness value and the age deviation candidate with best fitness value is the optimal age deviation. The real age plus the optimal age deviation is called the brain pathological age.
The popular regression method SVR is used as age estimation model. Several kernel functions and dependency criterion are compared in the case of NC_AD. Based on the experimental results, we can find that the age deviation of NC is lower than that of AD. The results demonstrate that the proposed idea works better. The results quantitatively prove the fact that the AD process is a form of accelerated aging. The difference between the age deviation of NC and AD is largest in the case of linear kernel function. The results mean that the linear kernel function is most helpful for maximizing the classification accuracy of NC_AD and is used for subsequent age estimation. The possible reason why the linear kernel function is best is that the kernel function is most suitable for the data. According to correlation values of different kernel functions, the brain pathological age by the proposed algorithm is best. The results show that the proposed age estimation idea is best. Based on the SVR with linear kernel function and the dependency criterion, the proposed age estimation algorithm is applied for cases of NCAD, NCMCI, MCIAD, and NCMCIAD. According to the estimated brain age deviations, the accelerating aging is quantitatively calculated. The age deviation of NC is lower that of MCI, the latter is lower than that of AD. In order to show the advantage of the proposed algorithm, the correlation values are calculated in terms of different age types. The correlation value of brain age by existing age estimation method is better than that by real age; the results mean that the brain age estimation is very necessary. The correlation value of brain age by the proposed age estimation algorithm is better than that by existing age estimation method; the results mean that the propose age estimation algorithm is better than the existing brain age estimation method in terms of classification of AD. The reason is in that the age deviations by the proposed brain age estimation algorithm are in accordance with the fact that the AD process is a form of accelerated aging.
Based on the estimated brain ages, the differences between the samples belonging to different classes are calculated. According to the results, the differences by the proposed algorithm not only vary monotonously but also can distinguish the different states of AD. The bar graphs also support this point. The results once again show that the brain pathological age can quantitatively measure the extent of the accelerated aging in the AD process.
The correlation of the brain pathological age and the traditional brain age with the class label and the MR features are studied respectively. The experimental results show that the brain pathological age is most helpful for classification of AD and feature reduction.
At present,all the existing brain age estimation methods are based on same brain age idea which is to minimize the error between the predicted age and the actual age,while is inconsistent with the process of accelerating brain age of AD. In this paper, a new brain age estimation idea (brain pathological age estimation) is proposed and it is quite different from the existing brain age idea. According to the experimental results, for same public feature data, the brain pathological age has higher classification accuracy and can be better helpful for reducing feature size than the existing brain age. The most advantage of the proposed algorithm is that it can improve the accuracy of classification and effectively reduce the feature size. The most limitation of it is that when the number of features is too large and the step size of search brain age deviation is very small, the time cost of the brain pathological age estimation will become high. The potential significance of the algorithm is that this paper proposed a new brain pathological age estimation idea rather than a concrete method, thereby obtaining a new and better brain age type (biomarker). Since there is a new idea, it will lead to many different new concrete methods by introducing different algorithms such as different regression models, optimization algorithms, classification criteria, and so on.
Highlights
This paper proposed a new kind of brain agebrain pathological age and realized a concrete method for estimating it which is helpful for diagnosis of AD. The main contributions of this paper can be described as follows.

(1)
The current age estimation methods for the diagnosis of AD are based on the same idea. This paper proposed a new idea to replace it rather than proposed a new concrete method.

(2)
This proposed idea considers the deviation directly so that it can help to distinguish the different states of AD, thereby estimating the extent of accelerated aging. The age estimation was conducted by maximizing classification accuracy rather than by minimizing the distance between the estimated age and the real age, thus make the estimation helpful for the diagnosis of AD.

(3)
This idea uses the real age plus deviation as the training label rather than the real age, thereby making the training process more reasonable for the classification of AD.

(4)
Two states and three states of AD were involved at the same time for brain age estimation in this paper.

(5)
Dependency criterion of correlation was used for algorithm design and for the verification of the quality of the estimated age. The criterion is a kind of measurement index of classification capability. It has low computational complexity and good generalization capability, so that the brain pathological age can be widely applied in the different individuals from different areas.

(6)
The brain pathological age is most helpful for feature reduction, thereby reducing the complexity of the classification model.
Conclusions
Real age has been proven to be related to the classification of AD, but it has poor and unsatisfactory classification capability. From brain MR images, the existing age estimation methods can offer an estimated brain age for classification of AD. But the age estimation methods are based on the same idea, which is to estimate the age by minimizing the distance between the estimated age and the real age. The idea is not in accordance with the AD process. Based on the limitations, this paper proposed a new brain age estimation ideabrain pathological age estimation idea. The experimental results showed that the estimated brain pathological age could reflect the differences between the real age and the brain pathological age at a significant level. The difference could distinguish the different states of AD and was more helpful for the classification of AD, reflecting the extent of accelerated aging better than the traditional brain age estimation idea. Besides, the brain pathological age is most helpful for feature reduction for subsequent classification model.
Abbreviations
 SVR:

support vector regression
 NC:

normal control
 MCI:

mild cognition impairment
 AD:

Alzheimer’s disease
 MRI:

magnetic resonance imaging
 GM:

gray matter
 WM:

white matter
 CSF:

cerebrospinal fluid
 VBM:

VoxelBased Morphometry
 VD:

vascular dementia
 DTI:

diffusion tensor imaging
 MAE:

mean absolute error
 SVM:

Support Vector Machine
References
 1.
Selkoe DJ. Preventing Alzheimer’s disease. Science. 2012;337(6101):1488–92.
 2.
Geert Jan Biessels. Diagnosis and treatment of vascular damage in dementia. Biochimica Biophysica Acta. 2016;1862(5):869–77.
 3.
Tondelli M, Wilcock GK, Nichelli P, et al. Structural MRI changes detectable up to ten years before clinical Alzheimer’s disease. Neurobiol Aging. 2012;33(4):825–36.
 4.
Tosun Duygu, Mojabi Pouria, Weiner Michael W, Schuff Norbert. Joint analysis of structural and perfusion MRI for cognitive assessment and classification of Alzheimer’s disease and normal aging. NeuroImage. 2010;52(1):186–97.
 5.
Diciotti S, Ciulli S, Ginestroni A, Salvadori E, et al (2015) Multimodal MRI classification in vascular mild cognitive impairment. Conference Proceeding of IEEE Engineering Medical Biology Society. p 4278–81.
 6.
Ortiz Andres, Gorriz Juan M, Ramirez Javier, et al. LVQSVM based CAD tool applied to structural MRI for the diagnosis. Pattern Recognit Lett. 2013;34(14):1725–33.
 7.
Apostolova LG, Hwang KS, Kohannim O, et al. ApoE4 effects on automated diagnostic classifiers for mild cognitive impairment and Alzheimer’s disease. NeuroImage Clin. 2014;24(4):461–72.
 8.
Alafuzoff I, Thal DT, Bogdanovic N, AlSarraj S, Bodi I, Boluda S, Bugiani O, Duyckaerts C, Gelpi E, Gentleman S. Assessment of βamyloid deposits in human brain: a study of the BrainNet Europe Consortium. Acta Neuropathol. 2009;117:309–20.
 9.
Pepe A, Dinov I, Tohka J. An automatic framework for quantitative validation of voxel based morphometry measures of anatomical brain asymmetry. NeuroImage. 2014;100(15):444–59.
 10.
Takao H, Hayashi N, Ohtomo K. White matter microstructure asymmetry: effects of volume asymmetry on fractional anisotropy asymmetry. Neuroscience. 2013;231:1–2.
 11.
BekiesińskaFigatowska Monika, Sawicka Ewa, Żak Klaudia, Szczygielski Orest. Age related changes in brain MR appearance in the course of neurocutaneous melanosis. Eur J Radiol. 2016;85(8):1427–31.
 12.
Coppus AM, Schuur M, Vergeer J, Janssens AC, Oostra BA, Verbeek MM, van Duijn CM. Plasma β amyloid and the risk of Alzheimer’s disease in down syndrome. Neurobiol Aging. 2012;33(9):1988–94.
 13.
ScherzerAttali R, Farfara D, Cooper I, Levin A, BenRomano T, Trudler D, Vientrov M, ShaltielKaryo R, Shalev DE, SegevAmzaleg N, Gazit E. Naphthoquinonetyrptophan reduces neurotoxic Aβ* 56 levels and improves cognition in Alzheimer’s disease animal model. Neurobiol Dis. 2012;46(3):663–72.
 14.
Kim JH, Lee JW, Kim GH, Roh JH, Kim MJ, Seo SW, Kim ST, Jeon S, Lee JM, Heilman KM, Na DL. Cortical asymmetries in normal, mild cognitive impairment, and Alzheimer’s disease. Neurobiol Aging. 2012;33(9):1959–66.
 15.
Capitani E, Rosci C, Saetti MC, Laiacona M. Mirror asymmetry of category and letter fluency in traumatic brain injury and Alzheimer’s patients. Neuropsychologia. 2009;47(2):423–9.
 16.
Tokuchi R, Hishikawa N, Sato K, et al. Agedependent cognitive and affective differences in Alzheimer’s and Parkinson’s diseases in relation to MRI findings. J Neurol Sci. 2016;365:3–8.
 17.
Riedel BC, Thompson PM, Brinton RD. Age, APOE and sex: triad of risk of Alzheimer’s disease. J Steroid Biochem Mol Biol. 2016;160:134–47.
 18.
Hoyer D, Schneider U, Kowalski EM, Schmidt A, Witte OW, Schleußner E, Hatzmann W, Grönemeyer DH, van Leeuwen P. Validation of functional fetal autonomic brain age score fABAS in 5 min short recordings. Physiol Measurement. 2015;36(11):2369–78.
 19.
Pfefferbaum A, Mathalon DH, Sullivan EV, Rawles JM, Zipursky RB, Lim KO. A quantitative magnetic resonance imaging study of changes in brain morphology from infancy to late adulthood. Arch Neurol. 1994;51(9):874–87.
 20.
Good CD, Johnsrude IS, Ashburner J, Henson RN, Friston KJ, Frackowiak RS. A voxelbased morphometric study of ageing in 465 normal adult human brains. NeuroImage. 2001;14(1):21–36.
 21.
Terribilli D, Schaufelberger MS, Duran FLS, Zanetti MV, Curiati PK, Menezes PR, Scazufca M, Amaro E, Leite CC, Busatto GF. Agerelated gray matter volume changes in the brain during nonelderly adulthood. Neurobiol Aging. 2011;32(2):354–68.
 22.
Cole JH, Leech R, Sharp DJ. Prediction of brain age suggests accelerated atrophy after traumatic brain injury. Ann Neurol. 2015;77(4):571–81.
 23.
Rzezak P. Relationship between brain agerelated reduction in gray matter and educational attainment. PLoS ONE. 2015;10(10):e0140945.
 24.
Duchesne S, Gravel P. Estimating ‘Brain Age’ across the life span using MRI appearance. Alzheimers Dementia. 2016;12(7):111–P111.
 25.
Fratiglioni L, Grut M, Forsell Y, Viitanen M, Grafström M, et al. Prevalence of Alzheimer’s disease and other dementias in an elderly urban population Relationship with age, sex, and education. Neurology. 1991;41(12):1886–92.
 26.
Hullinger R, Puglielli L. Molecular and cellular aspects of agerelated cognitive decline and Alzheimer’s disease. Behav Brain Res. 2017;322(Part B):191–205.
 27.
Pini L, Pievani M, Bocchetta M, Altomare D, Bosco P, Cavedo E, Galluzzi S, Marizzoni M, Frisoni GB. Brain atrophy in Alzheimer’s disease and aging. Ageing Res Rev. 2016;30:2548.
 28.
Lorenzi M, Pennec X, Frisoni GB, Ayache N. Disentangling normal aging from Alzheimer’s disease in structural magnetic resonance images. Neurobiol Aging. 2015;36(Suppl 1):S42–52.
 29.
Rieckmann A, Van Dijk KR, Sperling RA, Johnson KA, Buckner RL, et al. Accelerated decline in white matter integrity in clinically normal individuals at risk for Alzheimer’s disease. Neurobiol Aging. 2016;42:177–88.
 30.
Baumann Pia, Widek Thomas, Merkens Heiko, Boldt Julian, Petrovic Andreas, Urschler Martin, Kirnbauer Barbara, Jakse Norbert, Scheurer Eva. Dental age estimation of living persons: comparison of MRI with OPG. Forensic Sci Int. 2015;253:76–80.
 31.
Vieth Volker, Schulz Ronald, Brinkmeier Paul, Dvorak Jiri, Schmeling Andreas. Age estimation in U20 football players using 3.0 tesla MRI of the clavicle. Forensic Sci Int. 2014;241:118–22.
 32.
Ekizoglu O, Hocaoglu E, Inci E, Can IO, Aksoy S, Kazimoglu C. Forensic age estimation via 3T magnetic resonance imaging of ossification of the proximal tibial and distal femoral epiphyses: use of a T2weighted fast spinecho technique. Forensic Sci Int. 2016;260:102.e1–7.
 33.
Davatzikos C, Fan Y, Wu X, Shen D, Resnick SM. Detection of prodromal Alzheimer’s disease via pattern classification of magnetic resonance imaging. Neurobiol Aging. 2008;29(4):514–23.
 34.
Bortolon Catherine, Louche Aurore, GélyNargeot MarieChristine, Raffard Stéphane. Do patients suffering from Alzheimer’s disease present an ownage bias in face recognition? Exp Gerontol. 2015;70:46–53.
 35.
Hirano S, Shinotoh H, Shimada H, et al. Age correlates with cortical acetylcholinesterase decline in Alzheimer’s disease patients: a PET study. Alzheimers Dementia. 2012;8(4):531–2.
 36.
Franke K, Ziegler G, Klöppel S, et al. Estimating the age of healthy subjects from T1weighted MRI scans using kernel methods: exploring the influence of various parameters. Neuroimage. 2010;50(3):883–92.
 37.
Franke K, Hagemann G, Schleussner E, et al. Changes of individual BrainAGE during the course of the menstrual cycle. Neuroimage. 2015;15(115):1–6.
 38.
Franke K, Gaser C, Manor B, et al. Advanced BrainAGE in older adults with type 2 diabetes mellitus. Front Aging Neurosci. 2013;5(1):90–90.
 39.
Teverovskiy LA, Becker JT, Lopez OL, Liu Y. Quantified brain asymmetry for age estimation of normal and AD/MCI subjects. IEEE Int SympBiomed Imaging. 2008;5(1):1509–12.
 40.
Irimia A, Torgerson CM, Goh SYM, et al. Statistical estimation of physiological brain age as a descriptor of senescence rate during adulthood. Brain Imaging Behav. 2015;9(4):678–89.
 41.
Kondo C, Ito K, Wu K, et al (2015) An age estimation method using brain local features for T1weighted images. 37th Annual international conference of the IEEE engineering in medicine and biology society (EMBC), Milan, pp 666669.
 42.
Nakano R, Kobashi S, Alam SB, et al (2015) Neonatal brain age estimation using manifold learning regression analysis. IEEE international conference on systems man and cybernetics conference proceedings (SMC 2015). pp 22732276.
 43.
Loewe LC, Gaser C, Franke K. The effect of the APOE genotype on individual BrainAGE in normal aging, mild cognitive impairment, and Alzheimer’s disease. PLoS ONE. 2016;11(7):1–25. doi:10.1371/journal.pone.0157514.
 44.
Luders E, Cherbuin N, Gaser C. Estimating brain age using highresolution pattern recognition: younger brains in longterm meditation practitioners. Neuroimage. 2016;134:508–13.
 45.
Löwe LC, Gaser C, Franke K, Christine L. The Effect of the APOE genotype on individual BrainAGE in normal aging, mild cognitive impairment, and Alzheimer’s disease. PLoS ONE. 2016;11(7):e0157514.
 46.
Moradi Elaheh, Pepe Antonietta, Gaser Christian, Huttunen Heikki, Tohka Jussi. Machine learning framework for early MRIbased Alzheimer’s conversion prediction in MCI subjects. NeuroImage. 2015;104:398–412.
 47.
Zhang Q, Hu X, Zhang B. Comparison of lnorm SVR and sparse coding algorithms for linear regression. IEEE Trans Neural Netw Learn Syst. 2015;26(8):1828–33.
 48.
Li W, Zhang F, Li C, Song H. Observation of nonhermitian quantum dependency criterion in mesoscopic optomechanical system. Int J Theor Phys. 2016;55(4):2097–109.
 49.
Runkler Thomas A. Data analytics: models and algorithms for intelligent data analysis. Berlin: Springer; 2012.
Authors’ contributions
YL conceived of the whole study, and participated in its design and coordination and helped to draft the manuscript. YL, JW and SX participated in the measurements of all subjects and drafted the complete manuscript. PW and MQ managed the trials and assisted with writing discussions in the manuscript. All authors read and approved the final manuscript.
Acknowledgements
Authors thank the professor Jingna Zhang, Li Wang, Linqiong Sang and Ye Zhang for their valuable suggests.
Competing interests
The authors declare that they have no competing interests.
Availability of data and supporting materials
The data utilized in this study was obtained from ADNI (Alzheimer’s Disease Neuroimaging Initiative) database (http://adni.loni.usc.edu/).
Funding
This research is funded by National Natural Science Foundation of China NSFC (Nos: 61108086, 91438104, 61571069, 81601970), Basic and Advanced Research Project in Chongqing (cstc2016jcyjA0043, cstc2016jcyjA0134,cstc2016jcyjA0064), Chongqing Social Undertaking and People’s Livelihood Guarantee Science and Technology innovation Special Foundation (cstc2016shmszx40002), Southwest Hospital science and technology innovation program (SWH2016LHYS11), The Ministry of education to return personnel research start fund, the Fundamental Research Funds for the Central Universities (10611CDJXZ238826), and Scientific and Technological Research Program of Chongqing Municipal Education Commission (KJ1603805).
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Author information
Affiliations
Consortia
Corresponding author
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Cite this article
Li, Y., Liu, Y., Wang, P. et al. Dependency criterion based brain pathological age estimation of Alzheimer’s disease patients with MR scans. BioMed Eng OnLine 16, 50 (2017). https://doi.org/10.1186/s129380170342y
Received:
Accepted:
Published:
Keywords
 Brain age estimation
 Brain pathological age
 Alzheimer’s disease
 Classification
 Correlation criterion
 Magnetic resonance imaging
 Support vector regression