Analyzing brain structural differences associated with categories of blood pressure in adults using empirical kernel mapping-based kernel ELM+

Background Hypertension increases the risk of angiocardiopathy and cognitive disorder. Blood pressure has four categories: normal, elevated, hypertension stage 1 and hypertension stage 2. The quantitative analysis of hypertension helps determine disease status, prognosis assessment, guidance and management, but is not well studied in the framework of machine learning. Methods We proposed empirical kernel mapping-based kernel extreme learning machine plus (EKM–KELM+) classifier to discriminate different blood pressure grades in adults from structural brain MR images. ELM+ is the extended version of ELM, which integrates the additional privileged information about training samples in ELM to help train a more effective classifier. In this work, we extracted gray matter volume (GMV), white matter volume, cerebrospinal fluid volume, cortical surface area, cortical thickness from structural brain MR images, and constructed brain network features based on thickness. After feature selection and EKM, the enhanced features are obtained. Then, we select one feature type as the main feature to feed into KELM+, and the rest of the feature types are PI to assist the main feature to train 5 KELM+ classifiers. Finally, the 5 KELM+ classifiers are ensemble to predict classification result in the test stage, while PI is not used during testing. Results We evaluated the performance of the proposed EKM–KELM+ method using four grades of hypertension data (73 samples for each grade). The experimental results show that the GMV performs observably better than any other feature types with a comparatively higher classification accuracy of 77.37% (Grade 1 vs. Grade 2), 93.19% (Grade 1 vs. Grade 3), and 95.15% (Grade 1 vs. Grade 4). The most discriminative brain regions found using our method are olfactory, orbitofrontal cortex (inferior), supplementary motor area, etc. Conclusions Using region of interest features and brain network features, EKM–KELM+ is proposed to study the most discriminative regions that have obvious structural changes in different blood pressure grades. The discriminative features that are selected using our method are consistent with the existing neuroimaging studies. Moreover, our study provides a potential approach to take effective interventions in the early period, when the blood pressure makes minor impacts on the brain structure and function.


Background
Hypertension is one of the risk factors for cognitive dysfunction. According to the epidemiological survey, the global incidence of hypertension in 2000 was about 26.4%, affecting 972 million people worldwide. By 2025, the number of people affected by hypertension is to increase by 60% to 1.56 billion [1]. A long-term follow-up of elderly patients at risk for cardiovascular disease found that the patient's blood pressure (BP) variability affects the patient's cognitive function [2]. A latitudinal investigation demonstrates that high systolic blood pressure (SBP), high diastolic blood pressure (DBP) and persistent hypertension can accelerate the decline of cognitive function, as well as increase the incidence of dementia [3]. Longitudinal studies have found that antihypertensive therapy can effectively reduce the incidence of cognitive dysfunction [4]. Excessive BP can cause cerebral vascular damage, which in turn causes white matter and gray matter ischemic or hemorrhagic damage [5], while white matter and gray matter ischemia can cause brain atrophy and leukoaraiosis. All these studies indicate that high BP may affect cognitive function.
Hypertension can be classified by severity. The classification scheme for hypertension helps determine the condition, quantify the risk, evaluate the prognosis and guide the management [6]. The "2017 American College of Cardiology/American Heart Association (2017 ACC/AHA) Guideline for the Prevention, Detection, Evaluation, and Management of High Blood Pressure in Adults" recently recommended a new categorization for BP grades. This new guideline commends that BP should be classified in four categories: normal (Grade 1), elevated (Grade 2), hypertension stage 1 (Grade 3) and 2 (Grade 4). And defined hypertension as a SBP of ≥ 130 mmHg and/or a DBP of ≥ 80 mmHg, reducing the former SBP and DBP by 10 mmHg (a SBP of ≥ 140 mmHg and/or DBP of ≥ 90 mmHg [7]). The research of Ettehad [8] and Xie et al. [9] also supported this BP ≥ 130/80 mmHg as critical value of hypertension intervention.
The overall situation of prevention and control of hypertension in China is severe. At present, Chinese diagnostic criteria of hypertension is still BP ≥ 140/90 mmHg. According to the 2017 ACC/AHA new diagnostic criteria of hypertension, China will add another 100 million hypertensive patients. Treatment in the early stages of disease development may help prevent the development of cardiovascular disease and reduce the risk and complications of hypertension [10,11]. It is necessary for us to learn from the 2017 ACC/AHA guidelines, which is of great significance for the prevention and control of hypertension as well as the entire chronic patient population in China.
The purpose of this study is using machine learning to explore the relationship between BP grades and brain structural changes. Magnetic resonance (MR) imaging, a safe and effective means, plays an important role in revealing brain abnormalities. ROI-based analysis has been widely used [12]. Maaike et al. [13] used voxel-based morphometry to study the gray matter and white matter volume of hypertension, revealing the relationship between hypertension and anterior cingulate cortex (ACC), lower forehead (IFG) and hippocampal volume. Studies of structural abnormalities in the brain based on MR images of hypertensive patients have shown that brain atrophy and brain tissue lesions often occurred in gray matter and white matter [14,15], affecting the transport of nutrients to neurons and leading to the decline of cognitive function [16]. From MR-related studies, it is known that gray matter damages appeared in the prefrontal cortex, hippocampus, lower jaw, and inferior parietal lobe, white matter lesions mainly occurs in the frontal area [17,18]. Peter et al. [19] demonstrated that atrophy of the auxiliary motor areas, superior frontal gyrus, anterior cingulate cortex and middle temporal lobe is associated with hypertension. In addition, high BP gives rise to atrophy of the medial temporal lobe, which plays an important role in cognitive development [20]. Detection of hypertension-related brain regions is of great value in clinical and academic studies. Those researches above have only studied hypertension brain morphometry. Their subjects consist of normal group and hypertension group whose diagnostic criterion is BP ≥ 140/90 mmHg. And less use automated classification to extract hypertension-related brain regions. Therefore, more studies are needed to further explain the relationship between BP grades and brain morphometry.
In this paper, we examined the hypertension-related brain morphometry in regions of interest (ROIs) using features, which consist of ROI features and brain network features. ROI features were extracted from the brain structural MR images including gray matter volume (GMV), white matter volume (WMV), cerebrospinal fluid volume (CSFV), cortical thickness (Thickness), and cortical surface area (Area). Brain network features were constructed by computing the correlation index of cortical thickness values between ROIs. The two feature types complement each other in revealing neuroanatomical information about hypertension.
Due to the complexity of brain diseases, the use of single information cannot fully represent the disease characteristics in process of the diagnosis. For this reason, comprehensive consideration of multiple information is required. Learning Using Privileged Information (LUPI), a new learning paradigm for classifier proposed by Vapnik and Vashist, can be a good way to solve this problem. The privileged information (PI) is only available during the training phase of model, but unavailable during the testing phase [21]. PI can help establish better prediction rules by providing additional information to training samples. It has become a trend for researchers to embed LUPI paradigm in different classifiers, such as the support vector machine plus (SVM+) and random vector functional link network plus (RVFL+) [22], which usually achieves improved classification performance [21].
The proposed kernel-based ELM+ (KELM+) is developed based on kernel-based RVFL+ (KRVFL+) [22]. ELM and RVFL, two kinds of classifiers based on single-layer feed-forward neural network (SLFN) [23], have received extensive attention in recent years. With high approximation ability, good generalization performance and very fast training time, ELM is widely used for a variety of classification tasks [24]. However, random affine transformation in ELM+ usually causes prediction instability. To this end, we propose a KELM+ algorithm to overcome this problem and improve performance. KRVFL+ outperforms SVM+ on several benchmark datasets [22]. In view of the nuances of ELM and RVFL, we also consider that KELM+ outperforms SVM+ in the network structure.
Empirical kernel mapping (EKM), one of the kernel methods, can map raw data to a high-dimensional data space via the inner-product forms [25], which works as the implicit kernel mapping (IKM) [25]. EKM overcomes the limitations of traditional IKM on inner-product calculation, and can explicitly map samples to feature space. In the meanwhile, it can fully retain the structural characteristics of data [26].
In this study, we proposed an EKM-based KELM+ (EKM-KELM+) method, which can be used to investigate brain structural differences in different grades of BP. Specifically, first EKM performed on six types of feature to generate six enhanced features. Then, one type of feature is selected as the main feature, and the other five features are used as PI, together with the main feature to form five feature pairs, which are built to train five individual KELM+ classifiers. Finally, ensemble learning is performed on the KELM+ classifiers to give the classification result.
The main contributions of the method are twofold: (1) by transforming the original features to high-dimensional to form enhancement features through EKM, EKM-KELM+ has a more meaningful input layer in the neural network, which help improving classification performance; (2) instead of using simple multi-level ROI for mixed feature selection, one soft tissue feature is selected as main feature, and the other five features are used as PI to assist the classifiers training. Only the main feature is used in the testing. The most discriminative brain regions, which have structural changes affected by hypertension, can be found using our method. This can also help us to analyze the changes of specific brain regions in BP from grade 2 to grade 4. Moreover, our study provides a potential approach to take effective interventions in the early period, when the BP has minor impacts on the brain structure and function.

Results
The proposed EKM-KELM+ algorithm is compared with the following algorithms: (1) SVM classifier with Radial Basis Function (RBF) kernel is used for every ROI feature; (2) KELM classifier is used for every ROI feature; (3) KELM+ without EKM.
In this experiment, the fivefold cross-validation (CV) strategy was conducted; for each round of CV, the performance of the model can be calculated separately, which reduces the variance of the evaluation. The classification accuracy (ACC), sensitivity (SEN), specificity (SPC), Youden index (YI), positive predictive value (PPV), negative predictive value (NPV) and F1-score (F1) are used as evaluation indices. Our classification results were presented in the form of mean ± SD. Table 1 gives the classification performance using different feature types between Grade 1 and Grade 2, Grade 1 and Grade 3 and Grade 1 and Grade 4. For Grade 1 and Grade 2; in the comparison of different feature types, the cortical thickness performs worst in all feature types. It is found that the GMV performs observably better than any other volumetric features (i.e., WMV and CSFV) with a comparatively higher classification accuracy of 76.73%, sensitivity of 78.73%, and specificity of 75.14%. Similarly, cortical thickness performs worst and GMV performs best with an accuracy of 93.19%, sensitivity of 93.14%, and specificity of 93.23% in Grade 1 and Grade 3. In Grade 1 and Grade 4 group, GMV has the highest classification accuracy of 95.15%, sensitivity of 97.14%, and specificity of 93.14%, while WMV performs worst.

Classification performance
It can be seen from Table 1 that all the best results are achieved on GMV. It means that the high BP group and the normal BP group have more differences in GMV than in others. On every type of feature, the classification accuracy increases with the increase of BP grade, which indicates that higher BP will aggravate the change of ROI feature. Table 2 gives the classification results of different algorithms on the different feature types. It can be found that the proposed EKM-KELM+ outperforms all the compared algorithms.

Experiment on kernel type
Different kernel function types represent different ways of data mapping. Polynomial kernel, RBF kernel, and linear kernel are mostly used kernel types. In this study, we used RBF kernel and linear kernel. We chose the most suitable kernel function type through experiments to achieve the best classification performance. Classification results of Grade 1 vs. Grade 4, using EKM-KELM+ with different kernel types (RBF kernel or linear kernel of EKM & KELM+) on the GMV feature are shown in Fig. 1. Experimental

Table 1 Classification performance using different feature types between Grade 1 and Grade 2, Grade 1 and Grade 3 and Grade 1 and Grade 4 (mean ± std, UNIT: %)
GMV gray matter volume, WMV white matter volume, CSFV cerebrospinal volume, thickness, cortical thickness, Area cortical surface area, ACC accuracy, SEN sensitivity, SPC specificity, PPV positive predictive value, NPV negative predictive value, YI Youden's index, F1 F1-score results show that the kernel function has an important impact on the performance of the classification. Using RBF kernel for EKM and KELM+ can achieve the best classification performance, which reflects the robustness of our method. The RBF kernel function is commonly used as the kernel functions for the reason that is has good anti-interference ability for noise in the data.

The most discriminative features
The most discriminative features are selected from ROI features and brain network features, respectively. The top 10 of the most discriminative ROI features and brain network features for Grade 2, Grade 3 and Grade 4 compared with Grade 1 are listed in Table 3. For Grade 2 compared with Grade 1, the top 10 of the most discriminative ROI features are mainly distributed in frontal lobe [inferior frontal gyrus (opercular) right, olfactory right], temporal lobe (bilateral superior temporal gyrus, middle temporal gyrus  As for Grade 4, the top 10 of the most discriminative ROI features are found in frontal lobe (superior frontal gyrus (dorsal) left, bilateral orbitofrontal cortex (superior), bilateral orbitofrontal cortex (inferior), bilateral supplementary motor area, inferior frontal gyrus (triangular) left, bilateral middle frontal gyrus, rectus gyrus right), and temporal lobe (bilateral superior temporal gyrus). Figure 2 shows the results of projecting the most discriminative ROI features (top-10) onto the cortical surface. Three connection graphs of the most discriminative brain network features for three groups are shown in Fig. 3 (top-20), which are generated by Circos software [27]. Thicker line in the connection graph indicates stronger connection between ROIs, while thinner line implies weaker connection. The red lines represent brain connections in the same hemisphere, while the gray lines represent brain connections in different hemispheres of the brain. As we can see in lower grade of BP, the most discriminative brain network features are mainly distributed in left hemisphere. As the BP increases, the features will be gradually distributed in the right hemisphere and finally across both the right and left sides of the brain and almost across all brain regions, including frontal lobe, occipital lobe, limbic lobe, parietal lobe, sub-cortical gray nuclei, and central region. Moreover, regions in the bilateral frontal lobes and limbic lobes show close internal relation. That is, the most sensitive biomarkers of hypertension are mainly distributed in frontal lobe and limbic region.

Discussion
In this work, the proposed EKM-KELM+ algorithm can help study the brain structural differences associated with BP grades and achieve effective classification results. Its effectiveness is demonstrated on datasets of different BP grades.

Improvement of the proposed method
Due to the complexity of brain diseases, the use of multiple anatomical MRI measures can provide more information to help research the disease. Although the proposed EKM-KELM+ algorithm is based on the LUPI paradigm that required additional modality for PI in previous work, we successfully performed EKM-KELM+ on multiparameter information of single-modality neuroimaging data in this work. In fact, GMV, WMV, CSFV, thickness and area are extracted from structural brain MRI, brain network features are computed based on cortical thickness between ROIs. During the training phase, the five feature pairs are built to train five individual KELM+ models. While in testing phase, only one type of feature, extracting from structural brain MR images, will be directly fed to the well-trained KELM+ models to give the final classification result, which is flexible and convenient. The use of EKM before KELM+ results in data obtaining a more powerful expression, which improves the classification performance. A well-classified performance and discriminative features reported in our study are important in clinical studies. By using our model, we can classify hypertension patients as with and without structural brain changes. Clinicians can give the targeted recommendations for initiation of treatment for these two types of patients. It conforms more with the principles of hypertension treatment.
The current studies on hypertension are all in the population with SBP ≥ 140 mmHg or DBP ≥ 90 mmHg (Grade 4), to find specific brain regions related to hypertension. However, these studies have some shortcomings. They only explain the relationship between hypertension and the relevant brain regions in a general way, which has not considered the network activity of specific brain regions. We have fixed the deficiency of these existing methods by using quantitative analysis. This can provide information of both isolated ROI and brain connectivity between pairs ROIs, and help us understand the change pattern of brain morphological in different BP grades.

Analysis of discriminative ROIs
We performed t test between different groups and counted the number of ROIs with significant changes (p value < 0.05) of each feature type. Figure 2 shows the results of projecting the most discriminative ROI features (top 10) onto the volumetric and cortical. The GMV, cortical thickness, and surface area encoded by the color from yellow (larger, thicker) to red (smaller, thinner).
For all groups, the most discriminative ROI features include GMV, WMV, CSFV, Thickness, and Area. The most conspicuous regions of GMV reduction are found in frontal lobe, limbic lobe, temporal lobe, parietal lobe, central region, and occipital lobe. The most obvious regions of WMV reduction are in frontal lobe, parietal lobe, occipital lobe, sub-cortical gray nuclei, and limbic lobe. The most evident regions of Thickness volume reduction are frontal lobe, occipital lobe, limbic lobe, parietal lobe, and temporal lobe. The higher the BP, the more reduction of brain tissue occurred. In insula and sub-cortical gray nuclei, the CSFV has positive correlation with the increase of BP. All critical regions are known to be strongly involved in the pathophysiological mechanisms of hypertension.

Comparison with other methods
Studies have shown that high SBP, high DBP and persistent high BP will lead to cognitive impairment [28]. Morphological studies have shown that different cognitive dysfunction manifestations (such as overall cognitive function, executive ability, memory impairment) are associated with structural changes in specific brain regions. Researchers [29] found that hypertension patients showed atrophy of the prefrontal and hippocampus, while the prefrontal cortex was closely related to executive ability, emotional processing ability, and social cognition. Blood flow in the posterior parietal region of hypertensive patients increased less than that of nonhypertensive patients when they completed the memory task, which indicates that hypertension may damage cognitive function by reducing local cerebral blood flow [30]. Elevated BP is associated with more executive function impairment than memory, which shows a significant decrease compared with the executive function of the non-hypertensive group [31]. Functional magnetic resonance imaging (fMRI) and diffusion tensor imaging (DTI) on 1007 elderly populations (including 405 hypertensive patients) are used to find that impaired executive function and decreased attention caused by hypertension may be associated with decreased white matter integrity and decreased functional connectivity of the frontotemporal lobe. In addition, cortical gray matter atrophy is closely related to executive dysfunction [32]. Hypertension can also cause atrophy of the medial temporal lobe, which plays an important role in cognitive formation [20].
Since there have been few reports on the automatic classification of hypertension grades, we only compared the brain regions that are differentiated in our results with existing hypertension-related morphological studies. Our results also examined the frontal lobe (bilateral orbitofrontal cortex (superior), superior frontal gyrus (dorsal) left, rectus gyrus right), temporal lobe (bilateral superior temporal gyrus, middle temporal gyrus left), central region (rolandic operculum right), insula right, limbic lobe (hippocampus), sub-cortical gray nuclei (thalamus), and parietal lobe (precuneus right) associated with elevated BP. It is consistent with current morphological studies, demonstrating the effectiveness of our classification method in revealing hypertension-related brains. Meanwhile, the central region and insula, which have not been reported in previous hypertension-related studies, were found in our study. Further research is needed to rule out false positives in our results. It can be found that the discriminative ROIs are mostly located in frontal lobe, which is mainly responsible for planning, sequencing and organizing attention, moral judgment and self-control behaviors. This is consistent with the fact that high blood pressure can cause cognitive damages.

Limitations
Despite the excellent classification performance, our method still has some limitations. First, as a pilot study, we use a relatively small amount of data during machine learning. Second, since our study is based on a universality, the age of subjects is not limited to a specific range. We can take these elements into consideration for further improving the experiment in the future research.

Conclusion
In summary, the proposed Empirical Kernel Mapping-Based Kernel ELM+ framework can be used in studying the changes of brain structure associated with blood pressure by a quantitative way. One type of feature is used as the main feature, and other different feature types are used as PI. Finally, the result is obtained by ensemble learning. Compared with other algorithms, our method has the best classification accuracy, which can provide more accurate early intervention identification methods and potential guiding significance for the treatment of hypertension patients. The ROI features and the brain network features can be used to locate specific brain regions that process hypertension. The discriminative features selection by EKM-KELM+ is consistent with existing structural studies. Moreover, our study provides an important step in investigating brain structure and brain connective changes associated with hypertension, which offers a potential research direction to further study the mechanisms basis of the cognitive neuroscience of hypertension.

Participants
The structural MRI data utilized in this study were obtained from the Suzhou Science and Technology town hospital that consist 292 adults, aged from 25 to 76 years. The study is approved by the Ethics Committee of the Third Affiliated Hospital of Soochow University. According to the "2017 American College of Cardiology/American Heart Association (2017 ACC/AHA) Guideline for the Prevention, Detection, Evaluation, and Management of High Blood Pressure in Adults", we classified the data as four grades: Grade 1, Grade 2, Grade 3, and Grade 4 (more details in Table 4). Each grade includes 73 subjects. Each participant received a structured clinical interview by a psychiatrist to rule out smoking, secondary hypertension, traumatic head injury, diabetes, and congestive heart failure or pulmonary disease. Characteristics of all subjects are shown in Table 5. All images were collected on a Ingenia 3.0T PHILIPS Medical Systems equipment with a standard head coil. The scanning parameters are as follows: repetition time (TR) = 7.90 ms, echo time (TE) = 3.50 ms, flip angle (FA) = 8°, slice thickness = 1 mm, field of view (FOV) = 250 mm and voxel dimensions 1.0 mm isotropic.

Image process
All structural brain MR images were processed using BrainLab software [33], running automatically on Linux platform: (1) the original brain MR images were re-sampled in terms of direction, voxel size and volume according to right-hand rules. N3 bias field correction is to eliminate intensity non-uniformity [34].
(3) Level-set-based tissue segmentation algorithm [36] was used to separate GMV, WMV, CSFV, and background by limiting thickness to a biologically reasonable range with 1-6.5 mm. (4) Then, the tissue segmented images are registered to the brain atlas using a non-rigid matching algorithms derived from a concept of diffusing models [37]. The brain atlas is based on the Automated Anatomical Labeling (AAL) template with 45 labeled ROIs for each hemisphere [38]. (5) A deformable surface method accurately reconstructs inner, central, and outer cortical surfaces [39]. (6) ROI volume and cortical thickness were measured, respectively, according to the amount of voxels. Finally, we obtained 90 cortical ROIs [40]. We computed the GMV, WMV, CSFV, Thickness, and Area for each ROI.

Feature extraction and selection
Two types of features are used in this paper: ROI features and brain network features. The ROI features are extracted from the brain structural MR images including GMV, WMV, CSFV, Thickness and Area. Considering individual differences, the GMV, WMV, CSFV of each ROI are normalized according to the total brain volume of each subject [41], and the cortical thickness and cortical surface area of each ROI are normalized according to the standard deviation and the total cortical surface area of each subject.
Brain network features have been widely used in recent years for neuroimaging-based analysis of brain disease. The brain network features consist of Pearson correlation coefficient which are computed based on cortical thickness between ROIs. Because subcortical regions are not researched in this study, we neglected 12 sub-cortical ROIs of 90 cortical ROIs in the calculation [35], and finally got the 78 × 78 correlation matrix. The upper triangular elements of the matrix are used to construct the feature vector (3003-dimensional) for each subject.
Furthermore, statistical t test is first adopted to select the features with their p values less than 0.05. Then, on the basis of t test, mutual information method is further used to reduce feature dimensionality and improve feature representation. After the two feature selection steps, we obtained the optimal feature subsets for each feature type, respectively.

Classification
We proposed empirical kernel mapping-based kernel extreme learning machine plus (EKM-KELM+) classifier for classification. The EKM-KELM+ algorithm has 5 parts: ROI features and brain network features, feature selection (FS), features after FS, EKM, and KELM+ classifiers. FS is used for feature reduction. EKM solves the problem of data linear indivisibility and improves the performance of classifier. KELM+ is for classification. Ensemble learning is used to get the final classification label by voting on 5 classification results. In the following parts, we will further elaborate the algorithm. Figure 4 shows the flowchart of the proposed EKM-KELM+ algorithm with the following steps (GMV as the main feature as an example):

Empirical kernel mapping-based KELM+
1. Six kinds of features are extracted from the brain MR images after image preprocessing, and feature selection is performed, respectively, to obtain optimal feature subsets. 2. EKM is then performed on six optimal feature subsets to generate six new enhanced feature subsets. 3. The enhanced feature subsets are then sent to KELM+ classifier. During the training stage, GMV is selected as the main feature sending to 5 KELM+ classifiers (KELM + 1 -KELM+ 5 ). The other five features (CSFV, WMV, Thickness, Area and brain network feature) are used as privileged information sending to KELM+1-5, respectively, which provide additional information for the main feature GMV to train 5 KELM+ classifiers. 4. The ensemble learning algorithm is finally applied to the 5 KELM+ classifiers for classification. In this work, the final classification label is decided by voting on 5 classification results.  5. During the testing stage, the GMV features extracted from structural MR images will be directly input to the 5 KELM+ classifiers (in the purple box), which then give the final classification result with the ensemble learning algorithm.

Empirical kernel mapping
The EKM algorithm maps original data to a given empirical feature space incrementally with explicit feature representation. Here is a brief introduction to EKM [42].
be a d-dimensional training samples set. The input samples space is mapped to an r-dimensional empirical feature space is by a particular kernel function Φ e . The kernel mapping of paired x i and x j is calculated as follows: where ker(·, ·) is a particular kernel function, leading to a kernel matrix K = (K i,j ) m×m , and K is a symmetrical positive semi-definite matrix with size of m × m . K can be decomposed as where Λ is a diagonal matrix containing r positive eigenvalues of K in decreasing order, and P consists of the eigenvectors corresponding to the positive eigenvalues.
The EKM to an r-dimension Euclidean space Φ e r is then can be given as Thus a sample x can be mapped into empirical feature space incrementally with Φ e r (x).

KELM
The ELM performs a classification decision by nonlinearly expanding the original features (enhancement nodes) through a single hidden layer [43]. In ELM, the output weight β can be calculated by ridge regression as where T is a label matrix, C is the regularization parameter, which represents the tradeoff between the minimization of training errors and the maximization of the marginal distance and H is the enhanced matrix.
To overcome the problem of randomness in ELM, the kernel trick is then introduced into ELM as shown in Fig. 4. For KELM [23], we define the kernel matrices as where K is a linear kernel function and K represents a nonlinear kernel function.
The output of KELM is then given by  with the output weights calculated by the ridge regression as KELM+ ELM+ successfully integrates the LUPI paradigm to ELM, which has simpler optimization constraint than the commonly used SVM+. Define a set of training data {(x i , P i , t i ) |x i ∈ R d 1 , P i ∈ R d 2 , t i ∈ R m , i = 1 . . . n} , where {P i ∈ R d 2 , i = 1 . . . n} is a set of PI. In LUPI paradigm, ELM+ is formulated as where ɛ is a regularization coefficient, h(x i ) and h (P i ) are concatenated vector, and β is an output weight vector in the privileged feature space.
The Lagrangian function is then constructed to solve the optimization problem in Eq. (8) by where = [ 1 , . . . , n ] T are Lagrange multipliers.
After using the Karush-Kuhn-Tucker (KKT) condition to calculate the saddle points of the Lagrangian function, we have By substituting Eqs. (10) and (11) into (12), we have After combining Eqs. (10) and (13), the closed-form solution to the ELM+ is given by (6) f . . . h(x k )β − t k +h(P k )β , Moreover, 1 C is added to Eq. (13) so as to avoid singularity and guarantee the stability for ELM+, which leads to the following closed-form solution: The output function of the ELM+ is defined as Although ELM+ can implement the LUPI-based classification task, it also suffers from the same problem of randomness as ELM. Therefore, the kernel-based ELM+ algorithm is then proposed.
For the KELM+, we define the kernel matrices with same structure as Eqs. (4) and (5), the output weight vector is then given by The output of KELM+ is finally calculated as For multiclass cases, the predicted class label of a testing point is the index number of the output node, which has the highest output value for the given testing samples