Skip to main content

Improving cardiovascular risk prediction with machine learning: a focus on perivascular adipose tissue characteristics

Abstract

Background

Timely prevention of major adverse cardiovascular events (MACEs) is imperative for reducing cardiovascular diseases-related mortality. Perivascular adipose tissue (PVAT), the adipose tissue surrounding coronary arteries, has attracted increased amounts of attention. Developing a model for predicting the incidence of MACE utilizing machine learning (ML) integrating clinical and PVAT features may facilitate targeted preventive interventions and improve patient outcomes.

Methods

From January 2017 to December 2019, we analyzed a cohort of 1077 individuals who underwent coronary CT scanning at our facility. Clinical features were collected alongside imaging features, such as coronary artery calcium (CAC) scores and perivascular adipose tissue (PVAT) characteristics. Logistic regression (LR), Framingham Risk Score, and ML algorithms were employed for MACE prediction.

Results

We screened seven critical features to improve the practicability of the model. MACE patients tended to be older, smokers, and hypertensive. Imaging biomarkers such as CAC scores and PVAT characteristics differed significantly between patients with and without a 3-year MACE risk in a population that did not exhibit disparities in laboratory results. The ensemble model, which leverages multiple ML algorithms, demonstrated superior predictive performance compared with the other models. Finally, the ensemble model was used for risk stratification prediction to explore its clinical application value.

Conclusions

The developed ensemble model effectively predicted MACE incidence based on clinical and imaging features, highlighting the potential of ML algorithms in cardiovascular risk prediction and personalized medicine. Early identification of high-risk patients may facilitate targeted preventive interventions and improve patient outcomes.

Introduction

Cardiovascular diseases (CVDs), including myocardial infarction, acute coronary syndrome and stroke, are the primary contributors to mortality on a global scale. Early prevention and treatment of major adverse cardiovascular events (MACEs) are crucial for reducing the morbidity and mortality associated with CVDs [1]. Computed tomography angiography (CTA) has emerged as a noninvasive tool for screening major cardiovascular events, providing valuable insights into atherosclerosis and risk prediction [2].

Machine learning (ML), a branch of computer science, incorporates numerous variables using algorithms to identify their nonlinear relationships and complex interactions [3]. ML algorithms have been explored for their ability to predict heart failure [4], arrhythmia [5], postsurgical mortality [6], and composite CVD risk [7]. A prior investigation demonstrated that the ML exhibited superior performance compared to the logistic regression (LR), a traditional statistical technique, in forecasting patient outcomes [8]. The reason may stem from the inherent limitations of LR, which is known to be susceptible to multicollinearity. While some ML models, such as Linear Discriminant Analysis (LDA) and linear Support Vector Machine (SVM), are limited in capturing nonlinear relationships, others like Adaptive Boosting (AdaBoost) and ensemble methods are effective in modeling complex interactions among features. Additionally, multicollinearity, which poses challenges for traditional models like LR and LDA, is better managed by advanced ML models.

In the field of MACE prediction, the use of ML could improve CVD risk prediction by agnostically discovering new predictors of risk and learning about their complex interactions [9]. Previous studies have integrated myocardial perfusion imaging into the prediction of MACEs using ML techniques, demonstrating a moderate level of accuracy with area under the curve (AUC) values ranging from 0.73 to 0.79 [10, 11]. However, the clinical utilization of myocardial perfusion imaging remains limited. Recently, perivascular adipose tissue (PVAT), the adipose tissue surrounding the coronary artery, has attracted increased amounts of attention due to its hypothetical role in the pathogenesis of CVDs [12]. Using CTA images, PVAT characteristics can potentially be evaluated to determine if individuals are at increased risk of adverse cardiovascular outcomes [13, 14]. In addition, ensemble learning has been proposed because it is more effective and has a better generalization ability than single algorithms [15]. Coupling PVAT imaging features with the ML algorithm may hold promise for improving cardiovascular prevention by enhancing early detection capabilities [16]. However, the effectiveness of ensemble machine learning algorithms utilizing PVAT features in predicting patients at risk of MACEs remains uncertain.

Hence, we developed a novel ensemble model to predict MACE incidence utilizing ML algorithms. Through the utilization of stacking algorithms, our ensemble model incorporating both clinical and imaging features was established. The ability of the prediction model for MACE risk stratification may facilitate optimal preventive treatment intervention in future.

Methods

Participants and data collection

From January 2017 to December 2019, we analyzed a cohort of 1077 individuals who underwent coronary CT scanning at our facility. Exclusion criteria included the following: (1) coronary artery stenosis greater than 20% in at least one coronary major vessel; (2) history of stroke, serious heart disease or tumor; (3) incomplete clinical and imaging data; and (4) loss to follow-up. After exclusion, 228 subjects were included. Figure 1 illustrates the flowchart of the subject selection.

Fig. 1
figure 1

Patients selection flowchart

The baseline features of the subjects, including age and sex, were collected. Risk factors, including hypertension, smoking history, alcohol consumption, diabetes, body mass index (BMI) and laboratory findings. The BMI was obtained using the equation weight (kg) divided by squared height (m2). A TyG index was calculated by integrating fasting triglyceride (TG) by fasting blood glucose (FBG). The definition of hypertension was to have a systolic blood pressure (SBP) higher than 140 mmHg or a diastolic blood pressure (DPB) higher than 90 mmHg on two separate occasions or to be receiving antihypertensive medication. The definition of diabetes was to have an FBG ≥ 126 mg/dL or to be receiving any antidiabetic medication.

Follow-up and end points

Follow-ups through clinical visits were conducted for up to three years. The endpoint was the occurrence of MACEs [17], such as myocardial infarction, unstable angina, or stroke, which were confirmed through clinical presentation, imaging data, or other pertinent data collected during each follow-up session and diagnosed by specialized cardiology physicians or neurologists.

CTA multiparameter analysis

The CTA scan was performed on the Siemens SOMATOM Definition AS (Siemens Medical Systems, Erlangen, Germany). The acquired images were postprocessed by Deepwise Coronary Artery Analysis Software (Deepwise Inc., Beijing, China). The coronary artery calcium (CAC) scores were calculated based on a sequence within the CTA examination. Prior to the injection of the contrast agent, a sequence was scanned during diastole using prospective gated acquisition. This non-contrast sequence was used to extract the CAC scores, which were automatically calculated by the software and confirmed manually.

The segmentation of PVAT was performed by manual adjustment on the basis of automatic segmentation. Specifically, if non-adipose tissues surrounding the coronary artery were encompassed as the region of interest during the automatic segmentation procedure, the radiologist will remove these tissues manually to ensure that only PVAT was included.

For the analysis of PVAT, the region of interests (ROIs) were automatically tracked by the software based on the fat threshold (from − 190 to − 30 HU). The ROI extends from 10 to 40 mm proximal to the coronary artery, with the radial distance of the ROI equivalent to the diameter of the coronary artery. Two experienced radiologists confirmed the results of the segmentation. Then, the fat attenuation index (FAI) and fat volume (FV) of PCAT could be automatically generated by the software. CAC scores, PCAT-FAI, and PCAT-FV were recorded as coronary artery imaging features. PVAT is visualized with an adipose tissue Hounsfield unit color table.

Model development

Feature selection

The features screened for model development included both clinical features and coronary artery imaging features. First, samples with 80% or above of missing values were excluded, among which were 186 subjects with a high percentage of missing values and 24 subjects with missing critical images. Afterwards, the remaining missing values were imputed by multiple imputation. After normalizing the continuous variables using z score normalization, 23 features were considered for selection. The least absolute shrinkage and selection operator (LASSO) regression was employed for feature selection in order to automatically remove redundant and irrelevant collinear features. The correlation between the selected features was determined by Spearman's correlation coefficient. Finally, the most predictive features were confirmed for further model development.

ML algorithm

We employed five ML algorithms to predict MACEs in our study based on the selected features. In addition to four commonly used models, namely, AdaBoost, GNB, LDA, and SVM models [18], we introduced a stacking ensemble model. The ensemble model was built by stacking the classification probabilities of the outputs of the other four machine learning models. Compared with other ML models, this approach can enhance the prediction accuracy by combining diverse models, reducing the possibility of overfitting, and improving the model stability and generalizability [19]. Besides, the traditional LR model was also developed. Significant variables identified in the univariate analysis were incorporated into the multivariable LR analysis.

Model fitting

A fivefold cross-validation method was used to assign the training and validation cohorts using the same random seed across all splits to ensure consistency in groupings. Herein, based on the occurrence of MACEs within the first three years of initial examination, the MACE prediction task was conceptualized as a binary classification problem, with all machine learning models generating a normalized risk probability within the range of 0 to 1. The combination of hyperparameters that yielded the best average AUC was selected for model fitting.

Model evaluation

Model performance was evaluated using the receiver operating characteristic (ROC) curves. Calibration curves were employed for qualitative evaluation of model calibration, while decision curve analysis (DCA) was utilized to assess the utility of the models across various threshold probabilities. Confusion matrices were employed for comparative analysis of model performances. Furthermore, model efficiency metrics were computed to quantitatively assess the predictive capabilities of each model.

Additionally, in order to further validate the model's clinical applicability, model with the best performance was used to predict 3-year MACE incidence using Kaplan‒Meier survival curves. By applying the cut-off values determinate by the X-tile software [20], the subjects were divided into high, middle, and low ML risk groups. The overall log-rank test was applied to compare the differences between Kaplan–Meier survival curves. Figure 2 shows the study flowchart.

Fig. 2
figure 2

Schematic of the workflow of the study

Statistical analysis

SPSS (version 26.0), R software (version 4.0.2), X-tile (version 3.6.1), and MedCalc (version 19.5.6) used to analyze the data. A comparison of variables between patients with and without MACE was carried out based on the results of the follow-up. A normal distribution was determined by the Kolmogorov–Smirnov test. A continuous variable is expressed as the mean + standard deviation in normal distributions and median in nonnormal distributions. A categorical variable is expressed as frequency counts and percentages. T test and a Chi-square test were used to analyze differences between groups. The AUCs of the different models were compared using DeLong tests. The significance of a study is determined by P values that have two tails no greater than 0.05. We set the P-value threshold at 0.05 because it is a widely accepted standard in many scientific fields and balances statistical rigor with the practical significance of our findings.

Results

Clinical features

Out of the initial dataset of 1077 patients, 210 patients were excluded due to a high missing values or missing critical images for the analysis. A bias analysis was conducted to examine the demographic characteristics of patients with complete data in comparison to those with missing data. The results indicated no significant differences between the two, suggesting that the missing data were likely random and did not introduce systematic biases (Supplementary materials, Table S1).

In our analysis, 228 participants were involved with an average age of (62.58 ± 11.63) years and 146 men (64.0%). Regarding factors associated with arteriosclerosis, 56.6% of patients had hypertension, 25.9% were current smokers, 27.2% currently consumed alcohol, and 28.9% had diabetes.

Patients who experienced MACEs were older (68.13 ± 10.23 vs. 60.26 ± 11.42 years, P < 0.001) compared to those who did not experience MACEs. Besides, there was a higher proportion of males (74.6% vs. 59.6%, P = 0.032) among patients who experienced MACEs compared to those who did not. Furthermore, individuals in the MACE group were more inclined to engage in unhealthy lifestyle behaviors (smoking and alcohol use, 35.8% vs. 21.7%, P = 0.027 and 38.8% vs. 22.4%, P = 0.011, respectively), and the prevalence of hypertension was notably higher in the MACE group compared to the non-MACE group (76.1% vs. 48.4%, P < 0.001). The clinical characteristics of the patients are outlined in Table 1, and are in line with known literature [17].

Table 1 Clinical features of the study participants

Coronary artery imaging features

As presented in Table 2, compared with patients without MACE, those with MACE outcomes had a much greater CAC score (78.76 [5.43–371.10] vs. 0 [0–47.38], P < 0.001). With regard to PVAT features, the median PVAT-FAI was greater in the MACE group compared to the non-MACE group (− 71.56 [− 76.87–66.51] vs. − 80.58 [− 87.34–75.05] HU for the RCA, − 65.25 [− 71.70–61.38] vs. − 71.94 [− 76.87–66.67] HU for the LCX, and − 67.68 [− 72.59–63.09] vs.  76.17 [− 81.37–71.31] HU for the LAD, respectively, all P < 0.001). In addition, the PVAT volume was smaller in the MACE group (2.13 [1.62–2.94] mm3 for the RCA, 1.27 [0.95–1.65] mm3 for the LCX, and 1.65 [1.32–2.34] mm3 for the LAD) than in the non-MACE group (2.84 [2.38–3.37] mm3 for the RCA, 1.59 [1.19–1.94] mm3 for the LCX, and 2.24 [1.86–2.82] mm3 for the LAD, all P < 0.001). Figure 3 shows the typical CTA images obtained during the analysis.

Table 2 Coronary artery imaging features of the study participants
Fig. 3
figure 3

PVAT analysis of the RCA. A Longitudinal view of RCA PVAT measurements. B Cross-sectional view of RCA PVAT measurements. C Curved multiplane review of RCA PVAT measurements. An adipose tissue HU color table is shown with a color bar to visualize PVAT. Based on the fat threshold (from − 190 to − 30 HU), the ROI was automatically tracked using the software. The length of the ROI is 10–40 mm proximal to the coronary artery. The radial distance of the ROI is equal to the diameter of the coronary artery. The PVAT-FAI was − 109 HU. The PVAT volume was the total volume of adipose voxels within the ROI. RCA, right coronary artery; PVAT, perivascular adipose tissue; HU, Hounsfield unit; ROI, region of interest; FAI, fat attenuation index

Development and performance of models

By using LASSO, irrelevant or redundant features were eliminated. Finally, seven features were chosen for ML model development, including 3 clinical features (age, hypertension status, and smoking status) and 4 coronary artery imaging features (CAC scores, PCAT-FAIRCA, PCAT-FAILAD, and PCAT-FVRCA). Feature importance was determined using LASSO regression (Supplementary materials, Figure S1). The reason to use LASSO for all the ML models was driven by the need for a model-agnostic feature selection process that could be consistently applied across different ML algorithms, ensuring a fair comparison. Correlations among the selected features were evaluated using Spearman’s correlation coefficients (Supplementary materials, Figure S2). Additionally, in LR analysis, variables that exhibited statistical significance in the univariate analysis were subsequently incorporated into the multivariate logistic regression analysis, which is a common approach to develop a classical LR model. Age, hypertension, CAC scores, PCAT-FAILAD, and PCAT-FAIRCA were independent risk factors for MACE prediction (Supplementary materials, Table S2).

As Adaboost inherently calculates feature importance during its training process, a supplementary analysis was conducted using AdaBoost for the feature selection. There was no significant difference of the AUCs between LASSO and AdaBoost selections (Supplementary materials, Table S3). Besides, a supplementary analysis using LASSO for the LR feature selection was also conducted. No significant difference of the AUCs was found between LASSO and univariate-multivariate analysis (Supplementary materials, Table S4).

We compared the performance of the ML models to classical LR and a well-established model, the Framingham Risk Score [21]. Table 3 summarizes commonly used metrics for the model evaluation. Upon evaluation of LR, Framingham Risk Score, AdaBoost, GNB, LDA, SVM, and an ensemble algorithm, it was determined that the ensemble algorithm exhibited superior predictive performance metrics. Specifically, the ensemble algorithm demonstrated an AUC of 0.94, accuracy of 0.87, precision of 0.74, recall of 0.87, F1-score of 0.80, sensitivity of 0.87, specificity of 0.88, PPV of 0.74, and NPV of 0.94 in the training cohort. In the validation cohort, the ensemble algorithm maintained strong performance with an AUC of 0.93, accuracy of 0.88, precision of 0.77, recall of 0.85, F1-score of 0.81, sensitivity of 0.85, specificity of 0.89, PPV of 0.77, and NPV of 0.94. (Fig. 4 and Figure S3). The model parameters and settings were provided in detail in Supplementary Appendix S1. The formulas of the ensemble model are available in Supplementary Appendix S2.

Table 3 Performance comparisons of logistic regression model and machine learning models
Fig. 4
figure 4

Heatmap comparing the performance of different ML algorithm models in the training cohort (A) and validation cohort (B). AUC, area under the curve; AdaBoost, adaptive boosting; GNB, Gaussian naive Bayes; LDA, linear discriminant analysis; SVM, linear support vector machine; PPV, positive prediction value; NPV, negative prediction value

As an additional demonstration of the effectiveness of the models, the ROC curves are plotted in Fig. 5A. Although the AUC of the ensemble model did not exhibit the most substantial advantage in the training cohort when compared with all the other models, it demonstrated superior performance in the validation cohort (Table 4). The ensemble model's predicted probability was close to the actual probability according to calibration curves, which describe how well the prediction agrees with reality (Supplementary materials online, Figure S4). The clinical net benefit can be calculated using DCA to determine whether using a model is beneficial. In our study, the ensemble model demonstrated superior decision-making capabilities when compared to the other four algorithms, as illustrated in Fig. 5. Consequently, the ensemble algorithm was chosen for further model development.

Fig. 5
figure 5

Performances of different ML algorithm models. Plots showing the ROC curves of different models in the training cohort (A) and validation cohort (B). Decision curve analysis of different models in the training cohort (C) and validation cohort (D). The X-axis represents the threshold probability; the Y-axis represents the net benefit. AUC, area under the curve; AdaBoost, adaptive boosting; GNB, Gaussian naive Bayes; LDA, linear discriminant analysis; SVM, linear support vector machine

Table 4 DeLong tests to compare the AUCs of logistic regression model and machine learning models

Subsequently, the ensemble model was utilized for risk stratification prediction in order to investigate its potential clinical utility. Patients were categorized into low-risk, intermediate-risk, and high-risk cohorts for the prediction of MACE. Cut-off values of the three risk groups were 0.14 and 0.92. As presented in Fig. 6, the risk probability differed significantly among the three groups (P < 0.0001). The 3-year MACE risk for the middle-risk group was found to be 10.05 times (95% CI 8.46–23.33) higher compared to the low-risk group. Similarly, the 3 year MACE risk for the high-risk group was 51.21 times higher (95% CI 22.15–118.42) than that of the low-risk group. These findings suggest that the ensemble model could offer more valuable information and thus be more applicable in clinical settings.

Fig. 6
figure 6

Survival curves by risk group for the validation cohort stratified into low-risk, middle-risk and high-risk groups according to the optimal cut-off points

Discussion

Our study developed an ensemble model based on clinical and imaging features to predict MACEs accurately. Older age, current smoker, and a higher incidence of hypertension were identified as significant risk factors for MACEs, consistent with previous research. Additionally, patients with MACEs exhibited significantly higher CAC scores and median PVAT values, suggesting these imaging biomarkers may play a role in risk stratification and prognosis assessment. As compared to the other algorithms, the ensemble model performed the best in terms of prediction accuracy.

Seven common features were utilized in constructing the predictive ML models in our study. The features can be classified into three categories: general information of patients (age); lifestyle habits (hypertension and smoking); and CTA imaging data (CAC scores, PCAT-FAIRCA, PCAT-FAILAD, and PCAT-FVRCA). For comparison, a LR model was limited to only five features using univariate and multivariate regression analysis. As presented in our study, the inclusion of additional features has the potential to enhance prediction accuracy by capturing more complex relationships and interactions within the data. Furthermore, the incorporation of more features may enhance the model's resilience to data fluctuations and enhance its applicability to novel datasets, thereby mitigating the risk of overfitting. Additionally, it was observed that the laboratory results were not incorporated into the models, potentially attributable to the early stage of the disease that leads to lack of significant variations in the laboratory findings. It highlights a strength of our ML-based approach to dimensionality reduction. By preserving features that remain unaffected by laboratory findings, our model demonstrates robustness and relevance, even at early stages of the disease when laboratory variations are minimal. The clinical features observed in our study align with established risk factors for cardiovascular events. Older age and smoker have long been recognized as significant contributors to cardiovascular morbidity and mortality [22, 23]. Moreover, the increase in the incidence of hypertension among patients with MACE highlights the continued importance of blood pressure management in mitigating cardiovascular risk [24]. Our findings reinforce the need for targeted interventions aimed at addressing modifiable risk factors like smoking and drinking alcohol, to reduce the burden of MACE in high-risk populations [25, 26].

Coronary artery imagines, particularrly CTA, can contribute significantly to risk stratification and prognosis. Our analysis revealed distinct patterns in CAC scores and PVAT characteristics between patients with and without a 3-year MACE risk in a population that did not exhibit disparities in laboratory results. The heightened CAC burden observed in the MACE group signifies advanced atherosclerotic processes and plaque vulnerability, consistent with previous studies linking CAC scores to adverse cardiovascular outcomes [27, 28]. Additionally, alterations in PVAT composition, as reflected in our findings, underscores PVAT’s emerging role as a marker of metabolic disorders and inflammation [29, 30]. It has been reported that an increased LAD and RCA PVAT attenuation is associated with an increased CVD risk [31]. In our research, we substantiated the additional predictive significance of PCAT-FAIRCA, PCAT-FAILAD, and PCAT-FVRCA in patients experiencing MACEs. The comparative analysis of our models with the Framingham Risk Score highlights the added value of incorporating CTA characteristics into cardiovascular risk prediction models. The attenuation of PVAT and alterations in intracellular lipid deposition are attributed to early and persistent inflammation [32]. Thus, PVAT features may provide important complementary information about coronary artery disease [33].

The development of a predictive model integrating clinical and imaging features represents a significant advancement in personalized risk assessment. Recently, ML algorithms have been used to develop regression and classification models for clinical prediction [34, 35]. In this study, we compare LR, a classical statistical method, with ML algorithms including ensemble methods (AdaBoost), probabilistic models (Gaussian Naive Bayes), and linear classifiers (LDA and SVM). ML algorithms, particularly those using ensemble techniques, are generally less affected by multicollinearity due to their ability to combine multiple models and their inherent design to handle high-dimensional data. While the traditional LR model proved to be effective in predicting MACEs, the utilization of ML model, specifically the ensemble model, showcases the potential of data-driven approaches in health care decision support systems. Clinical data are inherently diverse and often suffer from imbalances among classes or outcomes. Although the cross-validation in our study attempts to mitigate this issue by partitioning data into multiple folds, the results in each fold may still not adequately representing the full spectrum of clinical scenarios. That’s the reason why the four commonly used ML models seem to perform worse in the validation cohort in our study [36]. The ensemble model, on the other hand, could mitigate this issue by amalgamating multiple base models to enhance overall performance. In our study, multiple base classifiers were employed as level-0 models. The predictions of these base classifiers were then used as input features for a level-1 metamodel, which generated the final prediction. By aggregating predictions from multiple models, the ensemble model effectively smooths out noise and leads to consistent performance across both training and validation cohorts [37]. The model’s high accuracy, precision, and sensitivity underscore its potential clinical utility in guiding risk management strategies and optimizing patient outcomes [19, 38].

In our study, we utilized fivefold cross-validation to evaluate the efficacy and reliability of our predictive model. Cross-validation involves splitting the dataset into multiple subsets and using each fold as both training and validation data. In each iteration, the model was trained on four folds and validated on onefold. The scores from the five validation folds were averaged to obtain a single representative score, which was used for final model prediction. Afterwards, the performance metrics are derived from the final model. This approach enables more efficient utilization of the available data.

In light of MACEs’ high mortality risk, early detection of MACEs may facilitate risk stratification, clinical decisions, and improved patient outcomes. For instance, implementing stringent lifestyle management practices, such as controlling blood pressure and cessation of smoking, adhering to a nutritious and well-rounded diet, engaging in regular physical activity to decrease overall body fat, and utilizing appropriate medication therapies (such as GLP-1 agonists) that focus on lipid metabolism and diminish fat build-up, may be efficaciously employed in the prevention of MACEs in patients at high risk. GLP-1 agonists, traditionally used for glycemic control in type 2 diabetes, have shown significant cardiovascular benefits. Combining the pharmacological advancements with stringent lifestyle interventions may offer a synergistic approach to significantly mitigate the risk of MACEs in high-risk patients.

Our study exhibits certain limitations. First, since the individuals under study were exclusively sourced from a single medical facility, there exists a potential for bias in the evaluation of predictive efficacy. To reduce the likelihood of overfitting and providing robust estimates of predictive accuracy, we utilized cross-validation to evaluate the efficacy and reliability of our predictive model. Second, directly comparing the ensemble ML model to individual variables can be challenging due to its inherent complexity and nonlinear relationships. The ensemble model combines the predictions of multiple base ML models, which obfuscates the contribution of each individual variable to the overall model performance. Third, the performance/validation metrics for the automatic segmentation tool of PVAT were not calculated. However, in our study the automatic segmentation is based on well-established rules of PVAT segmentation [39]. Given that the segmentation is rule-based, there was no need to independently verify its accuracy. Besides, in our research process, there were no instances requiring manual adjustments, which further attests to its robustness and reliability. Finally, although the ensemble model demonstrated promising performance in risk stratification, the translation of the model’s predictions into meaningful clinical decisions and interventions needs to be further explored.

Conclusion

Our study established an ensemble model for the prediction of the occurrence of MACEs using clinical and PVAT features. By integrating clinical data, advanced imaging modalities such as CTA, and machine learning techniques, we advance towards a more nuanced understanding of cardiovascular risk prediction. These findings may improve patient outcomes through early disease prevention and therapeutic interventions.

Availability of data and materials

The datasets used during the current study are available from the corresponding author (CH) on reasonable request.

References

  1. Chen W, Li R, Yin K, Liang J, Li H, Chen X, Sheng Z, Yu H, Mu D. Clinical feasibility of using effective atomic number maps derived from non-contrast spectral computed tomography to identify non-calcified atherosclerotic plaques: a preliminary study. Quant Imag Med Surg. 2022;12:2280–7.

    Article  Google Scholar 

  2. Chen Q, Pan T, Wang YN, Schoepf UJ, Bidwell SL, Qiao H, Feng Y, Xu C, Xu H, Xie G, et al. A coronary CT angiography radiomics model to identify vulnerable plaque and predict cardiovascular events. Radiology. 2023;307:e221693.

    Article  Google Scholar 

  3. Liu W, Laranjo L, Klimis H, Chiang J, Yue J, Marschner S, Quiroz JC, Jorm L, Chow CK. Machine-learning versus traditional approaches for atherosclerotic cardiovascular risk prognostication in primary prevention cohorts: a systematic review and meta-analysis. Eur Heart J Qual Care Clin Outcomes. 2023;9:310–22.

    Google Scholar 

  4. Hamatani Y, Nishi H, Iguchi M, Esato M, Tsuji H, Wada H, Hasegawa K, Ogawa H, Abe M, Fukuda S, Akao M. Machine learning risk prediction for incident heart failure in patients with atrial fibrillation. JACC Asia. 2022;2:706–16.

    Article  Google Scholar 

  5. Li L, Zhang Z, Zhou L, Zhang Z, Xiong Y, Hu Z, Yao Y. Use of machine learning algorithms to predict life-threatening ventricular arrhythmia in sepsis. Eur Heart J Digit Health. 2023;4:245–53.

    Article  Google Scholar 

  6. Cho SM, Austin PC, Ross HJ, Abdel-Qadir H, Chicco D, Tomlinson G, Taheri C, Foroutan F, Lawler PR, Billia F, et al. Machine learning compared with conventional statistical models for predicting myocardial infarction readmission and mortality: a systematic review. Can J Cardiol. 2021;37:1207–14.

    Article  Google Scholar 

  7. Gautam N, Mueller J, Alqaisi O, Gandhi T, Malkawi A, Tarun T, Alturkmani HJ, Zulqarnain MA, Pontone G, Al’Aref SJ. Machine learning in cardiovascular risk prediction and precision preventive approaches. Curr Atheroscler Rep. 2023;25:1069–81.

    Article  Google Scholar 

  8. Leonard G, South C, Balentine C, Porembka M, Mansour J, Wang S, Yopp A, Polanco P, Zeh H, Augustine M. Machine learning improves prediction over logistic regression on resected colon cancer patients. J Surg Res. 2022;275:181–93.

    Article  Google Scholar 

  9. Alaa AM, Bolton T, Di Angelantonio E, Rudd JHF, van der Schaar M. Cardiovascular disease risk prediction using automated machine learning: a prospective study of 423,604 UK biobank participants. PLoS ONE. 2019;14:e0213653.

    Article  Google Scholar 

  10. Singh A, Miller RJH, Otaki Y, Kavanagh P, Hauser MT, Tzolos E, Kwiecinski J, Van Kriekinge S, Wei CC, Sharir T, et al. Direct risk assessment from myocardial perfusion imaging using explainable deep learning. JACC Cardiovasc Imag. 2023;16:209–20.

    Article  Google Scholar 

  11. Rios R, Miller RJH, Hu LH, Otaki Y, Singh A, Diniz M, Sharir T, Einstein AJ, Fish MB, Ruddy TD, et al. Determining a minimum set of variables for machine learning cardiovascular event prediction: results from REFINE SPECT registry. Cardiovasc Res. 2022;118:2152–64.

    Article  Google Scholar 

  12. Rachwalik M, Matusiewicz M, Jasiński M, Hurkacz M. Evaluation of the usefulness of determining the level of selected inflammatory biomarkers and resistin concentration in perivascular adipose tissue and plasma for predicting postoperative atrial fibrillation in patients who underwent myocardial revascularisation. Lipids Health Dis. 2023;22:2.

    Article  Google Scholar 

  13. Kurata A. Deep learning-based CT noise reduction for perivascular adipose tissue evaluation. Acad Radiol. 2024;31:446–7.

    Article  Google Scholar 

  14. Mancio J, Oikonomou EK, Antoniades C. Perivascular adipose tissue and coronary atherosclerosis. Heart. 2018;104:1654–62.

    Article  Google Scholar 

  15. Zhang Z, Liu J, Xi J, Gong Y, Zeng L, Ma P. Derivation and validation of an ensemble model for the prediction of agitation in mechanically ventilated patients maintained under light sedation. Crit Care Med. 2021;49:e279–90.

    Article  Google Scholar 

  16. Oikonomou EK, Williams MC, Kotanidis CP, Desai MY, Marwan M, Antonopoulos AS, Thomas KE, Thomas S, Akoumianakis I, Fan LM, et al. A novel machine learning-derived radiotranscriptomic signature of perivascular fat improves cardiac risk prediction using coronary CT angiography. Eur Heart J. 2019;40:3529–43.

    Article  Google Scholar 

  17. Heo J, Yoo J, Lee H, Lee IH, Kim JS, Park E, Kim YD, Nam HS. Prediction of hidden coronary artery disease using machine learning in patients with acute ischemic stroke. Neurology. 2022;99:e55–65.

    Article  Google Scholar 

  18. Wang K, Tian J, Zheng C, Yang H, Ren J, Liu Y, Han Q, Zhang Y. Interpretable prediction of 3-year all-cause mortality in patients with heart failure caused by coronary heart disease based on machine learning and SHAP. Comput Biol Med. 2021;137:104813.

    Article  Google Scholar 

  19. Zhou J, Hu B, Feng W, Zhang Z, Fu X, Shao H, Wang H, Jin L, Ai S, Ji Y. An ensemble deep learning model for risk stratification of invasive lung adenocarcinoma using thin-slice CT. NPJ Digit Med. 2023;6:119.

    Article  Google Scholar 

  20. Camp RL, Dolled-Filhart M, Rimm DL. X-tile: a new bio-informatics tool for biomarker assessment and outcome-based cut-point optimization. Clin Cancer Res. 2004;10:7252–9.

    Article  Google Scholar 

  21. Atehortúa A, Gkontra P, Camacho M, Diaz O, Bulgheroni M, Simonetti V, Chadeau-Hyam M, Felix JF, Sebert S, Lekadir K. Cardiometabolic risk estimation using exposome data and machine learning. Int J Med Inform. 2023;179:105209.

    Article  Google Scholar 

  22. Jiang ZZ, Zhu JB, Shen HL, Zhao SS, Tang YY, Tang SQ, Liu XT, Jiang TA. A high triglyceride-glucose index value is associated with an increased risk of carotid plaque burden in subjects with prediabetes and new-onset type 2 diabetes: a real-world study. Front Cardiovasc Med. 2022;9:832491.

    Article  Google Scholar 

  23. Gomes D, Wang S, Goodspeed L, Turk KE, Wietecha T, Liu Y, Bornfeldt KE, O’Brien KD, Chait A, den Hartigh LJ. Comparison between genetic and pharmaceutical disruption of Ldlr expression for the development of atherosclerosis. J Lipid Res. 2022;63:100174.

    Article  Google Scholar 

  24. Kuwabara M, Kodama T, Ae R, Kanbay M, Andres-Hernando A, Borghi C, Hisatome I, Lanaspa MA. Update in uric acid, hypertension, and cardiovascular diseases. Hypertens Res. 2023;46:1714–26.

    Article  Google Scholar 

  25. Møller AL, Andersson C. Importance of smoking cessation for cardiovascular risk reduction. Eur Heart J. 2021;42:4154–6.

    Article  Google Scholar 

  26. Day E, Rudd JHF. Alcohol use disorders and the heart. Addiction. 2019;114:1670–8.

    Article  Google Scholar 

  27. Bell KJL, White S, Hassan O, Zhu L, Scott AM, Clark J, Glasziou P. Evaluation of the incremental value of a coronary artery calcium score beyond traditional cardiovascular risk assessment: a systematic review and meta-analysis. JAMA Intern Med. 2022;182:634–42.

    Article  Google Scholar 

  28. Hussain B, Mahmood A, Flynn MG, Alexander T. Coronary artery calcium scoring in asymptomatic patients. HCA Healthc J Med. 2023;4:341–52.

    Article  Google Scholar 

  29. Adachi Y, Ueda K, Nomura S, Ito K, Katoh M, Katagiri M, Yamada S, Hashimoto M, Zhai B, Numata G, et al. Beiging of perivascular adipose tissue regulates its inflammation and vascular remodeling. Nat Commun. 2022;13:5117.

    Article  Google Scholar 

  30. Koenen M, Hill MA, Cohen P, Sowers JR. Obesity, adipose tissue and vascular dysfunction. Circ Res. 2021;128:951–68.

    Article  Google Scholar 

  31. Oikonomou EK, Marwan M, Desai MY, Mancio J, Alashi A, Hutt Centeno E, Thomas S, Herdman L, Kotanidis CP, Thomas KE, et al. Non-invasive detection of coronary inflammation using computed tomography and prediction of residual cardiovascular risk (the CRISP CT study): a post-hoc analysis of prospective outcome data. Lancet. 2018;392:929–39.

    Article  Google Scholar 

  32. Antoniades C, Antonopoulos AS, Deanfield J. Imaging residual inflammatory cardiovascular risk. Eur Heart J. 2020;41:748–58.

    Article  Google Scholar 

  33. Ichikawa K, Miyoshi T, Osawa K, Nakashima M, Miki T, Nishihara T, Toda H, Yoshida M, Ito H. High pericoronary adipose tissue attenuation on computed tomography angiography predicts cardiovascular events in patients with type 2 diabetes mellitus: post-hoc analysis from a prospective cohort study. Cardiovasc Diabetol. 2022;21:44.

    Article  Google Scholar 

  34. Silva GFS, Fagundes TP, Teixeira BC, Chiavegatto Filho ADP. Machine learning for hypertension prediction: a systematic review. Curr Hypertens Rep. 2022;24:523–33.

    Article  Google Scholar 

  35. Jiang Z, Yuan F, Zhang Q, Zhu J, Xu M, Hu Y, Hou C, Liu X. Classification of superficial suspected lymph nodes: non-invasive radiomic model based on multiphase contrast-enhanced ultrasound for therapeutic options of lymphadenopathy. Quant Imag Med Surg. 2024;14:1507–25.

    Article  Google Scholar 

  36. Xu Y, Goodacre R. On splitting training and validation set: a comparative study of cross-validation, bootstrap and systematic sampling for estimating the generalization performance of supervised learning. J Anal Test. 2018;2:249–62.

    Article  Google Scholar 

  37. Ju C, Bibaut A, van der Laan M. The relative performance of ensemble methods with deep convolutional neural networks for image classification. J Appl Stat. 2018;45:2800–18.

    Article  MathSciNet  Google Scholar 

  38. Zhang L, Wang Z, Zhou Z, Li S, Huang T, Yin H, Lyu J. Developing an ensemble machine learning model for early prediction of sepsis-associated acute kidney injury. iScience. 2022;25:104932.

    Article  Google Scholar 

  39. Goeller M, Achenbach S, Cadet S, Kwan AC, Commandeur F, Slomka PJ, Gransar H, Albrecht MH, Tamarappoo BK, Berman DS, et al. Pericoronary adipose tissue computed tomography attenuation and high-risk plaque characteristics in acute coronary syndrome compared with stable coronary artery disease. JAMA Cardiol. 2018;3:858–63.

    Article  Google Scholar 

Download references

Acknowledgements

Not applicable.

Funding

This work was supported by Zhejiang Medical Health Science and Technology Program (2022KY1317).

Author information

Authors and Affiliations

Authors

Contributions

All authors were involved in writing the above article. CH was responsible for conceptualization, investigation and writing the original draft. FYW and LFF contributed to image interpretation and data collection. LTK, ZFL, and YPQ contributed to the collection of clinical cases. HWX contributed to critically revising the manuscript. The author(s) read and approved the final manuscript.

Corresponding author

Correspondence to Hongwei Xu.

Ethics declarations

Ethics approval and consent to participate

This study was approved by the Ethics Committee of Shaoxing Second Hospital. The data were anonymous, so informed consent was waived.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

He, C., Wu, F., Fu, L. et al. Improving cardiovascular risk prediction with machine learning: a focus on perivascular adipose tissue characteristics. BioMed Eng OnLine 23, 77 (2024). https://doi.org/10.1186/s12938-024-01273-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12938-024-01273-5

Keywords