Predicting carotid plaques in metabolic dysfunction-associated steatotic liver disease using machine learning and SHAP interpretation

Zhai, Shu-Mei; Wang, Xiao-Long; Zhang, Han; Zuo, Yu-Qiang

doi:10.1038/s41598-025-19959-8

Download PDF

Article
Open access
Published: 15 October 2025

Predicting carotid plaques in metabolic dysfunction-associated steatotic liver disease using machine learning and SHAP interpretation

Shu-Mei Zhai¹,
Xiao-Long Wang²,
Han Zhang³ &
…
Yu-Qiang Zuo³

Scientific Reports volume 15, Article number: 36074 (2025) Cite this article

36 Accesses
Metrics details

Subjects

Abstract

Cardiovascular disease (CVD) remains the most common cause of death worldwide. Carotid plaque is an indicator of subclinical CVDs. Metabolic dysfunction-associated steatotic liver disease (MASLD) is a risk factor for atherosclerotic CVDs. We aimed to develop and validate a predictive model for carotid plaque occurrence in annual health check-up populations, to integrate health check-up indicators with machine learning (ML) algorithms and LASSO-based feature selection and leverage advanced interpretability frameworks to elucidate the contribution of individual risk factors. In this retrospective cohort study, we enrolled 4,973 MASLD patients, among whom 1,178 were diagnosed with carotid plaques using carotid ultrasound. Collected baseline data included demographic indicators, clinical histories, blood biochemical parameters, and liver function test indicators. A predictive model for carotid plaques was developed and validated using five ML algorithms. Model performance was evaluated based on the area under the curve, sensitivity, specificity, accuracy, and F1 Score. For model interpretability, we adopted the Shapley Additive Explanations (SHAP) framework to quantify the contribution of individual features to the prediction outcomes. Among the five ML algorithm models, the support vectors machine model demonstrated superior discriminative capability, higher goodness-of-fit, and greater clinical utility compared to other ML algorithm models. Moreover, age, systolic blood pressure, total cholesterol, sex, and fasting plasma glucose were the most important risk factors associated with carotid plaques in the MASLD population. This study demonstrated the feasibility of constructing a predictive model for carotid plaques in MASLD populations using health check-up indicators combined with ML algorithms. The application of SHAP methods enhanced model interpretability by quantifying the contribution of individual risk factors to prediction outcomes, enabling clinicians to identify high risk MASLD patients prone to carotid plaque development, so that they can adjust interventions accordingly.

Predicting metabolic dysfunction associated steatotic liver disease using explainable machine learning methods

Article Open access 11 April 2025

Machine learning models for screening carotid atherosclerosis in asymptomatic adults

Article Open access 15 November 2021

The study on risk assessment of carotid plaques in the Northern Chinese population based on LASSO regression

Article Open access 12 May 2025

Introduction

Cardiovascular diseases (CVDs), principally ischemic heart disease and stroke, are the leading causes of global mortality and are major contributors to disabilities¹. Carotid plaques serve as both a critical subclinical marker of atherosclerosis and a predictor of adverse cardiovascular events^2,3. Metabolic dysfunction-associated steatotic liver disease (MASLD), affecting 25–30% of adults globally, is increasingly recognized as an independent risk factor for CVDs, with studies reporting a 40–60% prevalence of carotid plaques in MASLD populations⁴. Despite this strong association, current cardiovascular risk stratification tools, such as the Framingham Risk Score, fail to incorporate MASLD -specific biomarkers or leverage advanced predictive analytics, leading to suboptimal risk discrimination in this high-risk cohort⁵.

Recent advances in machine learning (ML) algorithms have demonstrated promising results in clinical prediction models but have been hindered by the “black box” nature of algorithms, limiting their clinical interpretabilities and actionable insights for personalized interventions⁶. The Shapley Additive Explanation (SHAP) is a model-agnostic interpretability method based on the Shapley value concept, derived from cooperative game theory, designed to quantify the contribution of parameters for ML prediction, and to enhance model transparency. By decomposing prediction results into feature contributions, the SHAP transforms complex “black-box” models into explainable frameworks, providing both global and local interpretabilities^7,8.

However, existing MASLD -focused studies predominantly rely on traditional regression models, which inadequately capture nonlinear interactions among metabolic, inflammatory, and hemodynamic risk factors⁹. Furthermore, while SHAP has been validated in other medical applications to improve model transparency, its application in MASLD -related cardiovascular risk prediction remains unexplored. In this study, we developed and validated a prediction model for the occurrence of carotid plaques in the MASLD population using ML algorithms based on health check-up indicators. The SHAP values were subsequently used to interpret the model’s predictions, revealing the marginal contributions of individual features and their interactions to the risks of carotid plaque development.

Methods

Study participants

Participants were enrolled from the annual health check-up population at the Second Hospital of Hebei Medical University, between January 2024 and December 2024. The inclusion criteria were: (1) participants aged ≥ 18 years, and (2) participants with liver ultrasound and carotid ultrasound results with clear diagnostic outcomes. Exclusion criteria included (1) age < 18 years; (2) history of cardiovascular and cerebrovascular diseases or malignant tumors; (3) participants with coexisting etiologies for chronic liver diseases, including hemochromatosis, autoimmune liver disease, chronic viral hepatitis, alpha-1 antitrypsin deficiency, Wilson’s disease, and drug-induced liver injury¹⁰; (4) Participants missing essential clinical examination indicators (blood tests, biochemistry, anthropometrics); and (5) participants without MASLD. Finally, 4,973 participants were enrolled in this study (Fig. 1).

This study was approved by the Institutional Review Board (IRB) of the Second Hospital of Hebei Medical University (Approval No. 2022-R341). Informed consent was waived by the Institutional Review Board (IRB) of the Second Hospital of Hebei Medical University owing to the study’s retrospective nature. Identifying information was anonymized, adhering to the ethical principles of the Declaration of Helsinki.

Potential risk features and outcomes

All potential risk factors for carotid plaque reported in recent literature were systematically evaluated. Based on data availability in the study cohort, 26 variables were selected and categorized as follows: demographic characteristics (sex and age), anthropometric measurements [height, weight, body mass index, systolic blood pressure (SBP), diastolic blood pressure (DBP), and pulse]. Blood biochemical indicators were total cholesterol (TC), triglycerides (TG), high-density lipoprotein cholesterol (HDL-C), and low-density lipoprotein cholesterol (LDL-C)]. Metabolic markers were fasting blood glucose (FBG) and uric acid. Hepatic function indicators were alanine transaminase (ALT), aspartate aminotransferase (AST), total protein (TP), albumin (ALB), globulin (GLB), A/G ratio, total bilirubin (TBIL), direct bilirubin (DBIL), and indirect bilirubin (IBIL)]. Renal functions included serum creatinine. Comorbid conditions were hypertension, diabetes mellitus (DM), and hyperlipidemia.

The outcome was whether the participant was diagnosed with MASLD and carotid plaques. The ultrasonographic manifestations of MASLD were primarily characterized by the basic symptoms of steatosis, which increased echogenicity of the liver parenchyma in comparison to the cortex of the right kidney, because intracellular accumulation of fat vacuoles reflected the ultrasound beam¹¹. Furthermore, diagnosis of MASLD also met the requirements of the multi-society Delphi consensus statement on new fatty liver disease nomenclature¹². Carotid artery plaque diagnosis by ultrasound was based on guidelines and standards from the 2020 American Society of Echocardiography¹³.

Data processing and feature selection

The dataset was randomly divided into a training cohort (n = 3,480) and validation cohort (n = 1,493), using a 7:3 ratio for model training and internal validation, respectively. All continuous variables underwent Z-score normalization to standardize feature scales, ensuring comparability across parameters. Variables exhibiting significant skewness (Shapiro-Wilk P < 0.05) were retained without transformation, leveraging the inherent robustness of tree-based methods to non-normality. The feature selection process and model development were conducted using the training cohort, while the independent validation cohort was used for evaluating the model’s performance.

Correlation between variables was evaluated using Pearson’s (parametric) and Spearman’s (nonparametric) tests. Variables exhibiting absolute correlation coefficients ≥ 0.8 were excluded to mitigate multicollinearity risks. To identify the most predictive features and improve analytical robustness, the least absolute shrinkage and selection operator (LASSO) regression was used for feature selection. This regularization technique applies L1 penalty to shrink less important coefficients toward zero, effectively selecting key variables while mitigating overfitting. The selected features preserved clinically relevant signal strengths, thereby enhancing comparability of latent patterns across datasets.

Model construction and evaluation

We used three ML algorithms [support vectors machine (SVM), Logistic Regression (LR), Decision Tree (DT), Random Forest (RF) and eXtreme Gradient Boost (XGBoost)] to construct the model based on the features selected by LASSO regression. Hyperparameter optimization was performed via 5-fold cross-validated grid search to ensure robust model performance and reproducibility. For the five ML algorithms evaluated: Support Vector Machine (SVM): Optimized regularization parameter (C: [0.1, 1, 10]) and kernel coefficient (gamma: [0.01, 0.1, 1]); Logistic Regression (LR): Tuned penalty strength (L2 regularization, C: [0.01, 0.1, 1]); Decision Tree (DT): Evaluated maximum depth^3,5,10 and minimum samples per leaf^5,10; Random Forest (RF): Optimized number of trees ([100, 200, 300, 400, 500]), features per split ([sqrt(p), log2(p)] where p = number of features), and node splitting criteria ([Gini impurity, entropy]); XGBoost: Calibrated learning rate (η: [0.01, 0.05, 0.1, 0.2]), maximum depth^2,3,4,5,6, subsample ratio ([0.6, 0.7, 0.8, 0.9, 1.0]), column sampling ([0.6, 0.7, 0.8, 0.9, 1.0]), and minimum loss reduction ([0, 1, 3]).Final configurations were selected by maximizing the AUC on the training cohort through systematic evaluation of all hyperparameter combinations. The optimal parameters were: SVM: C = 10, gamma = 0.1;LR: C = 0.1;DT: max_depth = 5, min_samples_leaf = 10;RF: ntree = 260, mtry = sqrt(p); XGBoost: eta = 0.1, max_depth = 3. The area under the curve (AUC) of receiver operating characteristics (ROC), sensitivity, specificity, accuracy, and F1 score were used to provide a comprehensive assessment of model predictive performance. Furthermore, we used the calibration curve and Decision Curve Analysis (DCA) to evaluate the model’s performance. Calibration curves served as an important tool for assessing predictive performance, by juxtaposing predicted probabilities against actual outcomes, thereby validating the model’s alignment with real-world risk profiles. DCA amalgamated predictive efficacy and clinical utility were used to evaluate a model’s practical value across diverse decision thresholds. By quantifying the net benefit (the trade-off between true positives and false positives), DCA identified thresholds where a model’s discriminative power and clinical relevance outperformed simpler heuristics (e.g., “universal treatment” or “no treatment”)^14,15. Confidence intervals (CI) for performance metrics were derived via 1,000 bootstrap iterations using percentile methods. Stratified sampling maintained original class distributions during resampling.

Model interpretation

The interpretation of predictive outcomes and the elucidation of feature importance remained critical challenges in machine learning, particularly for complex models where decision-making processes were often opaque. To address these challenges, we employed the SHAP framework, a theoretically grounded post-hoc explanation technique rooted in cooperative game theory, to quantify the contribution of each input feature to model predictions. This choice leveraged Shapley values to ensure fairness and consistency in feature attributions, addressing limitations associated with traditional feature importance metrics.

Statistical analysis

Statistical analysis was conducted using R software version 3.4.3 (http://www.r-project.org). Continuous variables were tested for normality using the Shapiro-Wilk test. If the data conformed to a normal distribution, means ± standard deviations (x ̅±s) were reported, and group comparisons were performed using independent sample t-tests. If the data were non-normally distributed, medians and interquartile ranges (M, IQR) were presented, with group comparisons conducted using the Mann-Whitney U test. Categorical variables are expressed as counts and percentages (n, %), and group differences were analyzed using the chi square test or Fisher’s exact test.

Results

Baseline characteristics

There were 4,973 participants enrolled in this study, with 3,462 males (69.62%) and 1,511 females (30.38%). Among the participants, 1,178 (23.69%) were diagnosed with carotid plaques. Compared to the negative group, the positive group had higher age, SBP, DBP, LDL-C, TC, FPG, urea, UA, TP, ALB, A/G ratio, and AST levels. In addition, the positive group had higher levels of hypertension, DM, and hyperlipemia; but a lower proportion of males (all, p ≤ 0.05) (Table 1).

Table 1 Characteristics of the study population.

Full size table

Comparison of model performance

Table 2 compares five ML algorithms (SVM, DT, LR, RF and XGBoost) across training and validation cohorts. Evaluation metrics included the AUC, sensitivity, specificity, accuracy and F1 score. Among the models, SVM demonstrated superior discriminative capability (validation AUC = 0.813), while XGBoost achieved marginally higher AUC (0.829) but lower specificity (0.744 vs. 0.785) (Fig. 2A and B). SVM maintained robust performance across sensitivity (0.674), specificity (0.785), accuracy (0.737), and F1 score (0.773), exhibiting the most balanced clinical utility profile. Calibration curves confirmed SVM’s exceptional reliability, with training and validation predictions closely aligning with observed outcomes (Fig. 2C and D). This concordance with the diagonal reference line indicates precise probability calibration. Decision curve analysis further established SVM’s clinical superiority, demonstrating the highest net benefit across decision thresholds (Fig. 2E and F). Notably, RF achieved perfect training metrics (AUC = 1, sensitivity = 1, specificity = 1) but showed reduced validation performance (AUC = 0.828), indicating potential overfitting. Despite XGBoost’s strong AUC (0.829), its lower specificity reduced clinical utility for preventive applications requiring minimal false positives. Consequently, SVM was selected as the optimal model for carotid plaque prediction in MASLD populations, balancing discrimination (AUC), calibration reliability, and clinical applicability.

Based on the SVM model performance, feature importance was quantified using the absolute SHAP values. The negative and positive contributions of each feature are represented by purple and yellow markers, respectively. The horizontal position of each data point reflects its SHAP value, where higher values indicate stronger contributions to increased predicted probabilities of carotid plaque occurrences, while lower values correspond to reduced risk predictions. Figure 3A and B shows the top 15 features ranked by contribution weights, which are displayed to simplify the interpretation of complex model outputs. This SHAP-based visualization framework enhanced clinical translatability by explicitly mapping the quantitative relationships between key predictors (e.g., age and SBP) and atherosclerotic risk stratification.

Table 2 Comparison of the performance of four machine learning methods.

Full size table

Discussion

In recent years, numerous studies have reported a significant association between MASLD and carotid plaque formation^16,17,18. This growing body of evidence underscores the critical role of MASLD as a predictor of subclinical atherosclerosis and cardiovascular risk, particularly through mechanisms involving metabolic dysregulation and systemic inflammation¹⁶. Notably, longitudinal cohort studies have reported that both the presence and progression of MASLD, especially in advanced fibrosis stages, are independently associated with increased carotid plaque burden¹⁹. These findings highlight the need for integrated cardiovascular risk assessment in MASLD patients, to mitigate the burden of ischemic cerebrovascular events.

ML algorithms have revolutionized clinical prediction model development by facilitating the integration of complex, high dimensional data, to improve diagnostic accuracy, risk stratification, and therapeutic decision-making^20,21. However, the development of predictive models for carotid plaques in the MASLD population, which integrates health check-up indicators with ML algorithms, remains poorly understood. Deng Y, et al.⁹ developed a prediction model for carotid plaques based on the health check-up indicators of 5.4 million adults with fatty liver disease, combined with ML algorithms. The predictive model showed good performance in an internal validation set (AUC = 0.831) and external validation set (AUC = 0.801). The model also graphically showed good calibration capabilities. However, this predictive model has not undergone validation using DCA to assess its clinical utility, nor has it undergone SHAP analysis for interpretability characterization. In addition, the study population comprised individuals with fatty liver disease, rather than MASLD.

In the present study, we constructed a predictive model for carotid plaques in the MASLD population, based on health check-up indicators, which showed efficiency, straightforwardness, and practicality in clinical applications. Among the three ML algorithms, SVM demonstrated the best performance, achieving an AUC of 0.813 in the validation cohort, The model exhibited favorable calibration characteristics, goodness-of-fit properties, and clinical usefulness as shown by the calibration and DCA curves, which exhibited efficiency, straightforwardness, and practicality in clinical applications. Notably, while SVM demonstrated strong discriminative capability (AUC = 0.813), XGBoost achieved marginally superior performance in the validation cohort (AUC = 0.829). This aligns with prior studies where boosting algorithms outperformed kernel-based methods in complex biomedical prediction tasks²². However, SVM maintained higher specificity (0.785 vs. 0.744), suggesting greater reliability in identifying low-risk individuals. Given the clinical priority of minimizing false positives in preventive cardiology, SVM’s balanced performance profile – coupled with its superior interpretability via SHAP – justified its selection as the primary predictive tool. To resolve persistent challenges in ML algorithm interpretability, and to enable transparent visualization of prediction determinants, we implemented SHAP in our SVM model, to systematically assess both global (feature importance rankings) and local (individual prediction analyses) interpretabilities. This integration enabled us to identify critical features driving predictions, such as age, SBP, TC, sex, and FBG, which were important features in prediction of carotid plaques in the MASLD population, while also revealing nonlinear relationships and interaction effects that may be masked by simpler methods. The computed SHAP values quantified directional feature contributions, with positive scores indicating elevated carotid plaque risks associated with specific feature values in the MASLD population, while negative values indicated protective effects. Through tree-specific implementation of the SHAP’s additive feature attribution framework, our methodology enabled granular visualization of nonlinear decision pathways, bridging the gap between algorithmic complexity and clinical reasoning.

The results in the present study, which showed that age, SBP, TC, sex, and FBG were the most important features of carotid plaques in MASLD populations, were consistent with previous studies^9,23. Aging induces cumulative metabolic and vascular stress, fostering carotid plaque formation through three principal mechanisms: (1) age-related reactive oxygen species accumulation and elevated pro-inflammatory cytokines (TNF-α and IL-6), which disrupt endothelial integrity, promote lipid oxidation, and drive foam cell formation, initiating atherosclerotic plaque development²⁴; (2) declining nitric oxide bioavailability and arterial stiffening impair vasodilation, enhance platelet aggregation, and potentiate thrombotic events²⁵; and (3) reduced insulin sensitivity exacerbates hepatic steatosis and systemic metabolic imbalance, accelerating atherosclerosis progression²⁶. Hypertension is a well-established risk factor for carotid atherosclerosis^27,28. Elevated SBP accelerates vascular endothelial injury and promotes LDL-C infiltration into the arterial wall, leading to plaque initiation and progression²⁷. In addition, elevated SBP results in arterial stiffness and promotes cytokine release (IL-6, TNF-α), enhancing monocyte adhesion and intraplaque inflammation²⁹. Furthermore, high cholesterol is a major risk factor for carotid artery disease, which can lead to narrowing or blockage of the carotid arteries supplying blood to the brain; and which also plays a role in the progression of carotid artery stenosis and overall cardiovascular risk³⁰. The mechanism involves: (1) small, dense LDL particles infiltrating arterial intima, undergoing oxidation, and being phagocytosed by macrophages to form foam cells, which are the core of atherosclerotic plaques³¹; and (2) reduced HDL-C, impairing reverse cholesterol transport, and failing to clear lipid-laden macrophages from plaque sites³². In addition, males with higher occurrences of carotid plaques are characterized by the following: (1) higher testosterone levels in males promote insulin resistance and accelerate atherosclerosis, but females with higher estrogen can enhance or reverse cholesterol transport and reduce vascular inflammation³³. (2) In male MASLD patients, oxidized LDL preferentially infiltrates the arterial wall of males, where it is engulfed by macrophages via scavenger receptors, leading to foam cell formation and the release of pro-inflammatory chemokines (e.g., MCP-1)³⁴. This process recruits’ monocytes to the arterial intima, causing plaque initiation and progression, while reduced paraoxonase-1 activity impairs HDL’s antioxidant capacity, diminishing its ability to detoxify oxidized LDL and mitigate endothelial dysfunction. This further exacerbates vascular inflammation and plaque vulnerability³⁵. Some studies had reported that elevated FBG increases advanced glycation end-products (AGEs), which bind to the receptor for AGEs on endothelial cells, activating the NF-κB pathway and promoting inflammatory cytokines (e.g., TNF-α and IL-6). This accelerates LDL oxidation and foam cell formation³⁶.

Limitations

There were several limitations in this study. First, this study has several inherent limitations. The single-center retrospective design and cross-sectional nature preclude establishing causal or temporal relationships between MASLD and carotid plaque occurrence. The restricted one-year temporal scope constrains longitudinal assessment. Potential selection bias may affect population representativeness and residual confounding; Model generalizability requires validation in multi-ethnic cohorts and prospective settings; Future research should prioritize multicenter prospective cohorts to validate prediction performance temporally, establish causal mechanisms, and explore advanced architectures (e.g., transformer networks) for feature representation learning. Second, the diagnosis of MASLD was defined by ultrasonography rather than liver biopsy. However, studies have reported a strong correlation between ultrasonographic findings and histopathological results from liver biopsies, particularly in detecting hepatic steatosis³⁷. Finally, SHAP analysis does not quantify the importance of predictors in real quant problems, but rather their importance to the model’s predictions³⁸.

Conclusions

This study developed and validated predictive models for carotid plaque occurrence in patients with MASLD by integrating demographic data, blood biochemical indices, and clinical parameters from annual health check-up populations, using ML algorithms. The SVM-based models showed high accuracy and robust reliability in predicting carotid plaque development. Furthermore, we used advanced SHAP technology for model interpretation and visualization, to facilitate a precise characterization of risk factors for carotid plaques in MASLD populations. By combining SHAP technology with ML algorithms, we quantified the contributions of key risk factors such as age, SBP, TC, sex, and FBG to carotid plaque risk, thereby possibly providing precise clinical insights into the management of carotid plaques associated with MASLD patients.

Data availability

The datasets generated and/or analyzed during this study are available from the corresponding author upon reasonable request.

References

Roth, G. A. et al. Global burden of cardiovascular diseases and risk factors, 1990–2019: update from the GBD 2019 Study. J. Am. Coll. Cardiol. 76(25), 2982–3021 (2020).
Article PubMed PubMed Central Google Scholar
Johri, A. M. et al. Maximum plaque height in carotid ultrasound predicts cardiovascular disease outcomes: a population-based validation study of the American society of echocardiography’s grade II-III plaque characterization and protocol. Int. J. Cardiovasc. Imaging 37(5), 1601–1610 (2021).
Article PubMed Google Scholar
Genkel, V. et al. Carotid total plaque area as an independent predictor of short-term subclinical polyvascular atherosclerosis progression and major adverse cardiac and cerebrovascular events. Ther. Adv. Cardiovasc. Dis. 17, 17539447231194860 (2023).
Article CAS PubMed PubMed Central Google Scholar
Teng, M. L. et al. Global incidence and prevalence of nonalcoholic fatty liver disease. Clin. Mol. Hepatol. 29(Suppl), S32–S42 (2023).
Article PubMed Google Scholar
Lee, T. B. Jr. et al. Biomarkers of hepatic dysfunction and cardiovascular risk. Curr. Cardiol. Rep. 25(12), 1783–1795 (2023).
Article PubMed PubMed Central Google Scholar
Zihni, E. et al. Opening the black box of artificial intelligence for clinical decision support: A study predicting stroke outcome. PLoS One. 15(4), e0231166 (2020).
Article CAS PubMed PubMed Central Google Scholar
Loh, H. W. et al. Application of explainable artificial intelligence for healthcare: A systematic review of the last decade (2011–2022). Comput. Methods Programs Biomed. 226, 107161 (2022).
Article PubMed Google Scholar
Ali, S. et al. The enlightening role of explainable artificial intelligence in medical & healthcare domains: A systematic literature review. Comput. Biol. Med. 166, 107555 (2023).
Article PubMed Google Scholar
Deng, Y. et al. Combinatorial use of machine learning and logistic regression for predicting carotid plaque risk among 5.4 million adults with fatty liver disease receiving health check-ups: population-based cross-sectional study. JMIR Public. Health Surveill 9, e47095 (2023).
Article PubMed PubMed Central Google Scholar
Chalasani, N. et al. The diagnosis and management of nonalcoholic fatty liver disease: Practice guidance from the American Association for the Study of Liver Diseases. Hepatology 67(1), 328–357 (2018).
Article PubMed Google Scholar
Petzold, G. Role of ultrasound methods for the assessment of NAFLD. J. Clin. Med. 11(15), 4581 (2022).
Article PubMed PubMed Central Google Scholar
Rinella, M. E. et al. A multi-society Delphi consensus statement on new fatty liver disease nomenclature. Hepatology 78(6), 1966–1986 (2023).
Article PubMed Google Scholar
Johri, A. M. et al. Recommendations for the assessment of carotid arterial plaque by ultrasound for the characterization of atherosclerosis and evaluation of cardiovascular risk: from the American Society of Echocardiography. J. Am. Soc. Echocardiogr. 33(8), 917–933 (2020).
Article PubMed Google Scholar
Zhao, L. et al. Understanding decision curve analysis in clinical prediction model research. Postgrad. Med. J. 100(1185), 512–515 (2024).
Article PubMed Google Scholar
Piovani, D. et al. Optimizing clinical decision making with decision curve analysis: insights for clinical investigators. Healthcare 11(16), 2244 (2023).
Article PubMed PubMed Central Google Scholar
Yu, X. et al. High NAFLD fibrosis score in non-alcoholic fatty liver disease as a predictor of carotid plaque development: a retrospective cohort study based on regular health check-up data in China. Ann. Med. 53(1), 1621–1631 (2021).
Article MathSciNet CAS PubMed PubMed Central Google Scholar
Zhang, Y. et al. Association between non-alcoholic fatty liver disease and silent carotid plaque in Chinese aged population: a cross-sectional study. Ann. Palliat. Med. 9(2), 182–189 (2020).
Article CAS PubMed Google Scholar
Xu, T. et al. CT-diagnosed non-alcoholic fatty liver disease as a risk predictor of symptomatic carotid plaque and cerebrovascular symptoms. Angiology. 33197241227501 (2024).
Tan, S. H. & Zhou, X. L. Early-stage non-alcoholic fatty liver disease in relation to atherosclerosis and inflammation. Clinics 78, 100301 (2023).
Article PubMed PubMed Central Google Scholar
Adlung, L. et al. Machine learning in clinical decision-making. Medicine 2(6), 642–665 (2021).
Article Google Scholar
Haug, C. J. & Drazen, J. M. Artificial intelligence and machine learning in clinical medicine, 2023. N. Engl. J. Med. 388(13), 1201–1208 (2023).
Article CAS PubMed Google Scholar
Aravkin, A. Y., Bottegal, G. & Pillonetto, G. Boosting as a kernel-based method. Mach. Learn. 108, 1951–1974 (2019).
Article MathSciNet Google Scholar
Wei, Y. et al. Application of machine learning algorithms in predicting carotid artery plaques using routine health assessments. Front. Cardiovasc. Med. 11, 1454642 (2024).
Article CAS PubMed PubMed Central Google Scholar
Anik, M. I. et al. Role of reactive oxygen species in aging and age-related diseases: a review. ACS Appl. Bio Mater. https://doi.org/10.1021/acsabm.2c00411 (2022) (Epub ahead of print.).
Article PubMed Google Scholar
Chen, J. Y. et al. Nitric oxide bioavailability dysfunction involves in atherosclerosis. Biomed. Pharmacother. 423–428. (2018).
Vesković, M. et al. The interconnection between hepatic insulin resistance and metabolic Dysfunction-associated steatotic liver disease-The transition from an adipocentric to liver-centric approach. Curr. Issues Mol. Biol. 45(11), 9084–9102 (2023).
Article PubMed PubMed Central Google Scholar
Zhang, Y. et al. Endothelial function and arterial stiffness indexes in subjects with carotid plaque and carotid plaque length: A subgroup analysis showing the relationship with hypertension and diabetes. J. Stroke Cerebrovasc. Dis. 32(3), 106986 (2023).
Article PubMed Google Scholar
Chen, J. et al. Risk factors for carotid plaque formation in type 2 diabetes mellitus. J. Transl Med. 22(1), 18 (2024).
Article CAS PubMed PubMed Central Google Scholar
Loperena, R. et al. Hypertension and increased endothelial mechanical stretch promote monocyte differentiation and activation: roles of STAT3, Interleukin 6 and hydrogen peroxide. Cardiovasc. Res. 114(11), 1547–1563 (2018).
Article CAS PubMed PubMed Central Google Scholar
Paraskevas, K. I. et al. Cholesterol, carotid artery disease and stroke: what the vascular specialist needs to know. Ann. Transl. Med. 8(19), 1265 (2020).
Article CAS PubMed PubMed Central Google Scholar
Jin, X. et al. Small, dense low-density lipoprotein-cholesterol and atherosclerosis: relationship and therapeutic strategies. Front Cardiovasc. Med. 8, 804214 (2022).
Article PubMed PubMed Central Google Scholar
Ouimet, M., Barrett, T. J. & Fisher, E. A. HDL and reverse cholesterol transport. Circ. Res. 124(10), 1505–1518 (2019).
Article CAS PubMed PubMed Central Google Scholar
Song, M. J. & Choi, J. Y. Androgen dysfunction in non-alcoholic fatty liver disease: role of sex hormone binding globulin. Front. Endocrinol. 13, 1053709 (2022).
Article Google Scholar
Cupido, A. J. et al. Low-density lipoprotein cholesterol attributable cardiovascular disease risk is sex specific. J. Am. Heart Assoc. 11(12), e024248 (2022).
Article PubMed PubMed Central Google Scholar
Kang, H. et al. The entry and egress of monocytes in atherosclerosis: A biochemical and biomechanical driven process. Cardiovasc. Ther. 2021, 6642927 (2021).
Article ADS PubMed PubMed Central Google Scholar
Chen, J. et al. Advanced glycation end products measured by skin autofluorescence and subclinical cardiovascular disease: the Rotterdam Study. Cardiovasc. Diabetol. 22(1), 326 (2023).
Article CAS PubMed PubMed Central Google Scholar
Liu, F. et al. Multiparametric US for identifying metabolic dysfunction-associated steatohepatitis: A prospective multicenter study. Radiology 310(3), e232416 (2024).
Article PubMed Google Scholar
Ponce-Bobadilla, A. V. et al. Practical guide to SHAP analysis: Explaining supervised machine learning model predictions in drug development. Clin. Transl Sci. 17(11), e70056 (2024).
Article PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We wish to thank the study participants for their cooperation and participation.

Funding

This study supported by the Medical Science Research Project of Hebei (No.20230505).

Author information

Authors and Affiliations

Department of Ultrasound, The 2nd Hospital of Hebei Medical University, Shijiazhuang, Hebei Province, China
Shu-Mei Zhai
Department of Information Center, The 2nd Hospital of Hebei Medical University, Shijiazhuang, Hebei Province, China
Xiao-Long Wang
Department of Physical Examination Center, The 2nd Hospital of Hebei Medical University, Shijiazhuang, Hebei Province, China
Han Zhang & Yu-Qiang Zuo

Authors

Shu-Mei Zhai
View author publications
Search author on:PubMed Google Scholar
Xiao-Long Wang
View author publications
Search author on:PubMed Google Scholar
Han Zhang
View author publications
Search author on:PubMed Google Scholar
Yu-Qiang Zuo
View author publications
Search author on:PubMed Google Scholar

Contributions

Shu-Mei Zhai conceptualization, Methodology, Investigation, writing-original draft; Han Zhang collected clinical data and laboratory indicators; Xiao-Long Wang was responsible for data analysis and visualization; Yu-Qiang Zuo conceptualization, formal analysis, writing-review and editing. We confirmed than this manuscript has not been published elsewhere and is not under consideration in whole or in part by another journal. All authors have approved the manuscript and agree with submission.

Corresponding author

Correspondence to Yu-Qiang Zuo.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Zhai, SM., Wang, XL., Zhang, H. et al. Predicting carotid plaques in metabolic dysfunction-associated steatotic liver disease using machine learning and SHAP interpretation. Sci Rep 15, 36074 (2025). https://doi.org/10.1038/s41598-025-19959-8

Download citation

Received: 08 April 2025
Accepted: 11 September 2025
Published: 15 October 2025
DOI: https://doi.org/10.1038/s41598-025-19959-8