Abstract
Metabolic dysfunction-associated fatty liver disease (MAFLD) is a highly prevalent liver condition closely linked to obesity, insulin resistance, and metabolic syndrome. Early identification of MAFLD remains challenging in routine health examination settings remain challenging, especially in routine health examination settings where conventional indicators often fail to capture deeper metabolic disturbances. This study aimed to evaluate the predictive value of body composition parameters and develop and validate a non-invasive, machine learning-based classification model for MAFLD. A retrospective study was conducted using data from 23,348 adults who underwent health check-ups between 2017 and 2021 at a tertiary hospital in China. Body composition was assessed via bioelectrical impedance analysis, and MAFLD was diagnosed based on hepatic steatosis plus metabolic risk criteria. A total of 13 features, including body composition indicators and basic demographics, were initially considered. Feature selection was guided by multicollinearity diagnostics and model-based importance analysis. Eight machine learning models were constructed and evaluated using tenfold cross-validation. An independent external validation cohort of 3,357 participants from 2022 to 2023 was used to assess generalizability. Performance was evaluated using area under the receiver operating characteristic curve, accuracy, recall, F1 score, and calibration metrics. Among all models, tree-based algorithms including extreme gradient boosting, gradient boosting decision tree, and LightGBM achieved the highest discriminative performance, with internal validation area under the curve values exceeding 0.96 and external validation area under the curve values above 0.95. Visceral fat rating consistently emerged as the most important predictor, followed by waist circumference and body mass index. Logistic regression confirmed their independent associations with MAFLD after adjustment for key confounders. Stratified analyses revealed variable patterns across sex, age, and body mass index groups, with visceral fat remaining a robust predictor in all subgroups. Body composition analysis, particularly visceral fat estimation, demonstrates strong diagnostic discrimination for MAFLD using non-invasive measurements. Integrating these parameters with machine learning enables accurate identification, supporting scalable screening and aiding diagnostic assessment in routine health examination, clinical, and public health settings.
Similar content being viewed by others
Data availability
Due to privacy and ethical restrictions associated with the hospital-based health examination data, the raw data are not publicly available. The code used for data preprocessing, model training, evaluation, and interpretation is available at: https://github.com/hyaxuan23-lab/ML-for-Non-Invasive-MAFLD-Identification. Additional documentation is provided in the Supplementary Materials.
Abbreviations
- MAFLD:
-
Metabolic dysfunction-associated fatty liver disease
- NAFLD:
-
Nonalcoholic fatty liver disease
- T2DM:
-
Type 2 diabetes mellitus
- BIA:
-
Bioelectrical impedance analysis
- VFR:
-
Visceral fat rating
- BMI:
-
Body mass index
- WC:
-
Waist circumference
- ALT:
-
Alanine aminotransferase
- ALB:
-
Albumin
- TBIL:
-
Total bilirubin
- GLB:
-
Serum globulin
- FPG:
-
Fasting plasma glucose
- TG:
-
Triglycerides
- TC:
-
Total cholesterol
- HDL-C:
-
High-density lipoprotein cholesterol
- LDL-C:
-
Low-density lipoprotein cholesterol
- ECW%:
-
Extracellular water ratio
- FatM:
-
Fat mass
- LeanM:
-
Lean mass
- Water:
-
Total body water
- Water%:
-
Body water percentage
- Muscle:
-
Muscle mass
- Bone:
-
Estimated bone mass
- BMR:
-
Basal metabolic rate
- SHAP:
-
SHapley Additive exPlanations
- VIF:
-
Variance inflation factor
- AUC:
-
Area under the receiver operating characteristic curve
- ROC:
-
Receiver operating characteristic
- SVM:
-
Support vector machine
- GBDT:
-
Gradient boosting decision tree
- KNN:
-
K-Nearest Neighbors
References
Zhao, Q. & Deng, Y. Comparison of mortality outcomes in individuals with MASLD and/or MAFLD. J. Hepatol. 80(2), e62–e64 (2024).
Gofton, C., Upendran, Y., Zheng, M. H. & George, J. MAFLD: How is it different from NAFLD?. Clin. Mol. Hepatol. 29(Suppl), S17-s31 (2023).
Vitale, A. et al. Epidemiological trends and trajectories of MAFLD-associated hepatocellular carcinoma 2002–2033: the ITA.LI.CA database. Gut 72(1), 141–152 (2023).
Kang, S. H., Cho, Y., Jeong, S. W., Kim, S. U. & Lee, J. W. From nonalcoholic fatty liver disease to metabolic-associated fatty liver disease: Big wave or ripple?. Clin. Mol. Hepatol. 27(2), 257–269 (2021).
Eslam, M. et al. The Asian Pacific association for the study of the liver clinical practice guidelines for the diagnosis and management of metabolic dysfunction-associated fatty liver disease. Hepatol. Int. 19(2), 261–301 (2025).
Comprehensive Medical Evaluation and Assessment of Comorbidities. Standards of Care in Diabetes-2025. Diabetes Care 48(1 Suppl 1), S59-s85 (2025).
Sun, D. Q. et al. MAFLD and risk of CKD. Metabolism 115, 154433 (2021).
Zhou, X. D. et al. Metabolic dysfunction-associated fatty liver disease and implications for cardiovascular risk and disease prevention. Cardiovasc. Diabetol. 21(1), 270 (2022).
Zhang, Y. et al. Association of metabolic dysfunction-associated fatty liver disease with systemic atherosclerosis: a community-based cross-sectional study. Cardiovasc. Diabetol. 22(1), 342 (2023).
Kumar, A. et al. Impact of diabetes, drug-induced liver injury, and sepsis on outcomes in metabolic dysfunction associated fatty liver disease-related acute-on-chronic liver failure. Am J Gastroenterol 120(4), 816–826 (2025).
Fouad, Y., Alboraie, M. & Shiha, G. Epidemiology and diagnosis of metabolic dysfunction-associated fatty liver disease. Hepatol. Int. 18(Suppl 2), 827–833 (2024).
Abasi, S., Aggas, J. R., Garayar-Leyva, G. G., Walther, B. K. & Guiseppi-Elie, A. Bioelectrical impedance spectroscopy for monitoring mammalian cells and tissues under different frequency domains: a review. ACS Meas. Sci. Au. 2(6), 495–516 (2022).
Ward, L. C. & Brantlov, S. Bioimpedance basics and phase angle fundamentals. Rev. Endocr. Metab. Disord. 24(3), 381–391 (2023).
Coëffier, M. et al. Accuracy of bioimpedance equations for measuring body composition in a cohort of 2134 patients with obesity. Clin. Nutr. 41(9), 2013–2024 (2022).
Dupertuis, Y. M. et al. Influence of the type of electrodes in the assessment of body composition by bioelectrical impedance analysis in the supine position. Clin. Nutr. 41(11), 2455–2463 (2022).
Lai, C. L. et al. Bioimpedance analysis combined with sagittal abdominal diameter for abdominal subcutaneous fat measurement. Front. Nutr. 9, 952929 (2022).
El-Serag, H. B. et al. Bioimpedance analysis predicts the etiology of cirrhosis in a prospective cohort study. Hepatol. Commun. 7(10), e0253 (2023).
de Luis, R. D. et al. Evaluation of muscle mass and malnutrition in patients with colorectal cancer using the global leadership initiative on malnutrition criteria and comparing bioelectrical impedance analysis and computed tomography measurements. Nutrients 16(17), 3035 (2024).
Younossi, Z. M. et al. Are there outcome differences between NAFLD and metabolic-associated fatty liver disease?. Hepatology 76(5), 1423–1437 (2022).
Obrien, R. M. A caution regarding rules of thumb for variance inflation factors. Qual. Quant. 41(5), 673–690 (2007).
Namdeo, S., Srivastava, V. C. & Mohanty, P. Machine learning implemented exploration of the adsorption mechanism of carbon dioxide onto porous carbons. J. Colloid Interface Sci. 647, 174–187 (2023).
Liang, D. et al. Perspective: global burden of iodine deficiency: insights and projections to 2050 using XGBoost and SHAP. Adv. Nutr. 16(3), 100384 (2025).
Greener, J. G., Kandathil, S. M., Moffat, L. & Jones, D. T. A guide to machine learning for biologists. Nat. Rev. Mol. Cell Biol. 23(1), 40–55 (2022).
Deo, R. C. Machine learning in medicine. Circulation 132(20), 1920–1930 (2015).
Handelman, G. S. et al. eDoctor: Machine learning and the future of medicine. J. Intern. Med. 284(6), 603–619 (2018).
Mohr, F. & van Rijn, J. N. Fast and informative model selection using learning curve cross-validation. IEEE Trans. Pattern. Anal. Mach. Intell. 45(8), 9669–9680 (2023).
Crane, H. et al. Global prevalence of metabolic dysfunction-associated fatty liver disease-related hepatocellular carcinoma: A systematic review and meta-analysis. Clin. Mol. Hepatol. 30(3), 436–448 (2024).
Zhao, J. et al. MAFLD as part of systemic metabolic dysregulation. Hepatol. Int. 18(Suppl 2), 834–847 (2024).
Argenziano, M. E. et al. Epidemiology, pathophysiology and clinical aspects of Hepatocellular Carcinoma in MAFLD patients. Hepatol. Int. 18(Suppl 2), 922–940 (2024).
Bai, J. et al. Correlation analysis of the abdominal visceral fat area with the structure and function of the heart and liver in obesity: a prospective magnetic resonance imaging study. Cardiovasc. Diabetol. 22(1), 206 (2023).
Wewege, M. A. et al. The effect of resistance training in healthy adults on body fat percentage, fat mass and visceral fat: A systematic review and meta-analysis. Sports Med. 52(2), 287–300 (2022).
Kolb, H. Obese visceral fat tissue inflammation: From protective to detrimental?. BMC Med. 20(1), 494 (2022).
Mitsushio, K. et al. Interrelationships among accumulations of intra- and periorgan fats, visceral fat, and subcutaneous fat. Diabetes 73(7), 1122–1126 (2024).
Feng, H. et al. Myopenic obesity determined by visceral fat area strongly predicts long-term mortality in cirrhosis. Clin. Nutr. 40(4), 1983–1989 (2021).
Zhang, S. et al. Increased visceral fat area to skeletal muscle mass ratio is positively associated with the risk of cardiometabolic diseases in a Chinese natural population: A cross-sectional study. Diabetes Metab. Res. Rev. 39(2), e3597 (2023).
GorditoSoler, M. et al. Usefulness of body fat and visceral fat determined by bioimpedanciometry versus body mass index and waist circumference in predicting elevated values of different risk scales for non-alcoholic fatty liver disease. Nutrients 16(13), 2160 (2024).
Rosa, G. B., Lukaski, H. C. & Sardinha, L. B. The science of bioelectrical impedance-derived phase angle: insights from body composition in youth. Rev. Endocr. Metab. Disord 10, 1–22 (2025).
Moh, M. C. et al. Association between neutrophil/lymphocyte ratio and kidney impairment in type 2 diabetes mellitus: A role of extracellular water/total body water ratio. Diabetes Res. Clin. Pract. 199, 110634 (2023).
Shibata, K. et al. Prognostic impact of segmental extracellular water to total body water ratio in cardiovascular surgery patients. Clin. Nutr. 51, 81–89 (2025).
Kajitani, N. et al. Relationship between extracellular water to total body water ratio and severe diabetic retinopathy in Type 2 diabetes. J. Clin. Endocrinol. Metab. 110(7), e2248–e2255 (2025).
Dmitrieva, N. I., Boehm, M., Yancey, P. H. & Enhörning, S. Long-term health outcomes associated with hydration status. Nat. Rev. Nephrol. 20(5), 275–294 (2024).
Akimoto, T., Tasaki, K., Ishihara, M., Hara, M. & Nakajima, H. Association of body water balance, nutritional risk, and sarcopenia with outcome in patients with acute ischemic stroke: A single-center prospective study. Nutrients 16(13), 2165 (2024).
Kim, Y., Chang, Y., Ryu, S., Wild, S. H. & Byrne, C. D. NAFLD improves risk prediction of type 2 diabetes: With effect modification by sex and menopausal status. Hepatology 76(6), 1755–1765 (2022).
Yang, J. D. et al. Patient sex, reproductive status, and synthetic hormone use associate with histologic severity of nonalcoholic steatohepatitis. Clin. Gastroenterol. Hepatol. 15(1), 127-131.e122 (2017).
Balakrishnan, M. et al. Women have a lower risk of nonalcoholic fatty liver disease but a higher risk of progression vs men: A systematic review and meta-analysis. Clin. Gastroenterol. Hepatol. 19(1), 61-71.e15 (2021).
Yang, X., Xue, X. & Zhou, Y. Methodological concerns and potential confounding factors. JAMA Ophthalmol 142(6), 587 (2024).
Ergun, Y. Significance of confounding factors in retrospective observational studies. JCO Oncol. Pract. 20(1), 154–155 (2024).
Lan, T. & Tacke, F. Diagnostics and omics technologies for the detection and prediction of metabolic dysfunction-associated steatotic liver disease-related malignancies. Metabolism 161, 156015 (2024).
Hu, H., Han, Y., Cao, C. & He, Y. The triglyceride glucose-body mass index: a non-invasive index that identifies non-alcoholic fatty liver disease in the general Japanese population. J. Transl. Med. 20(1), 398 (2022).
Bozic, D. et al. Detection of sarcopenia in patients with liver cirrhosis using the bioelectrical impedance analysis. Nutrients 15(15), 3335 (2023).
Dumitriu, A. M. et al. Advancing nutritional care through bioelectrical impedance analysis in critical patients. Nutrients 17(3), 380 (2025).
Romano, D. et al. Predictive and explainable machine learning models for endocrine, nutritional, and metabolic mortality in Italy using geolocalized pollution data. Appl. Syst. Innov. 8(2), 48 (2025).
Yu, Y., Yang, Y., Li, Q., Yuan, J. & Zha, Y. Predicting metabolic dysfunction associated steatotic liver disease using explainable machine learning methods. Sci. Rep. 15(1), 12382 (2025).
Acknowledgements
The authors thank the staff at the Health Management Center of The Third Xiangya Hospital for their assistance in data acquisition.
Funding
This study was supported by the following funding sources: the Natural Science Foundation of Hunan Province (Grant No. 2024JJ5520), the Changsha Municipal Natural Science Foundation (Grant No. kq2403054), the Hunan Provincial Program for Young Key Teachers in Universities (Grant No. 20240101–20261230).
Author information
Authors and Affiliations
Contributions
YH: Performed data analysis, constructed models, and drafted the manuscript. YC: Assisted with data analysis, model development, and manuscript preparation. ZC: Contributed to machine learning model validation. RX: Supported external data processing and validation. FW: Conceived and supervised the study, interpreted the results, and critically revised the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Ethical approval
This study was approved by the Ethics Committee of The Third Xiangya Hospital, Central South University (Approval Number: 225546). As this was a retrospective study using anonymized health examination data, the requirement for informed consent was waived by the ethics committee.
Consent for publication
Not applicable.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
He, Y., Cao, Y., Chen, Z. et al. Integrating body composition analysis and machine learning for non-invasive identification of metabolic dysfunction-associated fatty liver disease: a large-scale health examination-based study. Sci Rep (2026). https://doi.org/10.1038/s41598-026-37852-w
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-026-37852-w


