Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Scientific Reports
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. scientific reports
  3. articles
  4. article
Integrating body composition analysis and machine learning for non-invasive identification of metabolic dysfunction-associated fatty liver disease: a large-scale health examination-based study
Download PDF
Download PDF
  • Article
  • Open access
  • Published: 03 February 2026

Integrating body composition analysis and machine learning for non-invasive identification of metabolic dysfunction-associated fatty liver disease: a large-scale health examination-based study

  • Yaxuan He1,
  • Yu Cao1,
  • Zekai Chen2,
  • Rong Xiang1,3 &
  • …
  • Fang Wang1 

Scientific Reports , Article number:  (2026) Cite this article

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Biomarkers
  • Diseases
  • Endocrinology
  • Gastroenterology
  • Health care
  • Medical research
  • Risk factors

Abstract

Metabolic dysfunction-associated fatty liver disease (MAFLD) is a highly prevalent liver condition closely linked to obesity, insulin resistance, and metabolic syndrome. Early identification of MAFLD remains challenging in routine health examination settings remain challenging, especially in routine health examination settings where conventional indicators often fail to capture deeper metabolic disturbances. This study aimed to evaluate the predictive value of body composition parameters and develop and validate a non-invasive, machine learning-based classification model for MAFLD. A retrospective study was conducted using data from 23,348 adults who underwent health check-ups between 2017 and 2021 at a tertiary hospital in China. Body composition was assessed via bioelectrical impedance analysis, and MAFLD was diagnosed based on hepatic steatosis plus metabolic risk criteria. A total of 13 features, including body composition indicators and basic demographics, were initially considered. Feature selection was guided by multicollinearity diagnostics and model-based importance analysis. Eight machine learning models were constructed and evaluated using tenfold cross-validation. An independent external validation cohort of 3,357 participants from 2022 to 2023 was used to assess generalizability. Performance was evaluated using area under the receiver operating characteristic curve, accuracy, recall, F1 score, and calibration metrics. Among all models, tree-based algorithms including extreme gradient boosting, gradient boosting decision tree, and LightGBM achieved the highest discriminative performance, with internal validation area under the curve values exceeding 0.96 and external validation area under the curve values above 0.95. Visceral fat rating consistently emerged as the most important predictor, followed by waist circumference and body mass index. Logistic regression confirmed their independent associations with MAFLD after adjustment for key confounders. Stratified analyses revealed variable patterns across sex, age, and body mass index groups, with visceral fat remaining a robust predictor in all subgroups. Body composition analysis, particularly visceral fat estimation, demonstrates strong diagnostic discrimination for MAFLD using non-invasive measurements. Integrating these parameters with machine learning enables accurate identification, supporting scalable screening and aiding diagnostic assessment in routine health examination, clinical, and public health settings.

Similar content being viewed by others

Prevalence and factors associated with metabolic dysfunction-associated fatty liver disease in a multiethnic Asian country

Article Open access 14 January 2026

Frequency and risk factors of metabolic associated fatty liver disease among medical students in Egypt

Article Open access 18 April 2025

Global burden of MAFLD, MAFLD related cirrhosis and MASH related liver cancer from 1990 to 2021

Article Open access 27 February 2025

Data availability

Due to privacy and ethical restrictions associated with the hospital-based health examination data, the raw data are not publicly available. The code used for data preprocessing, model training, evaluation, and interpretation is available at: https://github.com/hyaxuan23-lab/ML-for-Non-Invasive-MAFLD-Identification. Additional documentation is provided in the Supplementary Materials.

Abbreviations

MAFLD:

Metabolic dysfunction-associated fatty liver disease

NAFLD:

Nonalcoholic fatty liver disease

T2DM:

Type 2 diabetes mellitus

BIA:

Bioelectrical impedance analysis

VFR:

Visceral fat rating

BMI:

Body mass index

WC:

Waist circumference

ALT:

Alanine aminotransferase

ALB:

Albumin

TBIL:

Total bilirubin

GLB:

Serum globulin

FPG:

Fasting plasma glucose

TG:

Triglycerides

TC:

Total cholesterol

HDL-C:

High-density lipoprotein cholesterol

LDL-C:

Low-density lipoprotein cholesterol

ECW%:

Extracellular water ratio

FatM:

Fat mass

LeanM:

Lean mass

Water:

Total body water

Water%:

Body water percentage

Muscle:

Muscle mass

Bone:

Estimated bone mass

BMR:

Basal metabolic rate

SHAP:

SHapley Additive exPlanations

VIF:

Variance inflation factor

AUC:

Area under the receiver operating characteristic curve

ROC:

Receiver operating characteristic

SVM:

Support vector machine

GBDT:

Gradient boosting decision tree

KNN:

K-Nearest Neighbors

References

  1. Zhao, Q. & Deng, Y. Comparison of mortality outcomes in individuals with MASLD and/or MAFLD. J. Hepatol. 80(2), e62–e64 (2024).

    Google Scholar 

  2. Gofton, C., Upendran, Y., Zheng, M. H. & George, J. MAFLD: How is it different from NAFLD?. Clin. Mol. Hepatol. 29(Suppl), S17-s31 (2023).

    Google Scholar 

  3. Vitale, A. et al. Epidemiological trends and trajectories of MAFLD-associated hepatocellular carcinoma 2002–2033: the ITA.LI.CA database. Gut 72(1), 141–152 (2023).

    Google Scholar 

  4. Kang, S. H., Cho, Y., Jeong, S. W., Kim, S. U. & Lee, J. W. From nonalcoholic fatty liver disease to metabolic-associated fatty liver disease: Big wave or ripple?. Clin. Mol. Hepatol. 27(2), 257–269 (2021).

    Google Scholar 

  5. Eslam, M. et al. The Asian Pacific association for the study of the liver clinical practice guidelines for the diagnosis and management of metabolic dysfunction-associated fatty liver disease. Hepatol. Int. 19(2), 261–301 (2025).

    Google Scholar 

  6. Comprehensive Medical Evaluation and Assessment of Comorbidities. Standards of Care in Diabetes-2025. Diabetes Care 48(1 Suppl 1), S59-s85 (2025).

    Google Scholar 

  7. Sun, D. Q. et al. MAFLD and risk of CKD. Metabolism 115, 154433 (2021).

    Google Scholar 

  8. Zhou, X. D. et al. Metabolic dysfunction-associated fatty liver disease and implications for cardiovascular risk and disease prevention. Cardiovasc. Diabetol. 21(1), 270 (2022).

    Google Scholar 

  9. Zhang, Y. et al. Association of metabolic dysfunction-associated fatty liver disease with systemic atherosclerosis: a community-based cross-sectional study. Cardiovasc. Diabetol. 22(1), 342 (2023).

    Google Scholar 

  10. Kumar, A. et al. Impact of diabetes, drug-induced liver injury, and sepsis on outcomes in metabolic dysfunction associated fatty liver disease-related acute-on-chronic liver failure. Am J Gastroenterol 120(4), 816–826 (2025).

    Google Scholar 

  11. Fouad, Y., Alboraie, M. & Shiha, G. Epidemiology and diagnosis of metabolic dysfunction-associated fatty liver disease. Hepatol. Int. 18(Suppl 2), 827–833 (2024).

    Google Scholar 

  12. Abasi, S., Aggas, J. R., Garayar-Leyva, G. G., Walther, B. K. & Guiseppi-Elie, A. Bioelectrical impedance spectroscopy for monitoring mammalian cells and tissues under different frequency domains: a review. ACS Meas. Sci. Au. 2(6), 495–516 (2022).

    Google Scholar 

  13. Ward, L. C. & Brantlov, S. Bioimpedance basics and phase angle fundamentals. Rev. Endocr. Metab. Disord. 24(3), 381–391 (2023).

    Google Scholar 

  14. Coëffier, M. et al. Accuracy of bioimpedance equations for measuring body composition in a cohort of 2134 patients with obesity. Clin. Nutr. 41(9), 2013–2024 (2022).

    Google Scholar 

  15. Dupertuis, Y. M. et al. Influence of the type of electrodes in the assessment of body composition by bioelectrical impedance analysis in the supine position. Clin. Nutr. 41(11), 2455–2463 (2022).

    Google Scholar 

  16. Lai, C. L. et al. Bioimpedance analysis combined with sagittal abdominal diameter for abdominal subcutaneous fat measurement. Front. Nutr. 9, 952929 (2022).

    Google Scholar 

  17. El-Serag, H. B. et al. Bioimpedance analysis predicts the etiology of cirrhosis in a prospective cohort study. Hepatol. Commun. 7(10), e0253 (2023).

    Google Scholar 

  18. de Luis, R. D. et al. Evaluation of muscle mass and malnutrition in patients with colorectal cancer using the global leadership initiative on malnutrition criteria and comparing bioelectrical impedance analysis and computed tomography measurements. Nutrients 16(17), 3035 (2024).

    Google Scholar 

  19. Younossi, Z. M. et al. Are there outcome differences between NAFLD and metabolic-associated fatty liver disease?. Hepatology 76(5), 1423–1437 (2022).

    Google Scholar 

  20. Obrien, R. M. A caution regarding rules of thumb for variance inflation factors. Qual. Quant. 41(5), 673–690 (2007).

    Google Scholar 

  21. Namdeo, S., Srivastava, V. C. & Mohanty, P. Machine learning implemented exploration of the adsorption mechanism of carbon dioxide onto porous carbons. J. Colloid Interface Sci. 647, 174–187 (2023).

    Google Scholar 

  22. Liang, D. et al. Perspective: global burden of iodine deficiency: insights and projections to 2050 using XGBoost and SHAP. Adv. Nutr. 16(3), 100384 (2025).

    Google Scholar 

  23. Greener, J. G., Kandathil, S. M., Moffat, L. & Jones, D. T. A guide to machine learning for biologists. Nat. Rev. Mol. Cell Biol. 23(1), 40–55 (2022).

    Google Scholar 

  24. Deo, R. C. Machine learning in medicine. Circulation 132(20), 1920–1930 (2015).

    Google Scholar 

  25. Handelman, G. S. et al. eDoctor: Machine learning and the future of medicine. J. Intern. Med. 284(6), 603–619 (2018).

    Google Scholar 

  26. Mohr, F. & van Rijn, J. N. Fast and informative model selection using learning curve cross-validation. IEEE Trans. Pattern. Anal. Mach. Intell. 45(8), 9669–9680 (2023).

    Google Scholar 

  27. Crane, H. et al. Global prevalence of metabolic dysfunction-associated fatty liver disease-related hepatocellular carcinoma: A systematic review and meta-analysis. Clin. Mol. Hepatol. 30(3), 436–448 (2024).

    Google Scholar 

  28. Zhao, J. et al. MAFLD as part of systemic metabolic dysregulation. Hepatol. Int. 18(Suppl 2), 834–847 (2024).

    Google Scholar 

  29. Argenziano, M. E. et al. Epidemiology, pathophysiology and clinical aspects of Hepatocellular Carcinoma in MAFLD patients. Hepatol. Int. 18(Suppl 2), 922–940 (2024).

    Google Scholar 

  30. Bai, J. et al. Correlation analysis of the abdominal visceral fat area with the structure and function of the heart and liver in obesity: a prospective magnetic resonance imaging study. Cardiovasc. Diabetol. 22(1), 206 (2023).

    Google Scholar 

  31. Wewege, M. A. et al. The effect of resistance training in healthy adults on body fat percentage, fat mass and visceral fat: A systematic review and meta-analysis. Sports Med. 52(2), 287–300 (2022).

    Google Scholar 

  32. Kolb, H. Obese visceral fat tissue inflammation: From protective to detrimental?. BMC Med. 20(1), 494 (2022).

    Google Scholar 

  33. Mitsushio, K. et al. Interrelationships among accumulations of intra- and periorgan fats, visceral fat, and subcutaneous fat. Diabetes 73(7), 1122–1126 (2024).

    Google Scholar 

  34. Feng, H. et al. Myopenic obesity determined by visceral fat area strongly predicts long-term mortality in cirrhosis. Clin. Nutr. 40(4), 1983–1989 (2021).

    Google Scholar 

  35. Zhang, S. et al. Increased visceral fat area to skeletal muscle mass ratio is positively associated with the risk of cardiometabolic diseases in a Chinese natural population: A cross-sectional study. Diabetes Metab. Res. Rev. 39(2), e3597 (2023).

    Google Scholar 

  36. GorditoSoler, M. et al. Usefulness of body fat and visceral fat determined by bioimpedanciometry versus body mass index and waist circumference in predicting elevated values of different risk scales for non-alcoholic fatty liver disease. Nutrients 16(13), 2160 (2024).

    Google Scholar 

  37. Rosa, G. B., Lukaski, H. C. & Sardinha, L. B. The science of bioelectrical impedance-derived phase angle: insights from body composition in youth. Rev. Endocr. Metab. Disord 10, 1–22 (2025).

    Google Scholar 

  38. Moh, M. C. et al. Association between neutrophil/lymphocyte ratio and kidney impairment in type 2 diabetes mellitus: A role of extracellular water/total body water ratio. Diabetes Res. Clin. Pract. 199, 110634 (2023).

    Google Scholar 

  39. Shibata, K. et al. Prognostic impact of segmental extracellular water to total body water ratio in cardiovascular surgery patients. Clin. Nutr. 51, 81–89 (2025).

    Google Scholar 

  40. Kajitani, N. et al. Relationship between extracellular water to total body water ratio and severe diabetic retinopathy in Type 2 diabetes. J. Clin. Endocrinol. Metab. 110(7), e2248–e2255 (2025).

    Google Scholar 

  41. Dmitrieva, N. I., Boehm, M., Yancey, P. H. & Enhörning, S. Long-term health outcomes associated with hydration status. Nat. Rev. Nephrol. 20(5), 275–294 (2024).

    Google Scholar 

  42. Akimoto, T., Tasaki, K., Ishihara, M., Hara, M. & Nakajima, H. Association of body water balance, nutritional risk, and sarcopenia with outcome in patients with acute ischemic stroke: A single-center prospective study. Nutrients 16(13), 2165 (2024).

    Google Scholar 

  43. Kim, Y., Chang, Y., Ryu, S., Wild, S. H. & Byrne, C. D. NAFLD improves risk prediction of type 2 diabetes: With effect modification by sex and menopausal status. Hepatology 76(6), 1755–1765 (2022).

    Google Scholar 

  44. Yang, J. D. et al. Patient sex, reproductive status, and synthetic hormone use associate with histologic severity of nonalcoholic steatohepatitis. Clin. Gastroenterol. Hepatol. 15(1), 127-131.e122 (2017).

    Google Scholar 

  45. Balakrishnan, M. et al. Women have a lower risk of nonalcoholic fatty liver disease but a higher risk of progression vs men: A systematic review and meta-analysis. Clin. Gastroenterol. Hepatol. 19(1), 61-71.e15 (2021).

    Google Scholar 

  46. Yang, X., Xue, X. & Zhou, Y. Methodological concerns and potential confounding factors. JAMA Ophthalmol 142(6), 587 (2024).

    Google Scholar 

  47. Ergun, Y. Significance of confounding factors in retrospective observational studies. JCO Oncol. Pract. 20(1), 154–155 (2024).

    Google Scholar 

  48. Lan, T. & Tacke, F. Diagnostics and omics technologies for the detection and prediction of metabolic dysfunction-associated steatotic liver disease-related malignancies. Metabolism 161, 156015 (2024).

    Google Scholar 

  49. Hu, H., Han, Y., Cao, C. & He, Y. The triglyceride glucose-body mass index: a non-invasive index that identifies non-alcoholic fatty liver disease in the general Japanese population. J. Transl. Med. 20(1), 398 (2022).

    Google Scholar 

  50. Bozic, D. et al. Detection of sarcopenia in patients with liver cirrhosis using the bioelectrical impedance analysis. Nutrients 15(15), 3335 (2023).

    Google Scholar 

  51. Dumitriu, A. M. et al. Advancing nutritional care through bioelectrical impedance analysis in critical patients. Nutrients 17(3), 380 (2025).

    Google Scholar 

  52. Romano, D. et al. Predictive and explainable machine learning models for endocrine, nutritional, and metabolic mortality in Italy using geolocalized pollution data. Appl. Syst. Innov. 8(2), 48 (2025).

    Google Scholar 

  53. Yu, Y., Yang, Y., Li, Q., Yuan, J. & Zha, Y. Predicting metabolic dysfunction associated steatotic liver disease using explainable machine learning methods. Sci. Rep. 15(1), 12382 (2025).

    Google Scholar 

Download references

Acknowledgements

The authors thank the staff at the Health Management Center of The Third Xiangya Hospital for their assistance in data acquisition.

Funding

This study was supported by the following funding sources: the Natural Science Foundation of Hunan Province (Grant No. 2024JJ5520), the Changsha Municipal Natural Science Foundation (Grant No. kq2403054), the Hunan Provincial Program for Young Key Teachers in Universities (Grant No. 20240101–20261230).

Author information

Authors and Affiliations

  1. Department of Endocrinology, The Third Xiangya Hospital of Central South University, 138 Tongzipo Road, Science and Education Building, Yuelu District, Changsha, 410013, Hunan Province, China

    Yaxuan He, Yu Cao, Rong Xiang & Fang Wang

  2. School of Automation, Central South University, Changsha City, 410083, Hunan Province, China

    Zekai Chen

  3. Key Laboratory of Pediatric Rare Diseases, Ministry of Education, Department of Cellular Biology, School of Life Sciences, Central South University, Changsha, 410013, Hunan Province, China

    Rong Xiang

Authors
  1. Yaxuan He
    View author publications

    Search author on:PubMed Google Scholar

  2. Yu Cao
    View author publications

    Search author on:PubMed Google Scholar

  3. Zekai Chen
    View author publications

    Search author on:PubMed Google Scholar

  4. Rong Xiang
    View author publications

    Search author on:PubMed Google Scholar

  5. Fang Wang
    View author publications

    Search author on:PubMed Google Scholar

Contributions

YH: Performed data analysis, constructed models, and drafted the manuscript. YC: Assisted with data analysis, model development, and manuscript preparation. ZC: Contributed to machine learning model validation. RX: Supported external data processing and validation. FW: Conceived and supervised the study, interpreted the results, and critically revised the manuscript.

Corresponding author

Correspondence to Fang Wang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Ethical approval

This study was approved by the Ethics Committee of The Third Xiangya Hospital, Central South University (Approval Number: 225546). As this was a retrospective study using anonymized health examination data, the requirement for informed consent was waived by the ethics committee.

Consent for publication

Not applicable.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary Material 1

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

He, Y., Cao, Y., Chen, Z. et al. Integrating body composition analysis and machine learning for non-invasive identification of metabolic dysfunction-associated fatty liver disease: a large-scale health examination-based study. Sci Rep (2026). https://doi.org/10.1038/s41598-026-37852-w

Download citation

  • Received: 06 October 2025

  • Accepted: 27 January 2026

  • Published: 03 February 2026

  • DOI: https://doi.org/10.1038/s41598-026-37852-w

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Keywords

  • MAFLD
  • Body composition analysis
  • Visceral fat
  • Bioelectrical impedance analysis
  • Machine learning
  • Identification
  • Non-invasive screening
  • Metabolic syndrome
Download PDF

Advertisement

Explore content

  • Research articles
  • News & Comment
  • Collections
  • Subjects
  • Follow us on Facebook
  • Follow us on Twitter
  • Sign up for alerts
  • RSS feed

About the journal

  • About Scientific Reports
  • Contact
  • Journal policies
  • Guide to referees
  • Calls for Papers
  • Editor's Choice
  • Journal highlights
  • Open Access Fees and Funding

Publish with us

  • For authors
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

Scientific Reports (Sci Rep)

ISSN 2045-2322 (online)

nature.com sitemap

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research