Abstract
To address the need for a simple model to predict ≥ F2 fibrosis in metabolic dysfunction-associated steatotic liver disease (MASLD) patients, a study utilized data from 791 biopsy-proven MASLD patients from the NASH Clinical Research Network and Jinan University First Affiliated Hospital. The data were divided into training and internal testing sets through randomized stratified sampling. A multivariable logistic regression model using key categorical variables was developed to identify ≥ F2 fibrosis. External validation was performed using data from the FLINT trial and multiple centers in China. The DA-GAG score, incorporating diabetes, age, GGT, aspartate aminotransferase/ platelet ratio, and globulin/ total protein ratio, demonstrated superior performance in distinguishing ≥ F2 fibrosis with an area under the receiver operating characteristic curve of 0.79 in training and over 0.80 in testing datasets. The DA-GAG score efficiently identifies MASLD patients with ≥ F2 fibrosis, significantly reducing the medical burden.
Similar content being viewed by others
Introduction
With the rapid increase in the prevalence of obesity and diabetes, nonalcoholic fatty liver disease (NAFLD), now referring as metabolic dysfunction-associated steatotic liver disease (MASLD), has led to a global health threat in both adults and children, poised to become the leading cause of end-stage liver disease in the future1,4. MASLD encompasses two pathologically distinct situations with different prognoses: metabolic dysfunction-associated steatotic liver (MAFL) and metabolic dysfunction-associated steatotic steatohepatitis (MASH); the latter incorporates a spectrum of diseases, including early MASH with no or mild fibrosis (F0/1) and at-risk MASH (≥ F2)5,6. The progression of liver fibrosis has been reported to be the most significant predictor of disease outcomes in MASLD, with ≥ F2 fibrosis posing a high risk for progression to end-stage liver disease7. The recent approval of resmetirom by the Food and Drug Agency (FDA) permits its utilization in the treatment of patients with moderate (F2) to advanced (F3) liver fibrosis, marking a significant milestone in the field of MASH after over two decades of research8. Therefore, it is imperative to distinguish the patients with ≥ F2 fibrosis from the patients with F0/1 fibrosis in order to implement more effective interventions and ultimately improve the prognosis9.
Currently, liver biopsies accompanied by histopathologic assessment remain the “gold standard” for diagnosing and monitoring hepatic fibrosis (HF) progression10. Due to the costly, invasive, and potentially severe complications associated with liver biopsy, it is poorly accepted by patients and unsuitable for long-term monitoring, especially use in primary care11. In addition, magnetic resonance elastography and vibration-controlled transient elastography are compromised by the intricate interplay between mechanical waves and various tissues, leading to inaccurate measurements12,13. What’s more, the substantial cost of the equipment and the need for specialized hepatologists significantly impede its widespread implementation in underserved medical areas or primary care.
Serum-based noninvasive diagnostic models, which combine common laboratory and demographic data, can provide a certain degree of exclusion and inclusion for MASLD patients with liver fibrosis, significantly reducing the medical and economic burden14. NAFLD fibrosis score (NFS) and Fibrosis-4 (FIB-4) have gained widespread utilization in clinical practice for identifying ≥ F3 fibrosis in MASLD patients, yet they lack the precision to accurately recognize ≥ F2 fibrosis15,17. Additionally, it has been reported that the accuracy of previous models varies by race, indicating that the diagnostic value differs among different ethnicities18. Therefore, including multiracial MASLD patients in model development is crucial for achieving more accurate results.
In this study, we develop and validate a practical and easily applicable scoring system to identify MASLD patients with ≥ F2 fibrosis from F0/1 fibrosis. Due to its facile and efficient nature, it would be widely performed in primary care or by non-hepatologists for early screening out at-risk MASH patients for resmetirom treatment, thereby enhancing the long-term outcomes of MASLD patients.
Methods
Inclusion and exclusion criteria of study patients
The study included the patients with biopsy-confirmed MASLD, and excluded the patients with increased alcohol intake (men, ≥ 30 g/day; women, ≥ 20 g/day) or other chronic liver diseases [e.g. hepatitis virus (HBV, HCV, and HDV), drug hepatitis, autoimmune hepatitis, and human immunodeficiency virus], and significant comorbidities or acute infection19.
The diabetes was determined by 2-hour glucose ≥ 11.1 mmol/L during oral glucose tolerance test, hemoglobin A1c (HbA1c) ≥ 6.5%, diabetes history, and use of antidiabetic agents20.
Data sources of the MASLD patients in training dataset
In order to establish a broadly applicable scoring system, the training dataset consisted of two sub-data. Figure S1 showed the data source composition and screening process of this study. The first sub-data was obtained from the Nonalcoholic Steatohepatitis (NASH) Clinical Research Network (CRN), which comprised mainly of individuals of White people and encompassed the entire spectrum of MASLD but contained fewer morbidly obese, young, or Asian populations (Table S1). Detailed information on the CRN data could be found in the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK, https://repository.niddk.nih.gov) database. The second sub-data was incorporated by most morbidly obese young Asians from the Jinan University First Affiliated Hospital (JNUFAH), which was collected from January 2019 to 2022 (Table S2). The study was approved by the ethics committee of the Jinan University First Affiliated Hospital. The integration of the two datasets served to remedy the deficiencies of each respective sub-data, making the scoring system more robust (Table 1). Only patients whose liver biopsy data were available within 6 months of the clinic data were eligible for inclusion. We only used the data from before any therapy or surgery. For the data from JNUFAH, those patients without Masson Trichrome or Gordon and Sweet’s silver staining (well-established markers of fibrosis) for the determination of HF were excluded.
Data origin of the external testing datasets
The external testing 1# dataset was procured from the farnesoid X nuclear receptor ligand obeticholic acid for non-cirrhotic NASH (FLINT trial), which represented a different population of liver fibrosis patients with higher rates of diabetes (Table 2). All protocols and consent forms could be found in the NIDDK database for the NASH CRN and FLINT. The external testing 2# dataset, delegating a population of young people with lower rates of diabetes and high liver transaminase, was retrospectively collected from three hospitals: Guangzhou Medical University Eighth Affiliated Hospital, Guangzhou Medical University Fifth Affiliated Hospital, and Shenzhen Integrated Traditional Chinese and Western Medicine Hospital (Table 2). The testing analysis only used baseline data, and samples lacking the complete data necessary for the analysis were excluded (Figure S1). The blood samples were collected under fasting conditions. All patients signed a written informed consent. This study was conducted in accordance with the ethical principles stated in the Declaration of Helsinki.
Liver histology
The pathological diagnoses of enrolled patients from China were confirmed by two pathologists in each participating center. Hematoxylin and eosin, Masson Trichrome, or Gordon and Sweet’s silver staining were used for fibrosis detection. The stage of fibrosis was scored based on the modified Brunt criteria: 0 = no fibrosis; 1 = perisinusoidal or portal fibrosis; 2 = perisinusoidal and portal/periportal fibrosis; 3 = bridging fibrosis; and 4 = cirrhosis21.
Selection of dependent and independent variables
To devise a scoring system for discriminating ≥ F2 fibrosis, stage 0/1 and stage 2–4 fibrosis were chosen as the dependent variable for analysis. Subsequently, the common variables among the three datasets were selected for further analysis. Finally, 28 variables were extracted (Table 1), including demographic data (age, gender, and race), body mass index (BMI), diabetes status, white blood cell (WBC), hemoglobin (HGB), platelet (PLT), HbA1c, aspartate aminotransferase (AST), alanine aminotransferase (ALT), gamma glutamyl-transpeptidase (GGT), alkaline phosphatase (ALP), total bilirubin (TBIL), direct bilirubin (DBIL), albumin (ALB), globulin (GLB), total protein (TP), uric acid (UA), creatinine (CREA), blood urea nitrogen (BUN), total cholesterol (TC), triglyceride (TG), low-density lipoprotein (LDL), and high-density lipoprotein (HDL). We also imputed AST/ALT ratio (AAR), AST/PLT ratio (APR), and GLB/TP ratio (GTR).
Model training and evaluation
We totally extracted 791 samples from CRN and JNUFAH datasets for model training. To ensure comprehensive training of both Asian and other populations, we implemented a randomized stratified sampling approach (CRN: JNUFAH = 2:1) to extract 660 samples for training and 131 for internal testing (training: testing ≈ 8:2) (Table 1). The model was optimized through 1000 iterations of the sampling and learning procedure. The Kolmogorov-Smirnov test (KS-test) was employed to verify the similarity of probability distributions between the training and internal testing sets. Additionally, the data size and similarity of external validation data were measured using the tool described by Cabitza F et al.22. The missing values were considered as missing at random and imputed using the quantile k-nearest neighbor (KNN) method by the “imputeLCMD” R package23,24. Figure S2 indicated that the overall occurrence of missing values was minimal, and the data distribution remained consistent even after imputation of these missing values. To filter the variables, we initially identified significant differential variables between stage 0/1 and stage 2–4 fibrosis in a continuous format. Then, the linearity assumption for continuous variables was ascertained by restricted cubic spline (RCS) analysis with 5 knots to investigate the correlation between the significant differential variables and the log odds ratio of significant HF25. Afterward, these continuous variables were transformed into a categorical format according to the breakpoints of the RCS using “strucchange” R package26,27. Subsequently, the multivariable logistic regression analysis was performed, where variables with a univariate logistic regression p-value < 0.05 were included as inputs. Finally, the score was weighted based on the β coefficients derived from logistic regression analysis to establish a straightforward and user-friendly model. We define the smallest absolute β coefficients as 1 point, and values for subsequent variables were assigned based on multiples of their respective β coefficients28. The final model was designated as DA-GAG (Diabetes, Age, GGT, APR, and GTR).
The diagnostic performance of the DA-GAG was evaluated by the area under the receiver operating characteristic curve (AUROC) and compared to FIB-4 and NFS using DeLong’s test. The calibration plots of predicted probability versus observed proportion were generated using the “rms” R package with mean absolute errors estimated from 500 bootstraps. To determine significant liver fibrosis, a sensitivity cutoff of 90% was used to exclude it, while a specificity of at least 85% was employed to include it.
The following scores were calculated for comparison:
-
1.
FIB-4 score: Fib-4 = [Age (years) × AST] /[PLT (×109/L) × √(ALT)]15;
-
2.
NFS score: NFS = − 1.675 + [0.037 × Age (years)] + [0.094 × BMI (kg/m2)] + [1.13 × diabetes (yes = 1, no = 0)] + [0.99 × AST/ALT]-[0.013 × PLT (×109/L)]- [0.66 × ALB (g/dL)16.
Statistical analysis
Statistical analysis was conducted in R software (Version 4.1.3, www.r-project.org). Statistical comparisons between F0/1 and ≥ F2 of continuous data were performed using the t-test or Wilcoxon test based on normality assumption, and the χ2 test was employed for categorical data analysis. The level of statistical significance was established at a p-value < 0.05.
Results
Clinical portraits of the MASLD patients
Table 1 summarized the clinical portraits of the training and internal testing datasets. In the training dataset, the median age was 38.00 [27.00, 53.00] years and males accounted for 40.2%. Of these individuals, 361 (54.7%) identified as white while 238 (36.0%) identified as Asian. Besides, the median BMI was recorded at 35.17 [30.68, 39.59] kg/m2, and a diabetes prevalence of 34.2% was observed. Patients with ≥ F2 exhibited significantly lower levels of PLT {239.55 [191.75, 294.05] ×109/L, p < 0.001} and HDL {0.98 [0.88, 1.19] mmol/L, p = 0.043}, as well as significantly higher levels of HbA1c {6.00 [5.50, 7.06] %, p < 0.001}, CREA {70.72 [61.66, 79.67] µmol/L, p = 0.038}, ALT {68.00 [45.75, 106.50] U/L, p < 0.001}, AST {51.00 [35.00, 75.00]U/L, p < 0.001}, AAR {0.76 [0.58, 0.98], p < 0.001}, APR {0.23 [0.14, 0.37], p < 0.001}, ALP {88.50 [72.00, 115.25]U/L, p = 0.007}, GGT {64.00 [40.75, 105.00] U/L, p < 0.001}, TP {73.45 [70.00, 76.90] g/L, p < 0.001}, GLB {31.00 [27.60, 34.00] g/L, p < 0.001}, and GTR {0.42 [0.39, 0.45], p < 0.001}, compared to those with F0/1. The results of the KS test demonstrated that the distributions of variables in the training dataset and internal testing dataset were comparable (Table 1).
The external testing 1# dataset represented another cohort of the MASLD population with a high likelihood of developing HF (Table 2). They were predominantly comprised of White people (87.1%) and older people {53.00 [43.00, 59.00] years} with a high prevalence of diabetes (53.2%) and higher levels of transaminases (ALT, AST, ALP, and GGT). The overall BMI {33.58 [30.26, 37.58] kg/m2} and male proportion (34.3%) was lower than the training dataset. While the external testing 2# dataset exhibited a totally different MASLD population with a high rate of ≥ F2 fibrosis (57.9%). The prevalence of diabetes (20.2%), age {34.00 [27.75, 44.00] years}, and BMI {27.90 [25.07, 33.03] kg/m2} were lowest, but the transaminases (ALT, ALP, and GGT) were highest among all datasets (Table 2). The inclusion of a diverse MASLD population in external testing datasets has significantly enhanced the validation process.
Model construction for identification of ≥ F2 fibrosis
Our objective was to devise a scoring system that was easily applicable and estimable in clinical practices, without the need for tedious calculations. Hence, we tended to transfer the significant differential variables in the training dataset into the categorical format. Subsequently, we used RCS analysis to explore the correlation and linearity between the significant differential variables and the log odds ratio of ≥ F2 in the overall training dataset (Fig. 1A-D). Age, GTR, GLB, TP, CREA, and HDL showed linear associations (p for nonlinear > 0.05) with significant HF, while nonlinear associations (p for nonlinear < 0.05) were observed for GGT, APR, ALT, AST, AAR, PLT, HbA1c, and ALP. In addition, AAR, HbA1c, ALP, CREA, and HDL were excluded due to their exhibiting a complex “N” or reciprocal “U” shaped relationship in predicting ≥ F2, which indicated that their relationships were intricate and unsuitable for categorical variable transformation (Figure S3). Of note, MASLD patients < 40 years showed a constantly low risk of significant fibrosis (log odds ratio < −1), and the risk rises sharply in patients aged 40 to 55, while those ≥ 55 years exhibited a significant risk (log odds ratio > 0). MASLD patients with GGT levels below 35 U/L exhibit a low-risk profile, and the impact of this association on the log odds ratio appears to plateau beyond 55 U/L. In addition, MASLD patients with APR levels<0.1 possess a low-risk profile, and those ≥ 0.3 exhibited a significant risk. Due to the linear relationship between GTR and significant fibrosis, we chose the quartile of GTR for classification conversion and set its cut-off points at 0.35 and 0.45, which were close to the interquartile range (IQR) of GTR.
The restricted cubic spline (RCS) analysis was performed to assess the linearity and explore the relationship between significant fibrosis (≥ F2) and (A) age, (B) gamma glutamyl-transpeptidase (GGT), (C) aspartate aminotransferase/ platelet ratio (APR), and (D) globulin/ total protein ratio (GTR). The red nods represented the breakpoints of RCS; (E) The forest plot and bar plot showed the odds ratio and the area under the receiver operating characteristic curve (AUROC) of categorical variables in the training dataset. PLT, platelets; ALT, alanine aminotransferase; AST, aspartate aminotransferase; APR, AST/ PLT ratio; GGT, gamma glutamyl-transpeptidase; TP, total protein; GLB, globulin; GTR, GLB/ TP ratio.
Then, we transformed these continuous variables (APR, GGT, AST, Age, ALT, GTR, PLT, GLB, and TP) into a categorical format according to the breakpoints (Table S3). Afterward, univariate logistics regression and ROC analysis were respectively performed to calculate the odd ratio (OR) and diagnostic performance of each categorical variable. We noticed that APR [OR = 3.76 (2.82–5.00), p < 0.001, AUROC = 0.701] performed better than AST [OR = 3.68 (2.71–4.99), p < 0.001, AUROC = 0.683] or PLT [OR = 2.36 (1.66–3.34), p < 0.001, AUROC = 0.584] alone, while GTR [OR = 2.02 (1.49–2.75), p < 0.001, AUROC = 0.586] exhibited better performance compared to GLB [OR = 2.01 (1.48–2.75), p < 0.001, AUROC = 0.584] or TP [OR = 1.52 (1.17–1.98), p = 0.002, AUROC = 0.561] alone. Therefore, we selected APR and GTR for model construction rather than AST, PLT, GLB, and TP. In addition, remarkable performances were also observed in terms of age [OR = 2.02 (1.65–2.47), p < 0.001, AUROC = 0.645], diabetes [OR = 3.36 (2.40–4.71), and p < 0.001, AUROC = 0.637], and ALT [OR = 2.47 (1.83–3.36), p < 0.001, AUROC = 0.619] (Fig. 1E).
To enhance the simplicity and interpretability of the model, ALT was further eliminated from the stepwise logistic regression based on the AIC improvement (Table S4). Table 3 summarized the final multivariate logistic regression model (DA-GAG), including Diabetes, Age, GGT, APR, and GTR. According to the β coefficients derived from multivariate logistic regression analysis, diabetes was given 1.5 points, age 1 point for each grade (totally 2 points), GGT 1 point (totally 2 points), APR 2 points (totally 4 points), and GTR 1 point (totally 2 points). A DA-GAG score ranging from 0 to 11.5, defined by the presence of specific clinical and laboratory parameters, was developed.
Model evaluation
Eventually, we evaluated the AUROC of DA-GAG in the training and testing datasets. The DA-GAG outperformed NFS and FIB-4 in both the training dataset (AUROC = 0.79 vs. 0.69 vs. 0.73, p < 0.001, Fig. 2A) and the internal testing dataset (AUROC = 0.81 vs. 0.64 vs. 0.71, p < 0.001, Fig. 2B).To further validate the robust performance of DA-GAG, we compared diagnostic capability in the external testing 1# dataset. Consistent with the aforementioned findings, the DA-GAG (AUROC = 0.80) exhibited superior performance compared to NFS (AUROC = 0.71, p < 0.001) and FIB-4 (AUROC = 0.73, p < 0.001) (Fig. 2C). Even in the population with low rates of diabetes and obesity (external testing 2# dataset), the DA-GAG was able to perform better than NFS and FIB-4 (AUROC = 0.80 vs. 0.64 vs. 0.67, p < 0.001, Fig. 2D). Next, the calibration plots across all datasets demonstrated that the DA-GAG model effectively distinguished ≥ F2 fibrosis from F0/1 fibrosis with low mean absolute errors ranging from 0.018 to 0.035 (Figure S4A). Then, we conducted a thorough evaluation of the data size and similarity between the training dataset and external testing datasets to ascertain the robustness of our findings. The outcomes showed an adequate data size and slight similarity between the training dataset and external testing datasets (Fig. 2E). These results indicated that the performance of DA-GAG was robust and superior to previous models.
The area under the receiver operating characteristic curve (AUROC) demonstrated that the DA-GAG score outperformed FIB-4 and NFS in the training (A), (B) internal testing, (C) external testing 1#, and (D) external testing 2# datasets; (E) The data size and similarity of external testing datasets were evaluated; (F) the probabilities of F0/ 1 and ≥ F2 at DA-GAG scores derived from a combination of the training and testing datasets.
Subsequently, we determined the diagnostic threshold of the DA-GAG. The low threshold (DA-GAG score ≤ 4.0) with a sensitivity of 90% was selected for the exclusion of ≥ F2 fibrosis. In the training dataset, the sensitivity and negative predictive value (NPV) were 90% and 88%, respectively; in the internal testing dataset, they were 89% and 90%, respectively; while in the external testing 1# dataset, they reached 94% and 73%, respectively; in the external testing 2# dataset, they achieved 79% and 73%. Furthermore, the high threshold to rule in ≥ F2 fibrosis was set as DA-GAG score ≥ 7.0, with specificity and positive predictive value (PPV) of 88% and 73%, respectively. In the internal testing dataset, the specificity and PPV achieved 92% and 73%, respectively, while in the external testing 1# dataset, they reached 76% and 82%; in the external testing 2# dataset, they were 92% and 87%, respectively (Table S5). By applying the low and high thresholds, a liver biopsy would be avoided in 774 (67%) patients and would be performed in only 381 (33%) of 1155 patients identified as “grey zone (scores 4.5–6.5)”.
Given the impact of baseline features such as age, diabetes, and BMI on prediction accuracy, we thereby investigated the performance of DA-GAG under varying baselines in overall data. DA-GAG performed better than FIB-4 and NFS among MASLD patients aged < 40 years, between 40 and 55 years, and ≥ 40 years with the AUROC of 0.76, 0.78, and 0.80, respectively (Figure S4B). In addition, the performance of DA-GAG remained stable and better than FIB-4 and NFS in MASLD patients both with and without diabetes, demonstrating a consistent AUROC of 0.76 (Figure S4C). What is more, DA-GAG exhibited superior performance across all BMI categories, including those with BMI of < 30 kg/m2, 30–35 kg/m2 or ≥ 35 kg/m2 (AUROC = 0.76, 0.82, and 0.81, respectively) (Figure S4D). These results indicated that baseline conditions have a relatively minor impact on DA-GAG.
Additionally, we calculated the probabilities of F0/1 and F2 by standard logit-transformed probability in all datasets. At a DA-GAG score of 4.0, the probability of F0/1 was estimated at 74%, and that of ≥ F2 was 26%, while at a score of 7.0, the probability of F0/1 was estimated at 39%, and that of ≥ F2 increased to 61% (Fig. 2F). These results demonstrated that low and high thresholds were effective in identifying patients with ≥ F2 MASLD. In addition, the clinicians were able to devise a tailored treatment plan for patients with indeterminate results (scores 4.5–6.5) based on the probability.
Discussion
MASLD is the most prevalent chronic liver disease worldwide, impacting between 30 and 44% of adults and obese children29. It is reported that ≥ F2 fibrosis in patients with MASH is related to the all-cause mortality of MASLD7. The primary responsibility for managing the significant burden of MASLD lies with primary care physicians, and the economic impact is undoubtedly substantial given the large population affected by this condition30. However, fewer than 5% of primary care physicians reported utilizing noninvasive diagnostic markers for stratifying MASLD patients31. At present, there is still a lack of a scoring system to predict MASLD patients with ≥ F2 fibrosis, especially in a facile and straightforward manner. Therefore, the development of precise, simple, and effective models for identifying MASLD patients with ≥ F2 fibrosis is crucial. In this study, we developed a novel grading system to detect MASLD patients with ≥ F2 fibrosis, named DA-GAG, which incorporates diabetes, age, GGT, APR, and GTR. Our results demonstrated that the DA-GAG showed superior performance and robustness than the NFS and FIB-4. We also determined the low and high thresholds with high NPV and PPV for clinical decisions. Additionally, the DA-GAG exhibited a lower susceptibility to baseline conditions including age, diabetes, and BMI. Moreover, owing to its high precision, universal variables, and straightforward computation, the DA-GAG score can be effortlessly employed by physicians to detect MASLD patients with ≥ F2 fibrosis in any given scenario, which may hugely relieve the heavy economic and medical burden of society brought by the high prevalence of MASLD. Furthermore, given that the excellent robustness and accuracy of the DA-GAG score, it can effectively identify MASLD patients who may be suitable candidates for resmetirom treatment, thereby enhance the prognosis of the MASLD patients.
Predictors of significant fibrosis among MASLD patients have been indicated in the previous studies. It is well recognized that diabetes and age are risk factors for NASH and diabetes serves as the strongest predictor for MASLD-related liver fibrosis and cirrhosis32,34. In addition, GGT is associated with metabolic syndrome and liver fibrosis35. The APR index (APRI) was initially employed for detecting advanced fibrosis in patients with chronic hepatitis C virus and has subsequently been applied in cases of MASLD, but it is unsuitable for identifying significant fibrosis alone36,37. Our results reinforce the previous findings. At the same time, we have also observed a significant correlation between the GTR (GLB/ TP ratio) and significant fibrosis, which has not been previously reported. Whereas, GLB is positively associated with the stage of liver fibrosis and animal studies have demonstrated that immunoglobulins can directly activate hepatic stellate cells, thereby promoting the development of liver fibrosis38. In addition, TP can also reflect the nutritional status and the liver synthesis capacity of MASLD patients.
The present study has several strengths. We firmly believe that the DA-GAG score, with its common indicators and facile computation, can significantly enhance clinicians’ ability to screen patients for MASLD without increasing clinical burden. Besides, our study totally included 1155 MASLD patients from multi-centers in America and Asia, which enforced the robustness of the DA-GAG. What is more, the low and high thresholds seemed to have considerable NPV and PPV, respectively, indicating that the DA-GAG score could be more effectively utilized in primary care settings or patients who were unwilling to undergo liver biopsy.
However, our study also has some limitations. Similar to the previous scoring system, there is still a grey zone in the current algorithm where decisions has to be tailored to individual circumstances and preferences. Therefore, we provide the ≥ F2 fibrosis probabilities of each score for reference. Although we use the low and high thresholds with high predicted value to filter out MASLD patients with the ≥ F2 fibrosis, the DA-GAG is still far from being perfectly distinguishing. Future study should enlarge the training data and testing data to establish a more accurate model. Additionally, the majority of MASLD patients in our study are overweight or obesity, thus necessitating further validation of DA-GAG precision within the normal weight population. Moreover, the generalizability of the DA-GAG score to other ethnicities remains understudied, but it appears to be applicable for White and Asian populations. Given the retrospective nature of this study, it was hard to ensure consistent testing methods or instrumentation across all samples in the datasets, which could introduce variability and potentially affect the results.
In conclusion, the DA-GAG score, a novel, facile, and straightforward model, has been developed to identify MASLD patients with ≥ F2 fibrosis. The DA-GAG incorporates universal grading indicators, and confirms its robustness in the external testing data. We firmly maintain that the implementation of DA-GAG score would significantly augment clinicians’ capacity to screen ≥ F2 fibrosis of MASLD without imposing additional clinical burden. However, more prospective and cross-sectional studies should be performed to further confirm the robustness and accuracy of the current model, and even the long-term outcomes of MASLD.
Data availability
The CRN/FLINT datasets generated and/or analysed during the current study are available in the NIDDK Central repository, https://repository.niddk.nih.gov. The JNUFAH datasets used and/or analysed during the current study available from the corresponding author on reasonable request.
Abbreviations
- MASLD:
-
Metabolic dysfunction-associated steatotic liver disease
- NAFLD:
-
Nonalcoholic fatty liver disease
- MAFL:
-
Metabolic dysfunction-associated steatotic liver
- MASH:
-
Metabolic dysfunction-associated steatotic steatohepatitis
- NASH:
-
Nonalcoholic steatohepatitis
- HF:
-
Hepatic fibrosis
- NFS:
-
NAFLD fibrosis score
- FIB-4:
-
Fibrosis-4
- HbA1c :
-
Hemoglobin A1c
- NASH CRN:
-
NASH Clinical Research Network
- NIDDK:
-
The National Institute of Diabetes and Digestive and Kidney Diseases
- JNUFAH:
-
Jinan University First Affiliated Hospital
- FLINT:
-
The farnesoid X nuclear receptor ligand obeticholic acid for non-cirrhotic NASH
- BMI:
-
Body mass index
- WBC:
-
White blood cell
- HGB:
-
Hemoglobin
- PLT:
-
Platelet
- AST:
-
Aspartate aminotransferase
- ALT:
-
Alanine aminotransferase
- GGT:
-
Gamma glutamyl-transpeptidase
- ALP:
-
Alkaline phosphatase
- TBIL:
-
Total bilirubin
- DBIL:
-
Direct bilirubin
- ALB:
-
Albumin
- GLB:
-
Globulin
- TP:
-
Total protein
- UA:
-
Uric acid
- CREA:
-
Creatinine
- BUN:
-
Blood urea nitrogen
- TC:
-
Total cholesterol
- TG:
-
Triglyceride
- LDL:
-
Low-density lipoprotein
- HDL:
-
High-density lipoprotein
- AAR:
-
AST/ALT ratio
- APR:
-
AST/PLT ratio
- GTR:
-
GLB/TP
- KS-test:
-
Kolmogorov-Smirnov test
- KNN:
-
K-nearest neighbor
- RCS:
-
Restricted cubic spline
- AUROC:
-
The area under the receiver operating characteristic curve
- IQR:
-
Interquartile range
- OR:
-
Odd ratio
References
Weiss, J. M. et al. Itaconic Acid underpins hepatocyte lipid metabolism in non-alcoholic fatty liver disease in male mice. Nat. Metab. 5, 981–995 (2023).
Baselli, G. A. et al. Liver transcriptomics highlights Interleukin-32 as Novel Nafld-related cytokine and candidate biomarker. Gut 69, 1855–1866 (2020).
Rinella, M. E. et al. A Multisociety Delphi Consensus Statement on new fatty liver Disease nomenclature. J. Hepatol. 79, 1542–1556 (2023).
Crudele, L. et al. Fatty liver index (Fli) is the best score to Predict Masld with 50% lower cut-off value in women than in men. Biol. Sex. Differ. 15, 43 (2024).
Tao, L. et al. Integrative clinical and preclinical studies identify Ferroterminator1 as a potent therapeutic drug for Mash. Cell. Metab. 36, 2190–2206 (2024).
Kanwal, F., Neuschwander-Tetri, B. A., Loomba, R. & Rinella, M. E. Metabolic dysfunction-Associated Steatotic Liver Disease: Update and Impact of New nomenclature on the American Association for the study of Liver diseases Practice Guidance on nonalcoholic fatty liver disease. Hepatology 79, 1212–1219 (2024).
Taylor, R. S. et al. Association between Fibrosis Stage and outcomes of patients with nonalcoholic fatty liver disease: a systematic review and Meta-analysis. Gastroenterology 158, 1611–1625 (2020).
Harrison, S. A. et al. A phase 3, Randomized, Controlled Trial of Resmetirom in Nash with Liver Fibrosis. N Engl. J. Med. 390, 497–509 (2024).
Sanai, F. M. et al. Management of nonalcoholic fatty liver disease in the Middle East. World J. Gastroenterol. 26, 3528–3541 (2020).
Shen, Y., Wu, Y., Fu, M., Zhu, K. & Wang, J. Association between Weight-adjusted-Waist Index with hepatic steatosis and liver fibrosis: a nationally Representative Cross-sectional Study from Nhanes 2017 to 2020. Front. Endocrinol. 14, 1159055 (2023).
Sun, X. et al. Neutralization of oxidized phospholipids ameliorates non-alcoholic Steatohepatitis. Cell. Metab. 31, 189–206 (2020).
Payen, T. et al. Harmonic Motion Imaging of Pancreatic Tumor Stiffness Indicates Disease State and Treatment Response. Clin. Cancer Res. 26, 1297–1308 (2020).
Abenavoli, L. et al. Metabolic dysfunction-Associated Steatotic Liver Disease in patients with inflammatory Bowel diseases: a pilot study. Life-Basel 14, 1226 (2024).
Ajmera, V. H. et al. Clinical utility of an increase in magnetic resonance Elastography in Predicting Fibrosis Progression in nonalcoholic fatty liver disease. Hepatology 71, 849–860 (2020).
Sterling, R. K. et al. Development of a simple Noninvasive Index to predict significant fibrosis in patients with Hiv/Hcv coinfection. Hepatology 43, 1317–1325 (2006).
Angulo, P. et al. The Nafld Fibrosis score: a noninvasive system that identifies liver fibrosis in patients with Nafld. Hepatology 45, 846–854 (2007).
Vali, Y. et al. Biomarkers for staging fibrosis and non-alcoholic steatohepatitis in non-alcoholic fatty liver disease (the Litmus Project): a comparative diagnostic accuracy study. Lancet Gastroenterol. Hepatol. 8, 714–725 (2023).
Marella, H. K. et al. Accuracy of Noninvasive Fibrosis Scoring systems in African American and white patients with nonalcoholic fatty liver disease. Clin. Transl Gastroenterol. 11, e165 (2020).
Easl-Easd-Easo. Clinical practice guidelines for the management of non-alcoholic fatty liver disease. Diabetologia 59, 1121–1140 (2016).
Molinaro, A. et al. Imidazole propionate is increased in diabetes and Associated with dietary patterns and altered Microbial Ecology. Nat. Commun. 11, 5881 (2020).
Kleiner, D. E. et al. Design and validation of a histological Scoring System for nonalcoholic fatty liver disease. Hepatology 41, 1313–1321 (2005).
Cabitza, F. et al. The importance of being External. Methodological insights for the External Validation of Machine Learning models in Medicine. Comput. Meth Programs Biomed. 208, 106288 (2021).
Albig, C. et al. Jasper Controls Interphase histone H3S10 phosphorylation by chromosomal kinase Jil-1 in Drosophila. Nat. Commun. 10, 5343 (2019).
Shah, J. S. et al. Distribution based nearest neighbor imputation for truncated high Dimensional data with applications to pre-clinical and clinical Metabolomics studies. BMC Bioinform. 18, 114 (2017).
Harrell, F. E. Regression Modeling Strategies: with Applications to Linear Models, Logistic Regression, and Survival AnalysisVol. 608 (Springer, 2001).
Zeileis, A., Leisch, F., Hornik, K., Kleiber, C. & Strucchange An R Package for Testing for Structural Change in Linear Regression models. J. Stat. Softw. 7, 1–38 (2002).
Yu, L. et al. Survival of Del17P cll depends on genomic complexity and somatic mutation. Clin. Cancer Res. 23, 735–745 (2017).
Harrison, S. A., Oliver, D., Arnold, H. L., Gogia, S. & Neuschwander-Tetri, B. A. Development and validation of a simple Nafld Clinical Scoring System for identifying patients without Advanced Disease. Gut 57, 1441–1447 (2008).
Wu, Z., Wang, W., Zhang, K., Fan, M. & Lin, R. Trends in the incidence of cirrhosis in global from 1990 to 2019: a joinpoint and age-period-cohort analysis. J. Med. Virol. 95, e28858 (2023).
Allen, A. M., Van Houten, H. K., Sangaralingham, L. R., Talwalkar, J. A. & McCoy, R. G. Healthcare Cost and utilization in nonalcoholic fatty liver disease: real-World Data from a large U.S. Claims Database. Hepatology 68, 2230–2238 (2018).
Sripongpun, P. et al. The steatosis-Associated Fibrosis Estimator (Safe) score: a Tool to detect low-risk Nafld in Primary Care. Hepatology 77, 256–267 (2023).
Jensen, T. et al. Fructose and Sugar: a major mediator of non-alcoholic fatty liver disease. J. Hepatol. 68, 1063–1075 (2018).
Lee, J. I., Lee, H. W. & Lee, K. S. Value of controlled attenuation parameter in Fibrosis Prediction in Nonalcoholic Steatohepatitis. World J. Gastroenterol. 25, 4959–4969 (2019).
Noureddin, M. et al. Clinical and histological determinants of nonalcoholic steatohepatitis and Advanced Fibrosis in Elderly patients. Hepatology 58, 1644–1654 (2013).
Andrade, T. G., Xavier, L., Souza, F. F. & Araujo, R. C. Risk predictors of advanced hepatic fibrosis in patients with nonalcoholic fatty liver Disease - A Survey in a University Hospital in Brazil. Arch. Endocrinol. Metab. 66, 823–830 (2022).
Unalp-Arida, A. & Ruhl, C. E. Liver Fibrosis scores Predict Liver Disease Mortality in the United States Population. Hepatology 66, 84–95 (2017).
Wai, C. T. et al. A simple Noninvasive Index can predict both significant fibrosis and cirrhosis in patients with chronic Hepatitis C. Hepatology 38, 518–526 (2003).
Wang, J. et al. A novel non-invasive model for the prediction of Advanced Liver Fibrosis in Chronic Hepatitis B patients with Nafld. J. Viral Hepat. 30, 287–296 (2023).
Acknowledgements
The authors appreciate the study investigators and staff who participated in this study. The NAFLD adult and FLINT were conducted by the NAFLD adult and FLINT Investigators and supported by the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK). The data from the NAFLD adult and FLINT reported here were supplied by the NIDDK Central Repository. This manuscript was not prepared in collaboration with Investigators of the NAFLD adult and FLINT study and does not necessarily reflect the opinions or views of the NAFLD adult and FLINT study, the NIDDK Central Repository, or the NIDDK.
Funding
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
Author information
Authors and Affiliations
Contributions
Linjing Long, Yue Wu, Huijun Tang, and Yanhua Xiao contributed equally to this paper. All authors have read and approved the final version of the manuscript. Linjing Long : Writing - original draft, Data curation, Formal analysis, Investigation, Methodology, Software, Supervision, Validation, Visualization, Conceptualization. Yue Wu : Writing - original draft, Resources, Investigation, Supervision, Validation, Formal analysis. Huijun Tang : Writing - original draft, Resources, Investigation, Supervision, Validation, Formal analysis. Yanhua Xiao : Writing - original draft, Formal analysis, Resources, Investigation, Supervision. Min Wang : Investigation, Methodology. Lianli Shen : Investigation, Visualization. Ying Shi : Investigation. Shufen Feng : Investigation. Chujing Li : Investigation. Jiaheng Lin : Investigation. Chutian Wu and Shaohui Tang : Conceptualization, Funding acquisition, Project administration, Resources, Formal analysis, Methodology, Supervision, Validation, Writing - original draft, Writing - review & editing.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Ethics approval
The study was approved by the ethics committee of Jinan university first affiliated hospital (Approval number: KY-2022-231).
Patient Consent Statement
All patients signed a written informed consent.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Long, L., Wu, Y., Tang, H. et al. Development and validation of a scoring system to predict MASLD patients with significant hepatic fibrosis. Sci Rep 15, 9639 (2025). https://doi.org/10.1038/s41598-025-91013-z
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-025-91013-z




