Fig. 4

Machine learning identification of diagnostic marker genes for NAFLD. The accuracy and error rate of feature selection of the SVM algorithm reached the lowest cross-validation error of 0.02% (A) and the peak accuracy of 0.98% (B) when 2 genes were selected. (C) LASSO coefficient analysis. (D) Diagnostic performance of LASSO model. (E) Random forest analysis was used for 10 DE-PRGs, and 5 genes were included, with an accuracy of 0.99. (F) Venn diagram showing overlapping genes obtained using the three machine learning algorithms (SVM, LASSO, and RF). SVM support vector machine, LASSO least absolute shrinkage and selection operator, RF random forest, NAFLD nonalcoholic fatty liver disease.