Fig. 2

Differential Gene Expression and PE.ISG Analysis using Machine Learning Algorithms. (A) Volcano plot of differentially expressed genes (DEGs) between PE and control groups in dataset GSE60438. (B) Venn diagram of DEGs and IRGs. (C) Frequency distribution of LASSO algorithm results over 1,000 repeated experiments. The x-axis represents different gene sets, whereas the y-axis represents frequency. Bar chart values indicate the frequency of results for each gene set. (D) Bagged trees algorithm value distribution for different variables and model precision. The x-axis represents the source of preeclampsia immune source–related genes (PE.ISGs), whereas the y-axis represents the precision model’s optimal value. (E) Precision value distribution of the RF algorithm for different numbers of variables. The x-axis represents the source of PE.ISG, and the y-axis represents the precision model’s optimal value. (F) Bayesian algorithm precision value distribution with different numbers of variables. The x-axis represents the source of PE.ISG, and the y-axis represents the precision model’s optimal value. (G) Wrapper feature selection (Boruta) algorithm. The x-axis represents the name of each variable, and the y-axis shows the Z value of each variable. The box plot displays the Z-scores calculated during the model analysis: the green box represents important variables, the yellow box represents tentative attributes, and genes in the green and yellow boxes were selected for inclusion in the model. (H) Importance values of key genes identified using the learning vector quantization (LVQ) algorithm.PE, preeclampsia; IRGs, immune-related genes; IRDEGs, immune-related differentially expressed genes; DEGs, differentially expressed genes; PE.ISG, preeclampsia immune source–related gene; LASSO, least absolute shrinkage and selection operator; RF, random forest; LVQ, learning vector quantization; ML, machine learning.