Fig. 5 | Scientific Reports

Fig. 5

From: HLA-DPA1 as a diagnostic biomarker differentiating early- and late-onset preeclampsia

Fig. 5

Supplementary Analysis of Three Key DIRGs. (A) The chromosomal locations of the three key DIRGs were mapped to illustrate their specific positions on respective chromosomes, providing a genomic context for further exploration of their potential regulatory relationships. This circos plot was generated using the circlize package (version 0.4.17; https://cran.r-project.org/web/packages/circlize/) in R. (B) A PCA plot was generated to visualize the distribution of samples, with this visualization based on the expression profiles of the three key DIRGs; in the plot, samples are color-coded according to their group affiliation (blue for the LOPE group and red for the EOPE group), and dashed ellipses are used to enclose samples from each group, representing the 95% confidence intervals for group-wise sample clustering. Meanwhile, the x-axis and y-axis correspond to the first two principal components (PC1 and PC2), respectively, and the percentage of total variance explained by each principal component is indicated in parentheses adjacent to the axis labels, allowing for assessment of the dimensionality reduction effect. This plot was generated using the ggplot2 package (version 4.0.1) in R, with principal components computed via the prcomp function. (C) The train group datasets, which integrate two individual datasets (GSE75010 + GSE60438), were used to present the comparative expression levels of three critical DIRGs between the EOPE group and the LOPE group, ensuring that the observed expression differences are supported by a combined, larger sample size. This box plot was generated using the ggpubr (version 0.6.2; https://cran.r-project.org/web/packages/ggpubr/) and ggplot2 packages in R. (D) ROC curves were constructed to validate the predictive efficacy of the three critical DIRGs for EOPE within the train group, with the AUC serving as a quantitative metric to evaluate how well each gene can distinguish between EOPE and LOPE samples. The ROC analysis and curves were generated using the pROC package (version 1.18.5; https://cran.r-project.org/web/packages/pROC/) in R. (E) A nomogram was developed to predict the risk of EOPE, and this predictive tool is based on the expression levels of three DIRGs, specifically including HLA-DPA1, PROK2, and LEP. For each of these three genes, a specific number of points is assigned based on the gene’s expression level in a given sample; the total number of points for an individual sample is then calculated by summing the points from all three genes, and this total point value is further converted to the corresponding risk of developing EOPE, enabling intuitive risk assessment. This nomogram was generated using the rms package (version 8.0.0; https://cran.r-project.org/web/packages/rms/) in R software (v.4.4.0; https://www.r-project.org/). (F) A calibration curve was generated for the nomogram to assess the agreement between the predicted risk of EOPE (displayed on the x-axis) and the actually observed risk (displayed on the y-axis). The diagonal line in the plot represents the ideal prediction scenario where predicted risk perfectly matches observed risk, while the solid line (labeled “Apparent”) shows the model’s performance before bias correction and the dashed line (labeled “Bias-corrected”) illustrates the performance after bias correction via bootstrapping with 1000 repetitions, allowing for evaluation of the model’s calibration accuracy and potential overfitting. This calibration plot was generated using the rms package in R.

Back to article page