Abstract
Preeclampsia (PE) is a hypertensive disorder that affects 5% of pregnancies. It is characterized by new-onset hypertension and proteinuria after 20 weeks of gestation. We investigated maternal second-trimester peripheral blood gene expression profiles to identify potential biomarkers for predicting late-onset PE. We found significant associations between PE and several inflammatory and immunological pathways, including the JAK-STAT signaling, leukocyte transendothelial migration, systemic lupus erythematosus (SLE), and graft versus host disease (GVHD). Twelve candidate biomarkers were identified, with high predictive power (AUC 0.74–0.87) for late-onset PE. These biomarkers were also associated with various blood cell populations, particularly intermediate monocytes. Our study highlights the role of maternal immune dysregulation in PE and the potential of second-trimester blood RNA profiles in disease prediction.
Similar content being viewed by others
Introduction
Preeclampsia (PE) is a hypertensive pregnancy complication, characterized by new-onset hypertension and proteinuria or other related symptoms after 20 weeks of gestation1. PE affects approximately 5% of all pregnancies, with higher prevalences in developing countries1,2. This complex disorder mostly affects first-time mothers and is influenced by several maternal risk factors including obesity, cardiovascular disease, or systemic lupus erythematosus (SLE)3. Approximately 70% of PE develops after 34 weeks of gestation, defined as late-onset PE4.
Although only the delivery of the placenta cures PE, its underlying placental etiology remains incompletely understood. PE involves pre-clinical and clinical stages5. First, poor placentation and/or placental malperfusion causes syncytiotrophoblast stress, leading to the release of pro-inflammatory and anti-angiogenic agents into the maternal circulation3,6. Second, placental stress promotes generalized vascular inflammation and new-onset hypertension and proteinuria or other signs of end-organ dysfunction in the mother5. While early-onset PE is linked to poor placentation and fetal growth restriction, late-onset PE is more related to maternal risk factors.
Prediction of PE before the onset of clinical syndrome could improve maternal and fetal outcomes by enabling timely interventions, reducing complications, and lowering healthcare costs7. Current risk assessment involves maternal history and combinations of biomarkers such as mean arterial blood pressure, uterine artery pulsatility index, and placental growth factor (PlGF)8,9 These are, however, only useful in predicting early-onset PE. To discover potential biomarkers for presymptomatic prediction of late-onset PE, we studied gene expression profiles of maternal peripheral blood samples taken at 20–24 weeks of gestation. We found that several inflammatory and immunological pathways were associated with late-onset PE, weeks before the onset of clinical syndrome. These included the JAK-STAT signaling, leukocyte transendothelial migration, SLE, and graft versus host disease (GVHD) pathways. Moreover, we identified 12 candidate second-trimester biomarkers to predict late-onset PE.
Materials and methods
Sample collection and clinical characteristics
Samples were collected as part of the InTraUterine sampling in early pregnancy (ITU) cohort study at the Helsinki University Hospital, Finland. Data on cohort collection and sampling have been described10. The study protocol was approved by the Coordinating Ethics Committee of the Helsinki and Uusimaa Hospital District (269/13/03/00/09). Register data had been merged with approval from the register authority (THL/887/14.02.00/2021). All participants gave their signed informed consent during the recruitment. All methods were carried out in accordance with relevant guidelines and regulations.
Peripheral blood samples were collected between gestational weeks 20 and 24. A total of 64 samples, including 12 PE and 52 normotensive pregnancies (NP) were selected. All women with PE had late-onset disease ≥ 34 weeks, diagnosed using the most recent criteria11. The following maternal demographic and clinical data were recorded: maternal age at delivery, parity, hypertension, diabetes, pre-pregnancy body mass index (BMI), obesity status (BMI ≥ 30 kg/m2 denoting obesity), gestational age at birth, PE status (PE or NP), sex of the baby, birth weight, and placental weight. Although diabetes and obesity status were recorded, they were not included in the analyses because none of the normotensive women were obese or had diabetes. Clinical characteristics were tabled using Table 1 package (version 1.4.3)12 in R (version 4.2.2)13. Group comparisons were performed using t-test for parametric (p > 0.05, Shapiro-Wilk test), Mann-Whitney U test for non-parametric (p ≤ 0.05), and Chi-square test for categorical variables.
STRT library preparation and the sequencing
RNA was extracted from maternal peripheral blood samples, preserved in PAXgene Blood RNA Tubes (QIAGEN), using PAXgene Blood RNA kit (QIAGEN). A modified 5’-end RNA-seq method, STRT, with 8-bp unique molecular identifiers (UMIs)14 was applied by combining GlobinLock technique15. For the sample identification, 6-bp barcode sequences were included in the primers. Two STRT libraries, each having 48 samples and water as non-template controls (NTC), were sequenced with Illumina NextSeq 500 System, High Output (75 cycles, single-end). The sample layout was balanced for PE and NP status between the two libraries. In total, 97 spike-ins from the ERCC RNA Spike-in mix were tested.
Preprocessing and the quality control of the STRT libraries
STRTN pipeline (https://github.com/gyazgeldi/STRTN, commit e16a6d1)16 was employed for this step. The raw data were aligned to the hg38 reference genome with GENCODE v43 basic annotation (wgEncodeGencodeBasicV43), which was downloaded from the UCSC Genome Browser (https://genome.ucsc.edu/)17; then a count matrix for protein-coding genes, and several quality control metrics were obtained. Outlier samples were assessed based on quality metrics, including mapped reads, spike-in reads, spike-in 5’-end rate, mapped rate, mapped/spike-in ratio, and coding 5’-end rate, and identified as values outside the interquartile range-based threshold range (values below than Q1–1.5×IQR or above than Q3 + 1.5×IQR) in boxplots.
Reduction of technical bias and noise in the gene expression profiles
Genes with counts below five were filtered out, and library specific biases between the two STRT libraries were corrected using library_bias_correction function from NBGLM-LBC (Negative Binomial Generalized Linear Model – Library Bias Correction) package18 in R (version 4.2.2). Depth files were generated from BAM files using samtools19, followed by processing and classification by library. Thereafter, feature selection test20 was applied to identify fluctuating (highly variable or informative) genes to reduce technical noise. This test calculated the squared coefficient of variation for each gene and spike-ins in the library bias corrected and spike-in normalized count matrix. Technical variation was evaluated by analyzing the fluctuations in spike-in RNA levels across the cells, allowing for the comparison of each gene’s expression changes with predicted variations from the spike-in controls. Genes with adjusted p-values < 0.05 were considered fluctuating, indicating a higher gene-to-spike-in ratio in the squared coefficient of variation. These genes were kept, while those with adjusted p-values > 0.05 were filtered out. All spike-ins were kept without considering fluctuation as minor fluctuations reflected technical noise rather than biology.
Dimensionality reduction and visualization of gene expression profiles
Principal Component Analysis (PCA) was applied for dimensionality reduction and clustering and Uniform Manifold Approximation and Projection (UMAP) was used for two-dimensional visualization of the library bias corrected, fluctuating genes expression data using the Seurat package (version 5.0.1)21 in R, followed by spike-in normalization with offset and log transformation. PCAtools package (version 2.10.0)22 in R was then used, by removing the lower 20% of variables that explained minimal variance, to extract the most informative features and visualize the Pearson’s correlations between principal components (PCs) and clinical characteristics using eigencorplot and pairplots.
Identification of differentially expressed genes
Differential gene expression analysis (DGE) analysis was performed using the DESeq2 package (version 1.38.3)23. The analysis was based on the library bias corrected count matrix. Genes with fewer than 5 counts across the smallest group size (n = 10) were filtered out prior to analysis. Spike-in normalization was applied using all 88 non-zero expressed spike-ins as control genes in the DESeq2 function estimateSizeFactors(), where a logical vector was provided to specify the spike-ins. The design formula was ~ Condition (PE vs. NP). The Wald test was used, and p-values were adjusted for multiple testing using Benjamini-Hochberg method (α = 0.05). Log2 fold-change (log2FC) values were then used to show how the genes in the enriched pathways change between PE and NP, using the Pathview package (version 1.38.0)24.
Pathway and cell type enrichment analysis
Gene set enrichment analysis (GSEA)25 was performed using the curated pathway collection CP: KEGG_LEGACY26,27,28,29 to identify significant pathways associated with PE. Using the fgsea package (version 1.24.0)30 in R, genes were ranked based on PC loadings, reflecting their contributions to each principal component (PC). Pathways enriched at the top or bottom of the ranked gene list were identified using normalized enrichment scores (NES) degree, and expression trend. NES values were then calculated to determine the association between gene sets and PCs. Leading genes, which are the most influential in the biological processes and contribute to the observed enrichment signal, were also detected to highlight key genes associated with PE or distinguishing PE samples. For each enriched pathway, spike-in normalized gene expression values of member genes were transformed into Z-score, scaled, and visualized as heatmaps with ComplexHeatmap package (version 2.14.0)31. Additionally, GSEA was performed using only the fluctuating genes to identify enriched cell types in the PE samples compared to NP samples. For this, a custom gene set was prepared using the human blood cells GSE149938 dataset32. This dataset originally contained 32 cell types. However, we focused on circulating blood cells and excluded hematopoietic stem and progenitor cells, including hematopoietic stem cells, granulocyte-monocyte progenitors, B cell and natural killer cell progenitors, multi-lymphoid progenitors, lymphoid-primed multipotent progenitors, megakaryocyte-erythroid progenitors, common myeloid progenitors, and multipotent progenitors. To identify cell-type-specific markers, differentially expressed genes (log2FC > 2.32, padj < 0.05) were detected using Seurat (version 5.0.1). Marker genes were then compiled into a GMT file for GSEA, enabling enrichment analysis of the immune cell types associated with PE.
Prediction of PE status using logistic regression analysis
Spike-in normalized fluctuated data and sample metadata were combined, resulting in 6,982 variables. Logistic regression models were then fitted to predict PE status for each variable. Key metrics, including Area Under the Curve (AUC), sensitivity, specificity, and accuracy along with their confidence intervals at the fixed 10% false-positive rate, were calculated using pROC (version 1.18.5)33 and ROCR (version 1.0.11)34 packages in R. Confidence intervals (CI) were calculated at the 95% confidence level. Statistical significance of each ROC curve was evaluated by testing whether the AUC was greater than 0.5, with p-values calculated using the Mann–Whitney U test, which is equivalent to the DeLong test. Results were filtered to highlight variables with sensitivity above 0.65, focusing on those with the highest diagnostic potential. The gene expression levels of these highlighted variables were visualized in PE and NP using violin plots with the VlnPlot function from the Seurat package. Differences in gene expression between these two conditions were assessed using the two-sided Wilcoxon test.
Results
An overview of the study design is shown in Fig. 1, and demographics of the study population in Table 1.
For this study, we specifically selected late PE cases and uneventful pregnancies, for which we had biological samples available and taken at the same gestational age. While the overall prevalence of gestational hypertension was 7% in the whole ITU cohort10, our cohort included 12 (19%) PE cases among the total of 64 women. Mean maternal ages at delivery, parity, and pre-pregnancy BMIs in early pregnancy were similar between the groups. However, 3 (30%) of PE women were obese while all controls had normal weight. Moreover, PE women delivered at earlier gestational age than the NP group, and as expected, their neonates had lower mean birth weights35. One patient with PE delivered at 35 weeks, while all other deliveries occurred at or after 37 weeks. After the quality control (Supplementary Fig. 1), two PE and 13 NP samples were excluded as outliers, leaving 49 (10 PE and 39 NP) samples from two unbiased libraries (Supplementary Fig. 1) for downstream analyses.
Immune activity and inflammation pathways associated with late-onset PE at the presymptomatic phase
To identify transcriptomic alterations and pathway-level changes associated with late-onset PE at the presymptomatic phase, we started with the initial dataset detecting signals for 19,670 genes, including 97 spike-ins. After filtering steps, 12,670 genes and 91 spike-ins remained in the analysis. After the library bias correction and feature selection, we obtained the final dataset of 6,804 genes, including the spike-ins, for downstream analyses.
Conventional DEG analysis did not reveal differentially expressed genes between PE and NP women (adjusted p-values were ~ 0.9). Using standard DESeq2 settings with Benjamini-Hochberg correction (α = 0.05), 771 genes showed nominal significance (p < 0.05), but none remained significant after multiple testing adjustment. Since no clear signals were detected at the single-gene level, we therefore applied a PCA-GSEA approach to improve sensitivity36. This approach reduces the effect of individual variation by grouping correlated genes into principal components and then testing for enrichment at the pathway level, rather than focusing on single-genes differences as done in DEG. Using this approach, we identified significant associations between clinical characteristics and four PCs that strongly correlated with PE (Fig. 2a). Of them, PC1 explained 51.53% of the total variation and was negatively correlated with PE (Pearson’s correlation coefficient: −0.25). PC6, PC8, and PC18 explained 0.95 to 1.95% of the variation and were positively correlated with PE (Pearson’s correlation coefficients ≥ 0.25) (Fig. 2a). PE and NP samples were only partially separated by the PCs: PE samples were in the negative area of PC1 and positive area of other PCs (Fig. 2b).
Transcriptomic alterations in second-trimester maternal blood samples from women with late-onset preeclampsia during the presymptomatic phase. (a) PCA-correlation plot demonstrates that PC1, PC6, PC8, and PC18 are correlated (Pearson’s r ≥ 0.25) with preeclampsia (PE). Percentages of variation that each component represents are shown at the bottom line. (b) Biplot shows that PC1, PC6, PC8, and PC18 only partially distinguish PE and normotensive (NP) samples. PE samples are shown in red and NP in blue.
Using the KEGG Legacy dataset, we found that the genes highly contributing to the PE-correlated PC1, PC6, and PC18 were significantly enriched in several pathways (Table 2), whereas no pathways were enriched for PC8. For PC1, the leukocyte transendothelial migration pathway, JAK-STAT signaling pathway, and Vibrio cholerae infection pathway were significantly negatively enriched (Table 2). As these pathways contributed negatively to PC1, and PC1 was negatively correlated with PE (Fig. 2a), these pathway genes were upregulated in PE (Fig. 3a-b, Supplementary Figs. 2–4). As for PC6, being positively correlated with PE, the antigen processing and presentation pathway, and asthma pathway were negatively enriched (Table 2), indicating that their member genes were downregulated in PE (Supplementary Figs. 5 and 6). As for PC18, being also positively correlated with PE, the SLE pathway, GVHD and antigen processing and presentation pathways were positively enriched (Table 2), indicating that their member genes were upregulated in PE (Fig. 3c-d, Supplementary Figs. 7–9). Interestingly, the antigen processing and presentation pathway was negatively enriched in PC6 and positively enriched in PC18, reflecting variations captured by each PC and the influence of other pathway-specific genes, detailed in Fig. 3 and Supplementary Figs. 5 and 7.
Heatmaps showing differential second-trimester pathway activation in women with late-onset preeclampsia and normotensive women. Shown are the major upregulated pathways in preeclampsia (PE) related to (a) leukocyte transendothelial migration, (b) JAK-STAT signaling, (c) systemic lupus erythematosus (SLE) and (d) graft versus host disease pathways. Columns represent samples (ordered by rotated values in each principal component, PC), and rows represent pathway genes (ordered by loading values in each PC), with the left indicating leading genes. Red color indicates upregulation, and blue indicates downregulation, normalized across preeclampsia (red) and normotensive pregnancies (blue).
High-AUC biomarkers for late-onset PE prediction suggesting intermediate monocytes enrichment in maternal second-trimester blood RNA profiles
After observing pathway alterations associated with late-onset PE in the second-trimester maternal blood samples, we searched for potential marker genes for PE prediction. We performed logistic regression analysis followed by ROC analysis. We found that several genes distinguished PE and NP with AUC values of 0.74 to 0.87 (Table 3), as visualized in Fig. 4a and Supplementary Fig. 10. Their expression levels in PE and NP samples are shown in Fig. 4b and Supplementary Fig. 10. As shown in these figures, the identified biomarkers exhibited higher expression in NP compared with PE samples. Despite the high AUC and sensitivity values, the small sample size resulted in wide confidence intervals for sensitivity estimates (Table 3). Moreover, the genes identified through the ROC analysis, with a sensitivity threshold set at 0.65, were not directly involved in the pathways detected by the PCA analysis (Table 3).
Performances of three candidate genes for late-onset preeclampsia prediction. (a) ROC curves show the performance of selected biomarkers in distinguishing preeclampsia (PE) and normotensive pregnancies (NP) with false positive rate (FPR) on the x-axis and true positive rate (TPR) on the y-axis. The area under curve (AUC) values and corresponding p-values (testing whether AUC > 0.5) provide a measure of the diagnostic power of each biomarker. (b) Violin plots showing variation, with p-value from the two-sided Wilcoxon test shown above, in gene expression of the same biomarkers between PE and NP. Horizontal lines indicate median expression.
Since the candidate markers (Table 3) were not part of known pathways, we further tested whether they were associated with different blood cell populations. Specifically, we analyzed a total of 24 circulating blood cell types, ranking biomarkers separately based on AUC to assess their ability to distinguish PE from NP, and on sensitivity to assess their ability to detect PE. We found that the sensitivity-based ranking did not identify any significantly enriched cell types associated with PE samples. In contrast, the AUC-based ranking identified significant enrichment in human blood cell marker genes for erythrocytes, progenitor neutrophils, myelocytes, intermediate monocytes, metamyelocytes, mature neutrophils, and regulatory B cells (Table 4). Among them, intermediate monocytes had a positive NES, indicating that these cell-type marker genes were most effective in distinguishing PE samples. Conversely, other cell types had negative NES values, suggesting a lower ability to distinguish PE from NP.
Discussion
Tools for identifying late-onset PE at the presymptomatic phase, before the onset of maternal syndrome, are currently lacking. To address this challenge, we utilized maternal second-trimester blood samples to explore early alterations in RNA expression profiles before late-onset PE. We found that subtle changes in RNA expression profiles arise weeks before the onset of the clinical syndrome. Notably, PE-associated changes were related to leukocytic immune surveillance and maternal immunity homeostasis. Based on these observations, we found 12 potential candidate genes for PE prediction.
In our study, second-trimester maternal RNA expression profiles revealed alterations in several pathways among women with subsequent development of late-onset PE. These included upregulation of genes of the leukocyte transendothelial migration pathway that regulates immune surveillance through the movement of immune cells across the endothelium37. This pathway has previously been linked to immune activation during pregnancy, and additionally, to infection-related and immunological PE subtypes in placental transcriptome36. Upregulation of leukocyte transendothelial migration pathway is in agreement with prior findings that indicate maternal endothelial and leukocyte activation in PE38. To support the biological role of this pathway in PE, leukocyte migration genes are even candidate predictors of gestational age in the peripheral blood of normal pregnancies39. Similarly, genes in the JAK/STAT pathway, a critical mediator of cytokine and growth factor signaling, were upregulated in late-onset PE, further supporting alterations in immune responses or homeostasiss40. Thus, there are two possible interpretations: one involves the activation of maternal inflammatory responses already at the presymptomatic phase of PE, and the other suggests primary maternal immune dysregulation in pregnancies predisposed to subsequent development of PE. To support the role of primary immune aberration, genes in the SLE pathway were upregulated among women with subsequent PE, even if none of the participants had SLE. Among other connective tissue diseases, SLE is a chronic inflammatory disease with altered immune homeostasis41, and a considerable risk factor of PE42. Interestingly, one of the leading genes of the SLE pathway, cathepsin G (CTSG), shows upregulation in normal pregnancies versus non-pregnant controls43 of first-trimester maternal plasma, and contributes to the processing of antigens and autoantigens44, proposing its primary role during human pregnancy. As a potential sign of immune rejection, genes in the GVHD pathway were among those upregulated in PE, supporting the role of enhanced maternal immune responses in the PE pathogenesis45.
The candidate biomarkers that we identified have not been previously reported in association with PE. Among the top genes, some are directly linked to PE or placental processes. PI4KB (Phosphatidylinositol 4-Kinase Beta) is a lipid metabolism enzyme involved in vascular morphogenesis through the control of fibronectin secretion in endothelial cells46. As vascular remodeling is essential for placentation, and maternal endothelial dysfunction is a hallmark of PE47, its upregulation in late-onset PE may reflect early signs of endothelial dysfunction. KICS2 (KICSTOR subunit 2) regulates autophagy via mTORC1 signaling48. Autophagy is an important process for placentation, especially for trophoblast invasion49, and impaired placental autophagy has been observed in PE50. Autophagy also contributes to monocyte-macrophage differentiation51, suggesting that the PE-related changes in monocytes populations observed previously52,53,54 and in our study might be at least partially related to these alterations. HIPK3 (particularly its circular isoform circHIPK3) is also to be noted, as it is downregulated in preeclamptic placentas, with functional studies showing reduced trophoblast proliferation, migration and invasion55,56.
Among the other identified genes, many link to immune or blood cell functions. PPM1G is a phosphatase that regulates multiple cellular processes, including mRNA splicing, DNA damage response, and cell cycle progression. Its elevated expression has been associated with poor prognosis in hepatocellular carcinoma and shown to correlate with immune cell infiltration and NF-κB–mediated inflammatory signaling57. RAB11B, a Ras-like small GTPase, is known to regulate intracellular trafficking processes and immune receptor signaling58. It has also been linked to autophagy regulation and macrophage polarization, thereby influencing inflammatory responses59. FAM117A, also known as C/EBP induced protein, contributes to B-cell to myeloid differentiation, macrophages, and granulocytes in blood cell development60 and NK cell metabolism in cancer61. APOL6 has been associated with ferroptosis-related pathways and increased infiltration of immune cells such as CD8 + T cells, macrophages, and dendritic cells in cancer62, and exosomal regulation of APOL6 was shown to affect endometrial stromal cell survival63. TRIM68 is an E3 ubiquitin ligase involved in immune regulation and apoptosis in cancer64,65. In osteosarcoma, its expression correlated with increased memory CD4 T cells and Tregs, and decreased mast cells66, suggesting a role in adaptive immune responses. Although direct evidence from blood cells or immune cells is limited, these findings support a potential role for immune regulation.
Other identified genes have been studied mainly in placental or general cellular processes. SFI1 is a centriolar protein essential for centriole stability and ciliogenesis67. ARSG encodes a lysosomal sulfatase linked to metabolic stress and proinsulin levels, a metabolic trait associated with obesity68 and a known risk factor for PE. PITRM1, a mitochondrial metallopeptidase, is expressed in human placenta69 and has been reported to be abnormally upregulated in mouse nuclear transfer placentas70. USO1 is a vesicle transport factor that mediates ER-to-Golgi trafficking and is part of the phosphatidylinositol 3-kinase (PtdIns3K) complex, essential for autophagy and autophagosome assembly in intestinal epithelial cells71. While these genes are involved in placental or cellular pathways, their specific relevance in maternal blood and late-onset PE remains to be clarified.
While we investigated the cellular mRNAs in peripheral blood, many previous studies used cell-free molecules, not only RNAs but also DNAs7. Those approaches measure cell content primarily leaked from damaged cells. For example, Farina et al.72 could predict (both early- and late-onset) PE using cell-free mRNAs of Fms Related Receptor Tyrosine Kinase 1 (FLT1) at 10–14 weeks of gestation; the detection rate at a 5% false positive rate was 72.3%. Because FLT1 gene is highly expressed in placental trophoblasts and endothelial cells throughout the body73, and secreted in soluble forms, its high circulating levels make it an ideal marker for disease detection when the tissue breakdown occurs. In contrast, our approach measures cellular mRNAs originating primarily from intact cells of the peripheral blood. Therefore, the placental status cannot be assessed by our method but instead, our study provides data on maternal immune status and dysregulation based on leukocyte-derived mRNA alterations.
The main limitation of our study was the moderate number of samples. A larger series would have been ideal due to the heterogeneous nature of late-onset PE6, and the complexity of PE pathogenesis1. Furthermore, we had neither data on PE subtypes nor the severity of PE. These limitations may also explain why conventional DEG analysis failed to identify significant single-gene changes, partly due to the subtle variation observed at the presymptomatic phase of late-onset PE. Therefore, pathway-level approaches such as PCA-GSEA are more suitable in such cases. Genes that interact with each other for a function tend to co-regulate, and PCA groups co-regulated genes into principal components, so that genes in the same pathway could be enriched by PCA. Unlike DEG, Broad GSEA evaluates all genes, including those without significant changes at the single-gene level. Therefore, if the pathway-level alteration is consistent across patients with the same disease, the combination of PCA and GSEA can detect it. The present study detected several pathways as early signals of late-onset PE using this approach, and the robustness must be validated in other cohorts. However, our study utilized several variables that affected the expression profiles, including maternal pre-pregnancy BMI (PC1 of Fig. 2), fetal gender (PC2), or gestational diabetes (PC3), suggesting that future studies should not only include larger sample sizes but also account for these characteristics. Unfortunately, we had no longitudinal samples or samples at the diagnosis, limiting the conclusions of this study. As a strength of the study, however, the wide coverage of protein-coding genes allowed us to search for potential biomarkers for disease prediction.
Collectively, these results provide novel insights into the complex pathogenesis of PE and its prediction at the presymptomatic phase. Our study suggests that maternal immune dysregulation is an early sign of the disease and is measurable by maternal blood RNA expression profiles. Future studies are needed to characterize and stratify these findings in larger series and clinical settings.
Data availability
The datasets generated and/or analyzed during the current study are not publicly available due to restrictions imposed by the Finnish Data Protection Act and EU GDPR legislation. Data are stored at HUS Academic closed access computing environment. An interested researcher can obtain a de-identified dataset after having obtained an approval from the ITU study board and data controller. Data requests may be subject to further review by the national register authority and by the ethical committees. Any requests for data use should be addressed to the corresponding author.
References
Dimitriadis, E. et al. Pre-eclampsia. Nat. Reviews Disease Primers. 9, 1–22 (2023).
Abalos, E., Cuesta, C., Grosso, A. L., Chou, D. & Say, L. Global and regional estimates of preeclampsia and eclampsia: a systematic review. Eur. J. Obstet. Gynecol. Reproductive Biology. 170, 1–7 (2013).
Steegers, E. A., von Dadelszen, P., Duvekot, J. J. & Pijnenborg, R. Pre-eclampsia. Lancet (British Edition). 376, 631–644 (2010).
Robillard, P. Y. et al. Validation of the 34-week gestation as definition of late onset preeclampsia: testing different cutoffs from 30 to 37 weeks on a population-based cohort of 1700 preeclamptics. Acta Obstet. Gynecol. Scand. 99, 1181–1190 (2020).
Staff, A. C. The two-stage placental model of preeclampsia: an update. J. Reprod. Immunol. 134–135, 1–10 (2019).
Tranquilli, A. L., Brown, M. A., Zeeman, G. G., Dekker, G. & Sibai, B. M. The definition of severe and early-onset preeclampsia. Statements from the international society for the study of hypertension in pregnancy (ISSHP). Pregnancy Hypertension: Int. J. Women’s Cardiovasc. Health. 3, 44–47 (2013).
Ng, K. W., Chaturvedi, N., Coté, G. L., Fisher, S. A. & Mabbott, S. Biomarkers and point of care screening approaches for the management of preeclampsia. Commun. Med. 4, 208 (2024).
Tan, M. Y. et al. Screening for pre-eclampsia by maternal factors and biomarkers at 11–13 weeks’ gestation. Ultrasound Obstet. Gynecol. 52, 186–195 (2018).
Litwinska, M., Syngelaki, A., Wright, A., Wright, D. & Nicolaides, K. H. Management of pregnancies after combined screening for pre-eclampsia at 19–24 weeks’ gestation. Ultrasound Obstet. Gynecol. 52, 365–372 (2018).
Kvist, T. et al. Cohort profile: intrauterine sampling in early pregnancy (ITU), a prospective pregnancy cohort study in finland: study design and baseline characteristics. BMJ open. 12, e049231 (2022).
Magee, L. A. et al. The 2021 international society for the study of hypertension in pregnancy classification, diagnosis & management recommendations for international practice. Pregnancy Hypertens. 27, 148–169 (2022).
Using the Table 1 Package to Create HTML Tables of Descriptive Statistics. https://benjaminrich.github.io/table1/vignettes/table1-examples.html
R Core Team. R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2022).
Ezer, S. et al. Generation of RNA sequencing libraries for transcriptome analysis of globin-rich tissues of the domestic dog. STAR. Protoc. 2, 100995 (2021).
Krjutškov, K. et al. Globin mRNA reduction for whole-blood transcriptome sequencing. Sci. Rep. 6, 31584 (2016).
Boskovic, N. et al. Optimized single-cell RNA sequencing protocol to study early genome activation in mammalian preimplantation development. STAR. Protoc. 4, 102357 (2023).
Frankish, A. et al. GENCODE 2021. Nucleic Acids Res. 49, D916–D923 (2021).
Katayama, S. et al. Guide for library design and bias correction for large-scale transcriptome studies using highly multiplexed RNAseq methods. BMC Bioinform. 20, 418 (2019).
Danecek, P. et al. Twelve years of SAMtools and BCFtools. GigaScience 10, giab008 (2021).
Krjutškov, K. et al. Single-cell transcriptome analysis of endometrial tissue. Hum. Reprod. 31, 844–853 (2016).
Hao, Y. et al. Dictionary learning for integrative, multimodal and scalable single-cell analysis. Nat. Biotechnol. 42, 293–304 (2024).
Blighe, K. kevinblighe/PCAtools. (2024).
Love, M. I., Huber, W. & Anders, S. Moderated Estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).
Luo, W. & Brouwer, C. Pathview: an R/Bioconductor package for pathway-based data integration and visualization. Bioinformatics 29, 1830–1831 (2013).
Subramanian, A. et al. Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl. Acad. Sci. 102, 15545–15550 (2005).
Liberzon, A. et al. The molecular signatures database hallmark gene set collection. Cell. Syst. 1, 417–425 (2015).
Liberzon, A. et al. Molecular signatures database (MSigDB) 3.0. Bioinformatics 27, 1739–1740 (2011).
Kanehisa, M., Furumichi, M., Sato, Y., Matsuura, Y. & Ishiguro-Watanabe, M. KEGG: biological systems database as a model of the real world. Nucleic Acids Res. 53, D672–D677 (2025).
Kanehisa, M. & Goto, S. K. E. G. G. Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 28, 27–30 (2000).
Korotkevich, G. et al. Fast gene set enrichment analysis. BioRxiv 060012 https://doi.org/10.1101/060012 (2021).
Gu, Z. Complex heatmap visualization. iMeta 1, e43 (2022).
Xie, X. et al. Single-cell transcriptomic landscape of human blood cells. Natl. Sci. Rev. 8, nwaa180 (2021).
Robin, X. et al. pROC: an open-source package for R and S + to analyze and compare ROC curves. BMC Bioinform. 12, 77 (2011).
Sing, T., Sander, O., Beerenwinkel, N. & Lengauer, T. ROCR: visualizing classifier performance in R. Bioinformatics 21, 3940–3941 (2005).
Hocquette, A. et al. Association between risk of infant death and birth-weight z scores according to gestational age: A nationwide study using the Finnish medical birth register. Int. J. Gynecol. Obstet. 167, 1138–1145 (2024).
Katayama, S. et al. Gene expression analysis of skin grafts and cultured keratinocytes using synthetic RNA normalization reveals insights into differentiation and growth control. BMC Genom. 16, 1–14 (2015).
Bosinger, S. E. et al. Molecular control of leukocyte trafficking internal regulatory circuits of the immune system: leukocyte circulation and homing. In NeuroImmune Biology, eds. Bertók, L. & Chow, D. A., vol. 5, 185–214 (Elsevier, 2005).
Chaiworapongsa, T. et al. Soluble adhesion molecule profile in normal pregnancy and pre-eclampsia. J. Maternal-Fetal Neonatal Med. 12, 19–27 (2002).
Tarca, A. L. et al. Crowdsourcing assessment of maternal blood multi-omics for predicting gestational age and preterm birth. Cell. Rep. Med. 2, 100323 (2021).
Fortelny, N. et al. JAK-STAT signaling maintains homeostasis in T cells and macrophages. Nat. Immunol. 25, 847–859 (2024).
Pazhouhandeh, M. & Yu, D. Interferon disrupts immune and tissue homeostasis in SLE via CXCL13. Nat. Rev. Rheumatol. 20, 745–746 (2024).
Clowse, M. E. B., Jamison, M., Myers, E. & James, A. H. A national study of the complications of lupus in pregnancy. American Journal of Obstetrics and Gynecology 199, 127.e1-127.e6 (2008).
Manders, V. et al. The bivariate NRIP1/ZEB2 RNA marker permits non-invasive presymptomatic screening of pre-eclampsia. Sci. Rep. 10, 21857 (2020).
Burster, T. et al. Roles in antigen presentation and beyond. Mol. Immunol. 47, 658–665 (2010).
Gleicher, N. Why much of the pathophysiology of preeclampsia-eclampsia must be of an autoimmune nature. Am. J. Obstet. Gynecol. 196, 5e1–5e7 (2007).
Mana, G. et al. PPFIA1 drives active α5β1 integrin recycling and controls fibronectin fibrillogenesis and vascular morphogenesis. Nat. Commun. 7, 13546 (2016).
Goulopoulou, S. Maternal vascular physiology in preeclampsia. Hypertension 70, 1066–1073 (2017).
Wolfson, R. L. et al. KICSTOR recruits GATOR1 to the lysosome and is necessary for nutrients to regulate mTORC1. Nature 543, 438–442 (2017).
Nakashima, A. et al. Role of autophagy in oocytogenesis, embryogenesis, implantation, and pathophysiology of pre-eclampsia. J. Obstet. Gynaecol. 43, 633–643 (2017).
Nakashima, A. et al. Evidence for lysosomal biogenesis proteome defect and impaired autophagy in preeclampsia. Autophagy 16, 1771–1785 (2020).
Zhang, Y., Morgan, M. J., Chen, K., Choksi, S. & Liu, Z. Induction of autophagy is essential for monocyte-macrophage differentiation. Blood 119, 2895–2905 (2012).
Alahakoon, T. I. et al. Distribution of monocyte subsets and polarization in preeclampsia and intrauterine fetal growth restriction. J. Obstet. Gynaecol. 44, 2135–2148 (2018).
Faas, M. M., Spaans, F. & De Vos, P. Monocytes and macrophages in pregnancy and Pre-Eclampsia. Front Immunol 5, 298 (2014).
Tang, M. X., Zhang, Y. H., Hu, L., Kwak-Kim, J. & Liao, A. H. CD14 + + CD16 + HLA-DR + Monocytes in peripheral blood are quantitatively correlated with the severity of Pre-eclampsia. Am. J. Reprod. Immunol. 74, 116–122 (2015).
Wang, W., Liu, J. & Pan, E. CircHIPK3 contributes to human villous trophoblast growth, migration and invasion via modulating the pathway of miR-346/KCMF1. Placenta 118, 46–54 (2022).
Zhang, Y. et al. CircHIPK3 is decreased in preeclampsia and affects migration, invasion, proliferation, and tube formation of human trophoblast cells. Placenta 85, 1–8 (2019).
Zhang, X. et al. PPM1G and its diagnostic, prognostic and therapeutic potential in HCC (Review). Int. J. Oncol. 65, 109 (2024).
Stenmark, H. Rab GTPases as coordinators of vesicle traffic. Nat. Rev. Mol. Cell. Biol. 10, 513–525 (2009).
Zhao, Y. et al. Rab11b promotes M1-like macrophage polarization by restraining autophagic degradation of NLRP3 in alcohol-associated liver disease. Acta Pharmacol. Sin. 46, 134–146 (2025).
Cirovic, B. et al. C/EBP-Induced transdifferentiation reveals Granulocyte-Macrophage Precursor-like plasticity of B cells. Stem Cell. Rep. 8, 346–359 (2017).
Portale, F. et al. C/EBPβ-dependent autophagy Inhibition hinders NK cell function in cancer. Nat. Commun. 15, 10343 (2024).
Fan, Z. et al. APOL6 predicts immunotherapy efficacy of bladder cancer by ferroptosis. BMC Cancer. 24, 1046 (2024).
Shi, Q., Wang, D., Ding, X., Yang, X. & Zhang, Y. Exosome-shuttled miR-7162-3p from human umbilical cord derived mesenchymal stem cells repair endometrial stromal cell injury by restricting APOL6. Arch. Biochem. Biophys. 707, 108887 (2021).
Hatakeyama, S. TRIM proteins and cancer. Nat. Rev. Cancer. 11, 792–804 (2011).
Xu, X. et al. Tripartite motif-containing 68-stabilized modulator of apoptosis-1 retards the proliferation and metastasis of lung cancer. Biochem. Biophys. Res. Commun. 648, 11–20 (2023).
Jiang, J. et al. TRIM68, PIKFYVE, and DYNLL2: the possible novel Autophagy- and Immunity-Associated gene biomarkers for osteosarcoma prognosis. Front. Oncol. 11, 643104 (2021).
Laporte, M. H. et al. Human SFI1 and centrin form a complex critical for centriole architecture and ciliogenesis. EMBO J. 41, e112107 (2022).
Broadaway, K. A. et al. Loci for insulin processing and secretion provide insight into type 2 diabetes risk. Am. J. Hum. Genet. 110, 284–299 (2023).
Mzhavia, N. et al. Expression, and characterization of human metalloprotease 1: A novel member of the Pitrilysin family of metalloendoproteases. DNA Cell Biol. 18, 369–380 (1999).
Suemizu, H. et al. Expression profiling of placentomegaly associated with nuclear transplantation of mouse ES cells. Dev. Biol. 253, 36–53 (2003).
Foerster, E. G. et al. How autophagy controls the intestinal epithelial barrier. Autophagy 18, 86–103 (2022).
Farina, A. et al. Performance of messenger RNAs Circulating in maternal blood in the prediction of preeclampsia at 10–14 weeks. Am. J. Obstet. Gynecol. 203, 575e1–575e7 (2010).
The Human Protein Atlas. FLT1 protein expression summary. https://www.proteinatlas.org/ENSG00000102755-FLT1/single+cell#single_cell_type_summary (2025).
Acknowledgements
This work has been supported by Sigrid Jusélius Foundation, Föreningen Liv och Hälsa Association, Jane and Aatos Erkko Foundation, and Academy of Finland. The authors wish to acknowledge CSC – IT Center for Science, Finland, for computational resources, and thank Auli Saarinen for skilled technical assistance. Open access funded by Helsinki University Library.
Author information
Authors and Affiliations
Contributions
G.Y.G. analyzed the data and drafted the manuscript; S.E. performed the experiments; J.W. supported the funding for RNA-sequencing; K.R. managed and supported the fundings for the collection of ITU study; J.K. formulated the hypothesis, designed and supervised the study, reviewed and edited the manuscript; S.W. formulated the hypothesis, designed and supervised the study, reviewed and edited the manuscript; S.K. formulated the hypothesis, designed the data analysis methods, supervised the study, reviewed and edited the manuscript. All authors read and approved the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Gunaydin, G.Y., Ezer, S., Wedenoja, J. et al. Signs of immune dysregulation in second-trimester maternal blood RNA profiles in late-onset preeclampsia. Sci Rep 15, 42233 (2025). https://doi.org/10.1038/s41598-025-26323-3
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-025-26323-3






