Introduction

The great obstetrical syndromes (GOS) encompass a spectrum of pregnancy-related complications associated with defective deep placentation, including hypertensive pregnancies, preterm birth, and fetal growth restriction (FGR)1,2,3. Globally, these conditions remain the leading causes of maternal and perinatal morbidity and mortality, and the early prediction and prevention of GOS is still a significant challenge in obstetrics4,5,6.

Potential success might be obscured by the failure to note the common placental origin and the high degree of interconnectivity in different GOS conditions. Historically, these conditions have been regarded as distinct entities, and a substantial amount of effort has been devoted to predicting and preventing each complication. However, focusing solely on a single condition may overlook the potential commonalities in pathogenesis. Shared etiologies in the placenta among different GOS conditions have been acknowledged, and the co-occurrence of these complications is common1,2,3. A high proportion of fetuses from pregnancies complicated by preterm preeclampsia (PE) are small for gestational age (SGA) or FGR; similarly, early-onset FGR and preterm SGA often occur alongside PE7,8,9. The evidence that being born preterm or SGA was associated with a higher risk of hypertensive disorders of pregnancy (HDP) further supports a possible overlap in the underlying biology of these GOS conditions10. When adopting the screening tests that were constructed using PE as the outcome of interest, women identified as high risk of developing PE are also at increased risk of gestational hypertension (GH), preterm birth, SGA and FGR11,12,13. These findings suggest that overlapping molecular pathways underlie these conditions before clinical presentation, which may be key to the early prediction and prevention of GOS.

Although each GOS condition has been attributed to various pathological mechanisms, our understanding of the molecular commonalities in early pregnancy remains incomplete. The abnormality in the two well-known circulating angiogenic/antiangiogenic proteins, placental growth factor (PlGF) and soluble fms-like tyrosine kinase 1 (sFlt-1), has been observed in multiple GOS conditions. However, longitudinal analysis revealed that differences in the PlGF/sFlt-1 profiles did not emerge until the end of, or even after, early pregnancy. Identifying early biomarkers remains an unmet need, and high-throughput omics technologies could help unveil new biological insights. To our knowledge, multi-omics profiles across different GOS phenotypes using serum samples have not been explored, particularly in early pregnancy.

In the present study, we systematically searched for maternal serum biomarkers that may reflect the common pathophysiological processes of GOS in early pregnancy using a multi-omics approach that combines protein, metabolite, and gene analysis (Fig. 1). We sought to: (1) characterize the molecular changes prior to the onset of different GOS conditions [GH, PE, preterm prelabor rupture of membranes (PPROM), spontaneous preterm labor (sPT), SGA and FGR] and identify shared disease mechanisms from a molecular perspective; (2) describe the temporal patterns in molecular signatures and examine the timing at which deviations from normative expression patterns appeared in the pre-symptomatic phase. Taken together, this study demonstrates the shared molecular changes among different GOS conditions in early pregnancy and provides valuable insights into the possibility of a common screening tool and broad-spectrum prophylactics for GOS.

Fig. 1: Graphical overview of study design and the concept.
figure 1

GOS great obstetrical syndromes.

Methods

Study population

This case-control study was nested within the Shanghai Birth Cohort (SBC), which prospectively enrolled pregnant women from six participating hospitals in Shanghai, China, from 2013 to 2016. Details of the cohort have been described previously14. Briefly, women were recruited during their first prenatal care visit in early pregnancy. Maternal characteristics, pregnancy history and medical history were recorded through questionnaires. Fasting blood samples were obtained during prenatal care booking at 7 – 19 weeks of gestational age and stored at −80°C for subsequent analysis. Information on obstetrical disorders in the current pregnancy was extracted from medical records. The study was approved by the research ethics committees of Xinhua Hospital in Shanghai (ref # M2013010). All participants signed informed consent.

Nulliparous participants with singleton pregnancies and available serum samples during early pregnancy (before 20 weeks of gestation) were included in our study. Participants with serum samples that had not been thawed previously were included. Pregnancies that subsequently developed GH, PE, PPROM, sPT, SGA, and/or FGR formed the GOS group. Women with uncomplicated pregnancies and no previous medical history formed the control group. GH was defined as newly onset hypertension (systolic blood pressure ≥ 140 mm Hg and/or diastolic blood pressure ≥ 90 mm Hg ≥2 occasions) after 20 weeks of gestation15. PE was defined as GH plus proteinuria (≥300 mg/day or ≥1+ on urine dipstick analysis)15. Preterm birth was defined as a delivery occurring between 21 and 36 completed weeks’ gestation, and we further divided spontaneous preterm birth into PPROM and sPT16. SGA and FGR were defined as birthweights below the 10th and 3rd percentiles, respectively, based on an adjustable global reference for fetal-weight percentiles17.

Untargeted proteomics

Serum samples were centrifuged, and the supernatants were collected. High-abundance proteins in serum were depleted using PierceTM Top 14 Abundant Protein Depletion Spin Columns (Thermo Fisher Scientific). Protein concentration was determined by BCA assay. Samples were reduced and alkylated prior to digestion with trypsin. The tryptic peptides were analyzed with an EASY-nLC 1200 ultra-performance liquid chromatography system (Thermo Fisher Scientific) coupled to an Orbitrap Exploris™ 480 mass spectrometer (Thermo Fisher Scientific). Buffers A and B were composed of 2% and 90% acetonitrile, containing 0.1% formic acid, respectively. Peptides were separated over a 73 min gradient at a 500 nL/min flow rate. The separated peptides were injected into a nanospray ionization source and then analyzed by tandem mass spectrometry (MS/MS) in data-dependent acquisition mode. The m/z range of MS1 was 400-1200, with a resolution of 60,000. The top 15 precursors were selected for the MS/MS experiment, performed with a resolution of 30,000 (at 100 m/z) and high-energy collision dissociation at a normalized collision energy of 27%. The resultant mass spectrometric data were processed in Proteome Discoverer (Version 2.4.1.15, Thermo Fisher Scientific) against the human UniProt database (July 2021; 78120 sequences) concatenated with the reverse decoy database. Precursor and product ion mass tolerance were set to 10 ppm and 0.02 Da, respectively. The enzyme was set to trypsin with up to 2 missing cleavages. Protein and peptide identifications were filtered at a false discovery rate (FDR) < 1%. A total of 1930 proteins were identified in the serum samples.

Untargeted metabolomics

Serum samples (100 μL each) were treated with 400 μL of methanol containing internal standards and then centrifuged to collect the supernatant for freeze-drying. The freeze-dried sample was redissolved with 50 μL acetonitrile/MilliQ water (1/3) for liquid chromatography-mass spectrometry (LC-MS) analysis. To monitor the robustness of metabolomics analysis, quality control (QC) samples were generated by mixing equal amounts of each sample and analyzed after every ten sample runs. LC-MS analysis was performed using a Vanquish UHPLC system coupled to a Q Exactive™ Quadrupole-Orbitrap Mass Spectrometer (Thermo Fisher Scientific). An ACQUITY UPLC C8 column and an ACQUITY UPLC HSS T3 column T3 (Waters) were used for chromatographic separation in positive and ion scan modes, respectively. MS conditions were set as follows: Sheath gas flow rate was 45 arb, aux gas flow rate was 10 arb, spray voltage was 3.5 kV for positive ion scan mode and 3.0 kV for negative ion scan mode, capillary temperature was 320 °C, aux gas heater temperature 350 °C, and resolution was 14e4. The scan range was 70-1,050 m/z. The identification of metabolites was based on the exact mass, MS/MS fragments, and retention time. We also referred to an in-house database, OSI-SMMS18. To improve the coverage of metabolite annotation, previous work from You et al.19 was referred to, and the Mass Bank of North America (MONA) database was used20. Peak areas of identified metabolites were extracted using TraceFinder™ software (version 3.2.512.0). A total of 471 metabolites were identified in the serum samples.

Genotyping and SNP imputation

DNA was isolated from maternal blood using the TGuide Large Volume Blood Genomic DNA Extraction Kit (TIANGEN). Samples were genotyped using the Illumina Global Screening Array v.3.0 + multi-disease bead chips (GSAMD-24v3-0-EA) and Infinium chemistry. Each sample was interrogated on the arrays against 730,059 SNPs. We performed QC using PLINK2 and conducted pre-imputation QC on both a “per-individual” basis and a “per-marker” basis. Consequently, 373,632 single-nucleotide polymorphisms (SNPs) from 6910 samples were available for subsequent imputation. Hidden Markov model-based phasing was performed using SHAPEIT v.421, and imputation was conducted using IMPUTE522, with the 1000 Genomes Project Phase 3 as a reference panel. Post-imputation quality control included removing SNPs with an imputation info score <0.6, a missing call rate > 0.05, or an MAF < 0.01, leading to 8,039,891 SNPs remaining for subsequent analyses. Of 6910 samples, 313 women with proteomic information were available for the following genetic analysis.

Statistics and reproducibility

Co-expression network analysis of proteome and metabolome datasets

Proteins and metabolites with more than 80% missing values in the control or case groups were removed from the final analyses. Missing values were imputed with the minimal values in the proteomics and metabolomics datasets separately. All features were then Log2 transformed and quantile normalized prior to the following analysis.

Protein and metabolite co-expression modules were identified using weighted gene co-expression network analysis (WGCNA) with the R package “WGCNA”23. Soft-thresholding powers were chosen based on the criterion of approximate scale-free topology24. We used powers of 5 and 9 for proteomics and metabolomics, respectively, as these were the lowest powers to achieve a scale-free R2 fit of 0.85. Signed networks were created by constructing a topological overlap matrix, and hierarchical clustering dendrograms of proteins or metabolites were produced using the topological overlap-based dissimilarity. Modules were identified in the dendrogram using Dynamic Tree Cut25, and the minimum number of analytes per module was set to 15 for proteins and 10 for metabolites. The module eigenprotein and eigenmetabolite, the first singular vector of the module expression matrix, were calculated to represent the module’s protein/metabolite expression profiles. For each protein or metabolite, module membership (MM) was calculated by correlating its expression profile with the eigenprotein or eigenmetabolite to indicate how closely an analyte was associated with a given module.

Associations of protein/metabolite modules with GOS conditions and the identification of hub molecules

The associations between modules and GOS conditions (GH, PE, PPROM, sPT, SGA, and FGR), the composite outcome GOS, as well as maternal risk factors (maternal age and pre-pregnancy BMI) were assessed by correlating the module eigenprotein/eigenmetabolite with each trait. Hub proteins or metabolites in the modules related to GOS conditions were identified using two approaches: 1) For modules that were also associated with maternal risk factors, we extracted the analytes with high MM (MM > median, the p-value for MM < 0.05) and tested whether the level of these analytes was explained by GOS conditions, maternal age and/or BMI. We fitted a linear model for each analyte and used a nested F-test approach to classify analytes that were differentially expressed in two or more contrasts (GOS conditions, maternal age, and/or pre-pregnancy BMI) with R package “limma”26,27. Analytes that were explained by GOS conditions only or by both GOS conditions and maternal age/pre-pregnancy BMI were defined as the hub analytes; 2) For modules not related to maternal risk factors, we extracted analytes with high MM and calculated the protein or metabolite significance, which represented the relationship between an analyte and a GOS condition. Analytes with high MM and a p-value for protein/metabolite significance of less than 0.05 were considered hub analytes. The hub proteins related to the composite GOS outcome in the GOS core module (turquoise protein module) were also identified using the second approach, as the turquoise protein module was not associated with maternal risk factors. Given the exploratory nature of our study, we did not adjust for multiple testing in the module-trait correlation analysis or the selection of hub proteins and metabolites. Pathway and biological process enrichment of proteins was performed using WebGestalt (WEB-based Gene SeT AnaLysis Toolkit) with annotation by Kyoto Encyclopedia of Genes and Genomes (KEGG), Reactome and Gene Ontology28. Database genome protein coding was selected as the reference set. The sub-classes of metabolites were annotated based on the Human Metabolome Database (HMDB)29. The FDR significant or top 10 categories were selected.

Exploration of key protein-metabolite subnetworks in GOS conditions

Expression matrices of hub proteins and metabolites for each GOS condition were integrated with Data Integration Analysis for Biomarker discovery using Latent cOmponents (DIABLO)30. DIABLO extends generalized canonical correlation analysis, which maximizes the covariance between linear combinations of variables (proteins and metabolites in the present study), to a supervised framework30. We used DIABLO in the R package “mixOmics”31 to obtain a subnetwork of key proteins and metabolites for each condition. In addition to the blocks of hub proteins and hub metabolites, maternal characteristics (maternal age, pre-pregnancy BMI, and blood pressure during early pregnancy) were also involved as a separate block for DIABLO analysis. Cross-validation (10 × 10-fold) was used to determine the optimal number of components and the number of features to retain per component. To determine the optimal number of components for the DIABLO model, a M-fold cross-validation approach was employed, and the perf() function was executed with 10-fold cross-validation repeated 10 times. We tuned the number of features using the tune.block.splsda() function and the number of features to be kept was determined with 10-fold cross-validation repeated 10 times. Selected key proteins and metabolites were visualized using the circos plot function, and the correlation cut-off was set at ≥ 0.7 or ≤ −0.7.

Exploration of genetic variants associated with the GOS core module using genetic studies and molecular interaction network

Genome-wide associations with the eigenprotein of the GOS core module (turquoise protein module) were analyzed in PLINK2 using linear regression models assuming an additive genetic model, with adjustment for the residual population structure (the top two principal components from principal component analyses). The genome-wide significance threshold was set at 5.0 ×10-8 and suggestive significance at 5.0 ×10-6. Associations of known GOS variants with the eigenprotein of the GOS core module were analyzed using linear regression. A proxy in high linkage disequilibrium was used for the variant not available in our dataset. Previously reported variants associated with GOS conditions at P < 1.0 ×10-5 in published genome-wide association and meta-analysis studies were included (Supplementary Data 1)32,33,34,35,36,37,38,39,40,41,42,43. For hub proteins in the GOS core module and significant genes identified in genome-wide studies and association studies with known variants, we used NetworkAnalyst44 to construct a minimum interaction network, which only consisted of the seed nodes (significant genes and hub proteins) and the nodes that were necessary to connect the seed nodes. The literature-curated IMEx Interactome was used as the protein-protein interaction database45.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Results

Characteristics of the study population

Characteristics of nulliparous women in the control and GOS groups, as well as in each GOS condition group, are shown in Supplementary Data 2. A total of 203 pregnancies that subsequently developed GH (n = 57), PE (n = 14 for preterm PE, n = 51 for term PE), PPROM (n = 38), sPT (n = 43), SGA (n = 24), and/or FGR (n = 16) were included in the GOS group. The GOS conditions, except for SGA and FGR, were mutually exclusive. The control group consisted of 181 healthy pregnant women. Women in the GOS group had higher pre-pregnancy BMI and older age than the control participants. The distribution of gestational ages at sampling in the control and GOS groups is shown in Supplementary Fig. 1.

Associations of protein/metabolite modules with GOS conditions and the identification of the GOS core module

Using omics data in early pregnancy, WGCNA identified 11 protein modules (Supplementary Fig. 2, Fig. 2a; see Supplementary Data 3 for table format) and 15 metabolite modules (Supplementary Fig. 3, Fig. 2b; see Supplementary Data 4 for table format). For proteomics, eight modules were associated with at least one GOS condition (Fig. 2a). Notably, one protein module (turquoise) was positively associated with GOS and all six conditions (Fig. 2a, c, d), as well as more detailed GOS phenotypes (GH without FGR, PE without FGR, PPROM without FGR, preterm PE and term PE) (Supplementary Fig. 4). Besides, this module was not correlated with maternal risk factors (maternal age and pre-pregnancy BMI). Therefore, we defined this module as a GOS core module in the following analyses. Pathway analysis was performed for all the proteins in the GOS core module (Fig. 2e, Supplementary Data 5). The top 5 significantly enriched pathways were immune system-related (Innate immune system, Neutrophil degranulation, Immune system) and platelet-related pathways (Platelet degranulation, Response to elevated platelet cytosolic Ca2 + ). Eighty-one hub proteins related to the composite GOS outcome in the GOS core module and their odds ratios (ORs) for developing GOS were displayed in Fig. 2f (see Supplementary Data 6 for table format). The biological processes enriched for these proteins are illustrated in the directed acyclic graph in Fig. 2g (Supplementary Data 7), which primarily encompasses the immune system process. The other seven GOS condition-related modules were associated with maternal age or pre-pregnancy BMI, and five were shared across more than one condition (green, magenta, greenyellow, blue, red) (Fig. 2a).

Fig. 2: Associations of GOS conditions and maternal characteristics with co-expression modules and pathway enrichment and hub proteins for the GOS core module.
figure 2

a Heat map for protein modules. b Heat map for metabolite modules. Each row of the heat map corresponds to a module eigenprotein/eigenmetabolite. Cells containing correlation coefficients and p-values represent the significant associations. Analysis was performed on data from n  =  384 study participants. c Boxplot for the eigenprotein values of the turquoise protein module in the control group versus the GOS group. d Boxplot for the eigenprotein values of turquoise protein module in the control group versus different GOS condition groups. The Kruskal-Wallis p-value was annotated on the top of the plot. e Pathway analysis of proteins in the GOS core module (turquoise protein module). f Hub proteins related to the composite GOS outcome in the GOS core module (turquoise protein module) and their ORs (95% CIs) for developing GOS. Logistic regression was used to estimate the associations. g DAG of GO biological process enrichment results for the hub proteins in the GOS core module. Boxes with blue color correspond to the top biological processes that passed the FDR correction. GOS great obstetrical syndromes, GH gestational hypertension, PE preeclampsia, PPROM preterm prelabor rupture of membranes, sPT spontaneous preterm labor, SGA small for gestational age, FGR fetal growth restriction, BMI body mass index, OR odds ratio, CI confidence interval, DAG directed acyclic graph, GO gene ontology, FDR false discovery rate.

For metabolomics, nine modules were associated with at least one GOS condition (Fig. 2b), among which three were also associated with maternal risk factors. No metabolite module showed significant associations across all the GOS conditions. Although GH and PE shared four disease-related metabolite modules, and PPROM and sPT shared two modules, there was little overlap between hypertension-related and preterm birth-related modules.

Hub proteins/metabolites for GOS conditions and their enriched functions

According to the modules’ relationships with maternal risk factors (maternal age and pre-pregnancy BMI), we used two strategies to identify the hub proteins and metabolites (Methods and Supplementary Figs. 58). Hub analytes common for ≥ 2 conditions are displayed in Fig. 3a, b (see Supplementary Data 8, 9 for table format). We identified 4 common hub proteins across all 6 conditions, and 20 and 21 proteins were shared across 5 and 4 conditions, respectively (Fig. 3a). Pathway analysis revealed several common pathways for GOS conditions (Fig. 3c, Supplementary Data 10). Platelet degranulation, response to elevated platelet cytosolic Ca2 + , and innate immune system were shared across all six conditions. Several pathways common to two to five conditions involved events within the immune system, such as neutrophil degranulation, complement cascade, and complement and coagulation cascades (Fig. 3c). Condition-specific hub proteins and pathways were also identified (Supplementary Data 10, Supplementary Table 1). For example, apolipoproteins (APOC3, APOE, APOA1), involved in chylomicron assembly and remodeling, were unique hub proteins in GH.

Fig. 3: Hub proteins and metabolites for GH, PE, PPROM, sPT, SGA and FGR and their enriched functions.
figure 3

List of hub proteins (a) and metabolites (b) common for ≥2 conditions. The heatmap shows the protein/metabolite significance, which represents the relationship between an analyte and a GOS condition. The annotation of frequency on the right indicates the number of conditions a hub protein or metabolite shares across. Point annotation of a pathway means that the protein is involved in immune and platelet function. Analysis was performed on data from n  =  384 study participants. c UpSet plot of the top 10 enriched pathways of hub proteins for GOS conditions. d Subclasses of hub metabolites for GOS conditions. Subclasses that have only one metabolite were labeled as “others”. GOS great obstetrical syndromes, GH gestational hypertension, PE preeclampsia, PPROM preterm prelabor rupture of membranes, sPT spontaneous preterm labor, SGA small for gestational age, FGR fetal growth restriction.

Compared to proteomics, fewer overlapped hub metabolites were found in GOS conditions (Fig. 3b, d, see Supplementary Data 9 and 11 for table format, Supplementary Table 2). GH- and PE-related modules were enriched in glycerophospholipids [glycerophosphocholines (GPChos) and glycerophosphoethanolamines (GPEths)], followed by amino acids and peptides. While hub metabolites for PPROM and sPT were mainly fatty acids, followed by bile acids, alcohol and derivatives. SGA-related modules were enriched in fatty acids, GPChos, amino acids, and peptides. GPChos were found to be the main hub metabolites for FGR.

Key protein-metabolite subnetworks in GOS conditions

To better understand the interplay between the different levels of biological systems, we employed DIABLO to conduct feature selection among hub molecules, in combination with maternal characteristics, to identify the key molecular subnetworks in each condition (Supplementary Fig. 9). Each condition had its specific protein-metabolite subnetworks and signatures. For example, upregulated carnitine and apolipoproteins were recognized as essential features of GH. The key subnetworks also underlined the negative relationships between long-chain fatty acids and preterm birth. Important relationships between immune and platelet function-related proteins and GOS conditions were reinforced in the DIABLO models, such as FN1, MME, GAA with GH, PE, SGA and FGR; MRC1 and C1QB with both PPROM and SGA; and CCL16 with both PPROM and sPT.

Genetic variants associated with the GOS core module

To investigate the genomic signatures correlated with the GOS core module, we first performed a genome-wide association study (GWAS). Although we found no variants with genome-wide significance (P < 5.0 ×10-8), we identified 5 loci with suggestive significance P < 5.0 ×10-6 (ADAM12, FANK1, ARHGAP44, GAB4 and LAPTM5) (Fig. 4a, Supplementary Fig. 10, Supplementary Data 12). Next, we conducted a linear regression analysis using known variants associated with hypertensive pregnancies and preterm births (Supplementary Data 1). SNPs in TRPC6, TGFBR3, CHMP2B, ZBTB38 and EEFSEC genes demonstrated associations with the eigenprotein of the GOS core module (Supplementary Table 3). However, all these associations were at the P < 0.05 level; unfortunately, none survived multiple testing corrections. Nonetheless, when we employed a knowledge-driven strategy based on known protein-protein interactions using NetworkAnalyst to examine the links between hub proteins in the GOS core module and genes identified in our GWAS as well as an association analysis with known variants, nine of the 10 genes and 67 of the 81 hub proteins were interconnected in a functional network (Fig. 4b). Among the nine genes, TGFBR3 ranked the highest with regard to degree centrality and was directly connected to ENG protein, and LAPTM5 ranked the highest with regard to betweenness centrality and was directly connected to LAMP1 protein (Fig. 4b, Supplementary Data 13). Among the 67 hub proteins, proteins involved in immune and platelet function ranked top in both degree centrality and betweenness centrality (PKM, CAND1, VCL, ACTN1, AHSG, NCF2, HSP90B1 and BTK) (Supplementary Data 13).

Fig. 4: Genetic variants associated with GOS core module.
figure 4

a Manhattan plot of genome-wide significant SNPs associated with the eigenprotein of the turquoise protein module. The horizontal red line represents a P-value of 5 ×10-6. Analysis was performed on data from n  =  313 study participants with both proteomic and genotyping information. b Integration of the hub proteins in the GOS core module and the genomic signatures identified through genome-wide association study and candidate gene association study via protein-protein interaction network. Green nodes represent the genomic signatures. Red, orange and yellow nodes represent GOS hub proteins, with darker colors corresponding to higher betweenness centrality values and larger nodes corresponding to higher degree centrality values. Grey nodes represent the nodes necessary to connect the seed nodes (genomic signatures and GOS hub proteins) in a minimum interaction network. GOS great obstetrical syndromes.

Dynamics of the common proteomic signatures during early pregnancy

Before investigating the temporal progression of protein expression patterns, we first explored the gestational time point at which the proteomic profile might be different from subsequent weeks during early pregnancy. We found that 16 and 14 weeks of gestation might be the time points for healthy and complicated pregnancies, respectively (Supplementary Fig. 11). For the GOS core protein module, the eigenprotein values were significantly associated with gestational age at sampling and increased as gestation progressed in both the control (Pearson correlation coefficient = 0.18, p-value  = 0.02) and GOS (Pearson correlation coefficient = 0.28, p-value  = 4.5E-05) groups (Fig. 5a). Importantly, starting at the end of 13 weeks of gestation, women who subsequently developed GOS had a significantly higher expression level of proteins in the core module than healthy controls (nonoverlapping 95% confidence intervals of the mean eigenprotein between GOS and control groups) (Fig. 5a).

Fig. 5: Dynamic changes in the common proteomic signatures during early pregnancy.
figure 5

a Temporal changes in eigenprotein values across early pregnancy in GOS and control groups. Lines were smoothed by linear models and shaded areas represent 95% CIs. The correlation coefficients between eigenprotein values and gestational age at sampling and the P values are shown in the top left corner. b Heatmap of the P values on 81 hub proteins in the GOS core module. Logistic regression was used to estimate the associations, and gestational age at sampling was adjusted in the “<14w” category due to the significant differences in gestational age at sampling between the GOS and control groups. For “14-15w” and “≥16w” categories, the differences in gestational age at sampling between GOS and control groups were not significant. The categories of “<14w”, “14-15w” and “≥16w” represent the sample collection windows before 14, between 14 and 15, and after 16 weeks of gestation. The mean expression levels (log2 transformed and normalized values) of c ADGRE5, d TIMP1, e FN1, f MME, and g VWF in the control, HDP (GH + PE) and PT (PPROM and sPT) groups before and after 14 weeks of gestation. Error bars correspond to the 95% CI. Analysis was performed on data from n  =  384 study participants. GOS great obstetrical syndromes, HDP hypertensive disorders of pregnancy, PT preterm birth, CI confidence interval.

According to the gestational age at sampling, we further divided the participants into 3 groups (<14 weeks, 14-15 weeks and ≥ 16 weeks) to identify potential biomarkers for GOS in different detection windows. Among the 81 hub proteins in the GOS core module, 16 proteins’ expression levels significantly differed between the GOS and control groups as early as the first trimester (Fig. 5b, see Supplementary Data 14 for table format). During the early second trimester, more proteins demonstrated the potential to differentiate normal and complicated pregnancies. We also found 5 proteins (CCL16, SAA2-SAA4, LAMA2, LAMP1, SERPINF1) with increased levels in complicated pregnancies not only before 14 weeks but also after 16 weeks of gestation (Fig. 5b). Two immune-related proteins, ADGRE5 and TIMP1, ranked at the top with regard to the P-values in the <14 weeks and ≥16 weeks groups, respectively. These two proteins demonstrated increased levels in both hypertensive pregnancies and preterm birth compared to the control group before and after 14 weeks of gestation, respectively (nonoverlapping 95% confidence intervals of the mean expression levels between hypertensive pregnancies/preterm birth and control groups) (Fig. 5c, d). In addition to the proteins in the core module, several immune- and platelet-related signatures common in ≥4 conditions (FN1, MME, VWF) also had significantly higher levels in complicated pregnancies (Fig. 5e–g). But the differences could only be observed after 14 weeks of gestation.

Discussion

This study used systemic information from genomics, proteomics, and metabolomics to compare the molecular signatures of early pregnancy with those of different GOS conditions. We revealed a shared molecular background underlying these conditions and the upstream genetic variants associated with these molecular changes. We also unveiled the temporal patterns of the common signatures of GOS. Our findings may have important implications for disease pathogenesis, early prediction, and prevention.

A shared molecular background of GOS conditions, which may be part of the “core” GOS pathology that precipitates the manifestation of symptoms, was identified at the module, molecule, pathway, multi-omics network, and genome levels. At the co-expression module level, six protein modules were found to be correlated with more than one GOS condition. Importantly, we identified a core protein module that was positively associated with all tested conditions and was not influenced by maternal age or pre-pregnancy BMI. At the molecular level, we observed 45 hub proteins shared across ≥4 conditions and identified 81 hub proteins mainly involved in the immune system processes for the composite outcome GOS. Pathway-level analysis in the core protein module and for all hub proteins under the tested GOS conditions highlighted the central role of immune and platelet functions in GOS. At the network level, by combining metabolomic data, we observed that although multi-omics models identified different protein-metabolite subnetworks for the tested GOS conditions, they reinforced the role of proteins related to immune and platelet functions.

We further examined the genomic information to investigate potential upstream contributors to these common proteomic changes. Using both data-driven and knowledge-driven strategies, we identified nine upstream genetic variants associated with the core protein module. Although the genetic findings are exploratory results with suggestive significance, they may provide clues to the initial molecular events in GOS. The top SNPs with the smallest P-value in our GWAS are located in the region of the ADAM12 gene, which encodes a disintegrin and metalloproteinase-12 and is linked with two platelet function-related hub proteins in the GOS core module (PLG and ACTN1) in the interaction network46. LAPTM5, the gene with the highest degree of centrality in the molecular interaction network, encodes lysosomal-associated protein transmembrane 5 and regulates multiple pathways in immune cells47. TGFBR3, which is the gene with the highest betweenness centrality, encodes the transforming growth factor-beta type III receptor and is involved in immune response and associated with early-onset preeclampsia48,49. TGF-β is a well-known cytokine regulating a variety of cellular functions, and the binding of TGF-β to its receptor activates various signaling pathways involved in trophoblast differentiation, invasion and migration50. Together with the functions of the core protein module, these genetic signatures inform us that the initial molecular events in GOS mainly involve immune and platelet-related processes.

The finding that immune and platelet functions are crucial in GOS is not unexpected, given previous studies9,51,52,53. Still, our comprehensive analysis of multiple GOS conditions underscored the commonalities in the pre-symptomatic phase at a systemic level. In addition, our findings provide molecular evidence for the co-occurrence and interrelatedness of GOS conditions, adding weight to the concept that shared mechanisms underlie these conditions. It has been suggested that GOS conditions are underpinned by defective deep placentation, which is characterized by incomplete conversion of spiral arteries and obstructive vascular lesions1,2. We speculate that the shared molecular signatures identified in this study may indicate the early development of placental bed pathology, where immunoregulation and platelets are involved. Indeed, several important genes and proteins identified in our single- and multi-omics analyses are involved in placentation. For example, ADAM12 is highly expressed in the human placenta and serves as a critical regulator of trophoblast migration and invasion during early pregnancy46,54. In trophoblastic cell models and placental villous explant cultures, the knockdown of ADAM12 dampened trophoblast cell invasion, column outgrowth, and cell fusion46,55. CCL16 (C-C Motif Chemokine Ligand 16), a GOS hub and immune-related protein, may influence angiogenesis by interfering with endothelial cells56. Trophoblasts acquire CCR1, the receptor of CCL16, in the initial step of invasion57, and it has been observed that high umbilical artery CCL16 was associated with PE and FGR, and CCR1 was also highly expressed in PE placentas56. Another GOS hub protein, ENG (endoglin), plays an important role in angiogenesis and may contribute to the risk of PE, FGR and preterm birth58,59,60. The overexpression of endoglin has been observed in the ischemic placental tissue, and the increased release of soluble endoglin into circulation may impair the invasive ability of trophoblast cells61. PKM, an immune-related protein that ranked top with regard to degree centrality in the interaction network, has been reported to be involved in trophoblast cell invasion, and increased expression of PKM has been found in PE and FGR placentas62,63.

More importantly, our findings support the use of a screening tool and broad-spectrum prophylaxis to detect and prevent GOS in early pregnancy and provide a resource for translatable prediction and intervention targets for future studies. Previously, studies usually investigated the biomarkers of a specific GOS condition and constructed a corresponding prediction model. In this way, several screening tests should be applied in early pregnancy to cover the entire spectrum of GOS conditions. Such a strategy may not be a practical or economical option, considering the interrelatedness of different GOS conditions. Instead, one common prediction tool for the global risk of GOS in early pregnancy combined with phenotype-specific “rule in” or “rule out” tests in late pregnancy could be clinically valuable, especially for asymptomatic women with few risk factors. The possibility of common predictors has been evaluated in recent studies11,64. Boutin et al. and Minopoli et al. utilized the Fetal Medicine Foundation (FMF) screening test to estimate the risk of a composite GOS outcome including PE/HDP, SGA, fetal death, and preterm birth11,64. They found that this algorithm possessed the ability to identify women at a high risk of GOS with a higher positive predictive value, which may reduce the harms of a possible false positive result, such as maternal anxiety and unnecessary monitoring or aspirin exposure64. However, the FMF algorithm achieved only moderate performance in early pregnancy, and we hypothesize that the addition of other biomarkers may enhance its performance. In the future, we will conduct further research to confirm the common molecular changes identified in this study and to explore the potential of a single screening model of GOS in early pregnancy.

Potential broad-spectrum prophylactics for various GOS conditions have also been explored in previous studies, particularly drugs that could impact immune- or platelet-related functions. Low-dose aspirin (LDA) has long been proposed for the prevention of GOS conditions3,65. Several meta-analyses of RCTs indicated that pregnant women may benefit from reduced risk of preeclampsia, preterm birth, SGA, FGR, and perinatal mortality with LDA initiated in early pregnancy66,67,68,69. Low-molecular-weight heparin (LMWH), either alone or in combination with LDA, is another potential intervention strategy that may be beneficial. In high-risk women, meta-analyses have shown that LMWH use that started before 16 weeks of gestation was associated with a substantial reduction in the development of GH, PE, SGA, and perinatal death70,71. Notably, a recent meta-analysis found that first-trimester initiated vaginal progesterone, which is known for its effect on reducing the risk of preterm birth, may also decrease the risk of HDP and PE72. Other candidate drugs targeting immune- or platelet-related pathways could also be future options, such as statins and hydroxychloroquine3,73,74,75,76.

The optimal initiating timing for prophylaxis in early pregnancy remains an essential issue with controversy. For example, cut-off gestational ages of 14, 16, and 20 weeks have all been applied in clinical trials for LDA65,66,67,68,69. Here, from a molecular perspective, we suggest that initiating prophylactics before 14 weeks of gestation may be an optimal window not only for LDA but also for other potential drugs that modulate immune- or platelet-related pathways. Our data indicate that proteomic changes have already emerged as early as the first trimester in the sera from women who subsequently develop GOS. As pregnancy progressed to 16 weeks, more proteins demonstrated significant changes with larger deviation from normal expression patterns, which may imply an aggravating pathophysiology. These molecular findings might partly explain why some recent clinical trials initiating aspirin before 16 or 20 weeks observed a nonsignificant reduction of preterm birth or PE77,78. An earlier start of intervention, no later than the first trimester, might nip the pathology in the bud.

The major strengths of our work include the following: 1) We leveraged the systemic information in genome, proteome and metabolome to derive a comprehensive understanding of the pathogenesis of GOS at different biological system levels. 2) We simultaneously described the molecular profiles of multiple GOS conditions to avoid neglecting the commonalities and heterogeneity of these closely related conditions. 3) The multi-omics profiling was derived at a preclinical stage. It might serve as a source of translatable targets for early prediction and prevention. 4) We only included nulliparas in our work, so the confounding effects of previous pregnancy history can be excluded. Additionally, without the indication of previous pregnancy history, the screening for nulliparas at risk of GOS is still challenging16,79. Our data can be applied to the development of prediction and prevention strategies for these individuals.

This study also has several limitations. More detailed phenotypes were not considered due to the limited sample size; hence, we were unable to describe the molecular characteristics of early- and late-onset PE, early and late preterm birth, and SGA/FGR not complicated by other GOS conditions. A small sample size also decreased the power to detect significant associations in genomic analysis. Caution is necessary when interpreting the suggestive loci, and these exploratory results should be validated in a larger dataset or a meta-analysis in future research. Additionally, the investigation of the temporal progression of common proteomic changes was based on data from cross-sectional grouping rather than a longitudinal follow-up. Therefore, longitudinal studies using serial samples from the first into the second trimester are needed to track and validate the dynamic protein expression patterns suggested in the present study. Besides, information on placental histopathology is not available in the SBC dataset. Therefore, the findings in the present study do not provide direct evidence that the shared molecular abnormalities are associated with placental abnormalities. Taken together, our findings merit further investigation and the signatures need to be validated in larger cohorts.

Our work has several implications for future research. First, the exploratory results presented here, using untargeted omics analysis, particularly the common protein biomarkers, should be confirmed in independent cohorts using different platforms with targeted approaches. After verifying the common protein biomarkers, the construction and performance evaluation of a single screening model for GOS in early pregnancy can be conducted prospectively. Second, further mechanistic studies using in vitro and in vivo models are warranted to investigate the role of immune and platelet function-related proteins in placental maldevelopment and establish the causal links. Thirdly, targeted intervention strategies, as well as combined intervention approaches, could be investigated based on the findings of our study. Using the molecular signatures presented here, future research could be conducted to investigate the subsets of high-risk pregnant women who may benefit from LDA, LMWH or other prophylactics. Combining two or more prophylactics that target the immune- or platelet-related pathways is also a direction for future research. Finally, integrating other omics layers, such as cell-free DNA/RNA from maternal blood and proteome of the first-trimester placental villi or decidua obtained via biopsy, could enrich our understanding of the molecular changes occurring both systemically and at the maternal-fetal interface in early pregnancy.

In summary, this study presents a multi-omics landscape of multiple GOS conditions at the pre-symptomatic phase and demonstrates the shared proteomic changes among different GOS conditions. Together with the upstream genetic variants, the common molecular signatures may reflect the core GOS pathology, where the immune and platelet functions play a crucial role. Furthermore, the molecular profile of the core early-stage GOS pathologies provides useful clues to the possibility of a common screening tool and broad-spectrum prophylactics for a composite pregnancy outcome, as well as the necessity of early initiation of prevention no later than 14 weeks of gestation.