Abstract
Preeclampsia (PE) is a serious pregnancy complication that contributes to maternal and perinatal morbidity and mortality. Understanding its pathogenesis and revealing predictive biomarkers are essential for guiding treatment decisions. In order to explore the global changes of serum metabolites in PE patients and identify potential predictive biomarkers for suspected PE patients (pregnant women who had already shown PE-related symptoms in the middle to late stages of pregnancy, but were not yet confirmatively diagnosed as PE.), a large-scale serum metabolomic analysis was conducted in this study with a prospective cohort of 328 suspected PE patients in the middle or late pregnancy stages, as well as a retrospective cohort of 30 healthy pregnant women and 30 PE patients. Using liquid chromatography mass spectrometry (LC − MS), serum metabolomic profiling revealed that the development of PE was closely associated with disturbed amino acid metabolism. Moreover, a panel of seven predictive biomarkers including 2-methyl-3-hydroxy-5-formylpyridine-4-carboxylate, gamma-glutamyl-leucine, 2-hydroxyvaleric acid, LysoPC(16:1(9Z)/0:0), PC(DiMe(13,5)/MonoMe(13,5)), ADP-D-glycero-beta-D-manno-heptose and phenylalanyl-tryptophan were identified for PE development by performing multiple statistical analysis and LASSO regression analysis. The combination of these biomarkers showed promise in the prediction of PE development for suspected PE patients, with an AUC of 0.753 and 0.885 for the discovery and validation cohorts, respectively. These findings highlight the potential of large-scale prospective metabolomic studies combined with machine learning algorithms in identifying key biomarkers for predicting PE development, while retrospective metabolomics studies provide insights into the pathogenesis of PE.
Similar content being viewed by others
Introduction
Preeclampsia (PE) is a common and serious complication during pregnancy and postpartum, characterized by new-onset hypertension (> 140/90 mmHg) and proteinuria (> 300 mg/24 h), typically after 20 weeks of pregnancy. It can also be superimposed on preexisting hypertension or renal disease. Currently, PE has been recognized as public health concern due to several reasons: (1) it is diagnosed in 2–5% of pregnancies, with a higher incidence for other forms of pregnancy-related hypertension1; (2) there is an urgent need to explore more effective diagnostic biomarkers to assist in clinical decision-making for suspected PE patients, because the traditional clinical indicators have limited predictive value for PE or its adverse pregnancy outcomes2,3; (3) PE is a complex multisystem disease and triggers problems in the liver, kidney, brain, and the clotting system4. Worsely, PE increases the risk of long-term complications, including metabolic disease and cardiovascular disease, in both mothers and offspring5. (4) there is currently no cure for PE, and treatments focus mainly on relieving the symptoms and minimizing complications6.
Recent research has focused on revealing the development of PE and identifying biomarkers for prediction. The current strategy for predicting preeclampsia is based on a combination of baseline maternal factors, biophysical parameters, and placental-associated proteins7. Although this combination is effective in predicting PE in the early stage of pregnancy, it falls short in predicting PE development for those women have already shown PE-related symptoms8. Studies suggested that PE might stem from altered maternal pattern of circulating placentally derived proteins that regulating angiogenesis9, a specific sFlt-1:PlGF ratio cutoff of 38 has shown promise in predicting PE development for women with clinical suspicion of the condition10. However, evidence for the diagnostic effectiveness of the ratio in screening women without clinical suspicion of the disease is poor11.
Metabolomics, as the final manifestation of integrated upstream biological information flow, determines the eventual phenotype. Compared to targeted metabolomics focusing on well-defined metabolites, the high-throughput untargeted metabolomics aims at monitoring all low-molecular-weight metabolites in a biological fluid and has been widely used to discover specific metabolic patterns of diseases12. Besides, recent work has highlighted the potential of machine learning (ML) algorithms for processing a large amount of data and screening the candidate biomarkers effectively. The combination of untargeted metabolomics and ML provides comprehensive insights into metabolic alterations, leading to improved predictions13,14,15. Despite advancements in metabolomics for PE, there are still challenges for clinical application. The heterogeneity of metabolites contributed to poor specificity and low positive predictive values (PPV, 8-33%)16, leading to unnecessary tests and interventions for false-positive patients. Additionally, previous PE predictive models were established between healthy controls (HC) and PE patients, the differences between suspected-PE+ (patients with PE-related symptoms who are diagnosed with PE) and suspected-PE− (patients with PE-related symptoms but did not develop to PE) were not revealed. Therefore, more diverse populations should be recruited to validate the biomarkers.
In this work, an integrated work-flow for metabolome profiling of maternal serum was carried out to explore the global changes of serum metabolites in PE and identify potential predictive biomarkers for PE development. Firstly, HC, PE patients and suspected PE participants were recruited in independent experiments for untargeted serum metabolomics study. Subsequently, multivariate statistical analysis was employed to reveal the metabolic differences between HC vs. PE and suspected-PE−vs. suspected-PE+. Thirdly, predictive biomarkers for suspected PE patients were identified and validated in two cohorts. To our knowledge, this work is the first to reveal the candidate predictive biomarkers for suspected PE patients based on large-scale prospective serum metabolomics, we envision that the integrated researches could deepen our understandings of PE.
Results
Characteristics of serum metabolomic profiles in all groups based on untargeted metabolomics
Studies on determinants of PE in pregnant women suggested parity, age, pre-pregnancy BMI and gestational week at sampling as potential covariates of PE during pregnancy16,17,18. As shown in Fig. 1, retrospective and prospective cohorts were constructed and the baseline characteristics including these potential covariates of the study population were summarized, no statistically significant differences were observed (Table 1). Following appropriate pre-treatment, serum samples were injected into UPLC − QE-MS for untargeted metabolomic analysis. A total of 8033 features were discovered in the positive mode, and 6291 features were detected in the negative mode (Figure S1). After signal de-noising and dataset normalization, 1569 metabolites in the positive and 825 metabolites in the negative mode were annotated in the public database. A total of 187 compounds were identified in both positive and negative ion modes (Fig. 2A). Among these annotated metabolites, lipids and lipid-like molecules were the most abundant, accounting for 41.31% of the total metabolites. They were followed by organic acids and derivatives, organ heterocyclic compounds, benzenoids, organic oxygen compounds, phenylpropanoids and polyketides and organic nitrogen compounds, among others (Fig. 2B).
The metabolites collected in the positive and negative ion modes were combined for multivariate statistical analysis. Unfortunately, the unsupervised pattern recognition method was not sufficient to differentiate the four groups (Figure S2). Subsequent partial least squares discriminant analysis (PLS-DA) analysis revealed that HC samples mainly located in the first quadrant, with mild variations observed for suspected-PE− samples. On the other hand, significant intra-group variations were observed for suspected-PE+ samples, particularly for PE samples. It is noteworthy that during the development of PE, there is a gradual deviation from normal physiological patterns (Fig. 2C). This suggests that certain candidate biomarkers could be utilized to reflect the pathophysiological progression of PE. To validate the reliability of the developed PLS-DA model, a permutation test was carried out and no overfitting was observed (Figure S3). Additionally, an unsupervised principal component analysis (PCA) was performed to evaluate the variance of QC samples (Figure S4). The results indicated that the QC samples clustered closely within a range of no more than two standard deviations, demonstrating the stability of the analysis system and the high quality of the data.
Veen diagram of metabolites detected in the positive ion mode and negative ion mode (A); Types and proportions of metabolites detected in untargeted metabolomics (B); Score scattering plot yielded from partial least-squares discrimination analysis (PLS-DA) on the quantitative metabolome datafile of serum samples from HC, PE, suspected-PE− and suspected-PE+ groups (C).
Differential metabolites and dysregulated metabolic pathways between healthy pregnant women and PE patients
In order to enhance the diversity among groups in metabolomics analysis and identify key differential metabolites, supervised orthogonal partial least squares discriminant analysis (OPLS-DA) was performed. The scattering plot showed a clear separation between HC and PE (Fig. 3A). To avoid overfitting, a 200 times permutation test was carried out and the results suggested that the supervised model was able to provide an objective comparison between HC and PE (Figure S5). Metabolites with VIP value above 1.5 were dispersed from the origin in the loading plot (Fig. 3B). Additionally, through univariate nonparametric Wilcoxon’s analysis, 98 differential metabolites with VIP > 1.5, p < 0.05 and fold change (FC) > 1.2 / FC < 0.8 were recognized as the primary contributors to group classification (Fig. 3C).
Among the 98 metabolites, 73 were up-regulated and 25 were down-regulated in the PE group. These significantly regulated metabolites fell into diverse categories of structural identities, including organic compounds, bile acids, steroid hormones, amino acids, nucleotides, and purine metabolites. The top 5 metabolites with the highest FC in both upregulation and downregulation were highlighted in Fig. 3C. Notably, citric acid, taurocholic acid, 3-dehydrosphinganine, glycocholic acid and 3-phosphonooxypyruvate were significantly elevated in PE, while cortisol, estrone sulfate, dehydroepiandrosterone sulfate, L-phenylalanine and L-cysteine were significantly reduced compared to HC. Afterwards, a genetic algorithm-based optimization evolutionary method was employed to identify discriminant metabolites, with the top metabolites selected from 200 repeated 10-fold cross-validations as potential biomarkers to enhance the reliability and clinical applicability of the prediction models. And a LASSO machine learning model was utilized for variable selection to yield concise outcomes (Figure S6). Ultimately, a total of 10 metabolites, including dehydroepiandrosterone sulfate, cortisol, valproic acid glucuronide, L-proline, 3-hydroxybutyric acid, L-histidine, 2-hydroxybutyric acid, L-glutamic acid, citric acid and 3-phosphonooxypyruvate were identified (Table S1). As depicted in Fig. 3D, these selected biomarkers exhibited an area under the curve (AUC) ranging from 0.635 to 0.910 in the retrospective cohort. Unexpectedly, the combination of biomarkers showed outstanding accuracy (AUC = 1).
Orthogonal partial least-squares discrimination analysis (OPLS-DA) of serum samples from HC and PE groups: score scattering plots (A) and S-plot loading plots (B); Volcano plot analysis of significantly altered metabolites in the serum samples of HC versus PE (C); ROC analysis of the candidate biomarkers for PE prediction in the retrospective cohort (D).
Subsequently, the up-regulated and down-regulated metabolites were analyzed using MetaboAnalyst 5.0 to identify enriched pathways, respectively. We found that the down-regulated metabolites were mainly enriched in steroid hormone biosynthesis (p < 0.05, FDR < 0.25) (Fig. 4A, Table S2). On the other hand, the up-regulated metabolites were associated with pathways such as phenylalanine, tyrosine and tryptophan biosynthesis, nitrogen metabolism, phenylalanine metabolism, glyoxylate and dicarboxylate metabolism, glycine, serine and threonine metabolism, biosynthesis of unsaturated fatty acids, arginine biosynthesis, butanoate metabolism and histidine metabolism (Fig. 4B, Table S3). However, some metabolites can participate in multiple metabolic pathways. To explore the potential functional relationships between these key metabolites, a metabolite-metabolite interaction network was built. As shown in Fig. 4C, key metabolites such as L-glutamic acid, citric acid, L-phenylalanine, L-cysteine and L-proline, which had high degrees in the network, appeared to play critical roles in connecting different pathways, indicating the dysregulated amino acid metabolism pathway is the premise of various dysregulated metabolic pathways in PE.
Biomarkers discovery between suspected-PE− and suspected-PE+ groups
Predicting the development of PE in the patients with PE-related symptoms is crucial in clinical practice. Therefore, a prospective study was conducted with a large sample size of suspected-PE pregnant women (n = 336), which were randomly divided into discovery cohort and validation cohort. The discovery cohort aimed to reveal the metabolic differences between suspected-PE− and suspected-PE+ groups, nevertheless, initial analysis using PCA did not show clear distinctions between the groups (Figure S7). Subsequently, a supervised OPLS-DA model was employed to highlight the differences more effectively. The model demonstrated a slight separation in metabolite profiles between the two groups (Fig. 5A), supported by satisfactory permutation results between two similar groups (Figure S8). Suspected-PE− samples were predominantly distributed on the positive half axis of the X-axis, while suspected-PE+ samples were mainly on the negative half axis of the X-axis. According to the corresponding loading plots, 117 variables with VIP > 1.5 were retained and presented in the Fig. 5B. Moreover, volcano plots were used to visualize statistical significance and fold change values, leading to the identification of 19 differential metabolites meeting specific criteria (VIP > 1.5, p < 0.05, and FC > 1.2 / FC < 0.8) (Fig. 5C, Table S4). According to the significantly altered metabolites, metabolite set enrichment analysis (MSEA) and KEGG pathway analysis were used to determine the altered metabolic pathways in suspected-PE+. As a result, the enriched pathways mainly involved in vitamin B6 metabolism, glutathione metabolism, arginine and proline metabolism and warburg effect (Fig. 5D).
Orthogonal partial least-squares discrimination analysis (OPLS-DA) of serum samples from suspected-PE− and suspected-PE+ groups: score scattering plots (A) and S-plot loading plots (B); Volcano plot analysis of metabolites significantly altered in the serum samples of suspected-PE− versus suspected-PE+ (C), Enrichment pathway analysis of the metabolites that were differentially altered in suspected-PE− versus suspected-PE+ (D).
Development and validation of the diagnostic model for PE in suspected patients
Systematic metabolomic investigations revealed metabolic alteration in serum between suspected-PE− and suspected-PE+ individuals. Differential metabolites were analyzed using logistic regression models with constrained parameters as in LASSO to establish the prediction models. A panel of seven potential biomarkers including 2-Methyl-3-hydroxy-5-formylpyridine-4-carboxylate, gamma-glutamyl-leucine, 2-hydroxyvaleric acid, LysoPC(16:1(9Z)/0:0), PC(DiMe(13,5)/MonoMe(13,5)), ADP-D-glycero-beta-D-manno-heptose and phenylalanyl-tryptophan, was selected for predicting PE diagnosis (Fig. 6A). These differentially altered metabolites fell into diverse categories of structural identities, including pyridinecarboxylic acids and derivatives, amino acids, peptides, and analogues, fatty acids and conjugates, glycerophosphocholines, and purine nucleotide sugars. The corresponding intercept and coefficients in the LASSO model were summarized in Table 2. As illustrated in Fig. 6B, levels of 2-Methyl-3-hydroxy-5-formylpyridine-4-carboxylate was significantly down-regulated from suspected-PE− to suspected-PE+, while the levels of gamma-glutamyl-leucine, 2-hydroxyvaleric acid, LysoPC(16:1(9Z)/0:0), PC(DiMe(13,5)/MonoMe(13,5)), ADP-D-glycero-beta-D-manno-heptose and phenylalanyl-tryptophan were significantly elevated in suspected-PE+.
ROC analysis and ROC AUC (area under curve) were utilized to assess the diagnostic performance of all parameters. In the ROC analysis of discovery cohort, AUCs of seven biomarker candidates were presented in descending order: phenylalanyl-tryptophan, gamma-glutamyl-leucine, 2-methyl-3-hydroxy-5-formylpyridine-4-carboxylate, PC(DiMe(13,5)/MonoMe(13,5)), lysoPC(16:1(9Z)/0:0), 2-hydroxyvaleric acid and ADP-D-glycero-beta-D-manno-heptose, with the best AUC of 0.673 (95% CI, 0.597–0.750). Multivariate ROC curve analysis may yield a more effective approach for creating and evaluating predictive biomarker models compared to univariate ROC curve analysis, it yielded an AUC of 0.753 (95% CI, 0.683–0.824) and achieved a sensitivity of 67.91%, a specificity of 70.21%, a PPV of 80.86% and an NPV of 65.44% at the cutoff determined by the Youden’s index (Fig. 7A; Table 3). In the independent validation cohort, the panel of biomarker candidates showed the best predictive power with an AUC of 0.885 (95% CI, 0.789–0.982). Using the cutoff determined by the Youden’s index, the biomarker panel showed a sensitivity of 76.79%, a specificity of 80.00%, a PPV of 93.26% and an NPV of 75.77% (Fig. 7B; Table 3).
Discussion
Hypertension and proteinuria are critical clinical symptoms of PE, as well as major objective diagnostic indicators for this condition. However, the presence of proteinuria or hypertension does not always lead to the development of PE, and the metabolic alterations in suspected-PE− and suspected-PE+ remain poorly defined. In this untargeted UHPLC − MS metabolomics study, both retrospective and prospective cohorts were included and strict inclusion criterias were used to recruit healthy controls, PE and suspected-PE patients. After metabolomic profiling and systematic comparisons, significantly different serum metabolomic patterns were observed between HC and PE groups, while the metabolic patterns were similar between suspected-PE− and suspected-PE+. Pathway enrichment analysis combined with metabolite-metabolite interaction network analysis revealed that altered amino acids bridged the various dysregulated metabolic metabolites during PE, and the endpoint of interaction network mainly manifested as significant downregulation of steroid hormone metabolic pathways. Additionally, we constructed the predictive models based on differential metabolites between suspected-PE− and suspected-PE+ using machine learning algorithms, and then developed a consensus model with satisfactory predictive ability.
In the perturbed metabolic network of PE patients, amino acids played a central role and connected the entire network. Among them, L-glutamic acid, a key regulator of glutathione metabolism, exhibited the highest degrees in the metabolic network. According to previous studies19,20, aberrant glutathione homeostasis also contributes to complications in PE. Therefore, glutathione metabolism is one of the key pathways to target in PE. Except for amino acids, endogenous molecules including arachidonate and linoleate also served as important communication points. Over-accumulation of linoleate and arachidonate in PE indicated excessive biosynthesis of unsaturated fatty acids, leading to endothelial dysfunction due to oxidative stress and inflammation21. Moreover, metabolites like taurocholic acid, glycocholic acid, estrone sulfate and dehydroepiandrosterone sulfate, which are related to cholesterol metabolism, showed the highest FC between HC and PE. Interestingly, bile acids and steroid hormones exhibited opposite trends of variations, warranting further investigation.
Metabolomics combined with ML algorithm has been increasingly implemented for developing diagnostic models for various human diseases in recent years22,23,24. Systematic metabolomic investigations revealed metabolic alteration in plasma between HC and PE, as well as suspected-PE− and suspected-PE+ individuals. Diagnostic models based on differential metabolites were established for PE using machine learning algorithms. For the comparison between HC and PE, the selected biomarkers showed promising results with an AUC ranging from 0.635 to 0.910 in the retrospective cohort, but further validation with a larger population from multiple-center is necessary for clinical application.
Currently, the sFlt-1/PlGF ratio cutoff of 38 has been employed for the short-term prediction of PE in Asian women with suspected PE with good performance, which has also been validated in our previous research (training cohort AUC = 0.637; validation cohort AUC = 0.733). Despite the similar metabolic profiles between suspected-PE− and suspected-PE+ were found in this research, a panel of seven features were screened as predictive biomarkers with higher diagnostic efficiency (training cohort AUC = 0.753; validation cohort AUC = 0.885). Interestingly, although these 7 biomarkers have not been previously reported in metabolomics studies on PE, they are all linked to the pathogenesis of PE. Inflammation, particularly in the preeclamptic placenta, has been highlighted as a key process. Molecules such as ADP-d-glycero-β-d-manno-heptose (ADP-heptose) and certain glycerophospholipids including lysoPC(16:1(9Z)/0:0) and PC(DiMe(13,5)/MonoMe(13,5)) have been implicated in modulating inflammation25,26,27. Meanwhile, lipid over-accumulation in maternal serum contributes to endothelial dysfunction secondary to oxidative stress, and abnormal glucose metabolism and lipid metabolism often occur in parallel28. Moreover, increased 2-hydroxyvaleric acid, gamma-glutamyl-leucine and phenylalanyl-tryptophan were detected in suspected-PE+ samples. 2-hydroxyvaleric acid, a known organic acid in human fluids, has shown potential as a predictor of type 1 diabetes29,30. Additionally, gamma-glutamyl-leucine and phenylalanyl-tryptophan are bioactive peptides associated with obesity and type 2 diabetes31,32,33. As an intermediate metabolite in vitamin B6 production, 2-methyl-3-hydroxy-5-formylpyridine-4-carboxylate plays a role in regulating cellular homocysteine levels in the transsulfuration pathway and acetylcholine-induced endothelium-dependent relaxation, thereby helping to prevent insulin resistance and vascular dysfunction34. Lower levels of 2-methyl-3-hydroxy-5-formylpyridine-4-carboxylate in suspected-PE+ groups indicated an increase in insulin resistance.
This study has several inevitable limitations that warrant acknowledgment. Firstly, inadequate clinical information was available for the patients recruited in these cohorts, including the sFlt-1/PlGF ratio, which hindered the establishment of associations of the identified biomarkers and clinical information. Secondly, no absolute quantification analysis was conducted for the biomarkers. The overlap in the relative peak area of these biomarkers between suspected-PE− and suspected-PE+ underscores the necessity of absolute quantification for accurate diagnostics. Lastly, the use of different ML algorithms to construct predictive models and compare their performance are recommended to identify the most effective predictive models.
In summary, retrospective and prospective cohorts were recruited for untargeted metabolomics to reveal the altered metabolomic patterns between HC and PE, as well as suspected-PE− and suspected-PE+. Distinct metabolic differences were observed between HC and PE, primarily attributed to disturbed amino acid-related metabolic pathways, providing new insights into the pathogenesis of PE at the metabolic level. Despite the similar metabolomic profiles found in suspected-PE− and suspected-PE+, our diagnostic models showed promise in enhancing current clinical early diagnostic assessments of PE.
Methods
Study design and clinical samples collection
Serum samples and clinical data in this study were obtained from Beijing obstetrics and gynecology hospital, between January 2018 and January 2019. The study protocol adhered to the tenets of the Declaration of Helsinki and was approved by the Ethics Committee of Beijing Obstetrics and Gynecology Hospital, Capital Medical University (approval number: 2021-KY-05901). And informed consent was obtained from all participants.
As depicted in Fig. 1, retrospective-blinded-evaluation and prospective-specimen collection were designed in the study, following the principles of the PRoBE35. Initially, a total of 336 suspected PE singleton pregnant women between 20 and 36 gestational weeks (GWs) were recruited for this prospective cohort study. Indication for suspected PE included: elevated blood pressure or proteinuria or symptoms associated with PE, such as right upper quadrant abdominal pain or headache with visual disturbances. Women with glucose metabolism disorders (including abnormal fasting blood glucose and fasting insulin), confirmed diagnosis of PE or Hemolysis Elevated Liver enzymes and Low Platelets (HELLP) syndrome were excluded. Detailed inclusion and exclusion criteria for the suspected PE study cohort have been previously described36. The serum samples of all participants were collected at their first visits with onset of the suspected symptoms but not confirmative diagnosis of PE, then, randomly numbered and assigned to the discovery cohort (n = 267) and validation cohort (n = 69) according to their numbers. Follow up until delivery and record the occurrence of PE according to diagnostic criteria for PE proposed by the 2019 ACOG Practice Bulletin37. The recruited patients who eventually developed PE are referred to as ‘suspected-PE+’ and those who did not develop PE as ‘suspected-PE−’. However, eight of them were excluded due to incomplete follow-up data. As a result, 47 and 10 participants were clinically diagnosed with PE in the discovery cohort and validation cohort, respectively. For retrospective cohort study, 30 PE pregnant women and 30 singleton pregnant women without suspected PE were enrolled, and the serum samples were collected. Noteworthily, the enrolled participants in the discovery cohort were completely dependent of those in the validation cohort. All samples were centrifuge at 3000 rpm for 10 min, and separate the serum and store at -80 ℃ till usage.
Sample preparation
Following thawing at 4 °C, 100 µL serum was fortified with 300 µL iced methanol containing IS solution (2-Chloro-D-phenylalanine and ketoprofen) for protein precipitation. After being subjected to centrifugation at 13,000 rpm for 10 min at 4 °C, 5 µL supernatant was individually injected into LC − MS. Moreover, equal volumes of serum supernatant from each participant were mixed to prepare for quality control (QC) samples, and QC sample was inserted into the acquiring queue after every ten samples.
Untargeted metabolomics profiling analysis for serum
Untargeted metabolomics analysis was performed on an UHPLC system (Vanquish, Thermo Fisher Scientific) tandem Q Exactive mass spectrometer (Thermo Fisher Scientific) equipped with an electrospray ionization (ESI) interface. To guarantee high quality data acquisition, positive and negative polarities were implemented in separate runs based on defaulted settings. Metabolites were annotated by automatically comparing retention duration, ion feature, and tandem mass spectrometry fragmentation pattern with the public database. According to Metabolomics Standards Initiative (MSI)38, the annotations belong to level 2 of identification. Sample collection and pretreatment, as well as data acquisition and processing details for untargeted metabolomics analysis were described in Supplementary Methods.
Statistical analysis
All the data were presented as means ± standard error of the means (SEM) and statistical analyses were performed using SPSS version 22.0 (IBM Corp., Armonk, NY). According to normality of distribution analyzed by the Kolmogorov-Smirnov test, data were analyzed with one-way ANOVA and Student’s t-test where appropriate. Following that all the variables were Par-scaled, orthogonal partial least squares discriminant analysis (OPLS-DA) was performed with SIMCA-P software (Version 14.1, Umetrics, Umeå, Sweden) to obtain marked outliers based on Variable Importance on Projection (VIP) value. Subsequently, Logistic regression modeling, along with 10-fold cross‐validation was used to test potential biomarkers (p < 0.05, VIP > 1.5). Diagnostic performance was assessed with receiver operating characteristics (ROC) curve analysis. Sensitivity, specificity and cut-off values determined by the Youden’s index were reported. Furthermore, these metabolites were correlated with metabolomic pathways using the Kyoto Encyclopedia of Genes and Genomes (KEGG) database (http://www.genome.jp/kegg/).
Data availability
The metabolomics quantification data reported in this study are available as supplementary information.
References
Timpka, S. et al. Hypertensive disorders of pregnancy and offspring Cardiac structure and function in adolescence. J. Am. Heart Assoc. 11, (2016).
Zhang, J. et al. Prediction of adverse outcomes by common definitions of hypertension in pregnancy. Obstet. Gynecol. 2, 261–267 (2001).
North, R. A. et al. Clinical risk prediction for pre-eclampsia in nulliparous women: development of model in international prospective cohort. BMJ. d1875 (2011).
Eiland, E. et al. Preeclampsia 2012. J. Pregnancy. 586578 (2012).
Yang, C. et al. Long-term impacts of preeclampsia on the cardiovascular system of mother and offspring. Hypertension 9, 1821–1833 (2023).
Perez-Roque, L. et al. Pregnancy-induced high plasma levels of Soluble Endoglin in mice lead to preeclampsia symptoms and placental abnormalities. Int. J. Mol. Sci. 1 (2020).
Poon, L. C. & Nicolaides, K. H. Early prediction of preeclampsia. Obstet. Gynecol. Int., 297397 (2014).
Pedrosa, A. C. & Matias, A. Screening for pre-eclampsia: a systematic review of tests combining uterine artery doppler with other markers. J. Perinat. Med. 6, 619–635 (2011).
Maynard, S. E. et al. Excess placental soluble fms-like tyrosine kinase 1 (sFlt1) may contribute to endothelial dysfunction, hypertension, and proteinuria in preeclampsia. J. Clin. Invest. 5, 649–658 (2003).
Ohkuchi, A. et al. Economic evaluation of the sFlt-1/PlGF ratio for the short-term prediction of preeclampsia in a Japanese cohort of the PROGNOSIS Asia study. Hypertens. Res. 7, 822–829 (2021).
Liu, Y. et al. Diagnostic accuracy of the soluble fms-like tyrosine kinase-1/placental growth factor ratio for preeclampsia: a meta-analysis based on 20 studies. Arch. Gynecol. Obstet. 3, 507–518 (2015).
Grissa, D. et al. Feature selection methods for early predictive biomarker discovery using untargeted metabolomic data. Front. Mol. Biosci. 30 (2016).
Lee, A. M. et al. Using machine learning to identify metabolomic signatures of pediatric chronic kidney disease etiology. J. Am. Soc. Nephrol. 2, 375–386 (2022).
Kuwabara, H. et al. Salivary metabolomics with machine learning for colorectal cancer detection. Cancer Sci. 9, 3234–3243 (2022).
Hou, X. W. et al. Machine learning-based integration of metabolomics characterisation predicts progression of myopic retinopathy in children and adolescents. Metabolites. 2, (2023).
Jeong, D. E. et al. The association between maternal pre-pregnancy body mass index and pregnancy outcomes of preeclampsia. Taiwan. J. Obstet. Gynecol. 3, 441–446 (2022).
Yang, Y. et al. Preeclampsia prevalence, risk factors, and pregnancy outcomes in Sweden and China. JAMA Netw. Open. 5, e218401 (2021).
Yue, C. et al. Association of first trimester serum uric acid with preeclampsia: an observational cohort study with propensity score matching. Hypertens. Res. 2, 377–385 (2023).
Cao, T. H. et al. Plasma proteomic approach in patients with heart failure: insights into pathogenesis of disease progression and potential novel treatment targets. Eur. J. Heart Fail. 1, 70–80 (2020).
Saito, T. et al. Serum metabolome profiles characterized by patients with hepatocellular carcinoma associated with hepatitis B and C. World J. Gastroenterol. 27, 6224–6234 (2016).
Zhang, L. et al. Integrated metabolomic and lipidomic analysis in the placenta of preeclampsia. Front. Physiol., 807583 (2022).
Shen, X. et al. Serum metabolomics identifies dysregulated pathways and potential metabolic biomarkers for hyperuricemia and gout. Arthritis Rheumatol. 9, 1738–1748 (2021).
Lappas, M. et al. The prediction of type 2 diabetes in women with previous gestational diabetes mellitus using lipidomics. Diabetologia 7, 1436–1442 (2015).
Cai, D. et al. Predicting acute kidney injury risk in acute myocardial infarction patients: an artificial intelligence model using medical information mart for intensive care databases. Front. Cardiovasc. Med., 964894 (2022).
Austdal, M. et al. Metabolic profiles of placenta in preeclampsia using HR-MAS MRS metabolomics. Placenta 12, 1455–1462 (2015).
van der Veen, J. N. et al. The critical role of phosphatidylcholine and phosphatidylethanolamine metabolism in health and disease. Biochim. Biophys. Acta Biomembr. 9(Pt B), 1558–1572 (2017).
Rodriguez-Cuenca, S. et al. Sphingolipids and glycerophospholipids - the Ying and Yang of lipotoxicity in metabolic diseases. Prog Lipid Res. 14–29 (2017).
Austdal, M. et al. First Trimester urine and serum metabolomics for prediction of preeclampsia and gestational hypertension: a prospective screening study. Int. J. Mol. Sci. 9, 21520–21538 (2015).
Toyoda, A. et al. Metabolomic analyses of plasma and liver of mice fed with immature citrus tumida peel. Biosci. Biotechnol. Biochem. 6, 1098–1104 (2020).
Trupp, M. et al. Metabolomics reveals amino acids contribute to variation in response to simvastatin treatment. PLoS One. 7, e38386 (2012).
Wu, Q. et al. Gamma-glutamyl-leucine levels are causally associated with elevated cardio-metabolic risks. Front. Nutr., 936220 (2022).
Metrustry, S. J. et al. Metabolomic signatures of low birthweight: pathways to insulin resistance and oxidative stress. PLoS One. 3, e0194316 (2018).
Luo, P. et al. A Large-scale, multicenter serum metabolite biomarker identification study for the early detection of hepatocellular carcinoma. Hepatology. 2, 662–675 (2018).
Liu, Z. et al. Vitamin B6 prevents endothelial dysfunction, insulin resistance, and hepatic lipid Accumulation in Apoe (-/-) mice Fed with High-Fat Diet. J. Diabetes Res. 1748065 (2016).
Pepe, M. S. et al. Pivotal evaluation of the accuracy of a biomarker used for classification or prediction: standards for study design. J. Natl. Cancer Inst. 20, 1432–1438 (2008).
Zhao, S. et al. Serum peptidomic screening identified circulating peptide biomarkers predictive for preeclampsia. Front. Cardiovasc. Med., 946433 (2022).
Gestational Hypertension and Preeclampsia. ACOG practice bulletin summary, number 222. Obstet. Gynecol. 6, 1492–1495 (2020).
Sumner, L. W. et al. Proposed minimum reporting standards for chemical analysis Chemical Analysis Working Group (CAWG) Metabolomics Standards Initiative (MSI). Metabolomics 3, 211–221 (2007).
Acknowledgements
We gratefully acknowledge the study participants who have been involved and contributed to the procedure of data collection.
Funding
This work was supported by the Beijing Obstetrics and Gynecology Hospital, Capital Medical University, Beijing Maternal and Child Health Care Hospital ‘Discipline Backbone’ Plan Special Funds (No. XKGG201802), the Training Fund for Open Projects at Clinical Institutes and Departments of Capital Medical University (CCMU2022ZKYXZ006). The funding bodies did not take part in the design of the study, the collection, analysis and interpretation of the data, or manuscript writing.
Author information
Authors and Affiliations
Contributions
Conceptualization, Y.C., L.M., Y.Z., Z.S., and Z.C.; analysis, S.Z., Y.Y.Z., R.R., J.D., H.W., J.H., Z.X., Y.L., L.L., L.C., J.W., Y.L., and Z.C.; visualization, Y.C. and L.M.; resources, Y.C., L.M., Y.Z., and Z.C.; supervision and coordination, Z.S. and Z.C.; writing-original draft, Y.C., L.M., Y.Z., Z.S. and Z.C. All authors provided discussion, participated in revising the manuscript, and agreed to the final version.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Cao, Y., Meng, L., Wang, Y. et al. Large-scale prospective serum metabolomic profiling reveals candidate predictive biomarkers for suspected preeclampsia patients. Sci Rep 15, 4807 (2025). https://doi.org/10.1038/s41598-025-87905-9
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41598-025-87905-9
Keywords
This article is cited by
-
Multi-omic insights of preeclampsia and cardiovascular health outcomes
Communications Medicine (2025)









