Abstract
Placental insufficiency affects five to ten% of pregnancies worldwide, and folate deficiency has emerged as a key contributing factor. The molecular pathways linking folate metabolism to placental pathology remain poorly characterized. We developed a heterogeneous graph neural network framework that integrates genomic, transcriptomic, proteomic, and metabolomic data to investigate these mechanisms. Our network architecture explicitly models diverse node types and edge relationships within biological networks, addressing limitations of conventional approaches that treat molecular entities uniformly. The constructed network encompasses 6,704 molecular entities connected through 16,608 validated interactions. Our model achieved 94.7% classification accuracy and 0.978 AUROC, substantially outperforming traditional machine learning methods and single-omics analyses. Attention mechanism analysis identified key molecular signatures including MTHFR downregulation (2.8-fold), FOLR1 depletion (4.5-fold), and homocysteine accumulation (6.3-fold). We identified seven interconnected functional modules spanning folate metabolism, methylation regulation, oxidative stress, and angiogenesis pathways. We acknowledge that the current model was trained on placental tissues collected at delivery, which precludes direct application for antenatal risk prediction. Future studies correlating prenatal biospecimens with our identified placental signatures may enable development of early screening tools. This framework provides a foundation for multiomics integration applicable to diverse pregnancy complications.
Data availability
The multiomics datasets, complete analysis results, and computational code generated during this study are provided in Supplementary Materials. This file contains: (1) processed multiomics data matrices for all 298 samples; (2) complete differential expression results with raw and adjusted p-values for all features (Supplementary Table S1-S3); (3) full feature importance rankings including attention weights, gradient attributions, and permutation importance scores; (4) network construction parameters and edge lists; (5) Python scripts for data preprocessing, graph neural network training, and evaluation using actual experimental data. Patient identifiers have been removed to protect confidentiality. Additional raw sequencing data are available from the corresponding author (Mengting Yuan, ymt1125@outlook.com) upon reasonable request, subject to institutional review board approval and data sharing agreements.
Abbreviations
- AUROC:
-
Area Under the Receiver Operating Characteristic Curve
- ChIP-seq:
-
Chromatin Immunoprecipitation Sequencing
- CV:
-
Coefficient of Variation
- DHFR:
-
Dihydrofolate Reductase
- DNA:
-
Deoxyribonucleic Acid
- ELU:
-
Exponential Linear Unit
- FOLR1/FOLR2:
-
Folate Receptor 1/2
- GNN:
-
Graph Neural Network
- HAT:
-
Heterogeneous Attention
- HIF1A:
-
Hypoxia-Inducible Factor 1 Alpha
- HMDB:
-
Human Metabolome Database
- IQR:
-
Interquartile Range
- KEGG:
-
Kyoto Encyclopedia of Genes and Genomes
- LASSO:
-
Least Absolute Shrinkage and Selection Operator
- MTHFR:
-
Methylenetetrahydrofolate Reductase
- MTR:
-
Methionine Synthase
- PCA:
-
Principal Component Analysis
- PCFT/SLC46A1:
-
Proton-Coupled Folate Transporter
- QC:
-
Quality Control
- ReLU:
-
Rectified Linear Unit
- RFC/SLC19A1:
-
Reduced Folate Carrier
- RNA:
-
Ribonucleic Acid
- SHMT:
-
Serine Hydroxymethyltransferase
- SNP:
-
Single Nucleotide Polymorphism
- STAT3:
-
Signal Transducer and Activator of Transcription 3
- TF:
-
Transcription Factor
- VEGF:
-
Vascular Endothelial Growth Factor
References
Burton, G. J. & Jauniaux, E. Pathophysiology of placental-derived fetal growth restriction. Am. J. Obstet. Gynecol. 218 (2S). https://doi.org/10.1016/j.ajog.2017.11.577 (2018). S745-S761.
Morb Mortal Wkly Rep. Neonatal morbidities of fetal growth restriction: pathophysiology and impact. Front. Endocrinol. 10, 55. https://doi.org/10.3389/fendo.2019.00055 (2019).
Berry, R. J. et al. Prevention of neural-tube defects with folic acid in China. N. Engl. J. Med. 341 (20), 1485–1490. https://doi.org/10.1056/NEJM199911113412001 (1999).
Steegers-Theunissen, R. P. et al. Periconceptional maternal folic acid use of 400 µg per day is related to increased methylation of the IGF2 gene in the very young child. PLoS One. 4 (11), e7845. https://doi.org/10.1371/journal.pone.0007845 (2009).
Hasin, Y., Seldin, M. & Lusis, A. Multi-omics approaches to disease. Genome Biol. 18 (1), 83. https://doi.org/10.1186/s13059-017-1215-1 (2017).
Ritchie, M. D., Holzinger, E. R., Li, R., Pendergrass, S. A. & Kim, D. Methods of integrating data to uncover genotype–phenotype interactions. Nat. Rev. Genet. 16 (2), 85–97. https://doi.org/10.1038/nrg3868 (2015).
Misra, B. B., Langefeld, C., Olivier, M. & Cox, L. A. Integrated omics: tools, advances and future approaches. J. Mol. Endocrinol. 62 (1), R21–R45. https://doi.org/10.1530/JME-18-0055 (2019).
Picard, M., Scott-Boyer, M. P., Bodein, A., Périn, O. & Droit, A. Integration strategies of multi-omics data for machine learning analysis. Comput. Struct. Biotechnol. J., 19, 3735–3746. https://doi.org/10.1016/j.csbj.2021.06.030 (2021).
Zhou, J. et al. Graph neural networks: A review of methods and applications. AI Open. 1, 57–81. https://doi.org/10.1016/j.aiopen.2021.01.001 (2020).
Kipf, T. N. & Welling, M. Semi-supervised classification with graph convolutional networks. International Conference on Learning Representations (ICLR). (2017). https://doi.org/10.48550/arXiv.1609.02907
Zitnik, M. et al. Machine learning for integrating data in biology and medicine: Principles, practice, and opportunities. Inform. Fusion. 50, 71–91. https://doi.org/10.1016/j.inffus.2018.09.012 (2019).
Ching, T. et al. Opportunities and Obstacles for deep learning in biology and medicine. J. Royal Soc. Interface. 15 (141), 20170387. https://doi.org/10.1098/rsif.2017.0387 (2018).
Ducker, G. S. & Rabinowitz, J. D. One-carbon metabolism in health and disease. Cell Metabol. 25 (1), 27–42. https://doi.org/10.1016/j.cmet.2016.08.009 (2017).
Shane, B. & Stokstad, E. L. Vitamin B12-folate interrelationships. Annu. Rev. Nutr. 5 (1), 115–141. https://doi.org/10.1146/annurev.nu.05.070185.000555 (1985).
Zhao, R. & Goldman, I. D. Folate and thiamine transporters mediated by facilitative carriers (SLC19A1-3 and SLC46A1) and folate receptors. Mol. Aspects Med. 34 (2–3), 373–385. https://doi.org/10.1016/j.mam.2012.07.006 (2013).
Kaufmann, P., Black, S. & Huppertz, B. Endovascular trophoblast invasion: implications for the pathogenesis of intrauterine growth retardation and preeclampsia. Biol. Reprod. 69 (1), 1–7. https://doi.org/10.1095/biolreprod.102.014977 (2003).
Burton, G. J. & Jauniaux, E. Placental oxidative stress: from miscarriage to preeclampsia. J. Soc. Gynecol. Investig. 11 (6), 342–352. https://doi.org/10.1016/j.jsgi.2004.03.003 (2004).
Friso, S. et al. A common mutation in the 5, 10-methylenetetrahydrofolate reductase gene affects genomic DNA methylation through an interaction with folate status. Proceedings of the National Academy of Sciences, 99(8), 5606–5611. (2002). https://doi.org/10.1073/pnas.062066299
Wang, W., Knovich, M. A., Coffman, L. G., Torti, F. M. & Torti, S. V. Serum ferritin: Past, present and future. Biochim. Et Biophys. Acta (BBA)-General Subj. 1800 (8), 760–769. https://doi.org/10.1016/j.bbagen.2010.03.011 (2010).
Metzker, M. L. Sequencing technologies—the next generation. Nat. Rev. Genet. 11 (1), 31–46. https://doi.org/10.1038/nrg2626 (2010).
Kemmeren, P. et al. Large-scale genetic perturbations reveal regulatory networks and an abundance of gene-specific repressors. Cell 157 (3), 740–752. https://doi.org/10.1016/j.cell.2014.02.054 (2014).
Aebersold, R. & Mann, M. Mass-spectrometric exploration of proteome structure and function. Nature 537 (7620), 347–355. https://doi.org/10.1038/nature19949 (2016).
Wishart, D. S. et al. HMDB 4.0: the human metabolome database for 2018. Nucleic Acids Res. 46 (D1), D608–D617. https://doi.org/10.1093/nar/gkx1089 (2018).
Subramanian, I., Verma, S., Kumar, S., Jere, A. & Anamika, K. Multi-omics data integration, interpretation, and its application. Bioinform. Biol. Insights. 14, 1177932219899051. https://doi.org/10.1177/1177932219899051 (2020).
Rohart, F., Gautier, B., Singh, A. & Lê Cao, K. A. MixOmics: an R package for ‘omics feature selection and multiple data integration. PLoS Comput. Biol. 13 (11), e1005752. https://doi.org/10.1371/journal.pcbi.1005752 (2017).
Scarselli, F., Gori, M., Tsoi, A. C., Hagenbuchner, M. & Monfardini, G. The graph neural network model. IEEE Trans. Neural Networks. 20 (1), 61–80. https://doi.org/10.1109/TNN.2008.2005605 (2009).
Ying, R. et al. Hierarchical graph representation learning with differentiable pooling. Adv. Neural. Inf. Process. Syst. 31, 4800–4810. https://doi.org/10.48550/arXiv.1806.08804 (2018).
Gligorijević, V. et al. Structure-based protein function prediction using graph convolutional networks. Nat. Commun. 12 (1), 3168. https://doi.org/10.1038/s41467-021-23303-9 (2021).
Gilmer, J., Schoenholz, S. S., Riley, P. F., Vinyals, O. & Dahl, G. E. Neural message passing for quantum chemistry. International Conference on Machine Learning, 1263–1272. (2017). https://doi.org/10.48550/arXiv.1704.01212
Rong, Y., Huang, W., Xu, T. & Huang, J. DropEdge: towards deep graph convolutional networks on node classification. Int. Conf. Learn. Representations. https://doi.org/10.48550/arXiv.1907.10903 (2020).
Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for illumina sequence data. Bioinformatics 30 (15), 2114–2120. https://doi.org/10.1093/bioinformatics/btu170 (2014).
Webb-Robertson, B. J. M. et al. Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics. J. Proteome Res. 14 (5), 1993–2001. https://doi.org/10.1021/pr501138h (2015).
Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. Roy. Stat. Soc.: Ser. B (Methodol.). 57 (1), 289–300. https://doi.org/10.1111/j.2517-6161.1995.tb02031.x (1995).
Kanehisa, M. & Goto, S. KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 28 (1), 27–30. https://doi.org/10.1093/nar/28.1.27 (2000).
Välikangas, T., Suomi, T. & Elo, L. L. A systematic evaluation of normalization methods in quantitative label-free proteomics. Brief. Bioinform. 19 (1), 1–11. https://doi.org/10.1093/bib/bbw095 (2018).
UniProt Consortium. UniProt: a worldwide hub of protein knowledge. Nucleic Acids Res. 47 (D1). https://doi.org/10.1093/nar/gky1049 (2019). D506-D515.
ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489 (7414), 57–74. https://doi.org/10.1038/nature11247 (2012).
Barabási, A. L. & Albert, R. Emergence of scaling in random networks. Science 286 (5439), 509–512. https://doi.org/10.1126/science.286.5439.509 (1999).
Schwanhäusser, B. et al. Global quantification of mammalian gene expression control. Nature 473 (7347), 337–342. https://doi.org/10.1038/nature10098 (2011).
Huang, S. C., Pareek, A., Seyyedi, S., Banerjee, I. & Lungren, M. P. Fusion of medical imaging and electronic health records using deep learning: a systematic review and implementation guidelines. NPJ Digit. Med. 3 (1), 136. https://doi.org/10.1038/s41746-020-00341-z (2020).
Tuncbag, N., McCallum, S., Huang, S. S. C. & Fraenkel, E. SteinerNet: a web server for integrating ‘omic’ data to discover hidden components of response pathways. Nucleic Acids Res., 40(W1), W505–W509. https://doi.org/10.1093/nar/gks445 (2012).
Blondel, V. D., Guillaume, J. L., Lambiotte, R. & Lefebvre, E. Fast unfolding of communities in large networks. J. Stat. Mech: Theory Exp. 2008 (10), P10008. https://doi.org/10.1088/1742-5468/2008/10/P10008 (2008).
Wang, X. et al. Heterogeneous graph attention network. World Wide Web Conf. 2022-2032 https://doi.org/10.1145/3308558.3313562 (2019).
Veličković, P. et al. Graph attention networks. Int. Conf. Learn. Representations. https://doi.org/10.48550/arXiv.1710.10903 (2018).
Defferrard, M., Bresson, X. & Vandergheynst, P. Convolutional neural networks on graphs with fast localized spectral filtering. Adv. Neural. Inf. Process. Syst. 29, 3844–3852. https://doi.org/10.48550/arXiv.1606.09375 (2016).
Li, Y., Yu, R., Shahabi, C. & Liu, Y. Diffusion convolutional recurrent neural network: Data-driven traffic forecasting. Int. Conf. Learn. Representations. https://doi.org/10.48550/arXiv.1707.01926 (2018).
Lin, T. Y., Goyal, P., Girshick, R., He, K. & Dollár, P. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision, 2980–2988. (2017). https://doi.org/10.1109/ICCV.2017.324
Villar, J. et al. Preeclampsia, gestational hypertension and intrauterine growth restriction, related or independent conditions? Am. J. Obstet. Gynecol. 194 (4), 921–931. https://doi.org/10.1016/j.ajog.2005.10.813 (2006).
Pedregosa, F. et al. Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
Breiman, L. Random forests. Mach. Learn. 45 (1), 5–32. https://doi.org/10.1023/A:1010933404324 (2001).
Rousseeuw, P. J. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65. https://doi.org/10.1016/0377-0427(87)90125-7 (1987).
Paszke, A. et al. PyTorch: an imperative style, high-performance deep learning library. Adv. Neural. Inf. Process. Syst. 32, 8026–8037. https://doi.org/10.48550/arXiv.1912.01703 (2019).
Chicco, D. & Jurman, G. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genom. 21 (1), 6. https://doi.org/10.1186/s12864-019-6413-7 (2020).
Huang, S., Chaudhary, K. & Garmire, L. X. More is better: recent progress in multi-omics data integration methods. Front. Genet. 8, 84. https://doi.org/10.3389/fgene.2017.00084 (2017).
Hamilton, W. L., Ying, R. & Leskovec, J. Inductive representation learning on large graphs. Adv. Neural. Inf. Process. Syst. 30, 1024–1034. https://doi.org/10.48550/arXiv.1706.02216 (2017).
Tamura, T. & Picciano, M. F. Folate and human reproduction. Am. J. Clin. Nutr. 83 (5), 993–1016. https://doi.org/10.1093/ajcn/83.5.993 (2006).
Caniggia, I. et al. Hypoxia-inducible factor-1 mediates the biological effects of oxygen on human trophoblast differentiation through TGFβ3. J. Clin. Invest. 105 (5), 577–587. https://doi.org/10.1172/JCI8316 (2000).
Castro-Mondragon, J. A. et al. JASPAR 2022: the 9th release of the open-access database of transcription factor binding profiles. Nucleic Acids Res. 50 (D1), D165–D173. https://doi.org/10.1093/nar/gkab1113 (2022).
Zhang, Y., Parmigiani, G. & Johnson, W. E. ComBat-seq: batch effect adjustment for RNA-seq count data. NAR Genomics Bioinf. 2 (3), lqaa078. https://doi.org/10.1093/nargab/lqaa078 (2020).
Leek, J. T., Johnson, W. E., Parker, H. S., Jaffe, A. E. & Storey, J. D. The Sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics 28 (6), 882–883. https://doi.org/10.1093/bioinformatics/bts034 (2012).
Blom, H. J. & Smulders, Y. Overview of homocysteine and folate metabolism. With special references to cardiovascular disease and neural tube defects. J. Inherit. Metab. Dis. 34 (1), 75–81. https://doi.org/10.1007/s10545-010-9177-4 (2011).
Whitehead, C. L., Walker, S. P. & Tong, S. Measuring Circulating placental RNAs to non-invasively assess the placental transcriptome and to predict pregnancy complications. Prenat. Diagn. 36 (11), 997–1008. https://doi.org/10.1002/pd.4934 (2016).
Acknowledgements
The authors thank the clinical staff at Longyan First Affiliated Hospital of Fujian Medical University for assistance with sample collection and the participants for their invaluable contribution to this research.
Funding
Sponsored by Fujian Province Natural Science Foundation (Grant No. 2024J011622).
Author information
Authors and Affiliations
Contributions
X.X. and Z.L. contributed equally to this work. X.X. designed the study, coordinated sample collection, performed clinical data analysis, and drafted the initial manuscript. Z.L. conducted multi-omics data preprocessing, quality control procedures, and contributed to experimental validation. Q.X. developed the graph neural network architecture, implemented computational algorithms, and performed bioinformatics analyses. H.X. executed laboratory experiments, managed sample processing, and contributed to data interpretation. M.Y. conceived and supervised the project, provided critical intellectual input, coordinated all research activities, and finalized the manuscript. All authors reviewed, edited, and approved the final manuscript for publication.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Ethics approval and consent to participate
This study was approved by the Research Ethics Committee of Longyan First Affiliated Hospital of Fujian Medical University (Reference Number: IRB-2024-LFAH-067). All participants provided written informed consent prior to enrollment. The study was conducted in accordance with the Declaration of Helsinki and relevant national regulations governing human subject research.
Consent for publication
All authors have reviewed the manuscript and consent to its publication. No identifiable information regarding participants has been included.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Xie, X., Li, Z., Xiao, Q. et al. Heterogeneous graph neural networks reveal molecular mechanisms of folate deficiency in placental insufficiency through multiomics integration. Sci Rep (2026). https://doi.org/10.1038/s41598-026-38288-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-026-38288-y