Abstract
Oocyte and early embryo competence defects (OECD) represent a recently recognized cause of female infertility with the application of assisted reproductive technology, characterized by impaired oocyte or early embryo development. To investigate the genetic landscape and subtypes of OECD, we performed whole-exome sequencing on 2,140 patients, classifying them into six distinct subtypes. We identified 183 pathogenic/likely pathogenic variants across 28 established genes. Notably, distinct genetic profiles and diagnostic rates emerged across subtypes, with a rate of 53% in the Empty Follicle subtype. Additionally, we identified and validated two potentially causative genes, MLH3 and CENPH. Gene burden analysis, using 2,424 fertile controls, suggested nine potential previously unreported associated genes and offered biological insights into the underlying pathogenic mechanisms of OECD. Collectively, these genetic findings accounted for 12.8–23.1% of OECD cases. This study delineates the genetic architecture of OECD, offering insights that may inform the development of diagnostic genetic screenings and provide a reference for standardized subtyping of patients with OECD.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout





Data availability
The sequencing data of individuals included in this study have been deposited in the Genome Sequence Archive (GSA) in the National Genomics Data Center, China National Center for Bioinformation/Beijing Institute of Genomics, Chinese Academy of Sciences. Due to participant privacy and the Regulations on the Management of Human Genetics Resources of China, the raw sequencing data in the case group are available under restricted access in GSA-Human at https://ngdc.cncb.ac.cn/gsa (BioProject numbers PRJCA037616, PRJCA016901 and PRJCA038094). The raw data can be authorized for downloading by the Data Access Committee (DAC). Detailed guidance on data access requests can be found in the repository’s document (https://ngdc.cncb.ac.cn/gsa-human/document/). The DAC reviews requests on a monthly basis. Source data are provided with this paper.
Code availability
Our in-house gene-based analysis codes are available at https://github.com/ShuyanTang/OECD.
References
Liu, Q., Chen, X. & Qiao, J. Advances in studying human gametogenesis and embryonic development in China. Biol. Reprod. 107, 12–26 (2022).
Cox, C. M. et al. Infertility prevalence and the methods of estimation from 1990 to 2021: a systematic review and meta-analysis. Hum. Reprod. Open 2022, hoac051 (2022).
Doulgeraki, T. & Iliodromiti, S. Reproductive outcomes in women and men conceived by assisted reproductive technologies. BMJ Med. 2, e000318 (2023).
Kushnir, V. A., Barad, D. H., Albertini, D. F., Darmon, S. K. & Gleicher, N. Systematic review of worldwide trends in assisted reproductive technology 2004–2013. Reprod. Biol. Endocrinol. 15, 6 (2017).
Huang, H.-L. et al. Mutant ZP1 in familial infertility. N. Engl. J. Med. 370, 1220–1226 (2014).
Wei, Y. et al. Genetic mechanisms of fertilization failure and early embryonic arrest: a comprehensive review. Hum. Reprod. Update 30, 48–80 (2024).
Sang, Q., Zhou, Z., Mu, J. & Wang, L. Genetic factors as potential molecular markers of human oocyte and embryo quality. J. Assist. Reprod. Genet. 38, 993–1002 (2021).
Zhang, T. et al. CenpH regulates meiotic G2/M transition by modulating the APC/CCdh1-cyclin B1 pathway in oocytes. Development 144, 305–312 (2017).
Lipkin, S. M. et al. Meiotic arrest and aneuploidy in MLH3-deficient mice. Nat. Genet. 31, 385–390 (2002).
Zou, Z. et al. Translatome and transcriptome co-profiling reveals a role of TPRXs in human zygotic genome activation. Science 378, abo7923 (2022).
Baldarelli, R. M. et al. Mouse Genome Informatics: an integrated knowledgebase system for the laboratory mouse. Genetics 227, iyae031 (2024).
Yang, J. et al. Absence of the DNA-/RNA-binding protein MSY2 results in male and female infertility. Proc. Natl Acad. Sci. USA 102, 5755–5760 (2005).
Tolmachova, T. et al. A general role for Rab27a in secretory cells. Mol. Biol. Cell 15, 332–344 (2004).
Lee, B. M. et al. Follistatin rescues blastocyst development of poor quality porcine cumulus-oocyte complexes by delaying meiotic resumption with decreased cGMP. Reprod. Sci. 25, 759–772 (2018).
Hu, J. et al. Mouse ZAR1-like (XM_359149) colocalizes with mRNA processing components and its dominant-negative mutant caused two-cell-stage embryonic arrest. Dev. Dyn. 239, 407–424 (2010).
Tian, X., Pascal, G. & Monget, P. Evolution and functional divergence of NLRP genes in mammalian reproductive systems. BMC Evol. Biol. 9, 202 (2009).
Zhang, J. et al. DENN domain-containing protein FAM45A regulates the homeostasis of late/multivesicular endosomes. Biochim. Biophys. Acta Mol. Cell Res. 1866, 916–929 (2019).
Ghozlan, H., Cox, A., Nierenberg, D., King, S. & Khaled, A. R. The TRiCky business of protein folding in health and disease. Front. Cell Dev. Biol. 10, 906530 (2022).
Lu, X., Gao, Z., Qin, D. & Li, L. A maternal functional module in the mammalian oocyte-to-embryo transition. Trends Mol. Med. 23, 1014–1023 (2017).
Jentoft, I. M. A. et al. Mammalian oocytes store proteins for the early embryo on cytoplasmic lattices. Cell 186, 5308–5327 (2023).
Chi, P. et al. Structural basis of the subcortical maternal complex and its implications in reproductive disorders. Nat. Struct. Mol. Biol. 31, 115–124 (2024).
Feng, R. et al. Mutations in TUBB8 cause a multiplicity of phenotypes in human oocytes and early embryos. J. Med. Genet. 53, 662–671 (2016).
Zheng, W. et al. The comprehensive variant and phenotypic spectrum of TUBB8 in female infertility. J. Assist. Reprod. Genet. 38, 2261–2272 (2021).
Yang, P. et al. Mutation analysis of tubulin beta 8 class VIII in infertile females with oocyte or embryonic defects. Clin. Genet. 99, 208–214 (2021).
Dong, J. et al. Ectopic expression of human TUBB8 leads to increased aneuploidy in mouse oocytes. Cell Discov. 9, 105 (2023).
Lefièvre, L. et al. Four zona pellucida glycoproteins are expressed in the human. Hum. Reprod. 19, 1580–1586 (2004).
Oh, J. S., Susor, A. & Conti, M. Protein tyrosine kinase Wee1B is essential for metaphase II exit in mouse oocytes. Science 332, 462–465 (2011).
Kittler, R. et al. Genome-scale RNAi profiling of cell division in human tissue culture cells. Nat. Cell Biol. 9, 1401–1412 (2007).
Adamson, B., Smogorzewska, A., Sigoillot, F. D., King, R. W. & Elledge, S. J. A genome-wide homologous recombination screen identifies the RNA-binding protein RBMX as a component of the DNA-damage response. Nat. Cell Biol. 14, 318–328 (2012).
Papadimitriou, S. et al. Toward reporting standards for the pathogenicity of variant combinations involved in multilocus/oligogenic diseases. HGG Adv. 4, 100165 (2022).
Liu, W. et al. Dosage effects of ZP2 and ZP3 heterozygous mutations cause human infertility. Hum. Genet. 136, 975–985 (2017).
Alviggi, C. et al. A new more detailed stratification of low responders to ovarian stimulation: from a poor ovarian response to a low prognosis concept. Fertil. Steril. 105, 1452–1453 (2016).
Chen, S., Zhou, Y., Chen, Y. & Gu, J. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34, i884–i890 (2018).
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).
Danecek, P. et al. Twelve years of SAMtools and BCFtools. Gigascience 10, giab008 (2021).
Manichaikul, A. et al. Robust relationship inference in genome-wide association studies. Bioinformatics 26, 2867–2873 (2010).
Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).
McLaren, W. et al. The Ensembl Variant Effect Predictor. Genome Biol. 17, 122 (2016).
Richards, S. et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet. Med. 17, 405–423 (2015).
Landrum, M. J. et al. ClinVar: improving access to variant interpretations and supporting evidence. Nucleic Acids Res. 46, D1062–D1067 (2018).
Robinson, J. T. et al. Integrative genomics viewer. Nat. Biotechnol. 29, 24–26 (2011).
Fowler, D. M. & Rehm, H. L. Will variants of uncertain significance still exist in 2030? Am. J. Hum. Genet. 111, 5–10 (2024).
Ke, H. et al. Landscape of pathogenic mutations in premature ovarian insufficiency. Nat. Med. 29, 483–492 (2023).
Cheng, J. et al. Accurate proteome-wide missense variant effect prediction with AlphaMissense. Science 381, eadg7492 (2023).
Tabet, D. R. et al. Benchmarking computational variant effect predictors by their ability to infer human traits. Genome Biol. 25, 172 (2024).
Tang, S. et al. MGA loss-of-function variants cause premature ovarian insufficiency. J. Clin. Invest. 134, e183758 (2024).
Zhou, Y. et al. Metascape provides a biologist-oriented resource for the analysis of systems-level datasets. Nat. Commun. 10, 1523 (2019).
Xie, P. et al. Segmental aneuploidies with 1 Mb resolution in human preimplantation blastocysts. Genet. Med. 24, 2285–2295 (2022).
Zhou, S. et al. Complex mosaic blastocysts after preimplantation genetic testing: prevalence and outcomes after re-biopsy and re-vitrification. Reprod. Biomed. Online 43, 215–222 (2021).
Acknowledgements
We are grateful to the patients who participated in this research. We thank all the staff at the Center for Reproductive Medicine, Shandong University, and the Reproductive and Genetic Hospital of CITIC-XIANGYA. This work was supported by the National Natural Science Foundation of China (82192874 (H. Zhao), 82421004 (H. Zhao), 32588201 (Z.-J.C.), 82371672 (W.Z.), 82402161 (H. Zhang) and 32288101 (F.Z.)); the National Key Research and Development Program of China (2024YFC3405600 (H. Zhao), 2023YFC2705504 (W.Z.), 2021YFC2700400 (H. Zhao), 2018YFC1004300 (S.Z.) and 2021YFC2700701 (S.Z)); the specific research fund of The Innovation Platform for Academicians of Hainan Province (YSPTZX202310 (Z.-J.C.)); the Natural Science Foundation of Hunan Province (2024JJ2083 (W.Z.)); the Science and Technology Innovation Program of Hunan Province (2023RC3233 (W.Z.)); the Ningxia Hui Autonomous Region Key Research and Developmental Program (2024BEG02019 (H. Zhao)); the Shandong Provincial Key Research and Development Program (2024CXPT087 (H. Zhao) and 2020ZLYS02 (Z.-J.C.)); the Taishan Scholars Program of Shandong Province (ts20190988 (H. Zhao)); and Shanghai Jiao Tong University Trans-med Awards Research (STAR 20240202 (F.Z.)).
Author information
Authors and Affiliations
Contributions
Z-J.C. is the senior author of the paper. Z.-J.C., H. Zhao, G.L., S.T., F.Z., W.Z., K.W. and F.G. contributed to study design and conceptualization. H. Zhao, G.L., K.W., W.Z., H.H., Y.W., J.Z. and C.L. provided cohort ascertainment, recruitment and phenotypic characterization of the patient cohort. C.Z., W.Z., H. Zhang and S.Z. performed WES production and validation. S.T., C.Z., X.X., F.M., Y.Y., Y.G. and S.Z. helped with variant calling and quality control of the data. S.T., C.Z. and X.X. performed bioinformatics analysis. S.T. and C.Z. performed statistical analysis. C.Z., H.H., X.W., J.S., W.S., Y.W., Y. Cui, Y. Cao, X.Z., C.Z. and Z.W. performed DNA extraction. C.Z., H.H., X.W., J.S., W.S., Y.W., Y. Cui, B.Y., C.Y. and X.Z. performed Sanger sequencing validation. S.T., C.Z. and L.L. evaluated variant pathogenicity. W.Z., H. Zhang, S.Z., H.H., X.W. and J.S. conducted functional experiments. H. Zhao, G.L., K.W., F.G., C.Z., W.Z., H. Zhang and Y.W. collected and summarized the clinical information of patients. C.Z., W.Z., H. Zhang, S.T., G.L. and H. Zhao wrote and reviewed the paper. Z.-J.C., H. Zhao, G.L., S.T., F.Z. and W.Z. administered the project. Z.-J.C., H. Zhao, G.L., S.T., F.Z., W.Z., K.W. and F.G. supervised the project. Z.-J.C., H. Zhao, F.Z., W.Z., S.Z. and H. Zhang acquired funding. All authors contributed to and reviewed the final version of the paper.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Medicine thanks Svetlana Yatsenko and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Anna Maria Ranzoni, in collaboration with the Nature Medicine team.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Study overview.
Overview of the clinical characteristics of oocyte and early embryo competence defects (OECD), analytical workflow, and methodology. EF, Empty Follicle; MA, Oocyte Maturation Arrest; FF, Fertilization Failure; ZA, Zygote Arrest; EA, Early Embryonic Arrest; MIX, Mixed Phenotype; Ctrls, controls; SCMC, subcortical maternal complex.
Extended Data Fig. 2 Sample quality control (QC) overview.
Overview of the sample QC steps leading to the final cohort of 2,140 OECD individuals and 2,424 controls. The number of samples retained after applying each filter is shown. Samples were excluded based on the following criteria: low average coverage (< 30×), low call rate (<0.9), discordance between reported and genetically predicted sex (male), ≤2nd degree relatedness (one individual from each pair retained), or outlying values on the first two principal components of genetic variation.
Extended Data Fig. 3 Phasing results of two heterozygous variants in the same patient via IGV visualization and TA cloning sequencing.
a, IGV visualization of mapped reads containing c.631 T > C and c.630_631del detected in TUBB8 in case OE-2110. The two heterozygous variants located in different reads and were confirmed to be in trans. b, IGV visualization of mapped reads containing c.576 C > G and c.586 C > T detected in ASTL in case OE-3134. The two P/LP variants located in different reads and were confirmed to be in trans. c, IGV visualization of mapped reads containing c.289 G > C and c.374 T > C detected in PABPC1L in case OE-3036. The two P/LP variants located in different reads and were confirmed to be in trans. d, IGV visualization of mapped reads containing c.857 G > C and c.965 G > A detected in CDC20 in case OE-0182. The two P/LP variants located in different reads and were confirmed to be in trans. e, IGV visualization of mapped reads containing c.922 T > G and c.922_930del detected in ZFP36L2 in case OE-2089. The two P/LP variants located in different reads and were confirmed to be in trans. f, TA clone sequencing results of c.1506 A > T and c.1693 G > A in FBXO43 from case OE-3113. The two P/LP variants located in different clones and were confirmed to be in trans.
Extended Data Fig. 4 Locations of P/LP variants and their affected protein domains in 28 established OECD genes.
The x-axis represents the number of amino acid residues. Dashed lines mark exon boundaries, and variants are mapped in relation to critical functional domains or motifs. Variant types are color-coded: red for stop-gained, orange for frameshift, blue for missense, green for splice site, and purple for in-frame or stop-lost variants. Numbers inside circles represent the count of variants; if the count is 1, the number is omitted.
Extended Data Fig. 5 Pathogenic mutations in CENPH and MLH3.
a, Locations of P/LP variants and their affected domains in the proteins encoded by CENPH and MLH3. b, Verification of the CENPH splicing variant (c.435+1 G > A) indicates exon 6 skipping. cDNA derived from immature oocytes donated by a normal control (left) and the patient (right) were utilized for Sanger sequencing to confirm the aberrant splicing caused by c.435+1 G > A in CENPH. c, Structural analysis of CENPH using Pymol software shows that the mutation p.Leu44Pro alters the hydrogen bond with p.Met40. Red dashed lines indicate hydrogen bonds. AlphaFold3 was used to construct models for the wild-type (WT) and mutant (p.Leu44Pro) CENPH proteins. d, Morphology of MII oocytes, zygotes and embryos obtained from two MLH3 carriers and control donor. The representative time-lapse images of each patients were collected from the treatment process. Scale bar:20μm.
Extended Data Fig. 6 Experimental validation of the pathogenicity of CENPH and MLH3.
a, Schematic for Cenph or Mlh3 knockdown in mouse zygotes via siRNA microinjection. b, Knockdown efficiency of Cenph and Mlh3. Data are representative of three independent experiments. Data show the mean ± SEM from three independent experiments. Statistical significance was determined using a two-sided Student’s t-test. The P value is 0.005 (**) and 0.035 (*) for siCenph and siMlh3, respectively. c, Representative images of the indicated developmental stages in Cenph knockdown or Mlh3 knockdown mouse embryos and siNC controls. Scale bar: 100μm.d, Chromosome analysis using whole genome sequencing data from embryos of patients OE-3099 and OE-2669 and control.
Extended Data Fig. 7 Expression patterns of causative and associated OECD genes.
a, Heatmaps showing the expression dynamics of established and novel OECD genes at various stages from germinal vesicle (GV) to blastocyst, based on single-cell RNA-seq (left) and Ribo-seq (right). ICM, inner cell mass, which is part of the blastocyst. b, Expression changes of OECD associated genes at different stages, as assessed by single-cell RNA-seq and Ribo-seq. The dots represent the mean values from two independent replicates, while the bars indicate the standard error of the mean (SEM). c, Knockdown efficiency of Rab27a and Cct4. Data are representative of three independent experiments. Data show mean ± SEM from three independent experiments. Statistical significance was determined using a two-sided Student’s t-test. The P value is 0.0003 (***) and 0.001 (**) for siRab27a and siCct4, respectively. d-e, Top 15 of GO enrichment terms for molecular function of differentially expressed genes in 4-cell embryos after Rab27a (d) or Cct4 (e) disturbance.
Extended Data Fig. 8 Exploration of the digenic pathogenic model for SCMC genes.
a, Analysis of digenic pathogenicity patterns among subcortical maternal complex (SCMC) components in the control (left) and case (right) groups. b, The 3-dimensional structure representations of SCMC complex. The resolved crystal structure of SCMC (left) was obtained from the Protein Data Bank (PDB:8X7V). The structure of full-length SCMC core protein (middle) was modeled by AlphaFold3. The two complex structures were aligned with the RMSD score of 2.305 (right). c. Mutation analysis of TLE6 (p.Arg555Cys) and NLRP5 (p.Ser293Leu) using crystal structure of SCMC (PDB:8X7V) indicated disrupted hydrogen bonds between amino acid residues in separate proteins. Yellow dashed lines indicate the hydrogen bonds. d, The model of full-length SCMC core proteins with NLRP2 were constructed by AlphaFold3. The disrupted amino acids in NLRP2 (p.Ile880HisfsTer11) involved in its interaction interface with TLE6 are colored in red.
Extended Data Fig. 9 Circular dendrograms presenting biological insights from OECD case–control whole-exome sequencing data.
The outer circle presents the enriched pathways in Gene Ontology, REACTOME, KEGG and CORUM databases. The inner circle shows non-redundant representative terms after clustering of the outer terms. The size of each dot indicates the significance of enrichment, measured as -log10(P). Irrelevant clusters were removed following thorough biological curation.
Extended Data Fig. 10 Overview of mutations identified in established and novel OECD genes.
a, Distribution of pathogenic (P), likely pathogenic (LP), uncertain significance (VUS), likely benign (LB), and benign (B) variants across genes in all rare coding variants (allele frequency < 0.01) detected in OECD genes. b, Distribution of LoF, missense, and other variant types (including in-frame indels and the splice region) across genes. c, Distribution of LoF, missense, and other types of P/LP variants across genes.
Supplementary information
Supplementary Information
Supplementary Fig. 1.
Supplementary Tables 1–13
Supplementary Table 1. Clinical characteristics of 2,140 OECD cases. Supplementary Table 2. List of 37 established genes. Supplementary Table 3. Criteria used in this study for the classification of pathogenic variants according to ACMG standards and guidelines. Supplementary Table 4. Information of P/LP variants in established genes for diagnosis from OECD cases. Supplementary Table 5. Selection of 63 OECD-related mouse phenotypes and their corresponding genes in the MGI database. Supplementary Table 6. Curation of 8,575 candidate genes. Supplementary Table 7. Gene burden analysis results for genes with P < 0.001. Supplementary Table 8. Functional annotations of the novel genes and their biological evidence. Supplementary Table 9. Samples (cases and controls) with digenic variants in SCMC components. Supplementary Table 10. GSEA results for 483 overrepresented genes. Supplementary Table 11. Information of P/LP variants in novel genes from OECD cases. Supplementary Table 12. Information of VUS variants in established and novel genes from OECD cases. Supplementary Table 13. Primer sequence for the siRNA used in this study.
Source data
Source Data Fig. 3
Statistical source data.
Source Data Fig. 4
Statistical source data.
Source Data Extended Data Fig. 6
Statistical source data.
Source Data Extended Data Fig. 7
Statistical source data.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhang, C., Zheng, W., Zhang, H. et al. Genetic architecture and phenotypic diversity of oocyte and early embryo competence defects in female infertility. Nat Med (2025). https://doi.org/10.1038/s41591-025-04001-1
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41591-025-04001-1