Introduction

In mammals, somatic cells of the ovary are derived from the genital ridge that forms as a thickening of the coelomic epithelium on the ventromedial surface of the mesonephros. After migrating to the genital ridge, primordial germ cells proliferate actively with incomplete cytokinesis to form germ cell nests1. Subsequently, activation of Rspo1/Wnt4/β-catenin signaling pathway promotes Foxl2 expression and granulosa cell differentiation2. Foxl2 is highly expressed first in the ovarian medulla and then in the ovarian cortex3. Moreover, female germ cells proceed from mitosis to meiosis transition directly following granulosa cell differentiation and then arrest at the diplotene stage of prophase I4. Around the period of follicle assembly, the germ cell nests in the medullary region break down to form the first wave of follicles that are activated shortly after birth, and then the nests in the cortical region break down to form the primordial follicles that undergo the second wave of folliculogenesis after recruitment and activation during adulthood5.

With the completion of follicle assembly, a well-organized ovary with complex cell types forms. Subsequently, the ovary undergoes dramatic restructuring for the process of follicle growing and ovulation. After ovulation, the basement membrane of follicles undergoes breakdown, the microvascular cells invade granulosa layers, and the luteinizing granulosa cells and theca cells encroach on the follicular cavity to form the corpus luteum6. The corpus luteum is a temporary endocrine structure within the ovary, comprising small and large luteal cells7. These cells are capable of secreting progesterone that is critical for maintaining pregnancy8. However, the majority of ovarian cell types as well as their spatial distribution are incompletely characterized, suggesting the necessity of using single-cell approaches to decode the dynamic changes of cell types at the molecular and spatial levels.

Recent advances in single-cell RNA sequencing (scRNA-seq) have enabled the comprehensive illustration of molecular heterogeneity at the single-cell resolution. Work from several groups has explored the ovary development and pathology in different species3,5,9,10,11,12,13,14,15,16. Although scRNA-seq approaches can profile the transcriptomes at the single-cell level, the spatial information of cells cannot be preserved. As emerging technologies, spatial transcriptomics provide new opportunities to profile spatial regions in tissue sections and define the organization of cellular niches17,18,19. At present, several researches have applied these excellent technologies to investigate ovary development and aging processes20,21,22,23,24. However, no work could profile the complete spatiotemporal development of the mouse ovary.

Here, by using scRNA-seq and spatial transcriptional profiling, we investigate the cell types and spatial distribution of the mouse developing ovary. In the present study, we dissected region-specific granulosa cells in different ovary developmental stages and the differences of molecular features between atretic and healthy follicles. Additionally, we revealed the molecular diversity and spatial distribution of Bhmt-positive luteal cells (BLC) and Onecut2-positive luteal cells (OLC) in the adult ovary, which may regulate the secretion of progesterone in the corpus luteum. We anticipate that our results will enhance our current understanding of different cell types in the corpus luteum and provide directions for potential treatment of progesterone-related subfertility.

Results

Single-cell and spatiotemporal atlas of the developing mouse ovary

To understand the dynamic morphogenesis of the mouse ovary, we collected single-cell RNA-seq data from the early undifferentiated gonad (embryonic day 11.5, E11.5) to the mature adult ovary (postnatal day 90, PD90). Previously published datasets accounted for 75% of the cells3,5,10, while the single-cell data for PD0, PD21 and PD90 were newly sequenced in this study (Fig. 1a). After stringent quality control and graph-based clustering, 50,655 high-quality single cells were retained for further analysis and 11 transcriptionally distinct cell clusters were identified (Fig. 1b). Based on the classic marker genes, we annotated them as female germ cells (FGCs; Ddx4), bipotential pre-granulosa cells (BPG; Foxl2), epithelial pre-granulosa cells and epithelial cells (EPG&Epi; Lgr5, Amhr2), early theca (Dlk1)25, theca (Hsd3b1), proliferative mesenchymal cells (pMesenchyma; Pclaf), mesenchymal cells (Mesenchyma; Col1a1), endothelial cells (Endo; Pecam1), pericyte (Rgs5), immune cells (Tyrobp), erythrocyte (Alas2) (Fig. 1c and Supplementary Fig. 1a). Additionally, gene ontology (GO) analysis of the top 200 differentially expressed genes (DEGs) across cell types revealed features corresponding to known biological functions and characteristics of each cell cluster (Supplementary Fig. 1b). For instance, the GO terms “adrenergic receptor binding” was specific to the early theca cell cluster, whereas “steroid dehydrogenase activity” was specific to the theca cell cluster (Supplementary Fig. 1b). We further quantified the relative proportions of these cell types at different ovary developmental stages and examined the distribution of these cell clusters in each donor sample (Fig. 1d, e). These cell types varied significantly in global relatedness according to the expression patterns of known marker genes and changed in a development-dependent manner.

Fig. 1: Spatially resolved single-cell transcriptomic landscape of the developing mouse ovary.
figure 1

a Schematic diagram of the experimental workflow and analysis for single-cell RNA-seq and spatial transcriptomic data. b UMAP projection of 50,655 cells (divided into 11 clusters) from 12 pooled samples of ovaries, colored by cell type. BPG, bipotential pregranulosa cells; Endo, endothelial cells; EPG&Epi, epithelial pregranulosa cells and epithelial cells; FGC, female germ cells; pMesenchyma, proliferative mesenchyma. c Heatmap of cell type-specific marker genes. d The percentage of ovarian cells classified by developmental stages. Source data are provided in Supplementary Data 1. e UMAP projection of single-cell transcriptomes colored by developmental stage from E11.5 to PD 90. E11.5, embryonic day 11.5; PD90, postnatal day 90. f Spatial spots clustered by gene expression and labeled by tissue anatomical compartment for eight developmental stages (Different colors represent distinct spatial spots clusters).

Next, to precisely decode spatiotemporal ovary developmental events, we performed spatial transcriptomic analysis on mice at eight developmental stages, from the early fetal gonad (E13.5) to the mature adult ovary (PD60). After mapping and filtering, we obtained 1,120 individual spots and 5483 median genes for each spot. First, we performed clustering on stSME, a novel normalization method designed for spatial transcriptomics data and utilized tissue Spatial location, Morphology, and gene Expression, normalized spatial transcriptomics data using stLearn26 and identified different clusters on the ovary sections (Fig. 1f and Supplementary Fig. 1c, d). Collectively, using spatial transcriptomic profiling combined with scRNA-seq, we described a comprehensive development atlas of the mouse ovary.

Characteristics of the female germ cell fate transition and primordial follicle formation

To dissect the molecular features and mechanisms for female germ cell fate transition, we integrated the scRNA-seq data of FGCs from E11.5 to PD5 developmental stages and divided them into seven clusters (Fig. 2a). Next, we characterized the stage-specific marker genes for FGC subtypes; the mitotic FGCs (oogonia and pre-meiotic oogonia) highly expressed pluripotent genes, such as Nanog and Pou5f1, while meiotic FGCs highly expressed meiotic marker genes, such as Spo11 and Sycp3 (Fig. 2b). Notably, S100a8high oocytes were characterized by high expression levels of S100a8 and S100a9 and relatively low expression of Gdf9 and Zp3, whereas Zp3high oocytes exhibited higher expression of Zp3, Gdf9, and Nobox. These gene expression patterns suggest that S100a9 and S100a8 may play roles in maintaining oocyte dormancy, while Nobox, Gdf9, and Zp3 are likely involved in promoting early oocyte growth (Fig. 2b). Moreover, the developmental stages of S100a8high and Zp3high oocytes, along with their associated marker genes, are also consistent with developmental timepoints (Supplementary Fig. 2a, b).

Fig. 2: Molecular diversity and developmental trajectory of female germ cells.
figure 2

a Visualization of FGC subclusters using UMAP. RA, retinoic acid. b Heatmap showing the marker genes of each female germ cell subcluster. Source data are provided in Supplementary Data 1. c Pseudotime trajectory of female germ cells analyzed by Monocle. d Expression of representative genes along pseudotime trajectory. e Heatmap representing the expression dynamics of 2 gene sets with increased or reduced expression at the cell state 1 (follicular oocyte) and cell state 2 (meiotic oocyte). f Visualization of E18.5 female germ cell subclusters and marker genes using UMAP. Stage 1: Stage 1 oocyte; Stage 2: Stage 2 oocyte. g Volcano plot showing the significantly differentially expressed transcripts in the Stage 2 compared with the Stage 1. h Immunofluorescence co-staining for the XDH and DDX4 in ovaries at E18.5. The scale bars represent 100 μm.

To further investigate the process of female germ cell fate transition, we constructed the developmental trajectory of FGCs using monocle. Notably, the developmental trajectory exhibits two branch points, marking the divergence of cells from a common progenitor state into distinct cellular states (Fig. 2c). Although all FGCs express a series of mitotic and meiotic genes3, such as Dppa5a, Stra8 and Spo11 (Supplementary Fig. 2b), the survival marker genes, such as Sycp2l27 and Uchl128, and the oogenesis marker genes, such as Sohlh1 and Nobox, are highly expressed in the S100a8high and Zp3high oocytes (Fig. 2d). These results indicate that the FGCs in state 1 may form primordial follicles, while the majority of cells in state 2 may remain in the state of early meiotic oocytes (Fig. 2c). To further explore the cell fate transition on the branch 2, two gene sets exhibiting distinct patterns were identified, which were likely involved in the determination of female germ cell states. The set 1 includes genes, such as Uchl1 and Xdh, that are highly expressed in follicular oocytes, while the set 2 includes genes, such as Inca1, Stag3, and Pbx3, that are highly expressed in early meiotic oocytes (Fig. 2d, e). To dissect the occurrence time of oocyte fate transition, we presented the developmental trajectory with facet plots according to the developmental stages, which showed that the oocytes derived from E18.5 will proceed along the developmental trajectory of state 1, ultimately forming follicular oocytes (Supplementary Fig. 2c).

To further explore the differences of molecular features among E18.5 oocytes, we extracted the germ cell populations and re-clustered them into stage 1 and stage 2 groups (Fig. 2f). Next, differentially expressed analysis between stage 2 and stage 1 also showed that the stage 2 oocytes highly expressed Figla, Sohlh1 and Xdh, while stage 1 oocytes highly expressed Hmgb2, Ccnb3 and Hist1h4i (Fig. 2g). Furthermore, the immunofluorescent staining also verified the expression of Xdh in E18.5 ovary (Fig. 2h), and the germ cells that do not express Xdh may remain in the early meiotic state, while those expressing Xdh are likely to further develop into follicular oocytes. Additionally, we performed single-cell regulatory network inference and clustering (SCENIC) analysis and found that dozens of transcription factors displayed a dynamic pattern (Supplementary Fig. 2d). For example, Sall4, Pbx3, Bclaf1 and Nfia seemed to be mostly activated in the oogonia (Oa), early meiotic oocytes (EMOe), late meiotic oocytes (LMOe) and DOe, respectively, and their motifs were listed (Supplementary Fig. 2e). Taken together, we depicted female germ cell developmental process, which may be associated with follicle growth or dormancy in the ovary.

Meiosis initiation and ovary morphogenesis

Recently, several studies have shown that an earlier meiosis initiation (first wave) occurs with a radial manner from the ovary medial to the surface region3,29. Subsequently, the second wave of meiosis initiation occurs asynchronously from the ovary, anterior to the posterior region30. To dissect the process of meiosis initiation, we extracted the scRNA-seq data from E13.5 ovary, and divided the germ cells into mitotic FGCs and meiotic FGCs based on the marker genes of Pou5f1 and Sycp310, respectively (Fig. 3a). Next, we compared the DEGs between mitotic FGCs and meiotic FGCs, the Ubb, Hmgb1 and Slc25a31 were highly expressed in the meiotic FGCs, while Rrbp1, Arid1a and Malat1 were highly expressed in the mitotic FGCs (Fig. 3b and Supplementary Fig. 3a). Moreover, we visualized the expression of these genes on the spatial regions of E13.5 ovary, and the spots that slightly expressed Arid1a and Malat1 highly expressed Slc25a31 (Supplementary Fig. 3b). Furthermore, GO analysis on these DEGs also showed that Arid1a and Hmgb1 were enriched in the “chromatin remodeling” and “response to steroid hormone” (Fig. 3c).

Fig. 3: Spatial transcriptomics identify spatially restricted genes during the transition from mitosis to meiosis and follicle formation.
figure 3

a UMAP clustering analysis of ovarian germ cells at E13.5 (left) and marker genes (right). b Volcano plot showing the significantly differentially expressed transcripts in the meiotic FGCs compared with the mitotic FGCs. c GO terms of differentially expressed transcripts between meiotic FGCs and mitotic FGCs at E13.5. d Visualization of mitotic FGCs and meiotic FGCs on E13.5 mouse ovary based on integration of scRNA-seq cell type annotations with spatial transcriptomic data (left); Spatially resolved gene expression for region restricted genes (right). A: Anterior region; P: Posterior region. e Immunostaining for the POU5F1 and SYCP3 in ovaries at E13.5. The scale bars represent 100 μm. A: Anterior region; P: Posterior region. f Spatially resolved gene expression for region restricted genes at E13.5 ovary. g Spatially resolved genes enriched in cortical and medullary regions (left). GO terms of differentially expressed genes at MSR and CSR (right). SR spatial region, MSR medullary spatial region, CSR cortical spatial region, SR&SC spatial region and single cell transcriptomes.

Next, we integrated the single-cell and spatial transcriptomic data of the E13.5 ovary and mapped the mitotic FGCs and meiotic FGCs to the spatial section using Seurat, which revealed that the meiotic FGCs were mainly located at the anterior region, while the mitotic FGCs were mainly located at the posterior region of the ovary (Fig. 3d, e). Subsequently, we also constructed a spatial trajectory (white arrow in Supplementary Fig. 3c) from the ovary anterior to the posterior region and examined the expression patterns of pluripotent or meiotic genes using SPATA31. The expression patterns of several meiotic genes, such as Stra8 and Sycp3 exhibit two peaks along the spatial trajectory, and the former expression peak is low compared to the latter expression peak, suggesting that meiosis in the ovary anterior region has occurred and the relevant meiotic genes were downregulated (Supplementary Fig. 3c). Simultaneously, we focused on the DEGs across different regions, and we found the expression patterns of Slc52a2 and Zbtb26 were similar to the Sycp3, suggesting the roles of these genes in meiosis progression (Fig. 3f). Additionally, the Suox, Alg8, Lipe and Papolg were expressed in the middle and posterior regions of E13.5 ovary, respectively (Fig. 3f).

To further explore the relationship between meiosis and folliculogenesis, we identified the BPG and EPG clusters at E13.5, E18.5, PD0, and PD3 developmental stages using known marker genes (Supplementary Fig. 3d), such as Hmgcs2, Foxl2, Gng13, and Lgr53. Moreover, the spatial expression patterns of these marker genes were also examined; the Hmgcs2 and Foxl2 are highly expressed in the ovary medulla, meanwhile, Gng13 and Lgr5 are highly expressed in the ovary cortex (Supplementary Fig. 3e). Furthermore, we examined the DEGs across medullary and cortical regions in each stage (Fig. 3g), and we found Sh3bgr was mainly expressed in the medullary region, while Dpf1 was mainly expressed in the cortical region at E13.5. Moreover, GO analysis on these DEGs showed that the DEGs in medullary region mainly enriched in the meiotic cell cycle (Fig. 3g). Next, we mapped the BPG cluster in each stage to the relevant spatial sections and found that the BPG cluster mainly located at the ovary medulla at the E18.5 stage, and subsequently located at the ovary cortex at the PD0 stage (Fig. 3g), suggesting that cortical granulosa cells begin to express Foxl2 and further form primordial follicles3. Altogether, these data further provide transcriptional validation on medullary germ cells undergoing the first wave of meiosis initiation and Foxl2-positive pre-granulosa cells forming follicles, which are simultaneously activated shortly after birth (first wave of folliculogenesis). In contrast, cortical germ cells and Lgr5-positive pre-granulosa cells that subsequently express Foxl2 may form dormant primordial follicles that are gradually activated during adulthood (second wave of folliculogenesis)3.

Spatial diversification of follicles in the postnatal mouse ovary

We next turned our attention to the postnatal follicle development, a process that encompasses either follicular maturation or atresia. Previous studies have shown that healthy follicles highly express inhibin family genes32, such as Inha and Inhbb, while atretic follicles exhibit decreased levels of inhibin and instead show high expression of Ctgf, a marker gene specifically associated with atretic follicles22. To accurately identify and distinguish healthy follicles and atretic follicles, the spatial distribution of Inha, Inhbb, and Ctgf was examined by BayesSpace at PD7, PD14, 1 M, and 2 M stages (Fig. 4a and Supplementary Fig. 4a). We found that Inha was widely expressed in the ovary and exhibited a higher expression level in larger follicles (Fig. 4a, b). Notably, the expression of Inhbb and Ctgf were not overlapped in the same spatial regions, which could be used as marker genes for healthy follicles and atretic follicles, respectively (Fig. 4a and Supplementary Fig. 4a). Additionally, our findings indicate that Grb14, a marker gene associated with subordinate follicles33, is significantly upregulated in atretic follicles, while its expression levels are markedly lower in healthy follicles (Fig. 4c). To further investigate the spatial characteristics of granulosa cell in healthy versus atretic follicles, we clustered granulosa cells from PD90 ovary into four subtypes (Fig. 4d). These subtypes were subsequently annotated as granulosa cells with high Kctd14 expression (Kctd14high GC), high Top2a expression (Top2ahigh GC), high Inhbb expression (Inhbbhigh GC) and high Grb14 expression (Grb14high GC) based on their marker genes, top DEGs, and the GO results of these DEGs (Fig. 4d and Supplementary Fig. 4b–d). Next, we mapped the subclusters of granulosa cells to 1 M and 2 M ovary section and found that the Inhbbhigh GC mainly located at the healthy follicles and the Grb14high GC specifically highly located at the peripheral regions of mature follicles and atretic follicles (Fig. 4e). Interestingly, a subset of Top2ahigh GC exhibited overlap with Inhbbhigh GC and Kctd14high GC, suggesting that Inhbbhigh GC and Kctd14high GC are highly proliferative (Fig. 4e and Supplementary Fig. 4e). This observation is consistent with the characteristics of healthy follicles, which are known to contain granulosa cells with high proliferative activity34. Although both mature follicles and atretic follicles express Grb14, mature follicles also contain a significant population of proliferative granulosa cells and exhibit high levels of Inhbb expression. In contrast, granulosa cells in atretic follicles are predominantly non-proliferative and display low levels of Inhbb expression.

Fig. 4: Spatial transcriptomics identify spatially restricted genes during follicle development.
figure 4

a Spatially resolved gene expression for follicle development. H&E, hematoxylin and eosin. b Immunostaining for INHA and DDX4 in mouse ovary. INHA-P stands for INHA-Positive, indicating the presence of the INHA primary antibody; INHA-N stands for INHA-Negative, indicating the absence of the INHA primary antibody. The scale bars represent 100 μm. c The expression patterns of Grb14 in 1 M and 2 M ovarian sections. d Visualization of major clusters of PD90 ovarian cells using UMAP. BEndo, Blood vessel endothelial cells; Epi, epithelial cells; LEndo, Lymphatic endothelial cells; pMesenchyma, proliferative mesenchyma; (left). UMAP showing expression patterns of selected markers for subclusters of granulosa cells. Purple (or gray) represents a high (or low) expression level as shown on the color key at the right bottom (right). e Visualization of granulosa cell subclusters on 1-month and 2-month mouse ovaries based on integration of scRNA-seq cell type annotations with spatial transcriptomic data. f Identification of follicular states across different spatial regions. g Spatially resolved differentially expressed genes for atretic follicles and healthy follicles (left). GO terms of differentially expressed genes at atretic follicles and healthy follicles (right). h The expression patterns of Inha and Inhbb (left) and dynamic expression patterns of differentially expressed genes along 1 M and 2 M trajectories (right). PriF Primary follicles, SecF Secondary follicles.

To further dissect the differences of molecular features between healthy follicles (characterized by high expression of Inha and Inhbb) and atretic follicles (characterized by high expression of Ctgf and Grb14), we focused on two antral follicles in the 1 M, and their sizes were similar (Fig. 4f). Next, we examined the DEGs across the regions of healthy and atretic follicles, and we found that Cpa2, Pfkfb1, Naalad2, Hook2, Nr0b1 and Zfp11 were highly expressed in the atretic follicles, whereas Fshr, Fam222a, Hsd17b1, Ska3, Tmem52 and Jakmip3 were highly expressed in the healthy follicles (Fig. 4g and Supplementary Fig. 4f). Subsequently, GO analysis revealed that the highly expressed genes in healthy follicle were mainly enriched in the “ovarian follicle development”, “regulation of endocrine process” and “response to follicle stimulating hormone”, while the highly expressed genes in atretic follicle were mainly enriched in the “response to nitric oxide” and “regulation of protein kinase signaling pathway” (Fig. 4g). Furthermore, we also constructed a spatial trajectory throughout the atretic and healthy follicles in 1 M and 2 M ovary section, and the expression patterns of relevant genes were examined (Fig. 4h). We found that the expression level of proliferative marker genes, such as Top2a and Ube2c, were lower in atretic follicle than that in healthy and mature follicles, which suggested a compromised proliferative activity of granulosa cells in atretic follicle (Fig. 4h). Above all, through the single-cell and spatial transcriptomes, we precisely identified several categories of granulosa cells with region-specific properties, and highlight the molecular differences between atretic and healthy follicles from a novel developmental perspective.

Spatial characteristics of the luteal cell subtypes

We next turned our attention to the region-specific luteal cells. To comprehensively characterize the molecular signatures of BLC and OLC, the highly DEGs were examined, and Ssu2 and Bhmt expressions were specifically enriched in BLC, whereas Onecut2 and Akr1c18 expressions were specifically enriched in OLC (Fig. 5a). Additionally, the luteal cell marker genes Ptgfr and Sfrp435 were highly expressed in both BLC and OLC (Fig. 5a). Furthermore, using structural ovary markers, spatial transcriptomic imaging localized the BLC both in the corpus luteum and other ovary regions, which was consistent to the spatial distribution of Bhmt gene, while the OLC specifically located within the corpus luteum (Fig. 5b). Interestingly, the spatial distribution of Onecut2 was in both corpus luteum and other regions. Next, we performed immunohistochemistry using antibodies against Onecut2 and found that Onecut2 was highly expressed in luteal cells and oocytes, and the corpus luteum begins to express Onecut2 as ovulation occurs (Fig. 5c and Supplementary Fig. 5a).

Fig. 5: Molecular diversity and spatial distribution differences of luteal cells in mouse ovary.
figure 5

a UMAP visualization of marker genes for BLC and OLC. Color key represents the expression levels. BLC, Bhmt-positive luteal cells; OLC, Onecut2-positive luteal cells. b Visualization of the expression patterns of Bhmt and Onecut2 (left) and the spatial distribution of BLC and OLC (right) on 2-month mouse ovary based on integration of scRNA-seq cell type annotations with spatial transcriptomic data. c Immunohistochemistry staining of Onecut2 in mouse ovary. The scale bars represent 400 μm in low-magnification view and 60 μm in high-magnification view. d Spatial scatter pie plot representing the proportions of the BLC and OLC from the PD90 ovary reference atlas in the 2-month mouse ovary. BEndo, Blood vessel endothelial cells; LEndo, Lymphatic endothelial cells. e Volcano plot showing the significantly differentially expressed transcripts in the OLC compared with the BLC (left). The GO terms enriched in the differentially expressed genes that unique to BLC and OLC (right). f Typical incoming signaling pathway (CD137) and outgoing signaling pathway (SPP1) of OLC. g Heatmap of regulon activity analyzed by SCENIC.

To further exclude the impacts of potential precursor cells, the granulosa cells and theca cells were also mapped to the spatial ovary sections, and all of them exhibited none or low enrichment in the corpus luteum (Fig. 4e and Supplementary Fig. 5b). Next, the ratios of cell types within the spots in corpus luteum were examined using SPOTlight36, and we found that approximately 60% cells in the corpus luteum are BLC, and approximately 40% cells are the OLC (Fig. 5d). To further interpret the functional differences, the DEGs between BLC and OLC were examined (Fig. 5e). Furthermore, GO analysis showed that the GO terms “sterol biosynthetic process” and “cellular response to fatty acid” were specific to BLC, whereas “negative regulation of cell population proliferation” and “response to prostaglandin D” were specific to OLC (Fig. 5e).

Subsequently, we performed CellChat analysis on PD90 ovarian cell clusters. Considering the number of interactions, we found that OLC (senders: the cells that express ligands) could interact with other cell types (receivers: the cells that express receptors corresponding to the ligands), such as BEndo, pMesenchyma and pericytes, while BLC (senders) could interact mainly with BEndo (receivers), but didn’t interact with OLC (receivers) (Supplementary Fig. 5c). Moreover, the results of interaction strength indicated that both BLC and OLC exhibited a stronger interaction with BEndo, which suggested their roles in corpus luteum angiogenesis (Supplementary Fig. 5c). Next, we focused on the interactions of OLC, and we found that CD137 signaling pathway was one of the main incoming pathways from mesenchymal cells (senders), and pericytes (senders) to OLC (receivers) and T cells (receivers). Among the outgoing signaling patterns, SPP1 signaling pathway is highly enriched in the OLC and macrophages, which corporately participate in the regulation of other cell clusters (Fig. 5f). Finally, we performed SCENIC analysis that revealed the regulatory relationship between granulosa cells and luteal cells (Fig. 5g). For instance, Etv3 is highly enriched in the BLC and Maf is highly enriched in the OLC (Fig. 5g). Collectively, we identified a novel corpus luteum specific cell type (OLC), and compared the molecular and functional differences between BLC and OLC, which provided important new information for understanding the physiological function of the corpus luteum.

Molecular differences in developmental trajectories of luteal cells

After ovulation, granulosa cells and theca cells will further differentiate into luteal cells. To further understand the relationships between potential precursor cells and luteal cells, we performed correlation analysis, and found that the four subtypes of granulosa cells were clustered together, while the theca cells, BLC and OLC were clustered together (Fig. 6a). To explore the derivation of BLC and OLC, we also constructed a developmental trajectory using monocle, which revealed both granulosa cells and theca cells will differentiate into BLC and OLC after branch 2 (Fig. 6b). Moreover, we found that Nr5a2 was mainly expressed in granulosa cells, while Pcolce was mainly expressed in luteal cells and theca cells (Fig. 6c). Interestingly, two cell groups were formed at the end of trajectory, and both of them contained BLC and OLC, which suggested two states of BLC and OLC existed within the corpus luteum (Fig. 6b). Next, we examined the DEGs between cell group 1 and cell group 2, and we found that Basp1, Stxbp6 and Hist1h4i exhibited a higher expression level in cell group 1 than that in cell group 2 (Fig. 6c). Furthermore, GO analysis on these DEGs revealed that the DEGs that highly expressed in cell group 1 were mainly enriched in the “steroid biosynthetic process” and “oxidative phosphorylation” (Supplementary Fig. 5d).

Fig. 6: Developmental trajectory of luteal cells in mouse ovary.
figure 6

a Heatmap showing the correlations among different cell clusters by Spearman analysis. b Pseudotime trajectory of granulosa cells, luteal cells, and Theca cells analyzed by Monocle. c Expression of representative genes along developmental trajectory. d Spatial trajectory showing the differentiation of granulosa cells analyzed by stLearn and SPATA. e Dynamic expression patterns of differentially expressed genes along trajectory. SecF: Secondary follicles. f The distribution patterns of typical genes along trajectory. g Lineage tracing of Foxl2-expressing cells in the fetal ovary with Foxl2-Cre mice (Top panel) and in the 3-month ovary with Foxl2-CreERT2 mice (Middle panel); Lineage tracing of Cyp17a1-expressing cells in the 3-month ovary with Cyp17a1-Cre mice (Bottom panel). GFP signal was detected in both ONECUT2-positive luteal cells (inset, white arrows) and ONECUT2-negative luteal cells (inset, white arrowheads); Large luteal cells (inset, yellow arrows), Small luteal cells (inset, yellow arrowheads). The scale bars represent 100 μm.

Subsequently, we constructed pseudo-space-time trajectories using stLearn26, which showed a well differentiation process from granulosa cells to luteal cells (Fig. 6d). The DEGs along the spatial trajectory revealed that Rarres2 and Scaf1 were highly expressed in secondary follicle, Ctgf, Adgrg1, and Slc38a5 were highly expressed in atretic follicle, and Rab27a, Nrcam, and Foxk2 were highly expressed in the corpus luteum, respectively (Fig. 6e). Meanwhile, we found that Rab27a mainly located at the corpus luteum, was highly expressed in the BLC. Additionally, Slc38a5 was highly expressed in some Top2ahigh GC, and Inhbbhigh GC was mainly located in the atretic follicles, suggesting an important role in the atresia of follicles (Fig. 6f).

Previous studies have shown that Foxl2-positive cells in the fetal ovary will give rise to pre-granulosa cells and stromal progenitor cells37. The pre-granulosa cells will further differentiate into granulosa cells and Cyp17a1-negative theca cells, while stromal progenitor cells will further differentiate into stromal cells and Cyp17a1-positive theca cells37. To further validate the derivation of OLC, we conducted in vivo lineage tracing to label Foxl2-expressing cells in the fetal ovary using Foxl2-Cre mice38 (Supplementary Fig. 5e). Our findings revealed that nearly all cells expressing Onecut2 were marked by GFP-positive signals, indicating that the OLCs are derived from Foxl2-expressing cells (Fig. 6g). However, when Foxl2-Cre activity was induced in the fetal ovary, GFP signals were detected in both granulosa cells and Cyp17a1-positive theca cells. To further investigate the relationship between the origin of OLCs and granulosa cells, co-immunostaining for ONECUT2 and GFP was performed on 3-month-old ovaries from mT/mG; Foxl2-CreERT2 mice39 (Fig. 6g). The results demonstrated that the majority of GFP-positive cells expressed Onecut2 (Supplementary Fig. 5f), suggesting that most granulosa cells differentiate into OLCs after ovulation. Additionally, we performed in vivo lineage tracing to label Cyp17a1-expressing cells in 3-month-old ovaries using Cyp17a1-Cre mice. We observed that a subset of the GFP-positive cells expressed Onecut2 (Supplementary Fig. 5f), indicating that some stromal cell-derived Cyp17a1-positive theca cells also differentiate into OLCs (Fig. 6g). Taken together, these findings demonstrate that both granulosa cells and Cyp17a1-positive theca cells can differentiate into OLCs after ovulation, with the majority of OLCs originating from granulosa cells.

Discussion

In this study, we systematically analyzed the single-cell and spatial transcriptomic maps of the mouse ovary from the early fetal gonad to the adult mature ovary. Overall, we elucidated the differences between early meiotic oocytes and follicular oocytes around primordial follicle assembly. Additionally, we also dissected the molecular signatures and spatial distribution of granulosa cell subtypes to understand the processes of follicular growth and atresia. Furthermore, we demonstrated the differences of molecular features between atretic follicles and healthy follicles. Strikingly, we discovered two subtypes of luteal cells: one specifically expresses the transcription factor Onecut2 (OLC), whereas the other specifically expresses the betaine homocysteine methyltransferase Bhmt (BLC). Altogether, we present here the single-cell and spatial transcriptomic atlas of mouse ovaries, which provides mechanistic insights into both molecular and morphological aspects of ovary development.

Early female germ cell development mainly involves meiosis initiation and the death or survival of germ cells around the primordial follicle assembly12. A previous study has reported that retinoic acid is required for Stra8 expression, which could promote the mitosis to meiosis transition40. However, a recent study found that meiosis occurs normally in the fetal ovary of mice lacking all retinoic acid receptors41,42. In addition, transcriptional factors, such as Msx2 and Zglp1 could promote meiosis initiation through activating the downstream effector Stra843,44. Interestingly, gonad somatic cells have been shown to be involved in germ cell meiosis initiation, and abnormal differentiation of gonad somatic cells could cause meiosis initiation defects45. Our study provides insights into the meiosis initiation of female germ cells. For example, Arid1a is highly expressed in mitotic FGCs, and Arid1a deficiency could lead to a global reduction in open chromatin46, suggesting its roles in chromatin remodeling. Additionally, Slc25a31 is specifically expressed in meiotic FGCs (Supplementary Fig. 3a), and is also specifically expressed in male meiotic germ cells47, which suggests its roles in mitosis to meiosis transition. The data also suggests that female germ cells initiate meiosis first in the medullary regions and then in the cortical regions of the ovary (Fig. 3g), which is consistent with recent studies29,48. Additionally, we also constructed the developmental trajectory in mouse female germ cells, and the oocytes in cell state 2 nearly didn’t express Sycp2l, Uchl1, Xdh, and Sohlh1, but specifically expressed Pbx3, a transcription factor, suggesting an important role of Pbx3 in the regulation of the state of early meiotic oocyte (Fig. 2d).

Single-cell RNA sequencing is a powerful tool for the systematic identification of cell types in a given tissue, but it does not preserve its spatial information, leading to an inaccurate characterization of cell types and predictions of their physiological function. Our analysis demonstrates that spatial transcriptomics can be used to improve the robustness of scRNA-seq. In this study, we accurately characterized the subtypes of granulosa cells and luteal cells (Figs. 4d and 5b), considering their known marker genes and spatial distributions. Additionally, we compared the DEGs across the spatial regions of healthy follicles and atretic follicles, and we found that Inhbb and Fshr were highly expressed in healthy follicles (Fig. 4a, g), while Ctgf was highly expressed in atretic follicles (Fig. 4a), which can be used as marker genes for evaluation of follicle states.

While our spatial atlas of mouse ovary provides an invaluable resource for investigating ovary development and morphogenesis, it does have certain limitations. The 55-µm-diameter spots used in the Visium platform will contain multiple cells and cannot provide enough resolution for small ovarian sections and some sparsely distributed cell populations to be clustered as unique cell types. For example, the spatial data from E18.5, PD0, PD3, and PD7 ovarian sections exhibit relative uniform distribution of unique transcripts (Fig. 1f). However, this limitation was partially resolved by integrating scRNA-seq data and spatial data or performing cellular deconvolution analysis using SPOTlight36, which could infer the cell-type composition of each spot.

The important physiological function of the corpus luteum is producing the hormone progesterone to support pregnancy, but the cell composition of the corpus luteum and their spatial distribution are still poorly understood. Our research dissected the subtypes of luteal cells and identified a novel cell type (OLC) that highly expressed Onecut2 and was specifically located within the corpus luteum (Fig. 5a, b). Furthermore, lineage tracing results also showed that Onecut2 was expressed in both small GFP-positive cells and large GFP-positive cells (Fig. 6g), which suggested that OLC may include both small and large luteal cells. A previous study demonstrated that Onecut2 could suppress the androgen receptor transcriptional program by direct regulation of androgen receptor target genes49. Therefore, the functions of Onecut2 in the corpus luteum need to be further explored.

In conclusion, we have combined single-cell and spatial transcriptomics to explore the morphogenesis of the mouse ovary at high molecular, spatial, and temporal resolution. We analyzed cellular proximities within the ovary tissue, measured changes in cellular composition with anatomical location, and quantified anatomically restricted gene expression. Because cellular transitions in complex lineages do not occur in a synchronized manner, our data represent a broad range of cellular states from the undifferentiated gonad to the adult ovary. Our research thus demonstrates how combined spatial and single-cell RNA sequencing can be used to study the interplay between cellular differentiation and morphogenesis.

Methods

Animals

Wild-type C57BL/6 J mice were purchased from Beijing SPF Biotechnology Co., Ltd All mice were housed in a specific pathogen-free facility with individually ventilated cages and free access to a regular rodent chow diet and water. The room had a controlled temperature (23 ~ 25 °C), controlled humidity (40 ~ 65%), and light program (alternating light/dark cycles). All animal experiments were conducted with approval from the Institutional Animal Care and Use Committee of Guangdong Second Provincial General Hospital, Guangzhou, China. All the methods were carried out in accordance with the Code of Practice for the Care and Handling of Animals guidelines. We have complied with all relevant ethical regulations for animal use.

Preparation of ovarian cell suspensions and cryosections

The ovary samples for scRNA-seq were dissected from female mice at PD0, PD21, and PD90 days. For each single-cell sequencing experiment, the clean ovaries were subjected to a standard digestion procedure through Tumor Dissociation Kit (MiltenyiBiotec #130-095-929) as described previously12. Finally, the ovarian cell suspensions were prepared for further library construction. For the 10× Visium experiment, the fetal and postnatal ovaries were collected from female C57BL/6 J mice at E13.5, E18.5, PD0, PD3, PD7, PD14, PD30, and PD60. The ovary samples were embedded with pre-chilled optimal cutting compound (OCT), and the frozen ovaries were sectioned at 10 µm thickness. Subsequently, each ovarian cryosection was mounted on 10x Visium Spatial Tissue Optimization slides.

Single-cell and spatial transcriptomic libraries construction and sequencing

The suspended single ovarian cells were encapsulated into droplet emulsions using a 10× Chromium system. Then, the processes of library construction were instructed by the manufacturer’s protocol of Chromium Single-Cell 3′ Gel Bead and Library V3 kit. After the generation of GEMs, reverse transcription reactions were barcoded using a unique molecular identifier (UMI), and 12 cycles were used for cDNA amplification. The resulting libraries were fragmented and assessed on a fragment analyzer using a High Sensitivity NGS Analysis Kit. The average fragment length was quantified using fragment analyzer and qPCR with a Kapa Library Quantification Kit for Illumina. The final libraries were sequenced on an Illumina NovaSeq6000 (Illumina, San Diego) with pair-end 150 bp (PE150) reads.

For 10X Visium libraries construction, the sections were first placed on Thermocycler Adaptor with the active surface facing up and incubate 1 min at 37 °C, and then fixed for 30 min with methyl alcohol in −20 °C followed by staining with hematoxylin and eosin. The brightfield images were taken on a Leica DMI8 whole-slide scanner at 10X resolution. Next, permeabilization was performed using permeabilization enzyme to release spot cellular mRNA, followed by reverse transcription for cDNA synthesis. Subsequently, the cDNA amplification was performed on a S1000TM Touch Thermal Cycler (Bio-Rad). Finally, the Visum spatial libraries were constructed using Visum spatial Library construction kit (10x Genomics, PN-1000184), which were further sequenced on an Illumina Novaseq6000 with a sequencing depth of at least 100,000 reads per spot.

Processing of single-cell RNA-seq data

For scRNA-seq data, we combined previous published mouse ovary datasets generated by Niu et al.3., Wang et al.5., and Ge et al.10. All these published raw data were integrated with our sequenced raw data and subsequently analyzed using Cell Ranger v4.0.0 and Seurat v4.1.0, employing consistent parameters across all datasets to ensure comparability. Briefly, Cell Ranger v4.0.0 was used to perform raw data demultiplexing, read mapping, and barcode processing to generate a matrix of gene counts versus cells. Briefly, the raw BCL files generated by Illumina NovaSeq 6000 sequencer were demultiplexed into fastq files by function mkfastq. The fastq files were then processed using the count function with default settings, including alignment (using STAR align to mm10 mouse reference genomes), filtering, and UMI counting. The generated count matrices were loaded into R using the Read10X function of Seurat package (version 4.1.0)50. The Seurat object was created based on two filtering parameters of “min.cells = 5” and “low.thresholds = 200”, followed by adding the sample information in the row names of the matrices and integrating data from different samples. Next, the Seurat object was processed with harmony to remove batch effects51. Subsequently, normalization was performed in Seurat on the filtered matrix to obtain the normalized count, and principal component analysis (PCA) was performed to reduce the dimensionality on the top 18 principal components. Then, cells were clustered at a proper resolution and visualized in two dimensions using UMAP52. To identify typical marker genes for cluster annotation, the “FindAllMarkers” function was applied to detect differentially expressed genes between clusters with default parameters.

Processing of 10×Visium data

Space Ranger v1.2.0 was used to demultiplex Visium-prepared raw data by function mkfastq. The output fastq files and microscope slide images were then run with function count using default settings, including alignment (using STAR align to mm10 mouse reference genomes), tissue detection, fiducial detection, and barcode counting. The generated feature-barcode matrices were loaded into R using the Load10X_Spatial function of Seurat. According to the developer’s vignettes, the data were filtered and normalized by SCTransform53. Spatially variable genes were identified using the method of markvariogram, and PCA was performed to reduce the dimensionality on the top 30 principal components. Spots were clustered based on the stSME clustering by stLearn26. Additionally, the SPOTlight package was used to deconvolute spatial transcriptomics spots and infer the ratio of cell types within a certain spot36. To explore the differences of molecular features across different spatial regions, we performed segmentation of the histological images using the SPATA2::createSegmentation, and then detected the differentially expressed genes between healthy and atretic follicles using the SPATA2::runDEA54.

Pseudotime and Pseudo-space-time analysis

Pseudotime analysis was performed on germ cells and granulosa cells from the mouse ovary atlas data by using the Monocle2 R package55. With the gene count matrix as input, the new dataset for Monocle object was created. The ordering genes were differentially expressed genes between clusters in each cell type, which were further used to recover lineage trajectories. After pseudotime time was determined, differentially expressed genes were clustered to verify the fidelity of lineage trajectories. The root state was set and adjusted following consideration of the biological meanings of different cell branches. Additionally, pseudo-space-time analysis was performed on 2 M ovary sections by using stLearn26. The ordering genes were differentially expressed genes between clades, which was further used to construct spatial trajectories.

Cell-cell interaction analysis

CellChat objects were created via Cellchat (https://github.com/sqjin/CellChat, R package, v.1). With “CellChatDB.mouse” set up as the ligand-receptor interaction database, cell-cell interaction analysis was then performed via the default setting.

GO analysis

GO enrichment analyses were performed using ClusterProfiler, an R package in Bioconductor, to detect the gene-related biological process, and GO terms with a threshold value of “p value Cut off = 0.05” were considered.

Transcription factor regulatory network analysis

SCENIC v1.3.1 was used to predict the core regulatory transcription factors and their target genes based on the scRNA-seq data56. Following the standard pipeline, the co-expressed genes for each transcription factor were identified via the GENIE3 (v1.18.0)57. RcisTarget was used to infer enriched transcription factor-binding motifs and to predict candidate target genes (regulons) based on the mm10 mouse-specific database containing motifs with genome-wide rankings. Finally, regulon activity was calculated by AUCell software.

Lineage tracing experiments

The lineage tracing experiments were performed using Cre; mT/mG female mice. mT/mG mouse is a double-fluorescent Cre reporter mouse, which expresses membrane-targeted tandem dimer Tomato (mT) prior to Cre-mediated excision and membrane-targeted green fluorescent protein (mG) after excision58. Foxl2-Cre mice were crossed with mT/mG mice to study embryonic Foxl2-positive cell-derived luteal cells38. Cyp17a1-Cre mice were crossed with mT/mG mice to study Cyp17a1-positive theca cell-derived luteal cells. Foxl2-CreERT2 mice were crossed with mT/mG mice to study granulosa cell-derived luteal cells. For tamoxifen induction, a single dose of tamoxifen (250 mg tamoxifen/kg) was administered to 3-month-old Foxl2-CreERT2; mT/mG female mice. Ovaries were collected at 4 days post-tamoxifen injection to detect granulosa cell-derived luteal cells.

Immunofluorescence analysis

For immunostaining, the paraffin sections were dewaxed and rehydrated in xylene and in a series of decreasing graded ethanol to PBS. For antigen retrieval, the sections were treated in citric acid for 20 min at 98 °C. After cooling down, the sections were incubated with 5% BSA in 0.3% Triton X-100 for 1 h and were then incubated overnight at 4 °C with the primary antibodies. The following primary antibodies were used: Xdh (Invitrogen, PA5-26285, 1:200), Inha (Invitrogen, MA5-15703, 1:200), ONECUT2 (Proteintech, 21916-1-AP, 1:50), GFP (Abcam, ab13970, 1:200), SCP3 (Abcam, ab97672, 1:50), POU5F1 (Proteintech, 11263-1-AP, 1:50), Ddx4 (Abcam, ab13840, 1:200), and Ddx4 (Abcam, ab27591, 1:200). After being washed three times in PBS, the sections were incubated with secondary antibodies for 1 h at room temperature. Nuclei were stained with 4′,6-diamidino-2-phenyl-indole (DAPI, Life Technologies), and sections were examined with a confocal laser scanning microscope (Carl Zeiss Inc., Thornwood, NY).

Immunohistochemistry analysis

For paraffin-embedded ovarian tissues, the sections were dewaxed and rehydrated in xylene and in a series of decreasing graded ethanol to PBS. Next, the sections were pretreated in citric acid for 20 min at 98 °C and cooled down to room temperature, and then incubated in 0.3% H2O2 for 10 min to block endogenous peroxidase activity. The sections were incubated overnight at 4 °C with primary antibody ONECUT2 (Proteintech, 21916-1-AP, 1:200) and 1% PBS was used to wash the sections three times for 5 min each on the next day. Then the sections were incubated with secondary antibodies for 20 min at room temperature. The sections were washed with 1% PBS three times for 5 min each. Then the DAB was added to each section by incubating for 1–2 min and hematoxylin was used for counterstaining for 1–2 min. Then the sections were dehydrated through a series of increasing graded ethanol, cleared in xylene, and sealed. Brown staining of the cytoplasm or nucleus of the cells was considered positive.

Statistics and reproducibility

The ovary samples for spatial transcriptomic experiment were collected across eight different timepoints. Two or more biological replicates were included at the E13.5, E18.5, PD0, PD3, PD7, PD14, and PD60 timepoints. The scRNA-seq data across twelve different timepoints only have one biological replicate. In the computational analysis for these high-throughput sequencing data, the statistical analysis outlined in the specific packages was utilized.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.