Main

Polycystic ovary syndrome (PCOS) is a reproductive, metabolic and psychological condition with impacts across the lifespan, affecting a notable (~11% to 13%) number of women worldwide. From 1990 to 2019, its global age-standardized point prevalence markedly increased by 30.4%, partially driven by the expansion and evolution of diagnostic criteria, which substantially broadened the definition of PCOS1,2. PCOS not only disrupts reproductive function, but also has detrimental effects on metabolism3. Women with PCOS are three times more likely to experience obesity than their healthy counterparts4. Moreover, they have a higher prevalence of insulin resistance (26.7%)5 with a fourfold increased risk of type 2 diabetes (T2DM) before the age of 406. In addition, PCOS is associated with an elevated risk of cardiovascular disease, metabolic dysfunction-associated steatotic liver disease (MASLD), psychological disorders, baldness and other comorbidities, thereby imposing a considerable societal burden throughout an individual’s lifespan6.

PCOS exhibits a heterogeneous clinical presentation characterized by a wide array of symptoms, leading to varying potential consequences for affected women7. This variation in clinical phenotypic presentation leads to challenges in diagnosis, treatment and prevention, contributing to dissatisfaction among both healthcare practitioners and patients alike8. The lack of agreement on clinical subtypes, specifically because of an absence of robust evidence for distinctive subtypes, exacerbates healthcare costs for PCOS subjects, estimated to reach US$8 billion annually in the USA alone9. Some studies attempt to classify PCOS based on specific symptoms and metabolic comorbidities10. However, in the era of precision medicine, it is important to establish more precise and interpretable subtypes of the disease enabling distinct diagnosis and personalized therapy11. We hypothesize that the identification and validation of precise subtypes of women with PCOS in a broader population will improve understanding of their clinical consequences and the design of targeted interventions.

Our current study aimed to identify a set of commonly measured variables that could distinguish clinical subtypes of PCOS using unsupervised cluster analysis in a large discovery cohort, and validate the subtypes in five independent cohorts of different ethnicities from China, USA, Europe, Singapore and Brazil. In addition, we examined the association between reproductive and metabolic variables in different subtypes during 6.5-year follow-up, and of in vitro fertilization (IVF) outcomes and pregnancy complications among the subtypes.

Results

Identification and validation of PCOS subtypes

In the discovery cohort, a total of 11,908 of the 47,071 women with PCOS who were not receiving therapy at the first visit were included in the analyses (Fig. 1 and Extended Data Table 1). To identify PCOS subtypes, we initially included 29 clinical variables relevant for PCOS. Following correlation analysis, principal component analysis and exploratory factor analysis, nine features were selected for subsequent clustering. Using unsupervised clustering, four subtypes were identified in the discovery cohort (Fig. 2a,b). The Jaccard scores for all four subtypes exceeded 0.79, suggesting good stability of clustering (Supplementary Table 1).

Fig. 1: Flow chart of study design.
Fig. 1: Flow chart of study design.
Full size image

COH, controlled ovarian hyperstimulation; CV, cross validation; PGD, preimplantation genetic diagnosis.

Fig. 2: Classification and validation of PCOS subtypes.
Fig. 2: Classification and validation of PCOS subtypes.
Full size image

a, Visualization of the first two principal components (PCs) of the scaled quantitative trait data, with cases grouped according to their identified subtypes by k-means clustering. b, Distribution of cases in the discovery cohort from China (n = 11,908) by k-means clustering. c, Characteristics of each cluster in the Chinese discovery cohort (n = 2,952 in HA-PCOS; n = 3,138 in OB-PCOS; n = 3,103 in SHBG-PCOS; n = 2,715 in LH-PCOS). The center lines of the box indicate the median, the bounds of the box represent the 25th and 75th percentiles, and the bounds of whiskers are the 5th and 95th percentiles. dh, Distribution of cases in the validation cohorts from China (n = 1,476) (d), USA (n = 593) (e), Europe (n = 197) (f), Singapore (n = 127) (g) and Brazil (n = 85) (h), based on k-means clustering. im, ROC curves for the four subtypes in the validation cohorts from China (n = 1,476) (i), USA (n = 593) (j), Europe (n = 197) (k), Singapore (n = 127) (l) and Brazil (n = 85) (m).

Each subtype exhibited distinct clinical characteristics (Fig. 2c and Extended Data Table 1). The hyperandrogenic subtype (HA-PCOS, 25%) was characterized by high testosterone–dehydroepiandrosterone sulfate (DHEA-S), along with mild metabolic disorders. The subtype with obesity (OB-PCOS, 26%) was characterized by higher body mass index (BMI), fasting glucose and fasting insulin level, with the highest prevalence of T2DM (7.9%), dyslipidemia (75.3%) and hypertension (28.7%). The high-sex hormone-binding globulin subtype (SHBG-PCOS, 26%) had the highest sex hormone-binding globulin (SHBG) level and lowest BMI among four subtypes, primarily manifested as lower luteinizing hormone (LH) and testosterone levels. The high-LH–AMH subtype (LH-PCOS, 23%) was distinguished by elevated levels of LH, follicle-stimulating hormone (FSH) and anti-Müllerian hormone (AMH).

To validate our clustering results, we replicated them in 5 independent validation cohorts from China (3,081 women with PCOS; 1,476 with eligible data), USA (750 PCOS cases; 593 with eligible data), Europe (572 PCOS cases; 197 with eligible data), Singapore (428 PCOS cases; 127 with eligible data) and Brazil (100 PCOS cases; 85 with eligible data) (Extended Data Table 2). The same unsupervised clustering approach, using k-means with the same parameter setting as the discovery cohort, was applied for each validation cohort. These analyses identified the same four subtypes in each validation cohort (Fig. 2d–h).

To ensure the model could be applied to each woman with PCOS in clinic, we conducted ridge regression analysis in the discovery cohort, which generated four ridge regression equations to compute the probabilities that a given participant corresponded to each of those four subtypes. To verify the accuracy and reliability of these ridge regression equations, nine features from each participant in the validation cohorts were then applied as inputs into these four equations. We then calculated area under the curve (AUC) values from receiver operating characteristic (ROC) analysis of each subtype to evaluate the accuracy of the predictions obtained by the ridge regression equations based on comparison with subtype labels obtained via k-means (as reference). In the Chinese validation cohort, the average AUC of the four subtypes was 0.88 (0.87 to 0.90) (Fig. 2i); an average AUC of 0.92 (0.83 to 0.95) was obtained in the US validation cohort (Fig. 2j), 0.88 (0.88 to 0.89) in the European cohort (Fig. 2k), 0.95 (0.90 to 0.98) in the Singapore cohort (Fig. 2l) and 0.82 (0.80 to 0.86) in the Brazilian cohort (Fig. 2m). The sensitivity and specificity in each validation cohort are shown in Supplementary Table 2. Considering the ethnic heterogeneity of the US cohort, we further stratified this cohort into sub-populations of European (n = 463) and African (n = 82) descent for separate analysis, which resulted in an AUC value of 0.88 for both sub-populations (Supplementary Fig. 1). These results thus demonstrated that the ridge regression equations could indeed predict PCOS subtype across different geographic and ethnic populations. We also repeated the above analysis with random forests method, but obtained lower AUCs than ridge regression in all validation cohorts.

Longitudinal follow-up for the four PCOS subtypes

To investigate the long-term complications and disease remission of the four subtypes, we performed longitudinal follow-up with a median duration of 6.5 years. A total of 4,542 women with PCOS diagnosed between 2014 and 2018 in the discovery cohort were followed-up by telephone interviews (Fig. 1 and Extended Data Table 3) and 523 of them voluntarily underwent physical examinations (Fig. 3, Extended Data Figs. 13 and Extended Data Table 4). The average age of these participants was 34 years.

Fig. 3: Follow-up outcomes for the four PCOS subtypes.
Fig. 3: Follow-up outcomes for the four PCOS subtypes.
Full size image

a, PCOS remission rates across the four subtypes (n = 128 in HA-PCOS; n = 110 in OB-PCOS; n = 142 in SHBG-PCOS; n = 143 in LH-PCOS) during the follow-up period. Error bars represent the CIs for the overall rate estimates. bd, Cumulative incidence of T2DM (b), hypertension (c) and dyslipidemia (d) across the follow-up period.

First, remission of PCOS was evaluated at the time of the follow-up visit. Based on the telephone interview, the percentages of women who had received treatment for PCOS or were diagnosed with PCOS in the past six months among participants who had routine physical examinations were: 78.0% in the HA-PCOS subtype, 61.4% in OB-PCOS, 61.6% in SHBG-PCOS and 80.6% in LH-PCOS (Extended Data Table 3). Similarly, according to physical examination data at the follow-up visit, the percentages of women in the four subtypes who still met the Rotterdam criteria were: 67.2% for HA-PCOS, 50.9% for OB-PCOS, 52.8% for SHBG-PCOS and 74.8% for LH-PCOS (Fig. 3a).

Second, the hyperandrogenic, ovulatory and polycystic ovarian conditions were assessed at follow-up, and remission of these diagnostic features varied notably among the subtypes (Extended Data Fig. 2). The percentages of women with PCOS who remained hyperandrogenic were 56.0%, 31.6%, 44.4% and 55.6% for HA-PCOS, OB-PCOS, SHBG-PCOS and LH-PCOS, respectively. In terms of the other two clinical features, SHBG-PCOS had the lowest incidence of persistent oligo-ovulation or anovulation (57.7%) and polycystic ovaries (65.4%) among the four subtypes.

Third, the cumulative incidence of chronic metabolic complications was also compared among the four subtypes. HA-PCOS had the highest incidence of dyslipidemia (24.4%), whereas OB-PCOS exhibited the highest incidence of T2DM (16.0%). The incidence of hypertension was higher in HA-PCOS (11.1%) and OB-PCOS (14.6%) than in the other two subtypes. Remarkably, SHBG-PCOS demonstrated better metabolic characteristics, with the lowest incidence of T2DM and hypertension (Fig. 3b–d).

During the follow-up period, all subtypes showed an increase in BMI (Extended Data Fig. 1). OB-PCOS continued to exhibit the most unfavorable glucose metabolism, along with the highest rates of overweight and obesity. Notably, hepatic steatosis analysis revealed that OB-PCOS had the highest prevalence of MASLD (85.8%), followed by HA-PCOS (77.2%) (Extended Data Fig. 3).

IVF outcomes among the four PCOS subtypes

There were 5,418 women with PCOS who received IVF treatment in the discovery cohort. Additional information on the controlled ovarian hyperstimulation, embryo culture and transfer protocols for each subtype can be found in Supplementary Table 3.

To investigate the primary IVF outcomes among the four subtypes, we compared the live birth rate, pregnancy rate and pregnancy loss rate (Fig. 4, Table 1 and Supplementary Table 4). The live birth rates for HA-PCOS, OB-PCOS, SHBG-PCOS and LH-PCOS were 50.6%, 48.9%, 56.3% and 54.8%, respectively. Notably, women with SHBG-PCOS and LH-PCOS had higher live birth rates, even exceeding that of the control group (53.8%).

Table 1 IVF outcomes
Fig. 4: Odds ratios of IVF outcomes for the four PCOS subtypes.
Fig. 4: Odds ratios of IVF outcomes for the four PCOS subtypes.
Full size image

Rectangles in the plot represent the log-transformed OR and the error bars represent the 95% CIs. For each outcome, binary logistic regression was used to calculate odds ratio with 95% CI for each subtype (n = 1,295 in HA-PCOS; n = 1,203 in OB-PCOS; n = 1,650 in SHBG-PCOS; n = 1,270 in LH-PCOS) compared with control (n = 4,165) adjusted for two models of potential confounders: aage and type of embryo transfer (fresh or frozen) for overall IVF outcomes; bage and ovarian stimulation therapy for specific comparisons.

The clinical pregnancy rates for the four subtypes were 66.0% in HA-PCOS, 62.9% in OB-PCOS, 67.4% in SHBG-PCOS and 66.7% in LH-PCOS, all higher than the control group (60.6%). The total pregnancy loss rates were also significantly higher in the four subtypes, at 31.5%, 32.5%, 24.7% and 27.7%, respectively, compared with the control (19.8%) (Table 1). HA-PCOS displayed the highest clinical pregnancy loss rate (23.3%), particularly in the second trimester (odds ratio (OR) 7.32, 95% confidence interval (95% CI) 4.94–10.85) (Fig. 4 and Supplementary Table 4). OB-PCOS had the poorest outcomes, with the lowest clinical pregnancy rate (62.9%) and the highest total pregnancy loss rate (32.5%). Conversely, SHBG-PCOS had the lowest total pregnancy loss (24.7%), which contributes to the best pregnancy outcomes among the four subtypes (Table 1). Because BMI is one of the variables used for clustering, a potential confounding effect of BMI on fertility outcomes cannot be excluded at this time.

Regarding maternal complications, all women with PCOS exhibited an increased risk of moderate to severe ovarian hyperstimulation syndrome (OHSS) (all OR >1) (Fig. 4 and Supplementary Table 4). Notably, LH-PCOS had the highest risk of OHSS compared with the control group (OR 7.44, 95% CI 4.63–11.96). OB-PCOS showed a higher risk of gestational diabetes (OR 1.70, 95% CI 1.20–2.39). Gestational hypertension was more prevalent in OB-PCOS and HA-PCOS, with ORs of 2.83 and 2.63, respectively. Moreover, HA-PCOS had the highest risk of premature rupture of membranes (OR 2.91, 95% CI 1.56–5.43).

Regarding neonatal complications, women with PCOS who underwent IVF generally had lower rates of small for gestational age (SGA) infants than the control group, except for HA-PCOS (OR 1.00, 95% CI 0.58–1.70). On the other hand, the incidence of large for gestational age (LGA) was higher in all four PCOS subtypes than in the control group, with rates exceeding 20%. OB-PCOS had the highest risk of LGA, with rate reaching 38.1% and an OR of 2.14 (Table 1, Fig. 4 and Supplementary Table 4).

Outcomes of different IVF strategies in each PCOS subtype

To provide clinicians with exploratory IVF strategies for each PCOS subtype, we conducted in-depth analyses focusing on key clinical concerns: embryo transfer strategies, endometrial preparation protocols and ovarian hyperstimulation strategies (Table 2 and Extended Data Table 5).

Table 2 IVF outcomes with different protocols in each subtype

Among embryo transfer strategies, the transfer of fresh embryos was associated with a lower live birth rate (OR 0.71, 95% CI 0.51–0.99, P = 0.042) and clinical pregnancy rate (OR 0.65, 95% CI 0.46–0.92, P = 0.016) compared to frozen embryo transfer in women with the HA-PCOS subtype (Table 2). No significant differences were observed in live birth rates, clinical pregnancy rates or total pregnancy loss rates between fresh and frozen embryo transfer in the other three subtypes, indicating that women with HA-PCOS may benefit from frozen embryo transfer.

Furthermore, when comparing ovarian stimulation (OS) cycles with natural cycle (NC) or hormone replacement therapy (HRT) for endometrial preparation before frozen embryo transfer, our analysis revealed that HRT yielded the lowest live birth rate and clinical pregnancy rate in women with the LH-PCOS subtypes, as well as the highest pregnancy loss risk in both OB-PCOS and LH-PCOS groups (Table 2). A summary of the clinical features of each subtype is shown in Extended Data Table 6.

Discussion

Accurate and globally accepted classification of PCOS clinical subtypes is essential in addressing this complex disorder associated with reproductive, metabolic as well as psychological dysfunction, although the latter was not addressed in this study. We identified four distinct PCOS subtypes (HA-PCOS, OB-PCOS, SHBG-PCOS and LH-PCOS) in a large discovery cohort. These subtypes were validated in five independent cohorts with different ethnicities around the world. Distinct reproductive and metabolic comorbidities and IVF treatment outcomes were identified in the subtypes.

As with other complex conditions, PCOS has been linked to several comorbidities, including fertility and pregnancy complications, hyperinsulinemia, T2DM and cardiovascular disorders. Accordingly, PCOS has no single diagnostic marker to provide a gold standard12. The Rotterdam criteria have indisputably improved prognostic capabilities for predicting reproductive outcomes in women with PCOS. However, they may overlook some key heterogeneities that result in potentially severe complications, which can be averted by incorporating clinical phenotypic data, and determining the women at risk of such complications based on their complex PCOS subtype. For the above purposes, we have developed PcosX (www.pcos.org.cn), a web-based tool designed to assign women with PCOS to specific subtypes, provided the necessary clinical variables have been measured.

A metabolic, a reproductive and an intermediate sub-phenotype of PCOS are proposed in ref. 13. This classification system was established using 1,156 women from the USA who were diagnosed with PCOS according to the relatively strict National Institutes of Health (NIH) criteria. Here we established an unsupervised clustering model and defined four clinical subtypes of PCOS based on the Rotterdam diagnostic criteria and, in accordance with broader criteria, a cohort ten times larger than that used for PCOS classification in ref. 13. We then validated the model in five PCOS cohorts of different ethnicities from different geographic locations. Comparison with this previously developed PCOS subtyping model showed that the metabolic subtype defined in ref. 13 largely overlapped with our HA-PCOS and, to a lesser extent, OB-PCOS subtype, which may be due to the hyperandrogenism requirement in the NIH criteria. However, our follow-up studies show that the HA-PCOS subtype clearly differs from a simple metabolic subtype. By contrast, the indeterminate group was largely distributed between our HA-PCOS and LH-PCOS subtypes, possibly because of the inclusion of AMH as a variable. Our clustering analysis included AMH among the nine features, serving as an alternative to antral follicle count for assessing polycystic ovarian morphology (PCOM), as recommended in the recently updated International Evidence-based Guideline of PCOS14. Furthermore, although the different subtypes of PCOS and components of the Rotterdam criteria can predict reproductive outcomes to some extent, key predictors such as insulin and AMH are not included in the Rotterdam criteria12. We adopted a different approach by integrating AMH into our analysis and studying a diverse cohort of women with PCOS. This allowed us to identify four subtypes with unique reproductive, metabolic and IVF treatment outcomes on longitudinal follow-up.

Importantly, the four PCOS subtypes are closely associated with distinct clinical characteristics and outcomes. The HA-PCOS subtype is linked to metabolic diseases including obesity, MASLD, T2DM, hypertension and dyslipidemia during follow-up. Notably, HA-PCOS demonstrated a higher incidence of dyslipidemia and severe MASLD compared to the OB-PCOS subtype, which is in line with previous reports demonstrating that hyperandrogenism is associated with an increased risk of lipid dysfunction15. Moreover, the HA-PCOS subtype was associated with an increased risk of second trimester pregnancy loss and premature rupture of membranes compared to other subtypes. Indeed, previous reports have shown that higher maternal testosterone levels are linked to low birth weight in offspring and an increased risk of preterm delivery, accompanied by fetal membrane rupture and cervical dilatation16. Thus, higher androgen levels might increase maternal plasma estrogen, oxytocin and amnion fibronectin levels, leading to premature rupture of membranes.

OB-PCOS is a metabolic subtype with the highest rate of PCOS remission during follow-up. As the disease progresses, women with OB-PCOS primarily develop metabolic disorders such as T2DM and hypertension, while their reproductive endocrine abnormalities tend to become less pronounced. This suggests that metabolic triggers contribute to the reproductive alterations such as low IVF success rates due to poor oocyte quality, reduced implantation rates, high pregnancy loss rates, preterm birth and ultimately low live birth rates, all of which are linked to the OB-PCOS subtype. Therefore, the long-term complications of metabolic syndrome should be emphasized in the management of OB-PCOS. Women with OB-PCOS are also more prone to pregnancy complications, including hypertensive disorders, gestational diabetes, preterm birth and cesarean delivery17. Although it is difficult for these women to conceive or have a healthy delivery, medications used during IVF or throughout pregnancy may improve their chances of conception and better outcomes.

SHBG-PCOS represents the mildest form of PCOS and has the best IVF outcomes, although it often presents with irregular cycles or PCOM. It appears to exhibit relatively mild neuroendocrine, androgenic and metabolic features, with primary abnormalities related to ovulatory dysfunction. SHBG, produced by the liver, binds to circulating sex steroids, affecting their bioavailability by sequestering androgens and estrogens from biological action. Several clinical studies have highlighted the potential role of SHBG in maintaining glucose homeostasis, because low levels of SHBG are strongly associated with an increased risk of T2DM18. Hence, the clinical features and IVF outcomes of the SHBG-PCOS subtype are almost the opposite of those observed in the OB-PCOS subtype. It is also worth considering that the follicle cutoff of 12 used in this study is below the current recommendation threshold of 20 follicles, meaning that this subtype may include women who would not meet the diagnostic criteria of the current International Evidence-based Guideline of PCOS. In addition, the lack of continuous menstrual cycle data may have influenced the classification of participants in this subtype, highlighting a need for further investigation in future research.

LH-PCOS showed the worst disease remission of PCOS at follow-up, indicating that the effects of high LH and AMH levels on PCOS may not be ameliorated by pregnancy or IVF procedures. Moreover, special attention should be given to the LH-PCOS subtype, because it is associated with the most typical reproductive characteristics of PCOS and carries an exceptionally high risk of OHSS. This heightened risk is likely due to the significantly elevated AMH levels in LH-PCOS, which are strongly associated with antral follicle count and serve as a reliable predictor of OHSS in IVF cycles. Although LH levels are also elevated in LH-PCOS, their ability to predict OHSS is relatively limited. Instead, it is possible that variants or dysregulation of the LH receptor, rather than the absolute levels of LH, may play a key role in the increased likelihood of OHSS19.

Fertility outcomes and complications are a major concern for infertile women with PCOS. Different outcomes from various intervention strategies, particularly when tailored to specific PCOS subtypes, were assessed to potentially provide clinicians with more IVF options for managing different subtypes. Our previous study20 showed that frozen embryo transfer results in a higher live birth rate than fresh embryo transfer in women with PCOS. In this study, we further found that the benefits of frozen embryo transfer are specific to women with the HA-PCOS subtype. For the other three PCOS subtypes, the more cost-effective fresh embryo transfer may serve as a preferred option.

In general, PCOS is a highly complex syndrome, exhibiting substantial heterogeneity in its etiology and clinical presentations. It is closely associated with a variety of adverse outcomes. To date, its etiology and symptoms have defied a singular explanation, making it challenging to apply trial data or individual clinical features for directly predicting outcomes. Moreover, no existing model has been proven to accurately forecast outcomes in PCOS. Our findings suggest that clustering provides a more comprehensive understanding of PCOS than examining isolated clinical features. This clustering model aligns with the multi-faceted metabolic and reproductive nature of the disease. Reliable subtypes serve as the foundation of precision medicine, and identifying these genuine subtypes requires analyzing data on a population scale, defining robust biomarkers for each subtype, refining diagnoses based on these subtype-specific markers and ultimately predicting treatment responses and disease remission through extensive research and validation.

Because this study represents a step in the long journey toward establishing and implementing clinical precision medicine practices in PCOS, several limitations should be considered when interpreting the results. First, forced patient clustering can create arbitrary groupings and discard valuable continuous information. In our study, we used k‑means clustering for this large dataset, while acknowledging that other unsupervised clustering or machine‑learning methods could also be applied for PCOS classification. Different techniques may yield varying results, especially for cases near cluster boundaries where sub‑phenotypic features overlap. By rigorously validating the clusters across international cohorts and assessing their clinical interpretability, we ensure that the resulting clusters are both statistically robust and clinically meaningful, and future methodological advances may further refine such classifications. In addition, we did not include categorical variables such as menstrual cycle regularity in the analysis. This decision was based on the self-reported nature of the data in our cohort, which introduced potential recall bias and reduced reliability. In addition, as a nonordered categorical variable, menstrual cycle regularity poses challenges for integration into current clustering algorithms without compromising interpretability. We acknowledge that this exclusion may have impacted the clustering results, because menstrual cycle regularity is a key feature of PCOS. Future research should prioritize the inclusion of accurate and objective measures of menstrual cycle regularity, such as data collected via digital tracking apps, to enhance the robustness of clustering analyses.

This study also has limitations related to recruitment methods. A substantial proportion of the participants were lost during the data completion phase. In addition, most participants—excluding those from the Turkish and Brazilian cohorts—were recruited from specialized reproductive centers. This may have introduced selection bias, favoring certain PCOS subtypes. The relatively small size of the validation cohorts from other countries further highlights the need for future validation in larger and more diverse populations to confirm the model’s applicability across different ethnicities. Moreover, follow-up data on disease remission were not available for the validation cohorts. For consistency across the study, we used the diagnostic threshold of 12 follicles per ovary for PCOM, in line with the 2003 Rotterdam criteria. This threshold is lower than the updated recommendations in more recent guidelines, which take into account improvements in ultrasound sensitivity. This discrepancy may limit the comparability of our findings with studies using updated diagnostic criteria. Also, follow-up via telephone interviews may introduce recall bias and lack clinical verification. To mitigate this, we incorporated in-person assessments, including blood draws, for a substantial proportion of participants to ensure objective data collection. However, we acknowledge that not all participants underwent in-person evaluations, which may limit the comprehensiveness of some follow-up data. Ongoing efforts will integrate additional in-person assessments to address this limitation and enhance the robustness of our findings. The control group was not included in the follow-up, precluding direct comparison of PCOS subtypes with controls. In addition, considering the dietary and geographical differences, the definition of hyperlipidemia in this study was based on Chinese criteria, which may differ in certain aspects from other international criteria. Lastly, it is important to note that when considering clinical applications, although our findings primarily focus on IVF outcomes, they do not include data on lifestyle interventions, ovulation induction or other first-line treatments for PCOS. Hence, the pregnancy outcomes reported here may not reflect the general outcomes for all women with PCOS21. Further research is needed to explore how this classification can be applied to other treatment strategies and their impact on patient outcomes.

Substantial work remains in both clinical and basic research to further develop and refine this classification. Because some of the variables used in the present model are not routinely recommended in current guidelines, sensitivity, specificity and cost evaluations need to be considered in the future application. Future studies should investigate continuous measures of ovarian or menstrual cycle dysfunction, genetic structures, epigenetic features and proteomic or metabolomic biomarkers for each subtype. Identifying additional variables could help refine cluster classifications, thereby enhancing our understanding of this complex yet often neglected syndrome, PCOS. Validation of these subtypes must be conducted in more diverse, unbiased and nonselected community-based populations to ensure generalizability. Furthermore, the statistical analysis of treatment outcomes should include more rigorous stratification and adjustments during follow-up. Incorporating contemporary diagnostic criteria, performing economic evaluations and developing robust translation tools will be essential for assessing the clinical utility of these clusters. These steps will ultimately support their integration into clinical practice and improve precision medicine approaches for PCOS.

This study, however, has multiple strengths, one of which is that it combines PCOS cohorts around the world. We propose a clinically relevant disease classification model, validated across diverse populations from different regions globally. For each subtype, we have identified their respective characteristics, highlighting the similarities and differences in the risk of metabolic disease, pregnancy complications and the variations in assisted reproductive outcomes. Furthermore, through long-term follow-up, we have revealed changes in the reproductive and metabolic features of each subtype. This provides crucial scientific evidence for early identification and enables early interventions for disease prevention.

In conclusion, our study identifies four PCOS subtypes, each with unique reproductive, metabolic and prognostic characteristics. These subtypes provide important insights into the heterogeneity of PCOS and highlight the potential for more personalized treatment approaches in clinical practice. However, continued international collaboration and research are essential to address the limitations identified and to ensure the reliability and effectiveness of these subtypes for personalized patient care. This collaborative effort will be crucial in refining classification systems and therapeutic strategies, ultimately paving the way for more tailored and effective clinical management of PCOS.

Methods

Ethics statement

Study protocols were approved by the relevant ethics review committee (China, [2021] IRB-No.140; Singapore, 2011/01716-SRF0012)22,23,24,25. All participants provided written informed consent. For information purpose, all cohorts were registered as an international multicenter study on clinicaltrials.gov (NCT06124391).

Study populations

This study involved a discovery cohort and five validation cohorts of participants with PCOS. All participants in the cohorts were between 20 and 45 years old. No transgender participant was included. PCOS was diagnosed using the broader Rotterdam diagnostic criteria26, which requires the presence of any two of the following: (1) menstrual cycle length of <21 days or >35 days, and/or fewer than 8 cycles per year; (2) hyperandrogenism defined as an elevated total testosterone level according to local laboratory criteria, and/or a modified Ferriman–Gallwey score ≥5; (3) the presence of 12 or more follicles measuring 2–9 mm in diameter in each ovary and/or an ovarian volume >10 ml as determined by ultrasound. All the clinical features were performed and recorded at the time of diagnosis.

The discovery cohort conducted at the Center for Reproductive Medicine, Shandong University, Shandong Province, China, comprised 47,071 women with PCOS diagnosed using the broader Rotterdam diagnostic criteria between December 2013 and June 2020.

Among the full cohort, 11,908 women were not receiving any therapy that could alter hormone levels or the ovulation cycle—such as oral contraceptives, metformin or weight loss regimens—at the time of their first visit and were therefore included in the subsequent statistical analysis. Women with PCOS who had received such therapies at the time of enrollment, and therefore did not meet the Rotterdam diagnostic criteria in terms of clinical manifestations or test parameters were excluded (n = 35,163).

The validation cohorts were from China, the USA, Europe, Singapore and Brazil. The China validation cohort comprised 3,081 PCOS cases from various regions in China, including East China, South China, North China, Northwest China, and Southwest China except cases from the Center for Reproductive Medicine, Shandong University, Shandong Province, China. The US cohort had 750 participants with PCOS from the Pregnancy in Polycystic Ovary Syndrome II (PPCOS II, NCT00719186) trial including European, African American and other races and ethnicities based on their country or area of birth23. The Europe cohort had 392 cases from Turkey27 and 180 cases from Sweden28,29,30,31. The Singapore cohort consisted of 428 cases32. The Brazil cohort had 100 cases24,25.

Feature selection

A total of 29 baseline clinical features routinely tested in the clinic and associated with endocrine or metabolic parameters in PCOS, were initially examined in the discovery cohort (detailed in Extended Data Table 1). Categorical variables were excluded, because they cannot be accommodated in most unsupervised clustering methods. Features with more than 30% missing data, which might be caused by the subjective bias of the physicians, were excluded from further analysis. K-nearest neighbor imputation with k = 10 was performed to impute the missing value for the remaining features, using the DMwR (v.0.1.4) package in R33. Spearman correlation was performed to analyze the correlations among the clinical features. Principal component analysis was then used to evaluate each feature’s contribution to the overall variance in the dataset, identifying those variables that provided the most distinctive information relevant to clustering. Seeking to minimize covariance and redundancy, the features with lower contribution (value of contributions <10% in the first three principal components) of two strongly correlated features (correlation coefficient >0.7) were also excluded. Finally, an exploratory factor analysis was performed. Features with factor loadings <0.4 were excluded to ensure that only those strongly associated with the identified factors were retained. This approach resulted in the selection of nine continuous variables—BMI, LH, FSH, testosterone, SHBG, DHEA-S, AMH, fasting insulin and fasting glucose—for clustering analysis.

Feature measurements

Blood samples were drawn at the first consultation, and the fasting plasma glucose and insulin were analyzed after overnight fasting. A total of 29 baseline clinical features that related to endocrine or metabolic disorders were included in the study, comprising age, height, weight, BMI, systolic pressure, diastolic pressure, LH, FSH, estradiol, testosterone, prolactin, progesterone, SHBG, AMH, DHEA-S, thyroid stimulating hormone, alanine transaminase, aspartate transaminase, gamma-glutamyl transferase, albumin, triglyceride, total cholesterol, high-density lipoprotein, low-density lipoprotein, fasting glucose, fasting insulin, ultrasound antral follicle counts, menstrual cycle history and age at menarche.

Reproductive steroid hormone levels were measured during days 1–3 of menstruation (related to early follicular phase) for ovulatory women or at any time for anovulatory women by chemiluminescence immunoassay. The levels of AMH and biochemical parameters were measured by enzyme-linked immunosorbent assay. The antral follicle count was assessed by transvaginal ultrasound.

Unsupervised clustering analysis

An initial cluster analysis was conducted in the discovery cohort of Chinese PCOS cases. All continuous variables were normalized using z-score transformation. K-means clustering analysis was chosen for unsupervised clustering based on methods in a previous study34. A range of k values (from 3 to 8) was used to first identify the maximum average silhouette widths (with 30 iterations) using the fpc package (v.2.2-10) in R v.4.0.3 to determine the optimal number for classifications. We found that k = 4 resulted in the highest average silhouette width in sensitivity analyses assessing the fit of individual objects in the classification, thus indicating that four clusters provided the most stable classifications (Supplementary Fig. 2). Ultimately, k = 4 was selected by Manhattan distance as the dissimilarity measure using the cclust function in the flexclust package (v.1.4-0) in R. Cluster stability was assessed by computing the Jaccard similarities through bootstrap resampling 1,000 iterations35.

Subtype validation

Likewise, cases with eligible data from the validation cohorts were used to assess the reproducibility of the clustering results, as in previous studies34,36,37. The same nine clinical features in each validation cohort were first standardized using z-score transformation. Each cohort was individually clustered with the nine standardized features using k-means with the same parameter setting (k = 4) as the discovery cohort.

Ridge regression analysis

To maximize its potential for clinical application in PCOS populations, we performed ridge regression analysis in the discovery cohort using the glmnet R package (v.4.1-3) with nine normalized model variables and multinomial outcome prediction for four subtypes. The lambda value was determined by assessing the model performance using ‘cv.glmnet’ function, which used tenfold cross-validation methods. The ridge regression analysis output comprised four ridge regression equations, each of which corresponded to a subtype, such as ‘Y = β1 × 1 + β2 × 2…β9 × 9’ (in which Y is the probability a given individual belongs to this subtype, with values ranging from zero to one).

To verify the accuracy of these ridge regression equations for each subtype in the discovery cohort, we used nine features from each participant in the validation cohorts. The features were used as inputs for each of the four equations to compute the probability of a participant belonging to that subtype. Each individual was then assigned a corresponding label of subtype based on the highest value among the four calculated probabilities.

Following this process, the subtype labels obtained from the previous validation cohort via unsupervised clustering with k-means served as a reference for ROC curve analysis. Each individual’s subtype label, assigned by predictive values obtained with the ridge regression equations, were then used to generate ROC curves. The AUC for each subtype was subsequently calculated to evaluate the accuracy of the predictions made by the ridge regression equations. The closer the AUC is to 1 indicates better consistency between the results of the ridge regression and the unsupervised results.

Longitudinal follow-up

A total of 9,601 women with PCOS from the discovery cohort were diagnosed between 2014 and 2018. Of these, 4,542 women voluntarily participated in telephone follow-ups between March 2021 and August 2024. In addition, 523 participants volunteered to undergo a physical examination, which included blood sample collection at the hospital. The median follow-up period for this cohort was 6.5 years (interquartile range 6.16–6.85). Remission of PCOS and its associated endocrinal features was defined as the percentage of people who still had the disease or features at the time of the follow-up. The diagnostic criteria of PCOS were still based on the same Rotterdam criteria as the baseline. Use of hormonal contraception in past six months was adjusted for in the hyperandrogenic and oligo-ovulation assessments.

Content of follow-up

The follow-up telephone interview included: (1) current height, weight and menstrual cycle status; (2) number of pregnancies and births within the past years; (3) whether the participant had received IVF treatment during the follow-up years; and (4) current disease status of PCOS, T2DM, hypertension and dyslipidemia based on whether the participant had received treatment or been diagnosed through physical examination within the last six months.

Physical examination included the following measurements: (1) anthropometric measurements such as height, weight, blood pressure, waist and hip circumference; (2) clinical evaluation for signs of hyperandrogenism; (3) laboratory tests for endocrine and metabolic disorders (same as at baseline); (4) ultrasound examination of the ovaries (antral follicle number and ovarian volume) and liver (fat content); and (5) medication use within the past six months.

Outcomes of chronic metabolic diseases

BMI was calculated as weight divided by the square of height (m2) and was classified based on criteria for the Chinese population38: (1) normal weight, BMI of 18.5–23.9 kg m2; (2) overweight, BMI of 24–27.9 kg m2; and (3) obesity, BMI ≥ 28 kg m−2. Degree of MASLD was assessed using abdominal ultrasound performed by an experienced ultrasonographer. Hypertension was defined as a systolic blood pressure ≥140 mmHg and/or diastolic blood pressure ≥90 mmHg. T2DM was defined as a fasting glucose ≥7.0 mmol l−1 (ref. 39). Dyslipidemia was defined as the presence of any of the following abnormalities: (1) total cholesterol ≥5.2 mmol l−1; (2) triglycerides ≥1.7 mmol l−1; (3) high-density lipoprotein <1.0 mmol l−1; and (4) low-density lipoprotein ≥3.35 mmol l−1 (ref. 40).

Outcomes of IVF

In the discovery cohort, IVF data were available for 5,418 participants. To compare the IVF outcomes of different PCOS subtypes with those of a control population, we included a control group that underwent IVF treatment in the same clinic and met one of the following criteria: (1) infertility due to fallopian tubal adhesion or blockage, without any PCOS features; (2) infertility due to oligozoospermia, asthenospermia or abnormal spermatozoa in their male partner.

The live birth rate, pregnancy rate and pregnancy loss were calculated as the primary outcomes of IVF. Conception is diagnosed by serum human chorionic gonadotropin ≥10 mIU ml−1. Clinical pregnancy is defined as detection of a gestational sac in the uterine cavity. We define first trimester pregnancy loss as pregnancy loss before the end of the 11th gestational week by miscarriage or stillbirth, and second trimester pregnancy loss as pregnancy loss during the 12th gestational week to the end of the 27th gestational week by miscarriage or stillbirth arising from fetal abnormalities or maternal factors, extreme spontaneous preterm birth or iatrogenic preterm birth.

Secondary outcomes were maternal and neonatal complications. Preterm delivery is defined as a live birth during the 28th gestational week to the end of the 36th gestational week, including iatrogenic preterm delivery and spontaneous preterm delivery. Premature rupture of membrane is the membrane rupture after the 28th gestational week, including preterm premature rupture of membrane, which also belongs to the spontaneous preterm delivery. SGA and LGA were determined on the basis of birth weight reference percentiles for Chinese populations, which was adjusted for sex and gestational age41. Birth weight lower than the 10th percentile of reference was defined as SGA and birth weight higher than the 90th percentile as LGA.

Logistic regression was used to calculate the ORs for each subtype compared to controls.

Statistical analysis

All statistical analyses were performed using SPSS v.26 and R v.4.0.3. Shapiro–Wilk tests were used for analyzing the normality of the variables. Continuous variables were compared using Student’s t-test or analysis of variance with the natural logarithmic conversion for nonnormal distribution data. Post-hoc comparisons among subtypes were performed using either Bonferroni or Dunnett T3 correction. Categorical variables were compared with either χ2 or Fisher’s exact test. ORs were calculated using logistic regression analysis, comparing cases with controls while controlling for two confounding models: (1) age and ovarian stimulation methods, and (2) age and fresh or frozen embryo transfer.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.