Introduction

The symptoms attributed to any life stage must be viewed against an ever-evolving background of population health, and those associated with the menstrual cycle and menopause are in need of an update. Menopausal symptoms received increasing medical attention in the mid-nineteenth century1, and premenstrual syndrome (PMS) has been medically acknowledged since at least 19352. Although the baseline frequency for metabolic and female reproductive health problems (e.g., polycystic ovarian syndrome, obesity) increased greatly from these eras, the clinical description of cycle-associated to climacteric symptoms has changed minimally. Health in America has degraded rapidly in the twenty-first century, with well-documented increases in overweight and obesity3,4, metabolic disease5,6, and potentially poorer coping with perceived stress7, all of which can impact reproductive function8,9 and menopause10,11,12,13. In particular, the recent years of the COVID-19 pandemic comprise a uniquely stressful and isolating time, the health effects of which are still being desribed14,15. This changing background of general “health” in the population suggests that the experience of female reproductive life stages may differ from the reports in the 1990’s or even early 2000’s. Because life expectancy of women has now reached 80 (almost 35 years longer than at the turn of the twentieth century16), a greater proportion (40%) of a woman’s lifespan is spent in menopause and, consequently, with menopausal symptoms. The interaction of the changing baseline health environment with aging physiology in a population in which menopausal women will become an estimated half of females by 203017 creates a strong need for a “living profile” of symptoms and symptom clusters. These clusters can provide an “update” for clinicians as well as context for evaluating efficacy of new treatments for women of all ages.

Today, methods like symptom clustering have yielded objective diagnostic criteria and more specialized treatment in fields ranging from psychiatry18 to cardiology19,20, gastroenterology21, and female reproductive health10,11,22,23,24,25,26. Numerous efforts have been made to cluster symptoms among premenopausal to menopausal women using longitudinal clinical datasets and surveys10,11,24,27,28,29, and a few studies have employed self-collected data from menstrual tracking apps to evaluate premenopausal women30,31. Despite these efforts, lack of consensus remains regarding as to what symptoms: (1) present-day women experience, (2) consistently predict occurrence of other symptoms, (3) cluster to each hormonal life stage, and what (4) distinct phenotypes within life stages exist that may be served by different treatments.

Large-scale efforts have described potential sub-groups of menopausal experience. Analyses are frequently directed at the open-access Study of Women’s Health Across the Nation (SWAN) dataset32, which followed 3289 patients at biannual office visits across 16 years, therefore capturing the full menopausal transition of many women. Although these infrequent visits relied heavily on recall at the time of doctors’ appointments, the dataset yielded many insights into the variety of symptoms experienced in menopause, potential trajectories, and influential factors. However, it is yet to generate a consistent picture of “the menopausal experience” or even consistent “menopausal subtypes”. For example, Harlow et al.10 used latent transition analysis to evaluate symptom relatedness in the SWAN dataset, identifying symptom severity clusters ranging from relatively asymptomatic to highly symptomatic. They additionally reported two distinct symptom cluster types: fatigue and psychosocial, versus vasomotor (VMS), sleep, and fatigue. A more recent analysis of a 557-woman subgroup with metabolic syndrom11, used latent class growth analysis to identify very different symptom clusters: sleep and urinary problems; VMS and vaginal dryness; and psychological, joint, and sexual dysfunction.

A separate menopausal cohort study of 971 women24, the Women’s Wellness Research Study, identified a further set of symptom clusters using a single timepoint survey: psychological, fatigue, and sleep; VMS; pain and numbness; and panic attacks and racing heart. Moreover, the authors identified more and more severe symptoms in women with a history of breast cancer, who reported nearly double the rate of VMS and low libido. Woods et al.29 drew on the Seattle Midlife Women’s Health Study, which collected annual surveys of 508 menopausal and perimenopausal women. This effort identified only mildly symptomatic, moderately symptomatic, and highly symptomatic clusters. Finally, studies of populations around the world have suggested that culture and genetic background impact menopausal experience. An early study conducted via phone interview of 1,900 Chinese women identified lower prevalence of VMS, and a peak of symptoms during perimenopause27. Five symptom clusters were identified: muscular and GI pain; psychological, respiratory, VMS and sleep disturbance, and non-specific somatic (fatigue, dizziness, headache). Subsequent studies have confirmed lower incidence of VMS and higher pain reporting in Asian women33. Additional differences may be present, with African American women exhibiting higher rates of VMS34,35 and Caucasian and Asian women reporting higher rates of psychological symptoms28.

Considering the above-named studies, totaling over 7000 women collectively, it remains unclear if consistent subtypes of menopause exist, and what physiological factors are responsible. Woods et al.29, Harlow et al.10, and Min et al.11 all make clear that external health and socioeconomic factors (breast cancer history, financial stress, ethnicity, and obesity/metabolic syndrome) can worsen the number and severity of symptoms, but do not provide consensus otherwise. Such different findings may be amplified by differences in data collection, analytical methods, number and range of reportable symptoms, symptom severity metrics, and changing perception of symptoms collected from different cultures in different decades.

In an attempt to reconcile results across analysis methods and life stages, we analyzed the characteristics of user-reported symptoms collected from a smartphone application designed to capture up to 45 symptoms in daily life. Analyzing symptom reporting across the continuum of pre- to peri- to post-menopausal enabled us to distinguish among symptoms dependent or independent of life stage. In addition, we aimed to avoid bias inherent in each of analytical method by employing several standard clustering methods: hierarchical clustering analysis (HCA) of symptom covariance, K-Means clustering of principal components generated from symptoms, and binomial network analysis. We hypothesized that most common symptoms among premenopausal women would be associated with the menstrual cycle (e.g., cramps, ovulation pain, breast swelling, spotting). We hypothesized that VMS would emerge in perimenopause, and menstrual-associated symptoms would disappear by menopause. Finally, we hypothesized that symptom patterning would vary by cycle regularity and by type of menopause. Comparison of premenopausal through menopausal populations enabled us to distinguish among symptoms that depend on life stage, and symptoms common to present-day women independent of life stage.

Results

Study population

Using a smartphone-based application where participants choose from 45 climacteric conditions/symptoms, 25,369 users recorded a total of 447,802 symptoms. In addition to self-reported symptoms, participants also self-reported menopausal status using a series of onboarding questions in order to determine how menopause was entered and menstrual cycle regularity (if applicable). Using inclusion criteria outlined in Fig. 1, a total of 4789 out of the 25,369 total users and 147,501 symptoms out of the 447,802 total symptoms were included in the analysis. All symptoms were collected from Fall 2021 through Spring 2023 (Supplemental Fig. 1). Of the 4789 total women included in the analysis, 1115 (23%) women met the criteria for premenopause, reporting a total of 27,731 symptoms (Table 1) with a median of 17 symptoms and median absolute deviation (MAD) of \(\pm 5\) reported per user; 1,388 (29%) women met the criteria for perimenopause, reporting a total of 57,964 symptoms an increased and more variable median symptom rate of 23 (\(\pm \, 10\)); 2286 (48%) women met the criteria for menopause, reporting a total of 61,806 symptoms. Despite the increased number of menopausal users and symptoms, symptoms per user were remarkably consistent, with a median of 19 symptoms (\(\pm \, 6\)). Note that some users did not answer if or how they had entered menopause (n = 124) or logged chemotherapy (n = 6). These users were excluded from the analysis (See Fig. 1, Table 1) Distribution of symptom counts did not vary by group (Fig. 2).

Fig. 1
figure 1

Inclusion criteria flow chart including separation of premenopausal, perimenopausal, and menopausal groups based on on-boarding survey questions asked in the MenoLife application.

Table 1 Number of women (N) comprising premenopausal, perimenopausal, and menopausal groups and their respective symptom counts (Sym).
Fig. 2
figure 2

The number of users reporting each symptom count in premenopausal (red), perimenopausal (light blue) and menopausal (dark blue) groups.

Symptom prevalence

The most common logs, presented as a percent of total logs, in premenopausal users were fatigue (7.94%), spotting (7.44%), cramps (6.55%), bloating (6.16%) and headaches (5.78%). By contrast, menopausal hot flashes greatly outweighed the prevalence of any other log (22.3%), followed by fatigue (5.13%), night sweats (4.31%), anxiety (3.96%), joint pain (3.52%), and bloating (3.45%). Perimenopausal women exhibited a combination of these most prevalent symptoms from the pre- and post-menopausal cohort with log prevalence of the following symptoms: hot flashes (14.8%), fatigue (6.33%), headaches and night sweats (each 4.77%), and cramps (4.38%) (Fig. 3A).

Fig. 3
figure 3

Symptom Reporting. (A) Histogram of each symptom as a percentage of total symptoms logged. (B) Histogram of the number of participants reporting each symptom as a percentage of total participants.

The percentage of users reporting symptoms depended on self-reported life stage. Premenopausal women were most likely to report fatigue (74.4%), followed by bloating (60.6%), cramps (57.3%), headaches (52.6%), and anxiety (52.2%), which closely mimicked the most common logs.

Perimenopausal women exhibited the highest rate of hot flashes (83.4%) and night sweats (62.2%), followed by fatigue (74.8%), headaches (58.9%), and bloating (57.1%), all comparable rates to premenopausal women (Fig. 3B). Even though hot flashes were the most reported symptom in the perimenopausal and menopausal cohorts, more menopausal users reported fatigue (75.0%) than hot flashes (73.1%) followed by reports of anxiety (58.7%), joint pain (56.1), and brain fog (56.1%).

Total symptom counts by user exhibited statistical differences between premenopausal, peri, and menopausal women. Aside from symptoms known to explicitly relate to either menopause or the menstrual cycle (i.e., vasomotor symptoms, ovulation and ovulation pain, cramps, spotting), several symptoms differed. Premenopausal and perimenopausal women logged the largest differences in fatigue, bloating, headaches, diarrhea, and mood swings (p < 0.01, chi-sq > 48.7) compared to menopausal women. Menopausal women reported greatest differences in elevated rates of painful sex, insomnia, vaginal dryness, memory lapse, low sex drive, and uti (p < 0.01, chi-sq > 26). No differences were observed in anxiety or vertigo.

Hierarchical clustering of symptom covariance

Premenopausal

Premenopausal symptoms fell into mood/cognitive and digestive groups, with most other symptoms minimally covarying (Supplemental Fig. 3, top). Brain fog and memory lapse were grouped more closely among regular cyclers, as were digestive and menstrual cycle-associated symptoms (e.g., breast pain, cramps, insomnia, bloating, constipation, ovulation pain) (See Supplemental Fig. 4). By contrast, fatigue and a variety of mood and cognitive symptoms were more closely related in irregular cyclers (See Supplemental Fig. 4).

Fig. 4
figure 4

Premenopausal, perimenopausal, and menopausal symptom networks. Premenopausal (top, red), perimenopausal (middle, light blue), and menopausal (bottom, dark blue) symptom networks. Clusters are colored separately. For full key see: Supplemental Table 2.

Perimenopausal

Perimenopausal women exhibited 3 large symptom branches (Supplemental Fig. 3, Middle). The highest covarying symptom group included digestive and mood/cognitive problems. Hot flashes were unrelated to any other symptoms.

Menopausal

Menopausal symptoms exhibited different structure from premenopausal or perimenopausal women (Supplemental Fig. 3, Bottom), and further differed by type of menopause (natural vs. medical/surgical) (Supplemental Fig. 5). In the menopause cohort as a whole, most data fell into a large, moderately correlated cluster including integumentary and mood/cognitive problems. Mood/cognitive and integumentary problems were even more related in medical/surgically entered menopause (Supplemental Fig. 5, Top). Notably, hot flashes and night sweats were unrelated to any other symptoms within these hierarchies.

K-means clustering of symptom covariance PCA

Premenopausal

All observed premenopausal clusters shared a baseline phenotype resembling premenstrual syndrome. 81% of users fell into a cluster reporting fatigue, bloating, cramps, anxiety, and headache. A remaining 13% were differentiated by bloating. The remaining 6% exhibited a variety of integumentary complaints alongside spotting and additional digestive symptoms. The only observed difference in top symptoms for regular cyclers was in the presence of brain fog rather than headache in the most common cluster (76% of regular cyclers). Fifteen principal components (PCs) were needed to capture 87% of the variance in the premenopausal dataset. The first PC captured 23% of the variance, and the second PC a remaining 11%. PC 3 accounted for 9%, and PC4 7%. All remaining PCs were ≤ 5%. Relatively few symptoms contributed to the top PCs: spotting in PC1, fatigue, headaches, anxiety, bloating, cramps, and breast pain in PC2; breast pain and cramps PC3, headaches in PC4 (See Supplemental Fig. 2).

Perimenopausal

Top symptoms grouping each main segment are as follows: 81% were placed in a segment characterized by VMS alongside fatigue, cramps, and bloating. 12% were characterized by their lack of night sweats, and presence of spotting and headaches. The remaining exhibited additional digestive symptoms or muscular pain. Fifteen PCs were needed to capture 90% of the variance: 55% for PC1, 11% for PC2, and 5% for PC3 (See Supplemental Fig. 2). Remaining PCs captured ≤ 3% of the variance (data not shown). PC 1 was exclusively determined by hot flashes, PC 2 was a mixture of many symptoms, PC 3 largely driven by nights sweats and, notably, the next PC driven by residual menstrual symptoms spotting and cramps.

Menopausal

The large majority (91%) of menopausal women were clustered by hot flashes and, similar to premenopausal women, fatigue, anxiety, bloating, and joint pain. An additional 6% included night sweats. The remaining reported the above symptoms alongside insomnia, chills, and irregular heartbeat. In menopausal women, 90% of the variance was captured in the first 3 PCs, the vast majority by PC1: 86% for PC1, 3% for PC2, and 3% for PC3 (See Supplemental Fig. 2). Remaining PCs captured ≤ 3% of the variance (data not shown). PC 1 was almost exclusively determined by hot flashes, whereas PC2 was a mixture of fatigue, night sweats, mood/cognitive, and integumentary problems. PC3 was dominated by night sweats.

Symptom networks

Symptom networks varied greatly by hormonal stage of life. Premenopausal symptoms were linked more sparsely than perimenopausal or menopausal symptoms, and into 6 groups of 3 or more symptoms, with all remaining symptoms singletons or pairs. Groups were comprised of (1) cognitive/mood, (2) integumentary, (3) flu-like/digestive, (4) nervous/muscular pain, (5) sexual symptoms, and (6) menstrual cycle-associated (See Supplemental Table 1 for symptoms comprising these groups). The nodes with highest degree centrality were brain fog, mood swings, dry skin, cramps, and nausea. For symptom networks by type of cycler, see (Supplemental Fig. 6).

Perimenopausal symptoms also clustered into 7 similar groups of 3 or more symptoms, with remaining symptoms in singletons or pairs (Fig. 4). (1) cognitive/mood, (2) integumentary, (3) dizziness/vertigo/irregular heartbeat, (4) sexual symptoms, (5) pain, and two groups included menstrual cycle-associated symptoms (ovulation vs. premenstrual). Notably, hot flashes and night sweats clustered together but not with any other symptoms. Mood swings, constipation, hair loss, vaginal dryness, and depression exhibited the greatest degree centrality.

Menopausal symptoms displayed a denser network than premenopausal symptoms, with similarly structured clusters to those found in HCA and K-Means clustering and some overlap with the premenopausal symptom network. Menopausal women exhibited a highly connected network of overlapping symptoms: (1) cognitive/mood, (2) digestive and integumentary (3) dizziness/vertigo/irregular heartbeat similar to perimenopausal women, (4) sexual, (5) flu-like, and (6) hot flashes, night sweats and chills. Joint pain, fatigue, itchy and dry skin, and memory lapse exhibited highest degree centrality. For symptom networks by type of menopause, see (Supplemental Fig. 7). All symptom summaries across methods are described in (Table 2).

Table 2 Symptom clustering summary.

Methods

Data collection and inclusion criteria

All procedures have been approved by Western Institutional Review Board-Copernicus Group (registration number, OHRP and FDA, IRB0000053; parent organization number, IORG0000432) Study Number: 1284093. Anonymized data were drawn from the MenoLife mobile app created by MenoLabs (https://app.menolabs.com) and collected between Fall 2021 and Spring 2023. As part of the onboarding process, users provided informed consent for use of de-identified data in this research. Users completed an onboarding questionnaire that included whether menstruation occurred in the last twelve months, description of menstrual periods, how the user entered menopause (if they noted absence of menstrual periods or > 12 months since last menstruation) and, finally, selection of most common symptoms. Following onboarding, women used the app at will to enter symptom logs from a list of 45 available symptoms. Retrospective analysis retrospective analysis of the entire app cohort indicates that approximately 94% of users were from United States and ~ 4% from United Kingdom.

Onboarding data was used to estimate user status as premenopausal or menopausal. These groups were separated for further analysis. Briefly, if users indicated that they had not entered menopause, and further specified that they had had a menstrual period within 12 months, and did not record vasomotor symptoms (i.e., hot flashes or night sweats), they were classified as premenopausal. Users were further grouped by whether they reported regular or irregular cycles. If users stated that they had entered menopause, and further confirmed that they had not had a period in more than 12 months, they were classified as menopausal. Menopausal users were further grouped by whether they reported entering menopause naturally, or via medical/surgical interventions. Individuals reporting chemotherapy were not included. Perimenopause does not have a strict clinical definition using symptoms alone38. Here we chose to estimate the perimenopausal population as users that indicated they had not entered menopause, who reported irregular cycles within the past year, and who experienced vasomotor symptoms. Users that selected conflicting answers were omitted from further analysis (e.g., self-identified as menopausal but selected that their periods were regular). As individual onboarding questions could be skipped, and multiple answers could be selected in some cases to each question, users of indeterminate status were also omitted from further analysis.

Finally, as many users were minimally interactive with the app, logging only a few symptoms, we opted to include only users who had logged at least 10 symptoms for further analysis of relationship among symptoms. To minimize the impact of “super-users” on symptom covariance, we opted to omit individuals who had logged > 300 symptoms.

Self-collected data and analytical methods

This analysis relied on self-collected, rather than clinician-collected data, providing several advantages and limitations in terms of accuracy, detail, and timeliness. In contrast to in-clinic data collection, individuals in the Meno Life data set have the opportunity to collect repeat data over time. Most users interact with the app for ~ 1 week (data not shown), providing a representative view of their everyday life during this time as opposed to a recollected snapshot over many years. Self-collected data close to the time symptoms are experienced is likely to be more accurate than in-clinic recall once a year39. Self-collected data may be particularly more accurate for symptoms which recur multiple times a day (e.g., hot flashes, chills), and which would otherwise be difficult to count and evaluate separately39,40. Finally, self-collected data using a mobile app on a personal smartphone may alleviate hesitation to report more personal symptoms (indeed, some previous studies omit questions about urogenital or sexual symptoms altogether in their surveys27).

Symptom categories

Menstrual cycle associated symptoms were here defined as ovulation and ovulation pain, menstrual cramps, breast pain and swelling (See Supplemental Table 1). Although fatigue and mood changes are commonly considered premenstrual symptoms, we considered that these were characteristic of both pre and post-menopausal women. Vasomotor symptoms were defined as hot flashes and night sweats. Although chills may be defined as VMS, they can also result from illness/infection, and so were considered separately. Integumentary symptoms refer to symptoms affecting hair, skin, and nails.

Data analysis and statistics

Data were securely organized in Amazon Web Services (AWS) S3 and queried through AWS Athena. Custom Python and R code was written for all the analysis methods. Ranksum tests (non-parametric ANOVA) were used to avoid assumptions of normality in comparisons of symptom count by individual across all 45 measured symptoms. Prior to computing the covariance matrix, symptom counts were standardized across symptoms rather than across users, meaning that users who experienced more symptoms contributed more strongly to HCA (Python: clustermap() from seaborn, linkages generated using linkage() from scipy.cluster.hierarchy).

Hierarchical clustering analysis

HCA combines independent forests of clusters that are not part of an existing hierarchy by using a distance metric to grow clusters. It starts by treating each symptom as a single node “forest”, maintaining a distance matrix between all clusters. This distance matrix is updated at each bottom-up iteration, with the algorithm converging at the formation of a single cluster. The distance metric used is \(\text{min}\left(\text{dist}\left(\text{u}\left[\text{i}\right],\text{ v}\left[\text{j}\right]\right)\right)\) where u, v are 2 clusters, and i, j represents each element within that cluster. This is repeated for all pairs of clusters up the hierarchical chain. We use Euclidean distance for dist() function. Dendrograms representing the hierarchical structure of symptom data were generated using the unweighted pair group method using arithmetic mean (UPGMA) applied to the covariance matrix of normalized symptom counts, and silhouette score was used to determine clusters reported here.

Principal components analysis

We used PCA (Python: PCA() from sklearn.decomposition) to reduce the dimensionality of the symptom dataset prior to K-Means clustering. K-Means clusters were generated using the principal components needed to capture up to 90% of the variance in each group’s symptom data, and elbow of the sum of squared errors (SSE) and silhouette score were used to determine optimal cluster number.

K-Means clustering

K-Means clustering of principal components generated from normalized symptom count was used to evaluate potential consensus with hierarchical clustering. Recall that HCA is a bottom-up approach in which each symptom-pair began as its own cluster, and clusters are iteratively merged or split until all points have been accounted for. By contrast, K-Means partitions data into a set number of clusters and aims to place data points into the group with the nearest centroid. We aimed to compare results generated under these methods to identify what symptom clusters were identified in both, as well as any small but notable groupings identified by HCA (e.g., the grouping of mood and cognitive symptoms).

Network analysis

Network analysis was performed in R using the package IsingFit41,42. The network estimation procedure used, called “eLasso” and based on the Ising model, pairs regularized logistic regression with model selection based on the Extended Bayesian Information Criterion (EBIC), a measure of fit that identifies variable relationships of interest. The resulting network consists of a symmetric (undirected) weight adjacency matrix. Each value above (below) the diagonal represents an edge (relationship) between a variable in a given row to the variable in that column.

Input data to Isingfit were one-hot encodings of the symptom matrices from each group. The presence of a symptom in any count in each individual was converted to a 1, and absence of a symptom remained a zero. Symptoms were set to null for which values were either (a) all blank, (b) rare enough that the Isingfit reported error due to lack of co-variance. These were nipple discharge in all groups; hot flashes and night sweats for premenopausal women; and ovulation, ovulation pain, spotting, and vomiting for menopausal women.

Networks were then exported and visualized in Python using the package iGraph. The Walktrap algorithm (Python: walktrap() from iGraph) was used to identify relevant communities within the network43. Negative correlations could not be used as inputs, were rare in the networks estimated by Isingfit, and were removed. Walk trap was tested with 3–10 steps, and the number of steps minimally impacted the estimated communities. Graphs displayed use 4 steps. Plots shown depict detected communities as shaded and nodes belonging to those communities in the same color. For a list of symptom names and abbreviations displayed on the graphs (See Supplemental Table 2).

Node degree, betweenness, closeness, and strength were calculated using iGraph (Python: degree(), betweenness(), closeness(), and strength () from iGraph ) and used to identify the most important nodes in the network. Degree is the number of edges connected to a given node, betweenness is the extent to which one node lies along the shortest path between other nodes, closeness is a measure of the average path length between one node and the others in the network, and strength is the sum of weights attached to ties belonging to a given node.