Introduction

Digital symptom checkers (SCs), also referred to as symptom assessment applications, are tools that help individuals self-assess their symptoms and decide on whether and how to seek medical care1. Digital SCs come in various forms, from simple web-based questionnaires to chatbots powered by large language models (LLMs)2. These tools are not only increasingly used by individuals but are also valued by health professionals, who report that receiving patient-entered symptom information in advance can support more efficient consultations3. Some national healthcare systems have also adopted digital SCs, particularly during periods of strain, to create standardized remote assessments and guide patients to appropriate care settings4,5.

Early evaluations of digital SCs primarily examined accuracy and safety, particularly in triage, with mixed results6,7,8. Some studies have raised concerns about digital SCs variability in accuracy and potential disruption to the patient-physician relationship, underscoring the need for more validation and oversight8,9,10,11. However, other studies suggest that digital SCs match or even outperform medical professionals in directing users to the appropriate care and exhibit high triage safety12,13,14,15. When implemented effectively, digital SCs can support faster diagnosis and reduce healthcare system burden5,16,17. This has led to growing interest in their use for conditions that are underdiagnosed, slow to detect, or poorly understood by patients and even clinicians.

Endometriosis, a chronic gynecological condition affecting 6–10% of women of reproductive age, marked by inflammation, pelvic pain, and infertility, presents a compelling case for examining the potential value of digital SCs18,19. Diagnosis of endometriosis is often delayed (by 7 years or longer), due to non-specific symptoms, limited availability of non-invasive yet accurate tests20, and low disease awareness among both patients and healthcare providers19,21. These delays can lead to reduced quality of life and higher societal costs through unnecessary healthcare use and loss of productivity22,23,24,25. In the U.S., the total annual cost of endometriosis, including healthcare costs and productivity losses, is estimated at $78–119 billion24,26. In Europe, the annual costs per woman are around €9579, with productivity losses comprising up to 75% of total costs25,27. These figures underscore the urgent need for early identification of high-risk individuals to mitigate these socio-economic impacts. Digital SCs may help shorten the path to endometriosis diagnosis and eventually treatment by prompting earlier symptom recognition and clinical engagement28. Recently developed digital SCs embedded within women health apps have exemplified this approach, offering users medical guideline-based self-assessments of their risk for gynecological conditions29. Preliminary evidence suggests that some of them can help users identify symptom patterns indicative of endometriosis, providing a scalable, non-invasive screening tool for earlier symptom recognition30,31.

While digital SCs hold promise for facilitating earlier symptom identification of conditions like endometriosis, where delayed diagnosis and high societal costs make early identification important, there is currently no published evidence on their cost-effectiveness. Existing economic evaluations focus on diagnostic testing or treatment strategies for endometriosis, overlooking the role of early digital self-assessment tools on when and how women engage with the healthcare system32,33,34,35,36.

This study aims to inform that gap in the evidence. We evaluated the cost-effectiveness of ‘Flo SC’, an early symptom self-assessment tool for endometriosis embedded within a female health digital platform Flo Health, compared to the standard of care (symptom-driven presentation to primary care and routine diagnostic and treatment pathways). We developed and validated a decision-analytic model to estimate Flo SC’s potential impact on time to diagnosis, healthcare costs, and quality-adjusted life years. In doing so, we aim to contribute to the small but growing evidence base on when and how digital SCs might offer value to patients and health systems alike.

Results

Base case analysis

The base case analysis, conducted over a 40-year time horizon and discounted accordingly, suggests that the Flo SC was likely to be a dominant strategy compared to standard care, potentially providing both cost savings and improved health outcomes.

Table 1 presents the average costs, quality-adjusted life years (QALYs), and diagnostic delays associated with Flo SC and the standard of care. The introduction of Flo SC improves diagnostic delay by 4.36 years and reduces average costs per patient by $5196.22 compared to the standard of care. The incremental QALY gain per patient is modest at 0.049. The incremental net monetary benefit (INMB) per patient is estimated to be $10,089.00 at a willingness-to-pay (WTP) threshold of $100,000 per QALY, and $7642.68 at a WTP of $50,000 per QALY. Breaking costs down further, the majority of savings came from direct medical costs savings ($4042 per patient), complemented by reductions in indirect costs ($1545 per patient). In comparison, intervention costs (i.e., Flo SC subscription) were modest at $391 per patient.

Table 1 Base case analysis results

Deterministic sensitivity analyses

To assess the robustness of the base case findings, we conducted one-way deterministic sensitivity analyses (DSA), evaluating how uncertainty in individual model parameters impacts the INMB of Flo SC versus the standard of care.

The one-way DSA results, presented in Fig. 1, indicate that health state utilities, Flo SC specificity and sensitivity, uptake probability, and compliance with the self-assessment results are the most influential drivers of cost-effectiveness. Higher specificity and sensitivity are associated with improved INMB, as are higher uptake and compliance with screening recommendations. In contrast, healthcare costs and productivity losses due to endometriosis symptoms have a relatively smaller impact. Utility values ranked as the top drivers for two reasons: (1) their impacts on NMB are direct and get amplified by the WTP threshold (which is high in our study); (2) with utilities bounded between 0 and 1, the same proportional uncertainty has a larger relative effect than costs, which vary across wider ranges.

Fig. 1: Deterministic sensitivity analysis results showing the impact of key parameter variations on the cost-effectiveness of the Flo Symptom checker versus standard of care at a willingness-to-pay threshold of $100,000 per QALY.
Fig. 1: Deterministic sensitivity analysis results showing the impact of key parameter variations on the cost-effectiveness of the Flo Symptom checker versus standard of care at a willingness-to-pay threshold of $100,000 per QALY.
Full size image

Parameters at the top of the chart have the greatest influence on model outcomes, while those at the bottom have a smaller effect. Only the top 15 parameters are displayed for clarity.

We conducted a two-way DSA to examine the joint impact of two key parameters—Flo SC sensitivity and specificity—on its effectiveness in reducing diagnostic delay and its cost-effectiveness, as measured by INMB. Given that digital SCs often undergo iterative improvements in their algorithms, and that their accuracy may vary in real-world settings, a two-way DSA allows us to account for their inherent uncertainty and identify thresholds at which digital SCs remain cost-effective or reduce diagnostic delay.

As illustrated in Fig. 2, Flo SC generally leads to a positive INMB and a reduction in diagnostic delay across a range of sensitivity and specificity values, provided both remain above certain thresholds.

Fig. 2: Impact of Flo Symptom Checker accuracy on incremental net monetary benefit (INMB) and diagnostic delay reduction: a two-way sensitivity analysis.
Fig. 2: Impact of Flo Symptom Checker accuracy on incremental net monetary benefit (INMB) and diagnostic delay reduction: a two-way sensitivity analysis.
Full size image

The left panel illustrates the effect on INMB at a willingness-to-pay threshold of $100,000 per QALY. The green zone represents combinations of accuracy thresholds where Flo SC remains cost-effective or reduces time to diagnosis, while the blue zone indicates scenarios where cost-effectiveness is not achieved or leads to further diagnostic delay. The right panel shows the effects on diagnostic delay reduction. Higher specificity is particularly critical for reducing diagnostic delay, as indicated by the green zone, while lower specificity results in minimal or even longer diagnostic delay.

The left panel shows that higher Flo SC accuracy levels lead to positive INMB, indicating cost-effectiveness. The green zone, corresponding to higher sensitivity and specificity, represents combinations of sensitivity and specificity levels where Flo SC remains cost-effective. In contrast, the blue zone represents scenarios where the Flo SC could be cost-ineffective or even result in longer diagnostic delay. The contours within the green zone highlight that INMB increases with improved accuracy. At a WTP threshold of $100,000 per QALY, the Flo SC achieves strong cost-effectiveness when both sensitivity and specificity are above 0.7.

The right panel depicts a similar pattern—high sensitivity and specificity are also crucial for reducing diagnostic delays. This result arises from the underlying logic of the care pathway, where any user flagged by Flo SC, whether a true or false positive, is likely to seek medical attention sooner. Higher sensitivity ensures more true cases are identified early, while higher specificity minimizes unnecessary referrals and reduces noise in the diagnostic process, indirectly facilitating faster diagnosis for true positives. As a result, the greatest reductions in diagnostic delay (up to 5 years) occur when both metrics exceed 0.8.

Probabilistic sensitivity analysis (PSA)

We performed a probabilistic sensitivity analysis (PSA) using 1000 Monte Carlo simulations to assess the joint uncertainty of key model input parameters, including costs, utilities, and transition probabilities. The PSA results show that Flo SC was cost-effective in 93.7% of scenarios at the $100,000/QALY threshold. The mean cost per patient with Flo SC was $7797.55, compared to $13,076.09 under standard care, saving $5278.54 per patient (95% CI: $5212.85–$5344.23). The intervention also resulted in a modest increase in QALYs (18.79 vs. 18.72), with an incremental gain of 0.071 QALYs (95% CI: 0.066–0.076). The INMB per patient was $12,398.92 (95% CI: $11,893.11–$12,904.72). The incremental cost-effectiveness plane (Fig. 3) illustrates the majority of simulations (scatter points) falling in the lower right quadrant, indicating that Flo SC was highly likely to be cost-saving and more effective than standard care.

Fig. 3: Probabilistic sensitivity analysis results, visualized on an incremental cost-effectiveness ratio (ICER) plane for Flo SC versus standard care.
Fig. 3: Probabilistic sensitivity analysis results, visualized on an incremental cost-effectiveness ratio (ICER) plane for Flo SC versus standard care.
Full size image

This scatter plot visualizes the results of 1000 Monte Carlo simulations, showing the incremental QALYs and incremental costs associated with Flo SC compared to standard care. The orange ellipse represents the 95% confidence interval of the ICER. The dotted green line represents the $100,000 per QALY willingness-to-pay (WTP) threshold. Most points fall in the fourth quadrant, indicating that Flo SC is generally cost-saving and more effective.

Additionally, Flo SC reduced the time to diagnosis by an average of 4.37 years (95% CI: 4.31–4.43) by prompting earlier self-recognition and clinical engagement. The probability that Flo SC reduces diagnostic delay by at least 3 years was 93.2%. Figure 4 shows a clear shift in diagnostic timing, with Flo SC users clustering around 3 years of delay (pink), compared to 7 years under standard care (blue).

Fig. 4: Distribution of diagnostic delay: standard of care versus Flo SC.
Fig. 4: Distribution of diagnostic delay: standard of care versus Flo SC.
Full size image

This histogram compares the distribution of diagnostic delays between standard care (blue) and Flo SC (pink), based on 1000 Monte Carlo simulations. The density curve (scaled) on the right indicates the relative frequency of each delay. The Flo SC distribution is centered around a shorter diagnostic delay (3.04 years), whereas the standard care distribution shows a longer delay (7.41 years).

Scenario analyses

We conducted scenario analyses to further explore how uncertainty in key parameters could affect the health and economic outcomes of Flo SC. We first examined influential behavioral drivers identified from the DSA—compliance, uptake, probability of visiting a doctor while symptomatic. We then examined uncertainty associated with factors particularly relevant to real-world implementation and payer decision-making, namely the time horizon and the subscription price of Flo SC.

Across most scenarios, Flo SC was a dominant strategy, providing both cost savings and greater health benefits (Table 2). However, when compliance was low (30%), the intervention became cost-increasing and was no longer considered cost-effective, with an ICER of $265,177.35 per QALY. In contrast, high compliance (90%) resulted in the greatest cost savings and QALY gains. While lower uptake (30%) still maintained cost-effectiveness and reduced diagnostic delay, the benefits were much more modest. To explore potential correlation between uptake and compliance, we conducted a two-way sensitivity analysis varying both parameters from 0.3 to 0.9. The results (see Supplementary Table 7a, b) reinforce the findings that cost-effectiveness is primarily driven by compliance, and when it is very low (0.3), increasing uptake actually worsened the outcomes, as more users engaged with the SC but did not act on its recommendations.

Table 2 Scenario analyses results

The spontaneous doctor visit scenarios showed that Flo SC was particularly cost-saving and beneficial when symptomatic women, regardless of their endometriosis status, were less proactive in seeking medical advice on their own. The time horizon analysis showed that shorter time horizons (e.g., 5 years) limited the scope of intervention effects, as some individuals may not develop symptoms of endometriosis within that period. Longer time horizons allowed for a more complete realization of the benefits, as Flo SC’s impact on healthcare utilization and symptom progression became more evident over time.

To test the robustness of our conclusions to pricing uncertainty, we varied the annual subscription cost from $15 to $240 per user (equivalent to 25%–400% of the current price). Across this entire range, Flo SC remained a dominant strategy, with INMB estimates ranging from $8917 to $10,382 per patient at a WTP threshold of $100,000 per QALY.

Discussion

We found that a digital SC, when used as a self-assessment tool for endometriosis, was likely to be a cost-saving and health-improving intervention compared to standard of care. Over a 40-year horizon, on average, the digital SC shortened the diagnostic delay by 4.36 years, delivered modest QALY gains of 0.049 per person, reduced costs of $5196.22 per person, and resulted in an INMB of $10,089.00 per person at a WTP threshold of $100,000 per QALY, primarily by improving early symptom recognition that supported reduction of diagnostic delays. The PSA showed that the digital SC remained cost-effective in more than 93.7% of simulations and reduced diagnostic delays by more than 3 years in 93.2% simulations, supporting the robustness of the base case findings.

Scenario analyses highlighted that digital SC accuracy, compliance, and uptake were the most critical drivers of cost-effectiveness. The digital SC delivered the greatest value when it provided accurate recommendations (both sensitivity and specificity were ≥0.7) and when users acted upon them (compliance exceeded 45%). Uptake alone had a smaller influence compared to compliance, but sufficient uptake remains a prerequisite for scaling the intervention in real-world settings. The model also showed that longer time horizons are essential to capture the full benefits of earlier diagnosis. Shorter horizons (e.g., 5 years) underestimated both cost savings and QALY gains, as many women in the simulated cohort had not yet developed symptoms.

Previous studies on digital SCs have largely focused on accuracy and triage safety6,7,8,12,37, with economic outcomes underexplored17. This study helps fill that gap and illustrates that digital SCs can offer economic value, particularly in conditions such as endometriosis, where diagnostic delays are costly and prolonged38. Our model also showed that savings were driven primarily by direct medical costs than by reductions in indirect costs related to productivity loss. This pattern is consistent with prior real-world evidence from the U.S.26 and implies that our findings are potentially relevant from the perspective of healthcare payers, for whom direct medical costs are the primary concern, even though the societal burden of productivity loss remains considerable.

Our sensitivity analyses suggest that digital SC accuracy, especially specificity, was a key driver of cost-effectiveness. Our findings are aligned with previous analyses that digital SCs tend to be too risk-averse, often over-triaging patients for medical attention11,37. While reduced sensitivity may delay diagnosis due to an increase in missed cases/false-negatives, low specificity can lead to unnecessary referrals/false positives that increase downstream costs. This highlights the importance of balancing the trade-offs between sensitivity and specificity in product iteration.

User behaviors—specifically, compliance and uptake—also played a major role in determining value for money. Even a well-performing digital SC will yield limited benefit if users do not engage with it or fail to act on its recommendations. These findings echo earlier work emphasizing that the success of digital health tools depends not only on technical performance but also on user adoption and adherence39,40.

Moreover, the cost-effectiveness of early endometriosis diagnosis aligns with findings from studies on menstrual health interventions and digital tracking applications41,42. These studies emphasize that empowering women with accurate, accessible information can lead to more proactive healthcare engagement. Consistent with these findings, our scenario analyses showed that Flo SC showed greater value when women were less likely to seek care on their own—highlighting its potential to reduce inequities in healthcare access. Indeed, in settings where symptom awareness is low or barriers to care are high, digital SCs may play a particularly valuable role in encouraging timely clinical evaluation and treatment.

The model also demonstrated that a longer time horizon is necessary to fully capture the benefits of earlier diagnosis. Because the simulated cohort began at age 20, shorter horizons excluded individuals who had not yet developed symptoms. As a result, both health gains and cost savings appeared smaller over 5 years, but became more pronounced over 10- and 20-year horizons. This supports the use of long-term modeling for evaluating digital interventions that affect disease trajectories over time.

This study is the first known published economic evaluation of a digital SC for endometriosis and among the few assessing the cost-effectiveness of digital SCs more broadly.

The findings offer broader insights into the conditions where digital SCs may be most valuable. Digital SCs are likely to deliver the greatest economic and health benefits in conditions characterized by prolonged diagnostic delays, stigma, high societal costs, and low care-seeking rates. For example, perimenopausal symptoms, which often remain under-recognized despite significant impacts on quality of life and workforce participation43,44,45, may benefit from digital SCs tailored to improve awareness and engagement.

The modest QALY gains observed in our model reflect both conservative assumptions on utility of diagnosis (we assumed none) and the nature of endometriosis, which has low mortality risk despite the morbidity burden46, which naturally limits the magnitude of QALY improvements. Moreover, treatment effects for endometriosis are constrained by relapse risk and imperfect effectiveness: available therapies such as hormonal treatments or laparoscopy can relieve symptoms but are not curative and may involve side effects or peri-operative discomfort47,48. Many patients also experience recurrent symptoms despite treatment49,50. Consequently, even when Flo SC accelerates treatment, the resulting QALY gains remain modest. For conditions with more responsive treatments and higher mortality risk, digital SCs may deliver larger health gains and, in turn, greater cost-effectiveness.

Our scenario analyses also reinforce the need for long-term evaluations of digital SCs, as shorter time horizons substantially underestimated both health and economic gains.

As digital SCs proliferate, particularly through consumer-facing health apps51, healthcare systems will increasingly confront decisions about how to integrate, regulate, and reimburse such tools. This study provides a methodological framework for evaluating digital SCs and offers early evidence that they can deliver clinical and economic benefits under realistic assumptions. Looking ahead, the rapid emergence of LLM-based health applications52,53 may further transform the digital SC landscape, offering opportunities for more personalized, conversational symptom triage while also introducing new challenges around specificity, safety, and trust. Expanding evaluation frameworks to address the opportunities and risks introduced by AI-powered tools will be crucial to ensure that the next generation of digital symptom checkers delivers meaningful clinical and societal value.

Despite these promising results, several limitations must be noted. The model assumes that women undergo digital SC check-in every 6 months while symptomatic. In practice, usage patterns are likely more variable, influenced by symptom severity, healthcare access, and user experience. Additionally, the model assumes that women receiving a false negative from the SC continue seeking care at the same rate as those who never used the tool. While not universally accurate, persistent or worsening symptoms often prompt continued care-seeking, meaning the model likely captures broader care-seeking dynamics despite these simplifications.

Another simplification concerns the modeled transition from “endometriosis-like symptoms” to “no endometriosis”. In the standard care pathway, symptom resolution only occurs after a clinical consultation, whereas spontaneous resolution without medical input is possible in reality. This may slightly underestimate quality of life in the standard care group; however, deterministic sensitivity analysis showed that even with optimistic utility values (0.85), the SC remained cost-effective (INMB: $7078.09 per patient).

The model also uses average disease trajectories and does not distinguish between phenotypes of endometriosis. Fertility-related outcomes were also not modeled, in line with prior findings34 that pain-related QALY gains tend to dominate due to their frequency and immediacy. This assumption likely underestimates the cost-effectiveness by excluding potential long-term benefits that may further favor the intervention. Treatment-specific responses and postmenopausal utility values were also averaged, and we assumed quality of life after menopause is similar for women with or without endometriosis—consistent with evidence suggesting symptom relief post-menopause19.

Real-world adoption remains a source of uncertainty. The Flo SC compliance was based on early data based on U.S. users and may shift with broader implementation. Last but not least, while the societal perspective captures a broad set of benefits, including productivity gains, it may not fully align with the incentives of all decision-makers in the U.S. healthcare system. Private payers, for instance, may be less likely to support tools where financial benefits primarily accrue to individuals or society rather than the insurer.

To address those limitations, future research should focus on the following priorities. First, real-world validation through longitudinal studies using electronic health records, claims data, or prospective cohorts to assess whether digital SCs meaningfully change diagnostic timelines, healthcare use, and outcomes51. Second, integration into care pathways is key; embedding digital SCs into telemedicine platforms, payer portals, or digital triage systems could improve early intervention, especially in complex healthcare systems like the U.S.54,55. Third, boosting user engagement and equity requires addressing barriers such as digital literacy, trust, and access. Tailoring digital SCs for the diverse needs of users will be essential to maximize uptake and impact39,40,56. Fourth, future research could also explore payer-specific perspectives and quantify how benefits are distributed across stakeholders, particularly in fragmented or non-universal healthcare systems.

By addressing these gaps, digital SCs can be further optimized not only for accuracy and usability, but also for their real-world impact, cost-effectiveness, and integration into equitable care.

In conclusion, this study provides the first economic evaluation of a digital SC for endometriosis screening. Using a validated decision model, we found that the tool can reduce diagnostic delays, improve health outcomes, and be cost-saving under a wide range of assumptions. These findings offer timely evidence for healthcare systems exploring the integration of digital triage tools—particularly for underdiagnosed, high-burden conditions. When implemented effectively, digital SCs may enhance early detection and support more efficient, equitable care.

Methods

Modeling overview

We developed a Markov decision process model to evaluate the cost-effectiveness of a digital SC (‘Flo SC’) for endometriosis compared to standard care. The analysis adheres to best practices in health economic modeling, following the CHEERS 2022 guidelines57 (see Supplementary Table 1). Following a pre-specified health economics analysis plan, the model simulates two parallel cohorts of 10,000 women of 20 years of age, based in the United States, and presenting with symptoms suggestive of endometriosis. We chose this starting age to reflect the typical onset of endometriosis during reproductive years18,58.

Each cohort starts in the healthy state and enters a decision tree capturing initial symptom assessment and care-seeking behaviors, followed by a Markov state-transition process representing diagnosis, treatment and management for endometriosis. Transitions occur at 6-month intervals, reflecting typical timelines for symptom checking and follow-up consultations. The intervention cohort is assumed to use Flo SC, a chatbot-based tool embedded within the Flo Health app, which guides users through a structured, guideline-based questionnaire to self-assess symptoms. Importantly, the model assumed that Flo SC augments, rather than replaces, existing care pathways, supporting users in recognizing symptoms earlier and navigating the healthcare system more effectively. For the comparator cohort, women follow standard of care pathways, with symptom recognition and healthcare engagement based on routine care-seeking. Model outcomes include time to diagnosis, healthcare costs, productivity losses, and QALYs.

This structure was adapted from the National Institute for Health and Care Excellence (NICE)’s economic model of endometriosis care pathway34 and aligns with previous published work33 and best practices modeling guidance59,60. Model conceptualization and assumptions included were refined through consultation with independent health economists and U.S.-based gynecologists and primary care physicians to ensure that the model structure reflects key features of the diagnostic and treatment care pathway for endometriosis in the US as well as patients’ care-seeking behaviors in the US healthcare system.

As recommended in recent reviews of economic evaluations in women’s health61, the model adopts a societal perspective, incorporating both direct healthcare costs and productivity losses. A 40-year time horizon was chosen to reflect the typical reproductive lifespan, with menopause marking symptom relief in most cases19. Costs and QALYs were discounted at an annual rate of 3%, consistent with standard economic evaluation practice in the U.S. setting62.

Model structure

The model reflects key stages of the endometriosis care pathway: symptom onset, initial care-seeking, diagnosis (or misdiagnosis), treatment, relapse, symptom resolution at menopause, and death. A schematic representation of both model arms is shown in Fig. 5. Women in each cohort either have endometriosis or experience similar symptoms due to other causes. To reflect real-world diagnostic uncertainty, we explicitly modeled both true-positive and false-positive pathways, including symptomatic women without endometriosis who may be misflagged to seek medical help or informed that their symptoms were not suggestive of endometriosis via Flo SC.

Fig. 5
Fig. 5
Full size image

Schematic diagram of the endometriosis care pathway model.

At model entry, healthy women may develop symptoms and transition into one of two underlying conditions: endometriosis or other causes. How a woman transitions to subsequent stages depends on the care pathway. In the intervention cohort, women first receive self-assessment via the tool, which influences the likelihood and time to seek care. In the standard of care cohort, care-seeking is based on women’s spontaneous behavior—they consult primary care physicians (PCPs) and, if needed, specialists, undergoing diagnosis and treatment based on cumulative diagnostic accuracy and empirical treatment response.

Diagnosis and treatment pathways were simplified by averaging treatment effects and omitting rare transitions such as delayed-onset endometriosis. Unlike the prior models34, we excluded a separate “diagnosed but untreated” state, based on clinical expert input indicating nearly all patients receive empirical treatment after consultation.

Women may relapse or cycle through treatments, and all can transition to menopause or death at any time based on age-specific probabilities. Menopause was modeled as a semi-absorbing state, after which symptoms typically resolve and no further transitions occur apart from death. Full state definitions and a full list of transitional decision nodes are included in Supplementary Tables 2 and 3.

Model assumptions

The model incorporates a series of structural, clinical, and economic assumptions. These assumptions balance model simplicity and real-world relevance, ensuring that key drivers of diagnostic delay, treatment initiation, and healthcare costs are accurately represented. Women enter the model in a healthy state, with the potential to develop symptoms over time based on age-specific endometriosis onset rates. Once symptomatic, women are classified as either having endometriosis or just endometriosis-like symptoms, reflecting real-world diagnostic uncertainty.

Diagnostic decisions are modeled as a cumulative process rather than a single test, consistent with the NICE approach34. Instead of applying fixed sensitivity and specificity for a specific diagnostic test, the model incorporates the cumulative accuracy of primary care and gynecologist assessments over multiple visits, based on the empirical estimates of diagnostic delay from epidemiological studies21. Women misdiagnosed as having endometriosis are assumed to accrue costs due to misdiagnosis and potential treatment, but do not gain health benefits from formal treatment for endometriosis.

In the model, women who transition to the “No Endometriosis—Diagnosed” state (see Supplementary Table 2) are assumed not to develop endometriosis later in life. While rare cases of late-onset endometriosis may occur, this state is treated as final in the model to simplify the care pathway, similar to the approach taken by NICE34. Additionally, women can transition to menopause (see Supplementary Table 4) or death at any time, based on age-specific probabilities. Menopause serves as a semi-absorbing state, where disease progression and treatment cease, reflecting the natural decline in endometriosis symptoms after reproductive years.

All symptomatic women seeking medical care receive empirical treatment, regardless of whether they have endometriosis. Empirical treatments include non-steroidal anti-inflammatory drugs (NSAIDs) or hormonal therapy, which are commonly prescribed for pelvic pain management in primary care. The effectiveness of empirical treatment is assumed to be similar across all symptomatic women, with formal treatment only initiated upon confirmed diagnosis.

The model incorporates average effectiveness estimates for treatment, rather than differentiating between specific hormonal therapies or surgical procedures. Women who experience symptom relapse transition from the managed state back to active treatment rather than progressing through specific therapy lines.

The model does not explicitly incorporate comorbid conditions, assuming that endometriosis is the primary driver of healthcare utilization and health outcomes. Costs in the model are assigned based on healthcare utilization, diagnostic tests, and treatment interventions, reflecting real-world expenditures. Treatment costs are averaged over different therapeutic options rather than modeled separately for practical purposes.

Model input parameters

The model incorporates key input parameters grouped into three categories: transition probabilities, healthcare costs and health utilities (see Supplementary Fig. 1 for an overview of key parameters used in the model). To minimize selection bias, parameter choices were anchored in the comprehensive systematic review conducted by NICE (2017)63, which remains the most rigorous evidence synthesis in this area. Where appropriate, we retained baseline parameter values from the NICE model, and for parameters that appeared outdated or not suitable for the U.S. setting, we conducted targeted PubMed literature searches (past 10 years) for English-language, peer-reviewed U.S. studies, as well as consulting the latest clinical guidelines and national statistics. Final values were reviewed with two U.S.-based clinicians for face validity.

Tables 35 present the parameter values, distributions, and sources for transition probabilities, costs, and health utilities.

Table 3 Model inputs of transition probabilities, their corresponding distributions, and sources
Table 4 Model inputs of costs (2024 USD) and their corresponding distributions
Table 5 Model inputs of utilities and their corresponding distributions

Transition probabilities reflect the likelihood of progressing through health states as described above. Women enter the model as healthy and develop symptoms over time, with age-specific probabilities of endometriosis onset and prevalence of endometriosis among symptomatic individuals informed by epidemiological data.

Assumptions on care-seeking behaviors differ by pathway. In the standard of care cohort, care-seeking is modeled based on women’s spontaneous doctor-visiting patterns. In the intervention cohort, symptomatic users may engage with Flo SC at each cycle, which guides subsequent care-seeking depending on self-assessment outcomes and user compliance with the care-seeking advice. These dynamics are governed by Flo SC uptake among the symptomatic population, its triage accuracy, and user compliance rates, informed by empirical studies on general use patterns of digital SCs and internal Flo SC usage data (see Table 3).

After diagnosis, women can begin treatment. Those who respond well enter a stable managed state, while others may experience symptom relapse and return to active treatment. The model also includes age-based transitions to menopause and death to reflect the natural course of endometriosis and capture long-term outcomes.

Since the model includes 11 health states, there are 121 possible transitions in the full transition matrix. However, only a small proportion of these transitions occur in practice, as most are non-reversible. Table 3, therefore, lists only non-zero transitions and omits self-transitions, where individuals remain in the same health state across cycles.

All costs, presented in Table 4, were expressed in 2024 USD, inflated where necessary using the U.S. Consumer Price Index64.

As we adopted a societal perspective, the model captures both direct healthcare costs and indirect productivity losses, which was estimated to account for up to 75% of total endometriosis-related costs25. Direct costs sourced from existing literature include symptomatic care, diagnosis, treatment, and long-term management of endometriosis. These costs cover primary and specialist consultations, diagnostic procedures, pharmacological therapies, surgeries, and follow-up care. The cost of Flo SC was modeled as a semi-annual subscription fee. For false positives, the estimated cost included travel expenses and productivity losses associated with the doctor visit. The direct medical cost of the visit itself was already captured in the model through the transition to the Doctor Visit health state, ensuring consistent application across all relevant patients and avoiding double-counting of diagnostic or treatment costs. No additional diagnostic or treatment costs are applied to avoid double-counting, as these are already captured through the transition to the Doctor Visit health state.

Indirect costs account for productivity losses due to absenteeism and presenteeism across different disease states, particularly before and during treatment. These costs were sourced from existing literature or derived using a human capital approach65, which values lost productivity based on average wage (see Supplementary Table 5 for detailed cost parameters derivation and calculation process).

Health utility values, presented in Table 5, were primarily sourced from EQ-5D-based studies on endometriosis (see Supplementary Table 6 for more detail). They were assigned to each health state to reflect how endometriosis symptoms, diagnosis, treatment options and treatment response affect women’s quality of life over time.

Women in the undiagnosed symptomatic state—whether due to endometriosis or similar conditions—had lower utility scores to reflect the physical discomfort and psychological distress from unresolved symptoms. The diagnosed and formally treated state also carried lower utility, acknowledging the temporary reduction in quality of life due to invasive procedures, side effects, or surgical recovery. This is consistent with findings that hormonal or surgical treatment often leads to short-term disability despite targeting long-term relief36.

A separate transitional utility was applied to women receiving first-line or empirical treatment, such as NSAIDs or hormonal therapy, before formal diagnosis66. This utility was not tied to a specific state but applied during treatment transitions for both women with and without endometriosis.

Women who responded well to treatment and entered the managed state experienced higher utility, reflecting sustained symptom control. The menopause state was assigned a population norm value for U.S. women aged 55–6467. The death state was modeled as absorbing, with a utility of zero.

Cost-effectiveness and sensitivity analyses

To evaluate cost-effectiveness, we calculated the incremental cost-effectiveness ratio (ICER) and incremental net monetary benefit (INMB) between the Flo SC intervention and standard of care. These were computed using the following formula:

$$ICER=\frac{{C}_{Flo\,SC}-{C}_{standard\,care}}{{Q}_{Flo\,SC}-{Q}_{Standard\,care}}\,and\,INMB=(\Delta Q\times \lambda )-\Delta C$$

where \(\Delta C\) is the difference in costs, \(\Delta Q\) is the difference in QALYs, and \(\lambda \) is the willingness-to-pay (WTP) threshold. The base case analysis used two WTP thresholds: $100,000/QALY, in line with the threshold for economic evaluations conducted in the US68, and a more conservative estimate of $50,000/QALY.

In addition to cost-effectiveness, we modeled time to diagnosis to assess how Flo SC affects diagnostic delays, using outputs from the Markov trace. In each cycle, we calculated the number of women in the undiagnosed endometriosis state and the theoretical minimum time to diagnosis (assuming diagnosis would occur in the next cycle). The difference between these was used to estimate diagnostic delay, which was summed across cycles over the specified time horizon. This approach allowed us to estimate the average diagnostic delay per woman, reported in undiscounted years, for both the Flo SC cohort and the standard of care cohort.

We performed a series of sensitivity analyses to assess the robustness of model findings. We did one-way deterministic sensitivity analysis (DSA) to explore the influence of individual input parameters on cost-effectiveness by varying key variables one at a time within their plausible ranges. We also conducted a probabilistic sensitivity analysis (PSA) using 1000 Monte Carlo simulations to account for uncertainty across all model inputs. Additionally, we carried out scenario analyses to test alternative model assumptions regarding screening uptake, compliance, Flo SC accuracy, and time horizon to assess real-world implementation conditions and policy relevance

Ethics approval and consent to participate

Not applicable. This study did not involve human participants, human data, or human tissue.