Introduction

Alzheimer’s disease (AD) has a multifactorial etiology, with advanced age as the strongest risk factor, while genetic predisposition (apolipoprotein [APOE] ε4 carrier status) and female sex also significantly increase susceptibility1,2. These risk factors may influence not only the likelihood of developing AD but also the rate of clinical progression3,4. Clinically, AD is characterized along a continuum of cognitive stages ranging from cognitively unimpaired (CU) to mild cognitive impairment (MCI) and ultimately dementia5. Importantly, the rate of progression along this continuum can vary according to the specific cognitive status6 and underlying risk profile7,8,9,10,11,12.

Neuropathologically, AD features amyloid-β (Aβ) plaques, tau tangles, and neurodegeneration such as hippocampal atrophy13. These define the A/T/N (amyloid/tau/neurodegeneration) biomarker framework, but the cost and invasiveness of these methods drive the need for scalable plasma biomarkers14. To overcome these limitations, substantial efforts have been devoted to developing blood-based biomarkers that can capture A/T/N categories15,16,17. Core plasma biomarkers include Aβ42/40 ratio and phosphorylated tau species such as pTau181, pTau217, and pTau231, while non-specific markers—such as glial fibrillary acidic protein (GFAP) and neurofilament light chain (NfL)—reflect astrocytic activation/inflammation and axonal injury/neurodegeneration, respectively14. These plasma biomarkers have shown high concordance with their PET or MRI counterparts and have demonstrated utility not only for diagnosing AD but also for predicting prognosis, including rates of cognitive decline and progression across the disease continuum18,19,20. Among them, pTau217 stands out for its exceptional performance in detecting both amyloid and tau pathology18,21,22. In this study, preclinical AD was operationally defined as Aβ + CU individuals, consistent with current secondary prevention trials23. Plasma pTau217 was used as an indirect correlate of tau pathology rather than a direct measurement18; therefore, the proposed framework does not constitute a biological staging system based on combined amyloid and tau positivity.

To date, most studies have focused on identifying prognostic factors within individual cognitive stages of AD. However, as clinical trials and therapeutic options continue to advance, there is a growing need for a unified prognostic framework to provide stage-consistent and biologically informed risk stratification across the disease continuum24,25. Such a system would enable more consistent communication about prognosis, harmonized interpretation of biomarker profiles, and standardized comparison of disease trajectories across studies26.

In this work, we develop and externally validate a prognostic staging system for AD, integrating traditional cognitive status with risk factors and multiple plasma and imaging markers to stratify dementia progression risk. To achieve this, we first identify prognostic subgroups using survival prediction modeling within each cognitive status, then develop outcome-specific stages by merging similar-risk subgroups for each of dementia progression outcomes, and finally integrate them into a comprehensive staging framework across the Alzheimer’s spectrum.

Results

Demographics of study participants

The study included 1263 participants from the Korea-Registries to Overcome dementia and Accelerate Dementia research (K-ROAD) cohort, comprising 224 CU individuals, 779 with MCI, and 260 with dementia. As detailed in Table 1, mean ± standard deviation (SD) of age was 71.8 ± 8.1 years, with female participants constituting 62.5%. Aβ PET positivity demonstrated incremental elevation across initial cognitive status (36.2%, 59.8%, and 84.2%, respectively). Longitudinal CDR-SB scores were obtained with a median of 2 assessments per participant (interquartile range [IQR] 2–3) over a median follow-up duration of 2.1 years (95% confidence interval [CI] 2.0–2.2). Clinically meaningful decline was defined using clinical dementia rating sum of boxes (CDR-SB) thresholds: ≥ 3 for very mild dementia, ≥4.5 for mild dementia, and ≥9.5 for moderate dementia.

Table 1 Baseline characteristics of K-ROAD and ADNI cohorts

Phase 1. Identifying prognostic risk subgroups through survival prediction modeling

Participants were classified into prognostic subgroups within each baseline cognitive status (CU, MCI, and dementia), with adjacent categories merged based on clinical and statistical validation (Fig. 1a). This resulted in 2 subgroups for CU, 3 for MCI, and 2 for dementia, totaling 7 refined subgroups (Fig. 2). In CU, participants were grouped by GFAP and pTau217: C1 (GFAP \(\le\) 195 pg/ml and pTau217 \(\le\) 0.62 pg/ml) and C2 (GFAP \(\le\) 195 pg/ml and pTau217 \( > \) 0.62 pg/ml or GFAP \( > \) 195 pg/ml) (Fig. 2a). In MCI, hippocampal volume (HV) and pTau217 were the main discriminators: M1 (HV \( > \) 2815 mm3 and pTau217 \(\le\) 0.68 pg/ml), M2 (HV \( > \) 2815 mm3 with pTau217 \( > \) 0.68 pg/ml or HV \(\le\) 2815 mm3 with GFAP \(\le\) 122 pg/ml), and M3 (HV \(\le\) 2815 mm3 with GFAP \( > \) 122 pg/ml) (Fig. 2b). In dementia, subgroups were defined by age and pTau217: D1 (\( > \) 60 years with pTau217 \(\le\) 1.1 pg/ml) and D2 (\( > \) 60 years with pTau217 \( > \) 1.1 pg/ml or \(\le\) 60 years regardless of pTau217) (Fig. 2c). Adjusted survival curves showed significant differences in progression risk across subgroups for CU (P = 0.011), MCI (P < 0.001), and dementia (P < 0.001), with incidence rates and 3-year cumulative incidences supporting discriminative validity (Supplementary Table 1).

Fig. 1: Development process of a unified prognostic staging system.
Fig. 1: Development process of a unified prognostic staging system.
Full size image

a Overall scheme to develop a unified prognostic staging system. b Detailed process for feature selection and cutoff determination to identify prognostic risk subgroups in Phase 1. CU cognitively unimpaired, MCI mild cognitive impairment, Dem dementia, CDR-SB clinical dementia rating–sum of boxes.

Fig. 2: Prognostic risk subgroups identified within initial cognitive status sub-cohorts in Phase 1.
Fig. 2: Prognostic risk subgroups identified within initial cognitive status sub-cohorts in Phase 1.
Full size image

a Risk subgroups in cognitively unimpaired (CU) group. b Risk subgroups in mild cognitive impairment (MCI) group. c Risk subgroups in dementia group. Left panel illustrates survival prediction models developed with selected features and their cutoff points within cognitive status sub-cohorts. The resulting prognostic risk subgroups are displayed with identical colors designating the same category: C1 and C2 in CU, M1 to M3 in MCI, and D1 and D2 in dementia. In the right panel, adjusted survival curves along with p-values indicate statistically significant progressive deterioration across prognostic subgroups (\(P=0.0106\) in CU, \(P=2.68\times {10}^{-13}\) in MCI, and \(P=0.0005\) in dementia). Adjusted survival curves were estimated with inverse probability weights and were compared using a log-rank test corrected for weighting. All statistical tests were two-sided, and source data are provided as a Source data file.

Phase 2. Developing a unified prognostic staging across the Alzheimer’s disease continuum

Prognostic subgroups from Phase 1 were used to generate outcome-specific stages for each of three progression outcomes (Fig. 1a): very mild dementia (N = 923), mild dementia (N = 1120), and moderate dementia (N = 1263). These outcome-specific stages were integrated into a six-stage unified prognostic system (Stage 0–IVB) reflecting increasing dementia severity (Fig. 3). Detailed procedures to develop a unified prognostic staging system is presented in Supplementary Method 3.

Fig. 3: Outcome-specific progression stages and a unified prognostic staging system across the Alzheimer’s disease continuum.
Fig. 3: Outcome-specific progression stages and a unified prognostic staging system across the Alzheimer’s disease continuum.
Full size image

Unified prognostic staging system defined through the integration of outcome-specific stages. C1 and C2 are prognostic risk subgroups in the cognitively unimpaired (CU) group. M1 to M3 are prognostic risk subgroups in the mild cognitive impairment (MCI) group, while D1 and D2 are prognostic risk subgroups in the dementia group. Prognostic risk subgroups positioned at the same horizontal level represent equivalent stage. Stage labels (e.g., IVA, IVB) reflect clinical progression severity. Detailed methods are provided in Supplementary Method 3.

Adjusted survival curves confirmed effective separation between stages (Fig. 4a), and pairwise comparisons demonstrated significant differences in progression risk between all adjacent stages (all P < 0.05) (Fig. 4b). Notably, early to intermediate stages (Stages 0–III) were primarily distinguished by very mild and mild dementia outcomes, whereas advanced stages (Stage IVA–IVB) were driven by moderate dementia outcomes. Higher stages were consistently associated with increasing CDR-SB and decreasing MMSE scores (both P < 0.001) (Fig. 4c). Marked inflection points in incidence rates and 3-year cumulative incidence were observed at Stage 0 → I, Stage I → II, Stage II → III, and Stage III → IVA (Table 2), highlighting clinically meaningful thresholds for prognosis and intervention. For clarity, we emphasize these key transitions, particularly between mid-level stages, as they represent sharp increases in progression risk and may serve as optimal points for clinical decision-making.

Fig. 4: Clinical validation of the unified prognostic staging system.
Fig. 4: Clinical validation of the unified prognostic staging system.
Full size image

Unified prognostic stages were clinically validated by comparing dementia progression curves and longitudinal cognitive trajectories with adjustment for age, sex, education, and apolipoprotein E (APOE) ε4 status. Stage labels reflect clinical progression severity. a Adjusted survival curves for each dementia progression outcome show effective stage differentiation across the cognitive spectrum: very mild dementia outcome differentiated lower stages, mild dementia outcome distinctly identified intermediate stages, and moderate dementia outcome distinguished advanced stages. Overall differences among stages were assessed using a log-rank test corrected for weighting (\(P=2.68\times {10}^{-13}\) in very mild dementia, \(P=4.74\times {10}^{-41}\) in mild dementia, and \(P=2.05\times {10}^{-18}\) in moderate dementia). b Adjusted pairwise comparisons were conducted using multivariable Cox proportional hazards models, and demonstrated statistically significant differences between adjacent stages, confirming discriminative validity. P values within each survival outcome were corrected for multiple hypothesis testing using a Benjamini–Hochberg procedure. c Longitudinal cognitive trajectories for each stage, measured by clinical dementia rating–sum of boxes (CDR-SB) and mini-mental state examination (MMSE), show progressively greater decline in higher stages over time. Data are presented as least squares means (LSmeans) with 95% confidence intervals estimated using generalized estimating equations. All statistical tests were two-sided, and source data are provided as a Source data file.

Table 2 Incidence rates and cumulative incidences at three years of the unified staging system according to the progression of dementia

External validation of the unified staging system using the ADNI cohort

For external validation, we utilized an independent cohort from the Alzheimer’s Disease Neuroimaging Initiative (ADNI), comprising 290 participants (160 CU, 118 MCI, and 12 dementia) with a median follow-up of 5.5 years (95% CI 5.1–6.0 years) (Table 1). Adjusted survival curves showed consistent patterns of worsening prognosis across stages for very mild and mild dementia outcomes, although moderate dementia differentiation was less distinct due to limited event occurrences in lower stages (Fig. 5a, b). Also, stages III, IVA, and IVB were excluded from analysis due to small sample sizes (N \(\le\) 10 each). Cognitive trajectories indicated progressive worsening, with increasing CDR-SB and decreasing MMSE scores at higher stages, aligning with severity escalation (Fig. 5c). Similar patterns were confirmed by 3-year cumulative incidence and incidence rates (Supplementary Table 2).

Fig. 5: External validation of the unified prognostic staging system using the ADNI cohort.
Fig. 5: External validation of the unified prognostic staging system using the ADNI cohort.
Full size image

a Adjusted survival curves for very mild and mild dementia outcomes shows consistent separation of worsening prognosis across stages, while less differentiation is found for moderate dementia outcome due to limited event occurrence. A log-rank test corrected for weighting was used to assess overall differences among stages, yielding \(P=2.03\times {10}^{-6}\) for very mild dementia, \(P=3.52\times {10}^{-5}\) for mild dementia, and P = 0.0419 for moderate dementia. Stages with fewer than 10 participants (i.e., III, IVA, IVB) were excluded from this analysis. b Pairwise comparisons between stages using multivariable Cox proportional hazards models demonstrated significant differences in progression risk for adjacent stage pairs, supporting external discriminative validity. P values within each survival outcome were corrected for multiple hypothesis testing using a Benjamini-Hochberg procedure. c Longitudinal changes in clinical dementia rating–sum of boxes (CDR-SB) and mini-mental state examination (MMSE) also showed progressively steeper decline in higher stages over time. Data are presented as least squares means (LSmeans) and their 95% confidence intervals estimated with generalized estimating equations. All statistical tests were two-sided, and source data are provided as a Source data file.

Discussion

In this study, we developed a proof-of-concept prognostic staging framework for AD that integrates cognitive status, traditional risk factors, plasma biomarkers, and neuroimaging markers to stratify progression risk across the disease continuum. Unlike prior studies that identified prognostic factors within each cognitive stage independently, our approach unifies CU, MCI, and dementia into a single continuous framework, enabling stage-consistent interpretation of prognosis and transitions over time.

Our first major finding was that primary prognostic discriminators varied across cognitive stages (CU, MCI, and dementia), reflecting the evolving pathophysiology of AD. Specifically, the primary discriminators were non-specific markers rather than core AD biomarkers: CU—GFAP, representing neuroinflammation; MCI—hippocampal atrophy, representing neurodegeneration. In CU, the prominence of GFAP aligns with prior studies showing that it predicts future tau accumulation and cognitive decline27,28, likely reflecting early astrocytic responses to amyloid pathology13. In MCI, hippocampal atrophy has consistently been shown to predict progression to dementia, particularly in the prodromal phase, reinforcing its central prognostic role during this transitional stage29,30. In dementia, younger age emerged as the strongest predictor of faster progression3, a pattern consistent with extensive evidence that early-onset AD follows a more aggressive clinical course. Individuals with early-onset AD typically show markedly higher amyloid positivity rates31, greater tau burden on PET32, and more severe and widespread cortical atrophy33 compared with late-onset AD. These biological differences—rather than chronological age itself—likely account for the accelerated progression observed in younger patients34,35.

Across all stages, plasma pTau217 consistently served as a secondary discriminator, reinforcing its relevance as a dynamic biomarker that tracks both amyloid- and tau-related processes18,36,37. Indeed, plasma pTau217 has demonstrated independent associations with both amyloid and tau pathology18,38. Notably, our recent work39 demonstrated that pTau217 predicts both Aβ (AUC 0.96) and tau (AUC 0.90) PET positivity and distinguishes longitudinal cognitive decline across plasma-defined AT profiles, underscoring its utility for prognostic stratification even without tau PET. However, GFAP and NfL are not disease-specific, and even pTau217 partly reflects non–AD processes—including ageing40, cerebrovascular burden41, and other neurodegenerative conditions—so these markers should be interpreted as indicators of broader neural vulnerability rather than AD-specific pathology.

Our second major finding was that outcome-specific prognostic staging revealed complementary patterns of progression. Early-to-intermediate stages (0–III) were primarily separated by very mild and mild dementia outcomes, while later stage (IVA–IVB) were distinguished by moderate dementia outcomes. This hierarchical organization enabled us to preserve unique prognostic information from each outcome domain before integrating them into a unified system. Unlike biological staging frameworks such as A/T/N24, our model does not incorporate regional tau PET or quantitative pathological burden and therefore should not be viewed as a biological classification system. Instead, it is an outcome-driven, prognostic framework built from T1 biomarkers, non-specific plasma markers, and known clinical risk factors—intended to complement, not replace, biological staging approaches. Because T2 biomarkers such as tau PET are not yet widely available in routine research or clinical settings, the present framework necessarily emphasizes prognostic rather than biological classification. Within each cognitive stage, individuals exhibited substantial heterogeneity in progression risk7,8,12,42, and this variability was captured by the combined contribution of plasma biomarkers, neurodegeneration measures, and clinical risk factors.

Our final major finding was that the unified six-stage system (0–IVB) produced clear, stepwise gradients in functional and cognitive decline and identified reproducible inflection points at which clinical worsening accelerated. These thresholds may facilitate standardized prognostic communication and research-level stratification across the AD continuum. However, this framework is not designed for therapeutic decision-making. In particular, eligibility for monoclonal antibody therapy requires confirmed amyloid positivity and treatment-response biomarkers43—elements that the present system does not assess. Therefore, this model should be used strictly for prognostic stratification rather than for guiding treatment selection.

External validation using the ADNI cohort supported stage-dependent prognostic patterns, particularly for early and intermediate stages. However, only 290 of the 378 eligible participants could be analyzed because complete plasma pTau217, GFAP, hippocampal volume, Aβ PET, and longitudinal CDR data were required for consistent modeling. This resulted in a biomarker-enriched subset that underrepresented advanced dementia and limited evaluation of later-stage performance. Broader validation using population-based and clinically heterogeneous cohorts will therefore be necessary to establish generalizability.

The strengths of our study include a relatively large sample size with well-balanced representation across the Alzheimer’s continuum. However, several limitations should be considered. First, although the K-ROAD cohort used a hybrid recruitment strategy from both memory clinics and community dementia prevention centers, it nonetheless represents a selectively ascertained research sample rather than a fully population-based or memory-clinic cohort. Although individuals with extensive cerebrovascular disease were excluded, only limited vascular and genetic variables (age, hypertension, diabetes mellitus, APOE ε4 carrier status, and education) were included, and detailed vascular imaging markers or social determinants of health were not available—factors that may contribute to mixed pathology. Second, the dataset did not include systematic measurements of modifiable lifestyle risk factors (e.g., diet, physical activity, sleep, cardiovascular behaviors)44, precluding evaluation of lifestyle-related predictors commonly observed in population-based cohorts. Because such assessments are required to model prevention in CU individuals, the present framework should not be interpreted as a prevention model. Third, the cohort did not include individuals with atypical AD presentations, who may follow distinct clinical trajectories; future studies should examine whether this framework generalizes across phenotypic heterogeneity45. Finally, longitudinal follow-up will be required to determine whether the staging system reliably predicts long-term clinical trajectories. Despite these limitations, this study proposes a unified, biomarker-informed prognostic framework that captures heterogeneous progression patterns across the AD continuum and provides a structured foundation for future prognostic research, particularly as multimodal biomarkers, tau PET, and population-based datasets become increasingly available.

In conclusion, this study introduces a stage-spanning prognostic structure that leverages cognitive status, established risk factors, and scalable biomarker modalities to delineate progression risk across the Alzheimer’s continuum. Rather than functioning as a biological staging tool, the framework provides a pragmatic foundation for future work aimed at refining individualized prognosis—particularly as broader tau biomarkers, multimodal datasets, and population-based cohorts become more widely available.

Methods

For the K-ROAD cohort, the study was approved by the institutional review board of Samsung Medical Center (No. 2021-02-135). All participants provided informed consent, and the study was conducted in accordance with the Declaration of Helsinki. The ADNI study was also approved by the institutional review boards of all participating sites, and written informed consent was obtained from all participants. All data were handled in accordance with relevant data protection and privacy regulations.

Participants selection

We included 1416 participants with CU, MCI and dementia from K-ROAD project, all of whom had available data on baseline demographics including comorbidities, plasma and imaging markers, and longitudinal CDR-SB assessments. The K-ROAD project is a nationwide initiative involving 25 university-affiliated hospitals across South Korea46. Recruitment was conducted both through memory disorder clinics and government-commissioned community dementia prevention centers, resulting in a hybrid cohort. However, we emphasize that this hybrid structure does not make the cohort representative of either the general population or a typical memory clinic population, as participants were selectively enrolled based on research-appropriate clinical evaluations and biomarker availability. Inclusion and exclusion criteria for this study have been described in detail elsewhere18. In summary, CU participants met the following conditions: (1) no major medical or psychiatric illness that could affect cognitive function, and (2) no objective cognitive impairment in any cognitive domain47,48. MCI participants met: (1) subjective cognitive complaints reported by the participants or caregiver; (2) objective cognitive impairment in one or more cognitive domains (defined as performance below −1.0 SD in memory and/or −1.5 SD in other domains, based on age- and education-adjusted norms); and (3) preserved instrumental activities of daily living49. Dementia participants fulfilled the National Institute on Aging - Alzheimer’s Association (NIA-AA) core clinical criteria for probable AD dementia50. Notably, baseline cognitive classifications (CU, MCI, and dementia) were determined independently of both CDR-SB and biomarker data (e.g., Aβ PET or plasma biomarkers), based solely on standard diagnostic procedures including neuropsychological testing, structured interviews, and functional assessments. As a result, each group may include individuals with or without biomarker positivity. The comprehensive participant selection flow is detailed in Supplementary Fig. 1.

Plasma biomarker measurements

Plasma Aβ40, Aβ42, GFAP, and NfL concentrations were measured using the commercial Neurology 4-Plex E kit (Quanterix, Billerica, MA, USA). Plasma pTau181 and pTau231 concentrations were measured using in-house Simoa assays developed at the University of Gothenburg, and pTau217 concentration was measured using the commercial ALZpath pTau217 assay kit. All samples were analyzed in a single run with one batch of reagents, and the intra-assay coefficient of variation was below 10%. All measurements were performed by analysts blinded to clinical data.

Brain MRI and hippocampal volume measurements

Three-dimensional T1-weighted turbo field echo imaging was performed for all participants, with a sagittal slice thickness of 1.0 mm and 50% overlap. As previously described, hippocampal volume was quantified using an automated segmentation method that combines a graph cut algorithm with atlas-based segmentation and morphological opening51.

Aβ PET acquisition and quantification

Aβ PET scans were performed using either 18F-Florbetaben (FBB) or 18F-Flutemetamol (FMM), following each manufacturer’s standardized imaging protocols. Aβ PET binding was quantified using the regional direct comparison centiloid (rdcCL) method, with the whole cerebellum as a reference region52. This method allows harmonization of FBB and FMM tracers without requiring 11C-labeled Pittsburgh compound B images. Aβ PET positivity was defined using a global MRI-based rdcCL threshold of 25.5, derived via Gaussian mixture modeling39. All imaging data were processed at the Samsung Medical Center laboratory, which served as the core center. The median time interval between plasma sampling and Aβ PET imaging was 4 days (IQR 0–69 days).

Physical comorbidities

Information on vascular risk factors, including hypertension and diabetes mellitus, was obtained from self-reported medical history or from records of current use of antihypertensive or antidiabetic medications.

Longitudinal cognitive assessments

The CDR-SB score is a measure of cognition and function, obtained by interviewing both patient and care partner. Its longitudinal assessments (i.e., CDR-SB profile) are used to track dementia progression over time. In this study, baseline CDR-SB was defined as the first assessment conducted within ±1 year of either blood sampling or Aβ PET imaging. All participants had at least two CDR-SB assessments with a minimum interval of 3 months between visits.

Three survival outcomes of dementia progression were defined using CDR-SB cutoffs of \(\ge\) 3.0 (very mild dementia), \(\ge\) 4.5 (mild dementia), and \(\ge\) 9.5 (moderate dementia), supported by both prior literature53,54,55 and validation within our cohort. Staging based on CDR-SB cutoffs of 0.5, 4.5, and 9.5 showed high concordance with global CDR scores (Cohen’s κ = 0.923 [95% CI, 0.910–0.936]; Supplementary Fig. 2). To refine early-stage classification, we compared individuals with CDR-SB scores of 0.5–2.5 versus 3.0–4.0. The two groups showed differences in baseline MMSE, plasma biomarker levels, and Aβ PET positivity (Supplementary Fig. 3a). They also exhibited distinct longitudinal MMSE trajectories with least squares mean plot using generalized estimating equation model (Supplementary Fig. 3b), supporting the use of 3.0 as the cutoff for very mild dementia.

For each dementia progression outcome, the time-to-progression was defined as the interval from the baseline date to the first visit when CDR-SB exceeded the corresponding cutoff. Participants with CDR-SB above the cutoff at the baseline were excluded from the analysis of that outcome.

Development process of a unified prognostic staging system

Phase 1. Identifying prognostic risk subgroups through survival prediction modeling

In Phase 1, the entire cohort was divided into three sub-cohorts by initial cognitive status (CU, MCI, and dementia) (Fig. 1a). Stratification within each sub-cohort was conducted independently by developing a survival prediction model for the corresponding dementia progression outcome: very mild dementia for CU, mild dementia for MCI, and moderate dementia for dementia. Candidate prognostic features included risk factors such as age, sex, education years, APOE ε4 carrier status, the presence of hypertension and diabetes mellitus, plasma biomarkers such as Aβ42/40 ratio, pTau181, pTau217, pTau231, GFAP, and NfL, and imaging markers including Aβ PET positivity and MRI hippocampal volume.

Random survival forest (RSF) was utilized to identify prognostically similar subgroups within each cognitive status based on survival outcomes of dementia progression56 (Fig. 1b). Modifications on RSF were made to select optimal prognostic features with their robust cutoff points. Firstly, the tree structure was restricted to a maximum depth of one (i.e., performing only a single split of each parent node). An optimal cutoff point was determined to enable bifurcation based on a numeric feature. For each node, we generated 1000 bootstrap samples to create 1000 single-split trees, each of which was represented by a best-splitting feature with its optimal cutoff point. The optimal prognostic feature was selected as the most frequently chosen feature across all single-split trees. When the selected feature was numeric, we employed weighted averaging of all identified cutoff points to determine the robust cutoff point. Among all unique values of the identified cutoff points, we chose the 5 or 10 most frequently identified values and averaged them. Following binary partitioning of a parent node, this algorithmic process was applied recursively to each resultant child node. The stopping rule for tree expansion was established by limiting the numbers of events and participants within a candidate parent node, so that a node cannot be split if it has less than the specified numbers. The cumulative hazard function served as the risk score of RSF56, and log-rank splitting57,58 or global non-quantile Brier score splitting59 were employed for splitting rule. Detailed process was described in Supplementary Method 2, and modified RSF was applied only in Phase I.

Phase 2. Developing a unified prognostic staging across the Alzheimer’s disease continuum

The second phase employed a three-step process to develop a unified prognostic staging system (Fig. 1a). Initially, outcome-specific staging was developed by merging prognostic risk subgroups with similar risk of each dementia progression based on clinical evaluation: visually overlapped adjusted survival curves with non-significant differences in pairwise comparison, and similar incidence metrics. Prognostic risk subgroups remained separate if their curves were visually distinct and incidence metrics were notably differed, even without statistical significance. In the second step, we integrated these outcome-specific categorizations into a unified prognostic staging system encompassing dementia progression and cognitive status information. Specifically, early stages were derived by focusing on the very mild dementia-specific categorization; intermediate stages were formed by considering the mild dementia-specific categorization; and advanced stages were created based on the moderate dementia-specific categorization. In the final step, the recombined stages were labeled from stage 0 to stage IV, with each progressive stage representing increasing disease severity. Adjacent stages with overlapping 95% CI for three-year cumulative incidence or incidence rates were assigned to the same primary stage, with further refinement into subcategories (A or B) within primary stages. Supplementary Method 3 illustrates the detailed process for developing the unified prognostic staging system.

The unified prognostic stages were clinically validated by comparing survival curves for dementia progression outcomes, and longitudinal cognitive trajectories for CDR-SB and MMSE with adjustment for age, sex, education, and apolipoprotein E (APOE) ε4 status.

External validation with the ADNI dataset

The unified staging system underwent external validation using ADNI dataset, with detailed procedures described in the Supplementary Method 1. A total of 290 ADNI participants with available plasma pTau217, GFAP, hippocampal volume on MRI, and longitudinal CDR-SB data were included to match the derivation criteria and minimize selection bias. After applying the proposed staging criteria to classify ADNI participants, we assessed whether three-year cumulative incidence and incidence rates demonstrated clear differentiation between adjacent stages in this independent cohort. Similarly, adjusted survival curves were generated and pairwise comparisons using multivariable Cox proportional hazard regression models were conducted to evaluate stage differentiation. Finally, the cognitive trajectory according to follow-up duration was also assessed whether the magnitude and progression rate of cognitive impairment differed between stages in this validation cohort, thereby confirming the generalizability of the staging system.

Statistical methods

Descriptive statistics were calculated for each cohort and proposed staging category, with continuous variables presented as mean (SD) and categorical variables as frequencies (proportion). Median follow-up duration with corresponding 95% CIs was determined using the reverse Kaplan-Meier method60.

All proposed subgroups, categories, and the unified staging system were evaluated through multiple measures. Incidence rates of dementia per 100 person-years with exact Poisson 95% CIs were calculated61, along with three-year cumulative incidence of dementia with 95% CIs (derived as one minus the Kaplan-Meier estimated survival probability at 3 years)62. We assessed whether incidence rates and 3-year cumulative incidence demonstrated sufficient discrimination between adjacent stages. Adjusted survival curves were drawn using inverse probability weights to evaluate potential overlap in dementia progression between stages63. These adjusted survival curves for all groups were compared using a log-rank test corrected for weighting to check whether all proposed categories have the same curves about probability of non-dementia across follow-up time64. Additionally, all pairwise comparisons were conducted using multivariable Cox proportional hazards regression models. The P values for all pairwise comparisons were adjusted using the Benjamini–Hochberg procedure to account for multiplicity, thereby controlling the false discovery rate65. Firth’s bias correction was adopted when some groups or categories contain very small number of events66. The longitudinal trajectories were estimated using least squares mean plot derived from generalized estimating equations67,68. Age, sex, the period of education, and APOE ε4 carrier status were considered as the covariates for adjustment.

Statistical significance was defined as two-sided P values less than 0.05. All statistical analyses and RSF modeling were conducted with SAS version 9.4 (SAS Institute, Cary, NC, USA) and R version 4.4.1 (The R Foundation, www.R-project.org).

Statistics and reproducibility

Sample sizes were determined by the availability of eligible participants in the K-ROAD and ADNI cohorts, with predefined exclusion criteria applied to ensure adequate longitudinal follow-up and suitability for survival analyses. This study was an observational study and did not involve randomization. Plasma biomarker analyses were conducted by investigators blinded to clinical and imaging data.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.