Fig. 3: Disease risk in the clusters of the discovery and validation cohorts.

A Coefficient of cluster membership (hazard ratio [HR] in the weighted Cox proportional hazards regression model) with respect to the onset of each cross-cohort disease in the UKB cohort (N = 502,504). The top five diseases with the strongest increase/decrease in risk in each cluster are indicated in the plot and listed on the right. The colour of the markers corresponds to the main ICD-10 disease category. D: Diseases of the blood and blood-forming organs, E: Endocrine, nutritional and metabolic diseases, F: Mental, behavioural and neurodevelopmental disorders, G: Diseases of the nervous system, H: Diseases of the eye, ear and mastoid process, I: Diseases of the circulatory system, J: Diseases of the respiratory system, K: Diseases of the digestive system, L: Diseases of the skin and subcutaneous tissue, M: Diseases of the musculoskeletal system and connective tissue, N: Diseases of the genitourinary system. B Values and 95% confidence intervals of cluster membership coefficients (hazard ratios) from the weighted Cox proportional hazards regression models for the onset of MDD across various cohorts. Points indicate coefficient values and error bars represent the 95% confidence intervals. Colours represent the different cohorts. C Weighted Kaplan‒Meier estimates of MDD-free survival in the various cohorts throughout participants’ lifespans. Survival curves are labelled by cluster numbers, and the colours of the curves indicate the distinct clusters. The dotted grey curves indicate the mean MDD-free survival in the whole cohort, regardless of cluster membership. (In A and C: UKB N = 502,504; THL N = 41,092; CHSS N = 645,913; FinnGen N = 385,640; SHIP N = 1449).