Table 1 Selection criteria for clustering of diseases.

From: Characterisation, identification, clustering, and classification of disease

Clinical inclusion criteria

Eligible hospital episodes

Prior disease

Diseases were the first primary hospital diagnosis in each ICD-10 chapter

425,383 male

502,771 female

+ Clinical considerations

Clinically distinct, age-related disease, or R-coded diseases of unknown aetiology

400,006 male

468,398 female

Statistical inclusion criteria

Eligible diseases

Successful fit

At least 50 cases, and a covariance matrix with no unusually large or small eigen values that exceed the mean by 2.5 standard deviations when outliers were included

343 male

346 female

+ Statistically significant

Statistically significant risk factors at the 0.05 level,after a multiple-testing Bonferroni adjusted multivariate \(\chi ^2\) test

150 male

140 female

+ Proportional hazards

Test of proportional hazards assumption—no statistically significant deviation at the 0.05 level, after a multiple-testing FDR adjustment

138 male

127 female

+ Unisex

The set must include the same disease in both men and women

86 male and female