Fig. 5: Factor 9 associations across top-level inpatient diagnostic phecodes.
From: Principled distillation of UK Biobank phenotype data reveals underlying structure in human variation

Box-and-whisker plots are shown for associations within UKB with 403 derived medical phecodes grouped by category. These associations are defined as the test statistics (that is, z-scores) for the factor score in a logistic regression model including our standard covariates (that is, first 20 genetic PCs, age, chromosomal sex, age2, age × chromosomal sex, age2 × chromosomal sex and dummy variables representing the assessment centres of origin). Boxes represent the middle quartiles of Factor 9’s test statistics across phecodes within a category, with whiskers extending to maximum and minimum observed values, excluding outliers >1.5× the interquartile range away from the middle quartiles which are plotted individually. Median values per category are indicated by individual black lines inside the boxes. The dotted grey lines represent the critical test statistics for significance at two-sided P < 0.05 after correcting for multiple comparisons across all 403 phecodes.