Fig. 5: The distribution of repeat lengths in different populations.
From: Increased frequency of repeat expansion mutations across different populations

a, Half-violin plots showing the distribution of alleles in different populations (African 12,786, American 5,674, East Asian 1,266, European 59,568, South Asian 2,882) for 10 loci (Methods) from the combined 100K GP and TOPMed cohorts. The box plots highlight the interquartile range and median, and the black dots show values outside 1.5 times the interquartile range. The red dots mark the 99.9th percentile for each population and locus. The vertical bars indicate the intermediate and pathogenic allele thresholds (Supplementary Table 20). Predicted ancestries are abbreviated as follows: AFR, African; AMR, American; EAS, East Asian; EUR, European; SAS, South Asian. b, A scatter plot showing the frequency of intermediate allele carriers against the frequency of pathogenic allele carriers. The data points are divided by population (n = 5) and gene (n = 10), and the size represents the total number of intermediate alleles. Correlations were computed using the Spearman method and two-tailed P values.