Fig. 6: PrCa risk assessment.
From: Non-coding genetic variants underlying higher prostate cancer risk in men of African ancestry

A For different SNP sets, the plot shows the percentage of EA cohort test cases that are in the top and the bottom 20 percentile PRS among controls; error bars show the standard deviation across the 10-fold cross-validation (n = 10 for each category of each SNP set). B For different SNP sets, the plot shows the percentage of AA cohort cases in top and bottom 20 percentile PRS among controls, based on the model trained in EA cohort; error bars show the standard deviation of the 10-fold bootstrapping (90% of the AA cohort), n = 10 for each category. C Age distribution of the AA PrCa patients with top 10% (n = 93), middle 50% (n = 746), and bottom 10% (n = 93) percentile of eSNP+PGS-545 SNPs-based PRSs. P-values are based on a one-sided Wilcoxon test. D Fraction of heterozygous SNPs that exhibit allelic imbalance. P-values are based on one-sided Fisher’s exact test. E Enrichment of prostate-related diseases/traits for different sets of SNPs, which was quantified by the observed ratio of the fraction of SNPs overlapping prostate traits to the fraction of SNPs overlapping other traits, relative to the expected ratio. Central dots are the median values. Error bars show the standard deviation of the 100 bootstrap samples of 90% of the SNPs (n = 100 for different sets of SNPs). Bonferroni-corrected P-values are obtained based on a one-sided Wilcoxon test. In boxplots (A, B, C), the horizontal line in the middle is the median value, and the lower and upper edges of the boxes correspond to the 25th and 75th percentiles. Extending vertically upwards/downwards of the boxes are the lines showing 1.5 times the interquartile range (i.e., distance between 25th and 75th percentile). Source data for these figures is provided as a Source data file.