Fig. 3: Different skin cancer risk by ancestries. | Nature Communications

Fig. 3: Different skin cancer risk by ancestries.

From: A highly accurate risk factor-based XGBoost multiethnic model for identifying patients with skin cancer

Fig. 3: Different skin cancer risk by ancestries.The alternative text for this image may have been generated using AI.

AD Survival plots showing for any type of skin cancer: A All OTH individuals (healthy and skin cancer patients) grouped according to their self-reported race/ethnicity (Hispanic or Latino, White and Admixed); B only OTH skin cancer patients grouped according to their self-reported race/ethnicity (Hispanic or Latino, White and Admixed); C All individuals (healthy and skin cancer patients) who self-report as White grouped according to their genetic ancestry (OTH and EUR); D only skin cancer patients who self-report as White grouped according to their genetic ancestry (OTH and EUR). Only populations with at least 20 skin cancer patients are shown. In (A, C), X axis indicates age at diagnosis for skin cancer patients; age at last follow-up for all other individuals. E Log2 odds ratio of the association of each genotype PC for being diagnosed with skin cancer in all individuals of non-European ancestry. Segments represent 95% confidence intervals, measured using the glm function in R. Filled points represent significant associations (p-value adjusted using Benjamini-Hochberg’s method <0.05). Odds ratios, confidence intervals and p-values were calculated using a single logistic regression with outcome = cancer, and the top 16 genotype PCs as covariates. This model was generated removing all individuals of European ancestry. F, G Boxplots showing the associations between each ancestry and the value of genotype PC1. Healthy individuals (i.e. AoU participants that were not diagnosed with skin cancer, F and skin cancer patients (G) are shown separately. The central line within each box represents the median, the box edges indicate the 25th and 75th percentiles (interquartile range, IQR), and the whiskers extend to the most extreme data points within 1.5 × IQR from the quartiles. H) Barplots showing the coefficient of each variable used to develop the LASSO logistic regression on OTH individuals. While Avanafil has a strong negative coefficient, it was only prescribed to 46 patients and thus is underpowered. Likewise, individuals that self-identify as Middle Eastern or have MID genetic ancestry included <1000 individuals and are likely underpowered. Abbreviations (EUR, EAS, MID, AFR, SAS, AMR) represent genetic ancestry proportions, while the full names represent self-reported races and ethnicities.

Back to article page