Fig. 4: Disease-associated rG4-altering variants change protein expression level. | Nature Communications

Fig. 4: Disease-associated rG4-altering variants change protein expression level.

From: G4mer: An RNA language model for transcriptome-wide identification of G-quadruplexes and disease variants from population-scale genetic data

Fig. 4

a Distributions of ΔG4mer scores for ClinVar variants annotated as benign or pathogenic within predicted rG4s in the \({5}^{{\prime} }\) and \({3}^{{\prime} }\) UTRs. Asterisks denote statistical significance (one-sided Mann-Whitney U test; p = 5.21 × 10−9 for \({5}^{{\prime} }\) UTR, p = 1.03 × 10−14 for \({3}^{{\prime} }\) UTR). Boxplots show the median (line), interquartile range (box), and 1.5 × IQR (whiskers). Sample sizes: \({5}^{{\prime} }\) UTR n = 2,628 (benign), n = 403 (pathogenic); \({3}^{{\prime} }\) UTR n = 5,328 (benign), n = 1,705 (pathogenic). b PheWAS plot for an rG4-disrupting variant in the \({5}^{{\prime} }\) UTR of EPN3. Each point is a Phecode (ICD-10), colored by disease category and plotted by \(-{\log }_{10}(P)\) from logistic regression adjusted for age, age2, sex, and top 10 ancestry PCs. The solid red line indicates the Bonferroni threshold (p = 8.5 × 10−6), and the dashed red line indicates the FDR  < 0.1 threshold (p = 5.5 × 10−2). Arrow direction reflects odds ratio: increased risk (up) or decreased risk (down). N = 1201 breast cancer cases. c Schematic of wild-type (WT) and mutant (Mut) dual-luciferase constructs used to assess the impact of rG4s in the \({5}^{{\prime} }\) UTR. WT includes the native sequence; Mut carries single-nucleotide substitutions: d rG4-breaking (G > U) in EPN3 and e rG4-forming (A > G) in MSH6. Renilla and Firefly luciferase activities were measured to quantify the effect on rG4-mediated translational regulation. Bar heights show mean fold change in Firefly/Renilla activity for Mut vs. WT (normalized to WT = 1), n = 3 per condition except in CHO (n = 5). For each independent biological replicate, luciferase activity was averaged across ≥3 technical repeats. Statistical significance was assessed using a one-sided Welch’s t-test, p-values: EPN3 -- NIH3T3: 7.00 × 10−3, CHO: 2.76 × 10−3, HuH-7: 9.16 × 10−4; MSH6 -- NIH3T3: 1.47 × 10−4, CHO: 2.79 × 10−5, HuH-7: 8.92 × 10−4. Error bars represent the standard error of the mean (SEM).

Back to article page