Fig. 5: Selection bias in oncogenic fusions. | Nature Communications

Fig. 5: Selection bias in oncogenic fusions.

From: Etiology of oncogenic fusions in 5,190 childhood cancers and its clinical and therapeutic implication

Fig. 5: Selection bias in oncogenic fusions.The alternative text for this image may have been generated using AI.

a Model of selection. DNA breakpoints from the same intron have equivalent selection pressure because they generate the same fusion proteins. DNA breakpoints from different introns may have different selection pressure when the variable exon (red star) encodes critical protein domains and the corresponding intron may have disproportionally more patients than other introns. We propose a relative selection bias (RSB) score to measure such imbalance by accounting for patient counts (N) and intronic lengths (L) for intronic versions (red vs blue). b Spectrum of intronic versioning (colored bands within bar) across leukemia, brain, and solid tumors. Oncogenic fusions may have a single version (TCF3-PBX1) or multiple versions (number of versions labeled on top of each bar). Colors indicate different versions (exact fusion versions are indicated for CBFB-MYH11). Oncogenic fusions with alternative splicing are indicated by asterisks (*) and excluded from selection bias analysis. c Theoretically possible (gray lines) and observed (red lines) intronic versions in CBFB-MYH11. We define the translation frames (0, 1, 2) for each exon by using the frame position of its first nucleotide. A functional oncogenic fusion can only be generated by connecting translationally compatible exons (gray lines). Due to additional requirement of protein domains (e.g., Myosin Tail domain in MYH11), only a subset of translationally compatible fusions can result in tumorigenesis (red lines), although patient prevalence can be dramatically different (thickness of red lines). d Analysis of selection pressure in CBFB-MYH11. Version E5-33 has a disproportionally high number of patients (n = 183) than version E5-28 (n = 17) although the corresponding intron 32 of MYH11 has a length of 370 bps and intron 27 has a length of 5509 bps, indicating a strong selection bias (RBS = 160.3) between E5-33 and E5-28 (with a χ2 test Q value <7.7 × 10–15). e Intronic versioning (E5-33) better predicts event-free survival (measured as hazard ratio) than known clinical features (KIT mutation status, while blood counts, age, and end of induction (EOI) MRD) for CBFB-MYH11-positive AML patients. Error bars represent hazard ratio ± 95% confidence interval. f Analysis of selection bias in ETV6-RUNX1, KIAA1549-BRAF, and EWSR1-FLI1 fusions. In panels d and f, x-axes are the C’ genes, and y-axes are the N’ genes. Exon/intron lengths are indicated with scale bars in corresponding figures. Sizes of red dots are proportional to the number of patients for corresponding versions, and χ2 test Q values (with Bonferroni correction for multiple testing) are indicated for each panel. Source data are provided accordingly as sheets b and c, and d, f and I in Source Data file.

Back to article page