Fig. 1: Overview of fusion oncoproteins (FOs). | Nature Communications

Fig. 1: Overview of fusion oncoproteins (FOs).

From: FusOn-pLM: a fusion oncoprotein-specific language model via adjusted rate masking

Fig. 1

A FOs are formed by chromosomal rearrangements between two independent genes, the 5’ head gene and 3’ tail gene. Created in BioRender. Chatterjee, P. (2025) https://BioRender.com/p82v208. B ESM-2 training data included the wild-type head and tail proteins involved in FOs, but not FOs themselves. FOs were compared to SwissProt, a representative subset of ESM-2’s training data, via BLAST. The best alignments for each FO are shown (% identity = total identities / length of FO sequence). C AlphaFold2 structures of four well-studied fusion oncoproteins: PAX3::FOXO1, EWSR1::FLI1, EML4::ALK, and SS18::SSX1. Structures are colored by composition (red = head, blue = tail) and pLDDT, AlphaFold2’s primary confidence metric. Each FO has multiple known breakpoints, producing different amino acid sequences. Breakpoint regions (rectangle), per-residue pLDDTs (bar coloring), and average pLDDTs (colored circle) are shown for each sequence. D The percentage of disordered residues per sequence for FOs and their respective heads and tails. Average disorder content is 45.9% for FOs, 33.7% for head proteins, and 32.7% for tail proteins. Only FOs with AlphaFold2 structures available on FusionPDB are included. Source data for this figure are provided in the Source Data file.

Back to article page