Introduction

Splicing of precursor messenger RNAs (pre-mRNAs) is a critical step in the processing of gene transcripts encoding most eukaryotic proteins, as part of an ordered set of reactions catalyzed by a large RNA–protein complex known as spliceosome1. One of the components of the spliceosome, the U2 Auxiliary Factor (U2AF1, 35 kDa) binds to pre-mRNA at the 3′ splice site during the early stages of splicing. Assuring to recognize and bind to the consensus sequence at the 3′ splice site, which is essential for the accurate recognition and selection of the splice site2. The expression of U2AF1 has been found to be dysregulated in various types of cancer. One of the most well-studied cancer types in relation to U2AF1 are myelodysplastic syndromes (MDS) and acute myeloid leukemia (AML). It has been predicted that the abnormal expression of a splicing factor in tumor cells can lead to the production of mRNA isoforms that are either absent or less abundant in normal cells, which may exert a direct or indirect impact on the development, progression, and response to cancer therapy3. U2AF1 has been found to play a role in alternative splicing, a process that allows the generation of multiple mRNA isoforms from a single gene. U2AF1 can influence the selection of alternative splice sites by binding to specific RNA sequences and promoting the inclusion or exclusion of certain exons4.

U2AF1 mutations have been linked to the occurrence and progression of various types of cancer5. A recurrent hotspot mutation in U2AF1 (S34F) in its first Zn-finger domain which is critical for RNA binding activity, alters RNA binding specificity and splicing kinetics which leads to a wide variety of splicing outcomes, resulting in multiple products of gene fusions and is considered ideal biomarkers for cancer prognosis6,7. However, the exact relationship between U2AF1 and cancer cell proliferation remains unclear. Knockdown of U2AF1 enhanced the viability of prostate cancer cell lines and enhanced prostate cancer cell proliferation8. In addition to an increase in programmed cell death (apoptosis), the knockdown of U2AF1 also hinders the enucleation process and leads to the formation of late-stage erythroblasts with abnormal nuclei. This abnormality may be attributed to the impact of U2AF1 depletion on genes associated with cytokinesis and mitosis. Also, the knockdown of U2AF1 hindered the proliferation of primary erythroid progenitor cells, highlighting the important role of U2AF1 in the development of hematopoietic progenitor cells9. In lung cancer cells A549, the U2AF1 knockdown and mutation lead to distinct alternative splicing patterns10.

The discovery and characterization of chimeric RNAs have provided valuable insights into the complexity of gene regulation and the development of diseases including cancer11, but now they have also been detected in normal tissues and cells12. Chimeric RNAs refer to a fusion transcript composed of exons, or fragments of exons from different genes at the RNA level13. Chimeric RNAs are generally thought to be produced as a result of chromosomal rearrangements. However, other mechanisms that can also contribute to the formation of chimeric RNAs are cis and trans-splicing. Cis-splicing between adjacent genes (Cis-SAGe) is an RNA processing event that occurs within a single pre-mRNA, where the transcription machinery reads through the intergenic regions of two neighboring genes. During the formation of cis-SAGe chimeric RNAs, the transcription program skips the termination signal of the 5’ gene, and the intergenic region is spliced out as an intron to join the exon of the 3’ gene14. It is necessarily an alternative splicing between exons of neighboring genes15. Trans-splicing on the other hand, is a non-canonical splicing process that can generate chimeric RNAs. Unlike cis-splicing, trans-splicing joins exons from two different primary RNA transcripts together16.

Furthermore, SRRM1, a splicing regulator has been found to have a role in regulating chimeric. It was revealed that the knockdown of SRRM1 resulted in the reduction of cis-SAGe (cis-splicing between adjacent genes) fusion RNAs. Silencing SRRM1 resulted in the reduction of CTNNBIP1-CLSTN1, DUS4L-BCAP29, CLN6-CALMl, UBA2-WTIP, and SLC29A1-HSP90AB1 cis-SAGe fusion RNAs17. Similarly, SF3B1 which is part of the U2 component of the spliceosome plays an important role in splice site recognition18. We also found that the knockdown of SF3B1 resulted in the overexpression of C21orf59-TCP10L and PSMA4-CHRN5 chimeric RNAs which are intra-ss-0gap chimeric RNAs. Intra-ss-0gap are those chimeric RNAs in which the parental genes are neighboring genes and are transcribed in the same direction, it also refers to cis-SAGe. As earlier studies have reported that knockdown of SF3B1 results in significant abnormal exon skipping19, and retention of introns20. These could be the reasons for causing ‘read-through’ events and in turn, could potentially contribute to the elevated levels of cis-SAGe fusion RNAs. These results argue in favor of splicing regulators and splicing factors can play an effective role in the regulation of chimeric RNAs.

In summary, U2AF1 plays a crucial role in exon inclusion and exclusion by binding to the pre-mRNA at the 3′ splice site and recruiting other splicing factors. Its interaction with the pre-mRNA ensures the accurate recognition of splice sites and facilitates the splicing process21. However, the specific role of U2AF1 in the formation of chimeric RNAs has not been identified. Here we wanted to find out the potential role of U2AF1 in the formation of chimeric RNA and in regulating different categories of chimeric RNAs (Supplementary Fig. S1). Understanding the role of U2AF1 in chimeric RNA formation can provide insights into molecular mechanisms underlying various biological processes, including cancer development and progression. This study aimed to identify the potential role of U2AF1 in chimeric RNA formation and investigate which categories of chimeric RNA are regulated by U2AF1.

Materials and methods

Cell lines

HEK-293 T cells were maintained in DMEM/HIGH GLUCOSE medium 4500 mg/L Glucose and 4.0 mM l-Glutamine (HyClone™, USA), supplemented with 10% Fetal Bovine Serum (HyClone™, USA), and 1% penicillin and streptomycin (HyClone™, US). HEL, K562, KYSE150, and KYSE450 cells were maintained in RPMI medium, supplemented with 10% fetal bovine serum (HyClone™, US), and 1% penicillin and streptomycin. Typical incubation conditions included 37 °C, 5% CO2, with medium changes every second day.

Lentivirus packaging and transfection

For packaging lentivirus HEK293T cells were grown in Dulbecco’s modified Eagle medium with 10% fetal bovine serum and 1% penicillin and streptomycin and incubated at 37 °C, 5% CO2. When the cells reached 70–90% confluency, they were transfected with packaging plasmid (psPAX2), envelope plasmid (pMD2.G), and shU2AF1 plasmids for the production of lentivirus. Three shRNA shU2AF1 specifically targeting U2AF1 and a control shRNA (pLKO.1) were used for Knocking down of U2AF1. The lentivirus produced was subsequently added into HEL, K562, KYSE150, and KYSE450 cells. 48 h after lentivirus transfection, puromycin was added at a final concentration of 2 μg/ml for HEL cells, 3 μg/ml for K562 cells, and 4 μg/ml for KYSE150 and KYSE450 cells to select stable cells.

RNA isolation and qRT-PCR analysis

RNA from the stable cells subsequently selected with neomycin was extracted with TRIzol RNA isolation Kit (Life technologies, USA) and respectively reverse-transcribed with Vazyme HiScript@III RT SuperMix Kit (Nanjing, China) according to the manufacturer’s instructions. The primers for GAPDH and U2AF1 were synthesized by Sangon Biotech (Shanghai, China). qRT-PCR was carried out using ABI StepOne Plus Real-Time PCR (Life Technologies, USA) with 2× Universal SYBR Green Fast qPCR Mix from ABclonal Technology® (Wuhan, China). The expression of the genes was then analyzed using ΔΔCt statistical approach, using GAPDH as an internal control.

RNA-seq library construction

HEL and K562 cells were transfected with three biological repeats for shU2AF1 and a control shRNA for the successful knockdown of U2AF1. After transfection, the stable cells were selected with puromycin. The stable cells were collected with centrifugation and TRIzol (Life Technologies, USA) was subsequently added to the stable cells. Subsequently, RNA was extracted followed by qRT-PCR to confirm U2AF1 knockdown. The samples were then sent to Wuhan Kangce Technology Co., Ltd for performing paired-end RNA sequencing.

Transcriptome analysis

For transcriptomic analysis, firstly the raw sequencing data was filtered by using Trimmomatic22. Low-quality reads were discarded and the reads contaminated with adaptor sequences were trimmed. Clean data were then mapped to the reference genome using STAR software23. Reads mapped to the exon region of each gene were using featureCounts and RPKMs were subsequently calculated24. Differential gene expression between the different groups was identified using the edgeR package25. Gene expression was counted using reads aligned to the reference genome and reference gene, and then the expressed genes were annotated. Here we use the absolute value of logFC > 1 and pvalue < 0.05 as the standard to indicate that the gene is differentially expressed. Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis for the differentially expressed genes were both applied to annotate the biological functions using KOBAS26. The enriched GO data were screened, with P-Value < 0.05 as the standard. Similarly, the enriched KEGG data were also screened, with P-Value < 0.05 as the standard. The rMATs package was used to detect differential alternative splicing events27. rMATs package is used to compare the differential analysis of transcriptome alternative splicing between two samples or two groups of samples (each group has replicates), and gives the common and significantly different alternative splicing events between the two samples or two groups of samples and the corresponding statistical results. It classifies AS events (ASE-Alternative Splice Event) into the following five categories, including SE: Skipped exon, MXE: Mutually exclusive exon, A5SS: Alternative 5ʹ Splice site, A3SS: Alternative 3ʹ Splice site and RI: Retained intron. The thresholds for alternative splicing screening were: Pvalue < 0.05, difference |Δψ|> 0.1. For the input in this GO analysis, we used genes with alternative splicing events detected in both the target and junction regions after U2AF1 knockdown. These genes were selected based on significant differential splicing, and then GO analysis was performed to identify the biological processes, cellular components, and molecular functions most enriched in these datasets.

Discovery of chimeric RNAs

For the identification of chimeric RNAs from the paired-end RNA seq data SOAPfuse Algorithm was used. SOAPfuse combines alignment of the reads from the paired-end RNA seq data against the human genome reference sequence and annotated genes, with the detection of candidate fusion events. SOAPfuse makes use of two types of reads supporting a fusion event, spanning reads that connect the candidate fusion gene pairs; and junction reads that confirm the accurate junction sites. For this study, we followed SOAPfuse’s default parameters, which require a minimum of three junction-supporting read pairs including two spanning read pairs and one junction read pair to preliminarily indicate a fusion event. Additionally, we filtered fusion events by requiring at least two unique supporting reads from both sides of the fusion junction to ensure robust detection, as recommended by the SOAPfuse developers. This threshold has been shown to reliably identify fusion events with low false-positive rates in previous studies28.

Statistical analysis

Experimental data were processed and statistically analyzed using IBM SPSS Statistics 21.0 software. The relative expression of each gene was analyzed by performing a T-test, and a difference of P < 0.05 was considered to be statistically significant.

Results

Mutations in U2AF1 are associated with cancer progression and reduced overall survival

We accessed the data from the cBioPortal which is an open-access resource, to obtain the expression profile of U2AF1 in different cancer types. The data was analyzed based on structural variants, mutation data, and copy number alteration (CNA). Prominently, the mutation and amplification of U2AF1 had the highest ratio in Leukemia according to TCGA, PanCancer Atlas (Fig. 1A). The y-axis labeled “Alteration Frequency” in Fig. 1A represents the percentage of samples with genetic alterations in the U2AF1 gene among the total samples analyzed. Alterations typically include mutations, amplifications, deletions, or other changes affecting gene structure or function. This frequency indicates how commonly U2AF1 is altered across various cancer types in the dataset provided by cBioPortal, offering insights into its potential role and prevalence as an oncogenic factor. Earlier studies have revealed that genes encoding components of the spliceosome complex (U2AF1, SF3B1, ZRSR2, SRSF2) are identified to be mutated in around one-third of the patients having MDS, and nearly 50% of secondary acute myeloid leukemia cases that evolve from MDS (sAML)29,30. The U2AF1 protein consists of four main domains; U2AF homology domain (UHM), two zing finger domains (ZnF), and a serine/arginine-rich domain31. U2AF1 mutations are associated with poor prognosis in MDS and AML, inhibiting cell proliferation and inducing cellular apoptosis32. The most common type of U2AF1 mutations occurs at amino acid position 34 (S34F), which is a highly conserved serine present in the zinc fingers domain. This mutation results in the substitution of a serine amino acid with a phenylalanine amino acid at protein 34 of the U2AF1 protein (Fig. 1B). U2AF1 mutant S34F, located in the highly conserved zinc fingers, facilitates increased splicing and exon skipping, disrupting the capability of U2AF1 to RNA splicing machinery33. MDS Patients with U2AF1 mutations have been associated with shorter overall survival compared to those without mutations34. Figure 1C was cited from cBioPortal. In this figure, they compare the overall survival for the two groups, with the number of cases (N) for the altered group being 151, and for the unaltered group, the number of cases (N) was 10,652. The altered group includes patients with U2AF1 structural variants, mutations, and CNA. And as mentioned earlier the overall survival of U2AF1 in the altered group can be seen to be relatively less as compared with the unaltered group (Fig. 1C). Collectively, these findings suggest that mutations in U2AF1 are associated with poor prognosis. All these data strongly support the notion that U2AF1 is an important and highly mutated gene in different cancer types, contributing to cancer progression, splicing alteration, and reducing overall survival. This led us to further investigate the potential role of U2AF1 in the formation of chimeric RNAs.

Fig. 1
figure 1

U2AF1 expression profile in different cancers. (A) Data from the cBioPortal representing the expression of U2AF1 in different cancer types based on structural variant data, mutation data, and copy number alteration (CNA). (B) Structure of U2AF1 genes with RNA recognition motif (RRM) domain and two zinc-finger domains where the common S34F/Y mutation occurs. (C) Data from cBioPortal represents the probability of overall survival in altered and unaltered groups, with altered groups having low overall survival35.

U2AF1 is highly expressed in diverse cancer cell lines

The expression and function of U2AF1 can vary across different cell lines. Previous studies have shown that U2AF1 expression levels can be cell-type specific and can influence alternative splicing patterns. To elucidate this phenomenon, we assessed the expression of U2AF1 in various cancer types, including esophageal cancer, prostate cancer, and leukemia. Our findings revealed differential expression of U2AF1 across diverse cancer cell lines, underscoring its potential significance in tumorigenesis (Fig. 2A). To further investigate the functional implications of U2AF1 expression, we designed shRNA constructs to specifically knock down U2AF1 for further analysis (Supplementary Fig. S1). The shRNA significantly knocked down U2AF1 in esophageal cancer and leukemia cell lines. The efficacy of our knockdown approach was validated by qPCR, demonstrating a significant reduction in U2AF1 expression levels (P value < 0.05) (Fig. 2B). U2AF1 mutations are frequently found in myelodysplastic syndromes and acute myeloid leukemia, and recent work has linked U2AF1 aberrations to various epithelial cancers, including esophageal cancer29. These findings lay the groundwork for further investigations into the functional significance of U2AF1 knockdown in cancer development and progression, highlighting its candidacy as the therapeutic target for intervention strategies aimed at mitigating disease progression.

Fig. 2
figure 2

U2AF1 is highly expressed in leukemia and esophageal cancer. (A) Expression of U2AF1 in esophageal, prostate, and leukemia cell lines. (B) Knockdown of U2AF1 using shRNA in different cell lines (P value < 0.05) (Asterisks threshold P ≤ 0.05*, P ≤ 0.01**, P ≤ 0.001***).

U2AF1 knockdown unveils downstream effects in leukemia cells

To further inspect the downstream effects associated with U2AF1 knockdown we conducted RNA sequencing analysis on two leukemia cell lines HEL and K562, treated with three different shRNA targeting U2AF1 along with a control (pLKO.1). Following stable selected with puromycin and confirmation of knockdown efficiency differential gene expression analysis revealed distinct patterns of up and down-regulated genes upon U2AF1 knockdown (Fig. 3A). Differentially expressed genes were screened by evaluating the difference fold and significance level. Here, we use the absolute value of logFC > 1 and pvalue < 0.05 as the standard to indicate that the gene is a differentially expressed gene. There were differentially expressed genes, including 2848 up-regulated and 2998 down-regulated genes between Control and U2AF1 knockdown in HEL cells. For the U2AF1 knockdown in K562 cells, the differentially expressed genes included 1521 up-regulated genes and 917 down-regulated genes (Fig. 3B, Supplementary Tables 1 & 2). Furthermore, a heatmap representation depicted the differential expression of genes, highlighting the top 20 common up-regulated and down-regulated genes following U2AF1 knockdown in both cell lines. The leftmost figure shows the overall hierarchical clustering of all differentially expressed genes in all comparison groups, clustered by RPKM values, red represents high-expression genes, and blue represents low-expression genes. The x-axis represents the different samples, and the y-axis represents the gene name (Fig. 3C). Knockdown of U2AF1 also resulted in overlapped up-regulated and down-regulated genes in both K562 and HEL cell lines (Supplementary Fig. S2). Among the top five common down-regulated genes after the knockdown of U2AF1 in HEL and K562 cells consisted of PCK2, IDH2, PIM1, STAR, and EPHX2 which were validated by qRT-PCR (Fig. 3D, Supplementary Fig. S3). Similarly, the top five common up-regulated genes after U2AF1 knockdown in HEL and K562 consisted of AOC2, AOC3, SBSN, PYGM, and LSMEM1 which were also confirmed by qRT-PCR (Fig. 3E, Supplementary Fig. S4).

Fig. 3
figure 3figure 3

Transcriptomic changes associated with knockdown of U2AF1. (A) Volcano plot showing the differentially expressed genes (DEG) after knockdown of U2AF1 in HEL cells (left panel) and K562 cells (right panel). (B) Scatter plot mentioning the numbers of up and downregulated as a result of U2AF1 knockdown in HEL and K562 cells. (C) Heat map clustering analysis with the differentially expressed genes in the knockdown and control group (right panel) and the top 20 up and down-regulated genes for U2AF1 knockdown in HEL and K562 cells (RPKM values). (D, E) Validation of some down-regulated and up-regulated genes resulted after U2AF1 knockdown by qPCR (P value < 0.05). (F) Gene Set enrichment analysis (GSEA) associated with U2AF1 knockdown in HEL cells. (G, H) Gene Ontology (GO) terms associated with down-regulated (left panel) and up-regulated genes (right panel). (I, J) KEGG-associated pathways for the down-regulated (left panel) and up-regulated genes (right panel) after U2AF1 knockdown.

To investigate the underlying molecular mechanisms associated with U2AF1 knockdown, we performed gene set enrichment analysis (GSEA) of all the genes between the knockdown group and compared them with the control group. The GSEA enrichment analysis revealed positive correlations with biological processes such as HALLMARK_APOPTOSIS, HALLMARK_TGF_BETA_SIGNALING, HALLMARK_G2M_CHECKPOINT, and various others (Fig. 3F, Supplementary Fig. S5). We evaluated the Gene Ontology (GO) terms for differentially expressed genes that resulted as a result of U2AF1 knockdown in HEL and K562 cells to understand the molecular functions associated with the knockdown of U2AF1. The GO terms for the significantly down-regulated genes were highly enriched for the mitotic cell cycle process, cell migration, neuron apoptotic process, regulation of neuron death, and various others (Fig. 3G, Supplementary Fig. S6A). Similarly, the GO terms for significantly up-regulated genes were enriched for the regulation of RNA metabolic process, regulation of RNA biosynthesis process, regulation of gene expression, nucleic acid binding, regulation of cell migration, and growth factor binding (Fig. 3H, Supplementary Fig. S6B). Moreover, pathway analysis using KEGG pathways revealed the involvement of down-regulated genes in pathways like the p53 signaling pathway, DNA replication, cell cycle, RNA transport, and spliceosome (Fig. 3I, Supplementary Fig. S6C). While the KEGG pathway for up-regulated genes were associated with pathways including the mTOR signaling pathway, MAPK signaling pathway, TGF signaling pathway, PI3K-Akt signaling pathway, NK-kappa B signaling pathways, and FoxO signaling pathway (Fig. 3J, Supplementary Fig. S6D). In conclusion, our analysis of genes affected by U2AF1 knockdown unveils a multifaceted molecular landscape, implicating U2AF1 in diverse cellular processes including cell cycle regulation, apoptosis, gene expression, and signaling pathways. These findings provide valuable insights into the intricate mechanisms underlying U2AF1’s role in cellular homeostasis, highlighting potential targets for further investigation in gene regulation and cancer therapeutics.

Knockdown of U2AF1 impacts alternative splicing and cellular pathways

Alternative splicing of pre-mRNAs plays a significant role in both proteomic diversity and the regulation of gene expression at the gene expression level. This process is tightly controlled in various tissues and developmental stages, and any disruption to it can result in a wide range of human diseases. A crucial objective in the field of splicing is to establish a set of rules or codes that can accurately predict the splicing pattern of any primary transcript based on its sequence. To investigate the impact of U2AF1 knockdown on alternative splicing, we analyzed the paired-end RNA seq data obtained after the successful knockdown of U2AF1 in leukemia cell lines HEL and K562 cells. The “target region” refers to the specific exonic or intronic regions where U2AF1 binding may directly influence alternative splicing. This is typically where U2AF1 is expected to interact with pre-mRNA to regulate the inclusion or exclusion of certain exons. The "junction region," on the other hand, refers to the exon-exon or exon–intron boundaries adjacent to these target regions. These boundaries are critical in splicing decisions, as they determine where splicing events occur. Firstly, we analyzed the junction region and found about 5 different types of differential splicing events, with the prominent splicing events generated were skipped exon (SE), mutually exclusion exon (MXE), alternative 3′ splice site (A3SS), alternative 5′ splice site (A5SS) and retained introns (RI). The spliced exons had the highest number as compared to the other splicing events (Fig. 4A, Supplemental Tables 3 & 4). When analyzing alternative splicing events, we considered both target reads (reads aligning to the splice site region, including the intron–exon boundary) and junction reads (reads spanning the exon–exon junction) to comprehensively assess splicing changes and ensured that we captured variations in both splicing efficiency and exon inclusion/skipping. Therefore, we analyzed both the target and the junction region together and consistently found that the majority of the splicing events were skipped exons (Fig. 4B, Supplemental Tables 5 & 6). This strongly suggests that knockdown of U2AF1 leads to exon skipping, resulting in the inclusion and exclusion of exons and ultimately generating multiple mRNA isoforms from a single gene.

Fig. 4
figure 4

U2AF1 knockdown impacts alternative splicing (A) Differential splicing events detected after analysis of junction region after knockdown of U2AF1 in HEL cells (right panel) and K562 cells left panel. (B) Differential splicing events from both the target and junction region with spliced exon having the highest ratio. (C) Gene ontology (GO) terms analysis of the junction region after U2AF1 knockdown. (D) GO terms resulted after analyzing both the target and junction region. (E) KEGG-associated pathways analysis of the junction region after U2AF1 knockdown. (F) KEGG-associated pathways after analyzing both the target and junction region.

To gain insights into the molecular processes affected by U2AF1 knockdown, we analyzed the GO terms for both the junction and the target regions. Our analysis revealed significant effects on biological processes related to protein binding, intracellular part, metabolic process, cytoplasm, and binding (Fig. 4C,D, Supplementary Fig. S7A,B). Additionally, we performed the KEGG analysis for both the junction and the target region to determine the impact of U2AF1 knockdown on cellular pathways (Supplemental Tables 7 & 8). The analysis identified highly enriched KEGG pathways, including RNA transport, RNA degradation, spliceosome, mRNA surveillance pathway, pathways in cancer, cell cycle, and MAPK signaling pathways (Fig. 4E,F, Supplementary Fig. S7C,D). All these data reflect that the knockdown of U2AF1 results in the exons skipping, and inclusion/exclusion of specific exons, thus disrupting the normal splicing process and can lead to alternative splicing events resulting in the production of different protein isoforms with potentially altered functions. Also, depletion or suppression of U2AF1 will lead to the elevation of pathways affecting RNA transport, RNA degradation, and disrupting the spliceosome machinery. Understanding the role of U2AF1 in alternative splicing and its implications in disease can provide valuable insights into the molecular mechanisms underlying gene expression regulation.

Knockdown of U2AF1 is associated with the formation and regulation of different categories of chimeric RNA

To investigate the role of U2AF1 in the formation of chimeric RNAs and the characterization and distribution of chimeric RNAs formed as a result of the knockdown of U2AF1 we analyzed the RNA seq data by utilizing SOAPfuse algorithm which combines the alignment of RNA-seq paired-end reads against the human reference sequence and annotated genes, with detection of candidate fusion events (Supplemental Table 12)36. We characterized the chimeric RNAs based on parental gene location, fusion junction site, and fusion protein coding potential as previously described37. Firstly, we characterized the chimeric RNAs according to the chromosomal locations of their parental genes. When the parental genes were neighboring genes and were transcribing the same strand, they were categorized as INTRACHR-SS-OGO-0GAP. INTERCHR chimeric RNAs when the parental genes were present on different chromosomes, and other chimeric RNAs with their parental genes present on the same chromosome were categorized as INTRA-OTHERS as previously described38. Surprisingly, after the knocking down of U2AF1, the INTRA-SS-OGO-0GAP chimeric RNAs increased significantly as compared to the other two groups which decreased significantly (Fig. 5A,D, Supplementary Fig. S8A,D). This shift suggests that U2AF1 depletion promotes the formation of chimeric RNAs originating from neighboring genes situated in close proximity to each other.

Fig. 5
figure 5

U2AF1 knockdown leads to chimeric RNA formation. (A) Categories of chimeric RNAs based on the parental gene location. (B) Classification of chimeric RNAs based on EM classification and (C) and based on the fusion protein coding potential. (D) Bar graphs representing the changes in different categories of chimeric RNAs after the knockdown of U2AF1 in HEL cells as compared to the control group. (E) Different percentages and commonly shared chimeric RNAs between knockdown and control groups. (F) Expression profile of some chimeric RNAs expressed in all the knockdown groups. (G) Schematic representation of read-through/cis-SAGe chimeric RNA.

Continuing our analysis, we further categorized the chimeric RNAs based on the junction position relative to the exon of the parental genes. If both the positions were being known exon/intron boundaries then it was categorized as E/E. If one position being exon/intron boundary and the other is not then (E/M or M/E) and if both sides of the junction region fall into the middle of the exon, then they were categorized as M/M as previously described by Babiceanu et al., 2016. Notably, upon U2AF1 knockdown we observed a striking increase in the proportion of the chimeric RNAs falling into the E/E category (Fig. 5B,D, Supplementary Fig. S8B,D). This surge raises the likelihood of generating E/E chimeric RNAs with both junction positions precisely known relative to those of the parental genes.

Expanding our investigation, we proceeded to classify the chimeric RNAs based on their reading frames. When the known reading frame of the 3′ gene was the same as that of the 5′ gene then it was categorized as in-frame chimeric RNAs. Conversely, if the known protein coding sequence of the 3′ gene uses a different reading frame than the 5′ gene then it is categorized as frame-shift chimeric RNAs. Additionally, chimeric RNAs falling in the NA category were those that did not affect the reading frame of the parental genes or even did not translate to protein. NA category includes those chimeric RNAs whose junction sequence falls into the untranslated region or one or both then parental genes are long non-coding RNA (lncRNA)39. Intriguingly, following U2AF1 knockdown, we observed a significant increase in the frame-shift chimeric RNAs, accompanied by a notable decrease in the In-frame and NA chimeric RNAs (Fig. 5C,D, Supplementary Fig. S8C,D). Arguing that the knockdown of U2AF1 increases the probability of the formation of chimeric RNAs in which the protein sequence of the 3′ gene uses a different reading frame than the 5′ gene. Analysis with SOAPfuse resulted in different categories of chimeric RNAs regulated by U2AF1. Some of the cis-SAGe chimeric RNAs were common between the knockdown groups and the control group. From which some of them were selected, primers were designed and their expression was detected by qRT-PCR. We found that U2AF1 knockdown significantly enhances the expression of cis-SAGe chimeric RNAs (Fig. 5E,F)40. A schematic representation of the read-through/cis-SAGe chimeric RNA has been illustrated in Fig. 5G. Taken together, our findings underscore the substantial role of U2AF1 in governing chimeric RNA formation and regulating different categories of chimeric RNAs.

Discussion

We presented the data based on the molecular profile and survival outcome of patients having U2AF1 mutation. We observed a distinct profile for the U2AF1 gene in different cancer types, with the highest alteration frequency in leukemia. The results are in alignment with prior published data mentioning that mutations in U2AF1 are highly prevalent in various types of leukemia, including myelodysplastic syndromes and acute myeloid leukemia41. The U2AF1 S34F mutation was found to be a well-known genetic alteration located in the highly conserved zinc fingers and refers to a substitution of serine (s) with phenylalanine (F) at position 34 of the U2AF1 protein. Numerous studies have investigated the prevalence and impact of U2AF1 S34F mutation in various types of cancer, altering the splice site recognition and dysregulation of hematopoiesis42,43. Patients with mutations in U2AF1 were observed to be associated with low overall survival as compared to the wild-type U2AF1, which is consistent with the prior studies44. Similarly, individuals with MDS and carrying U2AF1 mutations experienced a reduced overall survival rate when compared to those without mutation. Furthermore, patients with U2AF1 mutations were found to be at a higher risk of developing AML45. Similar to this mutation in U2AF1 was found to be significantly related to shorter overall survival46.

U2AF1 expression was found to be highly varied across different cell lines. The shRNA designed was successful in significantly knockdown the U2AF1. For designing U2AF1 S34F mutant plasmid the U2AF1a and U2AF1b isoform incorporated the mutation but the U2AF1c isoform was unable to incorporate the mutation. Previous studies have indicated that the U2AF1a isoform is more prevalent than the U2AF1b in higher vertebrates. Moreover, the U2AF1c isoform is subjected to RNA surveillance, leading to the introduction of a premature termination codon47. Also, some researchers have mentioned that the presence of U2AF1a isoform is crucial for the division of Hela cells. Furthermore, it has been observed that the knockdown of U2AF1a isoform alone or in combination with U2AF1b isoform has a more substantial effect than the knockdown of U2AF1b alone48.

We used RNA-seq data after the knockdown of U2AF1 and found that it resulted in differentially expressed genes. The differentially expressed genes were analyzed by functional annotation, and the results showed that these genes were involved in enhancing apoptosis, cell cycle arrest, and DNA damage. In accordance with this, the earlier studies conducted have reported that knockdown of U2AF1 significantly inhibits cell proliferation and enhances apoptosis in HEL cells49. Similarly, U2AF1 S34F mutation was found to have a significant impact on the rate of apoptosis, as well as the viability of cells and the formation of colonies in individuals with myelodysplastic syndromes and myeloid malignancies32. Further, we found that the GO terms for DEG were found to be associated with RNA metabolic processes, regulation of gene expression, cell migration, and chemokine binding. In addition, the knockdown of U2AF1 results in differentially expressed genes which were involved in several biological pathways including the p53 signaling pathway, MAPK signaling pathways, and TGF-beta signaling pathway. These results were consistent with the previously conducted research. Knockdown of U2AF1 induces apoptosis involving the p53 pathway and alters the alternative splicing of apoptosis-associated gene transcripts9.

Our finding indicates that U2AF1 plays an important role in splicing and knockdown of U2AF1 leads to transcriptomic changes. Knocking down of U2AF1 disrupts the normal splicing mechanism and induces differential splicing events thus leading to the formation of different isoforms of a gene. Depletion of U2AF1 considerably increased the ratio of skipped exons. Consistent with our results mutant U2AF1, facilitates increased splicing and exon skipping, thus hampering the capability of U2AF1 to RNA splicing machinery50,51. Mutation in U2AF1 results in aberrant RNA splicing, non-functional transcripts, and nonsense-mediated decay. Mutation in U2AF1 induces the formation of long isoform IRAK4-L which possesses oncogenic activity52. Surprisingly, we found that U2AF1 played a significant role in the formation of chimeric RNAs. U2AF1 knockdown affected different categories of chimeric RNA. Taken together, as discussed earlier U2AF1 mutations have been associated with different cancer types such as MDS, AML, lung cancer, hematological diseases, etc. The formation of chimeric RNAs by the splicing factor U2AF1 can have oncogenic properties and may help us unveil different aspects of cancer progression and can be used as a biomarker for cancer diagnosis and prognosis.

Conclusion

In conclusion, our study underscores the pivotal role played by U2AF1 in the formation of chimeric RNA, shedding light on its regulatory mechanisms across different categories of chimeric RNAs. The study of chimeric RNAs formed as a result of the splicing factor U2AF1 holds great promise for advancing our understanding of cancer biology. Future research should focus on identifying novel chimeric RNAs, elucidating their functional roles and molecular mechanism of formation regulated by U2AF1, developing diagnostic tools, exploring therapeutic targeting strategies, and validating their clinical utility. By harnessing the power of chimeric RNAs associated with U2AF1, we can pave the way for transformative approaches in cancer diagnosis and treatment, ultimately improving patient outcomes.