Fig. 1: LncRNA MALNC, highly expressed in AML, is multi-exonic and polyadenylated.

A Expression of MALNC in AML (n = 103) versus normal bone marrow (NBM, CD34+, (n = 11). Data shown as read counts log2(CPM + 1) from the KAW cohort (normalized by exon length) using box and whisker dot plots with interquartile range (IQR). B Genetic locus of MALNC (custom track, all lab-validated exons combined) in comparison to annotated LOC105370601 gene with indicated exonic structure (RNA sequence XR_944092.2). Track data retrieved from NCBI (GRCh38.p13, Primary Assembly Homo sapiens, Chr14, and annotation release 109.20210514). C (Track 1) MALNC exonic structure as identified in this study spanning a genomic region of about 80 kb. (Track 2) Initially predicted genomic loci and exon structure of XLOC_091701 on chromosome 14 by Cufflinks spanning 65,754 bp (chr14:14:83712431-83778185, GRChr37). (Track 3-5) Histograms and heatmap of summed coverage of read alignment in all ClinSeq AML samples (n = 325) within the Cufflinks predicted lncRNA locus +/- 1 kb. ClinSeq samples are sorted by mean coverage and brightest red equivalent to 323x coverage. Nucleotide coverage, represented by red bars, indicates exonic regions, while gap coverage, shown in blue, indicates intronic regions. (Track 6-7) Schematic overview of, to date, the three main identified isoforms of MALNC in HL60 and NB4 cell lines (corresponding to the three distinct transcription start sites (TSS) at exons 1.0, 1.1, and 1.2) and observed splicing pattern for the two alternative transcription termination sites (exon 7 and exon 10). Transcript sequences were amplified by primer walk and RACE, cloned and Sanger-sequenced. Sequences were then aligned against the reference genome using the hg19 assembly (Genome Reference Consortium Human Build 37, GRCh37). D Isoform usage as indicated by expression of MALNC by transcription start exons (exon 1.0, exon 1.1 and exon 1.2) among AML patients from KAW cohort (n = 72, only including samples with CPM > 0 in all exons). Data shown as read counts log2(CPM + 1) using box and whisker dot plots with interquartile range (IQR). E Isoform usage indicated by expression of MALNC by transcription termination exons (exon 7 and exon 10) among AML patients from KAW cohort (n = 58), including only samples with CPM > 0 in all exons. Data shown as read counts log2(CPM + 1), normalized by exon length using box and whisker dot plots with interquartile range (IQR). F Track view of CAGE-seq data in TSS of MALNC (chr14:83,711,285-83,737,170, GRCh37) for cell lines HL60, U937, monocytes, healthy CD34+ cells and K562. The locus is zoomed on MALNC start exon 1.1 (Ex1.1) and exon1.2 (Ex1.2). Data retrieved from FANTOM5 project. Data is scaled by group and was visualized using IGV. G ChIP-seq data track view of RNA polymerase 2 (Pol2A) binding and transcription factor binding sites of c-Myc, Max, REST and PU.1 in transcription start locus of MALNC (chr14:83,711,285-83,737,170, GRCh37) for cell lines HL60 and K562. Data retrieved from ENCODE/HAIB (GSE32465) and visualized using IGV. P-values were determined by Student’s t-tests (1B;1E) and Two-way ANOVA followed by pairwise comparison testing (1D): ns- not significant, * < 0.05, ** < 0.01, *** < 0.001.