Facts

  • LncRNAs play an essential role in diverse biological manners.

  • LncRNAs can encode micropeptides.

  • LncRNA-encoded micropeptides affect human innate immunity, metabolism, tumorigenesis.

Questions

  • Which is the best method to identify the lncRNA-encoded micropeptides?

  • What is the physiological function of lncRNA-encoded micropeptides?

  • What is the underlying mechanism of bi-functional lncRNAs, either as coding peptides or ncRNA molecules, in human diseases?

Introduction

The Central Dogma of molecular biology posits that genetic information, encapsulated within genes as either DNA or RNA sequences, is translated into functional products, predominantly proteins [1]. Emerging advancements in next-generation sequencing technologies over the past two decades have significantly deepened our understanding of the transcriptome providing novel insights into the genetic orchestra. Astonishingly, it appears that up to 98% of RNA transcripts within the human genome are non-coding RNAs (ncRNAs), which do not code for proteins [2]. These non-coding RNAs have been referred to as “noise DNA” or “dark matter” since they were once believed to be worthless parts of the genome. However, recent research has brought attention to these hitherto overlooked molecular actors, illuminating the crucial regulatory roles of ncRNAs in a spectrum of fundamental biological processes—from metabolism to development and differentiation [3]. According to their size, ncRNAs can be broadly divided into different clusters, such as microRNAs, circular RNAs (circRNAs), long non-coding RNAs (lncRNAs), PIWI-interacting RNAs (piRNAs) and snoRNAs [4]. ncRNAs are involved in most human physiological diseases [5]. Such revelations underscore the potential of ncRNAs not only as diagnostic markers but also as targets for therapeutic intervention.

LncRNAs are a class of ncRNAs longer than 200 nucleotides. In general, lncRNAs are transcribed like messenger RNAs by the RNA polymerase II, capped at the 5’end, polyadenylated at the 3’end, and spliced [6]. Distinct from mRNAs, lncRNAs exhibit tissue-specific expression and directly modulate a plethora of biological processes. They exert diverse functions, including microRNA sponge, RNA stabilization, transcription regulation, and remodeling chromatin and genome architecture [7]. Numerous studies have explored the diverse and significant roles of lncRNAs in cancer development, where they can act as oncogenes, tumor suppressors, and chromatin scaffolds [8]. Recently, scientists have become aware that lncRNAs carry small open reading frames (sORFs) and encode micropeptides [9]. An instance of this includes the work by Huang and colleagues, who discovered that the lncRNA HOXB-AS3 produces a 53-amino acid peptide named HOXB-AS3. This peptide inhibits colon cancer (CRC) growth by binding with high affinity to the arginine residue motif of hnRNP A1, which impedes the splicing of pyruvate kinase M (PKM) by hnRNP A1 [10]. In another intriguing example, Ge and team identified a 94 amino acid-length micropeptide called the ATP synthase–associated peptide (ASAP), which is encoded by the lncRNA LINC00467. They showed that ASAP interacts with ATP synthase subunits α and γ (ATP5A and ATP5C), facilitating ATP synthase assembly, which boosts its activity and mitochondrial oxygen consumption. This results in augmented colorectal cancer cell proliferation [11].

In this review, we outline the rapidly advancing field of lncRNA-encoded proteins, encompassing both computational methodologies and their biological significance. We draw attention to the fact that certain lncRNA-encoded functional peptides with relevance to cancer play a central role in regulating various biological processes, and influence tumor initiation, progression, invasion, and metastasis. We outline the future outlook on the current research landscape of lncRNA-encoded micropeptides in therapy, aiming to provide novel implications and strategies in cancer.

Molecular functions of lncRNAs

LncRNAs are a class of RNAs that are affecting a large number of biological processes. These include but are not limited to influencing chromatin architecture, enhancing action, contributing to the phase separation, engaging in transcription processing, and exerting both in-trans and in-cis regulatory functions. Additionally, lncRNAs are involved in alternative splicing, DNA damage repair, microRNA processing, and even encoding micropeptides. Each of these aspects underscores the versatile and pivotal nature of lncRNAs within cellular biology (Fig. 1) [12]. (A) Chromatin architecture: Engreitz and colleagues marked a significant observation with lncRNA XIST, showcasing its ability to cover the entire X chromosome by leveraging its spatial proximity to 3D conformation during X chromosome inactivation (XCI), exemplifying the critical role in chromatin reorganization (Fig. 1A) [13]. (B) Enhancer action: Zhang and colleagues showed that M2-like tumor-associated macrophages (TAM2) infiltration facilitates a rich TGFβ microenvironment and promotes SMAD3 binding to the enhancer of Linc01977, therefore initiating malignancy through the TGFβ/SMAD3 pathway in lung cancer (Fig. 1B) [14]. (C) Nuclear body construction: Yamazaki reported that NEAT1_2 middle subdomains recruit NONO dimers that initiate paraspeckle assembly with phase-separated features (Fig. 1C) [15]. Xing identified that a snoRNA-end lncRNA SLERT binds to RNA helicase DDX21 RecA domain, in order to control fibrillar center and the dense fibrillar component (DFC) phase separation and reshapes the donut-like ring structures, therefore preventing the repression of PoI I transcription [16, 17]. (D) Transcriptional processing: Schlackow used mNET-seq to survey genome-wide PoI II density and found a different phosphorylation status of the Pol II C-terminal domain (CTD) between mRNAs and lncRNAs. LncRNAs are inefficiently polyadenylated and spliced and more degraded post-transcriptionally by the nuclear exosome (Fig. 1D) [18]. (E) in-trans and in-cis regulation: LncRNAs’ regulatory roles can be generally categorized into in-trans and in-cis regulation. In-trans regulatory lncRNAs modulate gene expression in regions distant from their transcription sites via influencing chromosome structure and the interacting proteins or RNA molecules [19]. In addition to in trans-acting, in cis-acting lncRNA molecules can recruit other proteins or complexes to nearby loci in order to modulate gene activity (Fig. 1E) [19]. (F) DNA damage: Wang recently reported that the lncRNA HCP5 could interact with YB1 and ILF2, therefore resulting in the shuttling of YB1 to the nucleus to stimulate MSH5 and affect DNA damage repair (Fig. 1F) [20]. (G) Alternative splicing: Zhou demonstrated that an intron 3 retention transcript of lncRNA PXN-AS1 (PXN-AS1-IR3) recruits p300 to the MYC promoter, activating MYC downstream genes and facilitating hepatocellular carcinoma (HCC) metastasis (Fig. 1G) [21]. (H) microRNA processing: Some lncRNAs are the host genes of microRNAs and do not apply to the canonical cleavage-and-polyadenylation pathway. Microprocessors (Dicer, DGCR8 and others) cleave the nascent transcript lnc-pri-miRNAs rather than in the typical polyadenylation-dependent manner (Fig. 1H) [22, 23]. (I) Micropeptides: Rohrig reported that ENOD40, a plant long noncoding RNA enod40, can encode functional peptides, in the case of a sucrose-synthesizing enzyme during root organogenesis, vividly illustrates the coding potential latent within ncRNAs (Fig. 1I) [24, 25]. In summary, lncRNAs are not mere passengers but active and versatile conductors of a multitude of cellular and biological processes, heralding a new era of understanding the complexity and elegance of RNA-mediated regulation.

Fig. 1: Diverse functions of lncRNAs.
figure 1

A LncRNA XIST alters three-dimensional genome architecture during X-chromosome inactivation. B LncRNA LINC09977 functions as an enhancer to stimulate the TGFβ/SMAD3 pathway. C NEAT1_2 recruits NONO dimers that initiate paraspeckle assembly. D The involvement of lncRNA in transcriptional processing via the microprocessor complex subunit DGCR8. E LncRNA can modulate gene expression in regions distant (in trans) or nearby (in cis) from their transcription sites. F LncRNA HCP5 affects DNA damage via YB1-mediated MSH5 in the nucleus. G Enforced DDX17 activates tumorigenesis by producing a long-spliced transcript of lncRNA PXN-AS1-IR3. H Microprocessor genes cleave the nascent transcript lnc-pri-miRNAs and result in microRNA processing in a noncanonical pathway. I LncRNA encoded micropeptide ENDO40 is involved in root organogenesis.

Prediction and identification of lncRNA-encoded micropeptides

In recent years, significant advancements have been made in methods to explore the coding capacity and potential functions of micropeptides. These approaches encompass a range of techniques, including predicting ORFs, analyzing translation start elements such as internal ribosome entry sites (IRES), investigating histone modifications, conducting translation omics and proteomics profiling, and utilizing Flag-labeled expression combined with mass spectrometry (Table 1) [9].

Table 1 Prediction tools, identification tools, and databases for lncRNA-encoded micropeptides.

Computational analysis

Coding potential assessment

The Coding-Potential Assessment Tool (CPAT) utilizes pure linguistic features calculated from RNA sequences to quickly and accurately assess the likelihood of protein coding, producing probabilities (0 ≤ p ≤ 1) based on the input nucleotide sequences or genomic coordinates of RNAs [26]. Another tool, the Coding Potential Calculator (CPC), can estimate a transcript’s protein coding potential by analyzing six sequence features [27]. CPC2, an updated version, boasts a thousand fold increase in speed over its predecessor while enhancing accuracy and maintaining a species-neutral approach [28]. The Coding-Non-Coding Identifying Tool (CNIT) is well-suited for transcriptome analysis, aiding researchers in validating coding or noncoding hypotheses with high accuracy, robustness, and consistency [29]. Phylogenetic Codon Substitution Frequencies (PhyloCSF), developed by the Broad Institute, is a track that assists in identifying functionally conserved, protein-coding regions of genomes [30]. Additionally, COME is a coding potential calculation tool that integrates sequencing-derived or experiment-based features to enhance prediction accuracy and robustness [31]. ORF Finder is a widely used tool for identifying ORFs in lncRNA sequences [32]. Coding Region Identification Tool Invoking Comparative Analysis (CRITICA) includes various programs that search for and rank likely protein-coding ORF sequences [33].

Internal ribosomal entry sites (IRESs) analysis

An IRES, or internal ribosome entry site, is an important RNA segment that enables the initiation of translation without relying on the cap structure, playing a key role in protein synthesis [34]. RNA binding proteins (RBPs) can bind to lncRNAs to form ribonucleoprotein (RNP) complexes, which function as well as Kozaks sequence around the AUG start codon to activate translation initiation [35]. Recent studies have highlighted novel functionalities of lncRNAs in IRES elements. Legnini et al., identified that the 5’ UTR of circ-ZNF609 is able to work as an IRES, enabling the encoding of a protein in a splicing-dependent manner [36]. Yu et al., recently identified that DNA damage enhances the interaction of ribosomes with the IRES region of the lncRNA CTBP1-DT. This interaction mitigates negative modulators on the ORF and enhances the translation of the micropeptide DNA damage-upregulated protein (DDUP) through a cap-independent mechanism [37].

m6A modification prediction

N6-methyladenosine (m6A) has been recognized as a prevalent regulatory mechanism that influences RNA expression across various physiological processes [38]. Emerging studies have shown that m6A modification accounts for lncRNA translation in mammals [39]. Additionally, different approaches have been developed to predict the m6A sites on lncRNAs. DeepM6ASeq, a deep-learning framework, allows for the prediction and visualization of m6A sites within sequences [40]. Similarly, SRAMP (sequence-based RNA adenosine methylation site predictor), a web-based tool, offers the capability to identify mammalian m6A sites at single-nucleotide resolution [41]. These tools are crucial for advancing our understanding of m6A impact on lncRNA function and its broader implications in disease and development.

Transcriptomic-based method

Over the last decade, researchers have devoted considerable effort to developing high-throughput profiling techniques to analyze the sequences predicted to be translated into ncRNAs, as summarized in Table 1.

Ribosome profiling, also known as Ribo-sequencing or active mRNA translation sequencing (ART-seq), has emerged as a common technique for quantitatively and thoroughly assessing translation. This method involves deep sequencing of ribosome-protected mRNA fragments, allowing researchers to identify hundreds of translated ORFs across various species, including zebrafish and Homo sapiens [42, 43]. Poly-Ribo sequencing is an advanced ribosome profiling technique that leverages active translation and the clustering of multiple ribosomes to minimize false positives [44]. Ribosome-nascent chain complex (RNC) sequencing refers to a technique used to analyze the collection of molecules that comprise a ribosome attached to a nascent polypeptide (protein) during translation [45].

Proteomics-based method

Researchers have employed proteomics, specifically mass spectrometry (MS), to validate micropeptides encoded by ncRNAs. For instance, Banfai and colleagues conducted a joint analysis of two public datasets that included tandem mass spectrometry (MS/MS) and RNA-seq data from K562 and GM12879 cell lines. Their study examined 79,333 peptides derived from 9,640 lncRNA loci, ultimately identifying 85 unique peptides corresponding to 69 lncRNAs [46]. Additionally, Slavoff utilized a combination of RNA-seq and liquid chromatography-tandem mass spectrometry (LC/MS/MS) methods, and identified 90 small open reading frame-encoded polypeptides (SEPs), 86 of which were characterized in K562 cells [47].

Experimental identification

Immunoblotting is a straightforward and traditional method used to detect proteins. This technique is particularly valuable for examining the endogenous expression of small peptides. However, the process of creating targeted antibodies presents several challenges. For instance, peptides that contain transmembrane domains may restrict the availability of epitopes suitable for antibody generation [48]. Alternatively, researchers can employ tagging systems, such as GFP-tag or Flag-tag, for validation purposes. These tags are typically cloned into the ORF sequence just before the stop codon, followed by transfection into a cell line. Subsequently, immunoblotting and immunofluorescence (IF) assays are performed to verify the presence of the tagged proteins [49]. Moreover, the CRISPR-Cas9 system offers another approach by facilitating the insertion of a Flag-tag directly before the stop codon of the lncRNA locus within target cells, followed by immunoblotting and IF assays to detection and localization of micropeptide expression [50].

LncRNA-encoded micropeptides in the immune system and inflammatory response

Recent studies have highlighted the significant role of lncRNA-encoded micropeptides in human innate immunity (Fig. 2). For instance, Niu and colleagues reported that lncRNA miR155HG encodes a 17-aa micropeptide, called miPEP155 (P155). P155 is highly expressed in inflamed antigen-presenting cells and interacts with HSC70 at the adenosine 5′-triphosphate binding domain. It affects the antigen presentation by major histocompatibility complex class II and interferes with the HSC70-HSP90 machinery, thus regulating T-cell priming (Fig. 2A) [51]. Additionally, Jackson et al., reported that a non-canonical ORF peptide derived from Aw112010 exhibits a translational capacity and influences mucosal immunity by enhancing IL-12 stability upon bacterial infection (Fig. 2B) [52]. Tang et al., recently reported that the lncRNA Dleu2-encoded micropeptide Dleu2-17aa can serve as scaffold to promote the interaction between Smad3 and Foxp3, therefore strengthening inducible regulatory T (iTreg) cell generation. Knocking out Dleu2-17aa in mice diminishes the iTreg cell formation and consequently deteriorates experimental autoimmune encephalomyelitis (EAE) (Fig. 2C) [53]. These findings imply the fundamental roles that micropeptides play as modulators of immunological processes.

Fig. 2: LncRNA-encoded micropeptides in the immune system and inflammatory response.
figure 2

A The micropeptide miPEP155 (P155) drives DC-stimulated autoimmune inflammation by disrupting the HSC70-HSP90 machinery. B The Aw112010-derived ORF peptide enhances IL-12 signaling. C The micropeptide Dleu2-17aa maintains immune homeostasis by interaction with Smad3 and Foxp3.

LncRNA-encoded micropeptides in mitochondria

Mitochondria are dynamic organelles responsible for energy transformation and signaling, crucial for maintaining cellular bioenergetics through ATP production [54]. Recent studies have demonstrated that lncRNA-encoded micropeptides play a crucial role in mitochondrial activity (Table 2). Notably, three different groups have parallelly examined the function of the lncRNA 1810058I24Rik-encoded micropeptide STMP1 in mitochondrial processes. Zheng et al., initially identified STMP1 as a 47-aa mitochondrial micropeptide that is involved in retinal differentiation by promoting the differentiation of bipolar and amacrine cells via the 15-AA N-terminus of STMP1 [55]. They further demonstrated that STMP1 regulates retinal ischemia/reperfusion (IR) via activating microglia, enhancing aerobic glycolysis, and promoting mitochondrial fusion and reactive oxygen species (ROS) production (Fig. 3A) [56]. Xie et al., identified that the inner mitochondrial membrane-located micropeptide STMP1 boosts mitochondrial fission and cell migration by increasing DRP1 expression and facilitating its interaction with MYH9 [57]. Sang et al., characterized STMP1’s promotion of cell cycle arrest by enhancing the activity of mitochondrial complex IV [58]. In addition, Bhatta et al., reported that the lncRNA 1810058I24Rik encoded another micropeptide, called Mm47, which is required for the interaction between Nlrc4 and Aim2, influencing the Nlrp3 inflammasome activity (Fig. 3B) [59]. Moreover, Ge et al., reported that ASAP, a 94-aa micropeptide encoded by lncRNA LINC00467, is involved in mitochondrial metabolism. ASAP regulates ATP synthase activity via interaction with ATP5A and ATP5C, eventually affecting colon cancer tumorigenesis in vitro and in vivo (Fig. 3C) [11].

Table 2 LncRNA-encoded micropeptides in mitochondria.
Fig. 3: LncRNA-encoded micropeptides in mitochondria.
figure 3

A The micropeptide STMP1 enhances mitochondrial fusion and ROS production. B The micropeptide Mm47 impacts NIrp3 inflammasome-mediated responses by promoting the interaction between NIrc4 and Aim2. C The micropeptide ASAP regulates ATP synthase activity via interaction with ATP5A and ATP5C, eventually affecting colon cancer tumorigenesis in vitro and in vivo.

In summary, these studies highlight the fundamental functions of lncRNA-encoded micropeptides in mitochondrial activities.

LncRNA-encoded micropeptides in cancer

Cancer is the second-leading cause of death worldwide, with approximately 20 million newly-diagnosed cases and approximately 10 million deaths in 2022 [60]. Cancer is a result of the abnormal proliferation of normal cells, through their transformation to tumor cells following a multi-step process that culminates in unconstrained growth, and typically, metastasis. Research shows that micropeptides can influence tumorigenesis via diverse mechanisms (Table 3) [61]. Herein, we will summarize the functions of lncRNA-encoded micropeptides in different cancer types.

Table 3 lncRNA-encoded micropeptides in cancer.

Colon cancer

Huang and colleagues found a reduction of lncRNA HOXB-AS3 in colorectal cancer (CRC) tissues compared to the adjacent non-tumoral colon tissues. Highly metastatic colon cell lines also exhibited a reduction of HOXB-AS3. They found that the lncRNA HOXB-AS3 encodes a conserved 53-aa peptide, and showed that the HOXB-AS3 peptide, but not the lncRNA HOXB-AS3 itself, suppresses CRC growth. Mechanistically, the HOXB-AS3 peptide interacts with the hnRNP A1 protein via an RNA-binding RGG box (RGG) and suppresses hnRNP A1-dependent PKM splicing and miR-18a processing. This interaction prevents hnRNP A1 from binding to flanking PKM E9, effectively antagonizing CRC growth and migration/invasion [10]. In another study, lncRNA AP002387.2 (lnc-AP) is downregulated in chemotherapy-resistant CRC cells, whereas enforced lnc-AP is associated with beneficial clinical outcomes. The authors further found that lnc-AP encodes a micropeptide called pep-AP. Pep-AP and its binding protein TALDO1 co-repress the pentose phosphate pathway (PPP), reducing NADPH/NADP+ and glutathione (GSH) levels. This leads to ROS accumulation and apoptosis, sensitizing CRC cells to oxaliplatin treatment [62]. Additionally, Zhu et al., recently deciphered that the lncRNA LINC00266-1 encodes a 71-amino acid peptide, called RNA-binding regulatory peptide (RBRP) due to its interaction with several functional RNA-binding proteins. RBRP, which is highly expressed in metastatic cell lines and CRC tumors, interacts with the RNA m6A reader IGF2BP1 to enhance its recognition of the transcriptional factor MYC, thereby promoting MYC stability (Fig. 4A) [63].

Fig. 4: The functions of lncRNA-encoded micropeptides in cancer.
figure 4

A The micropeptide RBRP interacts with m6A reader IGF2BP1 and strengthens MYC stability in colorectal cancer. B Overexpression of micropeptide UBAP1-AST6 promotes cell growth, whereas UBAP1-AST6 KO inhibits cell proliferation in lung cancer. C The LINC00908-encoded micropeptide ASRPS inhibits angiogenesis by preventing phosphorylation of STAT3 in breast cancer. D The Micropeptide SMIM30 activates MAPK signaling and HCC progression by interacting with the non-receptor tyrosine kinase SRC/YES1.

Lung cancer

Lu et al., reported that the lncRNA-derived micropeptide UBAP1-AST6, is localized in the nucleoli and highly expressed in the lung cancer cell line A549. Overexpression of UBAP1-AST6 promotes cell growth, whereas UBAP1-AST6 KO via CRISPR-Cas9 significantly inhibits cell proliferation and clone formation. However, this overexpression of UBAP1-AST6 is reversed by mutating the start codon ATG, suggesting the coding potential and importance of UBAP1-AST6 in lung cancer (Fig. 4B) [64]. Meanwhile, another lncRNA-encoded peptide called DLX6-AS1 ORF can promote cell proliferation, migration, and invasion by activating the Wnt/β-catenin pathway in non-small cell lung cancer (NSCLC) [65].

Breast cancer

Wang et al., recently found that the lncRNA LINC00908 encodes a 60-aa micropeptide named ASRPS in triple-negative breast cancer (TNBC). ASRPS is low-expressed in TNBC, and its reduction correlates with poor survival and promotes tumor growth. Functionally, ASRPS interacts with STAT3 and prevents STAT3 phosphorylation and VEGF activation, subsequently repressing tumorigenesis (Fig. 4C) [66]. Another study identified that the lncRNA LINC00665 encodes a micropeptide called CIP2A-BP, which inhibits migration and invasion in breast cancer. The translation of CIP2A-BP is blocked by TGF-β-induced SMAD activation, which promotes the translation inhibitory factor 4E-BP1 and suppresses the initiation factor eIF4E. CIP2A-BP specifically competes with the PP2A subunit B56γ to bind CIP2A, reducing CIP2A/PP2A-mediated activation of the PI3K/AKT/NFκB pathway and thus inhibiting TNBC tumorigenesis [67]. Additionally, the CASMIMO1 peptide, a 10-amino acid microprotein generally located in endosomes, has been shown to play a crucial role in cell lipid homeostasis and breast cancer proliferation. CASIMO1 interacts with squalene epoxidase (SQLE), enhancing SQLE accumulation and ERK phosphorylation, leading to G0/G1 arrest [68]. Recently, a study proposed that lncRNA CTD-2256P15.2 contributes to epirubicin (EPI)-resistant breast tumors. They further found that the lncRNA CTD-2256P15.2 encodes a micropeptide called PAR-amplifying and CtIP-maintaining micropeptide (PACMP), which modulates DNA double-strand break (DSB), chemoresistance, and CtIP protein abundance through KLHL15-mediated degradation. PACMP enhances poly (ADP-ribosyl)ation by PARP1 through its binding to DNA damage-generated poly (ADP-ribose) chains. Targeting PACMP could sensitize tumor cells to various treatments including PARP, ATR, and CDK4/6 inhibitors, ionizing radiation, and camptothecin, opening new avenues for therapeutic strategies to improve clinical outcomes [69].

Liver cancer

Xu et al., identified a conserved microprotein KRASIM encoded by the lncRNA NCBP2-AS2 by utilizing ribosome profiling in hepatocellular carcinoma (HCC) cells. They noted that KRASIM is expressed at lower levels in HCC compared to normal hepatocytes and found that it inhibits HCC cell growth and proliferation by reducing KRAS protein levels and dampening ERK signaling pathway activity [70]. In other studies, De Lara and Polenkowski identified two lncRNA-encoded peptides, C20orf204-189AA and linc013026-68AA, which correlate with tumor differentiation grade and patient survival. These findings suggest their roles as cancer-specific fine tuners, offering potential targets for therapy in HCC [71, 72]. Using an antibody against ribosomal protein S6 (RPS6), Pang performed a RIP-seq assay and observed that the lncRNA LINC00998 encodes a micropeptide called SMIM30. SMIM30 is induced by MYC and can activate MAPK signaling and HCC progression by interacting with the non-receptor tyrosine kinase SRC/YES1 (Fig. 4D) [73]. Zhang identified that the TGF-β-induced lncRNA LINC02551 encodes a 174-amino-acid peptide, called Jun binding micropeptide (JunBP). JunBP binds c-JUN, enhancing its phosphorylation and affinity for SMAD3, which induces LINC02551 and forms a positive regulatory feedback loop promoting HCC metastasis [74]. Hypoxia-responsive lncRNA AC115619 encodes a micropeptide, AC115619-22aa, in HCC. AC115619-22aa represses HCC progression via the interaction with WTAP and impedes the assembly of the m6A methyltransferase complex, therefore affecting the expression of tumor genes including SOCS2 and ATG14 [75].

Others

Sun and colleagues have recently shown that the micropeptide APPLE, encoded by the lncRNA ASH1L-AS1, is upregulated in Acute Myeloid Leukemia (AML) and associated with poor outcomes in hematopoietic malignancies. Mechanistically, APPLE acts as a novel member of the PABPC1 complex, facilitating the interaction between PABPC1 and eIF4G. This interaction promotes mRNA circularization and eIF4F translation initiation by binding the RRM1 and RRM3 domains of PABPC1, thereby contributing to AML progression [76]. In esophageal squamous cell carcinoma (ESCC), the Y-linked lncRNA LINC00278 encodes a Yin Yang 1 (YY1)-binding micropeptide, designated YY1BM, which inhibits the interaction between YY1 and androgen receptor (AR). This decreases eEF2K expression and promotes cell apoptosis [39]. In renal cell carcinoma (RCC), overexpressed micropeptide MIAC significantly reduces the capacity of cells to proliferate and migrate by binding to AQP2 and reducing EREG/EGFR expression in vitro and in vivo [77]. Furthermore, the terminal differentiation-induced Non-Coding RNA (TINCR) encodes a highly conserved ubiquitin-like microprotein that serves as a tumor suppressor to repress tumor growth of squamous cell carcinoma [78]. In glioblastoma (GBM), the tumor-suppressing micropeptide MP31 disrupts mitochondrial quality control, causing defective mitochondria to accumulate in cells, which in turn results in ROS production and DNA damage [79].

In summary, these novel investigations reveal that the lncRNA-encoded peptides are closely involved in tumor-relevant activities and might become promising targets for cancer treatment.

LncRNA-encoded micropeptides in other diseases

Pulmonary hypertension

Increasing studies have shown that micropeptides also participate in the pathogenesis of other diseases. Pulmonary hypertension, characterized by pulmonary blood vessel abnormalities, has been linked to micropeptide involvement. Recently, the Zhu lab reported that lncRNA RPS4L encodes a micropeptide called 40S ribosomal protein S4 X isoform-like (RPS4XL), which promotes pulmonary artery smooth muscle cells (PASMCs) proliferation under hypoxic conditions. RPS4XL binds to RPS6 to inhibit its phosphorylation at Ser240 and Ser244 sites [80]. Additionally, RPS4XL suppresses hypoxia-induced pyroptosis in PASMCs by interacting with the glycosylation site of HSC70 [81]. These findings suggest that RPS4XL could be a potential target for treating pulmonary hypertension.

Myocardial infarction

Myocardial infarction (MI), or heart attack, occurs when the myocardium receives decreased, or no, blood flow leading to tissue damage or death [82]. Spiroski et al., reported that the lncRNA LINC00961-encoded micropeptide SPAAR, short for small regulatory polypeptide of amino acid response, is expressed mostly in human cardiac endothelial cells and fibroblasts. SPAAR is implicated with fibroblast function, hypoxic response and basal cardiovascular function in adulthood [83]. In a parallel study, Yan and colleagues observed that three micropeptides encoded by lncRNAs are involved in the process of oxidative phosphorylation, and the signaling pathways of calcium and MAPK, thereby regulating cardiomyocyte hypertrophy [84].

Muscle development

Anderson et al., identified a conserved micropeptide, myoregulin (MLN), coded by a muscle-specific lncRNA. MLN is structurally similar to the membrane pump SERCA inhibitors phospholamban and sarcolipin, therefore inhibiting SERCA by regulating Ca2+ uptake into the sarcoplasmic reticulum (SR) [85]. These findings underscore the importance of exploring lncRNA-encoded micropeptides and highlight the complexity of molecular mechanisms underlying disease processes. LncRNA MyolncR4 has been found to encode a 56-aa micropeptide called lncRNA-encoded micropeptide (LEMP). LEMP is a highly conserved peptide among different species and is associated with myogenic differentiation. Mice with LEMP KO using CRISPR-Cas9 exhibit a deficit in muscle formation and development [86]. Nelson et al., addressed a putative muscle-specific lncRNA that encodes a peptide of 34-aa, called dwarf open reading frame (DWORF). Upregulated DWORF promotes peak Ca2+ transient amplitude and sarcoplasmic reticulum Ca2+ load and enhances SERCA activity in cardiomyocytes of mice [87].

In summary, these findings underscore the importance of exploring lncRNA-encoded micropeptides and highlight the complexity of molecular mechanisms underlying disease processes.

Conclusions and perspectives

Current research has been intensively exploring the biological roles of lncRNAs. Unlike protein-coding mRNAs, lncRNAs contribute uniquely to several cellular mechanisms such as histone modification, DNA methylation, and transcription regulation [88]. Employing strategies that combine in silico prediction, experimental validation, and functional analysis are essential to better understand the complex operations of biological systems and their evolutionary developments. Moreover, the development of new technologies, including functional proteomics, gene editing, and extensive sequencing methods, has substantially enhanced research into micropeptides encoded by lncRNAs.

Functional studies of micropeptides have uncovered their essential biological functions, including immune system response and mitochondrial metabolism. Increasing studies also demonstrate that micropeptides are involved in the development of human diseases. For example, LINC00665 is upregulated in liver cancer, particularly in the pathological stages III and IV compared to the normal counterparts. The LINC00665-encoded peptide CIP2A-BP-52 competes with PP2A to bind to CIP2A, leading to the release and downregulation of the PI3K/AKT/NFκB pathway, thus silencing invasion and metastasis in liver cancer [89]. This review focuses on the role of micropeptides across cancer types, raising the possibility of their implication as biomarkers or novel therapeutics targets.

Despite significant efforts, there is still a vast challenge to be accomplished in understanding the biological roles of micropeptides. Given their relatively short length, it is crucial to develop specific and effective antibodies for further experimental analysis and clinical inspection. Additionally, considering the cell-specific and tissue-specific phenotypes of lncRNAs, it is vital to determine the level and distribution of micropeptides across tissues. Third but not least, bi-functional lncRNAs, either as coding peptides or ncRNA molecules, require and merit further investigation. A more in-depth study of lncRNAs and their encoded micropeptides will significantly advance research in the life sciences, providing new insights and strategies for cancer therapy in particular.