Fig. 1: Organism-wide scRNA-seq uncovers new genes, splice forms and orthologues. | Nature

Fig. 1: Organism-wide scRNA-seq uncovers new genes, splice forms and orthologues.

From: Mouse lemur cell atlas informs primate genes, physiology and disease

Fig. 1

a–d, Discovery of new genes (transcriptionally active regions, TARs). f–j, Discovery of new splice forms. k–m, Enhancement of gene annotation. a, Scheme for finding uTARs in the genome. b, Fraction of the genome (base pairs) that comprise uTARs and aTARs. c, Stacked bar plot showing the median percentage (transcript reads) of differentially expressed uTARs (DE uTARs), non-DE uTARs and aTARs for each atlas cell type. Example cell types enriched for DE uTARs are indicated by their designation number. 13, sweat gland; 35, enterocyte; H2, enterocyte/goblet; 130, pericyte; 179, basophil; 233, corticotroph; 235, lactotroph; 244, ependymal; 248, myelinating Schwann. d, Dot plot of mean expression (based on unique molecular identifier (UMI) counts: ln[UMIgene/UMItotal ×104 + 1], abbreviated as ln[UP10K + 1] in dot heatmaps) and the percentage of cells (dot size) expressing the indicated DE uTARs during spermatogenesis. Gene names were assigned using a BLAST sequence homology search. e, Current (Mmur 3.0, top) and revised (using the scRNA-seq cell atlas, bottom) annotation of lemur immunoglobulin (Ig) loci. Numbers above gene clusters indicate the estimated number of functional genes and those in parentheses pseudogenes, lacking transcripts. f, Scheme for characterizing lemur splice junctions. Bars, exons; lines, introns. g, Splice junction categories. A, previously annotated; B–E, not annotated, including novel junctions between two annotated exon boundaries (for example, novel exon skipping, B), between annotated exon boundary and unannotated location in the gene (C), between two unannotated locations in the gene (D), and outside annotated genes (E). h, Percentage of total splice junction counts and reads and mean reads per junction for each category. i, Percentage of lemur splice junctions in each category that are conserved in both human and mouse genomes (All), only in human (H&L), only in mouse (L&M) or neither (L). j, Examples of genes (MYL6, CAST and FAM92A) with cell-specific and tissue-selective alternative splicing. Plots show the percentage of each isoform (coloured as in the diagram above) expressed in indicated cell types or compartments. k, Stacked bar plot showing the percentage of named (white), unnamed (grey) and uncharacterized (black) genes in lemur, human and mouse genomes, separated by protein-coding genes (PCGs), non-protein-coding (nPCGs) and all genes (All). l, Top, three types of human–lemur–mouse expression homologue triads. Left and middle, triads of sequence homologues with similar expression profiles that are assigned (NCBI and Ensembl) as orthologues (solid line) in all three species, and the lemur orthologue is named accordingly (left) or unnamed (middle). Right, triads of sequence homologues with similar expression profiles but not currently assigned as orthologues (dashed line) for at least one species. Bottom, number of each type when comparing lung or skeletal muscle cell-expression profiles. m, Dot plot comparison of the mean expression of selected expression homologue triads of each type across human, lemur and mouse lung and skeletal muscle cell types. Two lemur unnamed loci (LOC105862649 and LOC105862489) are assigned (NCBI) as orthologues of mouse and human CD14, but only LOC105862649 (arrowhead) is an expression homologue, which suggests that it is the true orthologue. LOC105874770 is assigned as an orthologue of human ALDH1A1 but not of mouse Aldh1a1 (missed). For the three RAMP genes in each species, note that lemur RAMP1 and human RAMP3 are evolutionary outliers (asterisks), with both resembling the conserved RAMP2 expression pattern. See also Extended Data Figs. 13 and Supplementary Fig. 2. Adv, adventitial; Alv, alveolar; AT2, alveolar type 2 cell; cDC, conventional dendritic cell; FAP, fibroadipogenic progenitor; pDC, plasmacytoid dendritic cell; PF, proliferating; SPC, spermatocyte; SPG, spermatogonium; SPT, spermatid.

Source data

Back to article page