Extended Data Fig. 3: Microprotein expression and validation across brain development, Related to Fig. 3.
From: Developmental dynamics of RNA translation in the human brain

(a) Violin plot of average ribosome density (RD) by ORF type. Previously described ORFs are shown in red. Average ribosome density is shown in blue. (b) Venn diagram of sORFs detected in human brain (this study), human heart (van Heesch et al.4), or the sORF.org database. In total, 6,071 translated sORFs identified in the human brain perfectly matched the amino acid sequence of a previously reported entry in the sORFs.org database or identified in the human heart, a degree of overlap consistent with prior studies4. (c) Out of 8,590 lincRNA genes expressed across all brain samples, 415 lincRNA genes encode at least one translated ORF. We examined possible differences between ORF-encoding lincRNAs and non-translated lincRNAs. Box and whisker plots of annotated lincRNA features (expression, length, RPKM, and conservation) comparing RNAs that contain at least one ORF in the human brain to lincRNAs that do not contain any ORFs, data are shown as median ± IQR (whiskers = 1.5*IQR), n = 8,175 (no ORF detected) and 415 (ORF detected) lincRNAs. ** p = 0.009079 by two-sided Welch two-sample T-test, n.s. = not significant. (d) Number and type of ORFs identified by size-selection proteomics in the adult and prenatal brain, or by Johnson et al.25. (e) Histograms of number of proteins identified by size-selection proteomics in the adult and prenatal brain, or by Johnson et al.25., binned by protein length. (f) Box and whisker plots of Ribo-seq TPM for all ORFs detected by MS and ORFs not detected, data are shown as median ± IQR (whiskers = 1.5*IQR), n = 352 (adult, Johnson et al.), 3331 (adult, this study), 419 (prenatal, this study), 168,085 (not detected by MS). (g) Box and whisker plots of Ribo-seq TPM for sORFs detected by MS and ORFs not detected. (f-g) data are shown as median ± IQR (whiskers = 1.5*IQR), n = 14 (adult, Johnson et al.), 17 (adult, this study), 31 (prenatal, this study), 16838,125085 (not detected by MS). *** p = 2.92*10−5, **** p < 2.2*10−16, by two-sided Kolmogorov–Smirnov test. While only a fraction of the sORFs identified by ribosome profiling were detected by our mass spectrometry analysis, this is not surprising given that such shotgun proteomic approaches have low sensitivity for the detection of individual proteins, particularly if the proteins are transient or low in abundance. Consistent with this finding, the sORF-encoded proteins that we were able to detect by proteomics exhibited a higher average ribosome density compared to all sORFs detected by ribosome profiling.