Fig. 1: Broad variation among studies in reports of unannotated microprotein detection. | Nature Communications

Fig. 1: Broad variation among studies in reports of unannotated microprotein detection.

From: Community benchmarking and evaluation of human unannotated microprotein detection by mass spectrometry based proteomics

Fig. 1: Broad variation among studies in reports of unannotated microprotein detection.

A The relation between the number of sORFs used to construct the protein database of each study and the number of sORF-encoded proteins reported detected by MS (Spearman correlation = 0.43, p = 0.2). Whether the sORF database was constructed using a curated list of known sORFs, all possible sORFs from three frame translation of a transcriptome, or a list of ORFs found to be translated using Ribo-Seq or RNC-seq data is indicated. B For each study, the proportion of reported peptides supporting an unannotated protein that are also found by another study in our analysis is shown. The numbers of peptides found in other studies out of the total reported in the study are indicated above the bars. C Proportion of peptides mapping to annotated proteins using the ProteoMapper tool, divided into categories depending on the number of common single nucleotide polymorphism (SNP) differences separating the peptide from the peptide present in the reference protein and whether the annotated peptide is tryptic; i.e., could be generated by cleavage after lysine or arginine. Semi-tryptic peptides (where only one peptide end is tryptic) are grouped with non-tryptic. Peptides from immunopeptidomics experiments were not generated by trypsin digestion and therefore are not classified as tryptic or non-tryptic. Peptides matching currently annotated proteins that were not annotated on UniProtKB/Swiss-Prot in 2016 (i.e., recently annotated proteins) are excluded. D For each study, the proportion of reported peptides supporting an unannotated protein that are also found by another study in our analysis, excluding peptides that match to annotated proteins according to the ProteoMapper tool. Note that most studies have focused on different biological systems, which can limit the overlap. Source data are provided as a Source Data file.

Back to article page