Figure 1

The master AMT DB. (A) The overall scheme of AMT DB construction. A total of 42 datasets (24 + 18 from left and right branches, respectively) were generated from LC-MS/MS analysis. For individual datasets, UMCs assigned with protein IDs (identified UMCs) were identified using iPE-MMR analysis and target-decoy MS-GF+ search and then used to construct the AMT DB. (B,C) Utilization of AMT DB to assign protein IDs to unidentified UMCs. The 188,345 AMTs (magenta dots) in the AMT DB are visualized in a 2D (NET and molecular weight) scatter plot (B). For a LC-MS/MS dataset (dataset k), the identified UMCs (blue dots) are shown in the upper scatter plot. By matching unidentified UMCs in this dataset with AMTs using the indicated mass and NET tolerances, a subset of unidentified UMCs (magenta dots) were assigned with protein IDs. These matched UMCs are shown in the bottom scatter plot (C). (D) Relationships of expressed genes identified from mRNA-sequencing data with detected proteins from LC-MS/MS datasets. Numbers in parentheses denote the numbers of expressed genes and detected proteins.