Fig. 2: Data flow chart of vPro-MS for virus identification by proteomics. | Nature Communications

Fig. 2: Data flow chart of vPro-MS for virus identification by proteomics.

From: vPro-MS enables identification of human-pathogenic viruses from patient samples by untargeted proteomics

Fig. 2: Data flow chart of vPro-MS for virus identification by proteomics.

Data processing of the vPro-MS workflow is split into two parts: peptide library construction and virus detection. At first, a peptide library is generated based on UniProt protein sequences. The UniProt database (release 2023_01) contains >1.4 million protein sequences from human-pathogenic viruses. Structural virus proteins are extracted from these sequences and are used to predict a viral peptide spectral library. Peptides are further filtered for detectability (m/z, iRT, IM) and taxonomic specificity. The remaining peptides form the vPro peptide library, to which human and contaminant peptide sequences are added. This library is used to identify peptides from DIA-MS data using DIA-NN. The peptide sequences are analyzed using the vPro-MS R script to identify human-pathogenic viruses. vPro-MS controls the reliability of virus detection by calculating a confidence score (vProID) and summarizes the results in a tabular report. (Created in BioRender. Doellinger, J. (2025) https://BioRender.com/j84lltq).

Back to article page