Fig. 7 | Nature Communications

Fig. 7

From: Combining discovery and targeted proteomics reveals a prognostic signature in oral cancer

Fig. 7

Prognostic signature in saliva distinguishes OSCC patients. a Workflow for machine-learning approach to measure the predictive power of peptides and proteins. b, c The predictive relevance of individual proteins and peptides to distinguish N0 from N+ patients is represented by a bar chart indicating their cross-validation ROC AUC (100 repetitions of stratified tenfold cross-validation). The most relevant protein and peptide ordered by the AUC is LTA4H and Pep8_LTA4H, respectively. When only the AUCs of the individual signatures (size 1) are considered, the three highest areas at the protein level are LTA4H (73.9%), COL6A1 (62.1%), and ITGAV (60.5%) and at the peptide level are Pep12_CSTB (73.5%), Pep8_LTA4H (72.8%), and Pep9_COL6A1 (71.0%). d Cross-validation estimated ROC curves of the best protein and peptide signatures. e Box plots representing the AUC of all possibilities of signatures for both imbalanced and balanced (SMOTE) cross-validation. At the peptide level, 1024 signatures were tested. At the protein level, 63 signatures were tested. Signatures formed by peptides from different proteins S1 {Pep8, Pep12} and S2 {Pep8, Pep9, Pep12} have approximately 10.5% higher AUC than the peptide signature formed by LTA4H (S4). S2 peptide signature outperformed both S1 and S4 signatures. The candidate signatures are indicated by labels: S1, S2, S3, and S4. Peptide sequences: Pep1_MB: HGATVLTALGGILK; Pep2_MB: YLEFISECIIQVLQSK; Pep3_PGK1: VLNNMEIGTSLFDEEGAK; Pep4_PGK1: VLPGVDALSNI; Pep5_ITGAV: LQEVGQVSVSLQR; Pep6_ITGAV: STGLNAVPSQILEGQWAAR; Pep7_LTA4H: LTYTAEVSVPK; Pep8_LTA4H: DLSSHQLNEFLAQTLQR; Pep9_COL6A1: GLEQLLVGGSHLK; Pep10_COL6A1: TAEYDVAYGESHLFR; Pep11_NDRG1: EMQDVDLAEVKPLVEK; Pep12_CSTB: HDELTYF; Pep13_CSTB: SQVVAGTNYFIK; and Pep14_CSTB: VHVGDEDFVHLR. Four peptides were not included in the training model because they did not pass the filtering step (step 2 from Part 2 of Fig. 7a; P value < 0.1, Mann−Whitney U test). Box plots represent the median and interquartile range, whiskers represent the 1–99 percentile, and outliers are represented by “+”

Back to article page