Fig. 3: Performance of InstaNovo and InstaNovo+ on the labelled application-focused datasets. | Nature Machine Intelligence

Fig. 3: Performance of InstaNovo and InstaNovo+ on the labelled application-focused datasets.

From: InstaNovo enables diffusion-powered de novo peptide sequencing in large-scale proteomics experiments

Fig. 3

a, Peptide-level accuracy of IN and IN+ on each application-focused dataset. b, Total number of PSMs for IN and IN+ models at 5% FDR. Overlap with database search PSMs is shown in grey. c, Novel PSMs at 5% FDR for IN and IN+, expressed as a percentage of database search total PSMs. d, Peptide-level precision–recall curves for proteomes explored in this study. These consist of HeLa cell lysate proteome, ‘Candidatus Scalindua brodae’ proteome from a co-enrichment culture, snake venom proteomes and the proteome from human patient wound exudates as extracted from dressings. e, Comparison of peptide-level precision–recall curves for both models on the datasets where novel sequences were involved. These were HLA peptide-enriched samples, nanobodies and the antibody herceptin, as well as a HeLa proteome dataset including semi-tryptic and open search peptides. f, Kernel-smoothed precision of model confidence distributions across multiple datasets for IN.

Back to article page