Fig. 1: NeoDisc pipeline overview and benchmarking.

a, Schematic overview of NeoDisc pipeline. Input data are shown in the top white boxes, while the different modules of NeoDisc are represented in gray boxes and their output is shown as white boxes. Background colors indicate the data types used by the modules. Arrows display the flow of data between modules. Dark blue squares, below module output boxes, highlight which data are used in combination for multiple-sample analysis. b, Number of immunogenic peptides from the NCI-test dataset (15 samples and 24 immunogenic peptides) ranked by NeoDisc rule-based algorithm, ML algorithm, pTuneos and pVACseq and reported by Gartner et al. The color of the bars indicates the number of immunogenic peptides when considering only the top n ranked peptides in each person. Horizontal dashed bars indicate the highest number of immunogenic peptides ranked in the top n across all algorithms. The red horizontal line shows the total number of immunogenic peptides.