Fig. 2: Accurate fragment ion intensity prediction of XL-peptides by Prosit-XL.

a Schematic illustration of the general architecture of Prosit-XL-CMS2 and Prosit-XL-NMS2 for fragment ion intensity prediction of XL-peptides. The input data (XL-peptide precursor charge state, normalized collision energy (NCE), peptide sequence A, and peptide sequence B) are encoded into a latent representation (latent space). These representations are then element-wise multiplied and subsequently decoded to fragment ion intensities. Prosit-XL-CMS2 contains one extra decoder compared to Prosit-XL-NMS2 covering y-long and b-long fragments. The Prosit-XL-CMS3 has the same architecture as HCD Prosit 2020, missing the Encoder 2 and Decoder 2. b Violin plot comparing the prediction accuracy of Prosit-XL models (dark blue) for CMS3, CMS2, and NMS2 compared to the prediction accuracy of the previously published HCD Prosit 2020 and CID Prosit 2020 model (light blue) on the holdout set across 5 different cross-linker types: CMS3-Alkene, CMS3-Thiol, CMS2-DSSO, CMS2-DSBU, and NMS2-DSS/BS3. The number of underlying spectra (n) is indicated at the bottom. The black solid line and corresponding numbers indicate the median spectral angle (SA) and Pearson correlation coefficient (PCC). The prediction performance was assessed separately for peptides A and B (PSM level). c Violin plot demonstrating the prediction accuracy of Prosit-XL-CMS2 and Prosit-XL-NMS2 on external unseen datasets using DSSO and DSS/BS3 as cross-linkers. The number of underlying spectra (n) is indicated at the bottom. The black solid line and corresponding numbers indicate the median spectral angle and Pearson correlation. Data are presented as mean ± SEM: DSSO (mean = 0.776, SEM = 0.002), DSS/BS3 (mean = 0.726, SEM = 0.008). d, e Mirror spectrum of two XL-peptides comparing the experimentally acquired spectrum (top spectrum) to its respective prediction by Prosit-XL for the peptide DAIATVNKQEDANFSNNAMAEAFK (peptide A) cross-linked by DSSO with VTAVDAKGATVELADGVEGYLR (peptide B) predicted by Prosit-XL-CMS2 (d) and the peptide NGLTPITSLPNYNEDYKLR (peptide A) cross-linked by DSS with EKSIPSTITVGK (peptide B) predicted by Prosit-XL-NMS2 (e). Matching peaks are visualized in dark red, red, and light red for b, b-s, and b-l and b-xl ions, respectively, and in dark blue, blue, and light blue for y, y-s, and y-l and y-xl, respectively.