Extended Data Fig. 8: Comparison of different approaches for accurate mass calculation.
From: A modification-centric assessment tool for the performance of chemoproteomic probes

a-c, Histograms showing the distribution of mass shifts determined at the PSM-level using three different methods as follows: a, Mass shift average, all unknown mass shifts are assigned to the corresponding mass labels being kept in a fixed two decimal places. Those with the same mass label are unified by averaging the mass shifts from multiple PSMs as follows: \(\bar M_t = \frac{1}{K}\mathop {\sum }\limits_{i = 1}^K M_i\), where K denotes the number of spectra include the target mass label. Note that the modification mass Mi employed here has been corrected with system error. b, Mass range average, those unknown mass shifts within a fixed size window (0.01 Da of Mw by default) are considered the same unknown modification, averaging to give accurate mass of candidate modification as follows: \(\bar M_t = \frac{1}{Q}\mathop {\sum }\limits_{i = 1}^Q M_i,\forall \,M_t - M_w < M_i < M_t + M_w\), where Mw is the window radius size, Mt is the target mass shift and Q is the number of spectra belongs to the mass range. c, Window-based Iterative refinement builds upon the Mass range average method, the mass shifts are initially unified within a predefined window (0.01 Da by default), and gradually converged to stable and accurate ones through multiple iterations (See Methods for details). For a-c, orange vertical line represents the accurate mass of ground-truth unknow modification, while red dash line denotes the mass estimated by each approach. d, Violin plots showing mass accuracy achieved by above methods. Middle lines denote median values, while left- and right-end lines denote the 25th and 75th percentiles, respectively. Note that only the data sets where the number of spectra used for mass calculation is larger than 500 were used in this comparative analysis.