Extended Data Fig. 2: Factors affecting prey labelling and rationale for prey-wise analysis.
From: A proximity-dependent biotinylation map of a human cell

a, After sorting preys by proximity order and grouping by order across baits, the proportion of previously reported preys was calculated for the nth proximity order for n = 1, 2,… 200. b–f, For each bait, the relative proximity of every prey (proximity order) was calculated from the control-subtracted length-adjusted spectral counts (CLSC) (Methods), such that the prey with the highest CLSC value was considered to be the ‘interactor’ most proximal to the bait and the lowest CLSC value the most distal. b, Number of baits with a minimum of n preys at a 1% FDR, for n = 1, 2,… 200. c, Proximity order versus protein turnover rate (hours) in HeLa cells (turnover data are from ref. 55). d, Proximity order versus protein expression as represented by the log10-normalized MS1 iBAQ intensity from ProteomicsDB53. e, Proximity order versus the number of lysine residues per protein. f, The log10-normalized MS1 iBAQ intensity of proteins expressed in HEK293 versus HeLa cells from ProteomicsDB53. The similarity in proteomes supports the usage of HeLa data in c as suitable HEK293 data was not available. Values along the x axis could reflect zero expression or missing data in HeLa cells. These were ignored when calculating the R2 value. g, Bait comparisons for a pair of mitochondrial matrix proteins. Control-subtracted spectral counts are plotted for all high confidence preys (1% FDR) detected with either bait pair under comparison. AARS2 preferentially enriches components of the mitochondrial ribosome and proteins involved in translation, such as GFM1, MRPS9 and TRMT10C, whereas PDHA1 preferentially interacts with the pyruvate dehydrogenase complex component DLAT and the mitochondrial membrane ATP synthase ATP5F1B. h, Pipelines for localizing prey proteins using SAFE40 and NMF42. In our SAFE pipeline, preys with a correlation across baits ≥ 0.65 are considered interactors and these pairs are used to generate a network that is annotated for GO:CC terms (Methods). In NMF, the bait–prey spectral counts matrix is reduced to a compartment-prey matrix and compartments are then defined using GO:CC for the compartment’s most abundant preys. A 2D network is generated in parallel from the compartment–prey matrix using t-SNE44.