Extended Data Fig. 3: EndoMAP.v1 network characterization and application of AlphaFold-M across DSSO cross-linked protein pairs.
From: EndoMAP.v1 charts the structural landscape of human early endosome complexes

a, Degree distribution (number of edges per node) of the complete network. b, Power law log-log plot of the complete network showing the degree of a node (number of edges) and the probability. c, Distribution of the shortest path distances between all proteins in the complete interaction network. d, Distribution and number of PPIs within and between selected organelles (Supplementary Table 2). e, Criteria for network filtering to create an integrated endosomal network (EndoMAP.v1, see METHODS). f, Mapping of known protein complexes from CORUM126 onto the core components of the EndoMAP.v1 network (Supplementary Table 2). g,h, DisGeNET enrichment analysis of endosomal proteins as defined by our scoring method (panel g) and Gene Ontology (GO:0005768, panel h). Top 15 categories by highest gene ratio are depicted. Disorders related to the nervous system are indicated in bold. p-values by hypergeometric test were adjusted with Benjamini-Hochberg correction. i, Enrichment analysis of the endosomal proteome within several neurodegenerative diseases (LSD, Lysosomal Storage Disorders; ALS, Amyotrophic Lateral Sclerosis, PD, Parkinson’s disease; ASD, Autism Spectrum Disorders; DD/ID, epilepsy and severe neurodevelopmental disorder). j, Mapping of neurodegenerative disease related proteins onto the core component of EndoMAP.v1 network (see METHODS, Supplementary Table 2). k, Distribution of shortest path distances within various classes of neurodegenerative disease related proteins. Three different sources of disease genes were used to retrieve proteins related to PD (see METHODS). l, Distances between DSSO cross-linked lysines for AF-M predictions compared to structures in the PDB. Green and orange dots represent interprotein and intraprotein cross-links, respectively. Filled and empty dots represent predictions with SPOC > 0.33 or SPOC < 0.33, respectively. m, Distribution of Cα-Cα distances (Å) for intraprotein DSSO cross-linked lysines in all AF-M predictions compared to all lysines. n, Distribution of Cα-Cα distances (Å) for interprotein DSSO cross-linked lysines in all AF-M predictions compared to all lysines. o, Distribution of SPOC scores and average pLDDT for predictions with SPOC > 0. Number of interprotein DSSO cross-links evaluated and exceeding the cross-linker distance restrain are indicated by point size and the color, respectively. p, Box plot showing the distribution of SPOC scores relative to the number of DSSO cross-links identified for each interaction (n number of interactions in each category is indicated on top). The middle line corresponds to the median, the lower and upper end of the box correspond respectively to the first and third quartiles, and the whiskers extend from the box to 1.5 times the inter-quartile range. q,r, Distribution of Cα-Cα distances (Å) for intraprotein (q) and interprotein (r) DSSO cross-linked lysines in AF-M predictions involving endosomal proteins compared to all lysines. s, Distribution of Cα-Cα distances (Å) for interprotein DSSO cross-links reflecting predictions involving endosomal proteins with SPOC > 0.33 (orange) and SPOC < 0.33 (red).