Figure 7: Statistical analysis of SUMO sites and proteins.
From: System-wide identification of wild-type SUMO-2 conjugation sites

(a) IceLogo of all PRISM-identified SUMO-2 sites and their surrounding amino acids, ranging from −15 to +15 relative to the modified lysine. Amino acids indicated are contextually enriched or depleted as compared with randomly expected, with the height of the amino acids being representative of fold-change. All changes are significant with P<0.05 by two-tailed Student’s t-test. (b) Fill Logo of all PRISM-identified SUMO-2 sites and their surrounding amino acids, ranging from −7 to +7, with the height of the amino acids directly correlating to percentage representation. (c) Heatmap representation of a, giving a quick overview of enriched (blue) and depleted (red) amino acids surrounding SUMOylated lysines. Lysines and glutamic acids are enriched across the entire range surrounding SUMOylation. (d) Comparison of sequence windows of inverted SUMO sites (E or D at −2) to sequence windows of non-inverted SUMO sites. Amino acid height corresponds to percentage enrichment or depletion between the data sets representing inverted and non-inverted sites. Displayed amino acids are significantly different between the two data sets, with P<0.05 by two-tailed Student’s t-test. (e) Term enrichment analysis, comparing all SUMO target proteins identified by PRISM to the human proteome. Gene Ontology Molecular Functions terms were used to find statistical enrichments within the SUMO target protein data set. Term enrichment score is a composite score based on enrichment over randomly expected and the negative logarithm of the false discovery rate. All listed terms are significant with P<0.02 by Fisher Exact testing. (f) As in e, using Gene Ontology Cellular Compartments. (g) As in e, using Gene Ontology Biological Processes. (h) As in e, using keywords.