Extended Data Fig. 3: Proteins that contain regions with significant periodicity.
From: An activity-specificity trade-off encoded in human transcription factors

a. Region of significant periodicity in HNRNPA1. Plotted is the disorder score (Metapredict) on the top, and the P values (from K–S test) of the periodicity algorithm on the bottom against the position of amino acids. The positions of the two RNA binding domains (RBD1, RBD2) are noted as grey boxes. The position of the intrinsically disordered region (IDR) is noted with a dark blue bar. The position of the prion-like domain (PLD) is noted with a light blue bar. b. Density plot of all proteins that contain a region of significant periodicity. For each region of significant periodicity, the length of the region is plotted against the lowest P value (from K–S test) within the region. A P value cutoff of 0.01 was used to identify 2,202 regions. Each black dot represents one region, and the depth of the colour of the cloud is proportional to the density of the dots in the area. The positions of the DAZ1, EWSR1, HNRNPA1 and EGR1 are highlighted with red circles. c. AlphaFold models of four proteins. Aromatic residues are coloured in red, and all other residues are coloured in yellow. Note that in DAZ1, the periodic aromatic residues are in a structure of beta-sheets. EGR1 is the transcription factor with the highest ranked region of significant periodicity. d, e. Gene set enrichment analysis (GSEA) of the 2,202 human proteins that contain a region with significant periodicity. The GSEA revealed an enrichment of prion-like domains and depletion of transcription factors. The 2,202 proteins were ranked according to the lowest P value of their most periodic 100 amino acid window. The tick marks indicate the position of prion-like domains, aromatic rich prion-like domains (>10% aromatic content) and transcription factors on the ranked gene list. Since Zn-finger transcription factors (ZNFs) contain repetitive sequences, the transcription factors excluding ZNFs is also shown. Empirical P value is reported.