Fig. 1: Traces of aromatic periodicity in human TF IDRs.
From: An activity-specificity trade-off encoded in human transcription factors

a, Model of a TF (top) and the method used to identify aromatic periodic blocks (bottom). b, The top 80 TFs ranked according to the IDR periodicity score. Ranks are shown in parentheses. The height of the bars in the outer circle is proportional to the periodicity score. The inner circles indicate whether the IDR contains a minimal activation domain (AD) identified in the four studies. c, Positioning of aromatic residues in NFAT5. Red dots indicate the position of aromatic residues in periodic block; yellow dots indicate the position of all other aromatic residues. d, Omega plot of the NFAT5 IDR. The empirical P value is reported. Red dots indicate aromatic residues, white dots indicate any other residue. e, Disorder plot (Metapredict; black) and AlphaFold2 pLDDT score (yellow) for HOXC4. f, Omega plots of the HOXC4 IDR (top) and the portion encoding the periodic aromatic block (bottom). The coordinates, ΩAro scores and the percentage of randomly generated sequences that have a lower ΩAro score than the actual sequence are provided. g, Representative images of droplet formation of purified recombinant HOXC4 IDR–mEGFP proteins. Scale bars, 5 μm. h, Relative amount of condensed protein in the droplet assays. Data are the mean ± s.d. of n = 10 images from two replicates. The curves were generated as nonlinear regressions to a sigmoidal curve function. i, Schematic (top) and results of luciferase reporter assays (bottom). The luciferase values were normalized to an internal Renilla control and the values are displayed as percentages of the activity measured using an empty vector. Data are the mean ± s.d. of n = 3 biological replicates. P values are from two-sided unpaired Student’s t-tests. j, Pipeline for the identification of regions with significant periodicity. k, Density plot of protein regions with significant periodicity. The length of the region is plotted against the lowest P value from the K–S test within the region. The depth of the colour is proportional to the density of the dots. The numbers of proteins that contain a region with significant periodicity over the total number of proteins in each category are shown. l, Omega scores of IDRs in various protein classes. P values are from one-way analysis of variance with Tukey’s multiple comparisons post test. For the box plots, the centre line shows the median, the bounds of the box correspond to interquartile (25th–75th) percentile, and whiskers extend to Q3 + 1.5× the interquartile range and Q1 − 1.5× the interquartile range; the dots beyond the whiskers show Tukey’s fences outliers. m, Schematic models of prion-like domains (PLDs) and TF IDRs, and their omega scores.