Extended Data Fig. 1: Conservation of multivalency-promoting codon biases. | Nature

Extended Data Fig. 1: Conservation of multivalency-promoting codon biases.

From: Collective homeostasis of condensation-prone proteins via their mRNAs

Extended Data Fig. 1: Conservation of multivalency-promoting codon biases.

a, An example of a transcript from the gene CCDC61. The smoothed GeRM score is shown along the transcript in the upper panel (solid line), with the dashed line depicting the average smoothed GeRM score after synonymous codon shuffling. The amino acid entropy of the encoded sequence is shown in the lower panel (black line), while the proportion of charged amino acids in that window is shown in the orange line. b, The native DNA and amino acid sequences of CCDC61 within the GeRM peak. A codon shuffled sequence in which all codons must be synonymously shifted is shown. The conservation across 100 vertebrates (PhyloP) for each position that can tolerate synonymous mutation is shown, with the height of the letter corresponding to the PhyloP score. Below, a ratio is shown that compares the GeRM scores associated with the native codon choice to the GeRM scores associated with any possible synonymous mutation. c, a schematic detailing the calculation of the ratio of GeRM between native codon choices and synonymous mutations. d, the mean log2-transformed ratio for each type of possible synonymous mutation. For example, for the arginine codon CGG, position 1 is considered a synonymously mutable position (CGG -> AGG) and position 3 is considered separately as a mutable position (e.g CGG -> CGC). Relative codon usage is calculated by scaling the transcriptomic codon usage such that the mean is 0 and the standard deviation is 1. e, The normalised conservation across 470 mammals of synonymously mutable positions in coding sequences that either encode LCDs or do not. Codons are binned by the degree that the native codon choice supports sequence multivalency, where codons with the highest ratio support the multivalency the most. Conservation values are normalised to the middle position of the codon, which is never synonymously mutable, and each codon is normalised to have a median conservation value of 0. f, PhyloP scores across 100 vertebrates of the middle positions of codons inside or outside of LCDs. g, As in Fig. 1h and Extended Data Fig. 1e, but using unnormalised PhyloP scores across 100 vertebrates. h, The mean GeRM scores within CDS regions encoding LCDs (black lines), or the rest of the CDS (grey lines). The mean GeRM scores in those regions after synonymously reordering the codons within each transcript (dashed lines) Unless otherwise stated, pairwise significance testing are FDR-corrected Welch t-tests, where * = p < 1−15. Precise p-values are found in Source Data.

Source data

Back to article page