Fig. 2: CodonTransformer learned codon patterns across organisms. | Nature Communications

Fig. 2: CodonTransformer learned codon patterns across organisms.

From: CodonTransformer: a multispecies codon optimizer using context-aware neural networks

Fig. 2: CodonTransformer learned codon patterns across organisms.

a Codon usage index (CSI) for all and the top 10% CSI original genes (yellow and blue, respectively) and generated DNA sequences for all original proteins by CoronTransformer (base and fine-tuned models, light and dark red, respectively) for 9 out of 15 genomes used for fine-tuning in this study. See Supplementary Figs. 216 for all 15 genomes and additional metrics of GC content codon and distribution frequency (CDF). b Synonymous mutations in the E. coli (K12 strain) ccdA antitoxin gene, from the ccdAB toxin-antitoxin system, were analyzed using CodonTransformer (base and fine-tuned for E. coli K12 strain, with wild-type DNA as input to the model) and background frequency choice (BFC) models. The natural log of the probability of mutant codons over wild-type codons was computed for 62 mutations from Chandra et al.32, plotted against the natural log of experimental relative fitness, blue bars, (positive correlation) and relative ribosome stalling, green bars, (negative correlation, absolute values plotted). Two-sided Spearman correlation tests were used (for n = 62 mutations) to evaluate the models’ performance with numerical p-values of, from left to right for the six bars, 0.0153 (*), 0.1232, 0.0015 (**), 0.1123, 0.0026 (**), 0.0094 (**). Raw data and source data for a and Supplementary Figs. 216 are available at https://zenodo.org/records/13262517 and for data underlying b is provided in the Source Data File.

Back to article page