Fig. 2: NaP-TRAP measures the effect of Kozak strength on translation.

a A schematic detailing the Kozak library containing six random nucleotides upstream and one nucleotide downstream of the start codon of 3xFLAG-GFP (top left). Histogram of NaP-TRAP translation values at 6 hpf (bottom left). Sequence logos for the top and bottom 10% of reporters based on translation (right)63. b, c Cartoon of random forest regression model feature generation through the one-hot encoding of reporter sequences (b). Scatterplot comparing the model’s prediction to the experimentally derived translation values of a test set of reporters (two-sided Pearson’s R; N = 785 reporters) (c). d Permuted feature importance derived from random forest regression model (N = 10 repeats, error bars = SD). Nucleotide positions in purple correlate negatively with translation, whereas positions in blue correlate positively with translation (two-sided Spearman Rank Correlation Coefficient). e Comparison of translation measurements at 6 hpf and an in silico-derived Kozak score based on the frequency of Kozak sequences in the transcriptome (two-sided Pearson’s R; N = 2616 reporters; p < 10−56)26 f, g. The translation of seven reporters with a Kozak score between 295-305 was measured using a dual luciferase-based assay (f). Plot comparing relative luciferase activity (Nano luciferase / Firefly luciferase), and NaP-TRAP translation values (two-sided Pearson’s R; N = 3 replicates; 5 embryos per replicate, p < 0.0033; error bars = SEM) (g).