Fig. 1: A parsimonious model predicts the most probable outcomes of Target-AID mutagenesis.
From: Perturbing proteomes at single residue resolution using base editing

a gRNAs included in the time course base editing experiment had diverse C content profiles in the Target-AID activity window. Nucleotides are color coded: guanines are purple, thymines are red, adenines are green and cytosines are blue. b Overall fraction of edited reads for all target sites along time points in the experiment: T0 (start of induction), T6 (mid induction), T12 (end of induction). The solid time point represents surviving cells plated after galactose induction, while the liquid time point represents the cell population after canavanine co-selection. Amplification of the ERO1 target site from the liquid recovery time points was unsuccessful (shown in gray), and as such the solid recovery time point was used instead for the other analysis steps. c Fraction of genotypes with either one, two or three edits compared to the total fraction of reads that were edited. d Editing outcome type for all sites with a total editing rate greater than one percent after co-selection (n = 30 cytosines across all targeted sites). The C to G/T distribution represents the sum of editing that resulted in a C to G or C to T mutation. Position-wise editing rates and outcomes are shown in Supplementary Figs. 5 and 6. e Agreement between the predicted nucleotide total editing rank in the model used to predict mutagenesis outcomes in the large-scale experiment and the deep sequencing data (n = 28 sites, 10 gRNAs: gRNA specific predicted and observed rankings are presented in Supplementary Figs. 5 and 6). The gRNAs targeting ADE1 and SES1 were respectively excluded from the analysis because there is only one editable site in the activity window and total editing rate was too low. f Edited read coverage of the mutation outcome prediction model and the 99th percentile of edited allele combinations (n = 4 genotypes in both cases) for the gRNAs with editing activity included in the large-scale experiment. Boxplots represent the upper and lower quartiles of the data, with the median shown as a yellow bar. Whiskers extend to 1.5 times the interquartile range (Q3–Q1) at most. Source data are available in the Source Data file.