Fig. 3: Sequences collected over time and number of mutations in training and test sets. | Communications Biology

Fig. 3: Sequences collected over time and number of mutations in training and test sets.

From: Paying attention to the SARS-CoV-2 dialect : a deep neural network approach to predicting novel protein mutations

Fig. 3

The left side graph shows the number of unique sequences collected per date. The red line indicates the cutoff date (January 1st, 2022). A novel mutation is based on the first collection date of a sequence that contained said mutation. Right-hand graph shows the number of substitution mutations in the training set, where the first date was before the cutoff vs number of substitution mutations in the test set, where first date is after cutoff.

Back to article page