Fig. 1: Workflow used for the analysis of the full SARS-CoV-2 dataset, composed of three main stages: data preparation, phylogenetic analysis and post-processing.
From: High-resolution epidemiological landscape from ~290,000 SARS-CoV-2 genomes from Denmark

Data preparation included sequencing, identifying consensus sequences, aligning sequences to the reference sequence, masking sites and analysing nucleotide diversity. Phylogenetic analysis included building a preliminary phylogenetic tree, removing molecular clock outliers, partitioning the tree into sub-clades and re-inferring trees using a Bayesian approach for each sub-clade. Post-processing included inferring the effective reproduction number Re value for each clade, linking tips to registries and conducting phylogeographic analysis.