Fig. 2: Phylogenetic relationships of previously undetected SARS-CoV-2 and other NY and global isolates.
From: Molecular evidence of SARS-CoV-2 in New York before the first pandemic wave

a Multiple sequence alignment of >95% complete SARS-CoV-2 genome sequences obtained from RPN pools relative to Wuhan-Hu-1 (RefSeq: NC_045512). RPN pools are ordered by date and PANGO lineage as displayed in (a). The SARS-CoV-2 genome coordinates and gene annotations are shown above. Single- nucleotide variations (SNVs) are depicted with vertical lines in red (clade defining) or black (other). b Coverage for pools with detectable RT-PCR targets (ORF1ab+E+ (magenta), ORF1ab+ (yellow), E+ (cyan)) collected prior to the first confirmed case in NY (NY1) with detectable SARS-CoV-2 reads that could not be assembled to complete genomes (>Q30 reads are shown). Nextera XT comprises data from both whole-genome and targeted amplicon sequencing library preparations. c Maximum-likelihood (ML) phylogenetic inference shown as a time tree of seven SARS-CoV-2 genome sequences from this surveillance study in a global background of 2993. Tip circles indicate the position of the respiratory pathogen-negative (RPN) pools (red) described in this report, the first reported COVID-19 case in NYC (green) from 29 February, later NYC cases from MSHS (yellow) and other institutions (dark gray), and US (blue) early isolates prior to 1 March. Tips without circles correspond to the background global isolates. The yellow box delineates the position of the clade containing the majority of NYC sequences detected during the early spread. The PANGO lineage classification of the RPN pools is indicated on the right, and the NextStrain clades are shown as node labels. The specimen identifier is indicated for RPN pools detected earlier than NY1. The time tree was inferred under a strict clock model with a nucleotide substitution rate of 0.80 × 10−3.