Extended Data Fig. 1: Dataset assembly and statistics. | Nature Microbiology

Extended Data Fig. 1: Dataset assembly and statistics.

From: Systematic evaluation of horizontal gene transfer between eukaryotes and viruses

Extended Data Fig. 1: Dataset assembly and statistics.

a. A schematic representation of the dataset assembly pipeline. b. The numbers of eukaryotic, viral, and prokaryotic genomes and proteins examined and included in the initial and final dataset. The final dataset reflects the dataset upon which the HGT analysis was conducted. c. The representation of viral phyla and other groups (that is, those lacking phyla classifications) in the initial and final datasets. d, e. Summary statistics for the final clustered protein families including the number of sequences present (d) and the trimmed alignment lengths (e). See Methods for additional information.

Back to article page