Table 1 Bacterial species, strain collections and antibiotic susceptibility phenotypes used in this study.

From: PowerBacGWAS: a computational pipeline to perform power calculations for bacterial genome-wide association studies

Bacterial species

Strain collection

# of isolates (diversity)

LD: median R2 (IQ range)

# of SNP sites

# of genes in pan-genome

AMR phenotype (% R and S)a

AMR causal variants

AMR causal variants: AFb, OR and GWAS p-value

Enterococcus faecium

Species-wide

n = 1432 (5.6 SNPs/kb)

0.65 (0.37–0.95)

263,875

11,800

Kanamycin susceptibility (35.3%, 23.3%)

aph(3)-IIIa

AF: 56.3%

OR: 1083

p-value: 8.25  ×  10−145

Single-clade

n = 761 (2.4 SNPs/kb)

0.50 (0.28–0.98)

50,790

5443

Streptomycin susceptibility (34.5%, 60.3%)

ant(6)-Ia/aad(6)

AF: 34%

OR: 8986

p-value: 1.61 × 10−51

Klebsiella pneumoniae

Species-wide

n = 2628 (5.2 SNPs/kb)

0.67 (0.37–1.00)

543,165

30,772

Meropenem susceptibility (21%, 69.1%)

blaKPC

AF: 12%

OR: 180

p-value: 8.90 × 10−110

Single-clade

n = 1193 (0.11 SNPs/kb)

0.78 (0.50–0.96)

46,541

23,708

Meropenem susceptibility (95.4%, 1.3%)

blaKPC

AF: 72%

OR: NAc

p-value: NAc

Mycobacterium tuberculosis

Species-wide

n = 2655 (0.2 SNPs/kb)

0.86 (0.39–1.00)

93,995

21,678

Isoniazid susceptibility (30.9%, 66.4%)

nsSNPs in katG

AF: 20%

OR: 220

p-value: 2.54 × 10−101

Single-cladee

n = 1139 (0.05 SNPs/kb)

0.98 (0.40–1.00)

24,467

10,130

Isoniazid susceptibility (23.8%, 71.7%)

nsSNPs in katG

AF: 13%

OR: 166

p-value: 6.40 × 10−73

  1. Summary table of strain collections used in this study. The average diversity (third column) was calculated as the mean pairwise genetic distance between isolates, expressed as number of SNPs per kilobase. The number of SNP sites in the chromosome (forth column; extracted from the VCF file) and number of genes in the pan-genome (fifth column; extracted from Panaroo’s output), both calculated across all isolates, indicate the degree of diversity within each collection. The last columns show the AMR phenotypes and causal variants used by the sub-sampling approach to perform power calculations. The single-clade collections correspond to: clade A1 isolates for E. faecium; CC258 isolates for K. pneumoniae; and lineage 4.3 isolates for Mycobacterium tuberculosis.
  2. aThe percentage of resistant and susceptible isolates may not amount to 100%, as a subset of isolates were not tested.
  3. bThe MAF was calculated in the whole population not in just the samples phenotyped for the antibiotic in question.
  4. cThe unbalanced number of cases and controls prevented running GWAS.
  5. SNPs/kb Single Nucleotide Polymorphisms per kilobase, AF allele frequency, OR odds ratio, nsSNP non-synonymous SNPs, LD linkage disequilibrium.