Table 1 Phylogeny and alignment information

From: Improving microbial phylogeny with citizen science within a mass-market video game

Method

PASTA

Post-processsing PASTA

MUSCLE

MAFFT

Greedy algorithm

BLS

Phylogeny and alignment information

Kendall–Colijn mean

2,193

1,521

1,772

1,298

1,246

1,115

Triplet distance mean (k)

86.6

73.4

101.4

80.9

80.2

52.2

Compound Kendall–Colijn and Triplet

87.2

67.1

86.1

66.4

65.0

48.4

Sum of pairs (billions)

2.09

2.18

2.05

2.11

2.16

2.17

Alignment width

193

188

260

270

198

196

Proportion of bases lost in mapping to structure

Proportion of unmapped bases

0.001

 

0.149

0.100

 

0.003

  1. Top: The reported distances are the mean of sampled distances from phylogenetic trees estimated with FastTree from alignments generated with various methods against our reference tree, the Greengenes-SEPP tree. The compound distance is a combination of Kendall–Colijn and Triplet. Because these are distances, optimal values are the lowest. For the sum of pairs, the optimal value is the highest. We also report alignment width for context. Bottom: Alignments were reduced to the 150 most populated columns (same as in Fig. 2a) to be mapped to the 16S V4 secondary structure. We report the proportion of bases in unmapped columns.