Table 1 Pipelines used for the cluster congruence analysis with indication of the input type, the schema or reference genome used for each species, the output used for clustering and the clustering method(s) applied

From: Multi-country and intersectoral assessment of cluster congruence between pipelines for genomics surveillance of foodborne pathogens

Pipeline name

Workflow

Type of pipeline

Input type

Schema (if allele) / Type of dataset (if SNP)

Output for clustering

Clustering method

L. monocytogenes

S. enterica

E. coli

C. jejuni

chewieSnake

chewieSnake

Allele

Assembly

Ruppitsch

Enterobase

Enterobase

PubMLST

Allele matrix

HC and GT

INNUENDO-like (Listeria) or INNUENDO-like-schema

INNUENDO-like*

Allele

Assembly

Moura

INNUENDO wgMLST/INNUENDO cgMLST§ (EFSA)/Enterobase

INNUENDO wgMLST/INNUENDO cgMLST§ (EFSA)/Enterobase

INNUENDO wgMLST/INNUENDO cgMLST/PubMLST

Allele matrix

HC and GT

SeqSphere or SeqSphere-wgMLST (C. jejuni)

Ridom SeqSphere + **

Allele

Assembly

Ruppitsch

Enterobase

Enterobase

SeqSphere (extended)

Allele or Distance or Partition table

HC and GT

Bionumerics

Bionumerics**

Allele

Assembly

Moura

Enterobase

Enterobase

Oxford

Distance matrix

HC

MentaLiST

MentaLiST

Allele

Assembly

Moura

INNUENDO

INNUENDO

INNUENDO

Allele matrix

HC and GT

SnippySnake

SnippySnake

SNP

Reads

ST

serotype

serotype

-

Distance matrix

HC

CSI

CSI Phylogeny

SNP

Reads

ST***

serotype***

serotype

ST

Distance matrix

HC

WGSBAC

WGSBAC

SNP

Reads

ST

serotype

serotype

-

Distance matrix

HC

SnapperDB

SnapperDB

SNP

Reads

-

serotype

-

-

Distance matrix

HC

  1. *This pipeline corresponds to an adaptation of the pipeline included in the INNUENDO platform22; **Commercial software; ***CSI Phylogeny62 was also run with a more diverse dataset combining the sequencing data of the top STs/serotype; §This schema is the one implemented and recommended by EFSA and for clarity reasons it is referred throughout the manuscript as the “EFSA” schema.
  2. When a pipeline was not used for a given species, the schema/dataset field has a hyphen (“-”). HC – Hierarchical clustering; GT – GrapeTree.