Table 2 Summary of benchmarking study design and methods

From: Systematic benchmarking of omics computational tools

Benchmarking study

Application

No. of tools

Model of study

Raw input data type

Gold standard data preparation method

Parameter optimization

Yang et al. 2013

Error correction

7

I

R

SIMUL

N

Aghaeepour et al. 2013

Flow cytometry analysis

14

C

R

EXPERT

N

Bradnam et al. 2013

Genome assembly

21

C

R

ALTECH

n/a

Hunt et al. 2014

Genome assembly

10

I

R, S

SOFTWARE

N

Lindgreen et al. 2016

Microbiome analysis

14

I

S

SIMUL

No

McIntyre et al. 2017

Microbiome analysis

11

I

R, S

MOCK

N

Sczyrba et al. 2017

Microbiome analysis

25

C

S

SIMUL

n/a

Altenhoff et al. 2016

Ortholog prediction

15

I

DB

DB

Y

Jiang et al. 2016

Protein function prediction

121

C

R

DB

n/a

Radjvojac et al. 2013

Protein function prediction

54

C

R

DB

n/a

Baruzzo et al. 2017

Read alignment

14

I

S

SIMUL

Y

Earl et al. 2014

Read alignment

12

C

R, S

SIMUL

n/a

Hatem et al. 2013

Read alignment

9

I

R, S

SIMUL

Y

Hayer et al. 2015

RNA-Seq analysis

7

I

R, S

ALTECH

N

Kanitz et al. 2015

RNA-Seq analysis

11

I

R, S

ALTECH

N

Łabaj et al. 2016

RNA-Seq analysis

7

I

R

ALTECH

N

Łabaj et al. 2016

RNA-Seq analysis

4

I

R

DB

N

Li et al. 2014

RNA-Seq analysis

5

I

R

ALTECH

Y

Steijger et al. 2013

RNA-Seq analysis

14

C, I

R

ALTECH

n/a

Su et al. 2014

RNA-Seq analysis

6

I

R

ALTECH

Y

Zhang et al. 2014

RNA-Seq analysis

3

I

R

ALTECH

Y

Thompson et al. 2011

Sequence alignment

8

I

DB

DB

N

Bohnert et al. 2017

Variant analysis

19

I

R, S

I&A

Y

Ewing et al. 2015

Variant analysis

14

C

S

SIMUL

n/a

Pabinger et al. 2014

Variant analysis

32

I

R, S

SIMUL

N

  1. Surveyed benchmarking studies published from 2011 to 2017 are grouped according to their area of application (indicated in column “Application”). We also recorded the number of tools benchmarked by each study (“Number of Tools”). We documented the coordinating model used to conduct the benchmarking study (“Model of Study”), such as those independently performed by a single group (“I”), a competition-based approach (“C”), and a hybrid approach combining elements of “I” and “C” (“C, I”). Types of raw omics data (“Raw Omics Data”) and gold standard data (“Gold Standard Data Preparation Method”) were documented across benchmarking study. When a benchmarking study uses computationally simulated data, we marked the study as “S”; when real raw data were experimentally generated in the wet-lab, we marked the study as “R”. When the study used both simulated and real data, we marked the study as “R, S”. Gold standard data types included data that were computationally simulated (marked as “SIMUL”), manually evaluated by experts (marked as “EXPERT”), prepared by alternative technology (“marked as ALTECH”), prepared as curated software input (marked as “SOFTWARE”), prepared as mock community (marked as “MOCK”), prepared from curated databases (marked as “DB”), and prepared using an integration and arbitration approach (marked as “I&A”). In competition-based benchmarking studies, parameter optimization (“Parameter Optimization”) is performed by each team and is not mandatory (marked here as “n/a”). More details about the characteristics of techniques to prepare gold standard data sets are provided in Table 1