Table 1 Performance metrics of the models on the test sets.

From: Predictive design of sigma factor-specific promoters

 

σ70 specific promoters

σB specific promoters

σF specific promoters

σW specific promoters

 

WT genotype

σB genotype

σF genotype

σW genotype

 

Mean

Std.

Mean

Std.

Mean

Std.

Mean

Std.

Spearman’s rho

0.574

0.003

0.565

0.002

0.497

0.002

0.234

0.050

Weighted ACC

0.230

0.005

0.230

0.005

0.210

0.004

0.136

0.015

Weighted MAE

1.609

0.007

1.652

0.017

1.919

0.052

2.504

0.100

MAE

y = 0

2.866

0.150

2.397

0.068

2.891

0.107

4.797

0.240

y = 1

1.845

0.105

1.630

0.070

2.097

0.117

4.022

0.141

y = 2

1.372

0.054

1.071

0.028

1.507

0.101

2.739

0.287

y = 3

1.284

0.058

1.197

0.035

1.294

0.056

1.613

0.329

y = 4

1.241

0.045

1.368

0.058

1.324

0.095

1.090

0.122

y = 5

1.324

0.023

1.407

0.055

1.357

0.049

0.587

0.480

y = 6

1.401

0.038

1.333

0.036

1.258

0.068

0.951

0.111

y = 7

1.362

0.045

1.315

0.051

1.572

0.119

1.790

0.189

y = 8

1.316

0.055

1.507

0.065

2.102

0.146

2.711

0.246

y = 9

1.533

0.032

1.730

0.116

4.010

0.135

3.774

0.205

y = 10

2.158

0.047

3.224

0.165

1.704

0.320

4.646

0.298

  

σF genotype

σB genotype

σB genotype

ROC AUC

  

0.694

0.004

0.652

0.004

0.615

0.010

  

σW genotype

σW genotype

σF genotype

ROC AUC

  

0.691

0.002

0.643

0.004

0.635

0.006

  

WT genotype

WT genotype

WT genotype

ROC AUC

  

0.665

0.004

0.632

0.003

0.635

0.006

  1. Weighted metrics are used for accuracy and mean absolute error to account for class imbalance. For each performance, the mean and standard deviation (Std.) are given obtained by training multiple models in a five-fold set-up for the test set. Mean absolute errors for each of the sample classes (y = 0–10) are given. ROC/PR AUC is used for the binary classification problem. ROC AUC represent a perfect model at AUC = 1. (ACC accuracy, MAE mean absolute error, AUC area under the curve, ROC receiver operating characteristic, WT wild-type, σ sigma factor). Source data are provided as a Source Data file.