Table 3 Statistical significance (p-values*) for generalized linear models in the association of assembly parameters (strategy, assembler, and sequencing depth) and a key metric of the 3 C criterion in the assessment of genome pan-assembly of four bacterial models.
Bacterial model | 3 C metric (response variable, Y) | Predictor (factors, Xi) | |||
|---|---|---|---|---|---|
Strategy | Assembler | Depth for short-reads | Depth for long-reads | ||
B. henselae | Contigs | 5.72E-07 | 0.0007 | 0.3921 | 0.0090 |
N50 | 3.27E-05 | 0.0023 | 0.3786 | 0.0124 | |
Mismatchs | 1.09E-05 | 0.0144 | 0.1267 | 0.3386 | |
Indels | 2.95E-06 | 1.09E-05 | 0.1986 | 0.2241 | |
CDS | 2.79E-06 | 0.0591 | 0.8925 | 0.1900 | |
BUSCO score | 0.0011 | 2.12E-05 | 0.1749 | 0.3063 | |
E. coli | Contigs | 5.55E-06 | 0.0001 | 0.0729 | 0.0006 |
N50 | 4.59E-05 | 0.0008 | 0.0059 | 0.0003 | |
Mismatchs | 0.3936 | 0.0043 | 2.43E-06 | 0.1331 | |
Indels | 1.44E-07 | 4.14E-06 | 0.1184 | 0.0432 | |
CDS | 1.48E-09 | 0.0010 | 0.2439 | 0.0651 | |
BUSCO score | 3.00E-08 | 0.0081 | 0.0001 | 0.9368 | |
P. aeruginosa | Contigs | 1.92E-07 | 4.58E-05 | 0.0224 | 0.5472 |
N50 | 7.19E-08 | 0.0009 | 0.1138 | 0.4283 | |
Mismatchs | 2.36E-08 | 3.74E-06 | 0.3527 | 0.4355 | |
Indels | 1.33E-10 | 6.06E-07 | 0.2463 | 0.4424 | |
CDS | 2.13E-07 | 9.47E-05 | 0.2224 | 0.9103 | |
BUSCO score | 0.0004 | 3.79E-07 | 0.2384 | 0.2384 | |
X. fastidiosa | Contigs | 1.52E-08 | 4.15E-05 | 0.0318 | 0.0472 |
N50 | 2.14E-08 | 3.49E-06 | 0.0230 | 0.0069 | |
Mismatchs | 0.0424 | 7.59E-08 | 0.0002 | 0.1449 | |
Indels | 5.14E-07 | 1.31E-06 | 0.0004 | 0.3025 | |
CDS | 6.61E-08 | 0.0005 | 0.1496 | 0.4162 | |
BUSCO score | 5.09E-06 | 0.0026 | 5.26E-10 | 0.6539 | |