Table 1 Proposed standards and metrics for defining genome assembly quality
From: Towards complete and error-free genome assemblies of all vertebrate species
Quality category | Metric | Finished | VGP-2020 | VGP-2016 | B10k-2014 | This study |
---|---|---|---|---|---|---|
Notation | x.y.P.Q.C | c.c.Pc.Q60.C100 | 7.c.P6.Q50.C95 | 6.7.P5.Q40.C90 | 4.5.Q30 | |
Continuity | Contig NG50 (x) | = Chr. NG50 | >10 Mb | >1 Mb | >10 kb | 1–25 Mb |
Scaffolds NG50 (y) | = Chr. NG50 | = Chr. NG50 | >10 Mb | >100 kb | 23–480 Mb | |
Gaps per Gb | No gaps | <200 | <1,000 | <10,000 | 75–1,500 | |
Structural accuracy | Reliable blocks | = Chr. NG50 | >10 Mb | >1 Mb | Not required | 2.3–40.2 Mb |
False duplications | 0% | <1% | <5% | <10% | 0.2–5.0% | |
Curation | Conflicts resolved | Manual | Manual | Not required | Manual | |
Base accuracy | Base pair QV (Q) | >60 | >50 | >40 | >30 | 39–43 |
k-mer completeness | 100% complete | >95% | >90% | >80% | 87–98% | |
Haplotype phasing | Phase block NG50 (P) | = Chr. NG50 | >1 Mb | >100 kb | Not required | 1.6 Mba |
Functional completeness | Genes | >98% complete | >95% complete | >90% | >80% | 82–98% |
Transcript mappability | >98% | >90% | >80% | >70% | 96% | |
Chromosome status | Assigned (C) | >100% | >95% | >90% | Not required | 94.4–99.9% |
Sex chromosomes | Right order, no gaps | Localized homo pairs | At least one shared (for example, X or Z) | Fragmented | At least one shared | |
Organelles (for example, MT) | One complete allele | One complete allele | Fragmented | Not required | One complete allele |