Table 1 Genome assembly and annotation statistics of two Arabidopsis allotetraploids

From: Concerted genomic and epigenomic changes accompany stabilization of Arabidopsis allopolyploids

Sequence statistics

A. suecica (T + A)

Allo738 (T + A)

Total length of contigs (bp)

272,218,784

268,958,675

Total length of assemblies (bp)

272,391,284

269,147,175

Length of largest 13 super-scaffolds

263,860,340

265,394,178

Percentage of anchored (bp)

96.90%

98.60%

Number of contigs

380

470

Contig L50 (bp)

6,555,646

6,799,294

Number of scaffolds

269

218

Scaffold L50 (bp)

19,847,963

19,689,293

Total length of assemblies (A) (bp)

150,632,036

147,419,868

Total length of assemblies (T) (bp)

120,857,189

121,174,287

Percentage of repeat sequences (A)

26.1

25.0

Percentage of repeat sequences (T)

22.9

23.0

Number (%) of TEs (A)a

60,716 (20.9)

68,541 (23.8)

Number (%) of TEs (T)a

35,893 (21.4)

36,669 (21.6)

Number of genes (A)a

28,945 + 341

27,939 + 288

Number of genes (T)a

25,834 + 316

26,553 + 73

Complete BUSCOs (%) (A)

1,602 (99.2)

1,548 (95.9)

Complete BUSCOs (%) (T)

1,589 (98.5)

1,601 (99.2)

  1. Note: Allo738 has A. thaliana (Ler) and A. arenosa genomes for ten or more generations11,27, while the T (A. thaliana equivalent) and A (A. arenosa equivalent) subgenomes have evolved in Asu for 14,000–300,000 yr (refs. 13,28). aExcludes those TEs and genes in unassembled scaffolds.