Table 1 Statistics of different assemblies using ONT standard simplex (SUP basecalling model) and PacBio HiFi reads

From: Efficient near-telomere-to-telomere assembly of nanopore simplex reads

Phased

Sample

Dataset

Approach

Wall

time (h)

T2T count

Multicopy genes

retained (%)

N50 (Mb)

QV

Scaffold

Contig

Scaffold

Contig

Full (trio)

HG001

ONT (66×)

Hifiasm

15.0

16/17

11/11

92.2/92.2

154.8/134.8

127.1/109.3

49.1/49.2

Verkko + HERRO

126.7

3/5

0/2

91.8/92.6

97.4/96.2

59.0/62.5

51.4/51.6

HiFi (25×)

Hifiasm

2.0

0/0

0/0

93.6/92.3

48.4/52.1

21.2/21.5

50.2/50.0

HG002

ONT (47×)

Hifiasm

8.4

15/17

7/15

92.1/95.0

146.9/143.8

131.5/143.8

47.5/47.1

Verkko + HERRO

103.1

3/2

0/0

91.2/94.4

91.8/90.1

52.3/46.0

49.0/48.5

HiFi (52×)

Hifiasm

5.4

1/1

0/0

92.0/95.0

96.9/97.6

78.7/63.1

53.8/53.0

HG005

ONT (56×)

Hifiasm

10.8

11/12

8/4

92.7/94.0

134.4/134.8

107.3/99.5

48.3/48.4

Verkko + HERRO

86.7

1/2

0/0

92.7/93.7

94.2/94.0

51.7/58.7

50.6/50.7

HiFi (50×)

Hifiasm

5.6

2/2

1/0

93.3/94.6

104.4/104.4

83.7/92.5

53.6/53.4

Partial (dual)

HG001

ONT (66×)

Hifiasm

14.9

12/12

10/8

89.3/93.6

135.4/134.3

133.6/101.3

49.1/49.1

HiFi (25×)

Hifiasm

1.9

0/0

0/0

90.8/92.2

55.0/54.6

27.5/22.9

50.1/50.0

HG002

ONT (47×)

Hifiasm

8.2

15/15

7/10

94.5/91.4

141.6/133.5

103.7/110.7

47.3/47.3

HiFi (52×)

Hifiasm

5.3

1/0

0/0

91.4/92.3

98.3/95.3

71.0/77.9

53.0/53.8

HG003

ONT (71×)

Hifiasm

18.6

16/13

12/8

96.0/94.3

135.7/133.4

135.7/120.2

48.2/47.7

HiFi (74×)

Hifiasm

9.3

3/2

0/0

93.3/94.5

94.9/104.5

91.4/71.8

51.8/52.4

HG004

ONT (66×)

Hifiasm

14.2

13/12

9/6

93.5/92.5

111.1/133.6

104.0/97.4

48.5/48.6

HiFi (61×)

Hifiasm

6.6

2/0

1/0

91.4/94.5

95.6/92.0

64.1/55.9

53.2/54.3

HG005

ONT (56×)

Hifiasm

10.6

12/8

9/0

92.6/91.7

134.4/133.6

118.2/88.3

48.4/48.4

HiFi (50×)

Hifiasm

5.5

2/3

0/2

94.7/90.3

97.5/100.7

85.7/85.7

53.5/53.5

HG006

ONT (56×)

Hifiasm

10.4

15/13

9/6

93.6/92.1

134.4/134.1

107.6/100.7

47.9/47.6

HiFi (73×)

Hifiasm

9.4

5/3

2/1

91.5/94.9

96.6/96.4

87.2/71.3

55.6/56.0

HG007

ONT (66×)

Hifiasm

12.9

12/13

10/7

94.2/94.6

135.4/135.0

134.9/102.6

47.5/47.8

HiFi (61×)

Hifiasm

7.2

1/2

0/2

93.1/93.2

95.2/93.4

85.4/87.5

53.4/54.5

  1. Each assembly comprises two sets of sequences. These two sets represent either paternal/maternal sequences (for trio-binning assemblies) or haplotype 1/haplotype 2 (for hifiasm dual assemblies). The two numbers in each cell give the metrics for the two sets of sequences, respectively. The N50 of an assembly is defined as the sequence length of the shortest contig or scaffold at 50% of the total assembly size. ‘Multicopy genes retained’ is the percentage of multicopy genes in CHM13 (multiple mapping positions at ≥97% sequence identity) that are multicopy in the assembly. They were calculated by the asmgene method38, with CHM1336 as the reference genome. The sequencing read coverage of each dataset is provided in the ‘Dataset’ column. The number of T2T contigs represents how many chromosomes were reconstructed without assembly gaps (complete contigs), and the number of T2T scaffolds indicates how many chromosomes were reconstructed either with or without assembly gaps (complete scaffolds). QV scores, which reflect per-base assembly accuracy, were evaluated using the k-mer-based tool yak. Comprehensive k-mer-based evaluation results using both yak and Merqury39 are provided in Extended Data Table 1.