Table 6 Statistical analyses of gene structure annotation of the P. h. homarus genome.

From: Chromosome-level genome assembly of scalloped spiny lobster Panulirus homarus homarus

 

Gene set

Number

Average transcript length (bp)

Average CDS length (bp)

Average exons/gene

Average exon length (bp)

Average intron length (bp)

De novo

Augustus

659,318

1,799.42

355.72

1.73

205.12

1,966.25

GlimmerHMM

132,893

8,676.42

796.74

2.51

317.35

5,216.16

SNAP

875,253

2,584.61

394.55

2.05

192.48

2,086.10

Geneid

267,413

6,799.47

519.05

2.37

219.04

4,585.32

Genscan

79,061

21,323.13

1,286.12

5.23

246.05

4,740.20

Homolog

D. melanogaster

8,545

14,666.23

979.91

3.9

251.47

4,724.85

E. sinensis

84,894

3,107.67

723.74

1.64

441.67

3,732.89

H. americanus

135,558

2,773.94

678.41

1.6

424.38

3,500.81

L. vannamei

70,184

3,806.52

946.38

1.92

492.93

3,109.10

M. japonicus

75,131

5,650.06

796.72

2.08

383.73

4,509.48

P. chinensis

178,660

2,395.58

562.63

1.48

379.61

3,801.87

P. ornatus

96,200

3,829.82

616.76

1.97

312.78

3,306.05

P. trituberculatus

37,464

8,318.17

791.42

2.66

297.57

4,535.37

RNAseq

PASA

59,275

55,389.17

4,422.12

7.22

612.15

8,188.90

Cufflinks

118,883

11,285.66

856.62

2.77

309.37

5,895.81

EVM

118,883

11,285.66

856.62

2.77

309.37

5,895.81

Pasa-update

118,774

11,433.75

858.88

2.77

309.67

5,962.76

Final set

25,580

31,472.77

1,613.73

5.78

279.37

6,251.44

  1. De novo: Gene structure predictions performed using Augustus, GlimmerHMM, SNAP, Geneid, and Genscan software.
  2. Homolog: Gene annotations identified through comparison with orthologs from related species.
  3. RNA-seq: Gene structures annotated based on transcriptome sequencing data.
  4. PASA: Gene structures obtained from the assembly of RNA-seq transcripts using Trinity and integrated with transcriptome data.
  5. Pasa-update: Corrections to gene structures based on PASA annotations.
  6. EVM: Integrated gene annotations combining de novo, homologous, and RNA-seq annotation results using EvidenceModeler (EVM).
  7. Final set: The finalized set of functionally annotated genes after integrating and refining all annotation methods.
  8. Transcript numbers indicate the identified transcript variants from each annotation strategy.
  9. Average transcript length, CDS length, exon number, and exon/intron lengths are based on the corresponding gene prediction methods listed.