Table 1 Simulated data information. Gene sequences (taken from the corresponding reference gene in Ensembl) and their corresponding PacBio CCS error rates used for simulation. We simulated multigene families by using these gene sequences at the root. We refer to the resulting gene families by using the name of the reference gene, sometimes dropping the last modifier (i.e. TSPY instead of TSPY13P)
From: Deciphering highly similar multigene family transcripts from Iso-Seq data with IsoCon
Reference gene | Sum of exon lengths (nt) | Number of exons | Overall simulation error rate (%) | Median errors per simulated transcript | Median no. of passes per simulated transcript |
---|---|---|---|---|---|
TSPY13P | 914 | 6 | 0.5 | 2 | 18 |
HSFY2 | 2668 | 6 | 2.6 | 41 | 6 |
DAZ2 | 5904 | 28 | 6.1 | 276 | 2 |