Table 2 Comparison of selected technologies with STRaM
Method | Data Collection Method | Data Processing Pathway | Error Check | Evaluation System | Advantage (A) / Disadvantage (D) |
---|---|---|---|---|---|
CE | Capillary electrophoresis | CE instrument and software | No | Yes Similarity index (SI) | A: Gold Standard for cell line ID; high resolution; commercial automation knowledge; reference databases and search tool. D: Requires CE instrument and software; inability to capture base change information of sequence. |
STR-realigner, Expansion Hunter, STRetch, et al. | WGS or TAS | STR identify | No | No | A: Detects all types of STR input sequence; display sequence detailed reads informations: positions, sequence, length, and alignment scores, etc. D: Identification process may be interrupted due repeat structure change; requires large memory, run-time power, and long processing time for large genomic data. |
STRinNGS, popSTR, GangSTR, et al. | WGS or TAS | STR flanking alignment | No | No | A: Similar to CE; fast analysis. D: Construct a STR flanking sequence database of known STR loci set; STR flanking sequence anomalies lead to analysis termination. |
STR-FM, toaSTR, SNiPSTR, et al. | WGS or TAS | Series execution: STR identify and STR flanking alignment analyses | No | No | A: Rapid STR ID and positioning. D: Requires large memory, runtime power and long processing time for large genomic data; STR identify and STR flanking analyses operate independent of each other. |
STRaM | TAS | Parallel execution: STR identify, STR flanking alignment and edited/ mutant sequence analyses | Yes Error checks for STR identify and STR flanking alignment analyses | Yes Similarity index (SI), Purity index (PI), Editing/mutation index (EMI) | A: STR identify and STR flanking analyses are compared for error checks; facilitates database construction of known STR loci; good integration performance. D: Construct a flanking sequence database of known STR loci set; selection of known STR locus sets is not favorable for the search and study of unknown loci; large genome database analyses require large memory & long running time. |