Table 2 Comparison of selected technologies with STRaM

From: STRaM: A genetic framework for improved cell product provenance for research and clinical translations

Method

Data Collection Method

Data Processing Pathway

Error Check

Evaluation System

Advantage (A) / Disadvantage (D)

CE

Capillary electrophoresis

CE instrument and software

No

Yes

Similarity

index (SI)

A: Gold Standard for cell line ID; high resolution; commercial automation knowledge; reference databases and search tool.

D: Requires CE instrument and software; inability to capture base change information of sequence.

STR-realigner, Expansion

Hunter,

STRetch, et al.

WGS or TAS

STR identify

No

No

A: Detects all types of STR input sequence; display sequence detailed reads informations: positions, sequence, length, and alignment scores, etc.

D: Identification process may be interrupted due repeat structure change; requires large memory, run-time power, and long processing time for large genomic data.

STRinNGS, popSTR, GangSTR, et al.

WGS or TAS

STR flanking alignment

No

No

A: Similar to CE; fast analysis.

D: Construct a STR flanking sequence database of known STR loci set; STR flanking sequence anomalies lead to analysis termination.

STR-FM, toaSTR, SNiPSTR, et al.

WGS or TAS

Series execution:

STR identify and STR flanking alignment analyses

No

No

A: Rapid STR ID and positioning.

D: Requires large memory, runtime power and long processing time for large genomic data; STR identify and STR flanking analyses operate independent of each other.

STRaM

TAS

Parallel execution:

STR identify, STR flanking alignment and edited/ mutant sequence analyses

Yes

Error checks for STR

identify and STR flanking alignment analyses

Yes

Similarity

index (SI),

Purity

index (PI),

Editing/mutation

index (EMI)

A: STR identify and STR flanking analyses are compared for error checks; facilitates database construction of known STR loci; good integration performance.

D: Construct a flanking sequence database of known STR loci set; selection of known STR locus sets is not favorable for the search and study of unknown loci; large genome database analyses require large memory & long running time.