Table 1 Technologies for tandem repeats (TRs) sequence analysis based on NGS data
Software | Bioinformatics Pipeline | Features | Ref. |
|---|---|---|---|
lobSTR | Flags STR reads, maps the flanking regions to the reference to reveal the STR position and length. | A rapid and accurate algorithm for STR profiling. | |
RepeatSeq | Mapped to reference sequence, discards read that do not span the repeat. | A comprehensive genotyping software package for calling microsatellite repeat genotypes. | |
STRait Razor | Use AGREP function, an approximate string search tool to count the nucleotide number of each repeat sequence, and flanking region query each 12 bases long. | A Perl script for identifying alleles at forensic STR loci. | |
PacmonSTR | Alignment to reference genome. TR estimates by a pair-hidden Markov models, prediction genotyping and boundaries for compound structural variants within or around a TR interval. | A reference-based probabilistic approach to identify the TR regions and estimate the number of these TR elements. | |
STRViper | Align to a reference sequence for STR length, need paired-end reads aligned to the flanking regions. | A Bayesian method to estimate repeat-length variations. | |
TRhist | Retrieve STRs by an approximate algorithm. Maps to a unique position for location located in the genome. | An ab initio procedure for sensing, locating and sequencing STRs that were significantly expanded. | |
VNTRseek | Mapped to reference TRs for calling and then, mappings to reference flanking sequences for confirmations. | A software that identified internal copy number variation at minisatellite TR loci. | |
TSSV | Alignment of flanking pair of markers at predefined loci via a semiglobal alignment (25 bp). | An efficient and sensitive tool to specifically profile all allelic variants present in targeted STR loci. | |
STR-FM | A string comparison algorithm for STR analysis, and 20 bp flanking sequence mapped to reference genome. | A computational pipeline that detected the full spectrum of STR alleles. | |
CoalescentSTR | Aligned to reference genome for STR length, and read needs spanning the STR regions. | A new statistical model that estimated repeat numbers. | |
STR-realigner | A new dynamic programming-based realignment method for reference genome, and read needs spanning the STR regions. | A new realignment method for STR regions. | |
STRinNGS | Comparing the reference sequences allele calling and variations in flanking regions. | A Python script for the analysis of STR regions. | |
PopSTR | Mapped to the reference genome, and flanking region aligned to the reference genome ( > 4 base). | A method capable of studying microsatellite (STR) variation. | |
STRait Razor V3.0 | Performs fuzzy string matching for motif length. | A novel indexing strategy used to perform fuzzy string matching of anchor sequences. | |
HipSTR | Selects variation in STR and identify sequence variations. | A novel haplotype-based method for genotyping and phasing STRs. | |
TREDPARSE | Build a series of STR-region references, and align reads to reference for repeat size. Requires at least 9 bp when matching flanking sequences. | A software package that incorporated various cues from read alignment and paired-end distance distribution, as well as, a sequence stutter model in a probabilistic framework to infer repeat sizes for genetic loci. | |
ExpansionHunter | Repeat size determination from spanning reads, identifying IRRs and off-target regions. The flanking sequences are aligned to the reference. | A software package to genotype STRs. | |
STRetch | Generating a custom reference genome for STR length and positions. | A new genome-wide method to scan for STR expansions. | |
exSTRa | Mapping to reference loci. | A method to identify repeat expansions. | |
toaSTR | Use a fast k-mer-based fuzzy search for clustered observations. Reads must span the complete repeat region, and a minimum of 30 nucleotides upstream and downstream of the repeat region. | A web application to help forensic experts work with MPS data in a simple and efficient way. | |
GangSTR | Use tandem repeat finder to establish reference STR library. Determine STR length and off-target reads for reads fully enclosing the TR plus a minimum of 20 bp on either end. | A novel algorithm for genome-wide genotyping of both short and expanded TRs. | |
STRinNGS v2.0 | Reading the reference file to get information of all loci. Extracts the reference flank sequences for variant(s). | An updated version (2.0) of the STR analysis tool. | |
SuperSTR | A fast, compression-based estimator to identify reads with motifs. | An ultrafast method that does not require alignment, efficient screening and identification of known and potential disease-associated STRs. | |
STRling | Scans candidate reads for k-mer content. A pair of reads has one read that maps well to the reference genome and a mate with high STR content, the mapping position of the well-mapped read is used to reposition the STR read. | Used the k-mer counting to detect STR expansions. | |
SNiPSTR | Primarily aligned to reference sequence for length and to reference genome for flanking variants. After flanking alignment, obtain STR motifs. | A combined cost-efficient shallow-sequence output NGS assay and a dedicated bioinformatics pipeline. | |
STRaM (current work) | First pipeline: STR is recognized by STR-FM for genomic coordinates, length and read counts, etc. Second pipeline: reads mapped to reference sequence for genomic coordinates, length and read counts, etc. Comparing information of STR for error checks and third pipeline for target analysis. | An integrated and cross-checked workflow for STR analysis and targeted sequences combined with an evaluation system for sample monitoring. | — |