Fig. 1: Overview of targeted NGS panel workflow and deep learning model (DLM) for predicting sequencing depth.
From: A deep learning model for predicting next-generation sequencing depth from DNA sequence

a DNA probes (P) are designed for hybridizing to target sequences (T) for subsequent solid-phase separation and enrichment. In NGS, each read corresponds to a randomly sampled DNA molecule from the enriched library, and NGS reads are bioinformatically aligned to the probe sequences using standard algorithms and software. See Supplementary Note 1 for further details on the NGS experimental and bioinformatic workflow. b Overview of NGS read depth prediction method. c The DLM consists of 4 recurrent neural networks implemented as gated recurrent units (GRUs) and 1 feed-forward neural network (FFNN). Each GRU has 128 internal state nodes. The final GRU node values for the target DNA sequence T and for the probe DNA sequence P from \({5}^{\prime}\) to \({3}^{\prime}\) (\({{\bf{H}}}_{T}^{{\text{5}}^{\prime}- \,{> }\,{\text{3}}^{\prime}}\) and \({{\bf{H}}}_{P}^{{\text{5}}^{\prime}- \,{> }\,{\text{3}}^{\prime}}\)) are summed; likewise \({{\bf{H}}}_{T}^{{\text{3}}^{\prime}- \,{> }\,{\text{5}}^{\prime}}\) and \({{\bf{H}}}_{P}^{{\text{3}}^{\prime}- \,{> }\,{\text{5}}^{\prime}}\). These two hidden state node sum vectors are then concatenated into a vector of 256 node values, serving as input to the FFNN. In addition, 4 global parameters also serve as input to the FFNN, bringing the total inputs to 260. The output was a single node corresponding to the log predicted read depth log10(depth) for the DNA target sequence.