Fig. 1: Overview of k-mer pattern partitioning (kmerPaPa) model. | Nature Communications

Fig. 1: Overview of k-mer pattern partitioning (kmerPaPa) model.

From: A method to build extended sequence context models of point mutations and indels

Fig. 1

Input data consists of a list of observed mutations (here C→G mutations) and a bed file with regions sufficiently covered by WGS data to detect mutations. From this, we calculate a table with the number of times each possible k-mer is observed with the central base mutated and unmutated. The k-mers are then grouped using a set of IUPAC patterns so that each k-mer is matched by one and only one pattern. Out of the exponentially many possible pattern partitions the one that minimizes the loss function is chosen.

Back to article page