Fig. 7: Phasing result analysis in lrRNA-seq. | Nature Communications

Fig. 7: Phasing result analysis in lrRNA-seq.

From: Clair3-RNA: a deep learning-based small variant caller for long-read RNA sequencing data

Fig. 7: Phasing result analysis in lrRNA-seq.

a This figure illustrates the workflow of Clair3-RNA for model training and inference incorporating phasing information. During the training phase, variants were phased using the first round of called variants in conjunction with GIAB truths, and the alignments were haplotagged based on these phased variants. The alignments are color-coded, with pink representing haplotype 1, blue for haplotype 2, and gray indicating unknown haplotypes. The circle and rhombus in the alignments represent SNPs and indels, respectively. Twelve phasing features were concatenated with the original eighteen features and subsequently input into the neural network for model training. In the inference phase, heterozygous variants from the first round of called variants of Clair3-RNA were utilized for variant phasing. b Read haplotagging statistics of various samples in PacBio and ONT platforms: This panel presents the read haplotagging statistics for various samples across PacBio and ONT platforms. The value on y axis refers to the percentage of haplotagged reads, and the numbers displayed in each histogram represent the total count of haplotagged reads within the respective subset. c This panel displays read haplotagging statistics, stratified by chromosomes, for PacBio MAS-Seq HG004 and ONT dRNA004 HG004 samples. d Phasing statistics of the truth variant using WhatsHap: This panel presents variant phasing statistics for truth variants analyzed using WhatsHap across various PacBio and ONT samples. e The precision-recall curve of the SNP performance across various samples, comparing the performance with and without phasing information in the neural network. Source data are provided as a Source Data file.

Back to article page