Fig. 4: Analysis of read-level predictions on site-level results. | Nature Communications

Fig. 4: Analysis of read-level predictions on site-level results.

From: Rockfish: A transformer-based model for accurate 5-methylcytosine prediction from nanopore sequencing

Fig. 4

a Complementary cumulative distribution function (CCDF) of the strand-specific calling coverage for each ONT-based method and whole genome bisulfite sequencing (WGBS) for the R10.4.1 NA12878 dataset. b Distribution of proportions of positions not evaluated due to low coverage with the distinction between positions with overall coverage below threshold (labeled as Remora) and positions with overall coverage above threshold but valid coverage below threshold due to Remora’s filtering of uncertain calls (labeled as Remora filter) across different genomic contexts. c Distribution of proportions of high-filtering positions with sufficient valid coverage but above the expected number of filtered calls ( > 10%) across different genomic contexts. d Site-level evaluation in CpG poor and CpG rich promoter regions for all positions and high-filtering positions (HFP). Source data are provided as a Source Data file.

Back to article page