Fig. 5: Analysis of correlation between ONT-based tools and WGBS.

a A table presenting Pearson’s correlation between the methods based on nanopore signal and whole genome bisulfite sequencing (WGBS). Only the sites with coverage higher than 5x for each tool were included. Both R9.4.1 Rockfish models outperform Megalodon and Nanopolish on every evaluation dataset. R10.4.1 Rockfish outperforms Remora on the Mouse dataset, but slightly underperforms on the NA12878 dataset. The highest correlation scores are bolded in the table. b–d 2D histograms representing correlation between b Rockfish models, c the base model and WGBS and d the small model and WGBS for the R9.4.1 NA12878 datasets. The models exhibit a high correlation with each other and with WGBS. Each axis is divided into 20 bins, and counts are represented on a log scale. e Methylation frequency for every ONT-based tool and WGBS with respect to the binned distance from the transcription start sites (TSSs) on the R9.4.1 NA12878 dataset. Both R9.4.1 Rockfish models show high consistency with other ONT-based tools and, more importantly, WGBS. f the distribution of the absolute difference between every ONT-based tool and WGBS. R9.4.1 Rockfish models reduce the absolute difference between ONT and WGBS. Data (n = 81) in the box plot are presented as follows: the centre line indicates the median, the bounds of the box represent the first and third quartiles (Q1 and Q3), and the whiskers extend to the minimum and maximum values within 1.5 times the interquartile range (IQR). Outliers beyond this range are plotted individually. Source data are provided as a Source Data file.