Fig. 2: Performance comparison between fragmentomic models and functional analyses.
From: Integrated multiomics signatures to optimize the accurate diagnosis of lung cancer

A Schematic of the process for determining the first 6-nucleotide sequence (i.e., a 6-mer end motif) on each 5’ fragment end of cfDNA relative to the hg19 reference genome; B Hierarchical clustering analyses of the selected 6 bp end motifs derived from 5mC-sequencing data; Receiver operating characteristic analyses of the epigenomic models on the validation set C internal test set D and external test set (E); F Bar charts showing the TFs identified by the 6-mer end motifs selected from 5mC-sequencing data; G Bar charts showing the regulatory target genes by these identified TFs; H The top 15 most enriched GO terms based on the target genes. The area under the receiver operating characteristic curves are compared via the DeLong’s test. All statistical tests were two-sided, with p < 0.05 indicative of a statistically significant difference. 4bp-5mC, the model established by the 4-mer end motifs selected from 5mC-sequencing data; 6bp-5mC, the model established by the 6-mer end motifs selected 5mC-sequencing data; 4bp-5hmC, the model established by the 4-mer end motifs selected from 5hmC-sequencing data; 6bp-5hmC, the model established by the 6-mer end motifs selected from 5hmC-sequencing data; 5mC, 5-methylcytosine; 5hmC, 5-hydroxymethylcytosine; ROCs, receiver operating characteristic curve; AUC area under the ROCs curve, CI confidence interval, Sens sensitivity, Spec specificity, PPV positive predictive value, NPV negative predictive value, Accur accuracy, TFs transcription factors, GO gene ontology. Source data are provided as a Source Data file.