Fig. 6: ChromatinHD learned a complex dependency between predictivity and fragment size. | Nature Communications

Fig. 6: ChromatinHD learned a complex dependency between predictivity and fragment size.

From: ChromatinHD connects single-cell DNA accessibility and conformation to gene expression through scale-adaptive machine learning

Fig. 6

a Relationship between the abundance of a fragment size bin (± 10 bp) and the overall loss in predictive accuracy (predictivity, Δcor) when fragments of these sizes were removed from the data. b Abundance, normalized predictivity (predictivity divided by abundance) and average effect of different fragment size bins. The effect is defined as the difference in predicted gene expression between the original data and when the respective fragment is removed. Fragment sizes were split into footprint, mono-, mono, mono + , di-, di + , tri, tri+ and multi fragments by taking the middle point between the local maxima and minima of normalized predictivity. c Motif enrichment for windows with mono− (80-120 bp) versus TF footprint (0-80 bp) fragments, compared to the overall enrichment of a motif in predictive windows. d Relationship of the # of (indirectly) bound TFs within a 100 bp window according to ENCODE GM12878 data (x-axis) and predictivity as defined by ChromatinHD-pred (blue), # of footprints according to HINT-ATAC on the pbmc10k data (red), ratio of Mono− versus TF footprint fragments (green) and overall number of fragments (orange). Shown is the mean and standard error of a spline fit using R’s gam function with smoothing parameter sp = 1. ChIP-seq data of top 30 TFs (ordered by the correlation between predictivity and number of binding sites within 100 bp windows); data for all TFs is shown in Supplementary Fig. 8a. Source data are provided as a Source Data file.

Back to article page