Fig. 4: In silico analysis of sequence composition and motif content of IC and DC elements.

a, Training of an SVM model to identify heart enhancers with independent data from public repositories. Positive set: embryonic heart or cardiomyocyte ATAC-seq peaks; negative set: nonoverlapping ATAC-seq peaks from nonheart tissues. The model distinguishes heart-specific versus limb-specific enhancers from chicken embryos. b, Evaluation of classification of DC+, IC+ and NC regions of the chicken genome by the SVM model. c, TF-MoDISco interpretation of the putative TFBS that contribute to model specificity. Binding sites of several known heart-specific TFs contributed to model accuracy. d, Heart-expressed TFs identified by RNA-seq were consolidated to 301 motifs of heart-specific TFs. Promoter–enhancer pairs were screened for shared TFBS or ATAC-seq footprints. e, DC+IC+ promoters and enhancers shared more heart TFBS than DC−IC− or NC regions. f, Functionally conserved DC and IC ATAC-seq peak pairs shared more TF footprints than NC ATAC-seq peak pairs or control pairs (‘bg’ indicates a nonpaired ATAC-seq peak in the same TAD).