Abstract
Deep neural networks (DNNs) have greatly advanced the ability to predict genome function from sequence. However, elucidating underlying biological mechanisms from genomic DNNs remains challenging. Existing interpretability methods, such as attribution maps, have their origins in non-biological machine learning applications and therefore have the potential to be improved by incorporating domain-specific interpretation strategies. Here we introduce SQUID (Surrogate Quantitative Interpretability for Deepnets), a genomic DNN interpretability framework based on domain-specific surrogate modelling. SQUID approximates genomic DNNs in user-specified regions of sequence space using surrogate models—simpler quantitative models that have inherently interpretable mathematical forms. SQUID leverages domain knowledge to model cis-regulatory mechanisms in genomic DNNs, in particular by removing the confounding effects that nonlinearities and heteroscedastic noise in functional genomics data can have on model interpretation. Benchmarking analysis on multiple genomic DNNs shows that SQUID, when compared to established interpretability methods, identifies motifs that are more consistent across genomic loci and yields improved single-nucleotide variant-effect predictions. SQUID also supports surrogate models that quantify epistatic interactions within and between cis-regulatory elements, as well as global explanations of cis-regulatory mechanisms across sequence contexts. SQUID thus advances the ability to mechanistically interpret genomic DNNs.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout





Similar content being viewed by others
Data availability
The datasets, models and computational results used to support the findings in this paper are available on Zenodo at https://doi.org/10.5281/zenodo.10047748 ref. 69. Datasets include the test set sequences held out during the training of ResidualBind-32, DeepSTARR and BPNet; the ChIP–seq peaks and background sequences used to train our three-layer CNN; and the CAGI5 challenge dataset.
Code availability
SQUID is an open-source Python package based on TensorFlow70. SQUID can be installed via pip (https://pypi.org/project/squid-nn) or GitHub (https://github.com/evanseitz/squid-nn). Documentation for SQUID is provided on ReadTheDocs (https://squid-nn.readthedocs.io). The code for performing all analyses in this paper is available on GitHub as well (https://github.com/evanseitz/squid-manuscript, ref. 71), and a static snapshot of this code is available on Zenodo69.
References
Linder, J., Srivastava, D., Yuan, H., Agarwal, V. & Kelley, D. R. Predicting RNA-seq coverage from DNA sequence as a unifying model of gene regulation. Preprint at bioRxiv https://doi.org/10.1101/2023.08.30.555582 (2023).
Avsec, Ž. et al. Effective gene expression prediction from sequence by integrating long-range interactions. Nat. Methods 18, 1196–1203 (2021).
Dudnyk, K., Cai, D., Shi, C., Xu, J. & Zhou, J. Sequence basis of transcription initiation in the human genome. Science 384, 694 (2024).
Jaganathan, K. et al. Predicting splicing from primary sequence with deep learning. Cell 176, 535–548.e24 (2019).
Avsec, Ž. et al. Base-resolution models of transcription-factor binding reveal soft motif syntax. Nat. Genet. 53, 354–366 (2021).
Chen, K. M., Wong, A. K., Troyanskaya, O. G. & Zhou, J. A sequence-based global map of regulatory activity for deciphering human genetics. Nat. Genet. 54, 940–949 (2022).
Zhou, J. Sequence-based modeling of three-dimensional genome architecture from kilobase to chromosome scale. Nat. Genet. 54, 725–734 (2022).
Koo, P. K. & Ploenzke, M. Deep learning for inferring transcription factor binding sites. Curr. Opin. Syst. Biol. 19, 16–23 (2020).
Novakovsky, G., Dexter, N., Libbrecht, M. W., Wasserman, W. W. & Mostafavi, S. Obtaining genetics insights from deep learning via explainable artificial intelligence. Nat. Rev. Genet. 24, 125–137 (2022).
Han, T., Srinivas, S. & Lakkaraju, H. Which explanation should I choose? A function approximation perspective to characterizing post hoc explanations. Preprint at https://arxiv.org/abs/2206.01254 (2022).
Hooker, S., Erhan, D., Kindermans, P.-J. & Kim, B. A benchmark for interpretability methods in deep neural networks. In Advances in Neural Information Processing Systems Vol. 32 (2019).
Ancona, M., Ceolini, E., Öztireli, C. & Gross, M. Towards better understanding of gradient-based attribution methods for deep neural networks. Preprint at https://arxiv.org/abs/1711.06104 (2017).
Simonyan, K., Vedaldi, A. & Zisserman, A. Deep inside convolutional networks: visualising image classification models and saliency maps. In Workshop at International Conference on Learning Representations (2014).
Shrikumar, A., Greenside, P. & Kundaje, A. Learning important features through propagating activation differences. In Proc. 34th International Conference on Machine Learning Vol. 70, ICML’17, 3145–3153 (JMLR.org, 2017).
Zhou, J. & Troyanskaya, O. G. Predicting effects of noncoding variants with deep learning-based sequence model. Nat. Methods 12, 931–934 (2015).
Smilkov, D., Thorat, N., Kim, B., Viégas, F. & Wattenberg, M. SmoothGrad: Removing noise by adding noise. Preprint at https://arxiv.org/abs/1706.03825 (2017).
Sundararajan, M., Taly, A. & Yan, Q. Axiomatic attribution for deep networks. Preprint at https://arxiv.org/abs/1703.01365 (2017).
Lundberg, S. M. & Lee, S.-I. A Unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems Vol. 30, 4768–4777 (Curran Associates, 2017).
Starr, T. N. & Thornton, J. W. Epistasis in protein evolution. Protein Sci. 25, 1204–1218 (2016).
Weinreich, D. M., Lan, Y., Wylie, C. S. & Heckendorn, R. B. Should evolutionary geneticists worry about higher-order epistasis? Curr. Opin. Genet. Dev. 23, 700–707 (2013).
Aghazadeh, A. et al. Epistatic net allows the sparse spectral regularization of deep neural networks for inferring fitness functions. Nat. Commun. 12, 5225 (2021).
Zhou, J. et al. Higher-order epistasis and phenotypic prediction. Proc. Natl Acad. Sci. USA 119, e2204233119 (2022).
Domingo, J., Baeza-Centurion, P. & Lehner, B. The causes and consequences of genetic interactions (epistasis). Annu. Rev. Genomics Hum. Genet. 20, 433–460 (2019).
Otwinowski, J., McCandlish, D. M. & Plotkin, J. B. Inferring the shape of global epistasis. Proc. Natl Acad. Sci. USA 115, E7550–E7558 (2018).
Poelwijk, F. J., Krishna, V. & Ranganathan, R. The context-dependence of mutations: a linkage of formalisms. PLOS Comput. Biol. 12, e1004771 (2016).
Tareen, A. et al. MAVE-NN: learning genotype–phenotype maps from multiplex assays of variant effect. Genome Biol. 23, 98 (2022).
Tonner, P. D., Pressman, A. & Ross, D. Interpretable modeling of genotype–phenotype landscapes with state-of-the-art predictive power. Proc. Natl Acad. Sci. USA 119, e2114021119 (2022).
Ribeiro, M. T., Singh, S. & Guestrin, C. “Why should I trust you?": explaining the predictions of any classifier. In Proc. 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 1135–1144 (ACM, 2016).
Kinney, J. B., Murugan, A., Callan Jr, C. G. & Cox, E. C. Using deep sequencing to characterize the biophysical mechanism of a transcriptional regulatory sequence. Proc. Natl Acad. Sci. USA 107, 9158–9163 (2010).
Jones, M. C. & Faddy, M. J. A skew extension of the t-distribution, with applications. J. R. Stat. Soc. Ser. B 65, 159–174 (2003).
Tareen, A. & Kinney, J. B. Logomaker: beautiful sequence logos in Python. Bioinformatics 36, 2272–2274 (2019).
Gordân, R. et al. Genomic regions flanking E-box binding sites influence DNA binding specificity of bHLH transcription factors through DNA shape. Cell Rep. 3, 1093–1104 (2013).
Jolma, A. et al. DNA-dependent formation of transcription factor pairs alters their binding specificity. Nature 527, 384–388 (2015).
de Almeida, B. P., Reiter, F., Pagani, M. & Stark, A. DeepSTARR predicts enhancer activity from DNA sequence and enables the de novo design of synthetic enhancers. Nat. Genet. 54, 613–624 (2022).
Toneyan, S., Tang, Z. & Koo, P. Evaluating deep learning for predicting epigenomic profiles. Nat. Mach. Intell. 4, 1088–1100 (2022).
Spitz, F. & Furlong, E. E. M. Transcription factors: from enhancer binding to developmental control. Nat. Rev. Genet. 13, 613–626 (2012).
Bartlett, P. L., Long, P. M., Lugosi, G. & Tsigler, A. Benign overfitting in linear regression. Proc. Natl Acad. Sci. USA 117, 30063–30070 (2020).
Chatterji, N. S. & Long, P. M. Finite-sample analysis of interpolating linear classifiers in the overparameterized regime. J. Mach. Learn. Res. 22, 5721–5750 (2021).
Wang, Z. et al. Smoothed geometry for robust attribution. Adv. Neural Inform. Process. Syst. 33, 13623–13634 (2020).
Alvarez-Melis, D. & Jaakkola, T. S. Towards robust interpretability with self-explaining neural networks. In Proc. 32nd International Conference on Neural Information Processing Systems 7786–7795 (Curran Associates Inc., 2018).
Majdandzic, A. et al. Selecting deep neural networks that yield consistent attribution-based interpretations for genomics. In Machine Learning in Computational Biology 131–149 (PMLR, 2022).
Zhang, C., Bengio, S., Hardt, M., Recht, B. & Vinyals, O. Understanding deep learning requires rethinking generalization. Preprint at https://arxiv.org/abs/1611.03530 (2017).
Papagianni, A. et al. Capicua controls Toll/IL-1 signaling targets independently of RTK regulation. Proc. Natl Acad. Sci. USA 115, 1807–1812 (2018).
Crocker, J. et al. Low affinity binding site clusters confer hox specificity and regulatory robustness. Cell 160, 191–203 (2015).
Farley, E. K. et al. Suboptimization of developmental enhancers. Science 350, 325–328 (2015).
Castro-Mondragon, J. A. et al. JASPAR 2022: the 9th release of the open-access database of transcription factor binding profiles. Nucleic Acids Res. 50, D165–D173 (2021).
Kircher, M. et al. Saturation mutagenesis of twenty disease-associated regulatory elements at single base-pair resolution. Nat. Commun. 10, 3583 (2019).
Shigaki, D. et al. Integration of multiple epigenomic marks improves prediction of variant impact in saturation mutagenesis reporter assay. Hum. Mutat. 40, 1280–1291 (2019).
Kelley, D. R. et al. Sequential regulatory activity prediction across chromosomes with convolutional neural networks. Genome Res. 28, 739–750 (2018).
Kim, S. & Wysocka, J. Deciphering the multi-scale, quantitative cis-regulatory code. Mol. Cell 83, 373–392 (2023).
Georgakopoulos-Soares, I. et al. Transcription factor binding site orientation and order are major drivers of gene regulatory activity. Nat. Commun. 14, 2333 (2023).
Koo, P. K., Majdandzic, A., Ploenzke, M., Anand, P. & Paul, S. B. Global importance analysis: an interpretability method to quantify importance of genomic features in deep neural networks. PLoS Comput. Biol. 17, e1008925 (2021).
Weinreich, D. M., Lan, Y., Jaffe, J. & Heckendorn, R. B. The influence of higher-order epistasis on biological fitness landscape topography. J. Stat. Phys. 172, 208–225 (2018).
Ackers, G. K., Johnson, A. D. & Shea, M. A. Quantitative model for gene regulation by lambda phage repressor. Proc. Natl Acad. Sci. USA 79, 1129–1133 (1982).
Bintu, L. et al. Transcriptional regulation by the numbers: models. Curr. Opin. Genet. Dev. 15, 116–124 (2005).
Segal, E. & Widom, J. From DNA sequence to transcriptional behaviour: a quantitative approach. Nat. Rev. Genet. 10, 443–456 (2009).
Sherman, M. S. & Cohen, B. A. Thermodynamic state ensemble models of cis-regulation. PLoS Comput. Biol. 8, e1002407 (2012).
Faure, A. J. et al. Mapping the energetic and allosteric landscapes of protein binding domains. Nature 604, 175–183 (2022).
Tareen, A. & Kinney, J. B. Biophysical models of cis-regulation as interpretable neural networks. In 14th Conference on Machine Learning in Computational Biology (MLCB 2019); https://doi.org/10.1101/835942
Estrada, J., Wong, F., DePace, A. & Gunawardena, J. Information integration and energy expenditure in gene regulation. Cell 166, 234–244 (2016).
Scholes, C., DePace, A. H. & Sánchez, Á. Combinatorial gene regulation through kinetic control of the transcription cycle. Cell Syst. 4, 97–108.e9 (2017).
Park, J. et al. Dissecting the sharp response of a canonical developmental enhancer reveals multiple sources of cooperativity. eLife 8, e41266 (2019).
Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
Avsec, Z. & Weilert, M. kundajelab/bpnet-manuscript: Publication release of BPNet manuscript code. Zenodo https://zenodo.org/records/4294814 (2020).
Avsec, Z. et al. The Kipoi repository accelerates community exchange and reuse of predictive models for genomics. Nat. Biotechnol. https://doi.org/10.1038/s41587-019-0140-0 (2019).
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I. & Salakhutdinov, R. Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014).
Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. Preprint at https://arxiv.org/abs/1412.6980 (2014).
Majdandzic, A., Rajesh, C. & Koo, P. K. Correcting gradient-based interpretations of deep neural networks for genomics. Genome Biol. 24, 1–13 (2023).
Seitz, E. evanseitz/squid-manuscript: SQUID manuscript workflow with outputs. Zenodo https://doi.org/10.5281/zenodo.10047747 (2023).
Abadi, M. et al. TensorFlow: large-scale machine learning on heterogeneous systems. Software available from tensorflow.org (2015).
Seitz, E. & Koo, P. evanseitz/squid-nn: surrogate quantitative interpretability for deepnets. Zenodo https://doi.org/10.5281/zenodo.11060672 (2023).
Acknowledgements
We thank Z. Tang, S. Toneyan, M. Kooshkbaghi, C. Rajesh, J. Kaczmarzyk and C. Martí-Gómez for helpful discussions. This work was supported in part by: the Simons Center for Quantitative Biology at Cold Spring Harbor Laboratory; NIH grants R01HG012131 (P.K.K., E.E.S., J.B.K. and D.M.M.), R01HG011787 (J.B.K., E.E.S. and D.M.M.), R01GM149921 (P.K.K.), R35GM133777 (J.B.K.) and R35GM133613 (D.M.M.); and an Alfred P. Sloan Foundation Research Fellowship (D.M.M.). Computations were performed using equipment supported by NIH grant S10OD028632.
Author information
Authors and Affiliations
Contributions
E.E.S., D.M.M., J.B.K. and P.K.K. conceived of the study. E.E.S. wrote the software and performed the analysis. E.E.S. designed the analysis with help from D.M.M., J.B.K. and P.K.K. J.B.K. and P.K.K. supervised the study. E.E.S., D.M.M., J.B.K. and P.K.K. wrote the paper.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Machine Intelligence thanks the anonymous reviewers for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Additional comparisons of K-Lasso LIME to SQUID.
Shown are the results of analyses, performed as in Fig. 2b for n = 50 genomic sequences, comparing the performance of SQUID to the performance of the K-Lasso implementation of LIME for four different values of K. P values were computed using a one-sided Mann-Whitney U test; ***, p < 0.001. We note that the attribution variation values obtained for SQUID in these tests varied systematically with the choice of K. The reason is as follows. The K-Lasso LIME algorithm produces sparse attribution maps that have only K nonzero parameters. Consequently, the variation observed in K-Lasso LIME attribution maps systematically decreases as K decreases. This gives K-Lasso LIME an unfair advantage in the attribution variation test described in Main Text and in Methods. To fairly compare K-Lasso LIME to SQUID in this figure, we therefore modified this test. In the analysis of each in silico MAVE, the attribution map elements inferred by SQUID were first set to zero at the same positions where all K-Lasso LIME attribution map elements were exactly zero. Attribution variation values were then calculated as described in Main Text and in Methods.
Extended Data Fig. 2 Influence of mutation rate and library size on SQUID attribution maps.
a,b, attribution variation (left) and average attribution maps (right) found for 50 SQUID attribution maps computed for the TF AP-1 as in Fig. 2 but using in silico MAVE libraries having (a) variable mutation rate (r) and fixed number of mutagenized sequences N = 100, 000, or (b) fixed mutation rate r = 10% and variable number of mutagenized sequences (N). All SQUID attribution maps were computed using additive models with GE nonlinearities, followed by cropping these maps using a flank size of nf = 50 nt.
Extended Data Fig. 3 Nonlinearities and noise across DNNs and TFs.
Examples of the GE nonlinearities and heteroscedastic noise models inferred in the SQUID (GE) analyses performed for Fig. 3. Each plot shows results for a representative sequence among the n = 50 sequences analyzed for each combination of DNN and TF.
Extended Data Fig. 4 Average and example binding motifs for Oct4 and Sox2.
a, Oct4 motifs, centered on the putative binding site TTTGCAT. b, Sox2 motifs, centered on the putative binding site GAACAATAG. TF binding motifs are from attribution maps computed for BPNet and plotted as in Fig. 3c.
Extended Data Fig. 5 Benchmark analysis of attribution variation for putative weak TF binding sites.
a, ISM attribution map given by BPNet for a representative genomic sequence containing multiple putative weak binding sites for the mouse TF Nanog (top). Blue represents the wild-type nucleotides while gray represents other nucleotides. PWM scores (given by PWM for Nanog motif) are displayed across the genomic sequence (bottom). Only positive PWM scores are shown. b, For each TF and DNN, plots show attribution variation values for 150 putative TF binding sites plotted against putative binding site strength. Bold lines indicate signals smoothed with a sliding window of 20 nt. Stars indicate P values computed using the one-sided Mann-Whitney U test: **, 0.001≤ p < 0.01; ***, p < 0.001. PWM scores for each of the 150 putative sites are shown above, along with a logo representation of the PWM used. Each site is represented by a gray bar shaded according to the number of mutations (0, 1, or 2) in the core of the putative site.
Extended Data Fig. 6 Attribution maps computed for strong and weak TF binding sites.
a,b, Top row shows the average of 50 attribution maps in the 0-mutation ensemble computed for IRF1 using ResidualBind-32 (a), and for Ohler using DeepSTARR (b). Remaining rows show attribution maps for four representative genomic loci with the central putative binding site having varying numbers of mutations from the consensus binding site.
Extended Data Fig. 7 Occlusion analysis of AP-1 binding site effects.
Occlusion study based on the wild-type sequence investigated in Figures 6d and 6e. Occlusions were performed for every combination of one, two, or three occluded motifs, with the DNN prediction taken independently for each instance (n = 100 for each boxplot). In each occluded sequence, the corresponding AP-1 core (7-mer) sites were scrambled using a uniform probability of nucleotides at each position. The baseline score (CTRL) was calculated from the median of predictions corresponding to n = 100 instances of a dinucleotide shuffle over the full (2048 nt) sequence. The DNN prediction rapidly approaches the genomic baseline as additional binding sites are occluded. Boxplot lines represent median, upper quartile, and lower quartile; whiskers represent 1.5 × the inter-quartile range.
Extended Data Fig. 8 SQUID supports global DNN interpretations.
a, In silico MAVE libraries used by SQUID to infer surrogate models of global TF specificity. Each sequence in the library contains a partially-mutagenized version of a consensus TF binding site embedded within random DNA. b, Additive surrogate models for BPNet inferred by SQUID using libraries of the form in panel a for the mouse TFs Oct4, Sox2, Klf4, and Nanog. Gray bars indicate 10.5 nt periodicity on either side of the inferred Nanog motif. c, Format of the in silico MAVE libraries used to study global TF-TF epistatic interactions. Each sequence in the library contains two partially mutagenized consensus TF binding sites embedded a fixed distance apart (0 nt to 32 nt) within random DNA. d, Pairwise surrogate model inferred for BPNet by SQUID using a library of the form in c and putative binding sites for Nanog and Sox2. e, Distance-dependence of inter-site interactions between Nanog and Sox2. Dots show the Frobenius norm of inter-site pairwise interactions, inferred as in d, using libraries having different distances between the embedded Nanog and Sox2 sites. The occurrence of periodic inter-site interaction minima occurred at distances where the central nucleotides in Sox2 (that is, AA) overlapped with the periodic Nanog-associated flanking signals. Black line is from a least-squares fit of a polynomial of degree 10. d,e, Libraries comprised of 500,000 sequences. BPNet profiles were projected to scalars using PCA, which yielded substantially lower background noise than obtained using profile contribution scores.
Supplementary information
Supplementary Information
Supplementary Table 1 and Figs 1–3.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Seitz, E.E., McCandlish, D.M., Kinney, J.B. et al. Interpreting cis-regulatory mechanisms from genomic deep neural networks using surrogate models. Nat Mach Intell 6, 701–713 (2024). https://doi.org/10.1038/s42256-024-00851-5
Received:
Accepted:
Published:
Issue date:
DOI: https://doi.org/10.1038/s42256-024-00851-5
This article is cited by
-
Evaluating the representational power of pre-trained DNA language models for regulatory genomics
Genome Biology (2025)
-
Predicting gene expression from DNA sequence using deep learning models
Nature Reviews Genetics (2025)
-
Interpreting cis-regulatory interactions from large-scale deep neural networks
Nature Genetics (2024)