Extended Data Fig. 11: Species specificity of open chromatin deep learning.
From: Conserved and divergent gene regulatory programs of the mammalian neocortex

a. Correlation across cell types for peaks by conservation levels in human test dataset for single and multi-species models. Violin plots represent the density of data points. Box plots encompass 25th to 75th percentiles; white dots represent medians; whiskers represent 1.5 times the interquartile interval. N = 39,236, 777, 21,737, 6,605, 4,896, 1,493. b. Scatter compares the model’s ability to predict chromatin accessibility across cell types (spearman r) and conservation index in the test set. c. Scatter plot compares the model’s ability to predict chromatin accessibility across cell types (spearman r) and divergence index in the test set. d. Box plots show relationship between model accuracy (mean L1 norm between predictions and true data) and conservation level in the test dataset. Box plots encompass 25th to 75th percentiles; central lines represent medians; whiskers represent 1.5 times the interquartile interval. N as in a. e. Barplot comparing poorly predicted peaks in the top 10 peak annotations from Homer to each peak annotation in the entire test dataset. Shown for human only model (top), and multispecies model (bottom). N = 39,236 peaks. f. Accuracy of a three-species model across cell types with each species as an outgroup. Spearman correlation of model predictions and measured chromatin accessibility for each cell type, each represented as a dot. Plotted intervals are the same as in a. N = 16 for each. g. Correlation of test set peaks predictions to measured chromatin accessibility across cell types for each species. Violin plots represent the density of data points. Plotted intervals are the same as in d. N = 39,236 (human), 44,311 (macaque), 32,484 (marmoset), and 41,605 (mouse) test set peaks. h. True and predicted chromatin accessibility across the huntingtin locus in indicated cell types. Species silhouettes in g and h created in BioRender.