Extended Data Fig. 1: Distribution of gene expression counts in single-cell RNA-seq and statistics from association between gene expression and chromatin accessibility under null simulation.
From: Tissue-specific enhancer–gene maps from multimodal single-cell data identify causal disease alleles

a. In an example dataset of arthritis-dataset, mean gene count was strongly correlated with standard deviation of the gene count. b. The correlation between max expression count per gene (x-axis) and the mean naïve association chi-square values (χ2) from Poisson regression between gene expression and chromatin accessibility under null simulation (y-axis). c. The quantile-quantile (QQ) plot of two-sided P values from the Poisson regression between gene expression count and chromatin accessibility under null simulation. d. The QQ plot of two-sided P values from the negative binomial regression between gene expression count and chromatin accessibility under null simulation. e. The QQ plot of two-sided P values from the linear regression between log-normalized and inverse-normal-transformed gene expression and chromatin accessibility under null simulation. f. The QQ plot of two-sided P values estimated from bootstrapping based on the statistics distributions from the Poisson regression between gene expression count and chromatin accessibility under null simulation. g. The QQ plot of two-sided P values estimated from bootstrapping based on the statistics distributions from the negative binomial regression between gene expression count and chromatin accessibility under null simulation. h. Computational runtime benchmarking for Poisson regression with binarized ATAC-seq peak (red), negative binomial regression with binarized ATAC-seq peak (teal), and Poisson regression with non-binarized ATAC-seq peak (blue). The values are relative to the computational time for Poisson regression, and bars are the mean across n=100 randomly selected peak-gene pairs. Horizontal lines (error bars) indicate one standard deviation from the mean.