Fig. 5: Properties of AP site hotspots in mouse genome.

a Illustration of the analytical steps used to define sample-level and sample-shared hotspots. b Comparison of the fraction of sample-level hotspots found in the real and simulated datasets at different read depths across the 71 mouse samples. c Comparison of the fractions of sample-level hotspots among different tissues (X-axes) at different read depths. Data based on 12 (3 biological replicates of 4 age groups) samples per tissue type (with the exception of 11 for brain) are reported. d Dot plots show the increase in the odds ratios of association between the sample-level hotspots detected with the indicated read depths and different genomic elements. Data are presented as the mean values +/− SD based on the 71 mouse samples. e Box plot of the numbers of age- or tissue-specific sample-shared hotspots found in the real and simulated datasets. Data based on the 4 age groups or 6 tissue types in the real datasets, and 100 corresponding random simulations are shown. b, c, e Box plots indicate median (middle line), 25th, 75th percentile (box) and 1.5× interquartile range (whiskers) as well as each individual data (single points). f Odds ratios of association between all hotspots (“Samples ≥1”) or just sample-shared hotspots (“Samples ≥2” and “Samples ≥3”) and various genomic elements (X-axis). The dELS, pELS, and PLS represent candidate cis-regulatory elements with distal enhancer-like, proximal enhancer-like and promoter signatures, respectively, and the “sps20” in a feature name means that the features were found in at least 20 embryonic tissues and/or times of development (d, f). Asterisks above connecting lines indicate significance of difference between the real and simulated datasets (b) or indicated pairs of tissues (c) as represented by raw p-values of ≤0.05 (*), ≤0.01 (**), ≤0.001 (***) and ≤0.0001 (****) calculated by the two-sided paired t test (b) or two-sided Wilcoxon rank-sum test (c). Source data are provided as a Source data file.