Fig. 1: Most active genes are differentially expressed during spermatogenesis.

a–c Genome browser views of RNA-seq at Crabp1, a SG-specific gene (a), Spo11, a SC-specific gene (b), and Tssk3, a RS-specific gene (c). d, e Volcano plots showing fold changes in RNA-seq expression versus adjusted p-values, for active genes (n = 19,823). SG compared to SC (d), and SC to RS (e). Dotted lines indicate 1.5-fold cut-off for significantly increased (red) or decreased (blue) genes, with numbers of genes passing these thresholds and p-values < 0.05 indicated on each plot. P-values from DESeq2, (Wald test, adjusted p-value calculated using Benjamin-Hochberg correction). f Heatmap showing relative RNA-seq expression in each cell type using k-means clustering and standard Euclidean distances. Shown are 17,078 genes differentially expressed in (d, e). Cluster 1 n = 3226, Cluster 2 n = 2645, Cluster 3 n = 3601, Cluster 4 n = 4024, Cluster 5 n = 1596, and Cluster 6 n = 1986. g–i Top Gene Ontology terms and significance values for clusters 1 (g), 4 (h) and 6 (i), generated with ClusterProfiler and consolidated using Revigo. Enrichment p-values were calculated by a hypergeometric test, and the p-values were adjusted using a Benjamin-Hochberg correction. j Percent of genes from each cluster that correspond to the indicated RNA biotypes, as compared to all genes. k Distribution of gene length (TSS to TES) for genes in each cluster defined in (f). Line represents median, box represents 25–75th percentile, whiskers represent 1.5X interquartile range. P-values from unpaired, two-sided Mann-Whitney U test are shown for cluster 6 vs. all other clusters. Source data are provided as Source Data Fig. 1.