Fig. 5: Sex-specific patterns in temporal peaks/genes.

a Heatmap of ATAC-seq peaks (fitted values from ARIMA models) with a chronological trend in both women and men (n = 3197), as a function of age in years. Values represent z-score normalized accessibility values relative to the row (i.e., peak) mean. K-means clustering was used to group these peaks into three clusters (C1–C3) using concatenated data from men and women. Color bar on the top represents discrete age groupings as defined in this study (young, middle-aged, older). Rows are annotated according to their position relative to the nearest TSS: proximal if <1 kbp distance, distal otherwise. b, c Annotations of shared temporal peaks using chromHMM states (b) and gene sets from DICE database10 (c). Colors represent hypergeometric enrichment test p-values; light gray cells in plot indicate that there were insufficient genes in the gene sets to run an enrichment test. The enrichment pattern strongly associates cluster C1 to T cells, suggesting a delayed loss of accessibility in women relative to men; C2 to CD19+ cells, suggesting the presence of CD19+ specific loci with opposing temporal behavior in men and women; and C3 to monocytes and NK cells. d Transcription factor (TF) motif enrichment results for each temporal cluster (C1–C3), relative to the other two clusters. Motif enrichment analyses carried out on 1388 PWMs, grouped into families based on the sequence similarity, and most significant p-value for each motif family is represented here. Tests were done using HOMER54. e, f Expression levels of TFs associated to cluster 1 (C1) and cluster 3 (C3) grouped by age group and sex whose expression follows the same pattern as the peak temporal clusters where they are enriched. Cluster 2 (C2) is omitted since all TFs in this group show a significant increase with age in females or both sexes. Box plots represent median and IQR values, with whiskers extending to 1.5 times the IQR. Wilcoxon rank-sum test used to compare expression levels between sexes (significance value below boxes) and age groups (above boxes): *p < 0.05, **p < 0.01, ***p < 0.001, ns: non-significant. Sample sizes for young individuals n = 11 F, 6 M, middle-aged n = 10 F, 20 M, older n = 13 F, 14 M. Source data are provided as a Source Data file for (a, b, d).