Biological sex impacts immune cell proportions and epigenetic profiles in the developing pediatric immune system

Edwards, Karlie; Merrill, Sarah M.; Konwar, Chaini; Jude, Marcia S.; Zhuang, Beryl C.; Meijer, Mandy; Navarro-Delgado, Erick; MacIsaac, Julie L.; Mandhane, Piush; Simons, Elinor; Moraes, Theo J.; Azad, Meghan B.; Subbarao, Padmaja; Bustamante, Mariona; Vrijheid, Martine; Turvey, Stuart; Kobor, Michael S.

doi:10.1038/s42003-025-08844-9

Download PDF

Article
Open access
Published: 09 October 2025

Biological sex impacts immune cell proportions and epigenetic profiles in the developing pediatric immune system

Communications Biology volume 8, Article number: 1447 (2025) Cite this article

5570 Accesses
2 Citations
2 Altmetric
Metrics details

Subjects

Abstract

Age- and sex-related differences in immune cell compositions and immune outcomes have been identified across the life course, but a comprehensive and nuanced characterization of these changes during the rapid developmental window of early life is lacking. We explore immune associated DNA methylation (DNAm) changes in the context of age and sex leveraging whole blood samples collected at ages one and five from CHILD, a Canadian longitudinal pediatric cohort (n = 760: 356 female and 404 male). DNAm-based computational cell type deconvolution reveals significant changes in all estimated immune cell types across time, with notable sex differences. In addition, we identify distinct DNAm signatures reflecting age- and sex-associated immune profiles in early life. While age-related DNAm changes are relatively limited, sex-associated differences are consistent across this developmental window and partially validate in independent pediatric cohorts. Together, these findings provide insights into early immune system maturation, underscoring the presence of sex differences prior to puberty.

Single-cell analysis of the human immune system reveals sex-specific dynamics of immunosenescence

Article 10 April 2026

The conneXion between sex and immune responses

Article 21 February 2024

The immune factors driving DNA methylation variation in human blood

Article Open access 06 October 2022

Introduction

During early-life development, the pediatric immune system undergoes rapid changes, shifting from maternally derived protection to a mature state marked by changes in immune cell proportions and immunological function¹. Although this transition is essential for both sexes, differences in immune responses are observed between males and females in early life, but most immunological work investigating male-female differences focuses almost exclusively on the post pubescent stage where sex hormones are known to play a role in moderating immune system function^2,3. While sex hormones are critical for fetal development and may contribute to sex differences occurring in immune cell composition, immune responsiveness, and disease incidence later in life, much remains unknown about prepubescent differences in the immune system composition and the underlying molecular mechanisms that may drive them². To address this gap, we investigated early-life molecular differences between the sexes within the immune system, focusing on shifts in immune cell composition and epigenetic changes during development.

In the first 3 months of life, all typically developing infants converge on a similar growth trajectory, suggesting that during development there is also a stereotypical path to immune system maturity⁴. Innate immune cells, such as neutrophils, eosinophils, basophils, and monocytes, form the first line of defense and mount a consistent response undergoing functional changes during early development¹. Adaptive immune cells, including B and T cells, serve as a secondary line of defense and develop antigen-specific responses that improve with repeated exposure¹. For example, in healthy children, monocytes are initially immature with impaired signaling capacity, natural killer (NK) cells exhibit only half the cytolytic function seen in adults at birth, and early B memory cells demonstrate reduced antibody affinity maturation compared to later life¹. Although immune cell maturation trajectories occur in both males and females, sex-specific differences in the proportions of certain immune cell types have been observed as early as in cord blood samples⁵. Specifically, males exhibit a higher proportion of monocytes, and females have higher proportions of B cells with greater basal immunoglobulin levels⁶. It is plausible that in the absence of sex hormone expression during childhood, these sex-related biological differences could arise during gestation or the subsequent mini puberty that occurs during infancy^7,8. Hormonal levels of progesterone and testosterone fluctuate during pregnancy, serving distinct roles during this gestational period and in the fetal developmental timeline^8,9,10. After birth, sex hormone levels rise briefly in association with maturation of sex organs^8,9,10. These sex hormones may also interact with environmental stressors and exposures, ranging from pollution to phthalates, resulting in altered hormonal levels in the fetus and disparities in later health outcomes between males and females^11,12,13. This research on birth tissues, such as umbilical cord blood, suggests that sex-based health disparities in later life could be related to biological differences in the immune system that are established during gestation and further modified by the re-emergence of sex hormones in puberty^5,7. Even with the observation that children converge on a common developmental trajectory overall, there is compelling evidence that sex differences are present in immune responses and health outcomes beginning at birth and proceeding throughout the life course.

Life course trajectories can be reflective of biological embedding of experience¹⁴. The Developmental Origins of Health and Disease (DOHaD) framework hypothesizes that the first 1000 days of life, conception to about two years of age, are characterized by rapid development. During this period, environmental exposures or intrinsic psychological states will impact the developing fetus and may have significant, long-lasting influence on the offspring’s health and wellbeing^15,16. One such mechanism associated with biological embedding of early life experiences is epigenetics, mitotically heritable changes that occur around the genome with the potential for influencing gene expression while not altering the underlying sequence¹⁷. DNA methylation (DNAm), a well-characterized epigenetic mark in human population studies, involves the addition of a methyl group to the cytosine DNA residue. Importantly, DNAm is dynamic and can be reflective of the environment^18,19,20. Differences in DNAm have been associated with age, sex, and other biological features, exposures, and disease states^21,22,23,24. This chemical tag is also associated with transcriptional changes and involved in defining and maintaining cell type identity¹⁷. With this feature in mind, the choice of investigated tissue is imperative for the interpretation and findings of epigenetic studies. While blood is a commonly used peripheral tissue in DNAm research, it is directly relevant for observing immunological activity due to its constituent cell types²⁵. Although blood, an immune-relevant tissue, is commonly used in DNAm studies, few have leveraged this epigenetic mark to explore molecular developmental differences specific to the immune system during early life.

Epigenetic research in pediatric populations has demonstrated that children experience more dynamic changes in DNAm compared to adults. However, the implications of DNAm changes in the immune system during early life have not been well explored²⁶. Given the extensive DNAm remodeling observed in childhood, it is essential to investigate how these changes influence specific developmental pathways. Several studies have reported substantial autosomal DNAm changes in early childhood associated with biological sex and enriched for developmental processes^27,28. Sex differences in autosomal DNAm have also been observed in adults where a number of co-methylated regions (CMRs) were found to be consistently different between the sexes from adolescence into older age²⁹. While these findings suggest genetic-based sex differences that extend beyond the sex chromosomes, they have not specifically examined immune system development. As a result, little is known about how DNAm dynamically contributes to early immune maturation or whether these patterns relate to later-life sex differences in immune responses.

Our study aims to explore the dynamics of the immune system in early life, focusing on DNAm differences over time and between biological sexes in the pediatric population. Leveraging a deeply phenotyped cohort, the Canadian Healthy Infant Longitudinal Development (CHILD) study, we measured DNAm in venous whole blood collected at ages one and five. We first explored the developmental alterations of predicted immune cell type proportions between these ages and sexes. Next, we conducted age and sex-specific DNAm association studies employing immune gene candidates measured on a high dimensional microarray We subsequently validated the majority of the sex differences in independent pediatric cohorts. Together, we found significant sex differences in both cell type proportions and immune-specific DNAm, noting that DNAm differences between the sexes are relatively stable during the first five years of life. These data fill the gap in knowledge regarding developmental and sex-specific cell type composition and DNAm changes within the developing pediatric immune system.

Results

Estimated cell type proportions significantly changed in the first five years of life and exhibited sex-associated differences

Immune cell composition dynamics in early life are reflective of development and exposures as children grow. Leveraging our longitudinal pediatric cohort (CHILD), we sought to validate early life immune cell changes previously established from empirical measures such as complete blood cell counts and flow cytometry using DNAm data³⁰. Using DNAm-based deconvolution algorithms, we first estimated the proportions of 12 immune cell subtypes at age one and age five and next compared the changes between these time points (Fig. 1, Supplementary Data 1). Each cell type in both the innate and adaptive arm exhibited significant changes in their predicted cell type proportion between age one and age five (q < 0.05). Notably, the difference in neutrophils, B memory, B naive, CD4 memory, CD4 naive, CD8 memory, and CD8 naive cells between age one and age five had large effect sizes and exhibited high likelihood ratios as assessed by Cohen’s d and Bayes Factor, respectively, which provided further evidence to support that the two timepoints were significantly different from each other.

**Fig. 1: Cell type proportions predicted using bioinformatic deconvolution methods and compared between time points showed changes associated with immune system maturation.**

As these were paired data, we next examined how the individual participants’ immune cell profiles changed between time points. Most children had a clear and consistent trajectory (Table 1). In neutrophils, as well as in Bnv, Bmem, CD4nv, CD4mem, and CD8nv subsets, fewer than 10% of children demonstrated a change in the direction opposite to the predominant trend. In contrast, within eosinophils, 43% of children exhibited a decrease in proportion, while the remaining 57% experienced an increase.

Table 1 Summary of changes in estimated immune cell-type proportions between ages 1 and 5

Full size table

In addition to significant cellular changes in the immune system during the first five years of life, immune differences between the sexes are well-documented across the lifespan⁴. In this study, sex was based on the biological sex determined at the time of birth for each participant. Given the limited exploration of these differences in early life, we next compared cell type proportions at ages one and five between males and females (Fig. 2). At both time points, basophils, CD4 memory cells, and T regulatory cells (Tregs) were significantly different between the sexes (q < 0.05) (Supplementary Data 2). By age five, additional significant differences in the pediatric immune profile were identified with males having a greater proportion of monocytes and females having a greater proportion of CD8 naive cells. These data not only confirm established changes in cell type proportions in pediatric populations but also reveal sex-associated differences in immune cell populations during early life.

**Fig. 2: Predicted cell type proportions showed differences between sexes.**

Immune-specific epigenome association study revealed age- and sex-associated DNAm signatures in the first five years of life

Given the significant changes observed in all cell types between time points, and recognizing that functional changes occur during early development, we next employed a site-specific DNAm analysis of immune candidate genes to gain a detailed understanding of the molecular changes underlying immune system development.

We tested the associations between DNAm and time or sex by fitting robust linear regression models, which are sensitive to potential outliers and data with non-normal distributions. Statistically significant DNAm sites were identified based on a statistical cut-off FDR of 0.05 (p < 0.002) and a Δβ threshold of greater than 3% (|Δβ | > 0.03) (Supplementary Data 3). CpGs discovered in all presented regression models also met the same significance threshold in a sensitivity analysis corrected for immune-relevant estimated cell type proportions (Supplementary Data 4).

We identified age-associated differential DNA methylation at 10 CpGs within nine genes out of 4968 CpGs across 628 immune system candidate genes, using robust linear regression analysis (Fig. 3, Table 2). Seven of the ten CpG sites were located within larger co-methylated regions (CMRs), two of which showed significant associations with age (Supplementary Data 5).

**Fig. 3: Significant changes in immune system DNAm associated with age.**

Table 2 Top differentially methylated CpG sites associated with age across the whole cohort and stratified by sex

Full size table

When stratifying the same analysis by sex, we noted age-specific associations in seven CpGs in males and a non-overlapping set of seven CpGs in females (Fig. 3d). Of these CpGs, four CpGs were consistent between all samples and female only associations, and five CpGs were consistent between all samples and male only associations. Although not statistically significant in the sex stratified analyses, all 10 significant CpGs from the cohort-wide analysis exhibited consistent Δβ directionality in males and females (Supplementary Data 6). Overall, we identified age-specific DNAm associations with the immune system during this window of development that were significant for all children.

Given that we observed both cohort-wide DNAm differences in the immune system genes with age in conjunction with cell type proportion differences with sex, we next explored potential sex-associated DNAm differences at each time point. Within immune candidate genes, at age one, we identified 46 CpGs significantly associated with sex, and at age five we identified 37 CpGs significantly associated with sex (Fig. 4, Supplementary Data 3). 77% of all significant sex-associated CpGs (36 out of 47), exhibited consistent Δβ of sex differences across ages in both direction and magnitude (Pearson correlation: 0.995, p < 2.2e-16), and the Δβs of all 47 CpG sites were also highly correlated at both time points (Pearson correlation: 0.98, p < 0 2.2e-16). Furthermore, 38 of the sex-associated CpGs at age 1 and 29 at age 5 were located within larger correlated methylation regions (CMRs), comprising 17 and 15 regions, respectively. Of these, 10 CMRs at age 1 and 7 at age 5 showed significant associations with sex. (Supplementary Data 5).

**Fig. 4: Immune system specific DNAm associated with sex.**

Validation of sex differences in DNAm

Finally, to confirm the robustness of our sex-associated DNAm findings we took advantage of two independent cohorts where samples were collected at approximately the same ages as the children in our discovery cohort, CHILD. Specifically, one dataset was obtained from the publicly available repository, GEO (GSE142512), consisting of the healthy control one-year old individuals in a previously published study on DNAm associations with Type 1 Diabetes, and the other dataset was comprised of four-year olds from a population-based prospective cohort: INMA^31,32. As the samples in the validation datasets were profiled using an older Illumina platform, the 450k array that quantifies DNAm at approximately half the sites that are included on the EPICv1 array, only 35 of the original 47 significant sites were available to be tested. Using the same technical threshold as the previous analyses, 15 CpG sites validated at age one, and at age four, 16 of the sites validated (|Δβ | > 0.03 and p value < 0.05) showing sex-associated differential methylation. Of the remaining sites that did not meet the effect size threshold (|Δβ | > 0.03), 18 and 17 CpGs at ages one and four/five, respectively, still had effect sizes trending in the same direction as the original analysis and passed the significance threshold. All of the sites tested exhibited high correlation between the discovery and validation cohorts; R² = 0.8, p = 7.3e-13 at age one and R² = 0.9, p = 7.4e-15 at age five (Fig. 5, Supplementary Data 7, Supplementary Data 8).

**Fig. 5: Strong correlation of sex-associated Δβ between original discovery and validation cohorts.**

Discussion

The early life developmental window is a critical period that may establish lifelong health trajectories^15,16. The biological foundations of these trajectories can be better understood by examining the dynamics of the typically developing immune system as reflected in changing cell type proportions and molecular markers during early childhood¹⁴. Leveraging DNAm data from the longitudinal CHILD birth cohort, we characterized changes in whole blood immune cell proportions and DNAm patterns over the first five years of life. We observed system-wide changes in immune cell proportions with sex differences evident as early as age one, while DNAm patterns showed few age-specific changes but stable, consistent sex-associated differences. Collectively, these findings provide insight into immune composition and molecular changes underpinning immune system development, highlighting both the temporal dynamics and the prominent role of sex-associated DNAm patterns in this healthy pediatric population.

To begin exploring the immune system composition, we used a bioinformatic approach with an adult reference panel, allowing us to predict cell type proportions in this pediatric population. We were able to recapitulate early-life cell type changes previously established using other empirical methods. Our findings align with these prior observations, including the gradual increase of memory T cells across childhood identified through flow cytometry, the balance between B memory and B naïve cells observed via high-dimensional cytometry, and the inverse nature of neutrophil and lymphocyte proportions during pediatric development captured by complete blood cell counts^30,33,34. By utilizing cell type specificity inferred from bioinformatic predictions and validating these estimates through replication of prior findings, we demonstrate the applicability and utility of this approach for studying immune cell proportions from DNAm data in pediatric populations, particularly when empirical measurement methods are not available. This framework provided the basis for examining how these inferred immune cell proportions change across the rapid developmental period of early childhood.

Given that the first five years of life are marked by rapid immune development, we anticipated observing significant changes in all cell type proportions, with larger effect sizes in the adaptive arm compared to the innate arm, reflecting the accumulation of immunological memory from early-life exposures³⁵. In our study, we observed small effect sizes as assessed by Cohen’s d in most of the innate cell types (i.e., monocytes, eosinophils, and basophils). This was as we anticipated based on the known stability of the innate immune system over time, primarily in order to mount a consistent response to pathogenic exposures³⁵. Alternatively, given the role of the adaptive immune system to mount a learned response over time, such as in the case of vaccinations, we hypothesized that the adaptive immune cell type proportions would not only change over time, but that these changes would be larger in magnitude³⁵. As anticipated, the largest effect sizes were observed in changes to B and T cell proportions between one and five years old. These shifts reflected the dynamic developmental period during which children are increasingly exposed to environmental stimuli, prompting the transition from a predominantly naïve immune system to one characterized by acquired immunological memory¹. The study design also allowed us to examine within-participant changes during this developmental window. While most children exhibited a consistent pattern of change, some deviations from the majority were observed. Notably, eosinophils showed the most striking variation, which, given their small population, may be linked to the future development of allergy or asthma and warrants further investigation. These findings highlight the dynamic nature of immune landscape in early childhood and underscore the importance of longitudinal studies in understanding individual variability and potential implications for future health outcomes.

We next investigated site-specific changes in DNAm patterns of the pediatric immune system between ages one and five to explore molecular features of immune development, which may reflect shifts in cell type proportions or functional changes associated with maturation. Owing to the fact that we observed large changes in cell type proportions with time, which was suggestive of a dynamic and developing immune system during this pediatric window, we expected to see immune-specific DNAm changes between one and five years old in a large set of immune genes. Instead, we identified a small number of significant DNAm associations annotated to nine genes, including NLRC5 and LAG3. As these genes are involved in antiviral immunity and antigen presentation, respectively, it is conceivable that we captured a signature of immune system maturation in whole blood that could be a result of vaccination or an overall increase of exposure to pathogens as children are entering daycare and increasing their social interactions with age³⁶. While these age-associated findings offered some insight into immune development, they did not fully explain the biological variation observed in our cohort.

We therefore examined sex-associated differences as another potential contributor to this variation. Analyzing changes from age one to five separately in males and females revealed significant, yet non-overlapping, CpGs. In females, but not males, two CpGs annotated to NOD2, a gene involved in bacterial recognition and proper immune activation, exhibited significantly different DNAm between ages one and five. Females are typically seen to mount a stronger innate immune response than males in early life, and the observed sex-associated DNAm differences in NOD2 CpGs may be involved in the molecular underpinnings of that observation². In males, but not females, two CpGs annotated to DHX58, a negative regulator of type 1 interferon, had significantly different DNAm between ages one and five. Type 1 interferon plays a key role in antiviral defense and may contribute to the higher incidence of viral infections reported in males³⁷. These minor differences in observed immune system-related DNAm changes in males and females during the first five years of life align with sex differences observed in health outcomes and could play a role in the underlying mechanism of immune system function. Although not statistically significant, effect sizes for sex-stratified CpGs in one sex tended to be similar in direction and magnitude in the other, suggesting that limited power or interindividual variation may have influenced the results, warranting validation in larger independent cohorts.

We were ultimately surprised to find what seemed like a fairly small number of age-specific immune system DNAm changes in both the cohort-wide and sex stratified analysis. This limited number of changes may be partly attributable to the use of whole blood, a heterogeneous tissue in which the detectability of molecular changes can be constrained by cell type composition, particularly for DNAm differences of smaller magnitude (>3%), as noted in this study. Further, DNAm works in tandem with other epigenetic mechanisms, such as histone modification, to shape cell type identity and immune system function during childhood, indicating other epigenetic mechanisms should be investigated in the future³⁸. Together, we demonstrated DNAm within the immune system is relatively stable during this developmental window and is likely not the primary molecular driver of the observed compositional and functional changes in whole blood-based analysis.

With sex differences in immune responses and health outcomes observed across the life course, we also compared the estimated cell type proportions of males and females in our pediatric cohort. Currently, literature exploring sex differences in the immune system has typically included older adults or post pubescent populations when sex hormones are actively produced and modulating the immune responses, yet similar studies are lacking in early pediatric populations³. To address this gap, we compared the estimated cell type proportions between males and females at age one, and observed significant differences in basophils, CD4 memory T cells, and Tregs that persisted through age five. Differences in Tregs have been reported between older men and women, but much of what is known about Treg proportions and their immunosuppressive function is hypothesized to be associated with sex chromosomes and hormonal interactions^39,40. Our findings indicate these cell type proportional differences were present as early as one year old and bear further investigation across ages, as it may be plausible that hormones during gestation or the presence of sex chromosomes establish this difference prior to puberty. We also identified several sex-associated cell differences unique to a specific time point. For example, monocyte proportions were not significantly different between the sexes at age one but exhibited sex-associated differences by age five. In the literature, sex differences have been observed in monocytes, but these were from adults aged 25−71 years, where males were seen to have a consistently higher proportion of monocytes as compared to females, the same trend that we saw emerge in our study at age five⁴¹. Although multiple studies have highlighted significant sex differences in cell type proportions in adulthood, our study has provided compelling evidence of sex-associated cell type proportion differences that exist as early as one year old and persist through development.

While there were limited changes in immune-specific DNAm observed across early life in an age associated manner, we further explored these molecular differences between the sexes at each time point. In contrast to the surprisingly small number of CpGs associated with immune maturation over time, we found four times more significant DNAm differences associated with sex at each time point, as well as more than a 75% overlap in sex-differential DNAm congruent in ages one and five. This consistency in sex-associated immune DNAm in the first five years of life suggests that these differences are present and stable throughout childhood, potentially indicating there may be different biological starting points for the male and female immune system even before age one that persist throughout early life immune development. The discovered CpGs were annotated to genes such as ERG2, CFD, SPIB, RUNX1, and ALOX15 which are involved in multiple elements of the immune response from the complement pathway in innate immunity to lymphoid-specific enhancers in the adaptive arm. While there was only one sex-associated CpG uniquely identified at age five, we identified ten CpGs that were different between the sexes at age one but not age five. These were annotated to ALOX5, IL1RL2 and RELB, which are genes involved in inflammation, suggesting that sex differences in inflammatory responses could be more pronounced at age one. Sex differences have been reported in other acute inflammatory conditions, supporting the notion that males and females exhibit distinct inflammatory responses during early life. Specifically, males experience sepsis, a systemic inflammatory response mounted by the immune system, 2.8 times more frequently than females in the first year of life, but by age 8 there is a convergence of disease rates between the sexes⁴². Aside from inflammation, we observed the most significant difference in DNAm between the sexes at both time points was a CpG in CALM1. As this gene is responsible for calmodulin regulation of ion channels, sex differences could be associated with calcium-based messaging in lymphocyte cell types⁴³. We also identified significant sex differences in lymphocyte proportions at both time points, which may be linked to differential DNAm at immune-related loci. Collectively, our findings show that the majority of sex-associated DNAm differences were stable across ages one and five, reinforcing the idea that immune-related epigenetic profiles are established early in life, remain consistent during this developmental window, and may underlie observed sex-based disparities in immune-related health outcomes. Notably, two of the sex-associated CMRs identified in our analysis associated with ITGA2B and EGR2 overlapped with previously reported sex-specific differentially methylated regions in adults, further highlighting the persistence of these epigenetic differences across the lifespan². These compelling results prompted us to evaluate whether these findings could be replicated in independent pediatric cohorts.

Replicating results in DNAm studies is often a challenge in the field⁴⁴. Though we did not have access to a comparable longitudinal cohort with the same collection time points and sample type (whole blood), we leveraged two previously published independent pediatric DNAm cohorts with whole blood samples to validate our findings at each time point separately. At age one, we were able to use the healthy controls from a study focused on Type 1 Diabetes, and age five we used INMA, a Spanish birth cohort with samples collected at age four^31,32. At both time points, we were able to validate a significant subset of our sex-associated differences (approximately 30%) at the same statistical and biological thresholds in spite of smaller sample sizes and differences in genetic ancestry. In fact, for the CpG sites that did not meet effect size cut-off, we still observed strong correlation between the effect sizes in the discovery and validation cohorts, regardless of the time point. Our validation confirmed the robustness, stability, and consistency of sex-based differences in early life immune system autosomal DNAm, despite platform disparities, limited sample sizes, and varied populations. These findings underscore the reliability of our approach and provide a foundation for future studies that continue to explore the nuances of immune development in early childhood.

The goal of this study was to characterize cell type changes and DNAm signatures of normative immune development within and between the sexes in the first five years of life. To investigate cell type changes in our pediatric population, we relied on bioinformatic deconvolution tools using an adult reference. Though this adult reference has been widely used in pediatric populations, future research would benefit from validation of these findings with empirical methods such as cell sorting. At this stage, we cannot determine the functional significance of the observed DNAm changes, as transcriptomic data were not available for integration. However, combining DNAm and gene expression data in future studies would be a valuable next step for elucidating the biological relevance of these epigenetic differences. We also observed inter-individual variation in immune cell type proportions, which is expected given the heterogeneous nature of blood. To ensure the robustness of our group comparisons, we employed additional statistical approaches to confirm meaningful differences, even in cases where the distributions appeared similar. Future translation or extrapolation of our findings could also be limited by the demographics of the current cohort. Specifically, we identified immune-specific DNAm changes in a normative population, yet, as these children have access to regular health care, are predominately of high socioeconomic status, and have a low incidence of allergy or severe illness, the findings may not translate to a population with more diversity or wider variety of health outcomes. We also observed significant differences between study subsets in study center distribution, self-reported maternal ethnicity, and mode of delivery. To account for these differences, genetic ancestry and study center were included in the models. However, mode of delivery was not included as a covariate, as it has not been shown to be significantly associated with DNAm in later life⁴⁵. Furthermore, the two collection time points at one and five years of age allowed us to focus on the early life developmental window where we were able to identify previously under reported immunological sex differences in the prepubescent time window. However, in the absence of DNAm data extending beyond the five years, we are currently unable to extrapolate whether the identified sex- DNAm differences in immune system genes persist until, during, or after puberty, especially with the introduction of sex hormones. Future work utilizing larger, more diverse cohorts with additional time points to not only replicate, but expand on these findings, may provide valuable insights for pediatric research, advancing understanding of the molecular basis of immune system development, and informing clinical approaches to reduce sex-based health disparities⁴⁶.

Our study sought to leverage a longitudinal DNAm cohort to characterize pediatric immune system development. Though we observed significant changes in immune cell type proportions with age, the site-by-site DNAm changes over time were small and suggest that DNAm in immune-related genes is relatively stable during this period of development. We also identified consistent sex-associated differences in both cell type proportions and autosomal DNAm in prepubescent children. It is reasonable to hypothesize that during gestation, when sex hormones are present, a divergence in the male and female immune system may be set in place. While there are notable distinctions between male and female immune cell composition and immune-related DNAm patterns, many of these differences persist throughout the first five years of life, suggesting males and females have distinctive biological starting points, but are on a similar immune system developmental trajectory. This study characterizes early-life immune changes in typically developing children and provides a foundation for future research aimed at understanding and addressing sex disparities in pediatric health.

Methods

Participant description

The Canadian Healthy Infant Longitudinal Development (CHILD) Study began enrollment in 2008, recruiting a total of 3621 pregnant women from four major cities in Canada (Vancouver, Edmonton, Winnipeg, Toronto). Eligible infants (n = 3,454) with no congenital abnormalities and born at a minimum of 35 weeks of gestation were also enrolled and have been followed from pregnancy to age five years, and beyond⁴⁷. Informed consent was obtained from parents at the time of enrollment. CHILD Study participants were followed prospectively and thorough information on environmental exposures, clinical measurements, and assessments were gathered through a combination of questionnaires and in-person clinician appointments.

The samples used in the current study are a representative sub-cohort of infants (n = 760 paired, 1520 total) drawn from the CHILD study who had whole blood collected at ages one and five (Table 3). Biological sex was collected at birth.

Table 3 Study Demographics: Demographic characteristics of the full cohort, study subset, and participants stratified by sex

Full size table

This study adheres to all applicable ethical guidelines and was developed and authorized by the University of British Columbia, University of Manitoba, University of Toronto, McMaster University, BC Children’s Hospital, The Hospital for Sick Children, and Simon Fraser University. The Research Ethics Board (#H07-03120) examined and authorized this study in accordance with the Tri-Council Policy Statement: Ethical Conduct for Human Research (TCPS2, 2018). All ethical regulations relevant to human research participants were followed.

DNA sample collection and microarray profiling

DNA was extracted from whole blood samples (n = 1616) using the DNeasy Blood & Tissue Kit (Qiagen, Venlo, The Netherlands) with concentration and quality assessed using a NanoDrop 8000 Spectrophotometer (Thermo Fisher Scientific, United States). For DNAm array profiling, purified DNA was bisulfite converted using EZ-96 DNA Methylation kit (Zymo Research, Irvine, CA, United States) and DNAm profiles of the samples were measured with the Infinium MethylationEPIC BeadChip array version 1 (EPICv1) (Illumina, San Diego, CA). Raw intensity IDAT files for 866,836 data points encompassing 863,904 CpG sites were produced and preprocessed for downstream analysis. This included 1616 individual samples plus 12 technical replicates randomized across 293 chips by consistent operators.

For whole genome single nucleotide polymorphism (SNP) profiling, purified DNA from all cohort samples was hybridized to the GSA v3 + Psych v1 array (Illumina, San Diego, CA) to measure SNP markers at 699,424 genotypes.

DNA methylation preprocessing

All available whole blood DNAm data from children collected at ages one and five (n = 1,628) were processed in RStudio (version 4.0.3). The ewastools R package was employed for sample quality control to assess multiple technical parameters including array staining, extension, hybridization, target removal, specificity, and bisulfite conversion based off of 636 control probes⁴⁸. Next, the minfi R package was used to assess methylated and unmethylated intensities and check sex concordance between reported and inferred sex using sex probe intensities and the lumi R package was used to detect potential outliers^49,50,51. Poorly performing samples were defined as those with higher proportions of bad detection p-value (p > 0.01 on > 1% of all probes), with DNAm intensity statistically different from the average intensity of the negative control probes, and/or having a bead count less than 3 on > 1% of all probes. The 59 SNP probes included on the array were also leveraged to check for sample identity as the one- and five-year samples from each child should cluster together. The 12 technical replicates were removed from the final data set, and based on the quality control checks, 14 of the 1,616 independent samples were removed: seven due to sex mismatch, four due to missing bead count, and three due to lack of SNP agreement.

Next, probe filtering was conducted to remove poorly performing probes, probes on the sex chromosomes, and those predicted to be cross-hybridizing or containing SNPs⁵². Poorly performing probes were identified based on those with a high detection p value (p value > 0.1) and missing data (beadcount < 3)⁵³. Beta-Mixture Intra Quantile (BMIQ) normalization was done in conjunction with normal-exponential out-of-band (noob) correction to account for probe type bias and background correction⁵⁴. Finally, to correct for known technical confounding factors, variation associated with batch (chip, row, and plate) was removed using the ComBat function from the sva R package⁵⁵. Principal component analysis (PCA) was employed during each step of data preprocessing to confirm normalization and batch correction was appropriately performed on the data (Supplementary Fig. 1). After all preprocessing steps, a total of 1602 samples and 786,363 probes were included in the final sample set. For subsequent analyses, samples were subset to those with matched data at both the age one and age five collection time point (760 children at each time point for 1520 total distinct samples).

DNAm data reduction

For further data reduction to meet assumptions of Benjamini Hochberg false discovery rate (FDR) correction for equal likelihood of discovery, all DNAm probes that passed preprocessing were subjected to interquartile range filtering to subset only variable probes where the DNAm beta (β) values varied by at least 5% across all samples between the 5th and 95th percentiles. A total of 332,560 variable probes were included in the final object and used in downstream analyses and composition of the candidate list.

Computational cell type proportion estimation

To identify developmental and sex differences in cell types within our pediatric cohort, we completed cell type deconvolution to predict cell type proportions from raw, unfiltered DNAm data. The DNAm-based cell type estimates of 12 immune cell populations (Bnv, Bmem, CD4nv, CD4mem, CD8nv, CD8mem, Treg, NK, Monocyte, Neutrophil, Basophil, and Eosinophil) were extracted from the extended Identifying Optimal DNA methylation Libraries (IDOL-ext) reference dataset which has previously reported high correlations between mixtures of known cell type proportions and predicted values^56,57,58. We utilized the estimateCellCounts2 function from the FlowSorted.Blood.EPIC package to noob normalize our samples to the IDOL-ext reference and select the top 100 cell-type-specific probes for deconvolution^56,59.

Genotyping preprocessing

Samples were genotyped on the Illumina Infinium Global Screening + Psych Array (GSA) which measures SNP markers at 699,424 genotypes. Using the genotyping module of Illumina’s GenomeStudio software (version 2.0.4), samples were assessed for call rate and GenCall score. All samples had a call rate > 0.97 and a 10% GenCall confidence score. As there are common SNP probes (59) between the EPICv1 array and the GSA, discordant SNP patterns at these probes were used to identify potential sample mismatches. No mismatches were identified during this check.

SNPs were filtered based on various Illumina recommended metrics including GenTrain score (<0.4), cluster separation score (<0.45), call frequency (<0.97), and intensity of heterozygote and homozygote clusters (AB R Mean ≤ 0.4, AB T Mean ≤ 0.2 or ≥ 0.8, AA Frequency = 1 & AA T Mean ≥ 0.2, AA Frequency = 1 & AA T Deviation ≥ 0.04, BB Frequency = 1 & BB T Mean ≤ 0.8, BB Frequency = 1 & BB T Deviation ≥ 0.04, AA Frequency or BB Frequency = 0 & AB T Deviation;≥ 0.5, AB Frequency = 0). SNPs were also excluded based on a heterozygote excess score (<-0.3 or >0.2). SNPs that deviated from Hardy Weinberg Equilibrium (determined through a chi squared test p ≤ 10-6) were also removed. Next, rare SNP genotypes with a minor allele frequency (MAF) < 1% were excluded as they may reflect inaccuracies in genotype calling. Imputation was conducted using The Enhancing Neuroimaging Genetics through Meta-Analysis (ENIGMA) Consortium imputation protocol based on the Michigan Imputation Server pipeline⁶⁰. After imputation, SNPs with a R2 ≤ 0.8 and MAF ≤ 0.01 were filtered out using bcftools v1.16 14. To minimize redundancy of genetic information between SPS and reduce the computational burden, SNP data was pruned in PLINK software (version 1.9) based on linkage disequilibrium (LD) using the pairwise comparison method with a 50Kb sliding window, a 5 variant count window shift, and a 0.5 pairwise R² threshold.

After obtaining the LD pruned dataset with 1,260,703 probes, we confirmed sample independence, an assumption of the majority of statistical models. Sample relatedness was checked by calculating kinship coefficients (identity by descent) based on Maximum Likelihood Estimation (MLE) in R SNPRelate package. Based on a kinship coefficient score of 0.5, 3 pairs of related samples were identified which were technical replicates and being sample duplicates were expected to be identical. Only one sample from the 3 pairs was retained for downstream analysis.

The final genotyping consisted of a data set of 1,260,703 SNPs and 822 samples.

Genetic ancestry estimation

We next inferred the genetic structure of the study population using the 2504 individuals across four major superpopulations using the 1000 Genomes Project as the reference. Using the SNPRelate R package, we performed a principal component analysis (PCA) including both sets of samples and shared SNPs between the CHILD cohort and 1000 Genomes Project that have been pruned for LD (Supplementary Fig. 2). The first 2 components from the PCA, representing 73% of the overall genetic variance, were included as covariates in the statistical models.

Immune candidate gene list

Leveraging established array-based assays, the genes included in the Nanostring nCounter® immunology panel were selected for candidates for the immune focused arm of this study. Based on Illumina’s annotation of the gene names as informed by the UCSC genome browser, of the 1033 genes included in the NanoString panel, 989 were present on the EPICv1 array. After subsetting to variable probes that were not located in gene regions with known splice variants, the final testing space included 4968 autosomal probes used in subsequent analysis (Supplementary Data 9). The chromatin structure associated with these locations was annotated using ChromHMM with the neutrophil and PBMC reference^61,62.

Cell type proportion comparisons

To assess the statistically significant differences between cell type proportions between ages one and five and between sexes, a Bonferroni adjusted p value was applied to paired t-tests with a threshold of 0.05 using the t_test and adjust_p value functions in the rstatix package. We then assessed effect size using Cohen’s d in the lsr R package and the evidence for supporting the null hypothesis using Bayes Factors calculated using the BayesFactor R package^63,64.

Technical replicate Delta Beta (Δβ) threshold calculation

For this study, the biological effect size threshold was determined using the root mean square error (RMSE) of the 12 age five technical replicates after preprocessing (0.0199) added to the maximum standard error across all CpGs (0.008386347). This value, 0.0283, was rounded up to two decimal places, yielding a threshold of |Δβ | > 0.03, which is considered to be representative of an effect size greater than technical noise.

Delta beta (Δβ) calculations

Biological effect size for DNAm is quantified as the change in methylation value, referred to as Δβ⁶⁵. In this study, the Δβs were calculated against continuous data (age) and categorical data (sex). For continuous data, Δβs were calculated by extracting the β coefficient of the time between sample collections (ΔAge) from the linear model CpG site ~ ΔAge. This coefficient, representing the change in β value on the y-axis divided by time elapsed on the x-axis was then multiplied by the cohort interquartile range of ΔAge. For categorical data, the average β for the test category (males) was subtracted from the average β from the reference category (females) for every CpG site. The resulting Δβs represent the observed change in β value at each CpG site in an unadjusted association with the variable of interest.

Assessing statistical power

In order to determine if our sample size had sufficient statistical power to detect meaningful differences, we employed the pwrEWAS package in R⁶⁶. For the sex stratified analysis (male n = 404, female n = 356), a statistical power of 66% was achieved. For all associations, with the total sample (n = 760), we were well-powered at 84% to detect a technically meaningful effect of Δβ > 3% for the immune-relevant testing space of 4986 CpGs from 628 genes.

Candidate DNAm association studies with robust linear regression

A robust linear regression model using Huber M-estimation was used for site-by-site epigenome association analysis, as this method is robust to heteroskedasticity and outliers, both of which are present in these data. All models were run on variable immune candidate probes for a total of 4,968 linear regression tests. Model construction was tested for multicollinearity and relations were assumed to be linear and normally distributed. Covariates known to contribute significantly to DNAm differences, i.e., sex, age, genetic ancestry, recruitment location and prenatal smoke exposure, were included⁶⁵. We did not adjust for cell type proportion as we were interested in the developing immune system i.e.. the changes that occur as a result of immune cell proportions adapting and maturing.

The linear model used for testing associations with age is as follows

$$ {Candidate\; CpG}\,\Delta {DNAm} \sim \Delta {Age}+{Sex}+{GA}1+{GA}2\\ +{Study\; Center}+{any\; prenatal\; smoke}\exp o{sure}+\varepsilon$$

Where the input CpG ΔDNAm was represented by a differential matrix of the beta values at five minus the beta values at one for all candidate CpGs, delta age was the amount of time between sample collections, and GA1 and GA2 are genetic ancestry PCs. Rather than using a linear mixed model to account for the longitudinal nature of our data, we incorporated the change in age as a variable to capture the effect of time elapsed between samples, recognizing that longer intervals may correspond to greater potential for change.

The linear model used for testing associations with sex is as follows

$$ {Candidate\; CpG} \sim {Sex}+{Age\; at\; sample\; collection}+{GA}1\\ +{GA}2+{Study\; Center}+{any\; prenatal\; smoke\; exposure}+\varepsilon$$

Where the input CpG beta values were either from the age one or the age five time point

Model fit was assessed by QQ plot and genomic inflation of the models was observed^67,68. To address the inflation and reduce the incidence of false positives, the BACON package in R was employed to construct an empirical null distribution by which adjusted p values could be extracted (Supplementary Fig. 3)⁶⁹.

Following BACON, all regressions were multiple test corrected using the Benjamini–Hochberg FDR control method⁷⁰. A high confidence statistical threshold of FDR < 0.05, which corresponded to P ≤ 0.002 in our sample, and a biological effect size threshold of |Δβ | > 0.03 were used to determine significant associations.

Given that cell type proportions are known to influence DNA methylation, we also conducted a sensitivity analysis by including cell type proportions as covariates in our models. The adjusted models were as follows:

Age Associations:

${Candidate\; CpG}\,\Delta {DNAm} \sim \Delta {Age}+{Cell\; type\; proportions}+{Sex}+{GA}1+{GA}2+{Study\; Center}+{any\; prenatal\; smoke\; exposure}+\varepsilon$

Sex Associations:

${Can}{didate\; CpG} \sim {Sex}+{Cell\; type\; proportions}+{Age\; at\; sample\; collection}+{GA}1+{GA}2+{Study\; Center}+{any\; prenatal\; smoke\; exposure}+\varepsilon$

Post hoc characterization of significant DNAm sites

CpG sites of interest were annotated to genes, genomic regions, and chromosomal locations using the UCSC Genome Browser with the GRCH37/hg19 human genome reference assembly⁵². To relate the resulting sets of genes to their biological functions and gene network interactions, the GeneCards (https://www.genecards.org) database and EWAS Atlas (https://ngdc.cncb.ac.cn/ewas) were used^71,72,73. To evaluate chromatin structure, ChromHMM reference panels were employed^61,62. Given the regional correlation of DNAm and its potential biological relevance, a region-based approach was implemented. CMRs were defined using the Co-Methylation with Genomic CpG Background (CoMeBack) algorithm, which clusters genome-wide CpG sites based on a Pearson correlation threshold of ≥ 0.4 and a maximum inter-probe distance of 1 kb⁷⁴. Composite beta values for each CMR were computed using PCA. CMRs containing at least one significant CpG site were further analyzed for association with the variable of interest using the same robust linear regression models described in Section 2.3.5, with the composite beta value as the input. Associations were considered significant if the |Δβ| of the CMR exceeded 0.03 and the false discovery rate (FDR) was below 0.05.

Validation cohorts and data preprocessing

To validate the sex associated differences in DNAm, two separate pediatric whole blood cohorts were used. The publicly available dataset GSE142512 from GEO was used to validate the findings from age one (henceforth referred to as GEO) and the INfancia y Medio Ambiente (INMA) - Sabadell cohort was used to validate the findings from age five^31,32. Both cohorts were preprocessed independently from raw IDATS following the same steps as outlined for the discovery cohort. DNAm from both GSE142512 and INMA was measured using the Illumina 450k array.

Briefly, samples were assessed against 17 Illumina quality control metrics using the ewastools R package and detection p-value, bead count, average methylated and unmethylated intensity metrics were evaluated using the minfi R package⁵¹. To confirm the biological sex of the samples, sex was inferred by the array probes targeting the X and Y chromosomes with the R package minfi (version 1.44.0) and matched with reported sex. All samples passed the QC metrics. Dye bias and background correction was performed by and subsequently normalized by BMIQ⁵⁴.

After preprocessing, 197 samples remained in INMA, and the GEO data set was subset to passing healthy control samples (n = 56) (Supplementary Data 10).

Validation of candidate DNAm association analysis with linear regression for sex

We performed the same analysis as described in Section 2.3.5 independently in the GEO dataset and the INMA cohort using robust linear regression, including age, to assess the relations between changes in DNAm and biological sex. We tested only DNAm sites identified in the CHILD cohort analysis that were present in the datasets (n = 35 out of 47) (Supplementary Data 11). In accordance with the literature, CpG sites were determined to be meaningful at a threshold of |Δβ | > 0.03 and a p value of < 0.05⁷⁵.

Statistics and reproducibility

The discovery cohort included 760 individuals with samples collected at two time points, comprising 404 males and 356 females for the stratified analysis. Validation of sex differences at age 1 was performed using 55 samples from GSE142512 (GEO; 36 males, 19 females), and validation at age 5 was performed using 197 samples from the INMA cohort (102 males, 95 females). All statistical analyses were conducted in RStudio v4.2.2 with Bioconductor 3.16. Multiple testing corrections (Benjamini–Hochberg and Bonferroni) were applied as appropriate. For comparisons of continuous variables (e.g., cell type proportions) between two groups, paired t-tests were used. Comparisons of categorical variables (e.g., birth mode) were assessed with chi-squared tests. Pearson correlation was used for correlation analyses. To establish the technical threshold for DNA methylation analysis, technical replicates distributed across the arrays were used to calculate the root mean square error (RMSE). The relationship between the variable of interest and DNAm was assessed using robust linear modeling⁷⁶.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

Both the CHILD and INMA cohort data are available by request through the associated consortium. Further information and requests can be made through https://childstudy.ca/for-researchers/data-access/ and https://www.proyectoinma.org/en/inma-project/inma-collaboration-policy/, respectively. GSE142512 is available through GEO³¹. Source data for figures can be found in Supplementary Data 12.

References

Simon, A. K., Hollander, G. A. & McMichael, A. Evolution of the immune system in humans from infancy to old age. Proc. R. Soc. B Biol. Sci. 282, 20143085 (2015).
Article Google Scholar
Klein, S. L. & Flanagan, K. L. Sex differences in immune responses. Nat. Rev. Immunol. 16, 626–638 (2016).
Article CAS PubMed Google Scholar
Taneja, V. Sex hormones determine immune response. Front. Immunol. 9, 1931 (2018).
Article PubMed PubMed Central Google Scholar
Olin, A. et al. Stereotypic immune system development in newborn children. Cell 174, 1277–1292.e14 (2018).
Article CAS PubMed PubMed Central Google Scholar
Bous, M. et al. Sex differences in the frequencies of B and t cell subpopulations of human cord blood. Int. J. Mol. Sci. 24, 11511 (2023).
Article CAS PubMed PubMed Central Google Scholar
Ghosh, S. & Klein, R. S. Sex drives dimorphic immune responses to viral infections. J. Immunol. 198, 1782–1790 (2017).
Article CAS PubMed Google Scholar
Migliore, L., Nicolì, V. & Stoccoro, A. Gender specific differences in disease susceptibility: the role of epigenetics. Biomedicines 9, 652 (2021).
Article CAS PubMed PubMed Central Google Scholar
Renault, C. H. et al. Minipuberty of human infancy – a window of opportunity to evaluate hypogonadism and differences of sex development? Ann. Pediatr. Endocrinol. Metab. 25, 84–91 (2020).
Article PubMed PubMed Central Google Scholar
Wilson, J. D., George, F. W. & Griffin, J. E. The hormonal control of sexual development. Science 211, 1278–1284 (1981).
Article CAS PubMed Google Scholar
Cathey, A. L. et al. Gestational hormone concentrations are associated with timing of delivery in a fetal sex-dependent manner. Front. Endocrinol. 12, 742145 (2021).
Article Google Scholar
DiPietro, J. A. & Voegtline, K. M. The gestational foundation of sex differences in development and vulnerability. Neuroscience 342, 4–20 (2017).
Article CAS PubMed Google Scholar
Warembourg, C. et al. Exposure of pregnant women to persistent organic pollutants and cord sex hormone levels. Hum. Reprod. 31, 190–198 (2016).
Article CAS PubMed Google Scholar
Merrill, S. M. et al. Sex-specific associations between prenatal exposure to Di(2-ethylhexyl) phthalate, epigenetic age acceleration, and susceptibility to early childhood upper respiratory infections. Epigenomes 8, 3 (2024).
Article CAS PubMed PubMed Central Google Scholar
Hertzman, C. & Boyce, T. How experience gets under the skin to create gradients in developmental health. Annu. Rev. Public Health 31, 329–347 (2010).
Article PubMed Google Scholar
Gluckman, P. D., Hanson, M. A. & Buklijas, T. A conceptual framework for the developmental origins of health and disease. J. Dev. Orig. Health Dis. 1, 6–18 (2010).
Article CAS PubMed Google Scholar
Barker, D. J. P. The origins of the developmental origins theory. J. Intern. Med. 261, 412–417 (2007).
Article CAS PubMed Google Scholar
Aristizabal, M. J. et al. Biological embedding of experience: a primer on epigenetics. Proc. Natl. Acad. Sci. USA 117, 23261–23269 (2020).
Article CAS PubMed Google Scholar
Bird, A. DNA methylation patterns and epigenetic memory. Genes Dev 16, 6–21 (2002).
Article CAS PubMed Google Scholar
Ziller, M. J. et al. Charting a dynamic DNA methylation landscape of the human genome. Nature 500, 477–481 (2013).
Article CAS PubMed PubMed Central Google Scholar
Dor, Y. & Cedar, H. Principles of DNA methylation and their implications for biology and medicine. Lancet 392, 777–786 (2018).
Article CAS PubMed Google Scholar
Horvath, S. & Raj, K. DNA methylation-based biomarkers and the epigenetic clock theory of ageing. Nat. Rev. Genet. 19, 371–384 (2018).
Article CAS PubMed Google Scholar
Min, J. L. et al. Genomic and phenotypic insights from an atlas of genetic effects on DNA methylation. Nat. Genet. 53, 1311–1321 (2021).
Article CAS PubMed PubMed Central Google Scholar
Schrott, R., Song, A. & Ladd-Acosta, C. Epigenetics as a biomarker for early-life environmental exposure. Curr. Environ. Health Rep. 9, 604–624 (2022).
Article CAS PubMed Google Scholar
Bell, C. G. Epigenomic insights into common human disease pathology. Cell. Mol. Life Sci. 81, 178 (2024).
Article CAS PubMed PubMed Central Google Scholar
Houseman, E. A., Kim, S., Kelsey, K. T. & Wiencke, J. K. DNA methylation in whole blood: uses and challenges. Curr. Environ. Health Rep. 2, 145–154 (2015).
Article CAS PubMed Google Scholar
Alisch, R. S. et al. Age-associated DNA methylation in pediatric populations. Genome Res 22, 623–632 (2012).
Article CAS PubMed PubMed Central Google Scholar
Mulder, R. H. et al. Epigenome-wide change and variation in DNA methylation in childhood: trajectories from birth to late adolescence. Hum. Mol. Genet. 30, 119–134 (2021).
Article CAS PubMed PubMed Central Google Scholar
Acevedo, N. et al. Age-associated DNA methylation changes in immune genes, histone modifiers and chromatin remodeling factors within 5 years after birth in human blood leukocytes. Clin. Epigenetics 7, 34 (2015).
Article PubMed PubMed Central Google Scholar
Gatev, E. et al. Autosomal sex-associated co-methylated regions predict biological sex from DNA methylation. Nucleic Acids Res 49, 9097–9116 (2021).
CAS PubMed PubMed Central Google Scholar
Li, K. et al. Age-dependent changes of total and differential white blood cell counts in children. Chin. Med. J. (Engl.) 133, 1900–1907 (2020).
Article PubMed Google Scholar
Johnson, R. K. et al. Longitudinal DNA methylation differences precede type 1 diabetes. Sci. Rep. 10, 3721 (2020).
Article CAS PubMed PubMed Central Google Scholar
Guxens, M. et al. Cohort profile: the INMA—INfancia y medio ambiente—(Environment and Childhood) project. Int. J. Epidemiol. 41, 930–940 (2012).
Article PubMed Google Scholar
Garcia‐Prat, M. et al. Extended immunophenotyping reference values in a healthy pediatric population. Cytometry B Clin. Cytom 96, 223–233 (2019).
Article PubMed Google Scholar
Jalali, S. et al. A high‐dimensional cytometry atlas of peripheral blood over the human life span. Immunol. Cell Biol. 100, 805–821 (2022).
Article CAS PubMed PubMed Central Google Scholar
Marshall, J. S., Warrington, R., Watson, W. & Kim, H. L. An introduction to immunology and immunopathology. Allergy Asthma Clin. Immunol. 14, 49 (2018).
Article PubMed PubMed Central Google Scholar
Vissing, N. H., Chawes, B. L., Rasmussen, M. A. & Bisgaard, H. Epidemiology and risk factors of infection in early childhood. Pediatrics 141, e20170933 (2018).
Article PubMed Google Scholar
Klein, S. L. Sex influences immune responses to viruses, and efficacy of prophylaxis and treatments for viral diseases. BioEssays 34, 1050–1059 (2012).
Article CAS PubMed PubMed Central Google Scholar
Sekiya, T. Immune cell development and epigenetics. In The Epigenetics of Autoimmunity (ed. Zhang, R.) 25–55 (Elsevier, 2018).
Engelbrecht, H.-R. et al. Sex differences in epigenetic age in mediterranean high longevity regions. Front. Aging 3, 1007098 (2022).
Article PubMed PubMed Central Google Scholar
Robinson, G. A. et al. Investigating sex differences in T regulatory cells from cisgender and transgender healthy individuals and patients with autoimmune inflammatory disease: a cross-sectional study. Lancet Rheumatol. 4, e710–e724 (2022).
Article CAS PubMed PubMed Central Google Scholar
Chen, B. H. et al. DNA methylation-based measures of biological age: meta-analysis predicting time to death. Aging 8, 1844–1865 (2016).
Article CAS PubMed PubMed Central Google Scholar
Bindl, L. et al. Gender-based differences in children with sepsis and ARDS: the ESPNIC ARDS database group. Intensive Care Med. 29, 1770–1773 (2003).
Article PubMed Google Scholar
Vig, M. & Kinet, J.-P. Calcium signaling in immune cells. Nat. Immunol. 10, 21–27 (2009).
Article CAS PubMed PubMed Central Google Scholar
Sugden, K. et al. Patterns of reliability: assessing the reproducibility and integrity of DNA methylation measurement. Patterns 1, 100014 (2020).
Article CAS PubMed PubMed Central Google Scholar
Jaramillo, I. et al. Mode of birth and DNA methylation at birth, in childhood, and in adolescence: uncovering the relationship using ALSPAC data. Dev. Psychol. 60, 1746–1759 (2024).
Article PubMed Google Scholar
Muenchhoff, M. & Goulder, P. J. R. Sex differences in pediatric infectious diseases. J. Infect. Dis. 209, S120–S126 (2014).
Article CAS PubMed PubMed Central Google Scholar
Moraes, T. J. et al. The C anadian H ealthy I nfant L ongitudinal D evelopment birth cohort study: biological samples and biobanking. Paediatr. Perinat. Epidemiol. 29, 84–92 (2015).
Article CAS PubMed Google Scholar
Heiss, J. A. & Just, A. C. Identifying mislabeled and contaminated DNA methylation microarray data: an extended quality control toolset with examples from GEO. Clin. Epigenetics 10, 73 (2018).
Article PubMed PubMed Central Google Scholar
Fortin, J. P., Triche, T. J. & Hansen, K. D. Preprocessing, normalization and integration of the Illumina human methylationEPIC array with minfi. Bioinformatics 33, 558–560 (2017).
Article CAS PubMed Google Scholar
Du, P., Kibbe, W. A. & Lin, S. M. lumi: a pipeline for processing Illumina microarray. Bioinformatics 24, 1547–1548 (2008).
Article CAS PubMed Google Scholar
Murat, K. et al. Ewastools: Infinium human methylation beadchip pipeline for population epigenetics integrated into galaxy. GigaScience 9, giaa049 (2020).
Article PubMed PubMed Central Google Scholar
Price, E. M. et al. Additional annotation enhances potential for biologically-relevant analysis of the Illumina infinium human methylation450 BeadChip array. Epigenetics Chromatin 6, 4 (2013).
Article CAS PubMed PubMed Central Google Scholar
Dedeurwaerder, S. et al. A comprehensive overview of infinium human methylation450 data processing. Brief. Bioinform. 15, 929–941 (2014).
Article CAS PubMed Google Scholar
Liu, J. & Siegmund, K. D. An evaluation of processing methods for HumanMethylation450 BeadChip data. BMC Genomics 17, 469 (2016).
Article PubMed PubMed Central Google Scholar
Leek, J. T., Johnson, W. E., Parker, H. S., Jaffe, A. E. & Storey, J. D. The sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics 28, 882–883 (2012).
Article CAS PubMed PubMed Central Google Scholar
Salas, L. A. et al. Enhanced cell deconvolution of peripheral blood using DNA methylation for high-resolution immune profiling. Nat. Commun. 13, 761 (2022).
Article CAS PubMed PubMed Central Google Scholar
Gervin, K. et al. Systematic evaluation and validation of reference and library selection methods for deconvolution of cord blood DNA methylation data. Clin. Epigenetics 11, 125 (2019).
Article PubMed PubMed Central Google Scholar
Koestler, D. C. et al. Improving cell mixture deconvolution by identifying optimal DNA methylation libraries (IDOL). BMC Bioinform. 17, 120 (2016).
Article Google Scholar
Triche, T. J., Weisenberger, D. J., Van Den Berg, D., Laird, P. W. & Siegmund, K. D. Low-level processing of Illumina Infinium DNA Methylation BeadArrays. Nucleic Acids Res. 41, e90–e90 (2013).
Article CAS PubMed PubMed Central Google Scholar
Das, S. et al. Next-generation genotype imputation service and methods. Nat. Genet. 48, 1284–1287 (2016).
Article CAS PubMed PubMed Central Google Scholar
Ernst, J. & Kellis, M. ChromHMM: automating chromatin-state discovery and characterization. Nat. Methods 9, 215–216 (2012).
Article CAS PubMed PubMed Central Google Scholar
Ernst, J. & Kellis, M. Chromatin-state discovery and genome annotation with ChromHMM. Nat. Protoc. 12, 2478–2492 (2017).
Article CAS PubMed PubMed Central Google Scholar
Morey, R. D. & Rouder, J. N. Bayes factor approaches for testing interval null hypotheses. Psychol. Methods 16, 406–419 (2011).
Article PubMed Google Scholar
Cohen, J. Statistical Power Analysis for the Behavioral Sciences 2nd edn (New York University, 1988).
Jones, M. J., Moore, S. R. & Kobor, M. S. Principles and challenges of applying epigenetic epidemiology to psychology. Annu. Rev. Psychol. 69, 459–485 (2018).
Article PubMed Google Scholar
Graw, S., Henn, R., Thompson, J. A. & Koestler, D. C. pwrEWAS: a user-friendly tool for comprehensive power estimation for epigenome wide association studies (EWAS). BMC Bioinform. 20, 218 (2019).
Article Google Scholar
Nabais, M. F. et al. An overview of DNA methylation-derived trait score methods and applications. Genome Biol. 24, 28 (2023).
Article CAS PubMed PubMed Central Google Scholar
Pleil, J. D. QQ-plots for assessing distributions of biomarker measurements and generating defensible summary statistics. J. Breath Res. 10, 035001 (2016).
Article PubMed Google Scholar
The BIOS Consortium, Van Iterson, M., Van Zwet, E. W. & Heijmans, B. T. Controlling bias and inflation in epigenome- and transcriptome-wide association studies using the empirical null distribution. Genome Biol. 18, 19 (2017).
Article PubMed Central Google Scholar
Benjamini, Yaov & Hochberg, Yosef Controlling the false discovery rate: a proactical and pwerful approach to multiple testing. J. R. Stat. Soc. 57, 289–300 (1995).
Article Google Scholar
Stelzer, G. et al. The GeneCards suite: from gene data mining to disease genome sequence analyses. Curr. Protoc. Bioinforma. 54, 1.30.1−1.30.33 (2016).
Safran, M. et al. The GeneCards suite. In Practical Guide to Life Science Databases (eds. Abugessaisa, I. & Kasukawa, T.) 27–56 (Springer Nature Singapore, 2021).
Li, M. et al. EWAS Atlas: a curated knowledgebase of epigenome-wide association studies. Nucleic Acids Res. 47, D983–D988 (2019).
Article CAS PubMed Google Scholar
Gatev, E., Gladish, N., Mostafavi, S. & Kobor, M. S. CoMeBack: DNA methylation array data analysis for co-methylated regions. Bioinformatics 36, 2675–2683 (2020).
Article CAS PubMed Google Scholar
Campagna, M. P. et al. Epigenome-wide association studies: current knowledge, strategies and recommendations. Clin. Epigenetics 13, 214 (2021).
Article CAS PubMed PubMed Central Google Scholar
Venables, W. & Ripley, B. Modern Applied Statistics with S 4th edn (Springer, 2002).

Download references

Acknowledgements

We thank the CHILD Cohort Study (CHILD) participant families for their dedication and commitment to advancing health research. CHILD was initially funded by CIHR and AllerGen NCE. Visit CHILD at childstudy.ca. We would also like to thank the data stewards and participants of the INMA cohort for the ability to validate our work with their data. This research utilized the FlowSorted.BloodExtended.EPIC software packages developed at Dartmouth College, which are governed by the licensing terms provided by Dartmouth Technology Transfer. M.B.A. and M.S.K. are fellows of the CIFAR Network and M.B.A. holds a position as Canada Research Chair in Early Nutrition and the Developmental Origins of Health and Disease. M.M. was supported by a personal grant from the Dutch Research Council (NOW/ZonMW): Rubicon (grant no. 04520232320009). The final validation dataset is available through NCBI’s Gene Expression Omnibus, accession number GSE142512.

Author information

Authors and Affiliations

Department of Biochemistry and Molecular Biology, University of British Columbia, Vancouver, BC, Canada
Karlie Edwards & Michael S. Kobor
Edwin S.H. Leong Centre for Healthy Aging, University of British Columbia, Vancouver, BC, Canada
Karlie Edwards, Sarah M. Merrill, Chaini Konwar, Marcia S. Jude, Beryl C. Zhuang, Mandy Meijer, Erick Navarro-Delgado, Julie L. MacIsaac & Michael S. Kobor
British Columbia Children’s Hospital Research Institute, University of British Columbia, Vancouver, BC, Canada
Karlie Edwards, Sarah M. Merrill, Chaini Konwar, Marcia S. Jude, Beryl C. Zhuang, Mandy Meijer, Erick Navarro-Delgado, Julie L. MacIsaac & Michael S. Kobor
Department of Psychology, University of Massachusetts Lowell, Lowell, MA, USA
Sarah M. Merrill
Department of Medical Genetics, Faculty of Medicine, University of British Columbia, Vancouver, BC, Canada
Mandy Meijer
Department of Pediatrics, University of Alberta, Edmonton, AB, Canada
Piush Mandhane
Children’s Hospital Research Institute of Manitoba, University of Manitoba, Winnipeg, MB, Canada
Elinor Simons & Meghan B. Azad
Department of Pediatrics and Child Health, University of Manitoba, Winnipeg, MB, Canada
Elinor Simons & Meghan B. Azad
Department of Paediatrics, Hospital for Sick Children, Toronto, ON, Canada
Theo J. Moraes & Padmaja Subbarao
Department of Physiology & Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada
Padmaja Subbarao
Department of Medicine, McMaster University, Hamilton, ON, Canada
Padmaja Subbarao
ISGlobal, Institute of Global Health, Barcelona, Spain
Mariona Bustamante & Martine Vrijheid
Universitat Pompeu Fabra (UPF), Barcelona, Spain
Mariona Bustamante & Martine Vrijheid
Centro de Investigación en Red de Epidemiología y Salud Pública (CIBERESP), Madrid, Spain
Mariona Bustamante & Martine Vrijheid
BC Children’s Hospital Research Institute Vancouver, Vancouver, BC, Canada
Stuart Turvey & Michael S. Kobor
Program in Child and Brain Development, CIFAR, Toronto, ON, Canada
Michael S. Kobor

Authors

Karlie Edwards
View author publications
Search author on:PubMed Google Scholar
Sarah M. Merrill
View author publications
Search author on:PubMed Google Scholar
Chaini Konwar
View author publications
Search author on:PubMed Google Scholar
Marcia S. Jude
View author publications
Search author on:PubMed Google Scholar
Beryl C. Zhuang
View author publications
Search author on:PubMed Google Scholar
Mandy Meijer
View author publications
Search author on:PubMed Google Scholar
Erick Navarro-Delgado
View author publications
Search author on:PubMed Google Scholar
Julie L. MacIsaac
View author publications
Search author on:PubMed Google Scholar
Piush Mandhane
View author publications
Search author on:PubMed Google Scholar
Elinor Simons
View author publications
Search author on:PubMed Google Scholar
Theo J. Moraes
View author publications
Search author on:PubMed Google Scholar
Meghan B. Azad
View author publications
Search author on:PubMed Google Scholar
Padmaja Subbarao
View author publications
Search author on:PubMed Google Scholar
Mariona Bustamante
View author publications
Search author on:PubMed Google Scholar
Martine Vrijheid
View author publications
Search author on:PubMed Google Scholar
Stuart Turvey
View author publications
Search author on:PubMed Google Scholar
Michael S. Kobor
View author publications
Search author on:PubMed Google Scholar

Contributions

Conceptualization was carried out by K.E., S.M.M., and M.S.K. Cohort ownership and custodianship were provided by M.B., M.V., P.M., M.B.A., E.S., T.J.M., P.S., and S.T. Biological data were generated by J.L.M. and M.S.K., while biological data curation and preprocessing were conducted by K.E., C.K., M.S.J., B.C.Z., and E.N. Statistical analyses were performed by K.E. and S.M.M., and figures were generated and visualized by K.E. The manuscript was completed by K.E., S.M.M., C.K., B.C.Z., M.S.J., and M.M., with edits and revisions contributed by all authors.

Corresponding author

Correspondence to Michael S. Kobor.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Communications Biology thanks Saara Marttila and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editors: Rosie Bunton-Stasyshyn & Michele Repetto. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information (download PDF )

Description of Additional Supplementary Materials (download PDF )

Supplementary Data 1-11 (download XLSX )

Supplementary Data 12 (download XLSX )

Reporting Summary (download PDF )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Edwards, K., Merrill, S.M., Konwar, C. et al. Biological sex impacts immune cell proportions and epigenetic profiles in the developing pediatric immune system. Commun Biol 8, 1447 (2025). https://doi.org/10.1038/s42003-025-08844-9

Download citation

Received: 24 September 2024
Accepted: 04 September 2025
Published: 09 October 2025
Version of record: 09 October 2025
DOI: https://doi.org/10.1038/s42003-025-08844-9

This article is cited by

Sex differences in immune modulation: implications for infection, inflammation, and nutritional supplementation
- Marta Araújo
- Ana Mendes-Frias
- Ricardo Silvestre
Biology of Sex Differences (2026)

Subjects

Abstract

Similar content being viewed by others

Introduction

Results

Estimated cell type proportions significantly changed in the first five years of life and exhibited sex-associated differences

Immune-specific epigenome association study revealed age- and sex-associated DNAm signatures in the first five years of life

Validation of sex differences in DNAm

Discussion

Methods

Participant description

DNA sample collection and microarray profiling

DNA methylation preprocessing

DNAm data reduction

Computational cell type proportion estimation

Genotyping preprocessing

Genetic ancestry estimation

Immune candidate gene list

Cell type proportion comparisons

Technical replicate Delta Beta (Δβ) threshold calculation

Delta beta (Δβ) calculations

Assessing statistical power

Candidate DNAm association studies with robust linear regression

Post hoc characterization of significant DNAm sites

Validation cohorts and data preprocessing

Validation of candidate DNAm association analysis with linear regression for sex

Statistics and reproducibility

Reporting summary

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links