Ribosomal DNA copy number is associated with body mass in humans and other mammals

Law, Pui Pik; Mikheeva, Liudmila A.; Rodriguez-Algarra, Francisco; Asenius, Fredrika; Gregori, Maria; Seaborne, Robert A. E.; Yildizoglu, Selin; Miller, James R. C.; Tummala, Hemanth; Mesnage, Robin; Antoniou, Michael N.; Li, Weilong; Tan, Qihua; Hillman, Sara L.; Rakyan, Vardhman K.; Williams, David J.; Holland, Michelle L.

doi:10.1038/s41467-024-49397-5

Download PDF

Article
Open access
Published: 12 June 2024

Ribosomal DNA copy number is associated with body mass in humans and other mammals

Nature Communications volume 15, Article number: 5006 (2024) Cite this article

5109 Accesses
13 Citations
4 Altmetric
Metrics details

Subjects

Abstract

Body mass results from a complex interplay between genetics and environment. Previous studies of the genetic contribution to body mass have excluded repetitive regions due to the technical limitations of platforms used for population scale studies. Here we apply genome-wide approaches, identifying an association between adult body mass and the copy number (CN) of 47S-ribosomal DNA (rDNA). rDNA codes for the 18 S, 5.8 S and 28 S ribosomal RNA (rRNA) components of the ribosome. In mammals, there are hundreds of copies of these genes. Inter-individual variation in the rDNA CN has not previously been associated with a mammalian phenotype. Here, we show that rDNA CN variation associates with post-pubertal growth rate in rats and body mass index in adult humans. rDNA CN is not associated with rRNA transcription rates in adult tissues, suggesting the mechanistic link occurs earlier in development. This aligns with the observation that the association emerges by early adulthood.

Thousands of high-quality sequencing samples fail to show meaningful correlation between 5S and 45S ribosomal DNA arrays in humans

Article Open access 11 January 2021

Regulation of ribosomal RNA gene copy number, transcription and nucleolus organization in eukaryotes

Article 02 February 2023

Insulin signaling regulates R2 retrotransposon expression to orchestrate transgenerational rDNA copy number maintenance

Article Open access 04 January 2025

Introduction

Lifestyle changes have driven a relentless increase in the incidence of obesity¹. Current interventions have proven insufficient to curb this trend, therefore, understanding the basis of an individual’s response to an obesogenic environment is of great interest. Genome-wide association studies (GWAS) have contributed to our understanding of the genetic influence over body mass, yet so far only part of the heritability estimated by family and twin studies can be explained^2,3. Epigenetic mechanisms have also been explored in depth but largely using array technologies that only partially capture the DNA methylome⁴. Due to technical limitations, repetitive parts of the genome, such as 47S-ribosomal DNA (rDNA) have been excluded from such studies⁵.

Here we utilize whole-genome approaches, inclusive of repetitive regions to identify an association between rDNA copy number (CN) and body mass in humans and rats. We observe this association in two independent cohorts and different tissues in adult humans. The association is not driven by cell-type specific variation in rDNA CN and rDNA CN is not altered between monozygotic twins discordant for body mass index (BMI), implying that the association is not downstream of environmental exposures or metabolic changes that occur with BMI variation. DNA methylation at the rDNA in adult tissues is correlated with the rDNA CN and acts to normalize the transcription rates. Therefore, if the mechanistic basis of the association is through direct effects on the production of rRNA transcripts, it must occur earlier during development before epigenetic silencing is fully established. This is supported by the observation that the growth rate in rats from puberty to young adulthood underlies the association with adult body mass.

Results

Obese individuals have lower rDNA copy number

Previously, we identified hypermethylated rDNA in inbred adult mice that had been exposed to protein restriction (PR) via maternal diet from the period of conception to weaning^6,7. This exposure resulted in growth restriction. Intriguingly, the hypermethylation of rDNA was inversely correlated with the weight of the PR-exposed animals at the end of the exposure period. This raised the question of whether the functional genomics of rDNA may be linked to body weight regulation.

To address if altered DNA methylation of repetitive genomic elements, inclusive of the rDNA is associated with body mass index (BMI) in humans, we analysed whole blood from lean (BMI < 25 kg/m²) or obese (BMI > 30 kg/m²) human males (Fig. 1A, p = 9.5 × 10⁻¹²) using whole genome bisulfite sequencing (WGBS). There was no difference in the mean age of the lean and obese groups (~36 years old, Table S1) and no history of diagnosed comorbidities or medication use across the entire cohort. However, there were significant differences between the lean and obese groups relating to anthropomorphic measurements, blood pressure and serum markers (Table S1), consistent with the obese group having a metabolic syndrome associated with higher risk of cardiovascular disease and Type 2 Diabetes⁸.

**Fig. 1: Methylation of rDNA is positively correlated with copy number in blood and associated with obesity.**

The WGBS data was mapped to a reference genome modified to include a representative copy of the rDNA since the rDNA clusters present on chromosomes 13, 14, 15, 21 and 22 in humans are not included in the genome assembly^5,9. The sequencing depth of these data (~14x), is sufficient to produce high resolution quantitation of DNA methylation when considering groups of genomic features, or mapping to a consensus for a multi-copy genomic element, such as the rDNA (Table S2). When all reads mapping to genomic features such as exons, introns or a range of repetitive genomic elements were collectively considered, there was no difference in DNA methylation between the lean and obese groups (Fig. S1), with the exception of rDNA, for which methylation was ~7% less across the entire promoter and transcriptional unit (~14 kB) in the obese compared to lean cohort (Fig. 1B, p = 0.0099 & Fig. 1C). Hypomethylation was also found in the obese group across the key regulatory regions of rDNA, the upstream control element and the core promoter when considered specifically (Fig. S2). This suggests that rDNA is the only genomic feature, when collectively considered, that demonstrates altered DNA methylation in association with obesity.

As we and others have previously shown that DNA methylation levels at rDNA in adult tissues are associated with inter-individual variation in rDNA CN^10,11,12,13, we next queried if this epigenetic-genetic interaction is also observed in this cohort. rDNA CN was estimated from the WGBS data using a method based on that for WGS¹⁴. We and others have previously cross validated this approach^10,13 and also applied strict criteria for data quality control, as this can influence the accuracy of CN assessment¹⁵ (Supplementary Data 1). This was further confirmed by independent cross validation using digital droplet PCR (ddPCR) to estimate copy number using a sequencing-independent methodological approach (Fig. S3). Consistent with previous reports, we observed that rDNA CN and DNA methylation are highly correlated in blood, such that individuals with higher rDNA CN also have more rDNA methylation (Fig. 1D, Spearman r = 0.74, p = 2.2 × 10⁻¹⁶). This relationship was present in both the lean (Spearman r = 0.5738, p = 0.0009) and obese (Spearman r = 0.7845, 9.8 × 10⁻⁷) groups when considered separately as well as overall. Reconcilable with the observation of hypomethylation of rDNA in the obese group, rDNA CN was also significantly lower in the obese compared to lean group (Fig. 1E, p = 0.0024) both when calculated from WGBS and using ddPCR (Fig. S3). This was not accounted for by differences in the ethnic composition of the lean and obese groups (Table S3), or sequencing parameters (Fig. S4). Furthermore, the trend for lower CN in obese individuals was observed in multiple ethnicities (Fig. S5). These observations demonstrate that total rDNA methylation and CN are positively correlated in humans and that rDNA CN variation in blood is associated with variation in a complex trait in humans, adult BMI.

The strength of association is influenced by interventions

To validate the observed association between rDNA CN and BMI, we re-analysed published reduced representation bisulfite sequencing data (RRBS) derived from the adipose tissue of Finnish males 45–67 years of age from the METSIM study^16,17. We devised a methodology for estimating rDNA CN from RRBS for determining relative (rather than absolute) CN across individuals. The method was cross-validated with the previously published method used for the WGBS data above, which in turn we have previously validated using an independent methodology, ddPCR (and Fig. S6 and Fig. S3). In this cohort, we also observed a strong positive correlation between rDNA CN and methylation, confirming its presence in at least two different human tissues (Fig. S7).

The individuals in the METSIM cohort were not selected for clinical obesity (BMI > 30 kg/m²). Therefore, a continuous spectrum of BMI is represented with the majority of individuals included in this analysis classified as clinically “overweight” (25 < BMI < 30 kg/m²). As such, we performed a correlation analysis between the relative rDNA CN and an individual’s BMI, again revealing a negative correlation that passed a nominal significance threshold (Fig. 2A. r = −0.18, p = 0.02). Consistent with this observation, rDNA methylation also demonstrated a negative correlation with BMI, although the effect did not pass the nominal threshold of p < 0.05 (Fig. 2B, r = −0.13, p = 0.09). As this cohort is older and has a larger age range than the discovery cohort, we calculated age-adjusted BMI and found that this did not have a significant effect on the correlation between rDNA CN or methylation with BMI (Fig. S8).

**Fig. 2: rDNA copy number negatively correlates with BMI but the association is diminished by lifestyle interventions.**

The METSIM validation cohort has a significant proportion (69/169) of individuals regularly taking one or more medications, most commonly statins. Therefore, we examined the association of medication status with anthropomorphic and metabolic measurements. Although there was no difference in age or BMI in the medicated compared to the non-medicated groups, the medicated group had elevated measurements for waist circumference, waist to hip ratio, diastolic blood pressure, fasting insulin levels and the Homeostatic Model Assessment for Insulin Resistance (HOMA-IR) (Table S4). Collectively, these findings indicate a higher prevalence of metabolic syndrome in these individuals⁸. However, unlike the younger discovery cohort, this was not reflected in higher low-density lipoprotein levels, consistent with mitigation by statin use¹⁸.

To investigate whether medication or other interventions that modify anthropomorphic and metabolic phenotypes might influence the strength of genotype-phenotype associations, we next considered the medicated and non-medicated groups separately. Intriguingly, the correlation between BMI and both rDNA CN and methylation strengthened when the non-medicated group was considered separately (Fig. 2C, r = −0.30, p = 0.0025 & Fig. 2D, r = −0.25, p = 0.01), but disappeared when the medicated group were analysed alone (Fig. 2E, r = 0.05, p = 0.71 & Fig. 2F, r = 0.07, p = 0.59). This could not be explained by differences in sequencing quality (Fig. S9) after sample exclusion based on strict QC of the sequencing data (Supplementary Data 2). Similar findings were observed after adjusting BMI for age (Fig. S10). Taken together, these results support a quantitative relationship between rDNA CN and BMI in adults. This has been observed in two independent, ethnically different populations and holds across two different tissues. However, we find that lifestyle interventions that influence phenotype can reduce the genotype-phenotype correlation.

Comparison of the relationship between rDNA CN and the anthropomorphic and metabolic traits captured in both cohorts suggests that body mass is the primary measured trait associated with rDNA CN variation, as BMI and waist circumference were the only variables significantly associated across both the extreme discovery and METSIM validation cohorts (Table S5). Other variables did show cohort-specific negative correlations with rDNA CN, such as C-reactive protein, fasting insulin levels and the HOMA-IR exclusively in the discovery cohort and diastolic and systolic blood pressure in the unmedicated subgroup of the replication cohort. As these are all established sequelae of being overweight, it is likely this is explained by differences in the age and phenotypic extremes between the cohorts. rDNA methylation was less strongly correlated with BMI and other traits than rDNA CN, supporting the conclusion that it is the genetic rDNA CN association with BMI that is driving the altered DNA methylation profiles we initially observed.

rDNA CN does not systematically vary by cell-type

Ageing¹⁹ and body mass²⁰ have been associated with changes in blood cell proportions and adipose infiltration²¹. However, total rDNA CN has previously been shown to be consistent across multiple tissues in mice²² and more recently humans¹³. We further confirmed this specifically by looking at variation in purified cell populations, rather than whole tissues to investigate whether cell-type specific variation in rDNA CN is present in multiple purified blood and other cell types from donors²³ (Fig. S11). This confirmed that although there is extensive inter-individual rDNA CN variation, there is no systematic rDNA CN difference across cell types. Therefore, the rDNA CN association with BMI is not a downstream artifact of altered blood cell composition across samples.

rDNA CN is not discordant in twins with divergent BMI

The observation that rDNA CN negatively correlates with adult BMI raises questions about the origin of the association. Two scenarios are plausible, i) germline inherited rDNA CN, through an unknown mechanism, can influence BMI variation, or ii) environmental exposures and/or metabolic changes occurring concomitantly with increasing BMI may lead to rDNA (epi)genetic instability and rDNA CN loss. To our knowledge, there is no precedent for a human trait associated with germline inherited rDNA CN, but rDNA CN changes have been observed in human cancers^24,25.

To address the issue of causation, we obtained RRBS data derived from whole blood of monozygotic twins discordant for BMI and of a single ethnicity. The original study excluded individuals with diagnosed comorbidities²⁶. The cohort characteristics for the 10 male and 14 female monozygotic twin pairs included in this analysis after data quality control (Supplementary Data 3, Fig. S12) are shown in Table S6. The twins were confirmed to have highly discordant BMI (Fig. 3A, p = 1.9 × 10⁻⁵), with both the leaner and heavier twins spanning all clinical BMI group classifications. However, leaner co-twins were enriched in the clinically “lean” range (BMI < 25 kg/m²), and the heavier co-twins enriched in the clinically “overweight” range (25 kg/m² < BMI < 30 kg/m²). Despite discordance in BMI, there was no difference observed for the rDNA CN within twin pairs (Fig. 3B, p = 0.84). Consistent with this, there was not any association between within-twin rDNA CN variation and twin age, or BMI discordance (Fig. S13). As with the previous cohorts, there was a positive correlation between rDNA CN and methylation (Fig. S14), and in line with this, there was no rDNA methylation differences between discordant co-twins (Fig. 3C, p = 0.29). These results, together with the cell type specific analysis above, support the idea that rDNA CN is not subject to extensive random drift over time or in response to environmental or metabolic changes associated with altered BMI. It is therefore more likely that it is germline inherited rDNA CN that is associated with adult BMI.

**Fig. 3: BMI does not induce rDNA copy number variation and is not influenced by BMI-associated genetic variation in the rest of the genome.**

rDNA CN is not associated with other known genetic variation

As BMI has a strong genetic as well as environmental component, we next sought to address whether there is a direct association between rDNA CN and previously identified single nucleotide variants (SNVs) associated with BMI from GWAS. To this end, we retrieved summary statistics from a meta-analysis of BMI including ~700,000 individuals²⁷ and utilised these as base data for calculating a BMI polygenic risk score (PRS) for individuals in the 1000 Genome Project²⁸. Only individuals which have also independently been determined to have WGS data of sufficient quality for rDNA CN estimation were included¹⁵. PRS scores calculated using only the highly significant, near-independent SNVs previously identified²⁷, or a more relaxed significance threshold produced highly correlated PRS scores, as expected (Fig. S15). We then asked if the PRS calculated from BMI-associated SNVs explains variance in rDNA CN. We did not find any models that could explain variance in rDNA CN (top model (p < 0.005 threshold) goodness of fit R² = 0.00063, p = 0.19). This is reflected when the PRS scores are directly correlated with the rDNA CN estimates (Fig. 3D, r = 0.01284, p = 0.5304). In an alternative approach, we also performed a GWAS for rDNA CN in the same cohort. Only one SNV was identified that passed the nominal p < 5 × 10⁻⁸ threshold for genome-wide correction (Chr5: 159614102, p = 3.7 × 10⁻⁸). This position is only variable in some East Asian populations²⁹, suggesting that it is unlikely to be confounding results in the predominantly European discovery and validation cohorts. Furthermore, this SNV has not previously been associated with body mass or related traits. Taken together, these findings suggest that rDNA CN is not strongly influenced by common inter-individual genetic differences in other parts of the genome that contribute to BMI variation.

rDNA CN is associated with post-pubertal growth rate in rats

We next asked whether an association between rDNA CN and body mass also exists in non-human mammals. To address this, we leveraged published RRBS data generated from the liver of female, outbred (Sprague-Dawley) rats^30,31. The authors made available weight data that was longitudinally collected throughout the experimental period, weeks 8-19 of age. This period encompasses the onset of sexual maturity in this strain (9-10 weeks of age for females) and is still within a period of growth, which has been reported to extend to 24 weeks of age³².

After excluding some individuals based on strict quality control of the RRBS data (Fig. S16 and Supplementary Data 4), we further verified that there was no effect of the study treatments on either weight, rDNA CN or methylation (Fig. S17). Plotting the longitudinal weight data confirmed that this was still a period of active growth (Fig. 4A). The availability of longitudinally collected data allowed us to query whether growth rate correlates with rDNA CN. Interestingly, cross-sectional analysis of each time point demonstrated that there was no correlation at early timepoints (Fig. 4B, r = −0.072, p = 0.6432), with a negative correlation only emerging towards the end of the experimental period, reaching significance at weeks 18 and 19 (Fig. 4C, r = −0.35, p = 0.0206 & Table S7). Indeed, this could be explained by a negative correlation between the weight gained over the study period and rDNA CN (Fig. 4D, r = −0.41, p = 0.0055), but not methylation (Fig. S18), despite a positive correlation being observed between rDNA CN and methylation in the tissues at harvest, as in all other analysed datasets (Fig. S19). Collectively, these findings suggest that the correlation between rDNA CN and body mass is not unique to humans and furthermore, becomes manifest between the ages of puberty and early adulthood.

**Fig. 4: Weight gain from puberty to early adulthood negatively correlates with rDNA copy number in Sprague-Dawley rats.**

rDNA CN is not correlated with rRNA transcription in adult tissues

Having demonstrated an association with rDNA CN and body mass regulation across two mammalian species, we next asked whether rDNA CN variation is associated with nascent rRNA transcript levels in adult somatic tissues. Previously, we have shown that expression of specific, well-defined rRNA haplotypes in mice reflects their relative contribution to total rDNA CN after adjusting for the silencing of the methylated copies¹⁰. Similar haplotypes in humans are yet to be defined. Comparing the total rDNA copy number before and after adjusting for methylation in three independent data sets derived from inbred mice, outbred mice and human lymphoblastoid cell lines (LCLs) did not reveal an association with nascent rDNA transcription rates (Fig. S20). This supports the conclusion that the association between rDNA CN and body mass is not due to altered transcription rate in adult tissues and if the mechanism is associated directly with rDNA transcription, then this occurs earlier in development, prior to compensation of rDNA CN variation through epigenetic silencing upon the initiation of differentiation³³. This finding agrees with the timing at which the emergence of the genotype-phenotype association was observed in the rats.

Discussion

rDNA CN is highly variable in humans^9,10,15. This variation is thought to derive from a very high frequency of meiotic rearrangement at the rDNA repeats³⁴. However, germline genetic variation at rDNA has not, to the best of our knowledge, been studied in relation to human trait variation as it requires sequencing data rather than genotyping arrays. Here we provide evidence that germ-line variation in rDNA CN is associated with body mass in mammals using multiple cohorts and tissues, and both extreme and continuous phenotypes.

Somatic rDNA CN instability has been reported previously in human diseases, such as cancer and neurodegenerative disorders^22,24,25,35. However, in these cases there is clear evidence that rDNA CN variation is consequential of the disease process and shows no directional association, or predictive value. There are some rare examples of cancers with specific genetic lesions that do lead to tumour-specific rDNA CN loss^22,36. These are restricted to epigenetic regulators important for the maintenance of heterochromatin at repetitive genomic regions, including the rDNA³⁶ or tumour suppressors involved in chromatin stability³⁷. However, the evidence from the monozygotic twins and multiple cell types from the same donors analysed here suggests that rDNA CN does not undergo a high degree of random genetic drift, suggesting somatic stability.

A caveat of our findings is that at this stage we are limited to identifying an association between rDNA CN and body mass. Changes in inflammatory cell types are known to occur in obesity³⁸. However, our analyses demonstrate that rDNA CN does not systematically vary with cell type and furthermore, the association of rDNA CN with growth rate in rats in the absence of an obesogenic environment suggests that the origin of the association is not with obesity per se, but rather through a mechanism that more fundamentally regulates organismal growth. Interestingly, a human longitudinal study has recently linked the rate of BMI change during the post-pubertal to young adult ages to higher risk for obesity in mid-life, independently of the actual BMI during this period³⁹. This implies that factors influencing growth rate during this period are associated with later disease risk.

Our data suggests that epigenetic silencing, associated with DNA methylation compensates for higher rDNA CN, essentially serving to “normalise” the rate of rRNA transcription in adult tissues. This, together with the phenotype-genotype association emerging by early adulthood, suggests that the physiological basis for the association between rDNA CN and BMI may occur earlier in development. The dynamic regulation of rDNA transcription is essential for embryonic development in mammals³³, with pluripotency requiring a lack of rDNA silencing to produce a translational state essential for maintaining stemness and inhibiting genes required for differentiation^37,40. Silencing of the rDNA units begins with cellular differentiation but stabilises slowly. Could then rDNA CN influence cell fate decisions at these earlier timepoints? Interestingly, disease caused by mutations in ribosomal proteins producing a reduction in total ribosome levels, without a change in composition have been shown to influence cell fate decisions through selectively influencing translation of lineage specifiers⁴¹. This raises the intriguing possibility that variation in rDNA CN may produce subtle effects on cell commitment in development which ultimately impact growth trajectories later on. An alternative hypothesis is based on observations from Drosophila, where variations in 35S-rDNA CN have been shown to influence the expression of other genes by altering genome-wide chromatin structure^42,43. Exploring these hypotheses is beyond the scope of the work presented here and will be challenging, as rDNA CN in mammals is not amenable to manipulation by reverse genetic approaches, even in the era of CRISPR. This is due to the large tandem arrays of rDNA with very high sequence homology, combined with a lack of knowledge of their organisation beyond the recently released telomere-to-telomere genome assembly which is derived from a single cell line from a hydatiform mole that is homozygous⁵. Nonetheless, the first demonstration that rDNA CN may be associated with a biomedically relevant human phenotype provides the impetus for such investigations in the future.

Methods

Sample information

Mixed ethnicity obese discovery cohort

Participants were recruited as part of a prospective cohort study, the Dad’s Health Study at University College Hospital London (UCLH) to investigate the association between paternal metabolic health (including lean and obese men) and offspring birth weight, May 2016—March 2019. Whole blood samples were collected from participants. Participants were phenotyped with regards to BMI, waist circumference, systolic and diastolic blood pressure, blood lipids, fasting insulin and glucose levels and C-reactive protein (CRP). Two groups of participants were included; lean (BMI < 25 kg/m²) and obese (BMI > 30 kg/m²). Summary phenotypic data for each group is detailed in Table S1. BMI was determined with light clothing, by a trained researcher in the same visit at which blood samples were obtained. Peripheral blood samples were centrifuged at 3000 g for 15 minutes within one hour of venepuncture and the buffy coat stored at -80⁰C. Only male participants were recruited based on self-reported sex and verified by reads mapping to the Y chromosome from the WGBS data.

Ethics approval and consent to participate

Ethical approval for the study was granted from the South East Coast—Surrey Research Ethics Committee on 28 September 2015 (REC reference number 15/LO/1437, IRAS project ID 164459). The study was also registered with the University College London Hospital Joint Research Office (Project ID 15/0548). All participants provided written, informed consent.

All collaborating authors have been acknowledged in accordance with inclusion and ethics relevant to global research.

Data generation

DNA extraction

DNA was extracted from 200 ml of buffy coat using the Qiagen QIAamp DNA Blood Mini Kit (Qiagen, Cat No. 51106) according to the manufacturer’s instructions and including RNA digestion. Purity of the extracted DNA was confirmed on a Nanodrop (ThermoFisher, cat. No ND-ONEC-W) and the concentration determined using the QuBit dsDNA HS Assay Kit (ThermoFisher, Cat No. Q32854).

Whole genome bisulfite sequencing library construction and sequencing

Genomic DNA was diluted to 10 ng/μl, and 100 μl sonicated using a Bioruptor® Pico (Diagenode, Cat No. B01060010) to achieve a 500-600 bp size range, which was confirmed using a TapeStation High Sensitivity D1000 System (Agilent, Cat No. 5067-5584 & 5067-5585). Once the desired size range was achieved, 200 ng of sonicated DNA was subjected to bisulfite conversion using the EZ DNA Methylation-Gold™ Kit (Zymo, Cat No. D5006). Libraries were then made using the Accel-NGS Methyl-Seq DNA Library Kit with unique dual indices following the size-selection guidance provided with the kit and 10 cycles of amplification to minimise clonality. The removal of all adaptors and quantification were confirmed with TapeStation and QuBit before libraries were pooled into equimolar 12-plex pools and subjected to 150 bp paired-end sequencing on a NovaSeq6000 (GeneWiz).

Human digital droplet PCR for rDNA copy number

1 μg of human genomic DNA was digested with NsiI in rCutSmart buffer (NEB, Cat No. R3127) for 1 hour at 37 °C, followed by heat inactivation. Digests were cleaned up using the DNA Clean and Concentrator-5 kit (Zymo, Cat No. D4014) using a 5:1 binding buffer to sample volume:volume ratio. Eluted DNA concentrations were determined using the QuBit dsDNA High sensitivity kit (ThermoFisher, Cat No. Q32854), diluted to ~0.5 ng/μl in nuclease-free water and then verified using the QuBit assay once more. This concentration was determined empirically using a standard curve to optimise the conditions for ddPCR. ddPCR reactions consisted of 1x Absolute Q^TM DNA Digital PCR Master Mix (Thermofisher, Cat No. A52490), ~0.5 ng of NsiI digested genomic DNA, 1x TaqMan RNase P-Vic (Thermofisher, Cat No. A30064) and 1x Taqman 18S-FAM (Thermofisher, custom assay design) in a final volume of 10 μl. The custom primer/probe combination was designed using Genbank accession KY962518.1, the forward primer sequence was (5′-CCGCGGTTCTATTTTGTTGG-3′), the reverse primer sequence was (5′-CTGATCGTCTTCGAACCTCC-3′) and the probe sequence was (5′- CGAATGCCCCCGGCCGTCCC-3′). RNase P-VIC was used as a validated single copy gene reference. Thermal cycling conditions were 10 min at 96 °C, followed by 40 cycles of 5 s at 96 °C then 15 s at 60 °C on the QuantStudio Absolute Q Digital PCR System. Relative copy number using the QuantStudio Absolute Q Digital PCR Software (v6.3.0), with the CNV set to the FAM channel and CNV-REF set to the VIC channel with a value of 2. Samples were excluded is the Lambda (Cp/Rxn) values were ≥1.6. CN estimates and associated Lambda values are reported in Supplementary Data 5.

Mouse digital droplet PCR for rDNA copy number

This assay was performed as described previously¹⁰.

Mouse nascent 47S-rRNA qRTPCR

This assay was performed on kidney samples from mice of different strains as described previously⁷.

Human nascent 47S-rRNA qRTPCR

RNA was isolated using TRIzol as per manufacturer’s instructions, quality assured on a RNA 6000 Nano Chip (Agilent) and 500 ng of total RNA was reverse transcribed using random hexamers (NEB ProtoScript®II). Real-time qPCR was performed using QuantiTect SYBR Green qPCR mix (Qiagen). Primers to amplify the precursor of the human rRNA and the housekeeping control ACTB were taken from previously published work and GAPDH primers designed as F- 5′-CCATCACCATGTTCCAGGAG-3′, and R- 5′-CCTGCTTCACCACCTTCTTG-3′^44,45,46.

External data sources

Justification for cohort selection

rDNA is only captured using long or short-read sequencing approaches. Therefore, for methylation analysis, we were restricted to selecting cohorts analysed using bisulfite-sequencing based approaches. Furthermore, the strong positive correlation between rDNA CN and methylation provided a useful additional quality control. This additional quality control is absent in WGS datasets. Therefore, we limited our analysis of WGS data sources for which the rDNA CN had been previously established and rigorously quality assessed¹⁵.

Validation cohort (METSIM adipose tissue)

Raw RRBS data for this cohort was downloaded from GEO (GSE87893). Limited phenotype data for the samples included were made available through collaboration¹⁶. This pre-existing data resource consisted solely of participants of male sex, as previously described¹⁶. Summary phenotypic data from this cohort for the samples included in these analyses can be found in Table S4.

Monozygotic twin cohort

Raw reduced representation bisulfite sequencing data from the monozygotic twin cohort was made available through collaboration and similarly, can be made available by request to Q. Tan²⁶. Data is derived from whole blood from twins that have no diagnosed illness or medications. Summary phenotypic data from this cohort for the twin pairs included in this analysis are provided in Table S6. Both male and female twin pairs were included and sex verified by mapping reads to the sex chromosomes²⁶. Sex was not included in the analyses as they were all paired across the twins.

1000 genome project

rDNA CN estimates and single nucleotide variant calls (SNV) were obtained from published sources^15,46. Both sexes are included in this analysis and sex is included as a cofactor in analyses.

Rat data with longitudinal weights

Raw RRBS data for this cohort was downloaded from GEO (GSE157551). Longitudinal weight measurements were made available through collaboration^30,31. The original study included only female rats.

Methylation atlas

WGBS data generated from sorted and pure human cell populations was retrieved from the European Genome Archive (Dataset ID EGAD00001009789). Only individuals with multiple cell types were included in the analyses, with both sexes included and determined as specified in the original publication. Sex was not included in the analyses. This left 67 samples from 18 donors and included 26 cell types represented by 2 or more donors.

Mouse data for rDNA copy number and/or methylation quantitation

This data if not described above has been generated as part of previously published work^7,10. Data from previous work was from exclusively male mice.

Data analysis

Reference sequences

Genomic reference sequences used throughout this study were generated as follows to alleviate potential coverage loss and spurious alignments. The repetitive element appearing in the rDNA IGS closest to the 3′ end of the unit was identified from the publicly available annotations for the human rDNA unit reference (Genbank accession KY962518.1). The midpoint of this repetitive element, which is located 2120 base pairs upstream of the TSS, was employed as breakpoint for creating a “looped” rDNA unit. In particular, the bases downstream of the breakpoint up to the end of the rDNA unit reference were prepended to the bases upstream of the breakpoint, which improves read coverage around the TSS by avoiding reads being discarded due to split alignments.

To minimise the risk of sequencing reads from locations outside the rDNA being spuriously mapped to the rDNA, identified rDNA pseudocopies in the Hg38 assembly were masked, and the “looped” rDNA reference mentioned above was appended. Masked regions and their genomic coordinates, including an entire rDNA unit located in an unplaced contig (Genbank accession GL000220.1) are shown in Table S8.

A human exome reference was obtained and adapted as described previously¹⁴, with exon sequences and their annotations being downloaded from the EMBL/EBI repository. In particular, exons from the sex chromosomes and smaller than 300 bases were removed, as were sequences with significant similarity as reported by blastn version 2.7.1+ in --ungapped mode. This left a total of 12,898 exon sequences in the adapted reference.

In the case of rat data, the most complete rDNA consensus sequence available presently spans solely from the 5′ end of the 18 S to the 3′ end of the 28 S (Genbank accession V01270.1). A blastn comparison between V01270.1 and the rat whole genome assembly Rn7 revealed several apparent partial rDNA pseudocopies scattered across the assembly, plus three end to end matches. These three complete pseudocopies, whose coordinates are indicated in Table S8, were thus masked. To minimise the potential detrimental effects of the partial pseudocopies, only the sequence corresponding to the 18 S (positions 1 to 1874) was appended to the masked Rn7 assembly as an additional contig.

Sequencing data processing

Initial processing of data was performed using fastqc version 0.11.9 to identify failed libraries for exclusion. Data was then trimmed for base quality and adaptor removal using trimgalore version 0.6.5, with the –-paired and ‑‑rrbs options enabled when appropriate. For WGBS data, parameters --clip_r1 10 and --clip_r2 20 were also enabled to improve further alignment. Remaining parameters were set to the default values.

Deduplication was not performed on WGBS. The repetitive nature of the rDNA within the genome increases the probability of excluding the significant number of associated reads in addition to PCR artifacts from further analysis. Deduplication should not be performed on RRBS as recommended by Bismark processing protocol.

Alignments to the reference sequence were performed using bismark version 0.23.0 with underlying bowtie2 for the WGBS and RRBS data sets. Bisulfite conversion of the reference sequence for WGBS/RRBS alignments was performed using bismark_genome_preparation. Alignment output files were then sorted, indexed and filtered to retain only reads uniquely mapping to the rDNA reference using samtools version 1.10.

Methylation data from the WGBS/RRBS data was extracted from the bismark alignments using bismark_methylation_extractor.

Data QC and sample exclusion

Samples were excluded on the basis of bismark report data (Tables S3, S6, S8, S10). Namely, if they had extremely high or low uniquely mapped reads, poor mapping efficiency, poor bisulfite conversion (as evidenced by high non-CpG methylation values). In the case of the RRBS datasets, samples that were extreme outliers with regards to total CpG methylation were also excluded as this indicates a potential problem with enzyme digestion or size selection altering the genomic regions captured compared to other samples in the same set.

rDNA CN estimation

rDNA CN was estimated from WGBS using a previously described method^10,14. This involved aligning reads to the exome reference described above to obtain the average read depth for each sample using samtools depth. The average read depth mapped to the 18 S subunit from the whole genome + rDNA alignments were then normalised to the exome value and CN calculated as 2 × (18 S /exome average read depth) for each sample. We previously validated this method of estimating the relative total CN of rDNA across samples by comparison to digital droplet PCR¹⁰ as well as here (Fig. S4).

To estimate the relative rDNA CN across samples from the RRBS data, we developed an alternative approach due to the patchy coverage of exomes in these libraries. In this approach the number of reads aligned to the rDNA reference as reported by samtools idxstats was divided by the total number of alignments listed in the corresponding bismark report. This method was cross-validated with the methodology using whole-genome data and showed a high correlation (Fig. S4).

QC of rDNA CN estimation

The effect of depth coverage on rDNA CN estimation was analysed using library down-sampling as described elsewhere¹³. Four datasets of high comparable coverages were selected across WGBS (D284 and D454) and RRBS (SRR4418992 and SRR4418945) data. After trimming was performed, the trimmed reads were split into 10 subsamples of equal size using the seqkit version 2.6.1. Subsamples then were randomly selected and merged to achieve a coverage of 90%, 80%, 70%, etc. rDNA CN were estimated from merged subsamples as previously described. This confirmed that small variations within our data parameters were not influencing CN estimation in our analyses (Fig. S21).

Genomic feature methylation and coverage estimates

Genomic feature methylation was extracted using R package methylKit version 1.16.0⁴⁷. Genomic features and repetitive DNA elements were defined based on hg38 genomic annotation Reference Sequence (RefSeq) and RepeatMasker database acquired from UCSC table browser respectively⁴⁸. Regions 1 kb upstream and downstream of the transcription start site of the reference genome were considered as promoters. For rDNA analysis, only CpGs covered by at least 50 unique reads were used whereas all CpGs were analysed for rest of the genome. Coverages per genomic features were estimated by calculating average number of reads that cover all CpG sites within each annotated genomic feature for each sample revealed by methylKit methRead() function.

Genotype association between BMI PRSs and rDNA CN

The base data were obtained from the meta-analysis in ref. ²⁷, with files hosted within the GIANT consortium data site of the Broad Institute. In particular, the two summary files for BMI analysis (updated after June 25, 2018) were retrieved, one containing all considered loci (hereinafter referred to as the “COMPLETE” file, with 2,336,269 variants), and another including Conditional and Joint (COJO) ‑transformed p-values of only significant hits (hereinafter, “COJO” file, with 941 variants).

Base data SNPs were filtered for quality using the munge_sumstats.py script from ldsc version 1.0.1⁴⁹, with the options ‑‑snp SNP, ‑‑N-col N, ‑‑a1 Tested_Allele, ‑‑a2 Other_Allele, ‑‑frq Freq_Tested_Allele_in_HRS and --n-min 100000 for both COMPLETE and COJO input files. For the COJO input, the options ‑‑p P_COJO, ‑‑signed-sumstats BETA_COJO,0 and ‑‑ignore P,SE,BETA were also included to ensure the intended estimates were employed. This filtering left 1,977,697/2,336,269 SNPs from the COMPLETE input and 802/941 from the COJO input.

Although not explicitly specified, the genomic coordinates indicated in the base data files appear to refer to the GRCh37 assembly (e.g., rs1000096 is listed at chr4:38,692,835, instead of chr4:38,691,214 as it would correspond in GRCh38). For consistency with the target data, remaining locations in the base data were thus converted to GRCh38 coordinates using the liftOver function of the rtracklayer package version 1.54.0 in R with the hg19toHg38.over.chain file obtained from UCSC.

Target data vcf files were retrieved from the 1000 Genomes Project FTP servers. In particular, information for 3202 individuals was obtained for the autosomal chromosomes from the 20201028_3202_raw_GT_with_annot folder within the 1000G_2504_high_coverage collection. These were initially filtered at chromosome level with plink2 (version 2.0-20200328) ‑‑make-bed, keeping solely the SNPs remaining on the base data using the ‑‑extract bed1 option, and QC parameters ‑‑mind 0.01, ‑‑geno 0.01, ‑‑maf 0.01, ‑‑hw2 1e-6, and ‑‑max-alleles 2. The --set-missing-var-ids @:# option was also included to avoid apparent duplicate names, and a list of successful loci was requested with the ‑‑write-snplist option. The per-chromosome filtered outputs were then merged using plink2 ‑‑pmerge-list. 1,196,533 SNPs remained on the COMPLETE case after this step, and 467 for the COJO input. These were further pruned to remove highly-correlated SNPs using plink2’s ‑‑indep-pairwise 200 50 0.25 option, where the values represent the window size, step size and maximum LD r² threshold allowed. A total of 1,002,886 and 8 SNPs were removed in this step on the COMPLETE and COJO cases, respectively.

The remaining SNPs were then employed to calculate the F coefficient for heterozygosity of each sample using plink2 ‑‑het. All individuals with F coefficients more than 3 standard deviations away from the overall mean were removed, leaving 3198 and 3199 samples in the COMPLETE and COJO cases, respectively. These were further pruned with plink2 ‑‑king‑cuttof 0.125 to avoid closely-related samples biasing the results, with 0.125 representing the relatedness level of second-degree relatives. To ensure reproducibility, a fixed seed value (1986) was also included. This pruning step left 2575 and 2502 individuals for the COMPLETE and COJO analyses, respectively. The final target data files were then generated with plink2 ‑‑make-bed specifying the remaining variants and individuals.

Clumping of the remaining variants – retaining only weakly-correlated SNPs most associated with the phenotype of interest – is not yet available on plink2, so plink ‑‑clump from version 1.9-170906 was employed instead, with parameters ‑‑clump‑p1 1, ‑‑clump‑r2 0.1, and ‑‑clump‑kb 250. Whereas no significant clumps were identified for the COJO analysis, 1414 clumps from the top 2213 variants were generated for the COMPLETE input in this manner.

Poligenic risk scores (PRSs) were finally computed using plink2 ‑‑score at different p-value thresholds indicated with the --q-score-range option. The generated values at each threshold were then employed to construct linear models with rDNA CN estimates as independent variable, as well as the 6 first principal components calculated on the pre-clumping variants with plink2 ‑‑pca, sex and 1000 genomes population as covariates. The goodness of fit of each model was then estimated as the difference between the R² of the model itself minus the R² of a null model without the PRSs.

Genome-wide association study for rDNA CN

Association between SNVs and rDNA CN was performed using the plink2 --glm command from plink release 2.0-20200328, with parameters --mind 0.05, --geno 0.05, --maf 0.05, and --hwe 0.001 for pre-filtering.

General statistical analysis

Final figures and statistical analysis were performed using GraphPad Prism (v9.2.0) and R (v4.1.1). Specific statistical analyses are indicated in the respective figure and table legends. All tests are two-sided unless specified otherwise.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

Whole genome bisufite sequencing data generated for this study will be available from The Sequence read Archive (Home - SRA - NCBI (nih.gov)) using Bioproject ID: PRJNA817350 upon manuscript acceptance. External data sources were available in public repositories: GEO (GSE87893), (GSE157551), European Genome Archive (Dataset ID EGAD00001009789) and 1000 Genomes Project. According to the Danish and EU legislations, transfer and sharing of individual-level data require prior approval from the Danish Data Protection Agency and require that data sharing requests are dealt with on a case-by-case basis. For this reason, the raw data on the monozygotic twin cohort cannot be deposited in a public database but can be made available through collaboration or upon request (contact Qihua Tan, qtan@health.sdu.dk). All data was aligned to the GRCh38 assembly, unless otherwise specified. rDNA consensus sequences were obtained from GenBank (human KY962518.1, rat V01270.1).

Code availability

All programs and parameters are described within the relevant methods sections, but clarification can be provided upon request.

References

Collaboration, N. C. D. R. F. Worldwide trends in body-mass index, underweight, overweight, and obesity from 1975 to 2016: a pooled analysis of 2416 population-based measurement studies in 128.9 million children, adolescents, and adults. Lancet 390, 2627–2642 (2017).
Article Google Scholar
Khera, A. V. et al. Polygenic prediction of weight and obesity trajectories from birth to adulthood. Cell 177, 587–596 e589 (2019).
Article CAS PubMed PubMed Central Google Scholar
Loos, R. J. F. & Yeo, G. S. H. The genetics of obesity: from discovery to biology. Nat. Rev. Genet. 23, 120–133 (2022).
Article CAS PubMed Google Scholar
Wahl, S. et al. Epigenome-wide association study of body mass index, and the adverse outcomes of adiposity. Nature 541, 81–86 (2017).
Article ADS CAS PubMed Google Scholar
Nurk, S. et al. The complete sequence of a human genome. Science 376, 44–53 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Danson, A. F., Marzi, S. J., Lowe, R., Holland, M. L. & Rakyan, V. K. Early life diet conditions the molecular response to post-weaning protein restriction in the mouse. BMC Biol. 16, 51 (2018).
Article PubMed PubMed Central Google Scholar
Holland, M. L. et al. Early-life nutrition modulates the epigenetic state of specific rDNA genetic variants in mice. Science 353, 495–498 (2016).
Article ADS CAS PubMed PubMed Central Google Scholar
Huang, P. L. A comprehensive definition for metabolic syndrome. Dis. Model. Mech. 2, 231–237 (2009).
Article CAS PubMed PubMed Central Google Scholar
Parks, M. M. et al. Variant ribosomal RNA alleles are conserved and exhibit tissue-specific expression. Sci. Adv. 4, eaao0665 (2018).
Article ADS PubMed PubMed Central Google Scholar
Rodriguez-Algarra, F. et al. Genetic variation at mouse and human ribosomal DNA influences associated epigenetic states. Genome Biol. 23, 54 (2022).
Article CAS PubMed PubMed Central Google Scholar
Shea, J. M. et al. Genetic and epigenetic variation, but not diet, shape the sperm methylome. Dev. Cell 35, 750–758 (2015).
Article CAS PubMed PubMed Central Google Scholar
Hori, Y., Shimamoto, A. & Kobayashi, T. The human ribosomal DNA array is composed of highly homogenized tandem clusters. Genome Res. 31, 1971–1982 (2021).
Article PubMed PubMed Central Google Scholar
Razzaq, A., Bejaoui, Y., Alam, T., Saad, M. & El Hajj, N. Ribosomal DNA copy number variation is coupled with DNA methylation changes at the 45S rDNA locus. Epigenetics 18, 2229203 (2023).
Article PubMed PubMed Central Google Scholar
Gibbons, J. G., Branco, A. T., Godinho, S. A., Yu, S. & Lemos, B. Concerted copy number variation balances ribosomal DNA dosage in human and mouse genomes. Proc. Natl Acad. Sci. USA 112, 2485–2490 (2015).
Article ADS CAS PubMed PubMed Central Google Scholar
Hall, A. N., Turner, T. N. & Queitsch, C. Thousands of high-quality sequencing samples fail to show meaningful correlation between 5S and 45S ribosomal DNA arrays in humans. Sci. Rep. 11, 449 (2021).
Article CAS PubMed PubMed Central Google Scholar
Orozco, L. D. et al. Epigenome-wide association in adipose tissue from the METSIM cohort. Hum. Mol. Genet. 27, 1830–1846 (2018).
Article CAS PubMed PubMed Central Google Scholar
Laakso, M. et al. The metabolic syndrome in men study: a resource for studies of metabolic and cardiovascular diseases. J. Lipid Res. 58, 481–493 (2017).
Article CAS PubMed PubMed Central Google Scholar
Wurtz, P. et al. Metabolomic profiling of statin use and genetic inhibition of HMG-CoA reductase. J. Am. Coll. Cardiol. 67, 1200–1210 (2016).
Article CAS PubMed PubMed Central Google Scholar
Mitchell, E. et al. Clonal dynamics of haematopoiesis across the human lifespan. Nature 606, 343–350 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Desai, M. Y. et al. Association of body mass index, metabolic syndrome, and leukocyte count. Am. J. Cardiol. 97, 835–838 (2006).
Article PubMed Google Scholar
Nishimura, S. et al. CD8+ effector T cells contribute to macrophage recruitment and adipose tissue inflammation in obesity. Nat. Med. 15, 914–920 (2009).
Article CAS PubMed Google Scholar
Xu, B. et al. Ribosomal DNA copy number loss and sequence variation in cancer. PLoS Genet. 13, e1006771 (2017).
Article PubMed PubMed Central Google Scholar
Loyfer, N. et al. A DNA methylation atlas of normal human cell types. Nature 613, 355–364 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Stults, D. M. et al. Human rRNA gene clusters are recombinational hotspots in cancer. Cancer Res. 69, 9096–9104 (2009).
Article CAS PubMed Google Scholar
Valori, V. et al. Human rDNA copy number is unstable in metastatic breast cancers. Epigenetics 15, 85–106 (2020).
Article PubMed Google Scholar
Li, W. et al. DNA methylome profiling in identical twin pairs discordant for body mass index. Int. J. Obes. (Lond.) 43, 2491–2499 (2019).
Article PubMed Google Scholar
Yengo, L. et al. Meta-analysis of genome-wide association studies for height and body mass index in approximately 700000 individuals of European ancestry. Hum. Mol. Genet. 27, 3641–3649 (2018).
Article CAS PubMed PubMed Central Google Scholar
Choi, S. W., Mak, T. S. & O’Reilly, P. F. Tutorial: a guide to performing polygenic risk score analyses. Nat. Protoc. 15, 2759–2772 (2020).
Article CAS PubMed PubMed Central Google Scholar
Phan L. et al. ALFA: allele frequency aggregator. National Center for Biotechnology Information, U.S. National Library of Medicine. (2021).
Mesnage, R. et al. Comparative toxicogenomics of glyphosate and roundup herbicides by mammalian stem cell-based genotoxicity assays and molecular profiling in sprague-dawley rats. Toxicol. Sci. 186, 83–101 (2022).
Article CAS PubMed Google Scholar
Mesnage, R. et al. Multi-omics phenotyping of the gut-liver axis reveals metabolic perturbations from a low-dose pesticide mixture in rats. Commun. Biol. 4, 471 (2021).
Article CAS PubMed PubMed Central Google Scholar
Ghasemi, A., Jeddi, S. & Kashfi, K. The laboratory rat: age and body weight matter. EXCLI J. 20, 1431–1445 (2021).
PubMed PubMed Central Google Scholar
Xie, S. Q. et al. Nucleolar-based Dux repression is essential for embryonic two-cell stage exit. Genes Dev. 36, 331–347 (2022).
Article CAS PubMed PubMed Central Google Scholar
Stults, D. M., Killen, M. W., Pierce, H. H. & Pierce, A. J. Genomic architecture and inheritance of human ribosomal RNA gene clusters. Genome Res. 18, 13–18 (2008).
Article CAS PubMed PubMed Central Google Scholar
Hallgren, J., Pietrzak, M., Rempala, G., Nelson, P. T. & Hetman, M. Neurodegeneration-associated instability of ribosomal DNA. Biochim. Biophys. Acta 1842, 860–868 (2014).
Article CAS PubMed PubMed Central Google Scholar
Udugama, M. et al. Ribosomal DNA copy loss and repeat instability in ATRX-mutated cancers. Proc. Natl. Acad. Sci. USA 115, 4737–4742 (2018).
Article ADS CAS PubMed PubMed Central Google Scholar
Durand, S. et al. RSL24D1 sustains steady-state ribosome biogenesis and pluripotency translational programs in embryonic stem cells. Nat. Commun. 14, 356 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Gregor, M. F. & Hotamisligil, G. S. Inflammatory mechanisms in obesity. Annu. Rev. Immunol. 29, 415–445 (2011).
Article CAS PubMed Google Scholar
Zhang, T. et al. Rate of change in body mass index at different ages during childhood and adult obesity risk. Pediatr. Obes. 14, e12513 (2019).
Article PubMed PubMed Central Google Scholar
Bulut-Karslioglu, A. et al. The transcriptionally permissive chromatin state of embryonic stem cells is acutely tuned to translational output. Cell Stem Cell 22, 369–383.e368 (2018).
Article CAS PubMed PubMed Central Google Scholar
Khajuria, R. K. et al. Ribosome levels selectively regulate translation and lineage commitment in human hematopoiesis. Cell 173, 90–103 e119 (2018).
Article CAS PubMed PubMed Central Google Scholar
Paredes, S. & Maggert, K. A. Ribosomal DNA contributes to global chromatin regulation. Proc. Natl Acad. Sci. USA 106, 17829–17834 (2009).
Article ADS CAS PubMed PubMed Central Google Scholar
Paredes, S., Branco, A. T., Hartl, D. L., Maggert, K. A. & Lemos, B. Ribosomal DNA deletions modulate genome-wide gene expression: “rDNA-sensitive” genes and natural variation. PLoS Genet. 7, e1001376 (2011).
Article CAS PubMed PubMed Central Google Scholar
Tanaka, Y. et al. JmjC enzyme KDM2A is a regulator of rRNA transcription in response to starvation. EMBO J. 29, 1510–1522 (2010).
Article CAS PubMed PubMed Central Google Scholar
Murayama, A. et al. Epigenetic control of rDNA loci in response to intracellular energy status. Cell 133, 627–639 (2008).
Article CAS PubMed Google Scholar
Clarke, L. et al. The international Genome sample resource (IGSR): a worldwide collection of genome variation incorporating the 1000 Genomes Project data. Nucleic Acids Res. 45, D854–D859 (2017).
Article CAS PubMed Google Scholar
Akalin, A. et al. methylKit: a comprehensive R package for the analysis of genome-wide DNA methylation profiles. Genome Biol. 13, R87 (2012).
Article PubMed PubMed Central Google Scholar
Karolchik, D. et al. The UCSC table browser data retrieval tool. Nucleic Acids Res. 32, D493–D496 (2004).
Article CAS PubMed PubMed Central Google Scholar
Bulik-Sullivan, B. K. et al. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 47, 291–295 (2015).
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We thank Christopher Bell (QMUL) and Sue Ozanne (IMS, Cambridge) for their helpful comments on the manuscript. For the purpose of open access, the author has applied a Creative Commons Attribution (CC BY) licence to any Author Accepted Manuscript version arising. This work was supported by an Academy of Medical Sciences Springboard Award (SBF003\1026) (M.L.H.), Medical Research Council (MR/X009661/1) (M.L.H.), Medical Research Council (MR/P011799/1) (D.J.W., V.K.R., M.L.H., S.H.), Barts Charity Grant (MGU0604) (V.K.R.), Rosetrees grant (Seedcorn 2021\100182) (V.K.R.), Rosetrees grant (CF-2021-2\109) (V.K.R.), BBSRC grant (BB/R00675X/1) (V.K.R. & F.R.-A.). D.J.W.’s salary is partly supported by the National Institute for Health and Care Research University College London Hospitals Biomedical Research Centre.

Author information

Robert A. E. Seaborne
Present address: Centre for Human and Applied Physiological Studies, King’s College London, London, UK
These authors contributed equally: Pui Pik Law, Liudmila A. Mikheeva, Francisco Rodriguez-Algarra.

Authors and Affiliations

Department of Medical and Molecular Genetics, School of Basic and Medical Biosciences, King’s College London, London, UK
Pui Pik Law, Liudmila A. Mikheeva, James R. C. Miller, Robin Mesnage, Michael N. Antoniou & Michelle L. Holland
The Blizard Institute, School of Medicine and Dentistry, Queen Mary University of London, London, UK
Pui Pik Law, Francisco Rodriguez-Algarra, Robert A. E. Seaborne, Selin Yildizoglu, Hemanth Tummala & Vardhman K. Rakyan
UCL EGA Institute for Women’s Health, University College London, London, UK
Fredrika Asenius, Maria Gregori, Sara L. Hillman & David J. Williams
Population Research Unit, University of Helsinki, Helsinki, Finland
Weilong Li
Epidemiology, Biostatistics and Biodemography, Department of Public Health, University of Southern Denmark, Copenhagen, Denmark
Qihua Tan

Authors

Pui Pik Law
View author publications
Search author on:PubMed Google Scholar
Liudmila A. Mikheeva
View author publications
Search author on:PubMed Google Scholar
Francisco Rodriguez-Algarra
View author publications
Search author on:PubMed Google Scholar
Fredrika Asenius
View author publications
Search author on:PubMed Google Scholar
Maria Gregori
View author publications
Search author on:PubMed Google Scholar
Robert A. E. Seaborne
View author publications
Search author on:PubMed Google Scholar
Selin Yildizoglu
View author publications
Search author on:PubMed Google Scholar
James R. C. Miller
View author publications
Search author on:PubMed Google Scholar
Hemanth Tummala
View author publications
Search author on:PubMed Google Scholar
Robin Mesnage
View author publications
Search author on:PubMed Google Scholar
Michael N. Antoniou
View author publications
Search author on:PubMed Google Scholar
Weilong Li
View author publications
Search author on:PubMed Google Scholar
Qihua Tan
View author publications
Search author on:PubMed Google Scholar
Sara L. Hillman
View author publications
Search author on:PubMed Google Scholar
Vardhman K. Rakyan
View author publications
Search author on:PubMed Google Scholar
David J. Williams
View author publications
Search author on:PubMed Google Scholar
Michelle L. Holland
View author publications
Search author on:PubMed Google Scholar

Contributions

Conceptualization: M.L.H., V.K.R., D.J.W. Methodology: M.L.H., V.K.R., D.J.W., P.P.L., F.R.-A., L.A.M. Investigation: P.P.L., L.A.M., F.R.-A., F.A., M.G., M.L.H., R.E.A.S., R.M., M.N.A., W.L., Q.T., S.Y., J.R.C.M., H.T. Visualization: M.L.H., P.P.L., F.R.-A., L.A.M. Funding acquisition: M.L.H., V.K.R., D.J.W., S.L.H. Project administration: M.L.H., V.K.R., D.J.W. Supervision: M.L.H., V.K.R., D.J.W. Writing – original draft: M.L.H., P.P.L., F.R.-A. Writing – review & editing: M.L.H., V.K.R., F.R.-A., L.A.M., P.P.L., R.E.A.S., D.J.W. These authors contributed equally: P.P.0.L., L.A.M., F.R.-A.

Corresponding author

Correspondence to Michelle L. Holland.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks Nady El Hajj and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Peer Review File

Description of Additional Supplementary Files

Supplementary Data 1

Supplementary Data 2

Supplementary Data 3

Supplementary Data 4

Supplementary Data 5

Reporting Summary

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Law, P.P., Mikheeva, L.A., Rodriguez-Algarra, F. et al. Ribosomal DNA copy number is associated with body mass in humans and other mammals. Nat Commun 15, 5006 (2024). https://doi.org/10.1038/s41467-024-49397-5

Download citation

Received: 21 July 2023
Accepted: 03 June 2024
Published: 12 June 2024
DOI: https://doi.org/10.1038/s41467-024-49397-5