Main

Mitochondria, often described as the powerhouses of eukaryotic cells, probably originated from an α-proteobacterium entering a host cell around 2 billion years ago through an endosymbiotic event1,2. Their circular, double-stranded genome (mitochondrial DNA, mtDNA) bears a resemblance to bacterial DNA and undergoes a mutation rate substantially higher (100–1,000 times) than that of nuclear DNA, attributed to the absence of histone protection, rapid replication cycles and reduced efficiency in DNA damage repair mechanisms3,4. Given their sensitivity to various age-related damages such as replication errors and oxidative damage, mitochondria can act as a sensitive aging clock, recording a tissue-specific chronicle of age-accumulating damage.

The mitochondrial genome, inherited from the maternal germline, populates each cell in multiple copies and exhibits extensive clonal diversity, known as heteroplasmy, both between cells and across different body sites5. This clonal diversity is highly dynamic throughout a lifetime, with mutant clones constantly arising and their abundance undergoing dynamic changes with cell division and mitochondrial turnover (mitophagy, fission and fusion)6,7. These changes are driven by two primary mechanisms: replication errors during cell division in tissues with constant cellular turnover, and mitochondrial turnover, including a process known as relaxed replication8,9,10, which allows mtDNA to replicate independently of the cell cycle, even in post-mitotic cells. The expansion of defective clones can lead to diseases, especially in high-energy-demanding tissues11,12,13,14, such as Alzheimer’s and Parkinson’s diseases15,16,17, macular degeneration18, diabetes19 and heart disease20. Such tissue-specific dynamics position mitochondria as unique markers for uncovering body-wide accumulations of age-related damage.

Clonal mosaicism within mitochondria, characterized by the presence of distinct mitochondrial genotypes within tissues, may be both a driver and indicator of the underlying forces of aging21. Mitochondrial dysfunction, characterized by impaired biogenesis, fission, fusion and mitophagy, leads to increased production of reactive oxygen species (ROS) and genomic instability. These factors coupled with cellular turnover drive the clonal expansion of mutated mitochondrial genomes, contributing to tissue-specific aging patterns. As such, mitochondrial mosaicism reflects the broader genomic instability and cellular decline that underlie aging22.

While previous studies have explored tissue-specific clonal expansions in nuclear DNA during aging23,24,25,26, the mitochondrial genome offers much higher resolution due to its much higher copy number and mutation rate. Existing studies of mitochondrial mutations have focused primarily on animal models (mice27,28,29 and macaques30), tumor samples31 or a limited number of human tissues32,33,34,35. However, a systematic body-wide quantification of mitochondrial clonal mosaicism during aging remains elusive. While recent work has quantified the intrinsic mtDNA mutation rate per replication or turnover event35, the long-term, tissue-specific consequences of these mutational processes across the human lifespan, and how they contribute to the overall aging landscape, remain less well understood. This study aims to utilize the mitochondrial mutational clock to characterize age-accumulated cellular damage over time, specifically focusing on how this clock operates differently in tissues with distinct cellular turnover and metabolic demands. We propose that a biphasic mitochondrial molecular clock exists, with distinct mechanisms driving mutation accumulation in dividing versus post-mitotic tissues. By analyzing RNA reads from 47 tissues in the the Genotype-Tissue Expression (GTEx) project36, we comprehensively measured tissue-specific mutational burden and their dynamics over age. Our findings reveal a striking tissue-specific pattern of mitochondrial genome alteration during aging, driven by both cellular proliferation and mitochondrial turnover, resulting in distinctive aging hallmarks. These dynamics serve as a unique tissue-specific signature that not only delineates the trajectory of cellular clonal expansion and cumulative cellular damage throughout an individual’s lifespan, but also sheds light on the driving forces of aging, such as genomic instability, mitochondrial dysfunction and environmental exposure. Unraveling the intricacies of the mitochondrial clock has the potential to pave the way for early diagnosis and targeted interventions in age-related disorders.

Results

Dynamic mitochondrial clonal mosaicism across human tissues

Mitochondrial mutations accumulate and evolve over time, creating intricate clonal structures that vary across different tissues. This dynamic process is driven by the continuous turnover of mitochondria and the renewal or proliferation of the host cells (Fig. 1a). As new mutations arise throughout development and aging, subsequent clonal expansion creates a mosaic pattern of mitochondrial clonal structures (for an example of such structures as revealed by single-cell RNA sequencing (RNA-seq), see Fig. 1b), leading to highly heterogeneous mitochondrial populations across different body sites.

Fig. 1: Clonal dynamics of mitochondrial mutations.
figure 1

a, A schematic illustrating the accumulation and proliferation of mtDNA variants over time, with each color representing a unique mitochondrial clone. Created with BioRender.com. b, A heatmap showcasing the mitochondrial clonal structure across individual cells of the pancreas (an example derived from reanalysis of data from ref. 64), where each column represents a cell, and row shading indicates variant abundance. The cumulative abundance of each variant across all cells is also shown, reflecting the expected heteroplasmy value in bulk sequencing data. c, A schematic overview of the workflow for calling somatic variants from 47 tissues collected from 838 GTEx donors aged 20–70 years. Created with BioRender.com. d, Typical examples of mitochondrial mutations and clonal expansion among samples from the GTEx cohort. Each row represents a specific mitochondrial variant, and the colored dots indicate the variant’s abundance (heteroplasmy) in different tissues of that individual. The color scheme for tissues follows the GTEx conventions specified in Extended Data Fig. 2. e, The average mitochondrial RNA read coverage across tissues. The plot illustrates the distribution of RNA-seq read coverage across the mitochondrial genome in representative tissues (liver, esophageal mucosa and whole blood), highlighting regions excluded from variant calling owing to low coverage or the presence of known artifacts (tRNA, NUMTs, gene boundaries and so on). f, The number of variant sites by substitution type and functional category.

Source data

To comprehensively profile mitochondrial clonal mosaicism, we systematically analyzed mitochondrial RNA (RNA transcribed from mtDNA) from 838 healthy individuals across 47 different tissues from the GTEx cohort (Fig. 1c and Extended Data Fig. 2). After stringent quality control for read depth and sample contamination (Extended Data Fig. 1), a total of 11,802 samples were included in the analysis. We performed variant calling (using Mutect2 tool mitochondrial mode, as per the gnomAD mitochondrial pipeline33) and clonal abundance quantification (heteroplasmy level) at 5,376 sites with detected variants (Fig. 1d,e), encompassing a variety of substitution types (of likely origin of spontaneous deamination and ROS damage) and functional categories (Fig. 1f). RNA coverage over the mitochondrial genome ranged from 2,000× to 8,000× depending on the tissue (Fig. 1e), with all tissues achieving an average of at least 1,000×. Regions with low RNA coverage, including tRNA and gene boundaries, were excluded. Rigorous filtering, in accordance with previously published protocols for variant calling in both mitochondrial genomes33,37 and RNA-seq data24,25,26, was applied to remove potential artifacts, such as nuclear mitochondrial DNA segments (NUMTs)38, strand bias, low base quality and other blacklisted sites, resulting in the exclusion of 2,497 sites (Fig. 1e and Methods). To ensure reliability, only single-nucleotide substitutions were considered, excluding sites with indels or structural variants. In addition, potential sample contamination was thoroughly assessed and eliminated. Notably, mitochondrial RNA may harbor alterations beyond those transcribed from mtDNA, such as RNA modifications39,40. Those sites (2,617 and 13,710; Supplementary Fig. 2) were excluded from subsequent analyses. We also observed a difference between DNA and RNA at position 9,209 (Supplementary Fig. 2), a previously uncharacterized site of currently unknown etiology, which was analyzed separately.

Mutant clones display distinct characteristics based on their likely origin, underscoring the complex underlying forces of mutational gain and clonal expansion. Figure 1d shows typical examples of germline variants and somatic mutations. Germline variants are shared across all tissues but can exhibit widely different clonal abundances (heteroplasmy levels) among tissues due to random drift, as seen with the variants chrM_228_G_A in GTEx-1C64N and chrM_12389_C_T in GTEx-WY7C. Somatic mutations mostly occur in a single tissue spot, with a few instances of early developmental origin shared across multiple tissues. Although somatic mutations acquired later in life usually exhibit low abundance, some can reach high abundance over time. For example, the somatic missense variant chrM_3392_G_A in the individual GTEx-1GZ4H attains nearly 100% heteroplasmy level in the thyroid but is undetectable in other tissues. Notably, pathogenic mutations can reach high levels of heteroplasmy in specific tissues, such as chrM_11778_G_A in GTEx-15CHQ, associated with Leber hereditary optic neuropathy (LHON, Fig. 1d). This evidence highlights the critical potential for clonal expansion and the importance of examining tissue-specific clonal dynamics. Moreover, mutations of likely early developmental origin are shared across multiple tissues and are more likely to originate from the same germ layer, as in the case of chrM_76_C_T in GTEx-1F6I4, all from the mesoderm. Our subsequent analyses focus on somatic mutations, defined as those mutations present in one or more, but not all, tissues of an individual (that is, not germline). These mutations provide valuable insights into the tissue-specific dynamics of mutation accumulation and clonal expansion during aging.

Age drives tissue-specific mitochondrial mutation rates

The accumulation rate of mitochondrial mutations varies widely across tissues, driving distinct clonal dynamics. Most tissues exhibit a continuous gain of somatic mutations, with a maximum rate of 0.3 somatic mutations per year observed in the liver (Fig. 2a,b). Notably, tissues with constantly renewing epithelial cells6,41 (for example, the gastrointestinal tract, typical of endodermal origin) and nondividing cells with high energy demands10 (for example, muscle and the heart, typical of mesodermal origin) show particularly high rates of mutation accumulation. This observation suggests that both replicative stress and metabolic demands play important roles in mitochondrial mutagenesis, with potential implications for tissue-specific aging and disease susceptibility. Interestingly, a small fraction of mutations are shared across different tissues (Fig. 2a), suggesting they may originate either during early development or from recurrent mutations at specific hotspot sites (explored further in subsequent sections). Notably, these shared mutations tend to occur within tissues derived from the same germ layer (Extended Data Fig. 5), further implying that shared cellular lineage influences mutational patterns. After excluding hotspot mutations (which are more likely to arise independently in different tissues rather than through shared lineage), approximately 20% of all somatic mutations appear to stem from early developmental lineages, with less than 5% shared across all three germ layers—probably arising before gastrulation (Extended Data Fig. 6). This relatively small percentage aligns with previous findings35, suggesting that, while shared lineage plays a role, the majority of somatic mutations in these tissues arise later in development or during adulthood.

Fig. 2: Age accumulation of mitochondrial mutations across body sites.
figure 2

a, The number of tissue-private mutations (somatic mutations found only in a single tissue sample) per individual across 15 representative tissues, stratified by age group. The average number for the remaining tissues within each germ layer origin (endoderm, mesoderm and ectoderm) is also presented. The stacked bars at the bottom illustrate the number of mutations shared across multiple tissues within an individual, categorized by the number of germ layers in which they occur (1, 2 or all 3). b, Mitochondrial mutations and clonal expansion with age in the liver. The number of mutations represents the average across individuals within each age group. Total heteroplasmy reflects the summation of the heteroplasmy of all variants in an individual. Linear regression shows a significant increase in mutation count with age (β = 0.31 mutations per year, P = 4 × 10−16, n = 190 individuals). c, The rate of mutation accumulation (β, mutations per year) assessed by linear regression across six representative tissues. The shaded areas represent the 95% confidence interval (CI) for the linear regression line β. d, Differential rates (β) of age-related mutation accumulation across all tissues, derived from linear regression (sample sizes per tissue indicated in d). The error bars represent the 95% CI, which is based on the t-statistic of the corresponding mutation rate β and total heteroplasmy μ estimates. The asterisks indicate significant age-dependent mutation rate β (P < 0.001, FDR <0.05 after Bonferroni correction). Sample sizes (n, number of individuals) are marked on top of β estimates of each tissue. Detailed statistics for each tissue are provided in Supplementary Table 2. Tissues in this panel are color-coded to represent different tissue types, following the standard GTEx conventions (Extended Data Fig. 2). e, The relationship between age-related mutation accumulation (mutations per 10 years) with the total number of somatic mutations and their total heteroplasmy at an advanced age (60 years old), with the circle size representing the total heteroplasmy of somatic mutations. In b and d, box plots display the median (center line), interquartile range (IQR, box limits) and whiskers (extending to 1.5× IQR); notches indicate 95% CI for the median.

Source data

We further examined the pattern of mutational accumulation in each individual tissue. Among the 47 tissues analyzed from the GTEx cohort, 21 exhibited a significant positive correlation between the number of mutations and age (linear regression β > 0, P < 0.001, false discovery rate (FDR) <0.05 after Bonferroni correction), suggesting a progressively accumulating mitochondrial mutations throughout life (Fig. 2d and Extended Data Figs. 3 and 4). However, the dynamics of accumulation vary significantly between different tissues. The liver exhibits the most rapid accumulation of mutations with age (β = 0.31 per year, P = 4 × 10−16; Fig. 2b), followed by the left ventricle (β = 0.16 per year, P = 3 × 10−23) and esophageal mucosa (β = 0.11 per year, P = 8 × 10−24; Fig. 2c). This differential accumulation pattern probably reflects the interplay between tissue-specific factors, suggesting that both cellular turnover rate and environmental exposure have a substantial impact on mutational burden and clonal expansion. Further comparing esophageal mucosa (primarily epithelial) with esophageal muscularis (smooth muscle) highlights the profound impact of cellular composition on mutational burden (βmucosa: 0.11 per year > βmuscularis: 0.03 per year, P = 5.6 × 10−11; Fig. 2c). The higher mutation rate observed in the epithelial layer suggests that the constant turnover of these cells, with their continuous replication and division, drives increased clonal dynamics and heightened accumulation of mutations over time, with implications for understanding tissue-specific vulnerabilities to age-related dysfunction and, potentially, the development of malignancies in epithelial tissues. In addition, sun-exposed skin exhibits a much higher rate of mitochondrial mutations compared with nonexposed skin (βexposed: 0.09 per year > βnonexposed: 0.03 per year, P = 2 × 10−6; Fig. 2c), which underscores the influence of environmental factors, such as ultraviolet (UV) radiation, on mutational dynamics.

Both the number of distinct mitochondrial mutations and the total abundance of those mutations (total heteroplasmy) increase with age, suggesting a continuous acquisition of mutations and subsequent clonal expansions of these mutations (Fig. 2d,e). This expansion, resulting from the proliferation of cells carrying specific mitochondrial mutations, drives the overall increase in clonal abundance. The liver, exhibiting the highest rate of mutation accumulation, reaches an average mutant heteroplasmy level of 15% by the age of 60 (Fig. 2b,d). Overall, while age is the primary driver of mitochondrial clonal diversity, the rate of mutation accumulation and clonal expansion varies significantly across tissues (Fig. 2d,e). This variation probably reflects the influence of tissue-specific factors, such as cellular turnover rates, metabolic demands and exposure to environmental stressors.

Sporadic versus hotspot mutations define aging hallmarks

The distinct clonal dynamics observed across tissues potentially arise from two primary mutational processes: deamination-derived replication errors and oxidative (ROS) damage. However, their contributions differ significantly, leading to unique aging hallmarks across tissues. Replication errors, primarily C > T (β = 0.03 per year, P = 8.6 × 10−202) and T > C (β = 0.02 per year, P = 9.7 × 10−94) transitions characteristic of spontaneous deamination, are the dominant contributor to age-related mitochondrial mutation accumulation and clonal expansion (Fig. 3a). This is evident in the significant increase of these mutations with age, compared with minimal expansion of transversions, including ROS-related C > A and C > G.

Fig. 3: Tissue-specific aging hallmarks driven by distinct mutational mechanisms.
figure 3

a, Rate of mutational accumulation (β, mutations per year) by different substitution types, averaged over all 47 tissues. The lines represent linear regression fits of mutation count against age. The shaded areas represent the 95% CI for the slope (β). b, Tissue-specific aging hallmarks. The heatmap highlights the unique mutational signatures characteristic of aging in different tissues. Age-accelerated mutational hotspots are defined as sites with significant age-dependent increases in heteroplasmy (identified via linear regression; details in Supplementary Table 3). c, Non-ROS mutation accumulation across tissues. From top to bottom: the accumulation rate of hotspot and sporadic mutations (estimates from linear regression), total heteroplasmy at age 60, total number of mutations at age 60, and cell type enrichment of each tissue. The asterisks indicate a significant age-dependent mutation rate β assessed by linear regression (P < 0.001, FDR <0.05 after Bonferroni correction). The error bars represent 95% CI, which is based on the t-statistic of the corresponding slope β and mean μ estimates. Sample sizes (n, number of individuals) are marked on top of β estimates of each tissue. Detailed statistics for each estimate (slope β, 95% CI, exact P value) are provided in Supplementary Tables 2 and 3. For total heteroplasmy, box plots display the median (center line), interquartile range (IQR, box limits) and whiskers (extending to 1.5× IQR); the notches provide an approximate 95% CI around the median.

Source data

Further examination reveals that the accumulation of replication errors (non-ROS mutations) is not uniform across tissues but is instead shaped by two distinct mechanisms: sporadic mutations and mutations at hotspots. Sporadic mutations, occurring randomly throughout the mitochondrial genome, are primary contributors to mutagenesis and subsequent clonal expansion in tissues with continuous cellular turnover, such as the gastrointestinal tract (Fig. 3c). This is supported by the enrichment of constantly renewing cell types (epithelial cells, hepatocytes and keratinocytes) in tissues with high clonal expansion (Fig. 3c, bottom). Even within the brain, regions with a higher proportion of dividing glia display increased clonal diversity compared with neuron-rich regions (Fig. 3c, bottom), further emphasizing the role of cellular turnover in propagating these sporadic mutations.

By contrast, post-mitotic tissues such as the heart, brain and muscle exhibited a distinct pattern of mutation accumulation, concentrated primarily at specific hotspot sites (Fig. 3b,c). While hotspot mutations were observed across tissues of high metabolic activity, their relative contribution to the overall mutational burden was significantly higher in post-mitotic tissues. We defined hotspots as mitochondrial positions exhibiting both high prevalence (Supplementary Fig. 3, mutations present in over 10% of individuals by the age of 60, which is significantly higher than the overall mutation rate across all sites, P < 1 × 10−9, Fisher’s exact test, FDR <0.001 after Bonferroni correction) and a significant increase in frequency with age (Supplementary Fig. 4). We identified 22 such sites with significant age-accumulating effects (Fig. 3b and Supplementary Fig. 4). These hotspots, representing a distinct aging hallmark in these tissues (Fig. 3b), are probably a consequence of the cumulative burden of high energy metabolism and rapid mitochondrial turnover, coupled with the inherent error tendencies of DNA polymerase γ during relaxed replication associated with this turnover42. Relaxed replication allows individual mtDNA molecules to replicate independently of the cell cycle8,9,10. This is particularly relevant at hotspot sites, which are more susceptible to the same mutations across different cells, leading to their accumulation (increase of heteroplasmy) within a tissue. Given that this accumulation is expected to occur in a quasi-deterministic manner regardless of the individual, we further validated the age-dependent increase in heteroplasmy at these hotspot sites through targeted deep sequencing (average coverage >10,000×) of mitochondrial genomes from independent human testicular tissue samples (Extended Data Fig. 10). Furthermore, we replicated this age-dependent increase of heteroplasmy at these hotspots using independent data from the Encyclopedia of DNA Elements (ENCODE) and The Cancer Genome Atlas (TCGA, nontumor samples) (Supplementary Figs. 5 and 6). While linear regression was used to model the rate of mutation accumulation for ease of interpretation, this may not fully capture the complexity of the process, particularly in tissues like the heart, where mutation accumulation may exhibit a nonlinear pattern, potentially accelerating at advanced ages (Supplementary Fig. 7).

ROS-related damage (C > A and C > G) is also primarily localized to a few hotspot lesions (Fig. 4a). While ROS damage is prevalent in energy-intensive tissues (Fig. 4a), it does not significantly accumulate with age (Fig. 4b), consistent with previous reports in mice27,28,29, suggesting efficient repair or removal mechanisms.

Fig. 4: Tissue-specific patterns and age dynamics of ROS-related mitochondrial damage.
figure 4

a, ROS damage in all tissues. The relative mitochondrial copy number (reflecting higher energy demand), estimated from RNA data, is displayed to illustrate the correlation between a tissue’s energy demand and its susceptibility to ROS damage. Sample sizes (n, number of individuals) are marked on top of box plots of each tissue. To account for potential bias due to varying mitochondrial copy numbers across tissues, which could affect mutation detection sensitivity, we reanalyzed the data after subsampling all tissues to a uniform sequencing depth (Extended Data Fig. 7). Tissues are color-coded to represent different tissue types, following the standard GTEx conventions. b, Mutation accumulation in representative tissues. Top: the age-dependent increase in non-ROS mutation number for four selected tissues, exemplifying the varying rates and patterns of mutation accumulation and clonal expansion. Bottom: the various extent of ROS-induced damage in those tissues, which remains relatively constant with age. The rate of mutation accumulation (β, mutations per year) was assessed by linear regression across six representative tissues. For total heteroplasmy in a and b, box plots display the median (center line), interquartile range (IQR, box limits) and whiskers (extending to 1.5× IQR); the notches provide an approximate 95% CI around the median.

Source data

In conclusion, tissue-specific mitochondrial aging hallmarks are shaped by the complex interplay between sporadic and hotspot mutations, associated with distinct processes: cellular turnover, which leads to clonal expansion of sporadic mutations in proliferative tissues, and mitochondrial turnover (including relaxed replication), which underlies increased heteroplasmy of hotspot mutations, particularly in high-metabolic, post-mitotic tissues. This duality defines the biphasic nature of the mitochondrial molecular clock.

Widespread accumulation of deleterious somatic mutations

In contrast to germline mitochondrial variants, which presumably undergo strong purifying selection, sporadic somatic mutations are much enriched for potentially deleterious effects (Fig. 5a, P < 1 × 10−15, Fisher’s exact test). Deleterious variants are defined as mutations in conserved sites, including loss-of-function (LoF), conserved missense and conserved rRNA sites as predicted by conservation metrics derived from population studies33. These deleterious somatic mutations are rarely observed at the population level (Extended Data Fig. 8) and are found predominantly at low heteroplasmy within individual tissue sites (Fig. 5c).

Fig. 5: Functional impact of age-accumulated somatic mitochondrial mutations.
figure 5

a, The functional profile of somatic versus germline variants. The bin width is scaled to reflect the typical number of germline and somatic variants (as sampled across 15 body sites as in Fig. 2a) in an individual. b, The age accumulation of somatic mutations by variant type in representative tissues. Variant types are colored as in a. c, The tissue-specific accumulation of deleterious mutations (defined as mutations in conserved sites, including LoF, conserved missense and conserved rRNA sites). This panel demonstrates the preferential accumulation of deleterious mutations in tissues with high cellular turnover, particularly in constantly renewing epithelial compartments. By contrast, high-energy-demanding tissues primarily accumulate mutations at hotspots within the noncoding D-loop region. Age-dependent effects (β) are assessed using linear regression. The error bars represent the 95% CI, which is based on the t-statistic of the corresponding mutation rate β and mean heteroplasmy μ estimates. Tissues in c are color-coded to represent different tissue types, following the standard GTEx conventions. Sample sizes (n, number of individuals) are marked on top of β estimates of each tissue. Detailed statistics for each estimate (slope β, 95% CI, exact P value) are provided in Supplementary Table 2. For total heteroplasmy, box plots display the median (center line), interquartile range (IQR, box limits) and whiskers (extending to 1.5× IQR); the notches provide an approximate 95% CI around the median.

Source data

The accumulation of these deleterious mutations is not uniform across tissues. They occur sporadically, particularly in tissues with continuous cellular turnover (Fig. 5b,c), such as the liver (β = 0.098 per year, P = 1 × 10−8), esophageal mucosa (β = 0.029 per year, P = 4.2 × 10−8), and adrenal glands (β = 0.035 per year, P = 9.8 × 10−9). This leads to a higher burden of deleterious mutations in constantly renewing epithelial tissues, potentially contributing to age-related dysfunction.

By contrast, in high-energy-demand tissues such as the heart and kidney, mutations tend to cluster at specific hotspots as previously described, predominantly within the noncoding D-loop region (Fig. 5b,c). This concentration in a noncoding region suggests a potentially less detrimental impact compared with sporadic mutations in renewing tissues. However, the functional consequences of these hotspot mutations on cellular and tissue physiology require further investigation. These findings further highlight the complex interplay of sporadic and hotspot mutations in shaping the landscape of deleterious mitochondrial mutations.

Intriguingly, some predicted LoF variants exhibited unexpectedly high heteroplasmy (>0.5) and population frequency (>0.1%). These included start-codon loss in MT-ND1 (chrM_3308_T_C/G), stop-codon loss in MT-CO1 (chrM_7444_G_A) and another start-codon loss in MT-ATP6 (chrM_8527_A_G). However, further analysis revealed alternative downstream sequences that could potentially compensate for these mutations (Supplementary Fig. 8). In MT-ND1, a second AUG codon downstream could serve as an alternative start site. Similarly, in MT-CO1, the downstream AGG can act as a nonstandard stop codon in mitochondria. In MT-ATP6, GUG has been reported as an alternative start codon43. Thus, these variants might not result in protein function loss, which highlights the unique ability of mitochondria to tolerate certain mutations due to the presence of alternative codons.

Environment and sex affect mitochondrial mutation patterns

Beyond age and tissue, environmental factors and biological sex significantly influence mitochondrial mutational burden and clonal expansion. Our analysis reveals a striking interaction between mitochondrial haplogroup and sun exposure in shaping the mutational landscape of skin. Individuals with non-L haplogroups, typically associated with non-African ancestry and lighter skin pigmentation, exhibit a substantially higher mutation burden in sun-exposed skin compared to those with the L haplogroup (βnon-L: 0.09 per year > βL: 0.02 per year, P = 0.01; Fig. 6a). This disparity probably arises from increased susceptibility to UV-induced damage in individuals with lighter skin, rather than a direct genetic effect of the haplogroup itself. Notably, the age-dependent accumulation of mutations is consistently high in sun-exposed skin across all non-L haplogroups, but not in non-sun-exposed skin (Fig. 6c). Interestingly, while nuclear genomes show a strong UV mutational signature (SBS7a/b) in sun-exposed skin, particularly in non-L haplogroups (Extended Data Fig. 9), mitochondria exhibit very unique mutational spectra across tissues. This pattern is dominated by C > T and T > C transitions, with no typical tobacco (SBS4) or UV (SBS7a/b) mutational signatures even in lung and skin cancer (Extended Data Fig. 9), as previously reported by the TCGA study31. Considering the potential impact of increased cell turnover and clonal expansion35,44 in sun-exposed skin, which could increase the abundance of mutations arising from normal mitochondrial mutagenesis, further research is needed to definitively establish the mechanisms involved in the link between UV exposure and mitochondrial mutation burden in individuals with lighter skin pigmentation.

Fig. 6: Environmental and sex-specific influences on mitochondrial mutational burden.
figure 6

a, The differential accumulation of mutations in sun-exposed skin by mitochondrial haplogroup (L versus non-L). The fitted regression line shows the rate of mutation accumulation (slope β) in sun-exposed skin between individuals with the L haplogroup (typically associated with African ancestry and higher melanin content) and those of the non-L haplogroup. Shaded areas around the regression lines represent 95% CI of the mutation rate β. b, The mutational burden in mammary and reproductive tissues by sex. The lines show linear regression fits; the shaded areas represent the 95% CI for slope. A comparison was performed via interaction terms (sex and tissue) in the linear regression model. Detailed regression statistics are provided in Supplementary Table 2. c, Mutation rates (β, mutations gained per 10 years, derived from linear regression) of sun-exposed and non-sun-exposed skin across haplogroups, annotated with self-reported ethnicity makeup. The error bars represent the 95% CI of the mutation rate β. Sample sizes (n, number of individuals) are marked on top of β estimates of each group. d, The dosage sensitivity of mitochondrial verus nuclear genetic information processing machinery. LOEUF measures tolerance to LoF mutations, while allelic fold change (aFC) measures tolerance to expression altering cis-regulatory variants, with larger values indicating higher tolerance to dosage change.

Source data

Sex also plays a role in shaping the mitochondrial mutational landscape (Supplementary Fig. 9). We observe a substantially higher mutation burden in the breast tissue of females compared with males (female > male, P = 1 × 10−7; Fig. 6b), a difference probably established early in life, possibly during adolescence, as it plateaus after the age of 20. In addition, distinct mutational patterns are evident in male and female reproductive organs, with a notable age-dependent accumulation in the testis but not in the ovary (βtestis: 0.14 per year > βovary: 0.0 per year, P = 1 × 10−8; Fig. 6b). Sex differences in nondimorphic tissues are generally less pronounced than those in sexually dimorphic tissues; minor differences in mutational burdens (P < 0.05, not significant after multiple testing correction; Supplementary Fig. 9) were also observed in heart (male > female), adipose and skin (female > male) tissue, a trend similar to previous findings in nuclear genomes25.

Notably, mitochondrial genetic information processing machinery (replication, transcription and translation) demonstrates remarkable resilience to dosage alterations, contrasting sharply with the sensitivity of its nuclear counterparts. This resilience is indicated by the tolerance of LoF mutations, the predominantly recessive nature of mitochondrial diseases and the presence of large-effect expression quantitative trait loci (eQTLs) at nuclear genes coding those mitochondrial machinery components (Fig. 6d). These findings suggest that mitochondrial quality control mechanisms are robust and less susceptible to dosage-related disruptions. Nevertheless, while dosage may be well buffered, the functional consequences of the specific, age-accumulated mutations we observe across different tissues, and their potential role in age-related decline, warrant further study.

Overall, our findings emphasize the resilience of mitochondrial genetic information processing machinery to genetic variations and gene dosage changes, underscoring the unique robustness of mitochondria compared with nuclear systems. However, environmental factors such as UV exposure and biological sex significantly influence the accumulation and clonal expansion of mitochondrial mutations, highlighting their importance in shaping the tissue-specific landscape over age.

Discussion

Our findings support a biphasic model of mitochondrial molecular clock that governs the accumulation of mutations across human tissues, with distinct mechanisms operating in proliferative and post-mitotic cells. Mitochondria, as ancient endosymbionts within eukaryotic cells, possess their own genomes and exhibit a high degree of clonal diversity due to their large copy numbers and constant turnover. This diversity, shaped by replication errors and ROS damage, has been implicated in aging and disease. Our systematic examination of mitochondrial RNA from 838 individuals reveals that mitochondria indeed function as a mutational clock, recording and potentially driving the aging process across various tissues.

A key finding is the steady accumulation of clonal diversity with age across the body, albeit at varying rates and under distinct driving mechanisms. This observation aligns with previous studies in mice and macaques, which have consistently demonstrated that mitochondrial mutations accumulate with age in various tissues27,29,30. This age-related accumulation is driven primarily by deamination-associated replication errors rather than ROS damage, suggesting that the fidelity of DNA replication may be a critical factor in the aging process28. The pace of clonal expansion mirrors the rate of cellular turnover, particularly evident in rapidly regenerating epithelial tissues such as the gastrointestinal tract and skin, where mutant clones can constitute up to 15% of the mitochondrial population by age 60. This rapid accumulation of genetic errors and subsequent clonal expansions underscore the vulnerability of these tissues to age-related dysfunction and potentially tumorigenesis. This observation is consistent with previous studies in cancer genomes, which have highlighted an age-dependent accumulation of mitochondrial mutations in tumor samples31.

By contrast, nondividing tissues such as the heart and brain exhibit age-accumulated mutations primarily at specific hotspots. These hotspots, probably a consequence of high mitochondrial turnover in energy-intensive tissues8,29, represent a unique aging hallmark. Interestingly, while these tissues also show elevated ROS damage at specific lesions, it does not accumulate with age, which aligns with previous findings in mice28,45. This suggests a robust mitochondrial machinery capable of clearing ROS-induced damage.

Environmental factors and biological sex significantly influence mitochondrial mutational dynamics. Sun exposure increases the mutation burden in lightly pigmented skin (non-L haplogroup), highlighting an interaction between environment and genetics. This elevated mutational burden aligns with findings on nuclear genomes, where UV exposure induces characteristic mutational signatures25,26. However, as mitochondria exhibit distinct mutational spectra dominated by C > T and T > C transitions, with no typical UV-associated signature, it is unclear whether UV radiation directly causes these mitochondrial mutations or indirectly increases the burden through elevated cellular turnover and clonal expansion, which warrants further investigation. Distinct sex-specific patterns are also evident in breast and reproductive tissues, emphasizing the importance of considering sex in understanding tissue-specific aging.

Our findings establish the mitochondrial genome as an informative mutational clock, intricately recording the diverse stressors encountered by different tissues throughout an individual’s lifespan. Previous studies have estimated the average mitochondrial mutation rate to be around 5 × 10−8 per base pair per replication35. Our work further demonstrates that the accumulation of mutations with age varies significantly across tissues. This variation is driven by factors such as cellular mitotic activity, the prevalence of mutational hotspots and the inherent replication dynamics of mitochondria independent of cell division6, creating distinct mutational landscapes in different tissues over time. The finding of this clock’s biphasic nature—with distinct mechanisms operating in dividing versus nondividing tissues—offers unique insight for understanding the multifaceted process of aging, highlighting varying degrees of vulnerability to age-related pathologies. The distinct mutational signatures observed in proliferative and energy-intensive tissues underscore the importance of considering both replicative stress and metabolic demand in understanding the etiology of age-related diseases. Moreover, the identification of age-accelerated mutational hotspots opens avenues for early detection and potential intervention strategies. Further elucidating the precise etiology and functional consequences of these accumulated mutations, and their contribution to cellular and tissue dysfunction, is also of critical importance and demands further investigation, ideally using single-cell multi-omics approaches8,46 to link specific mitochondrial genotypes to cellular phenotypes and aging. By harnessing the power of the mitochondrial clock, we can potentially develop personalized approaches to diagnose, monitor and mitigate age-related decline and dysfunction, ultimately promoting healthier aging and longevity.

Methods

Ethical compliance

The analysis of publicly available data from the GTEx project (accessed via dbGaP accession phs000424.v8.p2), ENCODE project and TCGA project was conducted in accordance with their respective data access policies and the ethical guidelines under which the data were originally collected, including appropriate institutional review board approvals and participant consent obtained by the respective consortia. The analysis of human testicular tissue samples was approved by the Ethics Committee of Shanghai Changhai Hospital (approval no. CHEC2021-086).

Study samples and datasets

Our analysis primarily utilized mitochondrial RNA-seq data from the GTEx project v8 dataset (dbGaP accession phs000424.v8.p2). Tissues with fewer than 70 individuals, as well as cultured fibroblasts and Epstein–Barr virus-transfected lymphocytes, were excluded. Our final dataset encompassed 11,802 samples (after quality control) from 838 individuals, spanning 47 distinct tissues. To validate and compare our findings, we incorporated data from the ENCODE project (https://www.encodeproject.org/) and TCGA project (https://portal.gdc.cancer.gov/) to assess the presence and accumulation of mitochondrial mutation hotspots in an independent cohort.

Statistics and reproducibility

This cross-sectional study analyzed public RNA-seq data from 47 tissues (838 individuals, aged 20–70) from the GTEx v8 cohort (dbGaP: phs000424.v8.p2). No statistical method was used to predetermine sample size; sample sizes were based on data availability for public cohorts (GTEx) and are comparable to those reported in previous studies25,36. Findings were replicated using ENCODE and TCGA data and were validated with targeted mtDNA sequencing from eight human testicular samples (Ethics Committee of Shanghai Changhai Hospital, CHEC2021-086). Samples and variants were excluded on the basis of quality control filters, including checks for contamination, sequencing coverage, error-prone sites and variant call quality. Randomization and blinding were not applied to this observational and computational study design. Statistical analyses, performed using R, primarily involved linear regression to model age effects (assessed via t-statistic for slope β, 95% confidence intervals reported) and Fisher’s exact test for proportions. For linear regression analyses, data distribution was assumed to be normal, but this was not formally tested; data distribution is shown graphically in relevant figures. Significance was generally defined as P < 0.05, with multiple testing correction (Bonferroni or FDR) applied where appropriate. All statistical tests were two-sided, unless noted otherwise.

Mitochondrial variant calling and annotation

Mitochondrial variant calling and annotation were conducted using RNA-seq data from the GTEx project, following a workflow similar to the mtDNA variant calling pipeline utilized in the Genome Aggregation Database (gnomAD)33. Raw sequencing reads were initially aligned to the human reference genome GRCh38 with STAR v2.5.3a, in accordance with the GTEx Consortium pipeline. Reads mapping to the mitochondrial genome (chrM) were extracted and realigned to chrM using the Burrow-Wheeler Aligner47, retaining only uniquely mapped reads to mitigate the impact of nuclear sequences of mitochondrial origin (NUMTs). After marking duplicates, the Mutect248 mitochondrial mode was applied for variant calling, requiring a minimum base quality of 18. Considering the circular nature of mtDNA and the limitation of BWA-MEM supporting mapping to the linear genome, which results in low read coverage in the D-loop region and less accurate variant calling, we implemented a two-step mapping and calling process. First, reads were directly mapped to the rCRS mtDNA sequence49 to identify variants in chrM:576–16024. Second, the mtDNA sequence was split at the 8,000th base to create a new linear genome, and reads were remapped to this modified reference to identify variants in chrM:1–575 and chrM:16025–16569. These two sets of variants were then combined and subjected to flagging and filtering using GATK FilterMutectCalls.

Population frequencies of variants were obtained from the gnomAD database33. Variants were annotated utilizing Ensembl’s Variant Effect Predictor (VEP) v10150. Variants classified as stop_gained, start_lost and stop_lost were designated as LoF. Local sequence constraint metrics (MLC) were derived from gnomAD, considering scores higher than 0.7 as conserved. Moreover, pathogenic variants were annotated according to the MITOMAP51 database. Haplogroups were assigned using Haplocheck52.

Removal of contaminated samples

To minimize the risk of erroneous variant calls due to potential cross-contamination between samples, we implemented a multistep filtering strategy. Initially, samples flagged as potentially contaminated by Haplocheck were removed. In addition, we used a negative binomial regression model to identify and remove outliers in mutation counts, accounting for covariates such as tissue, age, sex, haplogroup, ischemic time and cohort as follows:

$$\begin{array}{l}{\rm{Mutation}}\,{\rm{count}} \sim {\rm{Tissue}}+{\rm{Age}}+{\rm{Sex}}\\+{\rm{Haplogroup}}+{\rm{Isc}}{\rm{h}}{\rm{emic}}\; {\rm{time}}+{\rm{Co}}{\rm{h}}{\rm{ort}}.\end{array}$$

Samples with residuals exceeding the population norm (P < 0.001) were considered potentially contaminated and excluded. Finally, to ensure sufficient sequencing depth for accurate variant detection, samples with inadequate coverage were removed. While most samples had adequate RNA coverage across protein-coding, rRNA and D-loop regions, those with less than 200× coverage at over 10% of the 16,569 mitochondrial genome positions were excluded from further analysis.

Removal of error-prone sites

We applied multiple filters to eliminate sites vulnerable to technical artifacts: (1) exclusion of blacklisted regions, encompassing positions 66–71, 300–316, 513–526, 566–573, 3,106–3,107, 12,418–12,425 and 16,181–16,194 (refs. 32,33,53) (these regions harbor homopolymer tracts that may induce polymerase slippage during polymerase chain reaction, leading to sequencing errors); (2) exclusion of known RNA modification sites39,40(Supplementary Fig. 2), including positions 295, 2,617 and 13,710; (3) exclusion of sites that may be affected by NUMTs38 (these sites were identified by mapping NUMT sequences to the mitochondrial genome and subsequently removed37); and (4) exclusion of all tRNA regions and gene boundary regions that are inadequately covered by polyA RNA libraries. To account for the low coverage observed at gene boundaries in RNA-seq, we trimmed bases from the ends of gene regions where coverage fell below 200×.

Quality control of mitochondrial variant calling

To enhance the reliability of our variant calls, we implemented several filtering steps. First, we excluded all indels, multiallelic sites and variants located within 3 bp of read ends. Moreover, variant calls exhibiting strand bias were filtered out. Strand bias was assessed using Fisher’s exact test, comparing allele-specific forward and reverse read counts. If the lower bound of confidence interval of odds ratio exceeded 2, the variant was flagged for strand bias.

In addition, to further differentiate variants from background sequencing errors, we adopted the same statistical approach used in previous studies24. We estimated an empirical error rate by counting the frequency of alternative alleles at all sites using BCFtools mpileup. Low-frequency alternative alleles (<1%), primarily attributed to sequencing errors, were modeled using a beta-binomial distribution. This analysis revealed a background error rate of approximately 5 × 10−4 in the GTEx data (Supplementary Fig. 1), consistent with estimates by previous GTEx studies25. Variants output by Mutect2 were then further filtered on the basis of their deviation from this distribution. To minimize false-positive calls, we applied an allele fraction cutoff of 0.005. Only variants with an allele fraction >0.005, representing a significant deviation from the sequencing error rate of 5 × 10−4 (P < 1 × 10−7, FDR <0.05 after Benjamini–Hochberg multiple testing correction), were considered reliable calls.

Assessing factors influencing mitochondrial clonal dynamics

To determine the factors contributing to clonal dynamics, we used a linear regression model with mutation count or heteroplasmy level as the response variable. The model included covariates such as tissue type, age, sex, macro-haplogroup, tissue ischemic time and cohort group (postmortem versus organ donor). The significance of the interaction between each covariate (sex, tissue type, environmental exposure and so on) and age on the rate of mutation accumulation was evaluated using the interaction term in the linear regression model. Reported confidence intervals are based on the t-statistic of the corresponding mutation rate β and mean heteroplasmy μ estimates.

Recognizing the substantial variation in read coverage across tissues, we repeated the analysis by subsampling the coverage to a uniform depth of 1,000×. This step ensured that observed differences in mutation counts between tissues were not simply artifacts of varying sequencing depths.

We assessed mutational signatures and their proposed etiologies using the SigProfileAssignment54 tool to identify matches to known COSMIC signatures55.

Cell type enrichment assessment among bulk tissue samples

To investigate the association between cell type composition and mitochondrial mutation burden, we used xCell56, a method used by the GTEx consortium for cell-type-specific eQTL analysis, to assess cell type enrichment across tissues (excluding the brain). xCell enables enrichment analysis for 64 immune and stromal cell types based on gene expression data. We focused on the ten most abundant cell types, including endothelial cells, epithelial cells, fibroblasts, hepatocytes, keratinocytes, mesenchymal stem cells, myocytes, sebocytes, adipocytes and neutrophils.

For brain tissues, we utilized CIBERSORT57, a deconvolution algorithm using support vector regression, to achieve a more granular breakdown of specific cell types. CIBERSORT, using customizable reference gene expression signatures, allowed us to infer the composition of astrocytes, endothelia, microglia, neurons and oligodendrocytes within the original expression profile of the brain tissues.

Dosage constraint in genetic information processing genes

We evaluated gene dosage constraint using the LoF observed/expected upper-bound fraction (LOEUF) score58. For a tissue-specific assessment, we utilized our recently developed metric, MoDs (tissue-specific dosage constraint)59, which quantifies the sensitivity of gene expression to gene dosage changes based on the presence of large-effect cis-regulatory variants. Genes involved in genetic information processing were curated from the Kyoto Encyclopedia of Genes and Genomes (KEGG)60 and Mitocarta61 databases. Specifically, mitochondrial genetic information processing genes were selected on the basis of their classification in Mitocarta (‘mtDNA maintenance’, ‘mtRNA metabolism’ and ‘Translation’) and KEGG (‘Mitochondrial transcription and translation factors’ and ‘Mitochondrial DNA replication factors’). Nuclear genetic information processing genes were identified on the basis of their categorization in KEGG (‘DNA replication proteins’, ‘DNA repair and recombination proteins’, ‘Translation factors’, ‘Messenger RNA biogenesis’ and ‘Transcription machinery’). Inheritance modes associated with genes linked to Mendelian diseases were determined using the Online Mendelian Inheritance in Man (OMIM) database.

Validation of mitochondrial mutation hotspots in testicular tissue samples

To independently validate our findings from the analysis of bulk RNA-seq data, we performed targeted mtDNA sequencing on testicular tissue samples obtained from eight individuals. These tissues were collected with ethical approval (approval number CHEC2021-086) from patients undergoing orchiectomy for various medical conditions at Shanghai Changhai Hospital between 2021 and 2022. Each patient provided informed consent for the use of their discarded testicular tissue. Genomic DNA was extracted from the tissues using the QIAamp DNA Micro Kit (Qiagen). The complete mitochondrial genome was enriched using the xGen Human mtDNA Hyb Panel (Integrated DNA Technologies), which utilizes biotinylated probes to capture the entire mitochondrial genome. Captured DNA was then prepared for sequencing using the KAPA Hyper Prep Kit (Roche), following the manufacturer’s protocol. Sequencing was performed on the GeneMind SURFSeq 5000 platform, generating high-coverage data (>10,000×) for accurate variant calling. Samples were assigned deidentified study IDs (format: TST-R-NN; Supplementary Table 3). The original sequencing data are deposited at the National Omics Data Encyclopedia (NODE) at https://www.biosino.org/node, under the accession number OEP00006064.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.