Introduction

Physiological aging can be defined as the cumulative effect of time on the organism1. It is characterized by an imbalance between individual functional reserves and daily stressors, leading to a progressive loss of physiological integrity, impaired functions and increased vulnerability to diseases and death2. In an increasingly aging global population, the prevalence of age-related illnesses continues to escalate3. Among these, periodontal diseases affect a significant portion of the adults aged 50 and older4,5,6.

Contemporary research faces the challenge to decipher the cellular and tissue mechanisms underlying preclinical and clinical changes associated with age-related health impairments. Advancing this understanding will enable the identification of aging trajectories through an integrated approach that combines biological and physiological markers (biological/physiological age)7,8,9. Such insights will facilitate the development of preventive and early intervention strategies aimed at extending health span and promoting healthy aging.

Recently, it has been proposed that healthy functions are achieved and maintained by an optimized relationship between three transversal multiscale elements: Structure/supportive compartment (S), the source and the driver of tissue architecture and repair; Immune/inflammatory system (I), that apprehends and defends the organism integrity against injuries; and Metabolism (M), which provides energy for organism healthy functioning1. Thus, the S – I – M paradigm is a pathophysiological key to identify or predict age-related tissue changes.

As an interface organ, the oral cavity serves a central element between the external environment and the entire organism10. Moreover, the mouth’s unique resilience - balancing structure, immune function, and metabolism - makes it not only a critical target for early disease detection but also a potential indicator of whole-body aging trajectories. In the oral cavity, the gingiva is continuously exposed to mechanical, biological and chemical stresses11, and undergoes constant remodeling to adapt to these cumulative challenges over time. This dynamic nature makes it an ideal site for detecting early alterations in the structural, immune, and metabolic (SIM) triad and investigating age-related tissue modifications. However, the role of aging as a major risk factor in periodontal homeostasis and its associated SIM changes remain poorly investigated12,13. Understanding these transformations could be pivotal in defining gingival biological age as a novel biomarker for aging assessment. Investigating gingival tissues could thus provide insights not only into oral health and its potential variations but also into an individual’s overall health status, reinforcing the concept of the mouth as a mirror of systemic aging10.

Through a comprehensive bulk RNA-Seq transcriptomic analysis, this study reveals age-related alterations in the gingival SIM transcriptomic landscape, with a particular impact on the extracellular matrix. This analysis enables the definition of a minimal signature for calculating gingival biological age.

Results

Acquisition and analysis of human gingival transcriptome data

Based on the defined keywords (gingival OR gingivitis OR gingiva on GEO DataSets) and following manual screening, 57 datasets corresponding to 57 individuals aged 16 to 78 years were identified (Fig. 1).

Fig. 1
figure 1

Flow diagram of the dataset selection process. This flowchart illustrates the steps involved in selecting Bulk RNA sequencing datasets for the study, starting with the total number of datasets retrieved from GEO DataSets (https://www.ncbi.nlm.nih.gov/gds) based on the search criteria (gingival OR gingivitis OR gingiva) and ending with the final number of datasets included in the analysis.

Initial data aggregation revealed a noticeable batch effect (Fig. 2A), likely due to heterogeneous sampling and/or sequencing procedures across the datasets. After removing the batch effect, Principal Component Analysis (PCA) conducted on the first two dimensions did not show any obvious segregation based on age or periodontal health status (Fig. 2B–D). However, a clear gender effect was observed (Fig. 2E).

Fig. 2
figure 2

Batch effect correction. (A) Principal Component Analysis (PCA) of the first two dimensions, conducted after data aggregation, revealed a distinct batch effect, likely due to variability in sequencing protocols across the datasets. Following batch effect correction, no segregation was observed in the first two PCA dimensions, either between datasets from the same study (B), or based on age (C) and periodontal status (D). However, a gender effect was detected (E) . Batch assignments correspond to the following GEO datasets: Batch 0: GSE111523; Batch 1: GSE83382; Batch 2: GSE173078; Batch 3: SE133603; Batch 4: GSE224044.

Aging was associated with a marked transcriptomic shift in gingival tissue, characterized by a predominant downregulation of genes, particularly those related to extracellular matrix organization, and a smaller set of upregulated genes enriched for cholesterol metabolism.

To assess the potential effect of age within the local gingival context, a differential gene expression analysis was performed across all datasets, without excluding any genes, adjusting for periodontal profile and gender. This analysis revealed that 698 genes were significantly differentially expressed with age (Fig. 3A) at the p-value < 0.01 level, with 171 genes upregulated and 527 genes downregulated in older compared to younger gingival tissue samples (Fig. 3B), indicating a decline in gingival gene expression with age. The normalized counts for the downregulated genes were lower than those for the upregulated genes, suggesting that downregulated genes had a lower basal expression compared to upregulated ones (Fig. 3C). No genes reached significance for the periodontal status: age or sex: age interaction terms (FDR < 0.01); subsequent analyses were conducted using reduced models without interactions to report the main effects of age adjusted on batch, periodontal status, and sex. The periodontal status: age interaction term identifies transcripts whose age-related expression trajectories differ by periodontal status, delineating molecular signatures of an age-dependent periodontal effect. Only nine genes met the interaction significance threshold, indicating that the interaction signal is present but limited in scope. These included CHIT1, IGLV7-46, IGKV2D-40, SPP1, CXorf40A, and IGLV4-60, which are associated with the immune (‘I’) function, and MMP9, associated with the structural (‘S’) function. All of these genes showed decreased expression with age in the periodontitis group, whereas their expression slightly increased with age in the healthy group (Supplementary Fig. 1).

Fig. 3
figure 3

Analysis of Age-Related Gene Expression Changes. (A) Kernel density plot showing the distribution of gene expression p-values. (B) Visualization of gene expression variation over time using volcano plot. All genes whose expression changes with age (n = 698) were plotted as individual dots on a graph of log2 fold change (log2FC) versus negative log10 adjusted p-value. (C) Genes whose expression significantly changed with age were represented as individual point on a graph of log2FC with age versus normalized counts. (D) Principal Component Analysis (PCA) performed on the normalized counts of genes significantly differentially expressed with age revealed that two dimensions accounted for 70% of the variance with age (component 1: 50%, component 2: 15%). (E) Venn diagram representing genes distribution into S, I, M and others categories. (F) The S (n = 98), I (n = 72), and M (n = 86) gene sets made a major contribution to explaining the variance with age in the dataset, compared to the “not SIM” (n = 477).

PCA performed on the normalized counts of genes significantly differentially expressed with age revealed that two dimensions accounted for 70% of the variance (component 1: 55%, component 2: 15%, Fig. 3D).

Genes involved in structure, immunity/inflammation, and metabolism play a key role in gingival aging

The 698 significantly differentially expressed genes were then classified into four categories according to “SIM” functions, the SIM categorization being applied post hoc as an interpretive framework. “S” refers to structural genes, which are involved in maintaining both the external structure (through extracellular matrix (ECM) management) and the internal structure related to the external one. “I” refers to immunity-related genes, responsible for detecting warning signals and emitting either pro-resolutive or pro-inflammatory signals. “M” encompasses genes involved in cell metabolism. A single gene could be categorized in more than one SIM category (Venn diagram, Fig. 3E). Genes that did not meet these criteria (e.g., transcription factors or proliferation-related genes) were assigned to a separate category (“not SIM”) comprising four subgroups: Transcription Factors, Migration, Proliferation related and “unclassified” genes. Overall, 98 (14.0%), 72 (10.3%), 86 (12.3%) and 477 (68.3%) genes were defined as “S”, “I”, “M”, and “not SIM”, respectively. Within the not SIM group, 91 genes were annotated as proliferation-related, 89 as migration-related, and 60 as transcription-related, while 196 genes remained unclassified. Furthermore, 6 genes were identified as “S and I”, 5 as “I and M”, and 4 as “S and M” and 10 as “S, I and M” (Fig. 3E). The top 20 age-associated genes by SIM functional category are shown in Supplementary Table 2. Loadings for each gene were extracted for components 1 and 2 of PCA (Fig. 3D), representing the contribution of each gene to explain the variance of significantly expressed genes with age. Strikingly, it was found that the S, I, and M groups contributed an average of 40% of the variance explained for component 1, while the “not SIM” group explained only 5% (Fig. 3F and Supplementary Fig. 2).

The expression of genes involved in extracellular matrix (ECM) homeostasis is significantly decreased with age

To deepen the analysis, a qualitative assessment was performed using STRING, which allowed for the classification of transcriptomic data based on their respective protein activities. A functional protein interaction network was generated, focusing on physical and functional interactions. The STRING network revealed that the ‘S’ genes occupy a central position, connected to various other processes (Fig. 4). Specifically, this node was predominantly composed of ECM constituents, including collagens, proteoglycans, and glycoproteins. Figure 4 was not designed to foreground individual gene names, but to provide an overview of connectivity among gene categories, aided by color coding and a clear legend. The aim was to assess the coherence of gene blocks according to their SIM-assigned functions and to identify potential transition zones between them. Such transitions were exemplified by genes including ZEB1, ZEB2, TWIST1, and CXCL12, each showing age-related decreases in expression. These findings were further validated by Gene Ontology Biological Processes analysis, which enabled functional enrichment within the STRING network. The top five GO term enrichments highlighted include ECM organization, cell adhesion, regulation of developmental processes, response to fluid shear stress, and regulation of cell migration.

Fig. 4
figure 4

Functional protein–protein interaction network analysis using STRING. Each node represents a protein, while the edges (lines) between nodes represent physical or functional interactions. Node color indicates the SIM classification.

To identify significant biological pathways potentially linked to aging-related variations, Gene Set Enrichment Analysis (GSEA) was performed. This analysis identified 9 enriched Gene Ontology (GO) pathways (Fig. 5A). The Normalized Enrichment Score (NES) reflects the degree of overrepresentation at either the top or bottom of the ranked gene list. The top canonical pathways differentially expressed with aging included “ECM organization,” “Degradation of the extracellular matrix,” and “ECM proteoglycans.” These pathways exhibited negative NES values, with most genes located at the bottom of the list. Their average expression level was high but significantly decreased with age (Fig. 5B). Specifically, among the ECM genes most affected by age were collagens and ECM regulators (Fig. 5C). When analyzed through the lens of Matrisome annotations, ECM-related genes exhibited a consistent decrease in expression across all categories with age, as their log2 fold changes were negative (Fig. 5D). Expression on POSTN, COL3 et COL1 decreased over time (Fig. 5E).

Fig. 5
figure 5

Significant age-related decrease in the expression of genes involved in ECM homeostasis. (A) Results from the Gene Set Enrichment Analysis (GSEA) showing all enriched pathways. The bars indicate significant enrichment at padj < 0.01. A negative Normalized Enrichment Score (NES) indicates enrichment of genes that were downregulated over time, while a positive NES indicates enrichment of genes that were upregulated over time. Among all the pathways, the one related to ‘Extracellular Matrix Organization’ showed the most significant differential expression with age, with a negative NES indicating downregulation. (B) The plots show the enrichment score (green line) of enriched GSEA pathways across ranked gene list positions (“ECM proteoglycans”, “ECM Organization” and “Degradation of ECM”, “Cholesterol biosynthesis” pathways). (C) Extracellular Matrix Organization” and “Cholesterol biosynthesis” pathways representation with the differentially expressed genes highlighted in color according to their fold change. (D) Expression of Core Matrisome (CM) genes based on genes significantly differentially expressed with age (x-axis: log2FC; y-axis: normalized values). All categories showed a decrease in gene expression with age (negative log2FC). (E) Decrease in gene expression of collagen I (COL1A1), collagen III (COL3A1) and periostin (POSTN) with increasing chronological age.

In contrast, genes associated with the “Cholesterol biosynthesis” pathway showed a positive NES, with all genes predominantly positioned at the top of the list (Fig. 5B). Furthermore, analysis of the cholesterol biosynthesis pathway highlighted CYP51A1 as the gene showing the most significant increase with age (Fig. 5C).

Microscopic analyses highlight extracellular matrix pro-fibrotic profile with aging

To validate key transcriptomic findings indicating the downregulation of extracellular matrix (ECM) remodeling genes in aging gingiva, a focused panel of histological and immunofluorescence analyses was conducted. Healthy gingival tissue samples were obtained from 40 healthy individuals aged 2 to 87 years and processed for microscopic examination. All specimens derived from clinically healthy sites (probing depths ≤ 3 mm, bleeding on probing negative at sampled site, no attachment loss), ensuring that observed architectural changes reflect ageing rather than subclinical inflammation. Hematoxylin and Eosin (HE) staining provided an overview of tissue architecture and cellular distribution across age groups. Younger specimens displayed a uniform, finely interspersed dispersion of stromal nuclei within a densely packed, noninflamed lamina propria; older specimens showed an apparent reduction in overall stromal nuclear density with focal clustering and intervening hypocellular zones, consistent with decreased diffuse fibroblast distribution rather than inflammatory infiltration (leukocytes scarce) (Fig. 6A). This pattern was captured by a fibroblast distribution score strongly inversely correlated with age (Spearman ρ = −0.62, p < 0.001; Cohen’s κ = 0.60). Transcriptomic data identified COL1A1 and COL3A1 as two significantly downregulated (log2FoldChange of -2.35 and − 3.23, respectively) ECM-related genes with age. Given that type I and III collagens constitute the predominant structural components of gingival connective tissue, Masson’s Trichrome (TM) staining was used to evaluate overall collagen content and fiber organization. Compared to younger gingival tissue - typically characterized by a dense and homogeneous mesh of fine collagen fibers- aging specimens displayed marked increases in fiber heterogeneity, irregular interfibrillar spacing, and occasional focal microhyalinization, visible as compact, glassy eosinophilic areas within the connective tissue (Fig. 6A). Picrosirius Red (PS) staining under polarized light was applied to distinguish red–orange (thicker, highly birefringent, type-I collagen-enriched fibers) and green (thinner, less birefringent, type III collagen-enriched fibers) pixel classes14,15. PS staining under polarized light further highlighted age-associated collagen remodeling: younger specimens presented a balanced interlacing network of thin green and thicker red–orange birefringent fibers; older specimens showed disrupted continuity, locally thickened irregular red–orange bundles and attenuation of the finer green component. Quantification of birefringence revealed a predominant Col I/Col III enriched birefringent fiber area ratio (red–orange/green pixel area; architectural surrogate, Fig. 6B), indicating a shift toward thicker, densely packed fibrillar bundles and relative depletion of the thin fibrillar component. These architectural and cellular changes parallel the age-associated downregulation of COL1A1 (log2FoldChange − 2.35) and COL3A1 (log2FoldChange − 3.23). Three-dimensional light sheet microscopy provided complementary insights into ECM architecture: in young gingiva, collagen fibers appeared thin, densely packed, and uniformly distributed. In aged tissue, the matrix exhibited thicker, irregularly arranged bundles and greater structural heterogeneity (Fig. 6C). Among the most strongly downregulated genes was POSTN (log2FoldChange of − 3.52), encoding periostin, a matricellular protein essential for fibroblast activation, mechanotransduction, and collagen cross-linking. To assess whether reduced transcript levels correlated with diminished protein expression, periostin localization was examined via immunofluorescence staining. In young gingiva, periostin was broadly and homogeneously distributed throughout the lamina propria. With increasing age, the signal became sparse, discontinuous, and limited to a narrow band beneath the epithelium (Fig. 6D). The periostin localization score declined with age (ρ = −0.884, p < 0.001). Although lectins are not ECM components per se, UEA I staining was included to visualize endothelial glycoproteins, based on transcriptomic evidence suggesting age-related changes in endothelial function and vascular ECM remodeling. This approach enabled visualization of microvascular patterns and their potential association with ECM alterations. UEA I staining demonstrated largely preserved microvascular patterning across age groups, indicating that ECM architectural changes occur without consistent vascular depletion. Collectively, histological, polarized, three dimensional and immunofluorescence findings converged with transcriptomic downregulation of COL1A1, COL3A1 and POSTN to depict progressive age-associated remodeling characterized by reduced stromal cellular dispersion, increased collagen architectural heterogeneity, a higher predominant Col I–/Col III - enriched birefringent fiber area ratio, and periostin spatial restriction - features compatible with a fibrotic like trajectory.

Fig. 6
figure 6

Pro-Fibrotic Extracellular Matrix Profile with Aging revealed by microscopic analysis. Illustration of age-associated microscopic changes in non-inflamed human gingiva (in years, y/o). Data derived from the analysis of 40 healthy gingival tissue samples collected from systemically healthy individuals (2–87 years old) with no history of medication use or periodontal inflammation. Samples were stratified into three age groups: Young < 16 y (n = 20; 12 ♂, 8 ♀), Middle-aged 17–40 y (n = 11; 10 ♂, 1 ♀) and Adult > 40 y (n = 9; 6 ♂, 3 ♀). All sampled sites were periodontally healthy (probing depth ≤ 3 mm, site with no bleeding on probing, no attachment loss. (A_HE) Younger specimens show a uniform, finely interspersed dispersion of stromal nuclei within a densely packed, noninflamed lamina propria; older specimens (66 y, 78 y) display reduced nuclear density with focal clustering and hypocellular zones (leukocytes scarce) (Hematoxylin-Eosin (HE), scale: 250 μm). (A_TM) Dense, homogeneous fine collagen mesh in young gingiva versus heterogeneous, disorganized bundles with irregular interfibrillar spacing and focal microhyalinization (*) in the 78y sample (n = 40). (Masson’s Trichrome (TM), scale: 100 μm). (A_PS) Polarization: polarized images show a balanced interlacing network of thin green and thicker red/orange birefringent fibers in young specimens, while older specimens exhibit disrupted continuity, thickened irregular red/orange bundles and attenuation of the green component. (B) The scatter plot summarizes the per subject predominant Col I–/Col III–enriched birefringent fiber area ratio (red/orange / green pixel area; architectural surrogate); the ratio rises significantly with age (n = 40). (C) Three-dimensional light-sheet microscopy. Volumetric renderings a tightly interwoven, uniformly distributed network of slender fibrils in young gingiva, contrasting with thicker, irregularly oriented bundles and enlarged interfibrillar voids in aged tissue (nuclei, red; collagen, white). (D) Periostin immunofluorescence and microvasculature: Periostin (green) is broad and homogeneous in lamina propria of younger samples, progressively restricting to a narrow, discontinuous subepithelial band in older samples. UEA I lectin (red) indicates largely preserved microvascular patterning; only the 78y specimen shows focal capillary sparsity. Nuclei counterstained with DAPI (blue). Scale = 100 μm.

Definition of a minimal gene expression signature of gingival biological age

A sparse partial least squares regression was performed to predict chronological age based on the normalized counts of “S” genes that exhibited differential expression with age. After model optimization and adjusting for periodontal health status and gender, 21 genes were required to compute gingival biological age, with an adjusted R² of 0.74 (Fig. 7A). The model highlighted that the predicted age (biological age) of subjects with periodontal disease was higher compared to their chronological age (Fig. 7B).

Fig. 7
figure 7

Predicting gingival age to obtain a minimal gene signature. From a clinical perspective, intercepting individuals whose biological gingival age exceeds their chronological age could allow for personalized preventive strategies to delay or prevent the progression to overt periodontal pathology. (A) A partial least square regression (sPLS) was performed to explain chronological age based on the normalized counts of the “S” genes, which showed differential expression with age. A total of 10 genes were selected by the model as necessary for dimension 1, and 11 genes for dimension 2 to effectively explain chronological age (S signature). (B) A linear regression model was then built using the genes of S signature to explain chronological age, adjusting for periodontal health categories (healthy, gingivitis, periodontitis) and gender.

Discussion

The present study provides significant insights into gingival aging by aggregating and analyzing 57 bulk RNA sequencing samples available in public databases. This analysis offers a novel perspective on the aging trajectory of the superficial periodontium in humans, independent of the local inflammatory status, and primarily suggests modification in tissue architecture, including a collapse in matrix remodeling processes with age.

The global age-dependent gene downregulation observed in human gingiva aligns with previous transcriptomic studies on both human gingiva16 and skin17. Our findings are consistent with the observation that aging is associated with a loss of potential, leading to a global decline in biological functions18,19,20. It can be hypothesized that aging causes tissue-specific functional decline, with a form of “specialization/restriction” in response to stimuli21, and a reduced capacity to adapt to daily stresses.

Post hoc functional categorization revealed that SIM-classified genes made an outsized contribution to the variance captured by the principal components. Beyond the core SIM domains, a distinct subset of differentially expressed genes involved in transcriptional regulation and cell cycle control was identified. These genes likely function as upstream modulators that may indirectly influence or orchestrate structural, immune, and metabolic changes during gingival aging. Examples include chromatin remodelers, DNA-binding transcription factors, and proliferation regulators whose altered expression may affect ECM turnover, immune cell activation, or metabolic stress responses. The presence of these regulatory changes suggests that age-related alterations in gingival tissue architecture and function may arise not only from localized dysfunction within SIM pathways, but also from higher-order perturbations in global gene expression control. This layered regulation underscores the complexity of mucosal aging and highlights potential intersections between SIM functions and broader transcriptional networks. While the present SIM classification served as a post hoc functional interpretive framework, the contribution of regulatory mechanisms is explicitly recognized.

The SIM perspective suggests that healthy functions are achieved and maintained through an optimized relationship between Structure/supportive compartment (S), Immune/inflammatory system (I) and Metabolism (M)1. Among these components, enrichment findings highlighted canonical pathways specifically associated with the “S” part of the triad, with a significant decrease in the expression of pathways related to “Extracellular matrix (ECM) organization”, “Collagen formation”, and “ECM proteoglycans (PGs)”. These pathways encompass genes involved in remodeling (e.g., Matrix Metalloproteinases (MMPs)) as well as those that constitute the ECM (e.g., collagens, periostin). ECM is a dynamic three-dimensional network of macromolecules that provides physical scaffolds for cells and tissues22 and plays key regulatory roles in many cellular processes and functions (growth, migration, differentiation, survival, homeostasis, morphogenesis and cell signaling23. Age-dependent alterations in MMPs have been documented in skin24,25,26, Bruch’s–choroid27 and cardiovascular system28,29. Additionally, MMP3 and MMP27 appeared to be down-regulated in old periodontal ligament cells and in old human gingival fibroblasts16,30. Within the pathway associated with collagen formation, several genes exhibited significant down-regulation with aging, including fibril-forming collagens (COL3A1, COL5A2, COL24A1), fibril-associated collagens with interrupted triple helices (COL9A2, COL12A1, COL14A1, COL22A), beaded filaments collagen (COL6A6) and ECM components of the basement membrane, such as laminin (LAMA4, LAMB1). A decrease in collagen production with age has been well-documented in the literature, particularly in the skin31,32,33,34. Collagen I fibers impact rigidity, strength, and resistance to torsion and tension in the tissues, whereas collagen III fibers are thinner and more elastic35. Collagen III contributes to the tissue’s resilience and distensibility (the ability of biological tissue to stretch and contract in response to different stresses) as well as offering structural support during growth. Additionally, collagen III plays a role in tissue regeneration process36. The age-related reduction in collagen III signaling may, therefore, contribute to tissue stiffening and the loss of the ability to effectively respond to daily stresses. Alongside collagen, the gene expression of periostin also decreases with age. Periostin is a matricellular protein expressed in collagen-rich tissues subjected to constant mechanical strains, including periodontal tissues37. It may be involved in tissue remodeling by promoting adhesion, cellular differentiation, cell survival, and fibrogenesis38. A role for periostin in the regulation of ECM production and maturation has been established in various tissues, including the heart39, skin40 and periodontal tissues41. The downregulation of periostin with aging has been documented in the skin42, heart43 and adipose tissue44. Based on our findings, the decrease in periostin expression, observed at both the transcriptomic and proteomic levels, may contribute to increased ECM stiffness, impaired wound healing, and heightened vulnerability to mechanical stress and damage. Periostin plays a crucial role in maintaining the integrity and function of the interface between the epithelium and the underlying connective tissue45,46. It helps anchor and support the epithelial layer, promoting proper tissue architecture and function. A reduction in periostin at this critical interface in older individuals could weaken epithelial-connective tissue interactions, potentially impairing tissue barrier function and increasing susceptibility to epithelial damage or dysfunction.

Matrisome variations from the analyzed datasets clearly illustrated a global reduction in the expression of genes related to ECM as aging progresses. The decline in ECM degradation and diminished collagen formation thus emerge as pivotal factors in gingival aging. According to the literature, the reduction in ECM turnover is associated with a gradual increase in intra- and inter-molecular post-translational collagen cross-linking (covalent bonds)47, leading to enhanced ECM stiffness with age48. ECM accumulation and stiffening play a critical role in the initiation and progression of fibrogenesis by promoting mechano-activation of pro-fibrotic signaling pathways48. Furthermore, covalent cross-links exhibit resistance to complete proteolytic cleavage. Consequently, age-related collagen fragmentation results in the accumulation of fragmented collagen49, irreversibly compromising the functional and structural integrity of the ECM34. Moreover, aging severely impairs the proliferative and regenerative capacities of oral fibroblasts50, resulting in the downregulation of ECM production, disorganization of ECM architecture, inefficient wound healing, and ultimately, distortion of organ architecture and loss of function. Aging-related stiffness and fibrosis51 have also been reported in other organs and tissues, such as liver52, kidney53,54, lung55,56, heart57, ovary58 and skin59. Furthermore, semi-quantitative fibroblast distribution and periostin localization scores in gingival tissue exhibited strong inverse correlations with age, supported by moderate to substantial interobserver agreement. These findings indicate a progressive reduction in stromal cellular dispersion and a loss of diffuse periostin organization. In parallel, the Picrosirius red–derived birefringent fiber ratio revealed an increased predominance of thick, red–orange (Col I–enriched) fibers relative to thin, green (Col III–enriched) fibers. This shift was accompanied by heightened architectural heterogeneity, including locally thickened collagen bundles, attenuation of fine birefringent components, and focal microhyalinization. Collectively, these findings define a progressive, age-associated fibrotic-like remodeling of the gingival lamina propria. This phenotype reflects structural matrix reorganization without evidence of end-stage fibrosis: no extensive acellular hyalinized plates, no vascular obliteration, and preservation of overall tissue architecture. The term “fibrotic-like” is used to denote this intermediate, non-pathological matrix phenotype—distinct from classical fibrosis yet indicative of declining connective tissue resilience with age54,60,61,62.

According to our results, aging variations in Structure compartment would also be associated with Metabolism changes. The top canonical pathways exhibiting differential expression with aging and characterized by positive NES appeared to be those concerning “Cholesterol biosynthesis”. Cholesterol is a critical component of the plasma membrane and plays a key role in regulating membrane fluidity and structure, as well as cell adhesion63. Significant enrichment of the “cholesterol biosynthesis” pathway during aging has already been reported in pneumocytes, lipofibroblasts64 and endothelial cells65,66. The cholesterol biosynthesis pathway intersects with several biological programs relevant to aging and immunity. The mevalonate pathway, which underlies cholesterol biosynthesis, has been implicated in the induction of cellular senescence through increased oxidative stress, mitochondrial dysfunction, and DNA damage signaling67. Furthermore, cholesterol intermediates serve as key modulators of innate immunity: they can prime type I interferon responses, promote NLRP3 inflammasome activation, and drive trained immunity via epigenetic reprogramming in myeloid cells68,69. In barrier tissues such as gingiva, where immune tolerance and microbial vigilance must be tightly balanced, such metabolic rewiring could gradually destabilize immune homeostasis. The age-related increase in cholesterol biosynthetic activity may therefore reflect both a response to cumulative stress and a contributor to the chronic para-inflammatory state observed in aging connective tissues. This aligns with the SIM framework, in which metabolic and immune alterations interact with structural decline to drive progressive loss of tissue resilience.

Gingival aging is thus associated with an increase in both ECM and cellular stiffness, supporting the hypothesis of the co-evolution between cells and their environment over time70. Additionally, since the elements of the SIM are interrelated, any change in either compartment will impact the other two, creating a vicious cycle that gradually leads to frailty and disease. In this context, targeted interventions on SIM elements could help reverse the vicious cycle, by re-educating the SIM and reestablishing a balance between Structure, Immunity/Inflammation, and Metabolism, ultimately maintaining tissue homeostasis and health over time. The SIM paradigm developed in this study aims to move beyond established models of aging such as inflammaging, immunosenescence, or molecularly focused geroscience71. While those frameworks have significantly advanced our understanding of age-related molecular deregulations, they often remain anchored in discrete signaling cascades or immune dysfunction. By contrast, SIM proposes a functionally and spatially integrative view of aging, centered on the progressive disintegration of coordination between structural integrity, immune competence, and metabolic adaptability. This model conceptualizes aging not as a linear accumulation of molecular damage, but as a systems-level drift in tissue homeostasis - manifesting through architectural disorganization, immune inefficiency, and altered energetic regulation. As such, SIM provides a theoretical framework for identifying early, subclinical signs of aging at tissue interfaces - such as the gingiva - that are chronically exposed to mechanical and microbial stressors.

The identification of a gingival health signature and the evaluation of its evolution with age could allow clinicians to monitor an individual’s gingival health and detect early or predictive signs of imbalance or pathology. Similarly, analyzing biological gingival age and its variations could help practitioners assess beforehand the effectiveness of preventive and therapeutic measures for each patient. The difference between chronological age and predicted age (gingival biological age) might reveal a trajectory of oral aging that could be either accelerated or decelerated, regardless of the periodontal inflammatory status, and could indicate subclinical stages of future oral dysfunctions. Importantly, we observed that individuals with periodontitis tended to have a predicted biological age greater than their chronological age, suggesting accelerated gingival aging. This is conceptually meaningful, as it supports the notion that transcriptomic aging signatures could help identify individuals at higher risk of developing periodontal disease, even before clinical symptoms are apparent. From a clinical perspective, intercepting individuals whose biological gingival age exceeds their chronological age could allow for personalized preventive strategies to delay or prevent the progression to overt periodontal pathology. Conversely, individuals with a biological age below their chronological age may exhibit healthier tissue aging trajectories. This predictive modeling approach supports the potential use of transcriptomic data to infer biological gingival age and detect early signs of accelerated tissue aging, which could contribute to personalized risk stratification in periodontal medicine.

Moreover, the unique position and role of the mouth as an interface and interphase between the external and internal environment, coupled with its constant exposure to stress, could provide key insights into aging not only at the oral level but also at a general one10. In fact, the mouth could serve as an organ where early signs of pre-frailty or frailty are detected, enabling preventive action for the entire organism. In the future, this may allow for the identification of potential markers to assess both gingival and overall health, as well as to prevent or treat oral and age-related systemic diseases at an early stage. In this context, mesenchymal stromal cells (MSC), which play a key role in Structure, Immune and Metabolism regulation72,73, appear to be a crucial element for further investigation, to foster preventive and early therapeutic strategies aimed at promoting healthy aging.

Limitations of the study

The present study is subject to certain limitations inherent to the use of publicly available transcriptomic datasets. Due to incomplete metadata, potential confounding factors such as lifestyle habits (e.g., smoking, oral hygiene practices), systemic health conditions, and ethnic background could not be systematically evaluated. Similarly, granular clinical data regarding gingival status - such as staging or grading of periodontitis beyond general classifications - were often unavailable or inconsistently reported. While the present analysis included only non-malignant, fresh gingival samples and accounted for gender and periodontal status, residual confounding cannot be fully excluded. Future prospective studies with comprehensive clinical annotation and diverse populations will be essential to confirm and expand upon these findings.

Methods

Acquisition of human gingival transcriptome data

A review was conducted on GEO DataSets (https://www.ncbi.nlm.nih.gov/gds) using the following keywords: gingival OR gingivitis OR gingiva. Datasets were included if they contained bulk RNA sequencing data from non-pooled samples (individual data) of human fresh, unfixed gingiva (Fig. 1). Information on individual age, sex, and the inflammatory status of the sampled tissue (healthy site, inflammatory site with gingivitis alone or with periodontitis) is provided in Supplementary Table 1.

Exclusion criteria included: non-human gingival samples, cultured cells or tissue explants, cancer biopsies, and absence of data on individual age or tissue inflammation. Last search date: 2024/12/31.

Based on the defined keywords, 3068 records were retrieved. After manual screening, 57 datasets corresponding to 57 individuals were identified (Fig. 1). The average age of the entire sample was 44 years, with ages ranging from 16 to 78 years. No gingival transcriptomic data on human subjects younger than 16 were available in the literature. Detailed information, including age, sex, and the periodontal health status of the subjects (gingivitis, periodontitis, or clinically healthy gingiva) is provided in Supplementary Table 1. Tissue samples were annotated using the periodontal status available in the corresponding metadata. The following clinical parameters were used to categorize samples: Periodontal health: Absence of visible inflammation, mucosal candidiasis, or clinical signs of infection at the time of tissue sampling; probing depth (PD) ≤ 3 mm; no clinical attachment loss (CAL); and no bleeding on probing (BoP). Gingivitis: PD ≤ 3 mm; no CAL; presence of BoP. Periodontitis (as defined by Papapanou et al., 2018): PD ≥ 5 mm with CAL ≥ 3 mm and BoP, consistent with Stage III disease severity74.

Bulk RNA seq quality control, trimming, mapping and counting

The raw data were pre-processed using Galaxy (https://usegalaxy.org), which allowed quality control, trimming, mapping and gene counting75. All sequencing raw reads were aligned to the human genome reference hg38. Only uniquely and properly mapped read pairs were used for further analysis.

Biostatistical analysis

All subsequent statistical analyses were performed using R/RStudio. The analysis process involved importing all the data, removing the batch effect, analyzing the genes differentially expressed with age, functionally grouping these genes, and then defining a minimal signature of gingival biological age.

Data integration: All expression data were merged with their corresponding metadata (age and sex of the individual, tissue inflammatory status). The transcriptomic dataset was composed of multiple independent sub-cohorts originating from different studies or collection periods, referred to as “batches”. Each batch was identified by its corresponding GEO accession number (GSE), which was listed in Supplementary Table 1 along with the distribution of samples across age and periodontal status. Each set of samples sequenced for the same study was considered as belonging to the same batch. To mitigate potential non-biological variation introduced by differences in experimental protocols, RNA extraction methods, or sequencing platforms, a batch correction step was performed prior to downstream analyses. Specifically, the normalized expression matrix was adjusted using the removeBatchEffect() function from the limma R package (version 3.62.2), with batch identity specified based on GEO dataset origin76. The rationale for early correction was to minimize technical artifacts while preserving biological variation of interest. Following batch correction, a variance stabilizing transformation (VST) was applied to the expression matrix using the vst() function from the DESeq2 package. This transformation was used solely for exploratory analyses, including principal component analysis (PCA) and unsupervised clustering, as it stabilizes the variance across genes with differing expression levels and improves interpretability of the data in reduced-dimensional space. The batch correction was applied before conducting differential gene expression (DGE) analysis.

Differential expression analysis: Differentially expressed genes associated with age were identified using the DESeq2 package (version 1.46.0) applying a statistical model adjusted for batch number and inflammatory status. The analysis included all genes from all datasets without any a priori exclusion, to provide a comprehensive overview of global gene expression changes with age. Raw count matrices were analyzed using DESeq2 (v1.46.0). Multivariate models were fitted to assess the effects of age, periodontal status, and sex, while accounting for technical batch. Interaction terms (sex: age and periodontal_status: age) were included to test whether age-related transcriptional changes vary across periodontal-status categories. No genes remained significant after multiple-testing correction (FDR < 0.01) for either interaction. Accordingly, a reduced model without interaction terms was refit to estimate and report the main effects.

Principal component analysis (PCA): PCA was performed on normalized counts of genes significantly differentially expressed with age (p-value threshold of 0.01), to understand their contribution to aging.

Gene set enrichment analysis (GSEA): GSEA was performed using the fgsea package on all genes significantly differentially expressed with age. Genes were ranked using a composite score combining the log2FoldChange and the corresponding p-value (log2FoldChange * -log10(p-value)). Significantly enriched pathways were identified at a False Discovery Rate (FDR) adjusted p-value threshold of 0.001.

SIM categorization: All genes significantly differentially expressed with age were classified into one or more of the following functional categories: Structure (S), Immunity (I), and Metabolism (M), based on their annotated biological functions. A gene could belong to multiple categories simultaneously. The classification into S, I, and M categories was based on functional annotations from the Gene Ontology (GO) database and the Human Protein Atlas. Structure (S): Genes involved in tissue architecture, extracellular matrix organization, cell adhesion to ECM, and structural integrity of gingival tissue. Typical GO terms included in this category were extracellular matrix organization and collagen binding. Immunity (I): Genes related to immune response, inflammation, cytokine signaling, and antigen processing. GO terms included immune system process, inflammatory response, cytokine activity, T-cell activation, etc. Metabolism (M): Genes involved in cellular metabolism, including oxidative phosphorylation, lipid and glucose metabolism, mitochondrial activity, and biosynthetic processes. Relevant GO terms included metabolic process, oxidoreductase activity, lipid metabolic process, and ATP synthesis. In addition to the SIM domains, further thematic categories were introduced to account for genes not primarily classified under structural, immune, or metabolic processes but nonetheless relevant to aging biology: Proliferation-related genes (encompassing regulators of cell cycle progression and mitotic activity); transcription factors (including DNA-binding proteins and transcriptional modulators not confined to a specific SIM domain); migration-related genes (covering cytoskeletal regulators, and chemotactic mediators); Other/Unclassified (a residual group capturing genes with diverse or poorly characterized functions not assignable to the above categories). This structured classification enabled a layered analysis of transcriptomic trajectories in aging gingiva and supported integrative interpretation across molecular and histological findings.

Identifying a “SIM” gene signature within expression data: To assess the contribution of significantly differentially expressed genes to aging through ‘SIM’ functions, the average loadings of each category on the first two principal PCA components has been calculated. A qualitative assessment was also conducted using STRING version 12.0 (https://string-db.org/), enabling the classification of transcriptomic data according to the associated protein activities. Finally, to reduce the transcriptomic space, a sparse Partial Least Squares (sPLS) regression was optimized to be able to define biological gingival age in a minimal number of variables.

Definition of the gingival biological age model: A linear model to explain chronological age using the previously reduced list of genes was ultimately performed. The age predicted by this model was considered as the gingival biological age (GBA).

Gingival microscopy

To support transcriptomic findings related to extracellular matrix remodeling, histological staining was performed on 40 healthy gingival tissue specimens (2 × 1 mm), obtained from the buccal marginal gingiva, defined as the terminal, collar-like edge of the gingiva on the buccal surface, adjacent to the tooth77. This region forms the soft tissue wall of the gingival sulcus and is not directly attached to the underlying alveolar bone. Tissue samples were collected during tooth extractions performed for non-periodontal clinical indications, including advanced carious lesions, dental fractures, or traumatic injuries resulting in structural loss incompatible with restoration. Extractions performed for periodontal disease or infection were excluded. Prior to extraction, all sites underwent a comprehensive periodontal examination. Only samples meeting the following criteria were included: probing depth ≤ 3 mm, no clinical attachment loss, and absence of bleeding on probing (BoP). These criteria ensured the inclusion of clinically healthy gingival tissue. To minimize confounding factors, samples were excluded in cases of known systemic disease, current medication use (e.g., corticosteroids, immunosuppressants), or any signs of periodontal inflammation. Samples were stratified into three age categories: Young (< 16 years, n = 20; 12 males, 8 females), Middle-aged (17–40 years, n = 11; 10 males, 1 female), and Elderly (>40 years, n = 9; 6 males, 3 females), according to the NIH Lifespan Categories78.

For 2D microscopy, gingival tissues were fixed in 3.7% paraformaldehyde for 5 h, stored in phosphate-buffered saline (PBS) at 4 °C until paraffin wax embedding. Five micrometers-thick section slices were performed and processed for hematoxylin/eosin (HE), Masson’s trichrome (TM) and red picrosirius (PS) staining, then scanned (Zeiss Axio scan Z1, Centre d’Imagerie Quantitative Lyon Est (CIQLE). For each specimen, four sections were analyzed per staining modality: Hematoxylin–Eosin (HE), Masson’s Trichrome (TM), and Picrosirius Red (PS). Sections were collected at 50 μm intervals to ensure representative sampling across the connective tissue while avoiding redundancy from overlapping anatomical structures.

Hematoxylin and Eosin (HE) offered a global assessment of tissue morphology and cellular organization across different age groups (hematoxylin stains nuclei blue, while eosin stains cytoplasm and extracellular components pink). Masson’s Trichrome (TM) was used to visualize overall collagen fiber density and orientation (with collagen fibers appearing blue/green and cytoplasm red). Picrosirius Red (PS) staining under polarized light allowed specific differentiation between red–orange (thicker, highly birefringent, type-I collagen-enriched fibers) and green (thinner, less birefringent, type III collagen-enriched fibers) pixel classes14,15.

Periostin immunofluorescence was used to detect periostin expression, a matricellular protein involved in fibroblast activation and extracellular matrix remodeling, in order to assess its spatial distribution changes with age. UEA I Lectin-based immunofluorescence was employed to label endothelial glycoproteins, serving both as a compartmental landmark and as an indicator of microvascular integrity, since ECM stiffening and fibrosis often coincide with vascular remodeling.

For light sheet microscopy, samples were fixed using 3.7% PFA before being labelled with Propidium Iodide overnight. After rinsing with Propidium Iodide, samples were dehydrated in 100% Methanol before being labelled with a solution of Fast Green diluted 1:1000 in Methanol overnight. The Fast Green was rinsed off and the sample rehydrated in 1X PBS before embedding in 1% Low Melting agar. For in-depth imaging, the sample was dehydrated in 100% Methanol before being cleared with a solution of 1vol. Benzyl Alcohol / 2 vol. Benzyl Benzoate (BABB). Light sheet microscopy was performed using a Light Sheet Z7 (Carl Zeiss, Rueil-Malmaison, France) and analyzed using Zen Blue Software.

Fluorescence microscopy was assayed after antigen retrieval for 30 min (Citrate buffer at 90 °C in water bath), using rabbit anti-human periostin antibody (Abcam (Cambridge, CB2 0AX, UK), Ab 14041 dilution 1/100, overnight at 4 °C) followed by secondary Alexa 488 donkey anti-rabbit antibody and human lectin (Ulex Europaeus Agglutinin I (UEA I), Biotinylated (B-1065-2) Vector Labs) followed by secondary streptavidin (Streptavidin Alexa Fluor 647 conjugated; Life technology) incubation for endothelium and epithelial superficial layers localization, for one hour at room temperature following by DAPI incubation for nuclear detection. Fluorescence microscopy was acquired using widefield microscope Axio Observer and analyzed with Zen software (Carl Zeiss, Rueil-Malmaison, France).

Semiquantitative fibroblast and periostin scoring. Two experienced oral histopathologists independently scored fibroblast nuclear distribution on HE sections using a 4-point scale: 0 = very few fibroblast nuclei within dense collagen; 1 = sparse nuclei; 2 = heterogeneous clustered nuclei; 3 = numerous nuclei homogeneously dispersed. Periostin immunofluorescence was scored on a 4-point scale: 0 = no staining; 1 = focal, rare, discontinuous narrow subepithelial staining; 2 = heterogeneous, discontinuous narrow subepithelial staining; 3 = broad, homogeneous lamina propria staining. For each specimen, one representative section was used; observers were blinded to age group. Interobserver agreement was assessed by Cohen’s κappa and was found to be acceptable (Cohen’s κ = 0.60 for fibroblast score; κ = 0.70 for periostin).

Analysis of PS staining. A Fiji® pipeline enabled the semi-automated quantification of collagen birefringence patterns in PS-stained sections. Using the Seeded Region Growing Tools plugin in Fiji® to delineate the epithelial layer from the underlying lamina propria, only the connective tissue compartment, where collagen fibers are most prominent, was retained for analysis. Subsequently, the red-orange/green birefringent pixel area ratio per subject was computed. This ratio represents the relative abundance of thick, tightly packed (red-orange) versus thin, loosely packed (green) birefringent fibers, interpreted as a surrogate for predominant collagen I versus collagen III-enriched matrix components14,15. Each ratio was calculated as the average across three non-overlapping stromal fields, acquired under standardized illumination and polarization conditions.

Ethics declarations

Healthy gingival tissue was obtained from subjects aged 2 to 87 years, as biological product with change of purpose (care leftovers). The protocol of preparation of human biological samples for research approved under the N°DC-2019-3731 by the competent authority (CODEOH, part of the Ministry of Higher Education and Research). The patients’ non-opposition was duly collected in accordance with ethical requirements. For minors, the non-opposition has been obtained from a parent or legal guardian. The metadata collected (age, sex and periodontal status) comply with the French the regulatory authority for personal data (CNIL, Commission Nationale Informatique Libertés) MR-004 methodology. All methods were carried out in accordance with relevant guidelines and regulations.