Introduction

Periodontitis represents a chronic inflammatory disorder of the periodontal tissues mediated by dysbiotic microbial communities, affecting nearly half of the global population and ranking as the sixth most prevalent disease worldwide1. This pathological condition manifests clinically through gingival inflammation and progressive destruction of periodontal structures, typically accompanied by substantial immune cell infiltration into affected sites2,3,4,5,6. The disease process entails persistent inflammation leading to irreversible degradation of both mineralized and soft connective tissues supporting the dentition7,8. Emerging evidence highlights immune dysregulation as a pivotal factor in periodontal pathogenesis, with activated T and B lymphocytes demonstrating particular significance in mediating alveolar bone resorption during disease progression9,10,11. Comprehensive understanding of immune cell functions has become essential for elucidating disease mechanisms and advancing targeted therapeutic interventions for periodontitis.

Periodontitis pathogenesis is multifactorial, involving complex interactions between genetic predisposition, environmental exposures, pathogenic microorganisms, and behavioral factors. Although substantial progress has been made in periodontitis research, the precise molecular mechanisms underlying disease initiation and progression remain incompletely characterized. Single-cell RNA sequencing (scRNA-seq) has emerged as a powerful tool for elucidating molecular pathways and pinpointing key cellular populations involved in disease processes. This cutting-edge technology provides unprecedented resolution for profiling individual cells across multiple omics layers (genomic, transcriptomic, and epigenomic), thereby uncovering cellular diversity and intricate gene regulatory networks within the immune system12. Prior investigations have demonstrated associations between monocyte-specific differentially expressed genes (DEGs) and periodontitis, with subsequent validation in infiltrating immune cell populations13,14. However, these findings often failed to establish connections between immune cell subtypes and core regulatory genes, while also lacking robust validation through integrated scRNA-seq and Mendelian randomization (MR) methodologies, potentially compromising result reliability. To overcome these constraints, we employed a multi-omics integration strategy incorporating expression quantitative trait loci (eQTL)15, genome-wide association studies (GWAS)16and scRNA-seq data17, combined with machine learning algorithms, to delineate immune cell-specific diagnostic markers and molecular subgroups in periodontitis. MR represents an innovative analytical framework that utilizes genetic variants as instrumental variables to infer causal relationships, effectively mimicking randomized controlled trials. This approach allows for comprehensive cellular classification and functional annotation at single-cell precision, enabling the identification of biologically relevant pathways that link transcriptional profiles with clinical phenotypes18.

To investigate the causal relationship between hub genes and immune cells in periodontitis, we conducted an integrated analysis combining single-cell sequencing with MR methodology19. The study employed a two-sample MR design to systematically evaluate immune cell enrichment patterns in periodontitis using single-cell sequencing data, while simultaneously applying MR techniques to identify genetic variants significantly associated with disease pathogenesis. Through this comprehensive approach, we established robust associations between key hub genes and specific immune cell populations, subsequently validating their causal role in periodontitis development using Bayesian MR modeling frameworks.

Materials and methods

Data collection

1) Two gene expression datasets (GSE17460920 and GSE15699321) were acquired from the Gene Expression Omnibus (GEO) repository, a publicly accessible genomic data archive maintained by NCBI. These datasets were selected based on their comparable tissue types and quality metrics in periodontitis research. GSE174609 comprises scRNA-seq data from eight clinical specimens, equally distributed between healthy controls and periodontitis cases (Supplementary Table S1). GSE156993 contains microarray expression profiles from 12 individuals (6 controls and 6 periodontitis patients) generated using the Affymetrix GPL570 platform (Supplementary Table S2).

Both datasets underwent comprehensive bioinformatic processing including:

  • Quality control assessment.

  • Data normalization.

  • Batch effect correction.

  • Dataset integration.

  • Principal component analysis.

  • Cell population clustering.

  • UMAP-based dimensionality reduction.

All analyses were performed using specialized R packages. The experimental group classifications strictly adhered to the original study protocols, with complete sample metadata available through the GEO accession numbers [GSE174609, GSE156993].

2) The exposure datasets were obtained from the eQTLGen consortium (https://www.eqtlgen.org), an extensive research initiative dedicated to exploring the genetic basis of complex traits through blood transcriptome analysis. During its ongoing second phase, this collaborative project primarily conducts genome-wide association meta-analyses using peripheral blood samples, having completed 86 large-scale GWAS meta-analyses in blood-derived gene expression data.

3) The FinnGen consortium represents a large-scale genomic initiative specifically designed to investigate population-specific genetic architecture and disease susceptibility within European cohorts. This comprehensive biobank integrates extensive genotypic and phenotypic data from diverse geographical regions, facilitating robust analyses of genotype-phenotype associations and their implications for complex disorders21.

Particularly noteworthy is the project’s emphasis on translating genetic discoveries into clinical applications, with special relevance to precision medicine approaches and population health strategies. The database serves as an essential resource for researchers investigating the molecular mechanisms underlying disease pathogenesis and genetic predisposition.

Regarding the periodontal disease phenotype (finn-b-K11_PERIODON_CHRON), the current release comprises 3,046 confirmed cases matched against 195,395 population controls, providing substantial statistical power for association studies. This carefully curated dataset enables rigorous examination of genetic risk factors contributing to chronic periodontitis susceptibility within the studied population.

Single cell RNA sequence analysis

The single-cell RNA sequencing data underwent rigorous quality control and preprocessing using the Seurat computational framework. Initial data processing involved the elimination of low-quality cells and potential doublets through the implementation of DoubletFinder. Cell filtering criteria were established based on UMI counts and mitochondrial gene content, with threshold values determined by robust statistical measures. Specifically, cells were retained when exhibiting nFeature_RNA values between 200 and 3329.985, percent.mt below 17.24422, and nCount_RNA under 12151.96, calculated as three median absolute deviations from the median.

Subsequent analytical steps included data normalization and standardization, followed by principal component analysis for dimensionality reduction. Cellular heterogeneity was visualized through uniform manifold approximation and projection (UMAP), enabling clear delineation of distinct cell populations. Cluster annotation was performed by integrating evidence from published literature and the CellMarker database (RNA_snn_res.1.2), with particular emphasis on cell populations associated with periodontal disease pathogenesis.

For differential gene expression analysis, the FindMarkers algorithm was utilized to systematically compare transcriptional profiles across immune cell subpopulations. This analytical approach facilitates the identification of statistically significant gene expression differences between experimental groups or cellular subtypes.

Mendelian randomization analysis

The causal associations between eQTLs and disease phenotypes were derived from carefully filtered outcome identifiers. Genome-wide significant single nucleotide polymorphisms (SNPs) demonstrating robust associations with target genes (genome-wide significance threshold P < 1 × 10^−8) were systematically identified as candidate instrumental variables (IVs). To ensure independence among genetic variants, stringent linkage disequilibrium (LD) criteria were applied (R² < 0.001 within a 10,000 kb window).

Multiple MR approaches were employed to enhance the robustness of causal inference: (a) The inverse-variance weighted (IVW) method provided primary effect estimates by meta-analyzing SNP-specific Wald ratios; (b) MR-Egger regression accounted for potential pleiotropy under the Instrument Strength Independent of Direct Effect (InSIDE) assumption; (c) The weighted median estimator maintained consistency when up to 50% of instruments were invalid; (d) The weighted mode approach offered superior causal detection with minimized bias and reduced false positive rates compared to MR-Egger.

For analyses limited to single-SNP instruments, the Wald ratio method was utilized to evaluate causal effects and quantify the collective influence of both cis- and trans-acting gene expression in peripheral blood on periodontitis pathogenesis. Resultant causal estimates underwent rigorous validation through heterogeneity assessment (Cochran’s Q test) and examination of horizontal pleiotropy.

Heterogeneity test analysis

Our investigation implemented Mendelian heterogeneity analysis to statistically assess potential variation among the examined SNPs. For every genetic variant, we calculated a weighted composite measure (Q value) based on squared deviations between effect estimates and their standard errors. This Q metric conforms to a chi-square distribution with degrees of freedom corresponding to the SNP count minus one. A resulting p-value greater than 0.05 suggested no significant heterogeneity in SNP effects, implying consistent genetic influences on disease susceptibility across all analyzed variants.

We performed comprehensive MR sensitivity testing using a leave-one-out approach to evaluate individual SNP contributions to periodontitis risk. This systematic procedure involved sequentially eliminating each variant while recomputing the combined effect estimate for the remaining SNPs, facilitating detection and exclusion of influential outliers. During each iteration, we generated adjusted effect estimates with 95% confidence intervals (CI) to assess the specific impact of the removed SNP. The complete set of leave-one-out results was then collectively analyzed and contrasted with the primary estimate obtained from all SNPs. This comparative methodology provided rigorous assessment of our MR findings’ stability by examining whether any single SNP exclusion meaningfully changed the overall effect size.

GSEA enrichment analysis

To stratify the study cohort, participants were divided into subgroups based on elevated versus diminished gene expression levels. Pathway enrichment disparities between these subgroups were systematically analyzed through Gene Set Enrichment Analysis (GSEA). The reference gene collection, containing subtype-specific pathway annotations, was obtained from the MsigDB repository. Comparative pathway expression profiling across subtypes was performed, with statistically significant gene sets (adjusted p-value < 0.05) being identified through consistency scoring. This analytical approach is widely employed in biomedical investigations to establish robust correlations between disease phenotypes and their underlying molecular mechanisms.

Gene set variation analysis (GSVA)

GSVA represents a non-parametric computational approach that assesses pathway enrichment patterns in transcriptomic data without requiring predefined sample classifications. This methodology converts individual gene expression profiles into comprehensive pathway activity scores through statistical transformation, thereby facilitating the interpretation of biological processes at the systems level. For the current investigation, curated gene sets were obtained from the Molecular Signatures Database (MSigDB), followed by implementation of the GSVA framework to quantify pathway activity and perform comparative analyses across experimental conditions.

Regulatory network analysis of key genes

The R package ‘RcisTarget’ was employed to identify potential transcription factors through comprehensive motif enrichment analysis. For each regulatory motif, normalized enrichment scores (NES) were computed by considering the complete repertoire of motifs available in the reference database. Beyond the primary motif annotations provided in the original dataset, additional regulatory associations were established through comparative analyses of motif sequence homology and corresponding gene sequences. The analytical pipeline commenced with the computation of area under the curve (AUC) values for all possible motif-gene combinations, utilizing recovery curve methodology that evaluates the positional ranking of gene sets relative to specific motifs. Subsequently, NES values were determined by normalizing the observed AUC values against the background distribution of all motifs present in the target gene set.

Immune infiltration analysis

Single-sample Gene Set Enrichment Analysis (ssGSEA) represents a robust computational approach for characterizing immune cell composition within tissue microenvironments. This method enables comprehensive profiling of 29 distinct human immune cell populations, encompassing T lymphocytes, B lymphocytes, and natural killer cells. Our investigation employed ssGSEA to systematically quantify immune cell infiltration levels from transcriptomic data, followed by rigorous correlation analyses examining relationships between specific gene expression patterns and immune cell abundance.

Ligand-receptor interaction analysis (Cellchat)

The Cellchat computational framework was utilized to systematically reconstruct cellular communication networks by integrating both intracellular signaling cascades and intercellular molecular interactions. This bioinformatics tool employs a knowledge-based approach, leveraging curated ligand-receptor-transcription factor (L-R-TF) regulatory axes derived from the KEGG pathway database22,23,24. Through comprehensive analysis of known L-R-TF interactions, the methodology facilitates the prediction of intercellular communication patterns by examining specific molecular pairs and their subsequent signaling cascades.

Developmental trajectories of key cell subtypes

Single-cell transcriptomic investigations provide unprecedented resolution for elucidating intricate biological mechanisms and transcriptional dynamics within heterogeneous cellular populations. Such analyses have successfully identified molecular signatures that define distinct cellular subtypes, biomarkers indicative of transitional phases during biological processes, and genetic markers associated with cell fate determination. A characteristic feature of single-cell gene expression profiles is their inherent temporal asynchrony, where each cell captures a snapshot of transcriptional activity at a discrete time point. The Monocle algorithm employs an innovative pseudotemporal ordering approach that capitalizes on this expression heterogeneity to reconstruct developmental trajectories, thereby enabling the temporal mapping of cellular transitions during processes like differentiation.

Statistical analysis

The validity of MR studies is contingent upon satisfying three fundamental criteria: (1) the association assumption, mandating that genetic variants exhibit robust linkage with the exposure variable while demonstrating no direct relationship with the outcome; (2) the independence criterion, necessitating that instrumental variables remain unaffected by potential confounders; and (3) the exclusion restriction principle, stipulating that genetic instruments influence outcomes exclusively via the exposure pathway. The presence of alternative biological mechanisms through which instrumental variables may affect outcomes would indicate horizontal pleiotropy. Statistical computations were performed utilizing R statistical package (v4.3.2), adopting a significance threshold of p < 0.05 for all analytical procedures.

Results

Single cell expression profile data

The present study analyzed gene expression patterns across eight periodontal tissue samples involved in remodeling processes (Supplementary Fig. 1 A). Cellular data processing was performed using Seurat, implementing stringent quality control measures. Cells were filtered according to multiple parameters: unique molecular identifier (UMI) counts per cell, detected gene numbers, and mitochondrial read proportions. Outlier cells were systematically removed when exceeding three median absolute deviations from the median value.

Dimensionality reduction via UMAP revealed 18 well-defined cellular subpopulations within the periodontitis dataset (Fig. 1A). Rigorous quality filtering was applied based on comprehensive visualization analyses (violin and scatter plots), with exclusion criteria set as follows: nFeature_RNA > 200, mitochondrial percentage ≤ 3MAD, gene counts ≤ 3MAD, total RNA counts ≤ 3MAD, and ribosomal percentage ≤ 3MAD (Supplementary Figs. 1B and 1 C). Potential doublets were identified and removed using DoubletFinder, yielding a final dataset comprising 66,209 high-quality cells for subsequent analyses.

Notably, we identified the top 10 most variable genes across the dataset, including biologically significant markers such as PPBP, HBB, IGKC, IGLC2, IGLC3, HBA2, HBA1, S100A9, S100A8, and LYPD2 (Fig. 1E, Supplementary Fig. 1D). These molecular signatures represent key players in periodontal tissue dynamics.

Cell subpopulation annotation for single cell data

To elucidate the immunological landscape of periodontitis, we conducted a comprehensive transcriptomic profiling of immune cell subpopulations. Cellular subsets were systematically classified according to their unique molecular signatures. Initial dimensionality reduction via principal component analysis (PCA) revealed significant batch effects among samples (Supplementary Fig. 1D), which were effectively mitigated through Harmony integration (Supplementary Fig. 1E). The ElbowPlot method identified 12 principal components as optimal for downstream analysis (Supplementary Fig. 1 F). Subsequent UMAP clustering resolved 23 distinct cellular clusters, which were biologically annotated to encompass various immune lineages including CD4 + T cells (both naive and mature), CD8 + T cells (naive and effector), B cell populations (naive and differentiated), natural killer cells, monocytic cells, dendritic cells, and hematopoietic progenitors (Fig. 1B).

Visualization of cellular phenotypes was achieved through bubble plot representations, illustrating both canonical markers for the 10 major immune cell categories (Fig. 1C) and their relative abundance in diseased versus healthy specimens (Fig. 1D). Differential gene expression analysis was performed for each immune subset, with significant markers identified using stringent statistical thresholds (absolute log2 fold change > 1.5, adjusted p-value < 0.05) via the findMarkers algorithm (Fig. 1E). The complete catalog of differentially expressed genes is provided in the supplementary dataset (Output_markers.csv).

Fig. 1
Fig. 1
Full size image

scRNA-seq atlas of cell cluster and DEGs analysis. (A) and (B) UMAP plot based on the clustering of immune cells in the periodontitis single - cell dataset. (C) Bubble plot of ligand - receptor interactions between immune and DEGs. (D) Intersections between the 10 immune molecular subtypes and the cell percent ratio in the control group and the disease group. (E) Differences in the expression and distribution of immune cell - associated genes.

Mendelian randomization analysis

MR has emerged as a powerful epidemiological method for causal inference, employing genetic variants, particularly SNPs, as IVs to investigate potential causal relationships between exposures and outcomes17. In our investigation, we leveraged SNPs associated with immune cell subpopulations as genetic instruments to examine their putative causal effects on periodontitis pathogenesis. To ensure methodological rigor, we implemented four complementary MR approaches: the IVW method as our primary analysis, supplemented by MR-Egger regression, weighted median estimation, and MR-PRESSO to assess and account for potential pleiotropy. Notably, all analytical approaches yielded concordant effect estimates, reinforcing the validity of our findings.

For genetic association analysis, we utilized comprehensive summary statistics derived from the FinnGen consortium (finn-b-K11_PERIODON_CHRON), comprising 198,441 individuals (195,395 controls and 3,046 cases). Our integrative MR approach identified seven genes demonstrating statistically significant associations with periodontitis susceptibility (IVW p-value < 0.05), including both risk-enhancing and protective genetic factors: ANXA1, ARL4C, CD79B, LRRC25, NKG7, SLC11A1, and VIM (Fig. 2A-G). These findings were further supported by sensitivity analyses and detailed in Supplementary Data 1 and Figure S2.

The effect estimates for each gene are as follows:

  • NKG7 (OR = 0.821; 95% CI: 0.711–0.947; p = 0.007),

  • ARL4C (OR = 0.861; 95% CI: 0.759–0.978; p = 0.021),

  • CD79B (OR = 0.791; 95% CI: 0.634–0.985; p = 0.036),

  • VIM (OR = 0.839; 95% CI: 0.706–0.998; p = 0.047),

  • LRRC25 (OR = 0.914; 95% CI: 0.837–0.999; p = 0.046).

These genes are potentially associated with a lower risk of periodontitis. In contrast, SLC11A1 (OR = 1.131; 95% CI: 1.036–1.234; p = 0.006) and ANXA1 (OR = 1.126; 95% CI: 1.037–1.222; p = 0.005) were found to be associated with an increased risk of periodontitis. Sensitivity analyses were conducted to evaluate the reliability of these causal relationships. The results demonstrated that excluding any single SNP did not significantly affect the overall effect estimates, confirming the robustness of the causal relationships identified for the seven genes (Fig. 3A–G).The validation cohort revealed that the SLC11A1 gene was statistically significant, which was consistent with the MR results (Supplementary Data 1).

Overall, based on the above findings, the MR analysis highlighted ANXA1, ARL4C, CD79B, LRRC25, NKG7, SLC11A1, and VIM as key genes with causal roles in periodontitis. These findings underscore the importance of these hub genes in the immune response and their potential as therapeutic targets for periodontitis.

Fig. 2
Fig. 2
Full size image

The scatter plots demonstrate the associations between the seven candidate genes and periodontitis. (A) The causal effect of the ANXA1 gene on periodontitis is illustrated using four Mendelian randomization (MR) methods. (B) The causal effect of the ARL4C gene on periodontitis is shown through four MR methods. (C) The causal effect of the CD79B gene on periodontitis is depicted using four MR methods. (D) The causal effect of the LRRC25 gene on periodontitis is presented through four MR methods. (E) The causal effect of the NKG7 gene on periodontitis is displayed using four MR methods. (F) The causal effect of the SLC11A1 gene on periodontitis is illustrated using four MR methods. (G) The causal effect of the VIM gene on periodontitis is shown using four MR methods.

Fig. 3
Fig. 3
Full size image

The forest plots demonstrating the association between 7 candidates and periodontitis. (A) The forest plots demonstrating the positive correlation between ANXA1 gene and periodontitis. (B) The forest plots demonstrating the negative correlation between ARL4C gene and periodontitis. (C) The forest plots demonstrating the negative correlation between CD79B gene and periodontitis. (D) The forest plots demonstrating the negative correlation between LRRC25 gene and periodontitis. (E) The forest plots demonstrating the negative correlation between NKG7 gene and periodontitis. (F) The forest plots demonstrating the rarely correlation between NKG7 gene and periodontitis. (G) The forest plots demonstrating the negative correlation between VIM gene and periodontitis.

Signaling pathways involved in key genes

To elucidate the molecular mechanisms underlying periodontitis pathogenesis, we conducted a comprehensive analysis of signaling pathways associated with the seven core hub genes. Pathway enrichment analysis using GSEA methodology revealed the most statistically significant biological pathways involved in disease progression.

  • For ANXA1, the enriched pathways include Antigen Processing and Presentation, NOD-like Receptor Signaling Pathway, and Staphylococcus Aureus Infection, among others (Fig. 4A).

  • ARL4C is enriched in the FoxO Signaling Pathway, Rap1 Signaling Pathway, Ras Signaling Pathway, and other relevant pathways (Fig. 4B).

  • The enriched pathways for CD79B include the B-cell Receptor Signaling Pathway, Herpes Simplex Virus 1 Infection, and Primary Immunodeficiency (Fig. 4C).

  • LRRC25 is associated with the B-cell Receptor Signaling Pathway, HIF-1 Signaling Pathway, and MAPK Signaling Pathways (Fig. 4D).

  • NKG7 is enriched in pathways such as the FoxO Signaling Pathway, Rap1 Signaling Pathway, and Ras Signaling Pathway (Fig. 4E).

  • SLC11A1 shows enrichment in the Adipocytokine Signaling Pathway, MAPK Signaling Pathway, and NOD-like Receptor Signaling Pathway (Fig. 4F).

  • For VIM, the enriched pathways include the B-cell Receptor Signaling Pathway, Legionellosis, and Neutrophil Extracellular Trap Formation (Fig. 4G).

Our investigation extended to examining eQTL data encompassing 33,538 genes derived from 3,046 clinical specimens, with the objective of pinpointing eQTLs and SNPs associated with periodontitis. The seven central hub genes demonstrated statistically significant differential expression patterns, as evidenced by F-statistic values surpassing the threshold of 10, thereby establishing their robust correlation with periodontitis-associated molecular pathways. Concurrently, we implemented GSVA methodology to evaluate the degree of pathway enrichment, utilizing the expression profiles of these pivotal hub genes as the analytical foundation.

  • Highly expressed ANXA1 was enriched in pathways such as PI3K/AKT/mTOR Signaling, UV Response DN, and Heme Metabolism (Fig. 5A).

  • ARL4C was enriched in mTORC1 Signaling, Androgen Response, and Heme Metabolism (Fig. 5B).

  • CD79B was enriched in Hedgehog Signaling, Pancreas Beta Cells, and IL2/STAT5 Signaling (Fig. 5C).

  • LRRC25 was linked to Hypoxia, Apoptosis, and the Reactive Oxygen Species (ROS) Pathway (Fig. 5D).

  • NKG7 was associated with mTORC1 Signaling, Androgen Response, and Heme Metabolism (Fig. 5E).

  • SLC11A1 was enriched in the ROS Pathway, IL6/JAK/STAT3 Signaling, and UV Response UP (Fig. 5F).

  • VIM was enriched in Hedgehog Signaling, KRAS Signaling UP, and the ROS Pathway (Fig. 5G).

These results indicate that the advancement of periodontal disease appears to be modulated through the stimulation of critical molecular pathways, facilitated by the core genes discovered. This investigation sheds light on the mechanistic involvement of immune regulation and fundamental biological processes in disease pathogenesis, while simultaneously revealing promising targets for clinical intervention.

Fig. 4
Fig. 4
Full size image

GDEA analysis for the interaction between DEGs and signal pathway based on the periodontitis dataset. (A) Enrichment score and Zoomed-in circus plots of genetic presenting the pathways enriched by ANXA1. (B) Enrichment score and Zoomed-in circus plots of genetic presenting the pathways enriched by ARL4C. (C) Enrichment score and Zoomed-in circus plots of genetic presenting the pathways enriched by CD79B. (D) Enrichment score and Zoomed-in circus plots of genetic presenting the pathways enriched byLRRC25. (E) Enrichment score and Zoomed-in circus plots of genetic presenting the pathways enriched by NKG7. (F) Enrichment score and Zoomed-in circus plots of genetic presenting the pathways enriched by SLC11A1. (G) Enrichment score and Zoomed-in circus plots of genetic presenting the pathways enriched by VIM.

Fig. 5
Fig. 5
Full size image

GSVA analysis of correlationship between DEGs and signal pathways in periodontitis. (A) GO analyses interaction network of ANXA1 (up- and down-related genes) in signal pathways. (B) GO analyses interaction network of ARL4C (up- and down-related genes) in signal pathways. (C) GO analyses interaction network of CD79B (up- and down-related genes) in signal pathways. (D) GO analyses interaction network of LRRC25 (up- and down-related genes) in signal pathways. (E) GO analyses interaction network of NKG7 (up- and down-related genes) in signal pathways. (F) GO analyses interaction network of SLC11A1 (up- and down-related genes) in signal pathways. (G) GO analyses interaction network of VIM (up- and down-related genes) in signal pathways.

Key gene-related transcriptional regulatory network

Our investigation centered on elucidating the transcriptional control mechanisms of seven pivotal hub genes, with particular emphasis on their co-regulation by common transcription factors. By utilizing these core genes as our analytical framework, we performed comprehensive enrichment analyses to delineate associated transcriptional regulators. The assessment of transcription factor motif enrichment was conducted through cumulative recovery curve methodology, which unveiled substantial regulatory relationships.

The graphical representation in Fig. 6A demonstrates the identified enriched motifs and their corresponding transcription factors, highlighting the intricate regulatory architecture that modulates these essential genes in periodontitis pathogenesis. Through rigorous motif-TF annotation and selection analysis, we determined that cisbp__M5167 exhibited the most pronounced normalized enrichment score (NES = 6.38), as depicted in Fig. 6B. This observation implies that this particular motif serves as a crucial regulatory element, potentially interacting with multiple transcription factors to orchestrate the expression patterns of the seven central genes under investigation.

Fig. 6
Fig. 6
Full size image

The NES of each motif analysis. (A) Enrichment score of motif analysis and correlation gene sequence. (B) The NES of each motif based on the AUC distribution.

Immune infiltration

The periodontal inflammatory niche is characterized by a complex interplay of immunological components, extracellular matrix constituents, signaling molecules, and unique biophysical properties, all of which collectively influence disease pathogenesis, clinical manifestations, and therapeutic outcomes. Our investigation focused on elucidating the molecular pathways through which critical genetic factors mediate periodontitis development by examining their associations with immune cell infiltration patterns. Through comprehensive immune profiling, we quantified the relative abundance of diverse immune cell subsets across patient samples and delineated their complex network interactions (Fig. 7A). Employing the CIBERSORT algorithm, we systematically analyzed correlations among 29 distinct immune cell populations and generated visual representations of their connectivity patterns (Fig. 7B). Notably, our data revealed a marked enrichment of T helper cell populations in periodontitis patients relative to healthy controls (Fig. 7C), suggesting their potential involvement in disease pathogenesis.

Subsequent analysis focused on identifying relationships between pivotal genetic markers and immune cell dynamics. Multiple candidate genes demonstrated significant associations with specific immune subsets. The cytoskeletal protein VIM displayed an inverse relationship with follicular T helper cells (Tfh), whereas the calcium-binding protein ANXA1 showed negative correlation with MHC class I molecules. The metal ion transporter SLC11A1 exhibited dual regulatory patterns, demonstrating positive association with natural killer cells (NK cells) but negative correlation with B lymphocytes. Furthermore, the small GTPase ARL4C and the cytotoxic granule component NKG7 both showed positive correlations with cytotoxic activity, while the leucine-rich repeat protein LRRC25 was positively associated with parainflammatory responses but negatively correlated with B cell populations. The B cell receptor component CD79B demonstrated positive association with immune checkpoint molecules and negative correlation with immature dendritic cells (Fig. 7D). These observations highlight both the specialized and synergistic functions of distinct immune cell subsets in periodontal disease, reinforcing the concept that immune infiltration represents a critical determinant of disease progression. The identified gene-immune cell correlations provide mechanistic insights into periodontal pathogenesis and may inform the development of targeted immunomodulatory therapies.

Fig. 7
Fig. 7
Full size image

Immune Infiltration Analysis. (A) Panoramic mapping of the 22 immune cell types across high and low risk score stratifications. (B) Heatmap illustrating the interrelationships among immune cell populations. (C) Differential analysis of immune subpopulation abundance between periodontitis and control groups, using the Wilcoxon matched-pairs signed rank test (P < 0.05 considered significant). (D) Heatmap depicting the performance of seven gene sets across immune cell subpopulations.

Expression profiles of key genes in single-cell data and co-expression with osteoblast/osteoclast marker genes and co-expression with IL17 and TGFB

The current research systematically examined the transcriptional profiles of seven pivotal immune-related genes (ANXA1, ARL4C, CD79B, LRRC25, NKG7, SLC11A1, and VIM) within diverse immune cell populations associated with periodontitis pathogenesis. Comprehensive analysis encompassed multiple lymphocyte subsets (CD4 + T cells, naive CD4 + T cells, naive CD8 + T cells, CD8 + T cells, naive B cells, B cells), along with NK cells, monocytes, dendritic cells, and progenitor cells. Distinct expression patterns across these ten immune cell subpopulations were graphically represented (Fig. 8A), demonstrating cell-type-specific distributions that imply specialized functions in periodontal immune homeostasis. Uniform Manifold Approximation and Projection (UMAP) analysis (Fig. 8B) effectively illustrated the spatial organization of immune cells according to their expression profiles of these seven core genes, highlighting their differential involvement during various phases of periodontitis development. Single-cell clustering analysis (Fig. 8C) confirmed the association between these molecular markers and periodontitis-affected immune cell populations, with particularly notable upregulation of ANXA1 and NKG7 in monocytes compared to other cellular subsets.

Our investigation extended to examining potential regulatory interactions between these molecular markers and two critical inflammatory mediators, IL17 and TGF-ß (Supplementary Data, Figures S2 and S3). The co-expression analysis demonstrated significant inverse relationships between IL17 and multiple core genes, most prominently ANXA1 (r = −0.021, p = 4.7 × 10^−6), CD79B (r = −0.04, p = 6.7 × 10^−6), and NKG7 (r = −0.017, p = 0.0024), implying potential suppressive effects of IL17 on these genes in quiescent immune cells (Figures S2A-G). In contrast, TGF-ß exhibited positive associations with several key genes, including ANXA1 (r = 0.14, p < 2.2 × 10^−16), ARL4C (r = 0.29, p < 2.2 × 10^−16), NKG7 (r = 0.37, p < 2.2 × 10^−16), and SLC11A1 (r = 0.15, p < 2.2 × 10^−16), suggesting its potential role in transcriptional activation within the periodontal inflammatory milieu. The observed negative correlations between TGF-ß and CD79B (r = −0.024, p = 0.043) or VIM (r = 0.054, p < 2.2 × 10^−16) revealed intricate cytokine-gene regulatory networks (Figures S2A-G).

These results underscore the complexity of immune modulation during periodontitis progression and highlight promising molecular candidates for targeted therapeutic strategies.

Fig. 8
Fig. 8
Full size image

scRNA-seq analysis for osteoblast/osteoclast marker genes. (A) Bubble plot showing the expression patterns of marker genes among different immune cell clusters. (B) UMAP clustering plot of the 7 genes in the periodontities samples. (C) Proportional plot showing the 7 genes distribution of immune cell in periodontitis.

Analysis of receptor-ligand relationship pairs of cell subpopulations from single-cell data and the developmental trajectories of key cell subtypes

Employing Cellchat, we systematically investigated intercellular communication dynamics between immune cells and other cellular components within the periodontitis microenvironment, with particular emphasis on receptor-ligand interactions spanning diverse disease-related pathways. Our computational analysis uncovered an intricate interaction network, graphically represented through a bubble chart quantifying ligand-receptor pair frequencies (Fig. 9A). A complementary network visualization illustrated the connectivity patterns among immune cell subpopulations (Fig. 9B). Dendritic cells demonstrated the most extensive interaction profile, showing prominent connectivity with NK cells, monocytes, and CD8 + T lymphocytes. Monocytes similarly exhibited broad interaction networks involving NK cells, dendritic cells, and both CD8 + and CD4 + T cell subsets, highlighting their pivotal position in immune coordination.

To elucidate the ontogeny and specialization of emerging T cell variants, we conducted clustering analyses that resolved three discrete T cell subpopulations. Subsequent computational approaches included cellular similarity assessment and pseudotemporal trajectory reconstruction, enabling visualization of differentiation processes and associated gene expression dynamics throughout developmental stages. The resulting trajectories incorporated pseudotime values, cellular classification, and state-specific markers (denoted by branching patterns), offering a multidimensional perspective on differentiation pathways (Fig. 9C–D).

Through examination of differentially expressed genes at critical branching nodes along developmental trajectories, we detected substantial transcriptional reprogramming events. These molecular transitions were quantitatively represented in a branching heatmap, emphasizing genes exhibiting pronounced expression alterations surrounding bifurcation points (Fig. 9E). Longitudinal tracking of core regulatory gene expression patterns across pseudotemporal progression revealed dynamic transcriptional regulation throughout the complete differentiation continuum (Fig. 9F). Collectively, these findings provide mechanistic insights into both the cellular crosstalk architecture and developmental biology of immune cell subsets participating in periodontitis pathogenesis.

Fig. 9
Fig. 9
Full size image

DRSGs AUCell scores and Subtyping analysis. (A) Interaction of signal pathways and immune cell subpopulations. (B) Ligand-receptor interaction network between immune cell subpopulations. (C) Differentiation trajectory plot of immune cell subpopulations. (D) Differentiation timeline plot of differential immune cells. (E) Heatmap of log interaction scores between immune cell subpopulations. (F) Differentiation trajectory plot of DEGs.

Discussion

Periodontitis represents a chronic inflammatory disorder mediated by microbial dysbiosis, leading to progressive alveolar bone resorption and eventual tooth detachment25,26. Emerging evidence suggests this condition exerts both localized tissue damage and systemic immunomodulatory effects, with particular emphasis on the pivotal involvement of immune cell populations27. Nevertheless, the precise mechanisms underlying systemic immune adaptations following periodontal interventions remain poorly elucidated within the disease’s pathophysiological framework28. To advance current comprehension, our study identified seven pivotal genetic markers linked to etiological factors through comprehensive analysis of extensive periodontitis datasets, facilitating enhanced interpretation of cellular dynamics and network interactions. Subsequent investigations further established significant associations between these core genes and immune cell profiles within the periodontal microenvironment.

Our investigation revealed significant causal relationships and immunological mechanisms involving seven pivotal genes in periodontitis pathogenesis. Through Mendelian randomization analysis encompassing 198,441 cases, we established that seven core genes (ANXA1, ARL4C, CD79B, LRRC25, NKG7, SLC11A1, and VIM) exhibit causal interactions with various immune cell subsets. Notably, B lymphocytes, T lymphocytes, and natural killer cells demonstrated strong positive associations with these key genetic markers. The anti-inflammatory properties of ANXA1 and NKG7 were particularly noteworthy, as these genes appear to modulate inflammatory processes through dual mechanisms: stimulating anti-inflammatory mediator synthesis while suppressing pro-inflammatory cytokine secretion29. Our findings further indicate that ANXA1 functionally collaborates with ARL4C, NKG7, SLC11A1, and VIM to orchestrate immune cell responses by influencing cellular signaling pathways and motility, while simultaneously regulating B cell memory formation and T cell inflammatory activation30,31. Transcriptional analysis combined with Mendelian randomization and gene set enrichment approaches demonstrated that ANXA1 and SLC11A1, despite showing moderate transcriptional activity as indicated by odds ratios, were significantly associated with elevated periodontitis risk. Experimental validation through quantitative PCR and immunohistochemical techniques confirmed these observations. In contrast to prior reports32, our RT-qPCR and immunohistochemistry data consistently revealed upregulated expression of ANXA1 and SLC11A1 in periodontitis-affected tissues (Supplementary Data 3; Figures S4, S5, and S7). These comprehensive experimental results provide robust validation of our initial findings.

Through Mendelian randomization analysis, we systematically evaluated the causal relationship between seven key genes and periodontitis development, employing rigorous causal inference methods. Our findings demonstrated that elevated expression levels of ANXA1 and SLC11A1 significantly increased susceptibility to periodontitis. Functional enrichment analysis revealed these genes participate in critical signaling cascades including the PI3K-AKT-mTOR pathway, UV response downregulation, reactive oxygen species metabolism, and IL6-JAK-STAT3 signaling. Mechanistically, ANXA1 overexpression was found to downregulate MHC-I molecules while simultaneously enhancing CD8 + T cell activity and pro-inflammatory cytokine production (including TNF-α and IFN-γ), ultimately promoting periodontal tissue degradation and alveolar bone resorption33. Similarly, SLC11A1 upregulation stimulated neutrophil activation, resulting in excessive reactive oxygen species generation. These ROS molecules not only exacerbated oxidative stress and endothelial dysfunction but also induced structural alterations in cellular proteins. Furthermore, ROS facilitated neutrophil extracellular trap formation, which significantly contributed to periodontal tissue injury34. Our comprehensive analysis demonstrated that these molecular pathways, coupled with immune cell activation (particularly neutrophils and CD8 + T cells), exhibited strong interactions with the IL-17/TGF-ß axis (Supplementary Data 3, Figures S2 and S3). The coordinated activation of these immune mechanisms by the identified hub genes through their associated signaling networks appears to play a pivotal role in exacerbating periodontal tissue destruction and bone loss characteristic of periodontitis pathogenesis.

Mendelian randomization analysis indicated that elevated expression levels of CD79B, ARL4C, NKG7, LRRC25, and VIM exhibited inverse correlations with periodontitis susceptibility, potentially conferring protection through immunomodulatory mechanisms. These findings aligned with our experimental validation using RT-qPCR and immunohistochemistry (Supplementary Data 3; Figures S6 and S7). Functionally, CD79B serves as a crucial element in B cell receptor (BCR) signal transduction, mediating the activation of NF-κB and MAPK cascades to orchestrate B lymphocyte proliferation and differentiation35. Immune profiling demonstrated significant positive associations between B cell infiltration and the presence of NK cells, monocytes, and dendritic cell co-inhibitory markers in periodontal lesions. This immunological evidence implies that B lymphocytes may mitigate autoimmune pathology resulting from hyperactivation of NK cells, monocytes, and T cells during inflammatory responses. Enhanced CD79B expression potentiates BCR signaling while attenuating immunopathological consequences of BCR activation36, collectively dampening excessive immune reactivity in periodontal disease. These mechanistic insights establish CD79B as a protective factor in periodontitis pathogenesis.

Our findings demonstrate that B cells exhibiting elevated CD79B expression levels engage in robust ligand-receptor interactions (including CD40-CD40L and MHC-II-TCR complexes) with dendritic cells and T cells, as revealed by Cellchat analysis of immune cell subpopulations. These results highlight the pivotal role of B cells in maintaining immune microenvironment homeostasis through intricate intercellular signaling networks. Single-cell transcriptomic profiling further elucidated the functional involvement of NK cells, monocytes, and dendritic cells in relation to hub genes, complementing the mechanistic understanding derived from Mendelian randomization analysis. The investigation uncovered significant cellular communication patterns within the periodontitis microenvironment, particularly between dendritic cells and NK cells, as well as between monocytes and CD8 + T cells. Notably, monocytes exhibited extensive interaction networks with multiple immune cell types, including dendritic cells, NK cells, and both CD4 + and CD8 + T cell subsets. Molecular characterization revealed that key hub genes (NKG7, ARL4C, LRRC25, and VIM) participate in common signaling cascades such as reactive oxygen species, tumor necrosis factor, and NF-κB pathways. These molecular players appear to modulate immune responses and fibrotic repair processes in periodontitis by influencing neutrophil extracellular trap formation and lymphocyte functionality.

Through comprehensive analysis, we delineated the complex regulatory roles of these hub genes in periodontitis pathogenesis, demonstrating both positive and negative correlations within the disease landscape. The study identifies ANXA1 and SLC11A1 as particularly significant genetic risk factors associated with periodontitis susceptibility. Our integrative approach combining single-cell RNA sequencing with Mendelian randomization provides a powerful framework for establishing causal relationships between candidate genes and disease pathology. These findings offer valuable insights into the immunological underpinnings of periodontitis and suggest potential therapeutic targets for drug repurposing strategies. The identified hub genes appear to influence immune cell infiltration patterns, presenting opportunities for developing targeted interventions based on individual genetic profiles.

This investigation demonstrates three key methodological strengths that enhance causal inference. First, we systematically incorporated extensive genomic data on gene expression and immune cell characteristics using a two-sample Mendelian randomization approach, with subsequent validation of potential causal associations in two distinct population cohorts. Second, by applying current MR-Contamination Mixture protocols, we implemented rigorous quality thresholds by removing exposures with F-statistics below 10 and fewer than 3 SNPs, thereby enabling robust assessment of horizontal pleiotropy via Egger regression and MR-PRESSO analyses. Third, the exclusive use of European-ancestry GWAS datasets combined with geographically matched LD reference panels effectively mitigated population stratification biases and minimized residual confounding from population substructure.

Several limitations warrant consideration in this study. The statistical sensitivity of our findings may be limited by sample size constraints, particularly in the FinnGen dataset. Furthermore, the restricted ethnic diversity in our study populations may affect the external validity of our results across different demographic groups. The biological functions of identified hub genes require further elucidation through larger and more ethnically diverse periodontitis cohorts. Importantly, the precision of exposure assessment significantly influences MR estimates, necessitating additional mechanistic studies to clarify the functional roles of candidate genes in periodontitis pathogenesis.

Conclusion

In summary, our investigation elucidated the immunological underpinnings of periodontitis through comprehensive analysis of immune cell subsets and molecular pathways using single-cell RNA sequencing and Mendelian randomization approaches. The identified candidate genes, especially ANXA1, SLC11A1 and CD79B, demonstrated significant causal relationships with periodontitis susceptibility. Furthermore, our results indicated these key genes exhibit distinctive expression profiles within immune cell networks and signaling cascades during periodontal pathogenesis. Subsequent studies should prioritize: (1) deciphering the precise molecular mechanisms through which these genes contribute to disease progression, (2) assessing their utility as prognostic biomarkers for early risk stratification, and (3) investigating targeted gene modulation strategies to bolster innate immune responses against periodontal pathogens.