To investigate the tumor promotion role of PLOD3 in colorectal cancer and its potential as a prognostic biomarker and therapeutic target

Dong, Wanhui; Li, Su; Tang, Wei; Sun, Qingming; Zhu, Pei; Wang, Mingxing; Xu, Bin

doi:10.1038/s41598-025-89521-z

Download PDF

Article
Open access
Published: 13 February 2025

To investigate the tumor promotion role of PLOD3 in colorectal cancer and its potential as a prognostic biomarker and therapeutic target

Wanhui Dong¹,
Su Li²,
Wei Tang¹,
Qingming Sun¹,
Pei Zhu¹,
Mingxing Wang¹ &
…
Bin Xu²

Scientific Reports volume 15, Article number: 5371 (2025) Cite this article

2339 Accesses
2 Citations
Metrics details

Subjects

Abstract

PLOD3 (procollagen-lysine, 2-oxoglutarate 5-dioxygenase 3), a key enzyme involved in collagen post-translational modification, is critical for maintaining the structural integrity of the extracellular matrix (ECM). Dysregulation of PLOD3 has been implicated in various malignancies, including colorectal cancer (CRC).This study aimed to elucidate the role of PLOD3 in CRC and evaluate its potential as a prognostic biomarker and therapeutic target. We conducted a comprehensive analysis utilizing data from The Cancer Genome Atlas (TCGA) and the Gene Expression Omnibus (GEO) to evaluate PLOD3 expression in CRC. Univariate and multivariate Cox regression analyses were conducted to investigate its impact on overall survival. Functional assays, including wound healing, Transwell migration, and invasion assays, were carried out in CRC cell lines with modified PLOD3 expression to elucidate its role in regulating tumor cell behavior. Furthermore, Gene Set Enrichment Analysis (GSEA) was employed to identify signaling pathways associated with PLOD3 expression. Our findings demonstrate that PLOD3 is significantly overexpressed in CRC tissues compared to normal tissues, and its elevated expression is associated with poor prognosis and reduced overall survival. The study also developed a RiskScore model incorporating PLOD3 and 11 other genes, which exhibited strong predictive performance for patient outcomes. Functional experiments confirmed that PLOD3 overexpression enhances CRC cell migration and invasion. GSEA linked high PLOD3 expression to the activation of epithelial-mesenchymal transition (EMT) and metastasis-related pathways. In conclusion, PLOD3 plays a pivotal role in CRC progression by promoting tumor growth and metastasis. Its elevated expression serves as an independent prognostic marker and a potential target for therapeutic intervention, offering new insights into the molecular mechanisms driving CRC.

Hypomethylation induced overexpression of PLOD3 facilitates colorectal cancer progression through TM9SF4-mediated autophagy

Article Open access 25 March 2025

PLOD3 contributes to HER-2 therapy resistance in gastric cancer through FoxO3/Survivin pathway

Article Open access 14 July 2022

Comprehensive bioinformatics analysis was used to identify and verify differentially expressed genes in targeted therapy of colon cancer

Article Open access 28 April 2025

Introduction

Colorectal cancer ranks as the third most common type of cancer globally and is the second leading cause of cancer-related mortality. According to GLOBOCAN 2020 data, there were approximately 1.9 million new cases and 930,000 deaths globally in 2020, with an emerging trend of increased incidence among younger age groups in high-income countries¹, this highlights a growing public health challenge. Current guidelines from the Chinese Society of Clinical Oncology (CSCO) and the National Comprehensive Cancer Network (NCCN), recommend treatment modalities for CRC, including surgical resection, radiotherapy, chemotherapy, and targeted therapies. However, early detection and intervention remain critical for improving patient survival rates². In recent years, advancements in molecular biology technology have facilitated the identification and clinical application of numerous biomarkers, significantly enhancing the diagnosis and treatment of CRC.

The PLOD gene family encodes a group of enzymes involved in the post-translational modification of collagen, with PLOD3 encoding lysine hydroxylase 3. Aberrant expression of PLOD3 has been observed in various tumor types. Emerging evidence suggests that PLOD3 contributes to tumor development and progression by modulating extracellular matrix (ECM) stability and promoting tumor cell invasiveness³. However, the precise role and underlying mechanisms of PLOD3 in CRC remain poorly understood, and its potential as a biomarker for diagnosis or therapy has yet to be fully explored.

This study examines the expression pattern of PLOD3 in CRC and its association with tumor biological behaviors through a comprehensive analysis of gene expression data and single-cell sequencing information obtained from publicly available databases. Furthermore, a risk prediction model was constructed using machine learning approaches, and the relationship between PLOD3 expression and the immune microenvironment, as well as its potential impact on immunotherapy response, was investigated. Finally, the functional role of PLOD3 in CRC was validated through in vitro experiments, highlighting its influence on the biological characteristics of CRC cells. These findings provide a novel perspective for the early diagnosis and treatment of CRC.

Methods

Single-cell clustering and cell annotation analysis

The single-cell dataset GSE161277⁴ was retrieved from the GEO database (https://www.ncbi.nlm.nih.gov/geo/), and samples from CRC and normal tissues were subsequently selected for analysis. The R package Seurat (v5.1.0)⁵ was used to loading and preprocessing the single-cell data. The ‘PercentageFeatureSet’ function was employed to calculate the proportion of mitochondrial genes. Ensure that each cell exhibits the expression of more than 250 genes, the mitochondrial content was less than 20%, the UMI of each cell was at least 500, and the ‘log10genesperumi’ was greater than 0.8 (calculated as ‘GenesPerUMI = nFeature_RNA / nCount_RNA’).

Subsequently, log-normalization was applied to the filtered GSE161277 dataset using the FindVariableFeatures function with default parameters to identify highly variable genes based on the Variance Stabilizing Transformation (VST). To address batch effects among samples, canonical correlation analysis (CCA) was performed, followed by sample integration using the FindIntegrationAnchors function.

After completing data integration with the “IntegrateData” function, we applied the “ScaleData” function to normalize gene expression across all features. Principal component analysis (PCA) was then used for dimensionality reduction to identify anchor points in the dataset. Subsequently, cell clustering was performed using the “FindNeighbors and FindClusters” functions.

Following clustering, we stratified the cell population based on previously established CRC and immune cell markers^{6,7,8,9,10,11}, enabling the analyzing of the distribution of various cell types within the sample. Marker genes for each cell type were identified using the “FindallMarkers” function, with gene selection criteria set at FC > 2 and FDR < 0.05. Finally, functional enrichment analysis of these marker genes was performed using the “ClusterProfiler” package to unveil their potential biological significance.

Cellular communication and ligand analysis

We utilized the “CellChat” R package (version 1.6.0)¹² to analyzing intercellular communication based on single-cell data and the defined cell classification. The analysis was performed using the “CellChatDB. human” database integrated into the “CellChat” package, ensuring that each cell type contained at least 10 cells.

Expression profile data processing and analysis

We retrieved expression profile data and associated clinical information for colorectal adenocarcinoma (COAD) from the TCGA database. Additionally, we downloaded the raw data for the datasets GSE17538¹³, GSE33113¹⁴, GSE39582¹⁵, GSE14333¹⁶, and GSE38832¹⁷ from the GEO database, and retained their sequencing data of the GPL570 platform.

The RMA function (Robust multi-array average expression measure) in the R package Affy (version 1.66.0)¹⁸ was employed to process and normalize the expression profile data, generating a normalized expression dataset. Using the annotation file of the GPL570 platform, probes were mapped to their corresponding gene symbols. For cases where multiple probes were mapped to the same gene symbol, the median expression value was calculated and assigned to the gene symbol. Conversely, probes mapped to multiple gene symbols were excluded from the analysis. Only CRC tumor samples with complete survival time and survival status information were retained for further analysis.

The six datasets - TCGA-COAD, GSE17538, GSE33113, GSE39582, GSE14333, and GSE38832-were integrated into a single dataset, hereafter referred to as Meta. Batch effect across datasets were corrected using the Combat function from the SVA package. Subsequently, we applied the single-sample gene set enrichment analysis (ssGSEA) method to compute enrichment scores for the expression data, utilizing marker genes derived from single-cells analyses.

Building prognostic models and evaluating their prognostic performance

We performed univariate Cox regression analysis on fibroblast marker genes in the TCGA dataset, applying a p-value threshold of < 0.01 to identify genes associated with survival. Based on the identified genes, we extracted their expression profiles from multiple independent datasets, including TCGA, GSE 39,582, GSE 17,538, GSE 33,113, GSE 14,333, GSE 38,832. The TCGA dataset was designated as the training set, while the remaining five datasets (GSE 39582, GSE 17538, GSE 33113, GSE 14333, GSE 38832) served as independent validation sets.

To screen for the optimal model, we applied 36 different methods¹⁹ and subsequently conducted multivariate Cox regression analysis on the selected genes to determine the risk coefficient for each gene, and the results were visualized for comparison. Additionally, we computed the risk score for each sample based on gene expression levels and plotted Kaplan-Meier (KM) survival curves to evaluate survival differences.

In the TCGA-COAD, GSE39582, GSE17538, and GSE14333 datasets, we performed both univariate and multivariate Cox regression analyses using clinical data to derive hazard ratios (HR), 95% confidence intervals (CI) for the HR, and associated P values. Through these analyses, we identified significant clinical characteristics and risk scores, which were subsequently employed to construct a nomogram model aimed at assessing its influence on survival prediction. Following this, we validated the performance of the nomogram by calibrating the predictions for 1-, 3-, and 5-year survival outcomes and visualizing the model’s efficacy. Finally, we conducted decision curve analysis (DCA) and compared the area under the receiver operating characteristic (ROC) curve (AUC value) across various time points.

Analysis of immune-related properties and functions in the model

We utilized the estimation tool to evaluate the Meta-dataset, obtaining scores for three indicators: stromal score, immune score, and estimate score, and compared score variances between the high-risk and low-risk groups. Referring to PMID 31,043,417, we identified immune checkpoint-related genes, calculated the correlation between these genes and risk score in the Meta-dataset, and presented the results in a heatmap format.

Subsequently, we used the “Limma” package to analyze the differentially expressed genes (DEGs) between the high-risk and low-risk groups in the Meta-dataset, using a threshold of |fold change FC| > 1.2 and p < 0.05. KEGG pathway enrichment analysis was then performed on the identified DEGs²⁰. Additionally, we utilized the GSEA function in the clusterProfiler package to perform KEGG pathway enrichment analysis on the high-risk and low-risk groups in the Meta-dataset.

The potential of models in immunotherapy

We utilized the online tool TIDE (Tumor Immune Dysfunction and Exclusion)^21,22 to predict immunotherapy responses in the Meta dataset. The correlation between RiskScore and TIDE scores was assessed, and the probabilities of immunotherapy response were compared between high-risk and low-risk groups.

Subsequently, we performed a subgroup analysis, comparing the expression profiles of high and low-risk groups in the Meta dataset with another published dataset, IMvigor210, which includes data on PD-L1 treatment. Additionally, we acquired immunotherapy-related datasets GSE78220²³ and GSE135222²⁴ and applied our risk scoring formula to calculate the risk scores for the samples in these datasets. The differences in the proportions of immunotherapy responders between the high- and low-risk groups in these two datasets were then compared.

Finally, for single-drug treatment effect, we utilized the R package oncoPredict to estimate IC50 values based on GDSC1 data and analyzed the correlation between drug IC50 values and RiskScore.

Mendelian randomization analysis of model genes and CRC

We obtained the GWAS dataset for the PLOD3 gene (prot-a-2305) and the GWAS dataset for CRC (bbj-a-107) from the OPENGWAS database (https://gwas.mrcieu.ac.uk/). For the prot-a-2305 dataset, a screening criterion of P < 5 × 10⁻⁶ was applied. Subsequently, analyses were conducted with the bbj-a-107 dataset using five Mendelian randomization methods: MR Egger, Weighted median, Inverse variance weighted, Simple mode, and Weighted mode.

Experimental verification

To further elucidate the association between the PLOD3 gene in the Riskscore gene model and CRC, we conducted additional verification through in vitro experiments.

Cell culture and transfection

Normal intestinal epithelial cells (NCM460) and human CRC cell lines (SW1116 and Caco2) were obtained from BNCC. The cells were cultured in DMEM medium (Gibco, USA) supplemented with 10% fetal bovine serum (FBS; Invitrogen, Carlsbad, CA, USA) and 1% penicillin/streptomycin. The cultures were maintained in a controlled environment at 37°C with an atmosphere of 5% carbon dioxide. The negative control (Vector) and siRNA (5’-GCGGTGATGAACTTTGTGGTT-3’) were transfected into SW1116 and Caco2 cells using Lipofectamine 2000 (Invitrogen, USA).

RNA extraction and quantitative reverse transcription polymerase chain reaction (RT-PCR)

Total cellular RNA was extracted using TRIzol (Invitrogen, Carlsbad, CA, USA). The total RNA was then reverse transcribed into cDNA according to the manufacturer’s instructions for the PrimeScript™ RT Kit (TaKaRa, Otsu, Shiga, Japan). Quantitative PCR amplification was performed using the SYBR Premix Ex Taq II (TaKaRa, Otsu, Shiga, Japan) on the TaKaRa system, following the standard real-time PCR protocol to quantify the relative mRNA expression levels.

The RT-PCR primers were sourced from Gene Pharma (Shanghai Gene Pharma, Shanghai, China). mRNA expression levels were normalized to GAPDH, and the relative gene expression was calculated using the 2-∆∆CT method.

PLOD3: Forward: GACCCGGTCAACCCAGAGA, Reverse: CTCCACCAACTGTTCGAGCC.

GAPDH: Forward: GTCTCCTCTGACTTCAACAGCG, Reverse: ACCACCCTGTTGCTGTAGCCAA.

Protein extraction and Western blot analysis

Total cellular proteins were extracted using RIPA buffer (Beyotime Biotechnology Co., Shanghai, China) containing protease inhibitors (Roche Diagnostics, Indianapolis, IN, USA). Protein concentration was determined according to the instructions provided with the BCA kit (Thermo Fisher Scientific, Waltham, MA, USA). A total of 40 µg of protein was separated by 10% SDS-PAGE and subsequently transferred onto a polyvinylidene fluoride (PVDF) membrane (Millipore, Billerica, MA, USA).

The membrane was blocked with 5% fat-free milk at 25 °C for 1 h, followed by overnight incubation with primary antibodies at 4 °C. The primary antibodies (details provided in Table 1) were sourced from Abcam (Cambridge, MA, USA). After washing, the membrane was incubated with a corresponding secondary antibody (1:50,000) at 25 °C for 1 h.

Table 1 Primary antibodies used for Western blot analysis.

Full size table

Immune-reactive proteins were detected using the Pierce ECL detection system (Thermo Fisher Scientific, Waltham, MA, USA) with ECL substrate, and quantitative analysis was performed using ImageJ software (NIH, Bethesda, MD, USA).

Transwell

Cells (5 × 10⁴) were seeded into Transwell chambers that were either pre-coated with Matrigel (BD Biosciences, USA, for invasion assays) or left uncoated (for migration assays). The upper chamber was filled with serum-free medium, while the lower chamber contained complete medium. After 24 h of incubation, the migrated or invaded cells were fixed with 4% paraformaldehyde and stained with 0.1% crystal violet.

Scratch test

Cell migration was assessed using wound healing experiments. A total of 5 × 10⁴ transfected cells were seeded into 6-well plates and cultured until cell confluence reached 90%. A linear wound was created in the cell monolayer using a 20 µL pipette tip, followed by washing with phosphate-buffered saline (PBS). After 24 h, the cells were fixed with 4% paraformaldehyde and stained with 0.1% crystal violet (Servicebio) for 15 min. The stained cells were imaged using an inverted microscope (Leica), and the wound closure rate was quantified using ImageJ software.

Statistical analysis

The data were analyzed using SPSS 19.0 software (SPSS Inc., Chicago, IL, USA) and GraphPad Prism 5.0 software (GraphPad Software, La Jolla, CA, USA). One-way analysis of variance (ANOVA) was performed to evaluate differences between groups, followed by post hoc testing using the Student-Newman-Keuls method. A p-value < 0.05 was considered statistically significant.

Results

Data sorting and analysis

From the single-cell dataset of colorectal cancer, we identified a total of 24 subpopulations through filtering and dimensionality reduction (Fig. 1A). Based on criteria established in previous studies^{6,7,8,9,10,11}, we screened a series of key markers to distinguish colorectal cancer and its associated immune cells. This analysis encompassed various cell types, including epithelial cells (EPCAM, KRT19, CLDN4), fibroblasts (COL3A1, COL1A1, COL1A2), macrophages (IL7R, CD68, IL1B), T cells (IL7R, CD3G), NK cells (NKG7), B cells (CD79A, VPREB3, CD79B), endothelial cells (PECAM1), mast cells (TPSAB1, TPSB2, KIT), and plasma cells (SSR4, IGLL5). Based on the expression patterns of these marker genes across different clusters, we further classified the cells into nine distinct types (Fig. 1B), and visualized the expression levels of marker genes both among different cell types and within the same cell type (Fig. 1C-D). To gain deeper insight into the functional roles of these marker genes, we performed functional enrichment analysis (Fig. 1E). Additionally, we calculated the proportion of each cell type across samples (Fig. 1F-G).The results revealed significant variations in cell proportions between samples, highlighting notable cellular heterogeneity. Notably, the proportion of epithelial cells was 44% in tumor tissues, compared to 17% in normal tissues, consistent with findings from current clinical research.

In the COAD samples, we identified nine distinct cell types with varying proportions across different samples. Using the CellChat tool for analysis, we revealed a complex intercellular communication network. Notably, fibroblasts exhibited the highest number and weight of communications within this network (Fig. 2A-E), underscoring their potential central role in regulating intercellular communication.

Further analysis of intercellular signaling pathways revealed interactions involving the collagen signaling pathway, which is closely associated with the tumor extracellular matrix, as well as the transmission modes of CXCL signaling pathway, which play critical roles in immune response and cell migration (Fig. 2F-J).

Subsequently, we employed single-cell marker genes to assess the cellular status of the merged meta dataset. Our analysis focus on comparing fibroblast scores between normal and tumor samples in the TCGA-COAD and GSE33113 datasets. Notably, the results demonstrated significantly higher fibroblast scores in normal samples compared to tumor samples (Fig. 3A).

Construction and effect of prognostic model

We screened 57 survival-related genes associated with fibroblast marker genes from the TCGA dataset (Fig. 3B), including MMP23B, TRIP10, SLC39A1, CHID1, ABL2, PLOD3, among others. After extracting gene expression profiles from the Meta dataset, we constructed a model using the training and validation sets and identified key genes associated with prognosis. Subsequently, we evaluated the selected genes using multivariate COX regression analysis (Fig. 3C), calculated the risk coefficient for each gene, and assessed the C-index of each dataset using different machine learning methods (Fig. 3D). The results showed that the 12 genes selected using the StepCox[backward] method had a higher average C-index. Therefore, we utilized this method to construct a prognostic model, represented by the following formula:

RiskScore = 0.8921*ABL2 + 0.2645*AIF1L + 0.1339*APOD-0.3421*BGN + 0.3051*CD248 + 0.274*CRIP2-0.6143*NPR1 + 0.1456*NR4A1 + 0.1928*PCOLCE2 + 0.3098*PLOD3 + 0.1424*S100A4-0.5725*USP53.

We further calculated the KM and ROC curves for each dataset to evaluate the predictive performance of the model (Fig. 3E). Additionally, we compared the expression of model genes in normal and tumor tissues within the TCGA-COAD and GSE33113 datasets. The results revealed significant differences in the expression of genes such as ABL2, AIF1L, APOD, BGN, NPR1, PCOLCE2, PLOD3, and USP53 between the two types of samples (Fig. 3F).

To validate the independence of the RiskScore model in clinical applications, we conducted further analyses. Firstly, across multiple datasets, the RiskScore demonstrated significant prognostic predictive power in both univariate and multivariate analyses(Fig. 4A-D). Secondly, the significance of other clinical characteristic factors in multivariate analysis was notably reduced, further supporting the independence and reliability of the RiskScore model. Notably, in the TCGA-COAD dataset, both univariate and multivariate analyses identified that T-stage, M-stage, and RiskScore model as significant predictors of prognosis. Consequently, we incorporated these three features to construct a nomogram for further analysis (Fig. 4E).

The nomogram results indicated that the RiskScore had the greatest impact on survival rate prediction, underscoring the high efficacy of gene-based risk models in prognostic prediction. The performance of the model was further validated by correcting the 1-year, 3-year, and 5-year survival rate data (Fig. 4F). Finally, through decision curve analysis (DCA) and AUC values of ROC curves at different time points, we confirmed that the RiskScore model exhibits robust predictive performance across various time points.

Association between models and immune and function

By comparing the immune microenvironment scores of high and low-risk groups in the Meta dataset, we found that that the immune score of the high-risk group was significantly higher than that of the low-risk group (Fig. 5A). Further analysis revealed a significant positive correlation between RiskScore and ImmuneScore (Fig. 5B).

Additionally, a comparative analysis of immune checkpoint genes showed that their expression was significantly and positively correlated with RiskScore in most cases (Fig. 5C). Differential gene expression analysis between the high- and low-risk groups identified 1923 differentially expressed genes. Enrichment analysis of these genes revealed significant enrichment in several signaling pathways, including the IL-17 signaling pathway, ECM-receptor interaction, TNF signaling pathway, PPAR signaling pathway, Hippo signaling pathway, PI3K-Akt signaling pathway, Wnt signaling pathway, Cytokine-cytokine receptor interaction, Chemokine signaling pathway, NF-kappaB signaling pathway, Focal adhesion, and Human papillomavirus infection pathway (Fig. 5D-F).

Further GSEA analysis revealed significant enrichment in the high-risk group for pathways such as the PI3K-Akt signaling pathway, NF-kappa B signaling pathway, VEGF signaling pathway, TGF-beta signaling pathway, and Wnt signaling pathway. Additionally, the high-risk group showed enrichment in pathways like the Chemokine signaling pathway, Toll-like receptor signaling pathway, Natural killer cell-mediated cytotoxicity, Th17 cell differentiation, and Th1 and Th2 cell differentiation. In contrast, the low-risk group exhibited significant enrichment in pathways including Homologous recombination, DNA replication, Mismatch repair, Base excision repair, and Nucleotide excision repair (Fig. 5G-I).

Potential association of the model with immunotherapy

We used the TIDE tool to predict immunotherapy responses in the samples from the Meta-dataset and compared the treatment response rates across different subtypes. The results showed that the TIDE scores in the low-risk group were lower than those in the high-risk group, indicating that the low-risk group might be more suitable for immunotherapy (Fig. 6A). Further analysis demonstrated a positive correlation between RiskScore and TIDE scores (Fig. 6B). Moreover, the percentage of samples with a True treatment response in the low-risk group was significantly higher than in the high-risk group, reaching 62% (Fig. 6C).

In the analysis of the expression profiles of the Meta dataset, we observed that compared to the clinical gene set treated with PD-L1, the expression profiles of the high-risk group showed significant correlations with the PD and SD response groups (p = 0.016, p = 0.002), suggesting a potential association between the high-risk group and response to PD-L1 therapy. Conversely, the expression profiles of the low-risk group exhibited significant correlations with the PR response group (p = 0.026), indicating that the low-risk group may demonstrate a more favorable response to PD-L1 therapy (Fig. 6D). Furthermore, we calculated the risk scores for two immunotherapy datasets using the risk model and observed that the proportion of PR/CR was higher in the low-risk group—79% in GSE78220 and 57% in GSE135222(Fig. 6E-F). These findings suggest that the RiskScore model could serve as a potential biomarker for guiding personalized PD-L1 therapy.

Finally, we utilized the GDSC1 dataset to analyze the correlation between drug IC50 and RiskScore. After establishing a threshold of |R| > 0.3 and a P-value < 0.05 for screening, we identified 38 significantly associated drugs. Among these, 24 showed a significant positive correlation between their IC50 and RiskScore, while 14 demonstrated a significant negative correlation (Fig. 6G-H). Notably, drugs such as Pyrimethamine, Gemcitabine, and Tozasertib exhibited a significant positive correlation with RiskScore, whereas drugs like Erlotinib, Lapatinib, and Dasatinib showed a significant negative correlation.

Mendelian randomization analysis of model genes and CRC

To further investigate the association between these model genes and CRC, we utilized the OPENGWAS database for data retrieval. Among the genes in this database, only PLOD3 had available genome-wide association study (GWAS) data. We further conducted a Mendelian randomization analysis for PLOD3. The heterogeneity tests using the MR Egger and Inverse Variance Weighted methods indicated no significant heterogeneity between the datasets (p > 0.05), and a pleiotropy analysis also showed no significant pleiotropy (p > 0.05). The scatter plot demonstrated that all five analytical methods supported a positive correlation between the exposure factor PLOD3 and the outcome of CRC, suggesting that PLOD3 may be a risk factor for CRC (Fig. 7A). The odds ratios (OR) for the five methods—MR Egger, Weighted Median, Inverse Variance Weighted, Simple Mode, and Weighted Mode—were all greater than 1 (Fig. 7B).

Experimental verification of model gene

To further verify the role of PLOD3 in vitro, we evaluated the expression of the PLOD3 gene in tumor cells and normal cells through an in vitro cell culture experiment. The results indicated that the mRNA and protein levels of PLOD3 were significantly upregulated in SW1116 and Caco2 cells compared to normal NCM460 cells (p < 0.05, Fig. 8A-C). Following the knockout of PLOD3, the migration and invasive capabilities of the tumor cells were impaired(Fig. 8D-H), as evidenced by the significantly fewer cells present in the trans-wells in the si-PLOD3 group (p < 0.05) and the notably slower wound closure observed in the si-PLOD3 group during the wound healing experiment (p < 0.05).

Discussion

The molecular mechanisms underlying CRC are intricate, involving the dysregulated control of multiple genes and signaling pathways. The PLOD family, consisting of PLOD1, PLOD2, and PLOD3, represents a crucial group of enzymes involved in the post-translational modification of collagen proteins. Their abnormal expression in various tumors is closely associated with tumor progression and metastasis²⁵. The PLOD3 gene encodes an enzyme known as lysyl hydroxylase, whose primary function is to catalyze the hydroxylation of lysine residues in collagen precursors. This process is crucial for the correct folding and functionality of collagen. Specifically, PLOD3 introduces hydroxylysine residues into collagen molecules, which serve as essential sites for subsequent glycosylation and cross-linking reactions²⁶. Studies have demonstrated that PLOD3 is overexpressed in various cancers and is closely associated with modifications of collagen in the tumor microenvironment. These modifications include lysine hydroxylation, a critical step in collagen maturation. Increased PLOD3 activity in cancer cells may alter the collagen network, thereby influencing tumor invasiveness and metastatic potential²⁷. Collagen cross-linking is a critical feature of the tumor microenvironment, in which PLOD3 plays an integral role by hydroxylating lysine residues. This cross-linking enhances the stability of the extracellular matrix, which may be linked to increased tumor invasiveness and drug resistance. In cancers such as oral cancer, the activity of PLOD3 positively correlates with the degree of collagen cross-linking²⁸. These prior studies indicate that PLOD3 expression and activity in cancer cells are directly implicated in the hydroxylation of lysine residues within collagen, a process that substantially influences the stability and functionality of the collagen network in the tumor microenvironment. These observations are consistent with our findings in colorectal cancer.

In this study, we conducted a comprehensive analysis of gene expression data from public databases, single-cell sequencing data, and in vitro experimental validation to elucidate the role of PLOD3 in CRC and its potential molecular mechanisms.

By conducting single-cell clustering analysis on the GSE161277 dataset from the GEO database, we accurately annotated and classified various cell types in CRC (CRC). Data normalization and batch effect correction using the Seurat package ensured data consistency and reliability. Following clustering, cells were reclassified based on established markers of CRC and immune cells, enabling us to identify the primary cellular populations in the tumor microenvironment and reveal their proportional distribution across different samples. The significance of this approach lies in its precise depiction of tumor microenvironment heterogeneity, particularly with regard to key immune cell populations in CRC. Functional enrichment analysis further elucidated the biological significance of these marker genes, suggesting that PLOD3 may play a significant role in specific cell types, thereby impacting tumor progression.

Analysis of cellular communication and its role within the tumor microenvironment Cellular communication analysis conducted using the CellChat package has revealed ligand-receptor interactions among different cell types, particularly the intricate communication network between tumor cells and immune cells. The analysis indicates significant communication activity between PLOD3-overexpressing cell populations and pro-inflammatory macrophage subtypes, suggesting that PLOD3 may influence the progression of CRC by modulating the immune response within the tumor microenvironment. This finding aligns with published literature highlighting the critical role of inflammation in CRC, further emphasizing the role of the tumor microenvironment in tumor growth and metastasis. Specifically, the high expression of PLOD3 in macrophages may facilitate tumor immune evasion and progression through the release of cytokines such as TNF-α, IL-6, and IL-17. These discoveries not only elucidate the function of PLOD3 within the tumor microenvironment but also provide a theoretical basis for the future development of immunotherapies based on PLOD3.

The integration of Meta-datasets and the elimination of batch effects have enabled us to validate the role of PLOD3 on a larger scale, ensuring the reliability and broad applicability of the research findings. Importantly, the consistent expression of PLOD3 across multiple independent datasets reinforces its significance as a potential prognostic marker.

In the construction of prognostic models, we utilized univariate Cox regression analysis to identify genes significantly associated with the survival of patients with CRC. Based on these genes, we developed various predictive models. The TCGA dataset was used as the training set, allowing the models to be trained on a large sample size, thereby enhancing their stability. The genes selected for predictive model construction included ABL2, AIF1L, APOD, BGN, CD248, CRIP2, NPR1, NR4A1, PCOLCE2, PLOD3, S100A4, and USP53. Notably, Jones et al. discovered through in vitro experiments that the ABL proto-oncogene 2 (ABL2), a member of the non-receptor protein tyrosine kinase family, not only plays a role in tumor adhesion and metastasis but also induces leptomeningeal metastasis of medulloblastoma by regulating the downstream c-MYC gene²⁹. Cao et al. identified a novel circular RNA (has_circ_0007368) through RNA sequencing technology. Their findings showed that this novel circular RNA can effectively inhibit the progression of thyroid cancer by upregulating the expression of the allograft inflammatory factor 1 like (AIF1L) gene³⁰. HE et al. screened out differentially expressed genes associated with metastasis in 681 CRC patients and found that BGN (biglycan) and THBS2 genes are closely related to this process and are associated with epithelial-mesenchymal transition (EMT) and overall patient survival time³¹. Similarly, as highly represented inhibitory genes in the gene model, NPR1 (natriuretic peptide receptor 1) and USP53 (ubiquitin-specific peptidase 53) play an indispensable role in the regulation of various tumors. For instance, Liu et al. experimentally confirmed that the absence of the NPR1 gene promotes the expression level of integrin subunit beta 4, thereby increasing adhesion of vascular endothelial cells³². Zhao et al. found that reduced levels of the USP53 gene are associated with lung adenocarcinoma, and this low expression is related to reduced cell apoptosis and increased metabolic activity, suggesting its potential tumor suppressor role. They also discovered that USP53 affects cell apoptosis and glycolysis by regulating the AKT1 pathway, ultimately inhibiting the growth of lung adenocarcinoma³³. These findings demonstrate the rationale behind the gene model we constructed from both clinical and experimental perspectives.

Subsequently, validation in multiple independent datasets demonstrated that the prognostic model based on genes such as PLOD3 exhibits a high C-index, indicating its potential application value in the prognosis assessment of patients with CRC. Notably, PLOD3 not only shows significant predictive power for overall survival but also exhibits a strong correlation with immune-related pathways, enhancing its potential for future clinical applications.

Previous studies on the PLOD family genes in tumor research have indicated that they play a significant role in tumor development and metastasis by influencing the maturation of collagen proteins and the remodeling of the Extracellular Matrix (ECM)³⁴. For instance, high expression of PLOD2 in tumor patients is associated with increased tumor aggressiveness and poor prognosis³⁵. In glioma, PLOD1 contributes to pathogenesis through the NF-κB signaling pathway and the transition to malignant mesenchymal subtypes³⁶. These studies provide essential background information on the role of the PLOD family in tumor biology. PLODs (including PLOD-1, -2, and − 3) are commonly overexpressed in ovarian cancer, and a strong correlation exists between PLOD3 and gap junction protein 43, suggesting that PLOD3 may play a particularly significant role in regulating intercellular communication and collagen synthesis³⁷.

Conclusions

The results of this study indicate that PLOD3 not only plays a crucial role in the occurrence and development of CRC, but also potentially modulates the immune response within the tumor microenvironment, thereby influencing the efficacy of immunotherapy. This finding holds significant clinical implications, particularly in the context of the widespread clinical application of immune checkpoint inhibitors³⁸. In the future, PLOD3 is expected to be incorporated as a novel biomarker into the molecular diagnostic and prognostic assessment systems for CRC. Moreover, combination therapies based on PLOD3, such as integration with existing immunotherapies, may enhance therapeutic efficacy, thus offering patients more personalized and precise treatment options.

In future studies, to further validate the efficacy of PLOD3 inhibitors in colorectal cancer and enhance the clinical translational significance of the research, we propose the integration of multi-faceted experimental designs and advanced technical approaches. First, CRISPR/Cas9 gene-editing technology can be utilized to knock out the PLOD3 gene in xenograft models, allowing observation of its effects on tumor growth and metastasis. Concurrently, the development of small-molecule inhibitors or monoclonal antibodies targeting PLOD3, coupled with pharmacokinetic (PK) analysis and maximum tolerated dose (MTD) assessments, could provide preliminary validation of drug efficacy. Moreover, multi-omics technologies, such as transcriptomics (RNA-seq) to examine the impact of PLOD3 inhibition on tumor-related signaling pathways (e.g., TGF-β, Wnt, VEGF), proteomics (LC-MS/MS) to investigate changes in collagen modifications, and metabolomics (LC-MS or GC-MS) to explore its role in tumor metabolic regulation (e.g., glycolysis and glutamine metabolism), could further elucidate the underlying mechanisms. Finally, liquid biopsy techniques, such as exosomal RNA and circulating tumor DNA detection, may enable dynamic monitoring of therapeutic efficacy and resistance mechanisms. These strategies will provide critical data to support the research and significantly enhance its scientific rigor and clinical relevance.

Data availability

The dataset referred to in this article is sourced from the GEO and TCGA websites and is publicly accessible.

References

Morgan, E. et al. Global burden of CRC in 2020 and 2040: incidence and mortality estimates from GLOBOCAN. Gut 72(2), 338–344 (2023).
Article PubMed MATH Google Scholar
Benson, A. B. et al. Colon cancer, version 2.2021, NCCN clinical practice guidelines in oncology. J. Natl. Compr. Canc. Netw. 19(3), 329–359 (2021).
Article PubMed MATH Google Scholar
Chen, Y. et al. PLOD3 contributes to HER-2 therapy resistance in gastric cancer through FoxO3/Survivin pathway. Cell. Death Discovery. 8(1), 321 (2022).
Article CAS PubMed PubMed Central Google Scholar
Zheng, X. et al. Single-cell transcriptomic profiling unravels the adenoma-initiation role of protein tyrosine kinases during colorectal tumorigenesis. Signal. Transduct. Target. Therapy. 7(1), 60 (2022).
Article CAS Google Scholar
Hao, Y. et al. Integrated analysis of multimodal single-cell data. Cell 184(13), 3573–3587 (2021). e3529.
Article CAS PubMed PubMed Central MATH Google Scholar
Becker, W. R. et al. Single-cell analyses define a continuum of cell state and composition changes in the malignant transformation of polyps to CRC. Nat. Genet. 54(7), 985–995 (2022).
Article CAS PubMed PubMed Central MATH Google Scholar
Wang, X. et al. Single-cell RNA-seq reveals the genesis and heterogeneity of tumor microenvironment in pancreatic undifferentiated carcinoma with osteoclast-like giant-cells. Mol. Cancer. 21(1), 133 (2022).
Article CAS PubMed PubMed Central Google Scholar
Devlin, J. C. et al. Single-cell transcriptional survey of ileal-anal pouch immune cells from ulcerative colitis patients. Gastroenterology 160(5), 1679–1693 (2021).
Article CAS PubMed MATH Google Scholar
Khaliq, A. M. et al. Refining CRC classification and clinical stratification through a single-cell atlas. Genome Biol. 23(1), 113 (2022).
Article CAS PubMed PubMed Central MATH Google Scholar
Fawkner-Corbett, D. et al. Spatiotemporal analysis of human intestinal development at single-cell resolution. Cell 184(3), 810–826 (2021). e823.
Article CAS PubMed PubMed Central MATH Google Scholar
Che, L-H. et al. A single-cell atlas of liver metastases of CRC reveals reprogramming of the tumor microenvironment in response to preoperative chemotherapy. Cell. Discovery. 7(1), 80 (2021).
Article PubMed PubMed Central MATH Google Scholar
Jin, S. et al. Inference and analysis of cell-cell communication using CellChat. Nat. Commun. 12(1), 1088 (2021).
Article ADS CAS PubMed PubMed Central MATH Google Scholar
Smith, J. J. et al. Experimentally derived metastasis gene expression profile predicts recurrence and death in patients with colon cancer. Gastroenterology 138(3), 958–968 (2010).
Article CAS PubMed MATH Google Scholar
Felipe de Sousa, E. M. et al. Methylation of cancer-stem-cell-associated wnt target genes predicts poor prognosis in CRC patients. Cell. stem cell. 9(5), 476–485 (2011).
Article MATH Google Scholar
Marisa, L. et al. Gene expression classification of colon cancer into molecular subtypes: characterization, validation, and prognostic value. PLoS Med. 10(5), e1001453 (2013).
Article CAS PubMed PubMed Central MATH Google Scholar
Jorissen, R. N. et al. Metastasis-associated gene expression changes predict poor outcomes in patients with dukes stage B and C CRC. Clin. Cancer Res. 15(24), 7642–7651 (2009).
Article CAS PubMed PubMed Central MATH Google Scholar
Tripathi, M. K. et al. Nuclear factor of activated T-cell activity is associated with metastatic capacity in colon cancer. Cancer Res. 74(23), 6947–6957 (2014).
Article CAS PubMed PubMed Central MATH Google Scholar
Gautier, L., Cope, L., Bolstad, B. M. & Irizarry, R. A. Affy—analysis of Affymetrix GeneChip data at the probe level. Bioinformatics 20(3), 307–315 (2004).
Article CAS PubMed MATH Google Scholar
Liu, Z. et al. Machine learning-based integration develops an immune-derived lncRNA signature for improving outcomes in CRC. Nat. Commun. 13(1), 816 (2022).
Article ADS CAS PubMed PubMed Central MATH Google Scholar
Kanehisa, M. & Goto, S. KEGG: Kyoto Encyclopedia of genes and genomes. Nucleic Acids Res. 28(1), 27–30 (2000).
Article CAS PubMed PubMed Central MATH Google Scholar
Fu, J. et al. Large-scale public data reuse to model immunotherapy response and resistance. Genome Med. 12, 21 (2020).
Article PubMed PubMed Central MATH Google Scholar
Jiang, P. et al. Signatures of T cell dysfunction and exclusion predict cancer immunotherapy response. Nat. Med. 24(10), 1550–1558 (2018).
Article CAS PubMed PubMed Central MATH Google Scholar
Hugo, W. et al. Genomic and transcriptomic features of response to anti-PD-1 therapy in metastatic melanoma. Cell 165(1), 35–44 (2016).
Article CAS PubMed PubMed Central MATH Google Scholar
Kim, J. Y., Choi, J. K. & Jung, H. Genome-wide methylation patterns predict clinical benefit of immunotherapy in lung cancer. Clin. Epigenetics. 12, 1–10 (2020).
Article CAS MATH Google Scholar
Tong, Y. et al. The PLOD2/succinate axis regulates the epithelial–mesenchymal plasticity and cancer cell stemness. Proc. Natl. Acad. Sci. USA 120(20), e2214942120 (2023).
GeneCards. GeneCards Commercial Trial. GeneCards [Internet]. 2025 [cited 2025 Jan 29]. https://www.genecards.org/cgi-bin/carddisp.pl?gene=PLOD3
Gong, S. et al. A human pan-cancer system analysis of procollagen-lysine, 2-oxoglutarate 5-dioxygenase 3 (Plod3). Int. J. Mol. Sci. 22(18), 9903 (2021).
Article CAS PubMed PubMed Central MATH Google Scholar
Uzawa, K., Kasamatsu, A. & Yamauchi, M. Collagen cross-linking in oral cancer. Oral Sci. Int. 21(1), 3–14 (2024).
Article MATH Google Scholar
Jones, J. K. et al. ABL1 and ABL2 promote medulloblastoma leptomeningeal dissemination. Neuro-oncology Adv. 5(1), vdad095 (2023).
Article Google Scholar
Cao, S. et al. CircGLIS3 inhibits thyroid cancer invasion and metastasis through miR-146b-3p/AIF1L axis. Cell. Oncol. (Dordrecht). 46(6), 1777–1789 (2023).
Article CAS MATH Google Scholar
He, Z. et al. Identification of BGN and THBS2 as metastasis-specific biomarkers and poor survival key regulators in human colon cancer by integrated analysis. Clin. Translational Med. 12(11), e973 (2022).
Article CAS Google Scholar
Liu, H. et al. Lack of NPR1 increases vascular endothelial adhesion through induction of integrin Beta 4. Int. J. Mol. Sci. 23(20), 12627 (2022).
Article CAS PubMed PubMed Central MATH Google Scholar
Zhao, X., Wu, X., Wang, H., Yu, H. & Wang, J. USP53 promotes apoptosis and inhibits glycolysis in lung adenocarcinoma through FKBP51-AKT1 signaling. Mol. Carcinog. 59(8), 1000–1011 (2020).
Article CAS PubMed MATH Google Scholar
Yuan, Z. et al. Extracellular matrix remodeling in tumor progression and immune escape: from mechanisms to treatments. Mol. Cancer. 22(1), 48 (2023).
Article CAS PubMed PubMed Central MATH Google Scholar
Du, H., Pang, M., Hou, X., Yuan, S. & Sun, L. PLOD2 in cancer research. Biomed. Pharmacother. 90, 670–676 (2017).
Article CAS PubMed MATH Google Scholar
Wang, Z., Shi, Y., Ying, C., Jiang, Y. & Hu, J. Hypoxia-induced PLOD1 overexpression contributes to the malignant phenotype of glioblastoma via NF-κB signaling. Oncogene 40(8), 1458–1475 (2021).
Article CAS PubMed PubMed Central MATH Google Scholar
Guo, T., Gu, C., Li, B. & Xu, C. PLODs are overexpressed in ovarian cancer and are associated with gap junctions via connexin 43. Lab. Invest. 101(5), 564–569 (2021).
Article CAS PubMed MATH Google Scholar
Robert, C. A decade of immune-checkpoint inhibitors in cancer therapy. Nat. Commun. 11(1), 3801 (2020).
Article ADS CAS PubMed PubMed Central MATH Google Scholar

Download references

Acknowledgements

We express our gratitude to the oncology specialists from the Traditional Chinese Medicine Hospital of Lu’an City, Xu Sheng, Zhang Hongxia, Wu Dezhen, Shi Yue, Xu Jing, Cheng Li, and Ci Xueping, for their invaluable advice and guidance during the manuscript writing process.

Funding

This study was supported by the Natural Science Foundation of Anhui Provincial Higher Education Institutions (Project No. 2023AH050834) and the Science and Technology Plan Project of Lu’an City, Anhui Province (Project No. 2023lakj-zyy007).

Author information

Authors and Affiliations

Department of Medical Oncology, Lu’an Hospital Affiliated To Anhui University of Chinese Medicine, No. 73, Mozitan Road, Lu’an, 237000, Anhui, China
Wanhui Dong, Wei Tang, Qingming Sun, Pei Zhu & Mingxing Wang
Oncology Radiotherapy Department, Affiliated Hospital of West Anhui Health Vocational College, No. 73, Mozitan Road, Lu’an, 237000, Anhui, China
Su Li & Bin Xu

Authors

Wanhui Dong
View author publications
Search author on:PubMed Google Scholar
Su Li
View author publications
Search author on:PubMed Google Scholar
Wei Tang
View author publications
Search author on:PubMed Google Scholar
Qingming Sun
View author publications
Search author on:PubMed Google Scholar
Pei Zhu
View author publications
Search author on:PubMed Google Scholar
Mingxing Wang
View author publications
Search author on:PubMed Google Scholar
Bin Xu
View author publications
Search author on:PubMed Google Scholar

Contributions

WD and BX were responsible for the overall conceptual framework of the manuscript, data collection, and the drafting of the initial manuscript. SL, WT, and QS were in charge of data processing and visualization, as well as translating, polishing, and checking for duplication in the manuscript. PZ and MW were responsible for the validation and visualization of the experimental sections.

Corresponding author

Correspondence to Bin Xu.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Dong, W., Li, S., Tang, W. et al. To investigate the tumor promotion role of PLOD3 in colorectal cancer and its potential as a prognostic biomarker and therapeutic target. Sci Rep 15, 5371 (2025). https://doi.org/10.1038/s41598-025-89521-z

Download citation

Received: 09 September 2024
Accepted: 05 February 2025
Published: 13 February 2025
DOI: https://doi.org/10.1038/s41598-025-89521-z

Keywords

This article is cited by

Exploration and validation of the prognostic value of mitophagy and mitochondrial dynamics-related genes in cervical cancer
- Jiankui Li
- Xi Chen
- Juan Li
Scientific Reports (2025)

Subjects

Abstract

Similar content being viewed by others

Hypomethylation induced overexpression of PLOD3 facilitates colorectal cancer progression through TM9SF4-mediated autophagy

PLOD3 contributes to HER-2 therapy resistance in gastric cancer through FoxO3/Survivin pathway

Comprehensive bioinformatics analysis was used to identify and verify differentially expressed genes in targeted therapy of colon cancer

Introduction

Methods

Single-cell clustering and cell annotation analysis

Cellular communication and ligand analysis

Expression profile data processing and analysis

Building prognostic models and evaluating their prognostic performance

Analysis of immune-related properties and functions in the model

The potential of models in immunotherapy

Mendelian randomization analysis of model genes and CRC

Experimental verification

Cell culture and transfection

RNA extraction and quantitative reverse transcription polymerase chain reaction (RT-PCR)

Protein extraction and Western blot analysis

Transwell

Scratch test

Statistical analysis

Results

Data sorting and analysis

Construction and effect of prognostic model

Association between models and immune and function

Potential association of the model with immunotherapy

Mendelian randomization analysis of model genes and CRC

Experimental verification of model gene

Discussion

Conclusions

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s note

Electronic supplementary material

Supplementary Material 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

This article is cited by

Exploration and validation of the prognostic value of mitophagy and mitochondrial dynamics-related genes in cervical cancer

Search

Quick links