1༎Introduction

Gallstones and cholecystitis are among the most common biliary tract diseases worldwide, with an adult prevalence rate of approximately 15%1.This incidence is notably higher in women and individuals aged 40 and older. Gallstones,2when complicated by cholecystitis, can lead to a range of severe clinical symptoms, including biliary colic, acute infection, and biliary obstruction3,4.In some cases, these conditions may progress to pancreatitis, posing significant health risks.The co-occurrence of gallstones and cholecystitis is frequently observed in clinical settings, making timely and accurate diagnosis crucial. Such diagnoses are essential for preventing complications and determining whether surgery or conservative treatment is necessary5.However, diagnosing these conditions in elderly patients or those with pre-existing health issues presents significant challenges. Early-stage or atypical cases often complicate diagnosis, highlighting the need for improved diagnostic methods6. Currently, abdominal ultrasound is widely used as the preferred diagnostic method for gallbladder diseases due to its non-invasive nature and accessibility7.However, the diagnostic accuracy of ultrasound is highly dependent on the operator’s experience, and in cases such as obesity, distal stone location, or the absence of common bile duct dilatation, its sensitivity can decrease to approximately 73%.8.This limitation calls for the development of more accurate and stable diagnostic methods. To improve diagnosis, there is an urgent need for a reliable, non-invasive method that does not rely on operator skill.While CT and MRI are essential diagnostic tools in clinical practice, especially for complex or atypical cases, they come with significant drawbacks, including high costs and radiation exposure risks. However, in certain clinical situations, such as routine screening or for patients who cannot tolerate radiation, these methods may not be ideal. For example, CT scans are often used for detailed imaging of the biliary tract, especially in cases of bile duct obstruction or when stones are located in difficult-to-visualize areas. MRI, particularly magnetic resonance cholangiopancreatography (MRCP), is the gold standard for visualizing bile ducts and detecting subtle abnormalities in the biliary system.Despite these advantages, both CT and MRI have limitations. CT carries a risk of radiation exposure, making it unsuitable for repeated use, particularly in vulnerable populations such as pregnant women or children. MRI, while non-invasive, is costly and requires specialized equipment and expertise that may not be available in all healthcare settings, especially in low-resource regions.This is where GC-IMS presents a valuable alternative. Gas chromatography-ion mobility spectrometry (GC-IMS) offers a non-invasive, low-cost, and operator-independent method for disease diagnosis, making it a promising complement or even an alternative to CT and MRI in specific clinical contexts. Unlike CT or MRI, GC-IMS does not rely on expensive imaging equipment or radiation, which makes it particularly useful for early screening and diagnosis in settings where access to advanced imaging may be limited9,10,11.

Exploring objective biomarkers that can overcome reliance on imaging and operator subjectivity has become a current research focus. From a metabolomics perspective, the high-throughput analysis of volatile organic compounds (VOCs) via GC-IMS has established a new molecular foundation for non-invasive, precise disease diagnosis. GC-IMS is an emerging VOC detection technology that combines the separation capabilities of gas chromatography with the high-sensitivity detection capabilities of ion mobility spectrometry12.This technology first separates VOCs in samples via GC, then ionizes them in an IMS drift tube. Detection occurs based on differences in their migration speeds within an electric field, forming a compound-specific two-dimensional “fingerprint spectrum.” This enables rapid detection of VOCs in complex samples without requiring complex pretreatment13.This approach offers new insights for the advancement of metabolomics and non-invasive disease diagnosis. VOCs are small molecules produced during metabolic processes that can be excreted through urine, breath, sweat, and other pathways. Their types and concentrations undergo significant changes under disease conditions14.These substances are extensively involved in various physiological and pathological processes within the body, including oxidative stress, inflammatory responses, energy metabolism disorders, and microbe-host interactions15. GC-IMS results are presented as VOC fingerprint spectra, which are two-dimensional or three-dimensional maps containing information such as retention time, migration time, and peak intensity for each compound16.By comparing fingerprint spectra from different samples, characteristic metabolic patterns associated with specific disease states can be identified17.Unlike traditional methods that rely on a single biomarker, VOC fingerprinting provides comprehensive information on metabolic status, making it more suitable for screening and differential diagnosis of complex diseases. In handling high-dimensional data, machine learning (ML) technology offers powerful tools for VOC spectrum analysis18.Machine learning algorithms (such as random forests, support vector machines, neural networks, etc.) can automatically extract meaningful features from large-scale, complex VOCs data to achieve classification and identification of disease and health states13。.

VOCs, as metabolic biomarkers, have demonstrated high diagnostic potential across various diseases, particularly for non-invasive screening. Previous studies utilizing GC-IMS or SIFT-MS to analyze bile VOCs have successfully differentiated hilar cholangiocarcinoma from benign biliary diseases. The constructed machine learning model achieved sensitivity and specificity of 93.1% and 100%, respectively, with an AUC of 0.966. A VOC panel (12 differentially expressed compounds) for gallbladder cancer detection also achieved an AUC ≈ 0.972 (sensitivity 100%, specificity 94.4%)19.In the diagnosis of pancreatic and biliary tract tumors, the bile VOCs combination model demonstrates high diagnostic accuracy (e.g., AUC ≈ 0.86–0.98, sensitivity ≈ 79–93%, specificity ≈ 81–100%)20,21,22.Additionally, VOCs demonstrated excellent discriminatory performance in other types of digestive system tumors (such as CRC and gastroesophageal cancer) as well as hepatocellular carcinoma, as detected through breath, urine, and even fecal samples (with total AUC values approaching 0.95–0.96 and sensitivity and specificity both exceeding 0.88)23.

To date, urine VOCs profiling for the common comorbid condition of “cholecystitis with gallstones” remains unexplored. To address this gap and meet the urgent clinical demand for non-invasive, objective, and reproducible diagnostic tools, this study aims to systematically evaluate the added value of urine VOCs fingerprinting—derived via GC-IMS—in identifying biliary co-morbidities. This approach will provide novel molecular-clinical integration evidence for early detection and precision intervention.

Materials and methods

Research subjects

The study sample was recruited from Shandong University Third Hospital between September 2023 and September 2024. A power analysis was conducted to ensure adequate power for detecting significant differences between cholecystitis patients and healthy controls. Assuming a large effect size (Cohen’s d = 0.8), α = 0.05, and power = 0.80, the minimum required sample size was 26 participants per group (52 total). Based on this, we selected 100 participants per group (200 total) to ensure sufficient statistical power.The case group comprised patients diagnosed with gallstones complicated by cholecystitis via imaging studies and undergoing radical surgical treatment. Inclusion criteria were: age ≥ 18 years, no history of other malignancies or anticancer therapy, no other severe hepatobiliary or systemic diseases, ability to provide fresh urine samples and complete medical records, and postoperative pathological confirmation of gallstones and cholecystitis. The control group comprised healthy volunteers meeting the following criteria: no history of hepatobiliary diseases, no recent use of medications potentially affecting hepatobiliary function, no urinary tract diseases, no chronic conditions such as diabetes or hypertension, and normal urine analysis results. The study protocol was approved by the Ethics Committee of Shandong University Provincial Third Hospital, and informed consent was obtained from all participating patients and their families. This study was conducted in accordance with the Declaration of Helsinki.

Sample preparation

All urine samples were collected preoperatively using midstream urine. Approximately 5 milliliters of urine per subject was collected in sterile containers and stored at -80 °C following collection.

Analysis of volatile organic compounds in urine

GC-IMS (brand name “FlavorSpec”, Dortmund, Germany) was used to analyze volatile organic compounds in urine samples. All samples were analyzed using static headspace sampling. GC-IMS first separates the complex VOC components in urine via GC, followed by secondary separation using ion mobility spectrometry (IMS) based on molecular ion mass and one-dimensional collision cross-section. Samples are thus characterized in two dimensions based on gas chromatographic retention index (RI) and ion mobility spectrometer drift time (Dt), and quantified according to signal response intensity. All samples underwent standardized processing: 2 mL of urine was placed into each headspace vial and incubated at 80 °C for 5 min. Subsequently, 1000 µL of gas was extracted from the headspace vials for analysis. Nitrogen was used as the carrier gas. Operating parameters were as follows: IMS drift gas flow rate maintained at 150 mL/min; carrier gas gradient: 0 min: 2 mL/min; 1 min: 2 mL/min; 8 min: 100 mL/min; 10 min: 150 mL/min; 15 min: 150 mL/min. Other key parameters include: T1 drift tube temperature: 45 °C; T2 GC column temperature: 80 °C; T3 inlet temperature: 80 °C; T4 line 1 temperature: 80 °C; T5 line 2 temperature: 80 °C; T6 drift tube temperature: 45 °C.

Statistical analysis

Statistical analyses were performed using SPSS version 26 (IBM, Armonk, NY, USA). For continuous variables, the normality of the data was first assessed using the Shapiro-Wilk test.For normally distributed data, the mean and standard deviation (SD) were calculated, and comparisons between the control and patient groups were performed using the independent samples t-test for parametric data.For non-normally distributed data, the median and interquartile range (IQR) were used, and comparisons between the two groups were performed using the Wilcoxon rank sum test (Mann-Whitney U test), which is appropriate for skewed distributions.For categorical variables, the frequency and percentage were reported. The Pearson’s Chi-squared test was used to compare the differences between the control and patient groups when the sample size was large. For small sample sizes, Fisher’s exact test was applied to ensure accurate results.P-values < 0.05 were considered statistically significant for all tests. All tests were two-sided.

Preliminary qualitative identification of VOCs is performed using dual criteria: retention index (RI) from gas chromatography and drift time (Dt) from ion mobility spectrometry. Cross-validation is conducted using the NIST RI database and an internal IMS Dt database. Target compounds are accurately identified by comparing their RI and Dt values against reference data, ensuring both parameters match standard reference peaks. In the data processing phase, GC-IMS automatically identified all VOC signal peaks. To ensure the quality of the signals and eliminate potential noise or interference, manual peak selection was performed prior to feature selection. The criteria for peak selection included the regularity of the peak shape and signal-to-noise ratio (SNR), with the signal intensity required to be at least three times the baseline noise. This manual selection process ensured that only high-quality signals were retained as features for subsequent machine learning model training and predictive analysis, rather than being selected based on inter-group differences. Additionally, reverse ion peak (RIP) normalization is applied to eliminate absolute drift time deviations. All compound names originate from the purchased NIST spectral library without modification, adhering to a unified chemical nomenclature system and enabling precise traceability via CAS numbers.

Data Cleaning: Missing values were handled as follows: for compounds with less than 10% missing data, we used mean imputation to fill in the missing values. For compounds with a higher percentage of missing data, we chose to delete the corresponding samples to avoid significant bias in the analysis results. Outliers were identified and removed using boxplot methods. Any values exceeding 1.5 times the interquartile range were considered outliers and were excluded from subsequent analyses.Standardization and Normalization: To eliminate scale differences in the concentrations of different compounds, the data were standardized. The concentration data for each compound were transformed to a standard normal distribution (mean = 0, variance = 1) using the Z-score standardization method, ensuring balanced contributions from each feature during model training. For certain compounds, particularly those with larger concentration ranges, normalization was applied, scaling the concentration data to a range between 0 and 1. This step helped eliminate scale differences between features, especially when the concentration distributions of compounds varied significantly.

After data preprocessing, feature selection was performed using univariate statistical analysis (such as t-tests and AUC) to assess the significant differences of each compound between different groups. Through this method, compounds with significant differences were selected as input features for the model.Feature Importance Evaluation: To further optimize the feature set, we used the random forest algorithm to calculate the feature importance of each compound and selected the most predictive compounds as the final input features.

In model construction, to ensure the generalization ability of the models, the sample data were randomly split into training (70%) and testing (30%) sets using the sample() function in R, with a random seed (set.seed(123)) set to ensure the reproducibility of the results. Classification models were established based on all VOC features using RF, NN, SVM, andDT methods. Ten-fold cross-validation was performed using R (×64 4.2.0) to evaluate the performance of these models. To assess the statistical significance of performance differences between the models, one-way analysis of variance (ANOVA) was first employed to test whether there were significant differences in the models’ performance. ANOVA helped determine whether the overall differences between the models were significant. If significant differences were found, Tukey HSD post-hoc tests were conducted to evaluate the specific differences between each pair of models and identify which performance differences were statistically significant. During the feature selection process, Gini importance scores from the RF model were used to evaluate the importance of all extracted VOC features. The top ten ranked VOCs were selected to plot and compare the receiver operating characteristic (ROC) curves for each model, and the area under the curve (AUC) values were calculated to evaluate the diagnostic efficacy of the different models.

To enhance the interpretability of the model, we employed the SHAP (Shapley Additive Explanations) method to quantify the contribution of each feature to the model’s prediction outcomes. The SHAP method, based on game theory principles, explains the model’s decision-making process by calculating the contribution of each feature to the prediction. In this study, we applied SHAP analysis to reveal the key features in both the RF 、SVM and NN models, and visualized their impact on the prediction results through charts.

Results

Participant characteristics

This study included a total of 200 patients with confirmed pathological diagnoses, comprising 100 patients with gallstones complicated by cholecystitis and 100 NC patients. Additionally, recruited subjects were randomly assigned to a training set (n = 70 patients with gallstones and cholecystitis and n = 70 NC patients) and a testing set (n = 30 patients with gallstones and cholecystitis and n = 30 NC patients). Table 1 details the clinical characteristics of these patients.

In this study, there were no significant differences in gender and age between the experimental and control groups. The gender distribution was balanced in the experimental group, with 50% males and 46% females, and in the control group, 43% males and 53% females. The mean age was also similar between the two groups, with the experimental group having an average age of 56 ± 14 years and the control group 57 ± 15 years. These findings suggest that gender and age did not significantly affect the results, ensuring that other clinical parameters could be compared reliably without these potential confounding factors.

Regarding liver function, the experimental group had significantly lower levels of albumin (ALB) compared to the control group (42.4 ± 3.1 g/L vs. 47.7 ± 3.0 g/L, P < 0.001), indicating possible liver dysfunction or malnutrition in the experimental group. Total bilirubin (TBIL) was slightly lower in the experimental group than in the control group (15 ± 12 µmol/L vs. 17 ± 7 µmol/L, P = 0.086), though this difference remained statistically significant, suggesting mild abnormalities in bilirubin metabolism in the experimental group. In contrast, direct bilirubin (DBIL) was significantly elevated in the experimental group (7.90 ± 10.29 µmol/L vs. 3.29 ± 1.27 µmol/L, P < 0.001), suggesting the possibility of biliary obstruction or impaired bile excretion. This biochemical marker provides important diagnostic information for cholelithiasis complicated by cholecystitis. Indirect bilirubin (IBIL) was significantly lower in the experimental group (7.3 ± 3.5 µmol/L vs. 13.6 ± 5.9 µmol/L, P < 0.001), further supporting the notion of specific disturbances in bilirubin metabolism in these patients.

Additionally, the experimental group showed significantly higher levels of reduced substances/porphyrin precursors (RPO) and ketone bodies (KET) compared to the control group, with values of 13% (P = 0.003) and 9% (P = 0.010), respectively, whereas the control group had values of 2% and 1%. These significant changes in metabolic markers suggest that patients in the experimental group may experience metabolic abnormalities, particularly in fat metabolism and ketone body production, further supporting the hypothesis of metabolic disturbances in cholelithiasis complicated by cholecystitis.These findings offer valuable insights for further exploration of the disease’s pathophysiology and clinical treatment.

Table 1 Clinical characteristics of cholelithiasis with cholecystitis and NC patients.

Volatile organic compound profile analysis in patients with NC and gallbladder inflammation with gallstones

Molecular retention indices in gas chromatography characterize volatile organic compounds, while molecular ion drift times determine and quantify signal peak intensities. Three-dimensional data (retention index, drift time, and peak intensity) are generated for each urine sample (Fig. 1B). Volatile organic compound data are extracted from two-dimensional spectra, with each point representing a signal peak (Fig. 1A).Retrieve the two-dimensional coordinates of the signal peak positions (retaining the x-axis drift time), and select the integration region to integrate the signal peaks, thereby extracting peak height values to characterize the compounds. A total of 60 volatile organic compound peaks were selected from the urine samples.Figure B shows the differences in VOCs between gallstone-associated cholecystitis samples and healthy control samples. Red indicates substances with higher concentrations in urine, while blue indicates those with lower concentrations. Using VOCal software (v0.1.1) and the GC-IMS library, 60 volatile organic compound peaks were manually selected based on retention indices and drift times across all patients. These substances (peaks) comprised 49 identified compounds and 11 unidentified compounds (Fig. 1C).Unidentified compounds cannot be confirmed through existing databases or reference standards, rendering their chemical structures and properties unknown. This prevents us from accurately interpreting their biological significance in disease contexts. Since unidentified compounds lack corresponding reference standards or known benchmark data, quantitative analysis or verification of their variation trends across different samples is impossible. Consequently, they are excluded from subsequent analyses.

Fig. 1
Fig. 1
Full size image

(A) VOC data extracted from the two-dimensional spectra, with each point representing a signal peak; (B) Three-dimensional dataset generated for each urine sample, including retention index, drift time, and peak intensity; (C) A total of 60 VOC peaks were selected from urine samples. Figure 1 illustrates the differences in VOCs between gallstones with cholecystitis and healthy control samples, with red indicating substances with higher concentrations in urine and blue indicating those with lower concentrations. Using VOCal software (v0.1.1) and the GC-IMS library, 60 VOC peaks were manually selected based on the retention index and drift time of all patients. These peaks included 49 identified compounds and 11 unidentified compounds.

Performance of machine learning algorithms for diagnosing volatile organic compounds in urine

Based on peak height data from 60 VOCs(Supplementary Tables 1,3), four machine learning methods (random forest, neural network, support vector machine, decision tree) were employed for modeling analysis. Test set results revealed varying performance among the four models in distinguishing gallstones with cholecystitis from healthy controls(Table 2). The random forest (RF) model achieved an AUC of 0.824, with an accuracy of 78.3%, sensitivity of 80%, specificity of 76.7%, precision of 77.4%, and an F1 score of 78.7%. This indicates good classification performance, although it was slightly lower than previously reported.The neural network (NN) model achieved an AUC of 0.86, an accuracy of 83.3%, a sensitivity of 83.3%, a specificity of 83.3%, a precision of 83.3%, and an F1 score of 88.3%, showing excellent performance across all metrics.The support vector machine (SVM) model achieved an AUC of 0.854, an accuracy of 78.3%, a sensitivity of 73.3%, a specificity of 83.3%, a precision of 81.5%, and an F1 score of 77.2%. While its AUC and precision were relatively high, its sensitivity and F1 score were somewhat lower than the neural network model(Fig. 2).

In contrast, the decision tree (DT) model achieved an AUC of 0.668, with an accuracy of 63.3%, sensitivity of 76.7%, and relatively low specificity at 50%. Its precision was 60.5% and F1 score 67.6%, demonstrating significantly inferior diagnostic performance compared to the other three models.

The results of the ANOVA test indicated that the differences between the models were statistically significant (F = 1.167E + 29, p < 0.001)(Supplementary Table 2), suggesting that the performance of the different models varied significantly. To further compare the specific differences between the models, Tukey HSD post-hoc tests were conducted. These results showed that the differences between all models were statistically significant, with the performance difference between the NN and DT models being the most pronounced, while the differences between the SVM and RF models were relatively smaller.

Overall, the neural network model demonstrated superior performance with the highest F1 score, while the random forest and support vector machine models also showed strong results. The decision tree model exhibited weaker classification capabilities, particularly in terms of specificity.

Table 2 Diagnostic performance of machine learning Models.
Fig. 2
Fig. 2
Full size image

ROC Curve Comparison of Four Models༚illustrates the ROC curve comparison for Random Forest (RF), Neural Network (NN), Support Vector Machine (SVM), and Decision Tree (DT) models. The AUC values are as follows: RF (AUC = 0.824), NN (AUC = 0.860), SVM (AUC = 0.854), and DT (AUC = 0.668). As shown in the figure, the Neural Network and Support Vector Machine models exhibit a higher balance between sensitivity and specificity, demonstrating better classification performance, while the Decision Tree model shows significantly poorer diagnostic performance.

Identification of volatile organic compounds in urine using random forest and support vector machine analysis

Figure 3A displays the peak heights of the top ten VOCs ranked by Gini coefficient in the model analysis. Figure 3B reveals the top ten VOC peak heights showing intergroup differences between patients with gallstones and cholecystitis and the NC group. These include Linalool, Propylpropenyl disulphide, Methylthiobutyrate-M, Butylamine, and Methyl pentanoate-M. The RF, SVM, and NN algorithms were respectively employed to model these five VOCs(Figure 3C). Results are shown in the figure: the NN model achieved an AUC of 0.81, the RF model 0.77༈Figure 3D༉, and the SVM model 0.76. Detailed results for all models are provided in the SupplementaryTable2.

Fig. 3
Fig. 3
Full size image

Identification of Volatile Organic Compounds in Urine Using Random Forest and Support Vector Machine Analysis (A). Volcano plot displaying the relationship between the log fold change and the p-value of VOCs. Red dots represent significant variables based on the p-value and log fold change threshold (FC > 0.5, p-value < 0.05). Linalool is shown as an example. (B). Bar plot of the top 10 features ranked by importance scores based on the Random Forest model. Linalool, Propyl propyl disulphide, and Methylthiobutyrate-M show the highest feature importance.(C) Distribution of peak heights for the key VOCs (Propyl propyl disulphide, Methylthiobutyrate-M, Butylamine, Methyl pentanoate-M) in patients with gallstones complicated by cholecystitis (CC) and healthy controls (NC). Statistical significance is indicated by asterisks: **** p < 0.0001, ** p < 0.01.D. ROC curve comparison of Random Forest (RF), Neural Network (NN), and Support Vector Machine (SVM) models. The AUC values are: RF = 0.77, NN = 0.81, SVM = 0.76, showing the performance of each model in diagnosing gallstones complicated by cholecystitis.

Urinary biomarkers

ROC curve analysis of the NN model revealed that Linalool, Propyl.propenyl.disulphide, Methylthiobutyrate-M, and Butylamine demonstrated significant discriminatory power in distinguishing patients with gallstones complicated by cholecystitis from healthy controls, with all AUC values exceeding 0.70 (Supplementary Table 2 ). Specifically, Linalool exhibited an AUC of 0.777, Propyl/Propenyl Disulfide and Methylthiobutyrate-M each showed an AUC of 0.768, while Butylamine demonstrated an AUC of 0.731(Figure 4A).

Through SHAP analysis, we revealed the contribution of key features to the prediction outcomes in the RF, NN, and SVM models(Fig. 4B-D). In the Random Forest model, Butylamine and Methylthiobutyrate-M were the primary features influencing the prediction results, with the former contributing positively and significantly increasing the prediction values. In comparison, the contributions of Propyl propenyl disulphide and Linalool were relatively small. In the Neural Network model, Butylamine remained a key feature, exerting a significant positive impact on the prediction outcomes, although Methylthiobutyrate-M had a smaller contribution to the model. In the SVM model, Butylamine continued to be the most important feature, with a clear positive effect on the prediction, while methyl pentanoate-M also contributed positively to the model. Other features, such as Methylthiobutyrate-M and Propyl propenyl disulphide, showed smaller contributions, indicating that they had a limited impact on the SVM model’s predictions. Overall, these analyses help us understand the specific role of different features in the prediction decisions of each model, providing strong interpretability support.

Fig. 4
Fig. 4
Full size image

(A) Individual Neural Network (NN) models for each VOC feature. The ROC curves show the performance of NN models built for individual VOCs: Linalool (AUC = 0.777), Propyl propyl disulphide (AUC = 0.769), Methylthiobutyrate-M (AUC = 0.768), Butylamine (AUC = 0.731), and Methyl pentanoate-M (AUC = 0.587). These models demonstrate varying diagnostic capabilities for each feature. (B) Feature importance plot for Methyl pentanoate-M, Methylthiobutyrate-M, Propyl propyl disulphide, Linalool, and Butylamine. The plot shows the predicted impact of each feature on the NN model. The actual and average prediction values indicate that Methyl pentanoate-M is the most influential for the model. (C) Prediction distribution for Butylamine, Methylthiobutyrate-M, and other features. The Actual prediction for Butylamine is 0.08, with an average prediction of 0.50. The phi coefficients show how the features contribute to the prediction. (D) Prediction distribution for Propyl propyl disulphide, Butylamine, Methyl pentanoate-M, and other features. The Actual prediction for Propyl propyl disulphide is 0.22, with an average prediction of 0.50. The phi coefficients represent the feature’s impact on the prediction outcome.

Discussion

This study employed GC-IMS to investigate the urinary VOC profile characteristics in patients with gallstones complicated by cholecystitis. In urine samples from patients with gallstones complicated by cholecystitis and healthy controls, we identified four VOCs exhibiting differential trends and statistically significant differences. Based on these findings, we constructed and evaluated a VOC diagnostic model for gal

Increasing research is focusing on VOCs generated from metabolites in disease patients, with urine, exhaled breath, bile, and feces demonstrating potential diagnostic value. Exhaled breath, due to its ease of collection and storage, became one of the earliest sample types used for VOC detection.Princivalle et al. developed a predictive model based on 10 VOCs in alveolar gas that effectively distinguishes pancreatic ductal adenocarcinoma from healthy controls24.Navaneethan et al. found that detecting volatile organic compounds in bile can aid in diagnosing cholangiocarcinoma in patients with primary sclerosing cholangitis25.VOCs can enhance the efficacy of fecal immunochemical testing (FIT) in colorectal cancer (CRC) screening. For individuals who test negative on FIT but still exhibit symptoms, VOCs also serve as an effective exclusionary test to reduce the risk of missed diagnosis. Furthermore, whether used alone or in combination with FIT, VOCs can be employed for triaging individuals awaiting colonoscopy in polyp surveillance populations, demonstrating superior performance compared to FIT in identifying high-risk individuals.Regarding urothelial bladder cancer, studies by Lett et al. and George et al. demonstrate that significant alterations in urinary VOCs can distinguish some (though not all) bladder cancer samples from clinically relevant controls, suggesting their potential diagnostic and monitoring applications for this disease26.

A study on prostate cancer analyzed urine samples from 66 prostate cancer patients and 87 non-cancer controls using GC-IMS, constructing machine learning models based on four representative VOCs. The AUC values for the RF and SVM models were 0.955 and 0.981, respectively, demonstrating strong diagnostic discrimination capability27.Another study combined GC-IMS with gas chromatography-time-of-flight mass spectrometry (GC-TOF-MS) to detect VOCs in urine samples from patients with bladder cancer (BCa) and prostate cancer (PCa): GC-IMS demonstrated significant discriminatory power for BCa and PCa (AUCs of 0.97 and 0.89, respectively) and effectively distinguished both from non-cancer controls (BCa AUC = 0.95, PCa AUC = 0.89), with key compounds including 2-ethyl-1-ethanol and 3-methoxyfuran. GC-TOF-MS further identified 34 potential biomarkers, including 13 associated with BCa and 7 with PCa. Key compounds such as 2,6-dimethyloctane and nonanal provide stronger evidence for precise subtyping of urinary tract tumors28.Mozdiak et al. analyzed urine samples from FOBT-positive individuals using GC-IMS, achieving effective differentiation between CRC and controls. In colorectal screening cohorts, urinary VOC biomarkers also distinguished CRC, adenomas, and individuals without significant intestinal pathology29.A systematic review of evidence on GC-IMS in CRC screening identified butyraldehyde (AUC = 0.98) as a key biomarker, with combined sensitivity and specificity of 84% and 70%, respectively. Combining FOBT with VOC detection can identify approximately 33% more CRC cases30.Combining traditional testing with VOCs holds promise for further enhancing clinical early screening and diagnosis capabilities. Studies on lung cancer populations demonstrate a sensitivity of 85% and specificity of 90%, outperforming electronic noses. The method detected eight key VOCs, including 2-pentanone, 2-hexenal, 2-hexen-1-ol, 4-hepten-2-ol, 2-heptanone, 3-octen-2-one, 4-methylpentanol, and 4-methyloctanol31.For esophageal cancer, the random forest model based on 8 VOCs achieved an AUC of 0.874, with a sensitivity of 84.2% and specificity of 90.6%, suggesting that urine VOC analysis can serve as a non-invasive screening method32.Additionally, urinary VOCs can distinguish hepatocellular carcinoma (HCC) from fibrosis cases (AUC = 0.97), preliminarily identifying seven HCC-associated chemicals including 2-butanone, 2-hexanone, and bicyclo(4.1.0)heptane33.

In this study, we uniformly collected midstream urine samples from patients and stored them at -80 °C. Prior to instrument operation, we ensured consistent freezing times for each urine sample to minimize variations caused by degradation and changes in urinary metabolites. Given the non-invasive nature of urine collection, utilizing urinary VOCs for early disease screening and diagnosis represents an ideal approach. Compared to currently used blood screening methods, urine samples are more readily accepted by patients. Furthermore, urine is rich in both exogenous and endogenous metabolites filtered by the kidneys, reflecting rapid changes in the local environment of the urogenital system and metabolic pathways throughout the body34. It can even indicate disease onset 34, making it an ideal sample for VOCs research.

To further enhance the interpretability of the models, we applied the SHAP method to reveal the contribution of different features to the model’s prediction outcomes. In the RF model, Butylamine and Methylthiobutyrate-M were the key features influencing the prediction results, with Butylamine contributing significantly to the model’s predictions and exhibiting a clear positive effect. In contrast, Propyl-propenyl disulphide and Linalool had smaller contributions to the prediction. In the NN model, Butylamine remained the dominant feature, exerting a significant positive impact on the prediction results, while Methylthiobutyrate-M contributed less. In the SVM model, Butylamine was also the most important feature, showing a clear positive effect, with Methyl pentanoate-M also contributing to the model’s predictions. Through SHAP analysis, we were able to gain deeper insights into the role of different features in each model, providing strong support for further optimization and interpretation of the models.

These five key compounds (Linalool, Propyl-propenyl disulphide, Methylthiobutyrate-M, Butylamine, Methyl pentanoate-M) are closely related to the pathological mechanisms of gallstones complicated by cholecystitis. Linalool has anti-inflammatory and antioxidant properties and may alleviate inflammation and oxidative damage in the gallbladder, reducing the irritation caused by gallstones. Propyl-propenyl disulphide, an organic sulfur compound, may inhibit the release of inflammatory mediators in the gallbladder, slowing the formation of gallstones and improving bile composition. Methylthiobutyrate-M may be involved in lipid metabolism disorders, regulating the formation of cholesterol crystals and promoting gallstone formation. Butylamine plays an important role in the immune response in cholecystitis and may exacerbate the pathological progression of inflammation by enhancing the inflammatory response. Methyl pentanoate-M, a product of lipid metabolism, may be related to gallstone formation and the metabolic disorders in cholecystitis, further driving pathological changes in the gallbladder. In summary, these compounds play a key role in the metabolic and inflammatory processes of gallstones complicated by cholecystitis, potentially promoting the development and progression of the disease through the regulation of inflammation, immune response, and lipid metabolism.

In conclusion, this study successfully explored the potential of urinary VOCs for the early diagnosis of gallstones complicated by cholecystitis using GC-IMS combined with machine learning models. The study identified significant differences in compounds such as Linalool, Propyl-propenyl disulphide, Methylthiobutyrate-M, and Butylamine between patients with gallstones complicated by cholecystitis and healthy controls. Machine learning models, including neural networks, random forests, and support vector machines, achieved high diagnostic accuracy, with the neural network model showing the best classification performance. SHAP analysis further revealed the contributions of different features to the model’s predictions, providing a deeper understanding and enhancing model interpretability.

Despite these promising preliminary results, there are limitations, including the single-center design and small sample size, which may affect the generalizability of the findings. Future studies should conduct multi-center and larger-scale validations to improve the model’s robustness and external validity. Additionally, this study did not combine VOCs with other clinical indicators or biomarkers, which offers opportunities for further optimization of the model. Overall, the use of urinary VOCs combined with GC-IMS and machine learning methods presents a promising approach for the early diagnosis of gallstones complicated by cholecystitis, with significant clinical application potential.