Establishment of a machine learning model for predicting splenic hilar lymph node metastasis

Ishizu, Kenichi; Takahashi, Satoshi; Kouno, Nobuji; Takasawa, Ken; Takeda, Katsuji; Matsui, Kota; Nishino, Masashi; Hayashi, Tsutomu; Yamagata, Yukinori; Matsui, Shigeyuki; Yoshikawa, Takaki; Hamamoto, Ryuji

doi:10.1038/s41746-025-01480-x

Download PDF

Article
Open access
Published: 11 February 2025

Establishment of a machine learning model for predicting splenic hilar lymph node metastasis

Kenichi Ishizu^1,2,
Satoshi Takahashi^1,3,
Nobuji Kouno^1,3,
Ken Takasawa^1,3,
Katsuji Takeda^1,3,
Kota Matsui⁴,
Masashi Nishino²,
Tsutomu Hayashi²,
Yukinori Yamagata²,
Shigeyuki Matsui⁴,
Takaki Yoshikawa² &
…
Ryuji Hamamoto^1,3

npj Digital Medicine volume 8, Article number: 93 (2025) Cite this article

3841 Accesses
2 Citations
9 Altmetric
Metrics details

Subjects

Abstract

Upper gastrointestinal cancer (UGC) sometimes metastasizes to the splenic hilum lymph node (SHLN). However, surgical removal of SHLN is technically difficult, and the risk of postoperative complications is high. Although there are models that predict SHLN metastasis, they usually only provide point estimates of risk, and there is a lack of sufficient information. To address this issue, we aimed to develop a Bayesian logistic regression model called Bayes-SHLNM. The performance of the models was compared with that of the frequentist logistic regression (FLR) model as a benchmark, and the posterior probability distribution (PPD) was shown individually. The performance of Bayes-SHLNM was equivalent to that of the FLR model, and the PPD for each case was visualized as the uncertainty. These results indicate that the Bayes-SHLNM model has the potential to be used as a decision support system in clinical settings where uncertainty is high.

Predicting lymph node metastasis from primary tumor histology and clinicopathologic factors in colorectal cancer using deep learning

Article Open access 24 April 2023

Prognostic model for log odds of negative lymph node in locally advanced rectal cancer via interpretable machine learning

Article Open access 07 March 2025

Development of a prediction model based on LASSO regression to evaluate the risk of non-sentinel lymph node metastasis in Chinese breast cancer patients with 1–2 positive sentinel lymph nodes

Article Open access 07 October 2021

Introduction

Gastric cancer ranks fifth in terms of incidence and fourth in terms of cancer-related mortality worldwide¹. The principle of curative treatment for gastric cancer is surgical resection with appropriate regional lymphadenectomy, as excessive resection causes high morbidity and shortens life expectancy^2,3,4,5,6,7. According to the Japanese Gastric Cancer Classification, the splenic hilar lymph node (SHLN) is a regional lymph node of gastric cancer in the upper third of the stomach. In total, 2.8–27.9% of SHLNs have metastasized^{8,9,10,11,12,13}. Although Japanese phase III trial comparing splenectomy and spleen-preservation demonstrated survival non-inferiority, that trial was only limited to the upper advanced gastric cancer not invading the greater curvature⁶. Gastric cancer invading the greater curvature had high frequency and therapeutic value index at SHLNs, thus Japanese Gastric Cancer Treatment Guideline recommended splenic-hilar nodal dissection for these tumors². Moreover, our retrospective study clarified that high frequency and therapeutic value index at SHLNs were shown even for gastric cancer without invasion to the greater curvature especially when the location was posterior wall and histology was undifferentiated type¹⁴. Thus, SHLN dissection with splenectomy is widely performed in Japan. Splenectomy has several disadvantages, particularly a high incidence rate of postoperative morbidity (approximately 20–30%), which has been reported to offset the survival benefit in certain cases in randomized controlled clinical trials and retrospective studies^{15,16,17,18,19}. The importance of developing and selecting an appropriate surgical approach based on the oncological status of the cancer has intensified^20,21. Furthermore, uniform treatment strategies typically applied to healthy patients are no longer feasible in patients with complex comorbidities or significant frailty^22,23,24. Recently, there is a clear lack of tools and definitive criteria for decision-making that need to be developed. Surgeons and healthcare staff decide on the best operation for each patient based on the oncological profile and background of the patient. This complex decision-making process highlights the urgent need for tools that visualize and facilitate the sharing of critical information, thereby enhancing effective communication and enabling comprehensive decision-making in healthcare settings.

Machine learning techniques have been advancing rapidly in recent years and are being implemented in the field of medical oncology^{25,26,27,28,29,30,31,32,33}. A key feature of machine learning is its data-driven nature, in which algorithms analyze large amounts of data in order to discover patterns and rules. Another key feature is its adaptability, which allows models to be updated and improve their accuracy as new data becomes available. These features make machine learning a powerful tool for addressing complex medical challenges, where data-driven insights and adaptability are critical for improving predictive performance and tailoring clinical decisions. Various machine learning models for predicting lymph node metastasis (LNM) in gastric cancer have been developed, mainly focusing on the eligibility for endoscopic resection in early gastric cancer and the prognosis of advanced gastric cancer (AGC)^34,35,36. Notably, no models have been developed that attempt to change the surgical plans or extent of lymph node dissection^37,38,39,40. Traditional machine learning models based on frequentist approaches fail to meet the clinical practice requirements because of their inability to predict uncertainty. Even in high-performance models, it is difficult to change clinical decision-making processes. The Bayesian approach allows prior information and beliefs to be incorporated into statistical inferences, thus providing a powerful tool for decision-making under uncertainty. Unlike traditional frequentist methods, the Bayesian framework provides intuitive interpretability by expressing results as probabilities. For example, clinicians can better understand statements such as “there is a 95% probability that the odds ratio is between 2.0 and 3.0,” which directly conveys the uncertainty associated with a prediction. This probabilistic interpretation makes the Bayesian approach particularly useful for complex clinical decisions where actionable insights must account for uncertainty. Another key advantage of Bayesian methods is their ability to quantify uncertainty through the posterior distributions of model parameters, thereby providing a comprehensive understanding of prediction confidence. Bayesian methods are inherently adaptive, allowing models to be dynamically updated as new data becomes available. By leveraging these strengths - intuitive interpretability, robust uncertainty quantification, and adaptability - the Bayesian approach offers a unique advantage over traditional methods in addressing the challenges of clinical decision-making. In the medical field, Bayesian inference has demonstrated effectiveness in various settings, including diagnosis, treatment planning, and epidemiological studies^41,42,43. In this context, uncertainty may play an important role in determining the need for SHLN dissection.

In this study, we have developed a Bayesian logistic regression model called the Bayesian prediction of SHLN metastasis (Bayes-SHLNM) to identify patients with SHLN who underwent total gastrectomy and splenectomy between 2000 and 2012. The primary distinction between Bayesian and frequentist modeling approaches lies in how they handle uncertainty and parameter estimation. Frequentist models, such as the traditional logistic regression model (frequentist logistic regression [FLR]), provide single-point parameter estimates, leading to a single predictive model. By contrast, Bayesian methods treat parameters as random variables with distributions, resulting in a collection of models rather than a single model (Fig. 1). This ensemble of models captures the uncertainty inherent in the parameter estimates. The key advantage of the Bayesian approach is that it yields the posterior probability distribution (PPD) for SHLN metastasis risk. This distribution offers a richer and more comprehensive view of the predictive uncertainty than the point estimates provided by frequentist models. The Bayesian models were benchmarked against the FLR model to evaluate their performances. The PPD for each prediction was visualized to demonstrate the uncertainty and range of possible outcomes, which can be crucial for clinical decision-making in high-risk and uncertain scenarios.

**Fig. 1: Comparison of frequentist and Bayesian (Bayes-splenic hilum lymph node metastasis [SHLNM]) models in predicting metastasis probability for SHLN dissection in upper gastrointestinal cancer.**

In this study, we aimed to develop a predictive model useful for decision-making focused on prediction uncertainty in Bayesian inference and visualize the individual PPD of SHLN metastasis in upper gastrointestinal cancer (UGC) based on clinicopathologic characteristics.

Results

Study population selection process

The patient selection flowchart is shown in Fig. 2. Between January 2000 and December 2012, a total of 5957 patients underwent gastrectomy with nodal dissection. Of these, 798 patients underwent total gastrectomy with splenectomy for primary gastric cancer. In total, 35 patients diagnosed with pT0 or pT1, 169 patients diagnosed with pStage IV, and 1 patient who underwent R1 or R2 resection were excluded. The final study population comprised 593 patients.

Baseline characteristics

The study cohort comprised 593 patients. Male sex was predominant, and 15.2% of the patients received neoadjuvant chemotherapy. Half of the tumors predominantly invaded the upper gastric body, and 35.8% had greater curvature invasion (GCI). The predominant histology (histology 1) was poorly differentiated adenocarcinoma non-solid type (por2), and signet-ring cell carcinoma (sig) was the second most common histological component (histology 2). SHLN metastasis (#10) was 8.1%, which was the prediction target. The most frequent LNMs were found along the lesser curvature (#1, #3). LNMs adjacent to the SHLN (#4sb, #4d, #11d) were found in 7.3–12% of the patients. The cohort characteristics are summarized in Table 1.

Table 1 Patient cohort characteristics for training datasets (n = 593)

Full size table

Comparison of the performance of the Bayes-SHLNM and frequency-based logistic machine learning models

Table 2 shows the 5-fold cross-validation (5fCV) performance of the Bayesian and FLR models. Among the four Bayesian models, the Bayes-SHLNM model showed superior performance in terms of the receiver operating characteristic area under the curve (ROC AUC) (0.83), precision-recall AUC (PRAUC, 0.35), and F1 score (0.31), with comparable results to those of the FLR model (Fig. 3). These findings indicate that the Bayes-SHLNM is a robust alternative to the FLR model.

Fig. 3: Performance of four Bayesian models (Bayes-splenic hilum lymph node metastasis [SHLNM], Bayesian least absolute shrinkage and selection operator, basic, and Student-T models) and one frequentist model (frequentist logistic regression model) based on the average results from 5-fold cross-validation, along with the 95% confidence intervals.

Table 2 Performance of 5-fold cross-validation for the Bayesian logistic regression and FLR models

Full size table

The results of the Bayes-SHLNM and FLR model predictions obtained from 5fCV are summarized in Table 3 and Supplementary Table 1. When the tumors were divided into two categories based on whether the tumor had GCI, which is recommended as an indication for SHLN dissection, the models predicted positive results equally in both categories in approximately 20% of cases, whereas tumors without GCI were predicted as negative precisely in 99% of cases. The results of Bayes-SHLNM and FLR were found to be similar; however, for cases without GCI, Bayes-SHLNM performed slightly better, whereas for cases with GCI, FLR demonstrated slightly higher accuracy.

Table 3 Prediction results of Bayes-SHLNM model category AGC with or without GCI

Full size table

The Bayes-SHLNM model posterior probability distribution for individual patients

Six demonstrable cases are shown in Figs. 4, 5, where the Bayes-SHLNM model provided a PPD with prediction and uncertainty. Figures 4, 5 demonstrate representative cases of individual posterior distributions of the probability of SHLN metastasis inferred using the Bayes-SHLNM model. Non-GCI (NonG)-Case 1 (Fig. 4a) was a 61-year-old man with AGC without GCI (U, Post, Type 1, 60 mm, por2 > sig, pT4a [SE]) and LNM (#1, #3, #4sa, #4sb, #11d). According to the Japanese Gastric Cancer Association (JGCA) guidelines (version 6), tumors that do not invade the greater curvature line are strongly recommended not to undergo SHLN dissection, including splenectomy. However, this model can provide an opportunity to reconsider whether a patient is at risk of undergoing SHLN dissection. WithG-Case 1 (Fig. 5a) was a 69-year-old man with AGC, GCI (UML, Circ, Type 4, 210 mm, por2 > sig, pT4a [SE]), and LNM (#1, #2, #3, #4sa, #4sb, #5, #6, #7, #9, #11p, #11d). NonG-Case 2 (Fig. 4b) was a 56-year-old woman with AGC without GCI (U, Less, Type 2, 60 mm, tub2, pT3 [SS]) and LNM (#3). The posterior distribution of these two cases supports decisions based on the JGCA guideline. Tumors that invaded the greater curvature line are weakly recommended to undergo SHLN dissection, including splenectomy. WithG-Case 2 (Fig. 5b) was a 66-year-old woman with AGC with GCI (U, Ant, Type 2, 45 mm, tub2, pT3 [SS]) and no LNM. The posterior distribution of this case supports reconsideration of the decision based on the JGCA guidelines. This model can provide us with good information to reach a consensus on whether a patient should undergo splenectomy.

**Fig. 4: Posterior probability distributions of splenic hilum lymph node (SHLN) metastasis predicted by the Bayes-SHLNM model for three NonG cases.**

**Fig. 5: Posterior probability distributions of splenic hilum lymph node (SHLN) metastasis predicted by the Bayes-SHLNM model for three cases with greater curvature invasion (WithG).**

However, Figs. 4c, 5c showed two cases in which decision-making might not change if uncertainty was considered. Therefore, predicting uncertainty is unacceptable for clinical judgment. Clinicians can then rely on the JGCA guidelines, patient will, or institutional policies. One (Fig. 5c, WithG-Case 3) was a 55-year-old man with AGC, GCI (M, Gre, Type 5, 77 mm, por2 > tub2, pT2 [MP]), and LNM (#3, #4d). The Bayes-SHLNM predicted “negative” but had room to reconsider performing as the positive case because the mean posterior distribution was 0.075, with a range of 0–0.2 in 95% highest density interval (HDI). The other patient (Fig. 4c, NonG-Case 3) was a 79-year-old man with AGC without GCI (U, Ant, Type 2, 100 mm, tub2 > por1, pT3 [SS]) and LNM (#1, #3, #7, #9, #11p). The Bayes-SHLNM model predicted “positive,” but the PPD with the uncertainty made room to reconsider not performing a splenectomy because of the patient’s age. All individual PPDs of the 5fCV models are shown in Supplementary Figure 1.

Posterior distribution of the regression coefficient parameters

Figure 6 shows the parameters of the 47 regression coefficients in the Bayes-SHLNM model trained using 593 cases. Notably, both the #4sb and #4sa coefficients had values > 0 within 95 HDI, suggesting a significant positive influence on the model. Tumor location in the greater curvature, tumor size, predominant histological por2, LNM #11d, and LNM #12a tended to be positive parameters.

**Fig. 6: Posterior distributions of the 47 regression coefficients in the Bayes-splenic hilum lymph node metastasis (SHLNM) model for SHLNM prediction in upper gastrointestinal cancer (UGC).**

Discussion

In the present study, we developed Bayesian models to predict SHLN metastasis in UGC using data from 593 patients with UGC who underwent TGS. To the best of our knowledge, this is the first report of a machine-learning model using Bayesian techniques for the prediction of SHLNM. The Bayes-SHLNM model performed comparably to the frequentist LR model, with a mean ROC AUC of 0.83 and an F1 score of 0.3 by 5fCV. When UGC was divided into two categories based on whether the UGC had GCI, which has been widely used as an indicator of SHLN dissection in Japan, positive predictive values, or precisions, were 20% regardless of GCI, whereas the negative predictive values were 99% in UGCs without GCI and 91% in UGCs with GCI². Moreover, the Bayes-SHLNM model demonstrated PPDs that provided uncertainty in the prediction of individual cases. These results suggest that the Bayes-SHLNM model has the potential to help in clinical decision-making regarding whether SHLN dissection should be performed for UGC.

The Bayes-SHLNM model is comparable with the frequentist model. Previous models for predicting LNM in gastric cancer have focused on the decision between surgery and endoscopic resection using regional LN metastasis prediction based on a frequentist approach. They reported that their models performed, with the following values: AUC, 0.69–0.94; F1, 0.29–0.31; precision, 8–21%; and negative predict value, 99%^34,35,36. Given the well-established benefits of gastric surgery, the false negative rate of LNM prediction models for early gastric cancer is unacceptable. However, in our study, considering the wide range of safety and feasibility of TGS in real-world data, the acceptable levels of false negatives and false positives differed for each cohort. In this study, we develop a prediction model with a level of predictive performance comparable with that of a frequentist, together with a range of uncertainty for these predictions.

The strength of this model is that it provides a posterior probability density distribution for each case; that is, uncertainty is accounted for. The benefit of considering uncertainty in decision-making has been reported in the literature, especially for situations that may occur infrequently but have significant negative effects if overlooked in unbalanced and limited data^43,44. This model provides an effective individual indicator that can be used to evaluate and discuss the pros and cons of performing the invasive treatment, SHLN dissection, with knowledge of the advantages of performing the dissection and the disadvantages of developing complications in the case.

In addition, incorporating uncertainty into decision-making has several practical benefits in clinical settings. First, uncertainty estimates allow clinicians to identify cases in which predictions are highly reliable versus those in which additional diagnostic testing or consultation may be needed. For example, if the model predicts SHLN metastasis with high certainty, it can streamline the decision to perform dissection. Conversely, when the model’s uncertainty is high, it highlights the need for further investigation or a more cautious approach.

Secondly, quantifying uncertainty facilitates better communication between clinicians and patients. By presenting predictions as probability distributions rather than definitive outcomes, clinicians can transparently discuss the risks and benefits, enabling shared decision-making and helping patients feel more informed and involved in their care. Incorporating these advantages into clinical workflows ensures that uncertainty not only informs decision-making but also enhances robustness, balancing the risks and benefits of invasive interventions such as SHLN dissection. We believe that this feature can serve as a powerful communication tool, fostering better collaboration among clinicians and between clinicians and patients. In situations with significant uncertainty, this method allows for the effective sharing of complex information and helps all parties understand the risks and also the probabilities involved. Through improved communication, clinicians and patients can work together to select informed and acceptable treatment options tailored to the unique circumstances of each case.

Furthermore, although the Bayesian model has demonstrated its ability to provide uncertainty estimates, comparisons with other machine learning models, such as random forests, remain an important direction for future work. Random forests and similar models can generate prediction intervals based on the variance in the data, providing insight into data uncertainty. However, these intervals primarily reflect data variability and do not account for model uncertainty, which is a critical component of clinical decision-making. By contrast, Bayesian methods provide a more comprehensive framework by quantifying both data and model uncertainty, which can be particularly valuable in situations with limited or imbalanced data.

Expanding this study to include comparisons with these models would provide a broader perspective on the strengths and limitations of Bayesian approaches. Such comparisons would require significant additional work in order to ensure fair and rigorous evaluations as well as thorough validation using external datasets. Although this is beyond the scope of this study, we acknowledge the value of this approach and suggest that it is an important avenue for future research. These efforts would help further establish the robustness and clinical utility of Bayesian models compared to other predictive modeling techniques.

Hyperparameter selection plays a critical role in both frequentist and Bayesian modeling approaches; however, the methodologies are fundamentally different. In frequentist models, hyperparameters such as regularization strength are treated as fixed values determined by cross-validation or optimization techniques. In our study, we used Optuna for efficient hyperparameter tuning in the frequentist logistic regression to ensure optimal model performance.

In contrast, Bayesian models treat hyperparameters as probabilistic variables and often assign prior distributions to directly incorporate uncertainty into the modeling process. In this study, several prior distributions of Bayesian model parameters were evaluated to assess their impact on predictive performance and uncertainty quantification. The horseshoe performed the best, demonstrating superior accuracy and robustness. The horseshoe prior is particularly advantageous because of its ability to handle sparsity and mitigate overfitting, making it a natural choice for our dataset, which has a moderate number of predictors relative to the sample size.

This distinction between the frequentist and Bayesian approaches highlights the flexibility and robustness of Bayesian models, particularly in contexts where quantifying uncertainty is critical. By incorporating hyperparameter uncertainty into posterior distributions, Bayesian models provide a more comprehensive framework for prediction and decision-making than frequentist methods that rely on fixed hyperparameter values.

We selected 47 parameters as the coefficients of logistic regression and examined the training models with four different super-prior distributions for regularization. The horseshoe prior model performed the best and exhibited the strongest regularization among the four models analyzed. This result can be attributed to correlations among various explanatory factors, such as tumor morphology, size, location, histology, and LNM sites. Because clinical decisions are usually based on a combination of all these factors, we decided not to omit categories arbitrarily, but to adjust the regularization to address overfitting and multicollinearity. Figure 6 shows the posterior distributions of the 47 parameters. Only #4sa and #4sb exceeded 95% HDI. These items are consistent with the factors identified in previous statistical methodologies. Previous studies have identified independent risk factors for SHLN, such as type 4 macroscopic type, larger and deeper invaded tumor, certain regional lymph node metastases (#4sa, #4sb, #7, or #11), and undifferentiated-type histology, which is consistent with our results^45,46. In our study, the pathological T-factors and #7 were not significant, which can be explained by the exclusion of pStage IV, including #16LN and CY positivity, in our cohort.

Several parameters were zero within their 95% uncertainty ranges. During the experimental phase, we explored models that included the selected parameter subsets. However, given the limited dataset used in our experiments, it was challenging to completely eliminate subjectivity in the feature selection methods. Therefore, we decided not to perform feature selection. Instead, we chose to include all 47 parameters and relied on the regularization provided by the horseshoe prior to effectively addressing overfitting and multicollinearity. We believe that this approach provides a more comprehensive representation of the data and is more consistent with the multifactorial nature of clinical decision-making. In addition, we also recognize that the use of feature selection may become a more viable approach as larger datasets become available for future studies. This could help further reduce multicollinearity and improve model interpretability without compromising predictive accuracy.

This study has some limitations. First, this was a retrospective study involving a certain patient group from a single high-volume institution. Following the publication of the JGCA guidelines in Japan, splenectomy with SHLN dissection was not performed in patients with GC without GCI. Consequently, a larger sample size is not expected in the future. Furthermore, the low rate of SHLN metastasis leads to imbalanced data, which results in the suboptimal performance of traditional machine learning methods. However, the Bayesian approach adopted in this study shows promise for addressing this issue. Second, the model was not validated using an external cohort. Therefore, further validation using a different cohort is required. Third, we could not verify the accuracy of the PPD. However, the most important factor is its usefulness in clinical decision making. Future prospective studies are required. In addition, integrating explainability into a model remains challenging. In general, deterministic models tend to align well with explainability. Incorporating explainability into a Bayesian model, which visualizes uncertainty, is an important future direction. Addressing this challenge could further enhance the clinical utility of the model and improve its practical adoption. Finally, the validity and interpretability of the uncertainty bounds provided by the Bayesian model remain unclear. To assess the accuracy of PPD and address potential model misspecifications, future research will require both prospective studies and also simulation-based analyses. These analyses examine the calibration of the Bayesian model under various scenarios, including cases involving misspecifications. In addition, validation using external datasets will provide further insight into the robustness of uncertainty estimates in different contexts. These efforts are critical for ensuring the reliability and practical applicability of the Bayesian approach in clinical settings.

In conclusion, the Bayes-SHLNM model demonstrates a performance equivalent to that of the traditional FLR while providing individual PPDs. It demonstrates potential contributions to decision-making processes and suggests promising prospects for personalized precision medicine.

Methods

Setting and ethical approval

All methods were performed in accordance with the ethical guidelines for medical and health research involving human subjects. Informed consent was obtained from all patients. This retrospective cohort study was approved by the Institutional Review Board of the National Cancer Center (2016-496, 2017-077). The study was conducted in accordance with the principles of the Declaration of Helsinki.

Datasets

In the present study, we used a cohort previously reported by Yura et al. ⁴⁷ This study involved a retrospective review of the clinical records of 593 patients diagnosed with advanced gastric cancer classified as stages T2–T4. These stages are based on the depth of tumor invasion into the stomach wall: T2 indicates invasion into the muscle layer, T3 indicates invasion into the connective tissue beneath the outer membrane, and T4 indicates invasion into the membrane itself or adjacent structures. All of the patients had tumors located in the upper third of the stomach and underwent total gastrectomy (complete removal of the stomach) combined with splenectomy (removal of the spleen) and extensive lymph node dissection called D2 dissection. D2 dissection involves the removal of all the regional lymph nodes around the stomach, including those near the major blood vessels supplying the stomach. Surgeries were performed between January 2000 and December 2012 at the National Cancer Center Hospital in Japan. Importantly, all patients underwent curative surgery (referred to as R0 resection), indicating that no visible or microscopic tumors remained after surgery. Resected specimens were examined and evaluated according to the Japanese Classification of Gastric Carcinoma^48,49.

Criteria for patient selection

The following criteria were used to select the study population:

Initial Pool: Patients who underwent gastrectomy with nodal dissection between January 2000 and December 2012

Inclusion Criteria:

Patients who underwent total gastrectomy with splenectomy for primary gastric cancer

Exclusion Criteria:

1. Patients diagnosed with pT0 or pT1 disease

2. Patients diagnosed with pStage IV disease (#16 LN metastasis, positive cytology)

3. Patients who underwent R1 or R2 resection

Criteria for variable selection

The explanatory variables for the model were selected from a comprehensive list of items reported to be clinically relevant for predicting SHLN metastasis in gastric cancer. The selected variables encompassed a range of clinical and pathological factors that could be predicted by preoperative examination. Clinical data included age, sex, whether neoadjuvant chemotherapy was administered, and whether the tumor invaded the greater curvature. Pathological data were classified according to the Japanese Classification of Gastric Carcinoma, including tumor location, cross-sectional area, macroscopic type, tumor size, predominant histological type, secondary predominant histological type, third predominant histological type, and metastasis to regional lymph nodes (numbers 1–12a) (Fig. 7). Common types were individually categorized, whereas special types were collectively classified as “special (sp)” due to their low incidence^48,49.

**Fig. 7: Lymph node’s numbering system defined by the Japanese Classification of Gastric Carcinoma (15th edition).**

Software and the basic structure of the model development

We designed our model using Python 3.10 and PyMC 5.9.2⁵⁰. The outcome variable for SHLN metastasis was binary (0 for absence, 1 for presence). The explanatory variables were incorporated using logistic regression. We assume that the output followed a Bernoulli distribution. We selected a non-informative prior for our Bayesian model and applied four superior distributions for regularization: a normal distribution, Student’s T, Laplace, and horseshoe priors⁵¹. Details of the horseshoe prior are provided in Supplementary Figures 2, 3. Additionally, for the performance benchmarking, the FLR model was developed by Scikit-learn module, tuned with the hyperparameter optimization framework “Optuna” version 3.5.0^52,53.

Selection and normalization of explanatory variables

Continuous variables, such as age, tumor size, and pathological T category, were standardized by applying Z-score normalization to convert them to a standard normal distribution. For categorical variables, we used one-hot encoding to transform them into a format suitable for the model input. The one-hot encoded variables were standardized using Z-score normalization.

Sampling and inferred posterior probability distribution

In this study, PyMC was used to estimate the PPD of the Bayesian model. We employed the No-U-Turn Sampler algorithm to perform the Markov Chain Monte Carlo (MCMC) sampling⁵⁴. A total of 5000 samples were collected. The initial 2000 samples were discarded as burn-in to ensure a more accurate estimation of the distribution after convergence. Four independent chains were run to ensure sample diversity. The acceptance rate was set to 0.99 to achieve efficient and accurate sampling.

Evaluation of the model’s performance using the internal cross-validation method

To compare the model performance, we used a stratified 5fCV approach. This process involved creating five separate models, each tested on different data subsets to evaluate their performance. In the training datasets, after samples were drawn by MCMC sampling, samples of all channels obtained from posterior sampling were combined in each case as a PPD. Their means were estimated as predictions and compared with observed SHLNM⁵⁵. For the decision-making, the thresholds were calculated using the following procedure:

1.
From the PPD of the training datasets for each fold, the mean was extracted as the predicted probability for each case.
2.
These predicted probabilities were used to construct an ROC curve and the Youden index was applied to determine the optimal threshold for classification. This threshold was used as the decision boundary and is referred to as the Yi threshold.

During testing, the models predicted outcomes, where a positive result was indicated if the mean posterior probability was above this threshold. These predictions were then compared with the actual outcomes, with the performance assessed using metrics, such as the ROC AUC, PRAUC, sensitivity, specificity, precision, and F1 score. This process was repeated for each of the five created datasets, and the resulting average of each value was calculated to assess the overall effectiveness of our model.

In addition to our Bayesian model, FLR models were developed for benchmark purposes. We applied the same rigorous model development, testing, and evaluation process to the FLR model used in our main model, effectively comparing how well each model predicted SHLN metastasis.

Evaluation of the utility of posterior probability distribution

Uncertainty was assessed using individual PPDs, and the feasibility of the model for clinical implementation was examined. The 95% HDI range, mean, and median of the PPDs were calculated and expressed as density distributions in kernel plots. Cases were visualized as GCI or not, which is an important indicator for clinical decision-making according to the JGCA guideline criteria². Cases were demonstrated in which the model predictions themselves could change the clinical decision, whereas cases in which the uncertainty of the prediction could help decision-making were also shown.

Uncertainty evaluation of Bayesian regression coefficients

Using all the training cohorts, we developed a final Bayesian logistic regression model and examined the posterior distributions of the parameters using a 95% HDI. The coefficients for which the posterior distributions of the parameters did not cross zero were defined as significant.

Data availability

The data used in this study are not available on public accessdue to patient privacy concerns but are available from the corresponding author upon reasonable request.

Code availability

Code is available upon request from the corresponding authors.

References

Sung, H. et al. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA Cancer J. Clin. 71, 209–249, https://doi.org/10.3322/caac.21660 (2021).
Article CAS PubMed Google Scholar
Japanese Gastric Cancer, A. Japanese Gastric Cancer Treatment Guidelines 2021 (6th edition). Gastric Cancer 26, 1-25, https://doi.org/10.1007/s10120-022-01331-8 (2023).
Sasako, M. et al. D2 lymphadenectomy alone or with para-aortic nodal dissection for gastric cancer. N. Engl. J. Med. 359, 453–462, https://doi.org/10.1056/NEJMoa0707035 (2008).
Article CAS PubMed Google Scholar
Sasako, M. et al. Left thoracoabdominal approach versus abdominal-transhiatal approach for gastric cancer of the cardia or subcardia: a randomised controlled trial. Lancet Oncol. 7, 644–651, https://doi.org/10.1016/S1470-2045(06)70766-5 (2006).
Article PubMed Google Scholar
Kurokawa, Y. et al. Bursectomy versus omentectomy alone for resectable gastric cancer (JCOG1001): a phase 3, open-label, randomised controlled trial. Lancet Gastroenterol. Hepatol. 3, 460–468, https://doi.org/10.1016/s2468-1253(18)30090-6 (2018).
Article PubMed Google Scholar
Sano, T. et al. Randomized Controlled Trial to Evaluate Splenectomy in Total Gastrectomy for Proximal Gastric Carcinoma. Ann. Surg. 265, 277–283, https://doi.org/10.1097/sla.0000000000001814 (2017).
Article PubMed Google Scholar
Cuschieri, A. et al. Postoperative morbidity and mortality after D1 and D2 resections for gastric cancer: preliminary results of the MRC randomised controlled surgical trial. Lancet 347, 995–999, https://doi.org/10.1016/s0140-6736(96)90144-0 (1996).
Article CAS PubMed Google Scholar
Sasada, S. et al. Frequency of lymph node metastasis to the splenic hilus and effect of splenectomy in proximal gastric cancer. Anticancer Res. 29, 3347–3351 (2009).
PubMed Google Scholar
Kunisaki, C. et al. Impact of splenectomy in patients with gastric adenocarcinoma of the cardia. J. Gastrointest. Surg. 11, 1039–1044, https://doi.org/10.1007/s11605-007-0186-z (2007).
Article PubMed Google Scholar
Zhu, G. L. et al. Splenic hilar lymph node metastasis independently predicts poor survival for patients with gastric cancers in the upper and/or the middle third of the stomach. J. Surg. Oncol. 105, 786–792, https://doi.org/10.1002/jso.22149 (2012).
Article PubMed Google Scholar
Ishikawa, S. et al. Pattern of lymph node involvement in proximal gastric cancer. World J. Surg. 33, 1687–1692, https://doi.org/10.1007/s00268-009-0083-6 (2009).
Article PubMed Google Scholar
Huang, C. M. et al. A 346 case analysis for laparoscopic spleen-preserving no.10 lymph node dissection for proximal gastric cancer: a single center study. PLoS One 9, e108480, https://doi.org/10.1371/journal.pone.0108480 (2014).
Article CAS PubMed PubMed Central Google Scholar
Shin, S. H. et al. Clinical significance of splenic hilar lymph node metastasis in proximal gastric cancer. Ann. Surg. Oncol. 16, 1304–1309, https://doi.org/10.1245/s10434-009-0389-5 (2009).
Article PubMed Google Scholar
Nishino, M. et al. Possible candidates for splenic hilar nodal dissection among patients with upper advanced gastric cancer without invasion of the greater curvature. Gastric Cancer 26, 460–466, https://doi.org/10.1007/s10120-023-01370-9 (2023).
Article PubMed Google Scholar
Galizia, G. et al. Modified versus standard D2 lymphadenectomy in total gastrectomy for nonjunctional gastric carcinoma with lymph node metastasis. Surgery 157, 285–296, https://doi.org/10.1016/j.surg.2014.09.012 (2015).
Article PubMed Google Scholar
Csendes, A. et al. A prospective randomized study comparing D2 total gastrectomy versus D2 total gastrectomy plus splenectomy in 187 patients with gastric carcinoma. Surgery 131, 401–407, https://doi.org/10.1067/msy.2002.121891 (2002).
Bonenkamp, J. J. et al. Randomised comparison of morbidity after D1 and D2 dissection for gastric cancer in 996 Dutch patients. Lancet 345, 745–748, https://doi.org/10.1016/s0140-6736(95)90637-1 (1995).
Article CAS PubMed Google Scholar
Kodera, Y. et al. Identification of risk factors for the development of complications following extended and superextended lymphadenectomies for gastric cancer. Br. J. Surg. 92, 1103–1109, https://doi.org/10.1002/bjs.4979 (2005).
Article CAS PubMed Google Scholar
Otsuji, E., Yamaguchi, T., Sawai, K., Ohara, M. & Takahashi, T. End results of simultaneous splenectomy in patients undergoing total gastrectomy for gastric carcinoma. Surgery 120, 40–44, https://doi.org/10.1016/s0039-6060(96)80239-x (1996).
Article CAS PubMed Google Scholar
Kinoshita, T. et al. Laparoscopic splenic hilar lymph node dissection for proximal gastric cancer using integrated three-dimensional anatomic simulation software. Surg. Endosc. 30, 2613–2619, https://doi.org/10.1007/s00464-015-4511-4 (2016).
Article PubMed Google Scholar
Kinoshita, T. & Okayama, T. Is splenic hilar lymph node dissection necessary for proximal gastric cancer surgery? Ann. Gastroenterol. Surg. 5, 173–182, https://doi.org/10.1002/ags3.12413 (2021).
Article PubMed Google Scholar
Feng, M. A. et al. Geriatric assessment in surgical oncology: a systematic review. J. Surg. Res 193, 265–272, https://doi.org/10.1016/j.jss.2014.07.004 (2015).
Article PubMed Google Scholar
Huisman, M. G., Kok, M., de Bock, G. H. & van Leeuwen, B. L. Delivering tailored surgery to older cancer patients: Preoperative geriatric assessment domains and screening tools – A systematic review of systematic reviews. Eur. J. Surgical Oncol. (EJSO) 43, 1–14, https://doi.org/10.1016/j.ejso.2016.06.003 (2017).
Article CAS PubMed Google Scholar
Puts, M. T. et al. An update on a systematic review of the use of geriatric assessment for older adults in oncology. Ann. Oncol. 25, 307–315, https://doi.org/10.1093/annonc/mdt386 (2014).
Article CAS PubMed Google Scholar
Hamamoto, R. et al. Application of Artificial Intelligence Technology in Oncology: Towards the Establishment of Precision Medicine. Cancers (Basel) 12, 3532, https://doi.org/10.3390/cancers12123532 (2020).
Article PubMed Google Scholar
Yamada, M. et al. Development of a real-time endoscopic image diagnosis support system using deep learning technology in colonoscopy. Sci. Rep. 9, 14465, https://doi.org/10.1038/s41598-019-50567-5 (2019).
Article CAS PubMed PubMed Central Google Scholar
Jinnai, S. et al. The Development of a Skin Cancer Classification System for Pigmented Skin Lesions Using Deep Learning. Biomolecules 10, 1123, https://doi.org/10.3390/biom10081123 (2020).
Article CAS PubMed PubMed Central Google Scholar
Hamamoto, R. et al. Introducing AI to the molecular tumor board: one direction toward the establishment of precision medicine using large-scale cancer clinical and biological information. Exp. Hematol. Oncol. 11, 82, https://doi.org/10.1186/s40164-022-00333-7 (2022).
Article PubMed PubMed Central Google Scholar
Asada, K. et al. Uncovering Prognosis-Related Genes and Pathways by Multi-Omics Analysis in Lung Cancer. Biomolecules 10, 524, https://doi.org/10.3390/biom10040524 (2020).
Article CAS PubMed PubMed Central Google Scholar
Kobayashi, K., Miyake, M., Takahashi, M. & Hamamoto, R. Observing deep radiomics for the classification of glioma grades. Sci. Rep. 11, 10942, https://doi.org/10.1038/s41598-021-90555-2 (2021).
Article CAS PubMed PubMed Central Google Scholar
Asada, K. et al. Integrated Analysis of Whole Genome and Epigenome Data Using Machine Learning Technology: Toward the Establishment of Precision Oncology. Front Oncol 11, 666937, https://doi.org/10.3389/fonc.2021.666937 (2021).
Takahashi, S. et al. A New Era of Neuro-Oncology Research Pioneered by Multi-Omics Analysis and Machine Learning. Biomolecules 11, 565, https://doi.org/10.3390/biom11040565 (2021).
Article CAS PubMed PubMed Central Google Scholar
Kawaguchi, R. K. et al. Assessing Versatile Machine Learning Models for Glioma Radiogenomic Studies across Hospitals. Cancers (Basel) 13, 3611, https://doi.org/10.3390/cancers13143611 (2021).
Article CAS PubMed Google Scholar
Hayashi, T. et al. A discrimination model by machine learning to avoid gastrectomy for early gastric cancer. Ann. Gastroenterological Surg. 7, 913–921, https://doi.org/10.1002/ags3.12714 (2023).
Article Google Scholar
Zhu, H. et al. Preoperative prediction for lymph node metastasis in early gastric cancer by interpretable machine learning models: A multicenter study. Surgery 171, 1543–1551, https://doi.org/10.1016/j.surg.2021.12.015 (2022).
Article PubMed Google Scholar
Lee, H. D. et al. Development and Validation of Models to Predict Lymph Node Metastasis in Early Gastric Cancer Using Logistic Regression and Gradient Boosting Machine Methods. Cancer Res Treat. 55, 1240–1249, https://doi.org/10.4143/crt.2022.1330 (2023).
Article PubMed PubMed Central Google Scholar
Zhang, A. Q. et al. Computed tomography-based deep-learning prediction of lymph node metastasis risk in locally advanced gastric cancer. Front Oncol 12, 969707, https://doi.org/10.3389/fonc.2022.969707 (2022).
Dong, D. et al. Deep learning radiomic nomogram can predict the number of lymph node metastasis in locally advanced gastric cancer: an international multicenter study. Annals of Oncology 31, 912–920, https://doi.org/10.1016/j.annonc.2020.04.003 (2020).
Lu, T. et al. Comparison of Machine Learning and Logic Regression Algorithms for Predicting Lymph Node Metastasis in Patients with Gastric Cancer: A two-Center Study. Technology in Cancer Research & Treatment 23, https://doi.org/10.1177/15330338231222331 (2024).
HajiEsmailPoor, Z., Tabnak, P., Baradaran, B., Pashazadeh, F. & Aghebati-Maleki, L. Diagnostic performance of CT scan–based radiomics for prediction of lymph node metastasis in gastric cancer: a systematic review and meta-analysis. Frontiers in Oncology 13, https://doi.org/10.3389/fonc.2023.1185663 (2023).
Giovagnoli, A. The Bayesian Design of Adaptive Clinical Trials. Int J Environ Res Public Health 18, https://doi.org/10.3390/ijerph18020530 (2021).
Ashby, D. Bayesian statistics in medicine: a 25 year review. Stat. Med 25, 3589–3631, https://doi.org/10.1002/sim.2672 (2006).
Article PubMed Google Scholar
Troiani, J. S. & Carlin, B. P. Comparison of Bayesian, classical, and heuristic approaches in identifying acute disease events in lung transplant recipients. Stat. Med 23, 803–824, https://doi.org/10.1002/sim.1651 (2004).
Article PubMed Google Scholar
Fanconi, C., de Hond, A., Peterson, D., Capodici, A. & Hernandez-Boussard, T. A Bayesian approach to predictive uncertainty in chemotherapy patients at risk of acute care utilization. EBioMedicine 92, 104632, https://doi.org/10.1016/j.ebiom.2023.104632 (2023).
Li, P. et al. Laparoscopic spleen-preserving splenic hilar lymphadenectomy in 108 consecutive patients with upper gastric cancer. World J. Gastroenterol. 20, 11376–11383, https://doi.org/10.3748/wjg.v20.i32.11376 (2014).
Article PubMed PubMed Central Google Scholar
Aoyagi, K. et al. Prognosis of metastatic splenic hilum lymph node in patients with gastric cancer after total gastrectomy and splenectomy. World J. Hepatol. 2, 81–86, https://doi.org/10.4254/wjh.v2.i2.81 (2010).
Article PubMed PubMed Central Google Scholar
Yura, M. et al. The Therapeutic Survival Benefit of Splenic Hilar Nodal Dissection for Advanced Proximal Gastric Cancer Invading the Greater Curvature. Ann. Surgical Oncol. 26, 829–835, https://doi.org/10.1245/s10434-018-07122-9 (2018).
Article Google Scholar
Japanese Gastric Cancer, A. Japanese classification of gastric carcinoma: 3rd English edition. Gastric Cancer 14, 101-112, https://doi.org/10.1007/s10120-011-0041-5 (2011).
Nakamura, T. et al. History of the lymph node numbering system in the Japanese Classification of Gastric Carcinoma since 1962. Surg. Today 52, 1515–1523, https://doi.org/10.1007/s00595-021-02395-2 (2021).
Article PubMed Google Scholar
Abril-Pla, O. et al. PyMC: a modern, and comprehensive probabilistic programming framework in Python. PeerJ Computer Sci. 9, e1516, https://doi.org/10.7717/peerj-cs.1516 (2023).
Article Google Scholar
Carvalho, C. M., Polson, N. G. & Scott, J. G. Handling sparsity via the horseshoe. Artificial intelligence and statistics, 73-80 (2009).
Akiba, T., Sano, S., Yanase, T., Ohta, T. & Koyama, M. Optuna: A Next-generation Hyperparameter Optimization Framework. 2623–2631, https://doi.org/10.1145/3292500.3330701 (2019).
Pedregosa, F. et al. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
Google Scholar
Hoffman, M. D. & Gelman, A. The No-U-Turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. J. Mach. Learn. Res. 15, 1593–1623 (2014).
Google Scholar
Youden, W. J. Index for rating diagnostic tests. Cancer 3, 32–35 (1950).
Article CAS PubMed Google Scholar
Japanese Gastric Cancer, A. Japanese Classification of Gastric Carcinoma - 2nd English Edition. Gastric Cancer 1, 10-24, https://doi.org/10.1007/s101209800016 (1998).

Download references

Acknowledgements

We thank all members of R. Hamamoto’s laboratory for providing valuable advice and a comfortable environment.

Author information

Authors and Affiliations

Division of Medical AI Research and Development, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo, 104-0045, Japan
Kenichi Ishizu, Satoshi Takahashi, Nobuji Kouno, Ken Takasawa, Katsuji Takeda & Ryuji Hamamoto
Department of Gastric Surgery, National Cancer Center Hospital, 5-1-1 Tsukiji, Chuo-ku, Tokyo, 104-0045, Japan
Kenichi Ishizu, Masashi Nishino, Tsutomu Hayashi, Yukinori Yamagata & Takaki Yoshikawa
Cancer Translational Research Team, RIKEN Center for Advanced Intelligence Project, 1-4-1 Nihonbashi, Chuo-ku, Tokyo, 103-0027, Japan
Satoshi Takahashi, Nobuji Kouno, Ken Takasawa, Katsuji Takeda & Ryuji Hamamoto
Department of Biostatistics, Nagoya University Graduate School of Medicine, 65 Tsurumai-cho, Showa-ku, Nagoya, Japan
Kota Matsui & Shigeyuki Matsui

Authors

Kenichi Ishizu
View author publications
Search author on:PubMed Google Scholar
Satoshi Takahashi
View author publications
Search author on:PubMed Google Scholar
Nobuji Kouno
View author publications
Search author on:PubMed Google Scholar
Ken Takasawa
View author publications
Search author on:PubMed Google Scholar
Katsuji Takeda
View author publications
Search author on:PubMed Google Scholar
Kota Matsui
View author publications
Search author on:PubMed Google Scholar
Masashi Nishino
View author publications
Search author on:PubMed Google Scholar
Tsutomu Hayashi
View author publications
Search author on:PubMed Google Scholar
Yukinori Yamagata
View author publications
Search author on:PubMed Google Scholar
Shigeyuki Matsui
View author publications
Search author on:PubMed Google Scholar
Takaki Yoshikawa
View author publications
Search author on:PubMed Google Scholar
Ryuji Hamamoto
View author publications
Search author on:PubMed Google Scholar

Contributions

K.I. was responsible for conceptualization, methodology, investigation, data curation, formal analysis, and writing of the original draft. S.T. was responsible for the conceptualization, methodology, formal analysis, and writing—review and editing. N.K. was responsible for the methodology and writing—review and editing. K. Takasawa was responsible for the conceptualization, methodology, formal analysis, and writing—review and editing. K. Takeda was responsible for the methodology, formal analysis, and writing—review and editing. K.M. was responsible for the methodology and writing—review and editing. M.N. was responsible for the methodology, data curation, and writing—review and editing. T.H. was responsible for the methodology, data curation, and writing—review and editing. Y.Y. was responsible for the methodology, data curation, and writing – review and editing. S.M. was responsible for the methodology and writing – review and editing. T.Y. was responsible for the methodology, data curation, and writing – review and editing. R.H. was responsible for the funding acquisition, conceptualization, methodology, and writing – original draft. All authors confirm that they have full access to all data in the study and accept responsibility for the submission for publication. All the authors have read and approved the final version of this manuscript.

Corresponding authors

Correspondence to Satoshi Takahashi or Ryuji Hamamoto.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information_npj Digital Medicine_rev_Final_2

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Ishizu, K., Takahashi, S., Kouno, N. et al. Establishment of a machine learning model for predicting splenic hilar lymph node metastasis. npj Digit. Med. 8, 93 (2025). https://doi.org/10.1038/s41746-025-01480-x

Download citation

Received: 29 October 2024
Accepted: 25 January 2025
Published: 11 February 2025
Version of record: 11 February 2025
DOI: https://doi.org/10.1038/s41746-025-01480-x

This article is cited by

Identification of multiomics and immune infiltration-associated biomarkers for early gastric cancer: a machine learning-based diagnostic model development study
- Kewei Du
- Wenfei Hu
- Shangdi Zhang
BMC Cancer (2025)
Integrative machine learning models predict prostate cancer diagnosis and biochemical recurrence risk: Advancing precision oncology
- Yaxuan Wang
- Haixia Zhu
- Minghua Ren
npj Digital Medicine (2025)

Subjects

Abstract

Similar content being viewed by others

Predicting lymph node metastasis from primary tumor histology and clinicopathologic factors in colorectal cancer using deep learning

Prognostic model for log odds of negative lymph node in locally advanced rectal cancer via interpretable machine learning

Development of a prediction model based on LASSO regression to evaluate the risk of non-sentinel lymph node metastasis in Chinese breast cancer patients with 1–2 positive sentinel lymph nodes

Introduction

Results

Study population selection process

Baseline characteristics

Comparison of the performance of the Bayes-SHLNM and frequency-based logistic machine learning models

The Bayes-SHLNM model posterior probability distribution for individual patients

Posterior distribution of the regression coefficient parameters

Discussion

Methods

Setting and ethical approval

Datasets

Criteria for patient selection

Criteria for variable selection

Software and the basic structure of the model development

Selection and normalization of explanatory variables

Sampling and inferred posterior probability distribution

Evaluation of the model’s performance using the internal cross-validation method

Evaluation of the utility of posterior probability distribution

Uncertainty evaluation of Bayesian regression coefficients

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Additional information

Supplementary information

Supplementary Information_npj Digital Medicine_rev_Final_2

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Identification of multiomics and immune infiltration-associated biomarkers for early gastric cancer: a machine learning-based diagnostic model development study

Integrative machine learning models predict prostate cancer diagnosis and biochemical recurrence risk: Advancing precision oncology

Search

Quick links