Introduction

Colonoscopy is the gold standard procedure for colorectal cancer screening and diagnosis, and high-quality colonoscopy is the key factor for improving the detection rate of early colorectal cancer and precancerous lesions1. Studies have shown that approximately one quarter of colonoscopies demonstrates inadequate bowel preparation2. Inadequate bowel preparation can prolong the colonoscopy time and increase the occurrence of bleeding or perforation during the examination; moreover, it can also reduce the detection rate of adenoma3,4. Currently, a many studies have assessed predictive factors of inadequate bowel cleansing; however, the views of these studies are not consistent5,6. Several studies have assessed predictive models of inadequate bowel preparation; however, the conclusions regarding these models are limited because of small sample sizes, a lack of validation, the inclusion of a limited number of potential predictors, and the lack of a validated scale of bowel cleansing7,8. Moreover, some studies have focused on hospitalized patients9, and a growing number of scholars believe that outpatients and inpatients require the construction of specific corresponding prediction models9,10.

A recent study using external validation of prediction models showed that adoption of a validated and easy-to-use predictive model is an efficacious intervention that could indeed assist clinicians in promptly identifying patients who are at a high risk of inadequate bowel preparation11. However, a more recent study of external validation and application of the prediction models showed that they are inadequate in predicting intestinal preparation failure, thereby requiring the construction of new prediction models or optimization of existing models12. Thus, the aim of this study was to identify the factors that are associated with inadequate bowel preparation in Chinese outpatients, in order to derive and validate a predictive model that is suitable for clinical practice. The discrimination and calibration of the predictive model were evaluated via internal and external validation, and the clinical utility of the model was evaluated by drawing a clinical decision curve.

Materials and methods

Study design and population

This was a single-center, prospective, observational study. Consecutive patients with indications for colonoscopy were prospectively enrolled between December 15, 2022 and August 12, 2023. The center performed bowel preparation regimens with 4 L polyethylene glycol (PEG) for all of the patients. Adult outpatients who were scheduled for colonoscopy for any indication were considered to be eligible. Patients who underwent emergency or elective therapeutic colonoscopies, as well as patients who did not undergo bowel preparation with PEG, were excluded from the study. Patient-related factors and patient bowel preparation-related factors were prospectively collected for quality auditing purposes. Diabetes mellitus, chronic constipation, medication use, and previous colorectal surgery history were obtained from the patient electronic medical record system and patient self-reports. Body mass index (BMI) and waist circumference were obtained from the clinical measurements. Other variables were obtained from patient self-reports. All of the patients were outpatients undergoing elective colonoscopy and were enrolled in the study after voluntary enrollment. Moreover, they were able to withdraw from the study at any time. The study was approved by the Research Ethics Committee of The First Affiliated Hospital of Zhejiang University (No. 2022 − 940). Informed consent was obtained from all of the participants and/or their legal guardians. In addition, the study was performed in accordance with the ethical standards of the Declaration of Helsinki in 1964 and its later amendments.

Procedures performed before colonoscopy

Members of the research team obtained written informed consent from all of the participants; moreover, the research team explained the purpose of the study and administered a questionnaire regarding predictors of inadequate bowel preparation, which included the following variables: age, sex, BMI, waistline circumference, diabetes, smoking status (smoking now or was a smoker and quit), medication use (antidepressants, opioids, and anxiolytic drugs), abdominal symptoms (abdominal pain or abdominal distension) experienced before the colonoscopy, educational level (elementary school, middle school, high school, or university), chronic constipation status (<3 bowel movements/week and at least one of the following: straining, hard stools defined as Bristol scale values of 1 or 2, and/or incomplete evacuation), personal history of colonoscopy (received an initial colonoscopy or not), history of colorectal surgery, and history of previous inadequate preparation. All of the participants received oral and written instructions that included the implementation of a structured low-fiber diet for the day before the procedure. For undergoing morning colonoscopy, the patient was administered a split-dose regimen; when undergoing afternoon colonoscopy, the patient received a same-day regimen13. All of the participants were advised to adhere to a low-residue diet or clear liquids for 24 h before the procedure. Abdominal obesity was defined as a waist circumference of 90 cm in men and 85 cm in women, as defined in the Chinese population.

Procedures performed on the same day as the colonoscopy

Morning colonoscopies were scheduled from 08:00 to 13:30, and afternoon colonoscopies were scheduled from 13:30 to 17:30. Water-pump use was allowed during endoscopy. Before the colonoscopy, the participants completed another questionnaire that was administered by the members of the research team with respect to dietary habits before the procedure and bowel preparation regimen, the end of the bowel preparation, ingested volume of bowel preparation, bowel preparation tolerance (such as nausea, vomiting, abdominal pain, and abdominal distension), and willingness to take the same preparation in the future. The time interval ranging from the end of the bowel preparation to the start of the colonoscopy (PC) was calculated after the colonoscopy.

Study outcomes

The primary outcome of this study was to identify the factors that are associated with inadequate bowel preparation among outpatients. Inadequate bowel preparation was defined as a Boston Bowel Preparation Scale (BBPS) score < 6 or BBPS < 2 for any colon segment14. The secondary outcome was the derivation and validation of a predictive model for inadequate bowel preparation.

Statistical analyses and predictive model derivation

We specified the type I error and the type II error as 0.05 and 0.20, respectively; therefore, the power of this study was 0.80. The sample distribution was binomial. Additionally, we calculated a sample size of at least 977 patients overall via G*power version 3.1.9.2 by using the methods of Faul et al.15and Demidenko16.

Continuous variables are reported as the means and standard deviations, and categorical variables are expressed as frequencies and percentages. Comparisons of variables were performed by using the Chi square test or Fisher’s exact test. Logistic regression analysis was performed by using bowel preparation quality (defined as BBPS < 2 in each segment) as the dependent variable and variables with a P value of < 0.05 in the univariate analysis. Results of the multivariate logistic regression analyses were expressed as regression coefficients, odds ratios (ORs) and 95% confidence intervals (CIs).

All of the patient- and bowel preparation-related variables that were statistically significantly associated with inadequate bowel preparation were included in a multivariate binary logistic regression analysis, in which a selection procedure was performed based on a backward stepwise elimination by using likelihood ratio statistics. The regression coefficients of the remaining predictive factors were used to assign integer points for the prediction score. The discrimination of the prediction score was assessed by calculating the AUC of the receiver operating characteristic curves (ROCs) in both cohorts. The calibration of the predictive model was assessed by computing the related goodness-of-fit test. The predicted score of inadequate bowel preparation for different scenarios that were defined according to the presence or absence of patient-related and preparation-related factors was determined in the validation cohort. The sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were used to evaluate the performance of the prediction model in both cohorts. The optimal cut-off point was determined in the validation cohort. A clinical decision curve was drawn to evaluate the clinical utility of the predictive model in both cohorts. All of the statistical analyses were conducted with IBM SPSS Statistics 25.0 and R4.3.2 software. Two-sided P values < 0.05 were considered to be statistically significant.

Results

Overall study population

Overall, from December 15, 2022 to August 12, 2023, 1400 patients were enrolled in the study. Additionally, 1035 outpatients who were treated from December 15, 2022 to May 31, 2023 served as a derivation cohort, and 279 patients treated from June 1 to August 12,2023 served as a validation cohort (see Fig. 1 for subject inclusion and exclusion procedures). Among all of the included variable data, the overall missing data rate was 1.5%. The mean PC time was 4.99 ± 1.20 h and 4.58 ± 0.94 h in the derivation cohort and validation cohort, respectively. Moreover, the number of participants reporting of abdominal pain/distension was 338 (32.66%) and 51 (18.28%) in the derivation cohort and validation cohort, respectively. Furthermore, the number of patients reporting of a high-fiber diet was 89 (8.60%) and 40 (14.34%); the number of patients demonstrating abdominal obesity was 224 (21.64%) and 119 (42.65%); and the number of patients with compliance to the intake regimen was 942 (91.01%) and 273 (97.85%) in the derivation and validation cohorts, respectively. These characteristics were statistically different between the two cohorts (p < 0.05). Detailed characteristics of the included patients can be found in Table 1.

Fig. 1
figure 1

Study flowchart.

Table 1. Patient-related and Bowel preparation-related factors in the Overall Study Population, Derivation and Validation Cohort (N=1314 Patients).

Univariate analyses

Inadequate bowel preparation was reported in 260 patients (25.1%). Results of univariate analyses in the derivation cohort are shown in Table 2. The univariate factors associated with inadequate bowel preparation included age、 male sex、 smoking、 chronic constipation、 diabetes mellitus、 previous history of colorectal surgery、 history of inadequate bowel preparation、 abdominal obesity、 high fiber diet、 no compliance to the intake regimen and PC > 5 h. Due to the fact that bowel preparation type and dose were inconsistent in the different institutions, the ingested volumes of the preparation were associated with inadequate bowel preparation but were not included in the multivariate analyses.

Table 2 Patient-related and Bowel preparation-related factors for inadequate bowel preparation via univariate logistic regression analysis in the derivation cohort (N = 1035).

1), T; 2), χ2. BP, bowel preparation.

Multivariate analyses

When considering the significant impact of special drugs on bowel preparation based on multiple systematic reviews5,17, although this factor was not statistically different in the univariate analysis, it was still included in the multivariate analysis. After multivariate logistic regression analysis with backward stepwise elimination, male sex (OR = 1.690, 95% CI: 1.242–2.300)、 chronic constipation (OR = 2.375, 95% CI: 1.560–3.617)、 diabetes mellitus (OR = 1.769, 95% CI: 1.059–2.954)、 previous history of colorectal surgery (OR = 2.915, 95% CI: 1.455–5.840)、 high fiber diet (OR = 2.662, 95% CI: 1.636–4.334) and PC > 5 h (OR = 2.471, 95% CI: 1.814–3.366) demonstrated an association with inadequate bowel preparation and were included in the prediction model. The results of the multivariate analyses in the derivation cohort are shown in Table 3.

Table 3 Multivariate analyses for patient-related and bowel preparation-related factors associated with inadequate bowel preparation in the derivation cohort.
Table 4 Independent predictive factors for inadequate bowel preparation and derivation of a prediction score.

Predictive model derivation

The smallest regression coefficient was given a score of 2 for the male sex. Other predictive factors were assigned scores based on the regression coefficients, as displayed in Table 4. The total prediction score ranged from 0 to 18, and higher risk scores were associated with an increased risk of inadequate bowel preparation, as displayed in both cohorts in Fig. 2. The ROC curve of the derivation cohort is shown in Fig. 3. The prediction score had an AUC of 0.704 (95% CI: 0.667–0.741). Additionally, the model was well calibrated (P = 0.632) in the derivation cohort. At the maximum value of Youden’s index, the risk threshold was 4.5 points (Fig. 4). All of the factors were evaluable in 1035 patients, and a predictive score ≥ 4.5 was observed in 287 patients (28%) (Table 5). Furthermore, we also adopted the clinical decision curve to evaluate the clinical utility of the model. It showed that the threshold probability ranged from 0.10 to 0.70, and the patient net benefit was higher in this range than regarding the two extreme lines in the figure, thus indicating that the prediction model of inadequate bowel preparation in this range has clinical value (Fig. 5). In order to avoid optimistic results due to overfitting, the Bootstrap method was used via resampling and B = 1000 repetitions. This method was used to internally verify the model and to draw the calibration curve (red curve), which showed that the occurrence probability of inadequate bowel preparation in outpatient colonoscopy is consistent with the actual occurrence probability; moreover, the accuracy of the model prediction is high, and the mean absolute error = 0.01 (Fig. 6).

Fig. 2
figure 2

Frequencies of patients with inadequate bowel preparation for different prediction scores.

Fig. 3
figure 3

Receiver operating characteristic curves of the prediction model in the derivation cohort. AUC, area under the curve.

Fig. 4
figure 4

Predictive score, in accordance with the value of Youden’s index.

Table 5 Contingency table between number of patients with adequate or inadequate bowel preparation and predictive score in the derivation cohort.
Fig. 5
figure 5

Decision curve analysis of the predictive model in the derivation cohort.

Fig. 6
figure 6

Calibration belt of the predictive model in the derivation cohort.

Predictive model validation

A total of 279 patients (44% male; mean age: 49.63 ± 14.73 years) who underwent colonoscopy procedures between June and August 2023 were included in the validation cohort (Table 1.). Adequate bowel preparation was reported in 209 patients. All of the factors were evaluable in 279 patients, and a predictive score ≥ 4.5 was observed in 53 patients (19%) (Table 6). The ROC curve of the validation cohort is shown in Fig. 7. The prediction score had an AUC of 0.704 (95% CI: 0.628–0.779) in the validation cohort. The predictive model was well calibrated (P = 0.376) and had a fair discriminative performance. In the validation cohort, a prediction score of ≥ 4.5 was observed in 19% of the participants and resulted in a sensitivity of 43%, a specificity of 89%, a PPV of 57%, an NPV of 82%, and a correctly classified rate of 77%. The clinical decision curve also showed a threshold probability range of 0.15–0.75, and the patient net benefit was higher in this range than regarding the two extreme lines in the figure, thus indicating that the prediction model of inadequate bowel preparation in this range has clinical value (Fig. 8).

Fig. 7
figure 7

Receiver operating characteristic curves of the prediction model in the validation cohort.

Table 6 Contingency table between number of patients with adequate or inadequate bowel preparation and predictive score in the validation cohort.
Fig. 8
figure 8

Decision curve analysis of the predictive model in the validation cohort.

Discussion

In this prospective observational study, 25.1% of the colonoscopies were performed based on inadequate bowel preparation, which is consistent with the findings of previous studies reporting of ranges from 19 to 35%7,9,18,19,20. The results suggest that clinical healthcare professionals should devote increased attention to the grim situation of inadequate bowel preparation. The prediction model that was constructed in this study ultimately included six independent factors. Among them, male sex, chronic constipation, diabetes, and previous history of colorectal surgery are variables related to patient clinical characteristics and are also known as unmodifiable risk factors21. Diabetes and constipation are currently well-known risk factors for inadequate bowel preparation, with hyperglycemia and nervous autonomic disorders postulated to hold the key role to diabetes5,17,22, while constipation could result from decreased colonic transit23. The high fiber diet used at 24 h before examination and a PC time > 5 h are influencing factors associated with bowel preparation and are also known as risk factors that can be modified. Studies have shown that the quality of colonoscopy is inversely associated with the PC time24. ESGE recommends a low fiber diet on the day preceding colonoscopy, and the PC time within 5 h13. This study jointly constructed a model predicting inadequate bowel preparation by emphasizing the importance of patients’ compliance associated with bowel preparation. Some scholars have proposed that the combination of bowel preparation-related variables and patient-related clinical characteristics can provide a basis for the personalized bowel preparation of patients3. A recent meta-analysis also showed that factors associated with adherence during bowel preparation have an important role in predicting the occurrence of bowel preparation failure, and prediction models need to consider factors that are associated with adherence25.

To our knowledge, the AUCs of current prediction models for inadequate bowel preparation range from 0.62–0.747,8,18,19,20,26. Some models were tested via internal validation within a random split context8,27or via bootstrap cross-validation18,28; however, studies have tested models without necessary external validation7,8,18,19,27,28. External validation focuses on the performance of the model on new datasets, which helps to evaluate the generalization ability of the model. Additionally, according to the requirements of EVP(events per variable)≥20 of the prediction model29, the sample sizes of several studies were deemed to be insufficient8,18,20,26,28,30. Moreover, more than half of the studies had an EPV of 10 to 20, and some studies even demonstrated an EVP <1020,28,30,which may have caused overfitting .Other studies have used unvalidated scales to assess bowel preparation quality, as well as limited model reliability (to some extent)7. Alternatively, clinical decision curves have been used to evaluate the utility of predictive models in the clinical setting31; however, there has only been one previous study that involved the drawing of a clinical decision curve30.

This model can be incorporated in the electronic health records, and can serve as a scoring system and as a decision-support tool for clinicians, when they are prescribing a prescribed colonoscopy for patients in the outpatient setting. The risk prediction model of inadequate bowel preparation that was constructed in this study can provide a reference for clinicians in providing personalized guidance on bowel preparation instructions for patients. This model indicates an increased risk of inadequate bowel preparation when the prediction score is above 4.5 points. Clinical providers can improve modifiable factors such as low fiber dietary adherence to reduce the occurrence of inadequate bowel preparation. Studies have shown that patients are more likely to follow the corresponding instructions only when they were predicted to be at an increased risk of inadequate bowel preparation32. Clinical providers can implement intensive education initiatives for high-risk patients, including the implementation of detailed dietary guidance such as a patient educational placemat33, a 5-meal balanced diet34with meals that are quantifiable, and visual methods to improve low fiber diet compliance. Furthermore, with the help of intelligent tools35, these initiatives can be used to improve the bowel preparation compliance of patients.

Additionally, the finding of our study can provide a reference for guiding the patients in making new appointments for colonoscopy. Furthermore, when patients are predicted to be at high risks for inadequate bowel preparation on the day of colonoscopy, the model can be used to avoid inadequate bowel preparation leading to the omission of important lesions and the increased economic burden for patients. Finally, the results of this study showed that certain factors (sex, history of previous colorectal surgery, chronic constipation, and diabetes mellitus) related to patient clinical characteristics, as well as other factors (high fiber diet and PC time) related to bowel preparation, were related to inadequate bowel preparation. When implementing a personalized bowel preparation program for patients, it is not only necessary to consider the type and the dose of bowel preparation but also to consider the patients’ compliance with bowel preparation instructions, in order to provide a theoretical basis for clinical medical staff to perform precise bowel preparation education for patients.

Our study has certain limitations. This was a single center study, which limits the applicability of these results. To avoid overoptimistic results due to overfitting, we tested the performance of our prediction score model by using internal bootstrap cross validation. A multi-center study would be useful to validate this score, in order to better evaluate its clinical effectiveness. Second, we only included outpatients in this analysis; thus, the applicable population for the prediction model was limited. It has been shown that the risk of inadequate bowel preparation in inpatients is much higher than in outpatients10,21. Some scholars have confirmed that developed prediction models based on hospitalized patients are more suitable for inpatients11. Third, we only used the traditional logistic regression for model construction. With the development of artificial intelligence systems, an increased number of machine algorithms are being used in the construction of prediction models, although machine algorithms have not been demonstrated to have better predictive performance27. In addition, the high-dose bowel preparation regimen of 4 L PEG that was used in this study may cause a decreased tolerance in patients. This study has shown that small doses of bowel preparation can have a similar intestinal cleaning effect but with better tolerance36. However, we adopted a split-dose regimen to improve tolerance. Furthermore, most of the patients who were included in this study were examined in the morning, and 88% (717/814) of the patients adopted the split-dose regimen to correspondingly improve the tolerance ability. In our study, the reason that patients were unwilling to take the same preparation was mainly due to the odor of the preparation, which was consistent with previous findings37. Future studies are needed to further improve the taste and dosage of PEG to improve patient satisfaction. Finally, in our study, we only recorded symptoms concerning the abdomen without including other indications for colonoscopy. Therefore, intensive investigation of the indications for colonoscopy can be conducted in the future.

Conclusion

In this prospective observational study, the high fiber diet used at 24 h before examination, PC time > 5 h, male sex, chronic constipation, diabetes, and previous history of colorectal surgery are independent risk factors for inadequate bowel preparation. A model constructed based on these six basic, easily accessible variables can effectively identify those individuals at high risks of inadequate bowel preparation and ultimately facilitate clinical decision-making.