Objective To explore the influencing factors of inadequate bowel preparation before colonoscopy in patients with chronic constipation (CC), and to construct and validate a predictive model for inadequate bowel preparation tailored to this high-risk group based on machine learning algorithms.
Methods A total of 700 CC patients with colonoscopy in Nanjing Hospital of Traditional Chinese Medicine from May 2022 to May 2024 were selected by consecutive sampling method, and they were divided into qualified group (560 cases) and unqualified group (140 cases) according to the quality of bowel preparation. Univariate and multivariate Logistic regression analyses were employed to identify the influencing factors of inadequate bowel preparation before colonoscopy in CC patients. Using SPSS software, machine learning algorithms including Logistic regression, decision tree (CRT), and back propagation neural network (BPNN) were applied to construct predictive models for inadequate bowel preparation before colonoscopy in CC patients. A total of 250 CC patients with colonoscopy in Nanjing Hospital of Traditional Chinese Medicine from June to October 2024 were selected as an independent external validation cohort for assessment of the generalization ability of the models. The predictive value of the three models was comprehensively compared using metrics such as the area under the curve (AUC) of receiver operating characteristic (ROC) curve, sensitivity, and specificity.
Results Univariate and multivariate Logistic regression analyses revealed that age, duration of constipation, diabetes, history of abdominal surgery, waiting timefor colonoscopy, and whether simethicone was taken were independent factors contributing to inadequate bowel preparation before colonoscopy in CC patients (P < 0.05). The model constructed based on the CRT module indicated that the duration of constipation, age, and diabetes were classification factors for inadequate bowel preparation in CC patients. According to the standardized importance results of independent variables in the BPNN model, the top five factors influencing inadequate bowel preparation in CC patients were the duration of constipation, age, waiting time for colonoscopy, whether simethicone was taken, and diabetes. The AUC of the models constructed using the three machine learning algorithms were all larger than 0.800, with the Logistic regression model demonstrating the best predictive performance, with an AUC of 0.889 (95%CI, 0.857 to 0.922), a sensitivity of 0.843, and a specificity of 0.821.
Conclusion The three machine learning predictive models constructed in this study can effectively identify high-risk individuals with inadequate bowel preparation among CC patients. Logistic regression model exhibits the best overall performance, providing a reliable tool for clinical risk stratification and precise intervention.