Abstract
This study develops predictive models for Chinese female patients with VL utilizing machine learning techniques. The aim is to create an effective model that can assist in clinical diagnosis and treatment of vaginal relaxation, thereby enhancing women’s pelvic floor health. In total, 1184 women with VL have been randomly selected and categorized into groups using the finger measurement method. Among them, there are 383 cases of mild VL, 405 cases of moderate VL, and 396 cases of severe VL. Concurrently, 396 healthy women without VL who underwent routine health examinations have been chosen at random and assigned to the non-VL group. Based on 1580 cases, we have established LightGBM, Random Forest, XGBoost, and AdaBoost models based on training dataset using 5-fold cross-validation and GridSearch, and analyzed the performance of the models on the hold-out test dataset. The confusion matrix, precision, recall, F1 score, overall accuracy, and ROC curve of the models on the hold-out test dataset are compared. The overall accuracy of LightGBM model, RF model, XGBoost model, and AdaBoost model are 0.8987, 0.8987, 0.8987, and 0.8457, respectively. The average AUC of LightGBM model is 0.976, the one of RF model is 0.9763, the one of XGBoost model is 0.9775, and the one of AdaBoost model is 0.928. The XGBoost model has the more comprehensive and reasonable performance among the four prediction models, which can accurately distinguish between healthy, mild VL, as well as moderate VL and severe VL, which can assist doctors in diagnosing persons’ conditions more accurately, devising personalized treatment plans, avoiding unnecessary surgeries, reducing persons’ psychological stress, improving patient compliance and treatment outcomes, thus enhancing overall treatment results.
Similar content being viewed by others
Introduction
Vaginal laxity (VL) refers to a disease characterized by the looseness of the vaginal wall, frequently observed in women after childbirth. The development of VL is linked to multiple factors such as aging, pregnancy, childbirth, obesity, estrogen deficiency related to menopause, as well as various psychological and external pressures1,2. Currently, there is no gold standard for the diagnosis of vaginal laxity, and it is primarily based on clinical diagnosis3. A cross-sectional survey shows that at least 48% of women who have had vaginal deliveries experience VL4. In clinical practice, the incidence of clinical diagnosis of vaginal laxity is much lower than the actual prevalence5. In recent times, as the social economy has advanced and human perspectives have become more liberated, there has been an increasing focus on pelvic floor health. Hence, accurately predicting the risk of VL and taking targeted intervention measures is crucial to enhancing women’s quality of life.
To precisely forecast the risk of VL, an initial analysis of the factors most closely associated with VL is essential. Prior studies have delineated a range of risk factors such as pregnancy, childbirth, and obesity6. However, these studies frequently focus solely on assessing the correlation between individual predictive factors and VL, neglecting to conduct a comprehensive analysis of multiple factors. Although some studies have utilized machine learning for predicting the risk of VL, these investigations are predominantly carried out in populations from other countries, and there remains a scarcity of research on the risk of VL in Chinese women7. At the same time, a limited number of studies have employed machine learning methods for objective variable preselection, potentially resulting in challenges like multicollinearity or overfitting arising from the extensive array of variables8. To address this limitation, the study will employ Light Gradient Boosting Machine (LightGBM), Random Forest (RF), Extreme Gradient Boosting (XGBoost), and Adaptive Boosting (AdaBoost) to develop and evaluate a forecasting model for VL, with the goal of overcoming the constraints of prior research. The machine learning feature selection method employed in this study can assist in identifying the most crucial predictive factors, thus partially mitigating the emergence of the previously mentioned issues, ultimately enhancing the accuracy and reliability of the model9,10.
This study conducts research on Chinese women, filling the gap in the lack of research on Chinese women in this field. By conducting pelvic floor screening during routine physical examinations and selecting participants for the study, it can better represent the Chinese female population, thereby constructing a VL prediction model suitable for Chinese women. The findings of the study will aid in implementing timely and targeted intervention measures, advancing women’s pelvic floor health, enhancing quality of life, and directing future research in this field.
Objectives and methods
Objectives
This study has been approved by the Ethics Committee of The Third Affiliated Hospital of Zhengzhou University (Ethics Approval Number: 2023-212-01, Approval Data: August 23, 2023) and all methods were performed in accordance with the relevant guidelines and regulations. Our research is retrospective in nature. In total, 1356 female patients with VL, who underwent routine physical examinations at the Third Affiliated Hospital of Zhengzhou University from January 2021 to July 2023 have been randomly selected and categorized into groups using the finger measurement method11. This includes 408 cases of mild vaginal relaxation, 474 cases of moderate vaginal relaxation, and 474 cases of severe vaginal relaxation. Concurrently, 416 cases of women without vaginal relaxation who underwent health examinations have been randomly selected as a control group. Based on the inclusion and exclusion criteria, 25 cases of mild vaginal relaxation, 69 cases of moderate vaginal relaxation, 78 cases of severe vaginal relaxation, and 20 cases of women without vaginal relaxation have been excluded. After screening, 383 cases of mild vaginal relaxation, 405 cases of moderate vaginal relaxation, 396 cases of severe vaginal relaxation, and 396 cases of women without vaginal relaxation have been included in the clinical study.
Inclusion and exclusion criteria
Inclusion criteria
Persons will be required to fit the following criteria in order to participate in this research: (1) Aged 18 years or older. (2) History of sexual activity. (3) Diagnosed with VL. (4) Not received any other treatment related to the disease within the past 3 months.
Exclusion criteria
If persons participating in the study fulfill any of the following criteria, they will not be included in the study: (1) Pregnancy and postpartum period. (2) Suffering from urinary incontinence and/or symptomatic pelvic organ prolapse (POP-Q stage 2 or above). (3) Persons with serious illnesses who cannot cooperate. (4) Incomplete clinical base-line information.
Data collection
Clinical baseline data
In this project, we gathered and organized the clinical baseline data of the research participants from the hospital’s electronic medical record system.
Pelvic floor pressure evaluation and modified Oxford muscle strength evaluation
All participants in the study underwent pelvic floor pressure assessment by pelvic floor rehabilitation therapists holding BFE Level I and Level II certificates. The assessment utilizes the MyoTrac biofeedback device from Nanjing Vishee Medical Technology Co., Ltd12. Furthermore, the modified Oxford grading scale is employed for evaluating pelvic floor muscle function. The Modified Oxford Scale for muscle strength was developed by Dr. Jo Laycock, a prominent physical therapist and recipient of the Lifetime Achievement Award from the International Continence Society. This method is extensively employed as a finger detection technique for the assessment of pelvic floor strength and has been consistently utilized in clinical practice and research since its inception. The Modified Oxford Scale holds considerable clinical value in evaluating pelvic floor muscle function, thereby providing a dependable basis for the diagnosis and management of pelvic-related disorders13.
Sexual satisfaction questionnaire and VL questionnaire evaluation
VLQ Questionnaire: Widely used for clinical assessment of VL, where patients need to select the corresponding degree of VL during sexual activity based on subjective feelings on a scale of 7 levels (very loose, loose, slightly loose, appropriate, slightly tight, tight, very tight, corresponding to scores of 1, 2, 3, 4, 5, 6, 7). A score of ≤ 3 indicates the possibility of VL14. When evaluating the effectiveness of intervention treatment, a score of ≥ 5 is considered as effective.
SSQ Questionnaire: Reflects the patient’s satisfaction with their own sexual life, where they select the corresponding level from 6 grades (none, poor, fair, good, very good, and excellent, corresponding to scores of 1, 2, 3, 4, 5, 6). A score of ≤ 4 indicates the possibility of VL15.
Measurement of vaginal pressure
The PHENIX USB2 neuromuscular stimulation device is used to assess dynamic vaginal pressure values in patients, with a normal reference range of 80 to 150 cmH2O16.
Statistical analysis methods
Statistical analysis is conducted using SPSS 26.0 and Python 3.9. The process modeling and analysis method is given Fig. 1. 28 clinical variables are included in this work, as shown in Table 1. We have excluded defective data and finally selected a total of 1580 persons from those who met the inclusion and exclusion criteria to form the final sample for analysis. After data cleaning, the total sample has been divided into a training dataset and a hold-out test dataset in a 7:3 ratio. Then, LightGBM, RF, XGBoost, and AdaBoost models have been established based on training dataset. The 5-fold cross-validation and GridSearch have been used to obtain the optimized hyperparameters. The hyperparameters for tuning include max_depth, num_leave, learning_rate, and n_estimators for LightGBM, XGBoost, and AdaBoost models. The hyperparameters for tuning include max_depth, min_samples_split, min_samples_leaf, and n_estimators for RF model. The performance of four models has been compared by comparing the confusion matrix, precision, recall, F1 score, overall accuracy, and ROC curve based on hold-out test dataset.
Results
The feature correlation heatmap is given in Fig. 2. The feature 3 (weight) and feature 4 (BMI), feature 6 (number of production) and feature 7 (number of vaginal births), and feature 12 (deep class II muscle strength), feature 13 (superficia class I muscle strength), and feature 14 (superficia class II muscle strength) are positively correlated and have a strong correlation. The feature 11 (deep class I muscle strength) and feature 15 (Intra-vaginal pressure) are negatively correlated and have a strong correlation. There is a certain degree of correlation between the features, showing strong nonlinearity.
In order to understand the relationship between each clinical feature and VL, we analyzed the importance of each clinical feature on VL, as shown in Fig. 3. The top four factors that have the greatest impact on VL, ranked from high to low, are phasic (Flick) contractions, average value during the endurance contraction phase, phasic (Flick) contractions, and intra-vaginal pressure.
Based on the clinical data, we build prediction models for VL using LightGBM, RF, XGBoost, and AdaBoost. The confusion matrixes of each model on the hold-out test dataset are shown in Fig. 4. In the data with label 0, the LightGBM model successfully predict 104 cases, and make 17 incorrect predictions as label 1; the RF model successfully predict 98 cases, and make 22 incorrect predictions as label 1 and 1 incorrect predictions as label 3; the XGBoost model successfully predict 101 cases, and make 20 incorrect predictions as label 1; and the AdaBoost model successfully predict 78 cases, and make 42 incorrect predictions as label 1. In the data with label 1, the LightGBM model successfully predict 99 cases, make 17 incorrect predictions as label 0, and make 1 incorrect predictions as label 3; the RF model successfully predict 104 cases, and make 11 incorrect predictions as label 1 and 2 incorrect predictions as label 3; the XGBoost model successfully predict 101 cases, and make 16 incorrect predictions as label 0; and the AdaBoost model successfully predict 104 cases, make 10 incorrect predictions as label 1, and made 3 incorrect predictions as label 1. In the data with label 2, the LightGBM model successfully predict 104 cases, and make 9 incorrect predictions as label 3; the RF model successfully predict 104 cases, and make 9 incorrect predictions as label 3; the XGBoost model successfully predict 104 cases, and make 9 incorrect predictions as label 3; and the AdaBoost model successfully predict 104 cases, and make 9 incorrect predictions as label 3. In the data with label 3, the LightGBM model successfully predict 119 cases and make 4 incorrect predictions as label 1; the RF model successfully predict 120 cases and 3 incorrect predictions as label 1; the XGBoost model successfully predict 120 cases and make 3 incorrect predictions as label 1; and the AdaBoost model successfully predict 114 cases, make 2 incorrect predictions as label 0, and make 3 incorrect predictions as label 1, and 4 incorrect predictions as label 2.
The precision, recall, F1 score, and accuracy are used to analyze the performance of the four prediction models. It is important to note that this work addresses a multi-class classification problem where each class has its precision, recall, and F1 score, but only the overall accuracy is available. The definitions of the above indicator are17,18
The precision, recall, and F1 score of each class and overall accuracy of LightGBM model, RF model, XGBoost model, and AdaBoost model on the hold-out test dataset are given in Fig. 5. For the data of label 0, the precisions of LightGBM model, RF model, XGBoost model, and AdaBoost model are 0.86, 0.9, 0.86, and 0.87, respectively; the recall of these models are 0.86, 0.81, 0.83, and 0.64, respectively; and the F1 score of these models are 0.86, 0.85, 0.85, and 0.74, respectively. For the data of label 1, the precisions of LightGBM model, RF model, XGBoost model, and AdaBoost model are 0.82, 0.81, 0.81, and 0.7, respectively; the recall of these models are 0.85, 0.89, 0.86, and 0.89, respectively; and the F1 score of these models are 0.84, 0.85, 0.84, and 0.78, respectively. For the data of label 2, the precisions of LightGBM model, RF model, XGBoost model, and AdaBoost model are 1, 1, 1, and 0.96, respectively; the recall of these models are 0.92, 0.92, 0.92, and 0.92, respectively; and the F1 score of these models are 0.96, 0.96, 0.96, and 0.94, respectively. For the data of label 3, the precisions of LightGBM model, RF model, XGBoost model, and AdaBoost model are 0.92, 0.91, 0.93, and 0.9, respectively; the recall of these models are 0.97, 0.98, 0.98, and 0.93, respectively; and the F1 score of these models are 0.94, 0.94, 0.95, and 0.91, respectively. The overall accuracy of LightGBM model, RF model, XGBoost model, and AdaBoost model are 0.8987, 0.8987, 0.8987, and 0.8457, respectively.
The receiver operating characteristic (ROC) curves of each label for LightGBM model, RF model, XGBoost model and AdaBoost model on the hold-out test dataset are given in Fig. 6. For label 0, the area under the curve (AUC) of LightGBM model, RF model, XGBoost model, and AdaBoost model are 0.97, 0.97, 0.971, and 0.939, respectively; for label 1, AUC of these models are 0.968, 0.967, 0.968, and 0.863, respectively; for label 2, AUC of these models are 0.971, 0.984, 0.984, and 0.988, respectively; and for label 3, AUC of these models are 0.983, 0.984, 0.987, and 0.922, respectively. The average AUC of LightGBM model is 0.976, the one of RF model is 0.9763, the one of XGBoost model is 0.9775, and the one of AdaBoost model is 0.928.
Discussions
This work is based on the clinical variables of the persons participating in the study, combining pelvic floor pressure assessment with modified Oxford muscle strength grading, SSQ scale, and VLQ scale for research. It includes 28 clinical variables that may affect VL, analyzing the correlations between variables and the associated risk factors for VL. Then, we adapted LightGBM, RF, XGBoost, and AdaBoost to establish the prediction models for VL. Through analyzing the prediction performance of each model based on confusion matrix, precision, recall, F1 score, overall accuracy, and ROC curve, we aim to find a prediction model capable of accurately distinguishing between individuals with healthy, mild VL, as well as moderate VL and severe VL. The results show that the top four factors that have the greatest impact on VL are phasic (Flick) contractions, average value during the endurance contraction phase, phasic (Flick) contractions, and intra-vaginal pressure. Subsequently, these indicators should be given high attention in the clinical screening process.
VL serves as a precursor symptom of pelvic floor dysfunction, but in the initial phases of the condition, especially in patients with mild VL, there are often no obvious clinical signs. This presents a significant challenge for clinical doctors in terms of diagnosis and treatment. Even for experienced clinical doctors, accurately diagnosing mild VL is quite difficult. In addition, duo to the lack of significant impact on persons’ daily lives, the clinical consultation rate at this stage is extremely low. The combined effects of the aforementioned reasons make it difficult to intervene in VL in the early stages in a timely manner. Moreover, it is challenging to effectively differentiate between moderate VL and severe VL. Moderate VL can be treated with either surgery or rehabilitation therapy, while severe VL typically requires surgery. Due to the challenges in effective differentiation, doctors often lean towards recommending surgery in clinical practice. This not only exacerbates the physical discomfort of persons with VL but also increases their psychological stress, reduces compliance, and impacts treatment outcomes. Therefore, it is crucial to find a prediction model that can accurately distinguish between healthy, mild VL, as well as moderate VL and severe VL. Such a model can assist doctors in diagnosing persons’ conditions more accurately, devising personalized treatment plans, avoiding unnecessary surgeries, reducing persons’ psychological stress, improving patient compliance and treatment outcomes, thus enhancing overall treatment results. Presently, machine learning is extensively utilized to aid in the diagnosis and treatment of a range of clinical diseases12,19,20,21,22,23, demonstrating extremely outstanding predictive performance.
LightGBM, Random Forest, XGBoost, and AdaBoost are four machine learning algorithms widely used in the field of disease diagnosis and prediction24. LightGBM is an efficient gradient boosting model that utilizes histogram algorithms to enhance training speed, rendering it particularly suitable for processing large-scale datasets25. Random Forest increases model stability and accuracy by constructing multiple decision trees and aggregating their outputs, while demonstrating remarkable resistance to overfitting26. XGBoost also belongs to the gradient boosting family and enhances training efficiency through the application of regularization and parallel computing techniques, contributing to its exceptional performance in numerous machine learning competitions27. In contrast, AdaBoost constructs a robust classifier by integrating the strengths of multiple weak classifiers, thereby significantly improving classification performance, although it is more sensitive to noise and outliers28. The primary differences among these four algorithms reside in their model construction methods, data processing strategies, and robustness to various data types and noise. Integrating these algorithms within specific application contexts can effectively aid researchers and engineers in selecting the most appropriate predictive models, thereby improving the accuracy of disease diagnosis and prediction.
Based on 1580 clinical cases, we conducted a comprehensive analysis of the clinical value of LightGBM, RF, XGBoost, and AdaBoost algorithms in predicting VL for the first time. To find a more reasonable prediction model, we firstly analyze the clinical cost of four prediction models. Through analyzing the confusion matrix, we found that the RF model and AdaBoost model have the risk of incorrectly predicting heathy persons and mild VL persons as severe VL persons. This is unacceptable. Although it is almost impossible for healthy and mild VL persons to be diagnosed as severe VL based on clinical diagnosis, we cannot ignore the possibility that doctors may overly rely on prediction model, no longer giving too much thought to clinical diagnosis, thereby leading to healthy and mild VL patients being diagnosed as severe VL. Since mild VL persons are treated clinically with rehabilitation methods, while severe VL patients require surgery, once healthy and mild VL persons are misdiagnosed as severe VL, it will lead to major clinical accidents. This is a catastrophic result that is unacceptable. Therefore, RF and AdaBoost algorithms are not suitable for establishing a prediction model for diagnosing VL. Comparing the LightGBM and XGBoost models, the F1 scores for label 0 are 0.86 and 0.85, the F1 scores for label 1 are 0.84 and 0.84, the F1 scores for label 2 are 0.96 and 0.96, and the F1 scores for label 3 are 0.94 and 0.95; the overall accuracy are 0.8987 and 0.8987; and AUC for label 0 are 0.97 and 0.971; AUC for label 1 are 0.968 and 0.968; AUC for label 2 are 0.971 and 0.984; and AUC for label 3 are 0.983 and 0.987. The F1 scores and total accuracy of the two models almost have no difference, with XGBoost model having a higher AUC. Based on the conclusion in Ref29. , when accuracy and AUC trends are inconsistent, choosing AUC as the evaluation metric for model performance is more reasonable. Thus, the XGBoost model is the best one in diagnosing VL.
The significance of this work is mainly in the following three aspects. Firstly, for the vast majority of persons with VL, the degree of VL is progressive, and at the beginning, the degree of VL is not high, making it easily treatable. However, as mentioned earlier, mild VL is difficult to diagnose effectively. Even doctors with extensive clinical experience find it challenging to diagnose and treat VL in the early stages promptly. Once the degree of VL deepens further, the person’s expenses and physical suffering will increase sharply, and even family relationships may be affected. As clinical doctors, we must admit with great shame that VL is often diagnosed at this stage. Over the years in medicine, we have witnessed too many persons seeking vaginal tightening surgery due to marital issues. We sympathize deeply with their experiences, which is also an important motivation for us to carry out this work. Through the prediction model established in this work, we can accurately diagnose mild VL and achieve timely treatment.
Secondly, moderate VL can be treated with surgery or rehabilitation methods, while severe VL must be treated with surgery. For persons with severe VL, rehabilitation methods are ineffective and can waste a lot of money. Therefore, to ensure the effectiveness of treatment, our decisions are usually conservative. That is, for many persons who may have moderate VL and for whom rehabilitation treatment may be effective, our recommendation is still to opt for surgical intervention. This will cause significant physical pain, and inevitable, antibiotics will be needed after surgery, which could have been avoided. With the help of predictive model, we can provide some persons with the option of rehabilitation treatment. Do not underestimate this choice, as it can greatly enhance patient compliance and increase the likelihood of curing VL. For many people, surgery is difficult to accept, leading them to refuse treatment. The stepwise process diagram of the predictive model used in clinical practice is given in Fig. 7.
Finally, and most importantly, in most parts of the world, there are still areas with limited medical resources, leading to a lack of awareness among local people about the dangers of VL. Although VL may seem like a functional organ disease that does not have serious consequences, it inevitably leads to recurrent vaginal infections, prolapse, urinary incontinence, and other outcomes that greatly impact person’s quality of life. Although treating the sick is our duty as doctors, curing diseases is not our ultimate goal. What we truly hope for is disease prevention. By promoting the prediction model in this work, timely diagnosis of early VL can be achieved in medically underserved areas. Popularizing Kegel exercises can help to stop the trend of VL development, fundamentally reducing the number of persons with VL.
Conclusions
Through analyzing the prediction performance of each model based on confusion matrix, precision, recall, F1 score, overall accuracy, and ROC curve, we found that XGBoost model has the more comprehensive and reasonable performance among the four prediction models. The XGBoost model can accurately distinguish between healthy, mild VL, as well as moderate VL and severe VL, which can assist doctors in diagnosing persons’ conditions more accurately, devising personalized treatment plans, avoiding unnecessary surgeries, reducing persons’ psychological stress, improving patient compliance and treatment outcomes, thus enhancing overall treatment results. Most important, by promoting the prediction model in this work, timely diagnosis of early VL can be achieved in medically underserved areas, which can help to stop the trend of VL development, fundamentally reducing the number of persons with VL.
Data availability
The datasets used and/or analysed during the current study available from the corresponding author on reasonable request.
References
Gao, L. et al. Fractional carbon dioxide laser improves vaginal laxity via remodeling of vaginal tissues in Asian women. J. Clin. Med. 11 (17), 5201 (2022).
Pereira, G. M. V. et al. Treatment of women with vaginal laxity: systematic review with meta-analysis. J. Sex. Med. 21 (5), 430–442 (2024).
Wang Jianliu, L. Expert consensus on the diagnosis and treatment of vaginal laxity (2020 edition). China Practical Gynecol. Obstet. 36 (10), 965–967 (2020).
Millheiser, L., Kingsberg, S. & Pauls, R. A cross-sectional survey to assess the prevalence and symptoms associated with laxity of the vaginal introitus. ICS Annual Meeting. : 23–27. (2010).
Pauls, R. N., Fellner, A. N. & Davila, G. W. Vaginal laxity: a poorly understood quality of life problem; a survey of physician members of the International Urogynecological Association (IUGA). Int. Urogynecol. J. 23, 1435–1448 (2012).
Mustafa-Mikhail, S. et al. Is Vaginal Laxity Associated with Vaginal Parity and Mode of Delivery? Int. Urogynecol. J. 35(12), 1–6 (2024).
Pereira, G. M. V. et al. Predictors of vaginal laxity and sexual function in a multiethnic population: a cross-sectional study.
Han, Y. & Wang, S. Disability risk prediction model based on machine learning among Chinese healthy older adults: results from the China Health and Retirement Longitudinal Study. Front. Public. Health. 11, 1271595 (2023).
Fazzari, M. J. et al. Adverse Pregnancy Outcomes in Women with Systemic Lupus Erythematosus: can we Improve Predictions with Machine Learning?9e000769 (Lupus Science & Medicine, 2022). 1.
Haug, C. J. & Drazen, J. M. Artificial intelligence and machine learning in clinical medicine. N. Engl. J. Med. 388 (13), 1201–1208 (2023).
Wang, J. et al. Consensus of diagnosis and treatment experts for vaginal laxity (2020 Edition). Chin. J. Practical Gynecol. Obstet. 36 (10), 965–967 (2020).
Yu, H. et al. Prediction of myofascial pelvic pain syndrome based on random forest model. Heliyon 10 (11), e31928 (2024).
Gao, L. et al. Association between pelvic floor muscle strength and sexual function based on PISQ-12—an analysis of data from a multicenter cross-sectional study on 735 nulliparae during pregnancy. Front. Med. 10, 1093830 (2023).
Polland, A. et al. Description of vaginal laxity and prolapse and correlation with sexual function (DeVeLoPS). Sex. Med. 9 (6), 100443–100443 (2021).
Plopa, M. Partners’ sexual satisfaction in partnership and marriage: a study method. J. Sex Marital Ther. 46 (1), 78–98 (2020).
Yang, S. et al. The effect of rehabilitation exercises combined with direct vagina low voltage low frequency electric stimulation on pelvic nerve electrophysiology and tissue function in primiparous women: a randomised controlled trial. J. Clin. Nurs. 26 (23–24), 4537–4547 (2017).
Yildirim, K. et al. Deep learning model for automated kidney stone detection using coronal CT images. Comput. Biol. Med. 135, 104569 (2021).
Yang, L. et al. Study of cardiovascular disease prediction model based on random forest in eastern China. Sci. Rep. 10 (1), 5245 (2020).
Wu, Y. et al. Risk factors and prediction models for cardiovascular complications of hypertension in older adults with machine learning: a cross-sectional study. Heliyon 10 (6), e27941 (2024).
Krittanawong, C. et al. Machine learning prediction in cardiovascular diseases: a meta-analysis. Sci. Rep. 10 (1), 16057 (2020).
Sailasya, G. & Kumari, G. L. A. Analyzing the performance of stroke prediction using ML classification algorithms. Int. J. Adv. Comput. Sci. Appl. 12(6), 539–545 (2021).
Bendifallah, S. et al. Machine learning algorithms as new screening approach for patients with endometriosis. Sci. Rep. 12 (1), 639 (2022).
Al Mudawi, N. & Alazeb, A. A model for predicting cervical cancer using machine learning algorithms. Sensors 22 (11), 4132 (2022).
Nagassou, M., Mwangi, R. W. & Nyarige, E. A hybrid ensemble learning approach utilizing light gradient boosting machine and category boosting model for lifestyle-based prediction of type-II diabetes mellitus. J. Data Anal. Inform. Process. 11 (4), 480–511 (2023).
Ke, G. et al. Lightgbm: a highly efficient gradient boosting decision tree. Adv. Neural. Inf. Process. Syst. 30, 1–9 (2017).
Hu, J. & Szymczak, S. A review on longitudinal data analysis with random forest. Brief. Bioinform. 24 (2), bbad002 (2023).
Hoque, R. et al. Breast cancer classification using XGBoost. World J. Adv. Res. Reviews. 21 (2), 1985–1994 (2024).
Bahad, P. & Saxena, P. Study of adaboost and gradient boosting algorithms for predictive analytics. International Conference on Intelligent Computing and Smart Communication 2019: Proceedings of ICSC 2019. Springer Singapore, : 235–244. (2020).
Ling, C. X., Huang, J. & Zhang, H. AUC: a better measure than accuracy in comparing learning algorithms. Advances in Artificial Intelligence: 16th Conference of the Canadian Society for Computational Studies of Intelligence, AI 2003, Halifax, Canada, June 11–13, 2003, Proceedings 16. Springer Berlin Heidelberg, : 329–341. (2003).
Funding
Foundation of He’nan Educational Committee (24A320036).
Author information
Authors and Affiliations
Contributions
The author’ contributions are as follows: Hongguo Zhao wrote the manuscript, modelling and simulation analysis; Peng Liu, Fei Chen, MengJuan Wang, and Jiaxi Liu collected the data; Xiling Fu, Hang Yu, and Manman Nai reviewed the manuscript; Lei Li and Xinbin Li were in charge of the whole trial.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Ethical approval
This study has been approved by the Ethics Committee of The Third Affiliated Hospital of Zhengzhou University (Ethics Approval Number: 2023-212-01, Approval Data: August 23, 2023). This study is a retrospective study and informed consent waived by ethics committee. The persons whose data are included in this manuscription have consented for all clinical data and other data included in the manuscript to be published. Ethics Approval Number: 2023-212-01.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Zhao, H., Liu, P., Chen, F. et al. Comparative analysis of machine learning approaches for predicting the risk of vaginal laxity. Sci Rep 15, 3147 (2025). https://doi.org/10.1038/s41598-025-86931-x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-025-86931-x