Introduction

Polycystic ovary syndrome (PCOS)1 is mainly manifested by androgenosis (excess androgenosis, acne and/or syndrome) and function (failure or deficiency and/or abnormal polycystic morphology), and is the most common gynecological and metabolic disorder. one. The prevalence rate is about 5–10%, and the prevalence rate of PCOS among people of childbearing age in my country is 5.61%.

However, most medications to treat PCOS symptoms are off-label2 because neither the FDA nor the European Medicines Agency has approved medications specifically for PCOS. The first-line treatment for PCOS3 is mainly through health management such as smoking cessation and alcohol consumption, weight control, etc., to improve the symptoms of acne, obesity and other symptoms. The most commonly used drugs for PCOS patients with fertility needs are letrozole and clomiphene. PCOS patients who are not trying to conceive are recommended to use oral contraceptives and anti-androgens. Although they are effective, multiple pregnancy and syndrome are prone to occur during treatment. Considering that PCOS is a lifelong disease with heterogeneous alternative medicine, treatment of PCOS should be symptom-oriented and based on the expectations of the individual patient. TCM is a highly individualized treatment that follows an independent theoretical path and formulates diagnosis and treatment plans by systematically assessing the patient’s symptoms and signs.

In the 2024 Guidelines for the Diagnosis and Treatment of PCOS by Combining Traditional Chinese and Western Medicine4, PCOS is divided into Spleen/Kidney Yang Deficiency Phlegm Dampness and Kidney (Yin) Deficiency Liver Depression. From this, it can be seen that kidney deficiency is the most common syndrome type of polycystic ovary syndrome. Bushen is a routine treatment for polycystic ovary syndrome, and most of the selected drugs are drugs that warm the kidney and fill the essence, such as Cuscuta chinensis, Xianling spleen, Cistanche deserticola, etc. Traditional Chinese medicine has been widely used to treat gynecological diseases and infertility problems of PCOS. Traditional Chinese medicine has different compatibility for each patient, which is consistent with the principles that should be paid attention to in individualized treatment of PCOS, but there is a lack of rigorous clinical trials focusing on its specific effects. These tests reflect biases related to age, genetic background, or other confounding variables.

Rat is an ideal animal model for PCOS, because it is sexually sensitive and has a stable estrous cycle that is easy to observe. Considering that among the various symptoms of PCOS, traditional Chinese medicine treatments remain unanswered, and Bushen formulas are the most common, it is prospective to evaluate the specificity of Bushen formulas in PCOS rodent models5. However, most of the research on clinical trials is single center, small sample clinical trials, which leads to a lack of high-quality evidence-based conclusions. Therefore, we only chose animal model experiments. Here, we report a systematic review and meta-analysis of data from studies on the effectiveness of Bushen formulas in animal models of PCOS. This study aimed to evaluate the efficacy of Bushen formulas in improving weight, disease symptoms, and insulin resistance in PCOS animal models.

Materials and methods

Literature inclusion and exclusion

This systematic review was conducted in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analysis guidelines6. The protocol based on SYRCLE’s tool for animal studies7 was registered in PROSPERO (registration number: CRD42024529569).

Inclusion (1) Research type: randomized controlled animal experiments, published in Chinese or English; (2) Experimental subjects are mouse and rat animal models of PCOS; (3) Intervention measures, the experimental group is treated with Bushen formulas Mainly carry out intervention; (4) Compared with the PCOS group in the animal experimental study, there is a blank control group; (5) The results include the impact of Bushen formulas on PCOS or the treatment, the morphology and changes in the animal experimental study, specifically, Effects of weight, sexual disorders, and insulin resistance.

Exclusion (1) interventions different from Chinese herbal medicine (for example, single Chinese medicine, acupuncture); (2) review literature, repeated studies, Meta articles; (3) in vitro experiments or experiments that do not use rats or mice as animal models; (4) Studies without full text available or without reporting relevant outcomes were excluded; (5) Duplicate publications. In addition, two authors independently screened the literature.

Literature search

Computer searches were performed on databases such as China National Knowledge Infrastructure, Wanfang Database, VIP Database, Chinese Biomedical Literature Database (CBM), PubMed, EMbase, Web of Science, and Cochrane Library. The search date was from the inception of the database to October 2, 2022. In Chinese, “Bushen”, “experimental animals”, “rat”, “mice”, “"PCOS” were used as keywords and subject words to search, and the English search terms included “Animal Models”, “mouse”, “rats”, “Bushen”, “Polycystic Ovary Syndrome”, please see Table 1 for the formula.

Filtering files

According to the literature of the above retrieval institute, the included literature is selected according to inclusion and exclusion, and then evaluated independently by at least two research evaluators. If there are differences, the third party will discuss and negotiate the specific steps as follows: (1) Incorporate the preliminary screening articles into Endnote software for induction and deletion of duplicate documents; (2) Reading the full text of the literature does not conform to the included literature; (3) Carefully study the required content and exclude the repeated published research literature; (4) Review the literature included in the study according to the established inclusion and exclusion, and verify the reasons for excluding the literature in detail; (5) If there is incomplete information, you can get in touch with the publishing house or document researcher to supplement the explanation to improve the included literature; Identify the literature included in the study and extract the data.

Data extraction

Two researchers independently extracted data Any differences between them were settled through discussion. Collect detailed data extraction through the following characteristics: (1) publication details (author and year); (2) Intervention measures used (dose and time of prescription component route); (3) Intervention measures in control group; (4) PCOS induction; (5) Animals used (species and strains); (6) Sample size; (7) Potential mechanism of intervention; (8) Result information. The sample size of the control group divided by the number of treatment groups to influence the control group. When treatment is administered in multiple doses, the data of each dose is extracted separately and studied separately.

Risk of bias

The internal effectiveness of the study was independently evaluated by two reviewers, and any disagreements were discussed and negotiated by a third party, with reference to SYRCLE’s Animal Research Bias Risk Tool4 The list of 10 evaluations includes: (1) describing the comparability between evaluation groups generated by distribution sequences; (2) Describe all possible prognostic factors or animal characteristics; (3) Describe the hidden distribution to judge that the intervention distribution is visible before/during the animal’s entry into the group; (4) Describing the random placement of animals; (5) Describe how blind animal breeders and researchers are to avoid knowing what kind of interventions animals receive; Providing any information on the effectiveness of the blindness performed (6) describing whether animals were randomly selected for outcome evaluation and (7) describing the specificity of blindness of outcome evaluators to prevent them from knowing what interventions the animals received; Providing any information on the effectiveness of the blindness performed (8) describing the complete character of each primary outcome data including lost visits and data excluded at the stage; Explain whether these data are reported and the reasons for loss or exclusion and any re-inclusion under each intervention group (9) Explain how the possibility of selective reporting results and the results of the review (10) Explain some other important biases that are not included in the above biases.

Statistics

Review Manager (RevMan) [Computer program]. Version 5.4, provided by Cochrane system was used for statistics RR(Relative risk) was used as counting data. MD(Mean difference) is used to measure data, and all of them are expressed by 95% confidence interval CI (Confidence Interval) The test of heterogeneity can be quantitatively judged by I2 and P value. When P > 0.05 I2 ≤ 50%, the fixed effect model is used for Meta, whereas the random effect model is used for combined effect quantity When the number of studies included is > 10, Funne Plot is drawn to further evaluate publication bias.

I2 ≤ 50% is low heterogeneity. 50%< I2 ≤ 70% is moderate heterogeneity, I2 > 70% is high heterogeneity.

Results

Research selection

Initially, 594 studies were retrieved through a comprehensive search of eight databases, of which 107 non-repetitive studies were filtered out After reviewing the title and abstract, 43 studies were deleted based on predetermined exclusions In addition, 384 studies were excluded in the second selection stage, and 60 studies were finally included in the system review Fig. 1 shows a flow chart describing research choices All literature studies are randomized controlled trials. Fig. 2 is a summary of the research features.

Fig. 1
figure 1

Flow diagram for identification and selection of included studies.

Table 1 General characteristics of included studies.

Inclusion of research evaluations

See Table 2 for the specific bias risks of SYRCLE animal experiments on 60 included literatures.

Table 2 Specific bias risk table.

22 studies8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29 used a random number table; Other studies30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67 only mentioned randomness and did not describe specific methods for generating random plans. No blind method was reported in 60 studies. Two studies reported that the outcome measurer was blind, and other studies did not report it; No sample shedding was found; No study selectively reported outcomes were found. All the studies did not report the basis of sample size estimation, but the baseline between groups was comparable, and there was no obvious conflict of interest when the subjects were included and excluded from reasonable financial support. Therefore, the other bias risks of the 60 studies were all low risks.

Fig. 2
figure 2

Summary of research characteristic.

The number of studies on different treatment times, ZHIFA and modeling methods, and hormone measurement methods.

Main outcomes

Ovarian mass

Five studies, including eight comparative reports on a, showed statistical heterogeneity among the studies. The random effects model showed significant differences between the Bushen formula treatment group and the control group. The 5 results showed that the treatment group could reduce Ovarian mass, with a statistically significant difference (MD = − 1.01, 95% CI = [− 1.50, − 0.52], P < 0.0001), and heterogeneity test I2 = 51%. In PCOS animal models, taking Bushen traditional Chinese medicine can significantly reduce ovarian mass, as shown in Fig. 3.

Fig. 3
figure 3

Ovarian mass forest plot.

Testosterone (T) level

55 studies (including 102 comparisons) reported T levels. The random effects model shows that compared with the control group, the treatment group with Bushen formulas is significantly different. 96 results showed a statistically significant decrease in T levels between the treatment group and the control group (MD = − 2.57, 95% CI = [− 2.91, − 2.23], P < 0.00001). There was significant heterogeneity between the studies, with heterogeneity test I2 = 87%. The forest plot is shown in Fig. 4, and the funne plot is shown in Fig. 5. The graph of subgroup analysis is shown in Attachment 2.

The subgroup analysis of treatment methods showed a heterogeneity of I2 = 83% among different treatment methods, indicating significant differences between different treatment methods. I2 = 68% for Bushen, resolving phlegm and removing blood stasis, I2 = 6% for Bushen, clearing the liver, and promoting blood circulation, I2 = 63% for Bushen and regulating the meridians, and I2 = 41% for Bushen and regulating the week. The heterogeneity of the subgroup studies significantly decreased, indicating that different treatment methods will affect the heterogeneity between studies.

The subgroup analysis of modeling methods showed that the heterogeneity I2 between different modeling methods subgroups was 27.8%, and the I2 within each group was greater than 75%, indicating that the heterogeneity between studies was not caused by different modeling methods.

The subgroup analysis of treatment time showed a heterogeneity I2 of 82.3% among subgroups with different treatment times, indicating significant differences between different treatment times. The I2 within each group was greater than 75%, indicating that heterogeneity between studies was not caused by different treatment times.

The subgroup analysis of hormone measurement methods showed a heterogeneity I2 of 91.4% among subgroups of different hormone measurement methods, indicating significant differences between different hormone measurement methods. The I2 within each group was greater than 75%, indicating that the heterogeneity between studies was not caused by different hormone measurement methods.

Fig. 4
figure 4

T forest plot.

Fig. 5
figure 5

T funnel plot.

Secondary outcomes

Weight

14 studies (including 25 comparisons) reported body weight, and a random effects model showed that compared to the control group, the kidney tonifying formula treatment group was significantly different. The 17 results showed significant weight loss in the treatment group, with a statistically significant difference (MD = − 22.07, 95% CI = [− 31.12, − 13.03], P < 0.00001). There was significant heterogeneity between the studies, with heterogeneity test I2 = 98%, as shown in Fig. 6. The funnel plot is shown in Fig. 7, and the graph of subgroup analysis is shown in Attachment 2.

The subgroup analysis of treatment methods showed that the heterogeneity I2 = 0% between different treatments, indicating that there was no significant difference in weight between different treatments. The heterogeneity I2 within different treatment groups is greater than 75%, indicating that the heterogeneity between studies is not caused by different treatment methods. At the same time, it indicates that there are significant differences within the same treatment method, and the impact on weight is not statistically significant.

The subgroup analysis of modeling methods showed a heterogeneity I2 = 92% among subgroups of different modeling methods. The I2 within each group is greater than 75%, indicating that heterogeneity between studies is not caused by different modeling methods.

The subgroup analysis of treatment time showed a heterogeneity I2 = 63.7% between different treatment times, indicating no significant difference between different treatment times. The I2 within each group is greater than 75%, indicating that heterogeneity between studies is not caused by different treatment times.

Fig. 6
figure 6

Weight forest plot.

Fig. 7
figure 7

Weight funnel plot.

Follicle-stimulating hormone (FSH) level

47 studies (including 95 comparisons) reported FSH levels. The random effects model showed that compared with the control group, the Bushen formula treatment group was significantly different, with an increase in FSH in the 60 treatment groups, and the difference was statistically significant. (MD = 0.96, 95% CI [0.61, 1.30], P < 0.00001), there is significant heterogeneity between studies, with heterogeneity test I2 = 90%. The forest plot is shown in Fig. 8, and the funnel plot is shown in Fig. 9. The graph of subgroup analysis is shown in Attachment 2.

The subgroup analysis of treatment methods showed a heterogeneity I2 of 65.8% among different treatment methods, indicating moderate differences in FSH among different treatment methods. Within different treatment groups, the heterogeneity I2 of studies on Bushen to promote excretion, Bushen to remove blood stasis and phlegm, and Bushen to regulate blood circulation was less than 75%, indicating that the heterogeneity between studies was partially due to different treatment methods.

The subgroup analysis of modeling methods showed that the heterogeneity I2 between different modeling methods subgroups was 35%, and the I2 within each group was greater than 75%, indicating that the heterogeneity between studies was not caused by different modeling methods.

The subgroup analysis of treatment time showed that the heterogeneity I2 between different treatment times was 82.5%, indicating a significant difference between different treatment times, and the I2 within each group was greater than 75%, indicating that the heterogeneity between studies was not caused by different treatment times. At the same time, it indicates that there are significant differences within the same treatment method, and the impact on FSH is not statistically significant.

The subgroup analysis results of hormone measurement methods are similar to the treatment time, indicating significant differences between different hormone measurement methods, and the heterogeneity between studies is not due to differences in hormone measurement methods.

Fig. 8
figure 8

FSH forest plot.

Fig. 9
figure 9

FSH funnel plot.

Luteinizing hormone (LH) level

48 studies (including 99 comparisons) reported LH levels. The random effects model showed that compared with the control group, the Bushen formula treatment group was significantly different (MD = − 1.43, 95% CI [− 1.79, − 1.08], P < 0.00001), with a total of 77 results. The LH in the treatment group decreased, and the difference was statistically significant. The heterogeneity between studies is significant, with heterogeneity test I2 = 90%. The forest plot is shown in Fig. 10, and the funnel plot is shown in Fig. 11. The graph of subgroup analysis is shown in Attachment 2.

The subgroup analysis of treatment methods showed a heterogeneity I2 of 67.2% among different treatment methods, indicating moderate differences in LH between different treatment methods. Within different treatment groups, the heterogeneity I2 of the study on Bushen and regulating the week was 15%, indicating that the heterogeneity between the studies was partially due to different treatment methods.

The subgroup analysis of modeling methods showed that the heterogeneity I2 between different modeling methods subgroups was 23.2%, and the I2 within each group was greater than 75%, indicating that the heterogeneity between studies was not caused by different modeling methods.

The subgroup analysis of treatment time showed that the heterogeneity I2 between different treatment times was 93.8%, indicating a high degree of difference between different treatment times. The I2 between treatment times of 18–22 days was 66%, indicating that the heterogeneity between studies was partially due to different treatment times.

The subgroup analysis of hormone measurement methods showed a heterogeneity I2 of 80.7% among subgroups of different hormone measurement methods, indicating significant differences between different hormone measurement methods. The I2 within each group was greater than 75%, indicating that the results were not entirely equal between different subgroups. The heterogeneity between studies was not caused by different hormone measurement methods.

Fig. 10
figure 10

LH forest plot.

Fig. 11
figure 11

LH funnel plot.

Homeostasis model assessment of insulin resistance (HOMA-IR) level

7 studies (including 14 comparisons) reported HOMA-IR levels. The random effects model showed that compared with the control group, the Bushen formula treatment group significantly reduced HOMA-IR values, with a total of 10 treatment group results showing a statistically significant decrease in HOMA-IR values (MD = − 3.26, 95% CI [− 4.5, − 2.02], P < 0.00001). There was significant heterogeneity between the studies, with heterogeneity test I2 = 99%, forest plot is shown in Fig. 12.

Fig. 12
figure 12

HOMA-IR forest plot.

Heterogeneity treatment: The range of hormone levels in rats varies greatly. In the meta-analysis referring to other high-quality animal models67,68,69,70, it was found that the heterogeneity of hormone levels such as testosterone values is generally greater than 75%, and subgroup analysis did not alleviate heterogeneity. This is due to various uncertain factors in animal hormone measurement, which leads to significant heterogeneity in different studies. This article innovatively attempts to use normalization method to ratio the results of the treatment group to the results of the model group, and the obtained results are subjected to meta-analysis. Normalize T, FSH, and LH for subgroup analysis. The specific forest plot and funnel plot can be found in Attachment 3. The heterogeneity table of different subgroups for different indicators is shown in Tables 3, 4, 5 and 6.

The heterogeneity of the results after normalization is high, and subgroup analysis shows that treatment method and treatment time can affect heterogeneity, and different treatment methods can affect the differences in results; The modeling methods and hormone measurement methods do not affect heterogeneity, and different modeling methods do not affect the level of T-value.

The heterogeneity of FSH normalization results is high, and treatment methods, modeling methods, hormone measurement methods, and treatment time do not affect heterogeneity. Different subgroups do not affect the level of FSH.

The heterogeneity of the results after LH normalization is moderate, and subgroup analysis shows that treatment method and treatment time can affect heterogeneity. Different treatment methods and treatment time can affect the differences in results. Modeling methods and hormone measurement methods can affect heterogeneity, but different subgroups do not affect LH levels.

In summary, conducting normalization can to some extent reduce the heterogeneity between studies, and the results of subgroup analysis are more meaningful for research.

Table 3 Heterogeneity table of different subgroups of T.
Table 4 Heterogeneity table of different subgroups of weight.
Table 5 Heterogeneity table of different subgroups of FSH.
Table 6 Heterogeneity table of different subgroups of LH.

Sensitivity analysis: After removing the studies and replacing the effect models one by one, the combined effect sizes of each influencing factor were close, without any significant changes, and the overall results did not show directional changes, indicating that the meta-analysis results are generally robust and reliable.

Discussion

These 60 articles studies, 62 formulas and 99 treatment groups based on dosage and herbal ingredients. All included studies were conducted between 2016 and 2022. 57 studies used SD rats, while only 2 studies used Wistar rats. The dosage of compound Chinese herbal medicine varies greatly, ranging from 5 to 20 flavors. The treatment methods mainly include Bushen Huatan Quyu, Bushen Huoxie Huayu, Bushen Qinggan Huoxie, Zi Yin Bushen Xuguan, Bushen Huoxie, Bushen Huatan, Bushen Tiaozhou, Bushen Tiaojing, Bushen Huayu, Bushen Cupai; The methods for inducing PCOS models include Poretsky method, aromatase inhibitor modeling method, androgen modeling method, estrogen modeling method, and progesterone modeling method; The treatment time is mostly 10–30 days, with only three studies including Li Caixia41, Lin Hui47, and Pan Wen52 showing that the treatment time is greater than 30 days. In addition, the administration routes of each study were oral gavage; The methods for measuring hormone levels include enzyme-linked immunosorbent assay, radioimmunoassay, and chemiluminescence assay.

In this study, we evaluated the effectiveness of Bushen formulas in PCOS animal models through meta-analysis. Compared to other papers, the advantage of this study is that only animal experiments were selected for meta-analysis, which differs from the analysis of randomized controlled trials (RCTs), comparing different formulations and doses to validate the most suitable treatment method, and conducting multivariate subgroup analysis based on the square law and experimental design. Overall, the research results indicate that Bushen formulas improve certain specific characteristics of PCOS (ovarian mass, weight, sex hormones, and insulin resistance). Specifically, ovarian mass, testosterone, body weight, LH and HOMA-IR decreased, while FSH increased. According to subgroup analysis, there are differences in the improvement effects of different treatment methods on various symptoms of PCOS. There is no significant difference in the improvement of various indicators among different modeling methods, and there are differences in the improvement effects of different treatment times on various symptoms of PCOS. Different hormone level measurement methods have significant differences in the determination of testosterone, FSH, and LH. For future clinical research, it can provide some inspiration. In PCOS clinical trials, patients can be classified and different treatment methods can be used for targeted drug research, which has a certain guiding role in clinical application. In order to better transform animal models into clinical treatment72, further clinical research can be conducted73,74,75,76, and mechanism exploration can be combined with network pharmacology and other methods77. At the same time, this article adopts innovative data processing methods, normalizing the data to eliminate some heterogeneity caused by systematic differences between different animal experiments, making the results of subgroup analysis more meaningful for research.

Overall, the quality of the studies included is moderate, with little difference in article quality. Out of 60 studies, 6 publications reported random allocation and no publications mentioned blind evaluation of results, which means that in animal models, blinding is often considered a technical difference. Given that failure to conduct blind evaluations may lead to overestimation of effect size, We recommend following more standardized experimental standards in preclinical studies, such as designing double-blind animal experiments and strictly implementing randomization to address the issue of high bias risk.

In addition, our research findings also indicate that there are differences in the therapeutic effects of Bushen formulas on various models, indicating that clinical applications should also pay attention to the advantages of individualized treatment of Bushen formulas. However, there is significant heterogeneity among different literature, which may be due to differences in specific treatment methods, differences in dosage and usage, and methods for measuring hormone levels. At the same time, we found that the range of hormone levels in rats varied greatly in different experiments, and the heterogeneity of hormone levels was generally greater than 75%, which may be due to various uncertain factors in animal hormone measurement.

Our research still has the following limitations: firstly, we searched for literature in Chinese and English databases, excluding databases in other languages. In addition, there is a lack of grey literature and negative results. Secondly, the number of studies evaluating pregnancy rate and litter size is too limited to conduct meta-analysis, which means that it is not possible to evaluate the effects of Bushen formulas on infertility in animal models. Thirdly, there are relatively few studies on the included mouse models, so we are unable to conduct subgroup analysis by species. Fourthly, this article reports the results using the mean difference (MD) of continuous variables, but does not adjust for multiple comparisons or discuss potential type I errors caused by a large number of tested results. Multiple comparisons can increase the likelihood of discovering false results78,79,80,81. Bonferroni correction can be used, or different subgroups can be merged separately, or one group can be selected for merged analysis to reduce Type I errors.

Conclusion

Through systematic evaluation and meta-analysis of the effectiveness of Bushen formulas on PCOS in animal models, we have concluded that in PCOS animal models, Bushen formulas can improve Ovarian mass, weight, T, FSH, LH, and HOMA-IR. In addition, we suggest using PCOS modeling drugs that meet the research objectives when studying different mechanisms.