Background

Ectopic pregnancy (EP) refers to pregnancies occurring outside the uterine cavity. The ampullary region of the fallopian tube being the most common site of extra-uterine implantation1. Other regions included the cornu of the uterus, ovaries, cervix, and abdominal cavity2. EP occurs in approximately 1.5–2% of all pregnancies and is a major cause of female mortality3. In recent years, the morbidity and mortality rates of EP have declined significantly owing to advancements in noninvasive diagnostic procedures (e.g. transvaginal ultrasound). However, with the widespread use of assisted reproductive technology (ART) and controlled ovarian stimulation, the incidence of EP continues to increase globally, posing a threat to gestational security and safety4. Previous studies have identified ART, double embryo transfer, cleavage-stage (day-3) embryo transfer, and fresh cycles as potential risk factors for EP. The incidence of EP is expected to double or even triple than that of natural conception5,6,7,8.

Given the dangers of EP and its increasing morbidity, recognizing risk factors, screening high-risk patients, and facilitating early detection and intervention are crucial. Current studies suggest that tubal abnormalities are the primary cause of tubal implantation; however, endometrial and embryonic factors may also play a role9. These factors may impair normal tubal function, embryo-tubal communication, embryo-endometrium synchronization, and the implantation process. Risk factors include previous tubal surgery, tubal infertility, pelvic inflammatory diseases, uterine pathology, and lifestyle factors such as smoking10,11, which also account for female infertility. Noting that ART treatment is expensive, invasive and not a guarantee of success, and that EP poses a significant risk to maternal wellness, identifying risk factors for EP following ART and developing a substantial predictive model are essential.

Previous studies have identified double embryo transfer, fresh cycles, and cleavage-stage embryo transfer as independent risk factors12. However, double cleavage-stage embryo transfer remains one of the most common embryo transfer strategies owing to patient limitations, such as reduced ovarian reserve or poor endometrium receptivity, and has not yet been fully replaced by single-blastocyst embryo transfer13,14. Therefore, identifying the risk factors, assessing individual risks, and developing a predictive model for ectopic pregnancy following double-fresh cleavage-stage embryo transfer are necessary to promote the safety and success of IVF/ICSI. This study aims to address this gap. We believe that this may also help patients and healthcare professionals raise appropriate awareness, make timely clinical decisions, and advance our understanding of the mechanisms underlying EP.

Results

Demographic features and clinical information of the study population

In total, 23,858 fresh double cleavage-stage embryo transfers were performed during the study period. Of these, 7456 cycles involving 7215 intrauterine and 241 ectopic pregnancies met the inclusion criteria. The training and validation sets were formed in a 7:3 ratio based on the time of embryo transfer. All patients undergoing embryo transfer between January 2016 and June 2018 were arranged in chronological order; the first 7/10 were selected as the training set, and the remaining 3/10 were regarded as the validation set. A detailed flowchart of this process is shown in Fig. 1.

Fig. 1
figure 1

Study flowchart showing the selection process for intrauterine and ectopic pregnancies.

The training set included 5051 Intrauterine pregnancies (IUPs) and 168 EPs, whereas the validation set included 2164 IUPs and 73 EPs. The median age for both sets was 32 years, and the median body mass index (BMI) was 22 kg/m2. The majority of patients suffered from < 5 years of infertility duration, with > 60% experiencing no previous gestation. Less than 5% of patients in both sets had previous pregnancies. Comparisons of other demographic characteristics and medical histories between the two sets showed no statistical significance.

In terms of uterine malformation, the training set included ten patients with rudimentary horn and unicornuate uterus, three with duplex uterus, and four with uterine septum. The validation set included three patients with a rudimentary horn and unicornuate uterus, one with a duplex uterus, and one with a uterine septum. For previous tubal surgery, the training set included 384/692 (55.5%) patients who received unilateral salpingectomy or tubal ligation, 80/692 (11.6%) bilateral salpingectomy or tubal ligation, and 96/692 (13.7%) and 125/692 (18.1%) unilateral and bilateral salpingostomies, respectively. Fifty-three of 692 (7.7%) patients underwent unilateral fenestration whereas 8/692 (1.2%) received bilateral fenestration. A total of 54 patients underwent multiple tubal surgeries. The corresponding incidence of six types of surgeries in the validation sets were 179/287 (62.4%), 34/287 (11.8%), 33/287 (11.5%), 41/287 (14.3%), 19/287 (6.6%), and 2/287 (0.7%) respectively. The proportion of tubal surgery types showed no significant differences between the training and validation sets.

Additionally, 2775/5219 (53.2%) pregnancies in the training set and 1223/2237 (54.6%) pregnancies in the validation set had undergone endometrial biopsy owing to abnormal uterine bleeding, previous gestational history, previous ART failures, or abnormal ultrasonographic findings. A total of 285/2775 (10.3%) pregnancies in the training set were diagnosed with chronic endometritis (CE) through immunohistochemical staining of CD138 + plasma cells, whereas 149/1223 (12.2%) pregnancies in the validation set were diagnosed with the same. Additionally, 163/2775 (5.9%) pregnancies in the training set and 73/1223 (6.0%) in the validation set were diagnosed with endometrial polyps. Uterine scarring was identified in 11/2775 (0.4%) pregnancies in the training set and 2/1223 (0.2%) in the validation set before oocyte retrieval. However, these differences were not statistically significant in either group.

Regarding ART information on the current cycle, the proportion of patients receiving agonist versus antagonist protocols was approximately 1:1 in both sets, with a median endometrial thickness of 11 mm on the day of human chorionic gonadotropin (HCG) administration. Nearly 90% of all patients demonstrated a type-A endometrium on HCG day through transvaginal ultrasound (TVS) examinations, and > 97% percent of all patients had at least one top quality embryo transferred. The proportion of fertilization methods, percentage of 2-pronucleus (2PN) embryos, serum estradiol level on HCG day, and number of retrieved oocytes did not differ significantly between the training and validation sets. The related data are presented in Table 1.

Table 1 Demographic characteristics and assisted reproduction information of the study population.

Risk factors for EP in the training set

Non-parametric and chi-square tests were performed on the training set to determine factors with statistical differences between EP and IUP patients. In the training set, demographic features including age, BMI, duration of infertility, gravidity, parity, previous cycles and causes of infertility did not differ between EP and IUP patients (P > 0.05). We found that patients with EP showed higher proportions of tubal factor infertility (63.1% vs. 39.6%, P < 0.001), previous tubal surgeries (35.7% vs. 12.5%, P < 0.001), polycystic ovarian syndrome (PCOS) (17.9% vs. 10.9%, P = 0.005) and uterine malformation( 1.8% vs. 0.3%, P = 0.007). They also showed comparably lower levels of serum HCG (7104.00 [5155.00,11816.00] vs. 6304.00 [4500.00,10106.00], P = 0.001) and thinner endometrium on HCG day (10.00[9.00,11.00] vs. 11.00 [10.00,12.00], P < 0.001). For previous tubal surgeries, the proportions of unilateral and bilateral salpingostomies were higher in patients with EP than those with IUP (9.5% vs. 1.6%, P < 0.001; 14.9% vs. 2.0%, P < 0.001). Moreover, among patients with EP, 19/111 (17.1%) were diagnosed with chronic endometritis, compared to 266/2664 (10.0%) with viable IUP, a difference that was statistically significant (P = 0.015). However, the proportions of endometrial polyps and uterine scars in patients with EP and IUP were 4.5% and 4.5% (endometrial polyps) and 0.0% and 0.4% (uterine scars), respectively, demonstrating no statistical significance (Supplementary Table 1).

Tubal factor infertility, previous tubal surgery, PCOS, uterine malformation, endometrial thickness on HCG day, estradiol level on HCG day, and baseline characteristics including maternal age and BMI were included in multi-variate regression analyses. Pathological diagnoses of chronic endometritis were not included because not all patients had undergone endometrial biopsy before oocyte retrieval. The related data are presented in Table 2.

Table 2 Multivariate regression model in the training set.

As shown in Table 2, tubal factor infertility (OR = 1.680, 95% CI 1.129–2.500, P = 0.011), previous tubal surgery (OR = 2.667, 95% CI 1.772–4.015, P < 0.001), PCOS (OR = 1.809, 95% CI 1.69–2.799, P = 0.008), uterine malformation (OR = 5.663, 95% CI 1.524–21.043, P  = 0.010), endometrial thickness (OR = 0.585, 95% CI 0.522–0.655, P < 0.001), and estradiol level on HCG day (OR = 1.000, 95% CI 1.000–1.000, P < 0.001, OR value stands for elevation of EP risk with 1 pmol/l elevation of serum estradiol level) were all identified as independent risk factors for EP. Collinearity diagnosis demonstrated that the Variance Inflation Factors (VIF) for the above parameters were 1.239, 1.234, 1.016, 1.000, 1.007 and 1.004, which prevented the model from being affected by collinearity among the predictors.

The nomogram, Receiver operating characteristic (ROC) curve, and calibration curve are shown in Fig. 2, with an area under curve (AUC) of 0.768 (95% CI 0.732–0.805, P < 0.001) indicating considerable predictive value of this model. The slope and Brier score of the calibration curve for the training set were 1.000 and 0.0294, respectively, confirming that the model was well-calibrated. We further established an online calculator for the nomogram, which helped to predict the individual risk of EP by the input parameters of each subject in a user-friendly way (Fig. 3).

The optimal cut-off value of the risk score was 86.1 for all 5219 patients in the established predictive model, determined by the maximum value of the Youden Index. Thus, 5219 patients were subsequently categorized into high EP risk group (1394 pregnancies with risk score > 86.1) and low EP risk group (3825 pregnancies with risk score ≤ 86.1). The high risk group demonstrated significantly higher incidence of EP compared to low risk group (OR = 4.950, 95% CI 3.623–6.711, P < 0.001). This model achieved a sensitivity of 64.3% and specificity of 74.6% (Table 3).

Table 3 Association between nomogram scores and EP risk in training and validation set.
Fig. 2
figure 2

Nomogram ROC curve and Calibration curve for predicting EP in the study population. ac Showed the nomogram, ROC curve and calibration curve. The calibration curve examined the agreement between the observed and predicted probabilities. The x-axis of the calibration curve demonstrates the nomogram-predicted probability, while the y-axis demonstrated the observed probability. A 45° diagonal line showing ideal calibration performance, where the observed probability strictly equals the predicted probability. The slope and Brier score quantify the approaching degree of actual line to the ideal diagonal line, with the scope approaching 1.000 and the Brier score approaching 0.000 showing good calibration ability.

Fig. 3
figure 3

Screenshot of the user-friendly online calculator based on the nomogram. The users enter the inspected parameters on the left and press the ‘predict’ button with the predicted probabilities of EP automatically calculated and displayed on the right. We provide 5 examples of input parameters in the table below the figure.

Validation of the nomogram

We applied our prediction model to a validation set to evaluate the nomogram performance. Overall, the model performed well, achieving an AUC of 0.756 (95% CI 0.703–0.810). Additionally, the calibration curve demonstrated a comparably high level of agreement between the observed and predicted probabilities, with a slope of 0.969 and Brier score of 0.030. Applying the same cut-off point as the training set, 556 pregnancies were classified into the EP high-risk group and 1681 into the low-risk group. The OR for ectopic pregnancy in EP high risk group was 4.587 (95% CI 2.899–7.246, P < 0.001), reaching a sensitivity of 60.3% and specificity of 76.4% (Table 3).

We also analyzed the clinical utility of this model by performing a decision curve analysis, which indicated substantial clinical utility of the nomogram (Fig. 4). Notably, the incidence of CE in patients who underwent endometrial biopsy was significantly higher in ectopic pregnancies (10/40 vs. 139/1183, P = 0.011). However, the proportions of endometrial polyps (2/40 vs. 31/1183, P = 0.361) and uterine scars (0/40 vs. 2/1183, P = 0.794) were not statistically significant. We performed sensitivity analyses to exclude the effect of potential outliers and the overall results of our sensitivity analyses were consistent with those of our primary analysis (Supplementary Figs. 1 and 2).

Fig. 4
figure 4

ROC curve, Calibration curve and decision curve of the predicting model in the validation set. ac Showed the ROC curve, calibration curve and decision curve. The calibration curve examined the agreement between the observed and predicted probabilities. The x-axis of the calibration curve demonstrates the nomogram-predicted probability, while the y-axis demonstrated the observed probability. A 45° diagonal line showing ideal calibration performance, where the observed probability strictly equals the predicted probability. The slope and Brier score quantify the approaching degree of actual line to the ideal diagonal line, with the scope approaching 1.000 and the Brier score approaching 0.000 showing good calibration ability. The decision curve evaluated the clinical utility of the nomogram, with the vertical axis showing standardized net benefit and the horizontal axis showing high risk threshold. The solid black line represents the net benefit when all patients are considered not to develop the outcome (Intrauterine pregnancy). The solid grey line represents the net benefit when all patients are considered to develop the outcome (Ectopic pregnancy).

Establishment of a threshold for 48-h serum HCG level rising ratio in studied patients

After screening for high-risk patients, serum HCG levels serve as an essential parameter to help distinguish EP from normal IUPs. In this study, patients with EP had lower serum HCG levels at days 14–21 compared to those with IUP (129.71 [44.87–273.55] vs. 829.96 [520.93–1278.78] mIU/mL, 1178.46 [421.90–3936.50] vs. 15411.00 [10000.00–22621.00] mIU/mL, P < 0.001, P < 0.001). As patients with suspicious or ‘potentially dangerous’ HCG levels were administered another serum HCG test 48 h later, we calculated the 48-h serum HCG levels rising ratio among patients with high and low EP risk determined by the cut-off point from the nomogram scores in both the training and validation sets. The relationship between the 48-h serum HCG levels rising ratio and EP incidence follow a ‘dropping’ pattern in both high and low EP risk patients of both sets. In the training set, patients with high EP risk experienced lower EP incidence at a 48-h serum HCG level increase ratio of 2.18, whereas the threshold of patients in the low EP risk group was 1.74 (P < 0.001). The rising threshold for high and low EP risk group in the validation set was 2.10 and 1.66 (P < 0.001), respectively, in the validation set (Fig. 5). In other words, patients with high EP risk, as categorized by the established nomogram, follow a stricter safety threshold for a 48-h serum HCG level rise before either an IUP or EP is diagnosed.

Fig. 5
figure 5

Relationship between rising 48-h serum HCG level ratio and EP risk in training set (a) and validation set (b).

Furthermore, we cast our sights on those with a ‘likely safe’ HCG rising ratio (exceeding the threshold for 48 h serum HCG rising ratio) in both high and low EP risk patients. It turned out that 38 high EP risk patients and 41 low EP risk patients in the training set developed EP even if their 48-h serum HCG rising ratio exceeded the threshold. 92.1% (35/38) from high EP risk group and 39.0% (16/41) from low EP risk group were accompanied by tubal factor infertility. In the validation sets, 77.2% (17/22) and 57.1% (12/9) patients from high and low EP risk group respectively developed EP with a relatively high 48-h serum HCG rising ratio. This implied that tubal anamnesis played an essential role in the onset of EP, in patients receiving embryo transfer, during which the embryos were directly implantated into the uterine cavity without entering the fallopian tube, which implied the existence of tubal-embryo communication. It also pointed out that a 48-h HCG increase above the threshold was not 100% safe considering cases of viable ectopic sacs. A comprehensive speculation of the patients’ risk factors as well as laboratory findings, TVS results and physical examinations was necessary.

Discussion

In this study, we established a predictive model for ectopic pregnancies after double fresh day-3 embryo transfer following IVF/ICSI procedures based on demographic features, clinical history, and assisted reproduction information. Before pregnancy is confirmed, the model can assist specialists in screening high-risk patients, choosing proper treatment or embryo transfer strategies, and offering clinical counselling. After a positive serum HCG test, different 48-h HCG rising thresholds can help determine patients from high or low EP risk group, as categorized by the nomogram as ‘likely safe’ or ‘likely dangerous.’

The overall incidence of EP in patients undergoing double fresh day-3 embryo transfer was 2.72% in our study, aligning with previous studies performed at our center15. In other studies, the incidence of EP after ART ranged from 1.40 to 3.45%16,17,18,19,20. ART itself, along with double embryo transfer and fresh cycles, are recognized risk factors for EP because they alter the hormonal milieu, uterine contractility, and embryo implantation position5,18. The differences in EP incidence may be attributed to the different study populations and ART methods used across cohorts21,22.

Our predictive model demonstrated robust performance, with an AUC of 0.768 in the training set and 0.756 in the validation set, showing strong agreement between the observed and predicted probabilities. By determining the optimal cutoff point using the maximal Youden index, we found that pregnancies in both the training and validation sets were successfully categorized into high- and low-risk groups for EP. Patients in the high-risk group exhibited a significantly higher incidence of EP. Given the low occurrence but high mortality rate of EP, our model provides a reference for providing timely interventions and counseling for high-risk patients to prevent EP.

In previous studies, attempts to establish a predictive model for EP have been made in limited scenarios. Li et al.. investigated high-risk factors in 5606 frozen-thawed embryo transfers and established a nomogram based on endometrial thickness on the day of embryo transfer, tubal factor infertility, previous ectopic pregnancy, and stage of embryo transfer23. Liu et al.. proposed a model in a cohort of 12,766 fresh and frozen cycles, emphasizing that tubal factor infertility, thin endometrium, cleavage-stage embryo transfer, and fresh embryo transfer are risk factors for EP24. Both models showed relatively high AUC values approaching 0.7, with Li’s model showing sensitivity of 69.2% and specificity of 57.9%, and Liu’s model showing sensitivity of 65.0% and specificity of 65.0% in their validation sets. Unlike the study by Li and Liu, which included patients receiving fresh and frozen embryo transfers, our study specifically focused on patients undergoing fresh embryo transfer (day-3, 2 embryos), as previous studies have shown that both fresh embryo transfer and cleavage-stage embryo transfer are independent risk factors for EP. This eliminated the effects of confounding factors and ensured the direct clinical application of the model. Moreover, we validated the nomogram using an internal validation set organized in chronological order to guarantee the calibration and clinical utility of the nomogram, an approach not applied in prior reports. We also determined a threshold for the total points of the nomogram rather than a threshold for a single risk factor, aiming to categorize patients with high or low EP risk by comprehensively considering both maternal and ART-related parameters. Compared with prior studies, the relatively high specificity of this predictive model and increased EP incidence in the high EP-risk group suggested that high-risk patients warrant special attention, with potential measures being taken to carefully monitor the implantation site of the embryo. Early ultrasonography or intermittent serum HCG testing may prove beneficial.

Other studies have explored the possibility of predicting EP using HCG levels, HCG ratios, progesterone levels and other ultrasound-based variables. The M4 model, established in 2007 using logistic regression of log HCG averages, HCG ratios, and HCG quadratic effects, has been proven to be an effective tool, correctly classifying 84% of EP cases as high risk25,26. This model also demonstrated the potential to reduce follow-up by 70% in patients with pregnancies at unknown locations27. The identification of different EP risk levels has facilitated personalized therapeutic strategies for each patient, ranging from expectant treatment to monitoring using ultrasonography or HCG levels. In 2016, Bobdiwala et al. examined the clinical performance of the M4 model in a cohort of 835 patients. They classified patients into three groups: high-risk, low-risk (likely failed), and low-risk (likely IUP), with different procedures, including follow-up or re-examination of serum HCG. The adverse event rate of the M4 model in clinical practice is 2%, ensuring its safety for use28. An updated version of the M6 model was established in 2016, and progesterone was applied as the first-step filter, with initial HCG, HCG ratio, and initial progesterone as predictors in the second-step model29. The same classification system as the M4 model was applied, correctly classifying 85.9% of EPs cases, with no serious adverse events reported among the 2625 patients30. Another model included the primary causes of infertility and categorized patients into groups based on EP risk15. In 2024, Kyriacou et al. reported an updated version of the M6 model, introducing the M6P and M6NP models, which further classified patients based on the availability or suitability of progesterone level measurements. Both models achieved an AUC > 0.8, distinguishing EP cases with high accuracy31. However, these models mainly focus on the effect of HCG after embryo transfer, allowing risk assessment only after a positive serum HCG test. Little attention has been given to identifying risk factors before ET, such as infertility type, medical history of oviduct surgery, and endometrial height. Similarly, limited emphasis has been placed on preventing the risk of EP or providing comprehensive medical counseling to high-risk patients before ET. Identifying these factors may facilitate the selection of an appropriate strategy for endometrial preparation and timely intervention, which was one of the focus areas of our study. After individual EP risk was identified through pre-ET factors, we considered serum HCG as a second step for determining the risk for EP, as patients in the high- and low-EP risk groups shared different 48-h serum rising thresholds, with high-risk patients having a stricter safety line for HCG rising. This echoed Xu’s finding indicating that 48-h HCG rising was significant in predicting EP15. However, a 48-h HCG increase above the threshold was not 100% safe, as rare cases, including live fetuses at ectopic sites or bilateral ectopic pregnancies, might combine, as reported by Acet et al. Transvaginal ultrasound can aid in the diagnosis under such circumstances32.

The several independent risk factors involved in building a predictive model warrant considerable clinical attention. Uterine factors such as thin endometrium, uterine malformation, and CE increase the risk of EP. These factors trigger abnormal endometrial receptivity or alter intrauterine implantation, eventually leading to retrograde transport of the embryo to the fallopian tube33. Prior studies featuring the role of endometrium thickness and EP emphasized that sub-optimal endometrial build-up, either too low or too high might impair the endometrial-embryo cross-talk, leading to failed implantation in the uterine cavity2. The correlation between decreased endometrial thickness and increased ectopic pregnancy risks might be explained by the fact that endometrial thickness is a symbol for endometrial receptivity, and that insufficient endometrial build-up might not be capable enough in preparing an optimal environment for embryo implantation. In prior literature, Haydardedeoğlu B’s study reported a significantly lower EP rate after hysteroscopic metroplasty, implying that correcting uterine malformations before ET may improve ART success34. Given that all CE patients were treated with standard oral antibiotics at our center, the higher proportion of CE history in patients with EP might reflect more complex inflammation or inflammation outside the uterine cavity, such as pelvic inflammatory disease, microbial disorders of the upper genital tract, or infections caused by atypical microbes35,36,37.

Tubal factor infertility and previous tubal surgeries had been reported as a risk factor for EP after IVF-ET, with the level of risk depending on the degree of damage and the extent of anatomic alteration. Surgical treatment of tubal factor infertility, including salpingectomy or conservative methods, has been recommended; however, a high ectopic pregnancy rate remains38. The underlying mechanism was that damaged fallopian tubes are less sensitive in determining whether the environment is appropriate for sustaining embryo implantation and recognizing embryonic factors that might lead to pregnancy loss. Consequently, damaged fallopian tubes may appear more receptive to embryo39. Its association with PCOS indicates an altered hormonal milieu, especially high estradiol level after COS might be a trigger for EP. This implied the idea that that excessive estradiol level altered endometrial receptivity as well as uterine contractility and thus disturbed the interaction between endometrium and embryo, causing reversed implantation of the embryo from the uterine cavity to the fallopian tube. Jwa SC et al. reported in 2020 that ovarian stimulation significantly associated with EP risk, as those undergoing COS showed higher incidence of EP compared with natural cycle. The authors pointed out that altered estradiol level during the implantation period might interrupt the embryo-tubal-endometrial crosstalk, affecting tubal peristalsis and the egg or zygote transport40,41. Thus applying an appropriate embryo transfer strategy to adjust hormonal status might be helpful42,43. Although some risk factors cannot be completely eliminated or prevented, comprehensive counseling and sufficient patient education should be provided.

The significance of this study lies in its ability to predict EP by identifying risk factors and screening patients with high-risk EP before embryo transfer. We also found appropriate 48-h HCG rising thresholds for patients with high- and low-risk of EP. With a relatively high AUC, calibration, sensitivity, and specificity, our model effectively predicted EP risk in patients receiving double fresh day-3 embryo transfer. Therefore, specialists should arrange appropriate interventions for high-risk patients before and after embryo transfer.

For clinical practice, high risk factors were speculated before embryo transfer and with interventions being taken. An appropriate strategy may be selected for patients with PCOS or uncorrected uterine malformations (Step 1 in Fig. 6). Therapeutics including restoration of the endometrium with autologous tissue or the use of exogenous medication to reach a certain level of endometrial thickness might be applied for those with a thin endometrium. Correction of tubal factor infertility and uterine malformations through surgery has proven beneficial, although potential risks must also be considered. The relationship between EP and chronic endometritis and endometrial polyps promotes the treatment of chronic endometritis and removal of endometrial polyps before embryo transfer. The nomogram could therefore be used for screening high risk patients and predicting individual EP risks (Step 2 in Fig. 6).

After embryo transfer, specific measures could be taken in patients with high EP risk, including more frequent ultrasound or serum HCG tests and raising awareness of potential EP symptoms such as abdominal pain and vaginal bleeding. A stricter 48-h rising serum HCG threshold would help determine whether the HCG increase is satisfactory for high- and low-risk patients (Step 3 in Fig. 6). Thus, this nomogram can be used in patients with infertility undergoing fresh double-embryo transfer to maximize the safety and success of ART strategies.

On a patient’s perspective, we summarized the implications of the model into a ‘3A’ strategy including ‘Awareness, Action and avoiding anxiety’, which helped to answer the three main question of patients with infertility towards EP: ‘What should I know’, ‘What should I do’ and ‘who should I turn to’. In detail, the patients should be aware of the prevalence of EP and potential risk factors, as well as a general view of herself being high or low risk according to the nomogram. With the risk factors being identified, taking actions requires both the patients and the professionals to eliminate risk factors that could possibly be corrected and select proper ART strategies with regard to risk factors that could not be eliminated or prevented. “Who should I turn to” section, on the other hand, refers to self-monitoring of warning signs and turning to healthcare professionals when needed. This helps to avoid unnecessary anxiety when the results are unknown and assure the patients that we have done the best we can when unfavorable results are informed. The clinical protocols for the nomogram were summarized in Fig. 6.

Fig. 6
figure 6

Clinical protocol for reducing EP risk from both healthcare professionals’ and patients’ perspectives.

The strengths of this study are as follows: (a) a large sample size with abundant documented clinical information; (b) a nomogram was established and validated using a time-ordered internal validation set; (c) a threshold was determined with the model showing good accuracy and clinical utility in both sets; and (d) therapeutic strategies were suggested for high-risk patients. However, the study had the following limitations: (a) a retrospective and single-center design causing potential selection bias; (b) information related to personal lifestyle factors (i.e., smoking) and past genital tract infections was not collected; (c) the pharmaceuticals used during gestation were not documented, potentially overlooking their effect on ectopic pregnancies; (d) external validation in other centers was not performed; and (e) patients undergoing other embryo transfer strategies were not included.

In summary, this study provides novel insights into the prediction of tubal ectopic pregnancy following double fresh day-3 embryo transfer. We established a predictive model and presented potential therapeutic strategies targeting specific risk factors. After internal validation, the model demonstrated considerable accuracy and clinical utility. We were fully aware of the single center nature of this study which might limit the generalizability of the model, as embryo trnasfer strategies and ovarian stimulation & monitoring protocols varied in different centers. However, the core predictive factors in this study including tubal factor infertility, PCOS and previous tubal surgery were well-established determinants of EP across diverse IVF populations, which suggests the structure of our model may be generalizable even if the specific coefficients might need adjustment for local practices. For future studies, we outlook studies with larger sample size, multi-center and external validation design to explicitly validate the predictive value of this model. Similar steps of ‘Validation-Calibration-Framework for Adaptation’ might be applied. Cost-effectiveness analyses and validation with more specific clinical monitoring protocols should also be performed to assess clinical utility of the model and its direct effect on patients. We hope that this model can be conveniently used by patients and healthcare professionals to identify potential risk factors for EP and estimate EP risk. This enables infertility specialists to screen high-risk patients with high specificity, help clinicians make informed clinical decisions, and provide sufficient counseling to patients. These recommendations include adjusting ART strategies, selecting appropriate ET protocols, and encouraging high-risk patients to be more cautious when experiencing symptoms such as abdominal pain and vaginal bleeding. Increased awareness among high-risk patients may prompt them to seek timely help when needed. We will conduct further studies with larger sample sizes and more detailed clinical data to explore additional risk factors for EP and improve the applicability of the model to the general population. This may help uncover the links between endometrial, tubal, embryonic, and technical factors, ultimately contributing to a better understanding of the mechanisms of EP.

Methods

Ethics approval

This study was approved by the Ethics Committee of Peking University Third Hospital (reference number: M2022799). Informed consent was obtained from all the participants. All methods were carried out in accordance with relevant guidelines and regulations.

Study participants

Medical records of patients who underwent double fresh cleavage-stage embryo transfer at Peking University Third Hospital between January 2016 and June 2018 were carefully reviewed. Our inclusion criteria were as follows: (a) patients with infertility undergoing double fresh day-3 embryo transfer between January 2016 and June 2018 at Peking University Third Hospital; (b) clinically diagnosed intrauterine or ectopic pregnancy; c)18–45 years old. The exclusion criteria were as follows: (a) spontaneous abortion; (b) heterotopic pregnancy or pregnancy at unknown location; (c) uncontrolled high maternal blood pressure, endocrine and coagulation disorders; (d) untreated hydrosalpinx; (e) maternal or paternal chromosomal abnormalities or preimplantation genetic testing (PGT) cycles; (f) alcohol and drug abuse; (g) psychiatric disorders; and (h) incomplete data.

Controlled ovarian stimulation protocols

The ovarian stimulation protocols, including antagonist, agonist, and other stimulation protocols, followed the routine protocols of Peking University Third Hospital as described previously12,15. Briefly, ovarian stimulation was achieved using recombinant follicle-stimulating hormone (FSH) and human menopausal gonadotrophin44. The long protocol involved pituitary downregulation via mid-luteal administration of a gonadotropin-releasing hormone (GnRH) agonist, whereas the short protocol used a GnRH agonist starting on day 2 of the cycle. GnRH antagonists were added to the GnRH antagonist protocols after evaluating the condition of developed follicles and hormone levels45. All patients underwent transvaginal ultrasonography and serum hormone tests to monitor follicular development. When two or more leading follicles reached a diameter of 18 mm (17 mm in antagonist protocols), final oocyte maturation was triggered by injecting 250 µg recombinant human chorionic gonadotropin (rHCG, Ovidrel, Merck Serono, UK) on HCG day. Oocyte retrieval was performed and oocytes were fertilized by IVF or ICSI 36–38 h later. Two fresh embryos were transferred on day 3 after insemination, and embryo quality was determined based on fragmentation percentage and cytoplasmic features46. Progesterone was routinely administered to maintain luteal function until 8–10 weeks of gestation. Serum HCG levels were examined at 14 and 21 days after embryo transfer, and TVS was performed at 28 and 35 days after embryo transfer. The 48-h serum HCG rising ratio was calculated based on the HCG levels measured 14 and 21 days after embryo transfer.

Data collection

Demographic characteristics, medical history, and details of current ART cycle were obtained from the hospital’s electronic database. The prognoses of ongoing pregnancies were assessed via telephonic interviews. Age and BMI were the main demographic features analyzed in this study. Medical history included obstetric history, duration and causes of infertility, and conditions such as endometriosis and PCOS. PCOS was diagnosed following the Rotterdam standards, while endometriosis was confirmed by surgery. Recurrent implantation failure (RIF) refers to failure to achieve clinical pregnancy after ≥ 3 fresh or frozen embryo transfer cycles with 1–2 high quality embryo transferred in each cycle. These failures have been attributed to factors other than embryo aneuploidy47. Recurrent pregnancy loss (RPL) is diagnosed after ≥ 2 loss of pregnancies48. Information on the current ART cycle includes controlled ovarian stimulation (COS) protocols, fertilization methods, endometrial thickness, morphology, pathological diagnosis by endometrial biopsy (chronic endometritis), hormone levels on HCG day, number of retrieved oocytes, percentage of two-pronucleus (2PN) embryos, and embryo quality. Endometrial thickness was measured on HCG day using TVS in millimeters (mm) on the long axis or sagittal plane, encompassing the entire endometrial lining through the endocervical canal49. Endometrial morphology was graded using a two-tier system: type A endometrium was defined as a triple-line pattern with a hyperechoic central uterine cavity line and denser echogenicity within the basal endometrium and myometrium, whereas type B endometrium was characterized by a non-multilayered homogeneous hyperechogenic or iso-echogenic endometrium compared to the myometrium, as observed on TVS50. The percentage of 2PN embryos was defined as the proportion of embryos with two pronuclei after fertilization, whereas high quality embryos were defined as those with more than six cleaved cells, regularly shaped blastomeres, and < 20% cytoplasmic fragmentation. The pathological diagnosis of chronic endometritis (CE) was made by detecting CD138 + plasma cells via immunohistochemical staining of an endometrial biopsy performed during the early follicular phase prior to oocyte retrieval51. All patients diagnosed with chronic endometritis were treated with full course of antibiotics before the ovarian stimulation cycle without hysteroscopic reexamination.

Outcome measures

The primary outcomes were clinical diagnoses and considerations of intrauterine or ectopic pregnancy. Both types of pregnancies were confirmed by serum HCG levels and TVS. IUP was defined as the presence of an intrauterine gestational or yolk sac. Ectopic pregnancy was diagnosed on the basis of TVS detection of a gestational sac outside the uterine cavity, which was further confirmed by laparoscopic or laparotomic surgery.

Sample size calculation

Sample size was calculated based on Riley RD’s four steps method for calculating the sample size required for developing a clinical prediction model52. To be exact, the sample size was calculated using the ‘pmsampsize’ package in R 4.3.3 software following the above four steps. With the average C-index of existing ectopic pregnancy-related prediction models set at 0.7, 10 candidate predictive parameters, and an outcome event (EP) incidence rate of 3%, the minimum required sample size was determined to be at least 5757, with 173 expected outcome events (EP). Patients with missing data were excluded from the analyses.

Statistical analysis

All patients were divided into training and validation sets at a 7:3 ratio based on the time of embryo transfer as to avoid ‘plausibly related patients’ using temporal validation53,54. Statistical analyses were performed using IBM SPSS version 26.0 and R 4.3.3 (packages ‘rms’, ‘car’, ‘epiR’, ‘pROC’, ‘DynNom’ and ‘rmda’). Continuous variables were presented as mean ± standard deviations and compared using Student’s t-test for normally distributed data, or presented as median (25% quartile,75% quartile), and were compared using non-parametric test for non-normally distributed data. Enumeration variables are described as proportions and compared using the chi-square test. Univariate and multivariate analyses were performed to calculate crude and adjusted odds ratios (ORs) for variables considered to be potential risk factors for EP. A nomogram was established based on the statistically significant variables after multivariate regression analysis to visualize how the targeted risk factors might contribute to the occurrence of EP. ROC analysis was performed and the AUC was used to evaluate the accuracy of the prediction model. Calibration curves were plotted by comparing observed rates with predicted probabilities and the scope value was applied to demonstrate the degree of agreement between predicted and actual probabilities. The predictive value of the nomogram was examined using an internal validation cohort. ROC analysis, calibration curves, and decision curves were generated on the validation set to assess the model’s accuracy and clinical utility. A restricted cubic spline was applied to demonstrate the relationship between the 48-h serum HCG level rise and EP risk, and determine a rising threshold for stratifying patients with high and low EP risk. A P value < 0.05 was considered statistically significant.