Introduction

In recent years, it has been suggested that the prophylactic use of low-dose aspirin beginning before 16 weeks of gestation could reduce the prevalence of preeclampsia (PE) by 50% compared with its use after 16 weeks of gestation [1]. It is therefore important to provide early screening and identify pregnant women at high risk of developing PE who might benefit from aspirin prophylaxis [2]. Early screening and management are also associated with improved neonatal outcomes [3]. Several risk factors have been reported as predictors of PE [4,5,6]. The traditional approach to screening PE is to identify risk factors from maternal characteristics and medical history, but such an approach can identify only 40% of preterm PE cases with delivery at <37 weeks of gestation at a screen-positive rate of 10% [7]. Recently, the Fetal Medicine Foundation (FMF) proposed a Bayes theorem-based model to predict preterm PE using a combination of maternal characteristics, medical history, mean arterial pressure (MAP), uterine artery pulsatility index (UtA-PI), and serum placental growth factor (PlGF). According to the risk calculation with the use of this combined prediction algorithm, women are classified into high- and low-risk groups. This model combines the a priori risk from maternal characteristics and medical history (maternal factors) with the results of various combinations of biophysical and biochemical measurements and can predict ~90% of early PE cases, with delivery at <32 weeks of gestation and 75% of preterm PE cases, with delivery at <37 weeks of gestation, at a screen-positive rate of 10% [4, 6]. Although this model has taken different ethnic backgrounds into account, it is uncertain whether Japanese and European pregnant women have different reference values for biophysical and biochemical markers. It is important to clarify whether the FMF Bayes theorem-based model is applicable for clinical use in Japan. Therefore, the objective of this study was to assess the screening performance of the FMF Bayes theorem-based model in the Japanese population at 11–13 weeks of gestation.

Methods

Study design and patients

This prospective observational cohort study was conducted at Showa University Hospital in Tokyo, Japan. Eligibility criteria for this study were maternal age, ≥18 years; no serious mental illness or learning disabilities; and singleton pregnancy with a live fetus with no major abnormality identified at 11–13 weeks of gestation. Based on these criteria, we invited 2655 Japanese women to participate in the study, and 1035 women provided consent. Subsequently, we excluded 122 cases (12%) due to missing ultrasound measurement data (n = 17), biomarker data (n = 39), or outcome data (n = 44). Additionally, cases in which pregnancy resulted in miscarriage before 22 weeks of gestation (n = 2) or termination of pregnancy (n = 20) were excluded. As a result, 913 women were included in the present study (Fig. 1). All enrolled subjects were followed up and delivered at our hospital between June 2017 and December 2019.

Fig. 1
figure 1

Flow chart depicting the study population, inclusion, and exclusion criteria

The study was approved by the Ethics Committee of our hospital (#2270). The confidentiality of the patients involved was protected, and no personal data were required for this study. All eligible women were provided with written information about the study, and those who agreed to participate provided written informed consent. This paper is an expanded Japanese series that included a number of women who participated in a previous study that evaluated the screening performance of first-trimester prediction models for preterm PE in a large Asian population [6].

Study variables

All pregnant women at 11–13 weeks of gestation underwent ultrasonography for the measurement of fetal crown rump length (CRL) and assessment of fetal morphological abnormalities. At the same time, the measurement of UtA-PI was performed. Maternal characteristics and medical history, which consisted of gestational age, maternal age, weight, height, ethnic origin (East Asian), method of conception, smoking, chronic hypertension (CH), preexisting diabetes mellitus, systemic lupus erythematosus/antiphospholipid syndrome, parity, history of PE, and family history of PE, were recorded. Gestational age was determined by the fetal CRL at 11–13 weeks.

The MAP was measured by validated automated devices (Omron HCR-7101 sphygmomanometer, Omron Healthcare Co., Ltd., Japan) according to the standardized protocol by the nurses who had received appropriate training on the use of the device [8]. The women were placed in a sitting position with their arms supported at the heart level. Small (<22 cm), normal (22–32 cm), and large (33–42 cm) adult cuffs were used depending on the mid-upper arm circumference. After a 5-min rest, blood pressure was measured in both arms simultaneously, and each measurement was repeated at a 1-min interval. When the last two blood pressure measurements in either arm differed by more than 10 mmHg for systolic and 6 mmHg for diastolic blood pressure, additional recordings were made from both arms until variations between consecutive readings fell within 10 mmHg of the systolic and 6 mmHg of the diastolic blood pressure. We calculated the MAP from the average of all four measurements.

The left and right UtA-PI were measured using transabdominal color Doppler ultrasonography, and the average values were recorded. The transabdominal approach for the assessment of UtA-PI followed the standardized protocol [9]. Midsagittal sections of the uterus and cervix were initially visualized. The transducer was gently tilted sideways so that the uterine arteries with high blood flow velocity along the side of the cervix and uterus (at the level of the internal os) could be identified using color flow mapping. Pulsed wave Doppler sample volume was set narrow (at ~2 mm) and positioned on either the ascending or descending branch of the uterine artery, at the point closest to the internal cervical os, with an insonation angle of <30°. The peak systolic velocity of >60 cm/s was used to verify that the uterine artery was being examined [10].

Maternal serum concentrations of PlGF were measured by an automated analyzer: DELFIA Xpress system (PlGF 1-2-3 kits; DELFIA Xpress random access platform; PerkinElmer Inc, Waltham, MA). All ultrasound operators obtained the appropriate certificate from the FMF.

The measured values of MAP, UtA-PI, and PlGF were converted into multiple of the median (MoM) values using the application, and then the risks were calculated using the prediction model in individual cases. The cutoff values for high-risk status were set to 1/100 for preterm PE and 1/50 for term PE.

Definitions

PE was defined as gestational hypertension accompanied by proteinuria or other maternal organ dysfunctions at or after 20 weeks of gestation with all symptoms normalizing by 12 weeks postpartum, according to the Japan Society for the Study of Hypertension in Pregnancy [11, 12]. Proteinuria is not mandatory for the diagnosis of PE. Rather, PE is diagnosed by the presence of de novo hypertension after 20 weeks of gestation accompanied by proteinuria and/or evidence of maternal acute kidney injury, liver dysfunction, neurological features, hemolysis or thrombocytopenia, or fetal growth restriction.

Systolic blood pressure ≥140 mmHg and/or diastolic blood pressure ≥90 mmHg on at least 2 occasions, 4 h apart, was defined as CH.

Proteinuria was defined as protein excretion of ≥300 mg/day in a 24-h urine collection.

Superimposed PE was defined as CH diagnosed before 20 weeks of gestation, with proteinuria emerging afterward. Superimposed PE was included as PE in this study. Researchers reconfirmed the clinical diagnosis determined by the clinicians to enhance accuracy. Preterm and term PE were defined as PE with delivery at <37 and ≥37 weeks of gestation, respectively. Data on perinatal outcomes were collected from the hospital maternity records.

Statistical analysis

The data were analyzed using a computerized data analysis program (Statistical Package for Social Science (SPSS) for Windows, version 20.0 J; Chicago, IL, USA). Continuous variables are presented as the median (range). The Mann–Whitney U-test was used to compare continuous variables including MAP, UtA-PI, and serum concentrations of PlGF between the high- and low-risk groups. Categorical variables are presented as counts and percentages and were compared using Fisher’s exact test or the chi-squared test. Statistical significance was defined as a p value of less than 0.05. Multiple comparisons between preterm PE, term PE, and unaffected groups were performed with the Mann–Whitney test and chi-squared test with Bonferroni correction. Discrimination was assessed by area under the receiver operating characteristic (AUROC) curve, and detection rates for fixed false-positive rates were 5, 10, and 20%.

Results

Among the 913 women, 26 (2.8%) developed PE, including 11 (1.2%) with preterm PE. Prediction for subsequent onset of PE was performed by the FMF Bayesian theorem-based model in the 913 cases; 263 cases (28.8%) were classified as high risk for preterm PE, and 615 cases (67.4%) were classified as high risk for term PE. Maternal and perinatal characteristics were compared between the high- and low-risk groups for preterm PE (number of cases: 263 vs. 650; Supplementary Table 1) and term PE (number of cases: 615 vs. 298; Supplementary Table 2). The incidence of preterm PE in the high-risk group for preterm PE was significantly higher than that in the low-risk group (3.8% vs. 0.2%, p < 0.05). The incidence of term PE in the high-risk group for term PE was higher than that in the low-risk group, but this difference was nonsignificant (2.1% vs. 0.7%, p = 0.10). The AUROC curves and detection rates of preterm and term PE at false-positive rates of 5%, 10%, and 20% by combining several parameters based on the model are shown in Τable 1. The AUROC curve for preterm PE calculated by a combination of maternal characteristics, MAP, and PlGF was the highest among the models, though it did not reach statistical significance. Additionally, the AUROC curve for term PE calculated by a combination of maternal characteristics and MAP was the highest performing model. The AUROC curve of the model including maternal characteristics and MAP was significantly higher than the model including maternal characteristics, MAP, and UtA-PI (0.751 vs 0.710, p < 0.05), although the AUROC curve of the model including maternal characteristics and MAP did not reach statistical significance compared to the other model except for maternal characteristics, MAP, and UtA-PI.

Table 1 Predictive performance of screening for preterm and term PE

The risk calculation for preterm PE using the best model had detection rates of 73% (95% confidence interval (CI), 43.4–90.3), 90% (95% CI, 62.3–98.4), and 100% (95% CI, 74.1–100) for fixed false-positive rates of 5%, 10%, and 20%, respectively. The risk calculation for term PE using the best model had detection rates of 47% (95% CI, 24.8–69.9), 60% (95% CI, 35.8–80.2), and 60% (95% CI, 35.8–80.2) for fixed false-positive rates of 5%, 10%, and 20%, respectively. The results of ROC analysis for predicting preterm and term PE by the respective best models are shown in Fig. 2. Furthermore, we constructed calibration curves to validate our prediction models for preterm and term PE. These calibration curves showed intercepts of 0.009 (95% CI, −0.071 to 0.089) and 0.013 (95% CI, −0.004 to 0.030) and calibration slopes of 0.984 (95% CI, 0.723–1.235) and 1.175 (95% CI, 1.016–1.334) for preterm and term PE, respectively, which suggested that good agreement did not exist between the predicted risks and observed incidence in preterm and term PE (Supplementary Fig. 1).

Fig. 2
figure 2

Receiver operating characteristic curves for the combination of MC, MAP, and PIGF (―) and the combination of maternal characteristics, MAP, PIGF, and UtA-PI (…) in the prediction of preterm PE (left). Receiver operating characteristic curves for the combination of MC and MAP (―) and combination of maternal characteristics, MAP, PIGF, and UtA-PI (…) in the prediction of term PE (right). MAP mean arterial pressure, MC maternal characteristics, PE preeclampsia, PIGF maternal serum placental growth factor, UtA-PI uterine artery pulsatility index

Maternal characteristics and biomarkers were compared between preterm PE, term PE, and unaffected cases (Table 2). The mean MAP MoM was significantly higher, and the mean PlGF MoM was significantly lower in women with preterm PE than in those with unaffected pregnancies. There were no significant differences in the mean UtA-PI MoM between preterm PE, term PE, and unaffected pregnancies (Fig. 3).

Table 2 Comparisons of maternal characteristics and biomarkers between outcome groups
Fig. 3
figure 3

Box plots of the multiples of the median value of UtA-PI (left), MAP (middle), and PlGF (right) between the outcome groups. The closed box indicates preterm PE group; the hatched box indicates term PE group; the clear box indicates no PE group. MAP mean arterial pressure, PE preeclampsia, PIGF maternal serum placental growth factor, UtA-PI uterine artery pulsatility index

Discussion

This study has demonstrated that the FMF Bayes theorem-based model at 11–13 weeks of gestation could be used to predict preterm PE in the Japanese population with a detection rate for preterm PE as high as 91% at a 10% false-positive rate. Previous publications have reported an overall detection rate of 75% at a 10% false-positive rate with this model [4, 5, 13]. The detection rates vary with racial origin, at 50.0% (95% CI, 6.8–93,2), 68.8% (95% CI, 62.7–75.5), and 72.8% (95% CI, 62.7–79.1) at a 10% false-positive rate for East Asian, White, and Afro-Caribbean women, respectively [4]. Since the detection rate of the FMF model for the Japanese population is higher than that for the White and Afro-Caribbean populations, this model can be applied to screen PE in Japanese women. There are two possible reasons for the high detection rate of the FMF model in our study population. First, the risk of PE development in this population is higher than that in white women (2.8% vs. 2.2% [9, 14]). Moreover, this population exhibited a higher incidence of preterm PE (1.2%) than East Asian women (0.67%) [6]. The high incidence of PE might be explained in part by the fact that our subjects had characteristics such as advancing maternal age, medical history of CH, and conception by in vitro fertilization, which are associated with an increased risk of PE [7]. In particular, 36% (n = 4) of women who developed preterm PE had CH, compared with only 0.2% (n = 2) of women without PE. It reflects the importance of this categorical variable as a risk factor for PE. Furthermore, our hospital is a tertiary referral hospital for high-risk pregnancies in the region. Second, strict compliance with the standardized protocols for the measurements of the parameters may have contributed to the good screening performance.

Although current studies have demonstrated that the best model for PE screening is a combination of maternal characteristics, MAP, UtA-PI, and PlGF, our study demonstrated that the combination of maternal characteristics, MAP, and PlGF is sufficient for the prediction of preterm PE in Japanese women. The measurement of UtA-PI was not significantly different between the women with preterm PE, term PE, and unaffected pregnancies (2.0 vs. 1.3 vs. 1.8, p = 0.06), although there was a tendency for the median value to be higher in the preterm PE group. In contrast, the measurements of MAP and PlGF were significantly different between the outcome groups. Since the UtA-PI measurements were performed in strict compliance with the abovementioned protocol with good quality assessment, the observations with UtA-PI might reflect the characteristics of this population rather than the instability of the measurement. In this respect, UtA-PI has a limited effect on predicting preterm PE compared to other parameters in this population.

Previous studies have demonstrated that the performance of screening for term PE is considerably lower than that of screening for preterm PE. In this study, the performance in predicting term PE is consistent with that of previous studies. In the prediction of term PE, higher performance was observed with the combination of maternal characteristics and MAP than with the combination of all parameters. The mean MAP MoM was higher in the term PE group than in the preterm PE and unaffected groups (Fig. 3). This might be caused by differences in the underlying pathophysiological mechanisms for preterm and term PE [15]. Although the cause of PE is not well known, it is initiated by reduced placental perfusion with failed vascular remodeling, resulting in the release of factors that lead to maternal systemic pathophysiological changes [16]. Reduced placental perfusion and abnormal implantation affect placental arteries or placenta size, which results in changes in the UtA-PI or PlGF, leading to the onset of preterm PE. In contrast, maternal pathophysiology of endothelial dysfunction is associated with blood pressure, obesity, and other factors at the time of pregnancy. Maternal endothelial dysfunction is associated with the subsequent onset of term PE. Maternal characteristics and MAP have been shown to be powerful predictors, especially in the prediction of term PE in this population. The mean UtA-PI MoM was lower, and PlGF MoM was higher in women with term PE than in those with unaffected pregnancies. This finding suggested that term PE had little to do with underlying impaired placentation in this population. Thus, it has been demonstrated that the FMF Bayes’ theorem-based model can be applied for predicting both preterm and term PE in the Japanese population at 11–13 weeks of gestation.

A limitation of the study relates to the generalizability of the results, since this study was conducted at a single center, and the number of cases was relatively small. Our results cannot confirm that the FMF model is applicable to other Japanese populations. In addition, compared with previous studies in which the FMF model has been validated in different populations [17, 18], our prediction models were not well calibrated. The lower accuracy of the prediction probability may be attributed to the small sample size of the observed incidence. The next step for the clinical application of this model will require prospective verification in more cases at multiple facilities. There might be some challenges in ensuring prediction accuracy in a nationwide screening program for PE in Japan. This requires training for all components of the PE screening program, which is regularly audited by the FMF. Once the program is established, screening for PE can be performed alongside screening for fetal morphological abnormalities.

In conclusion, we have demonstrated that the FMF Bayes theorem-based model is effective in predicting preterm PE in the Japanese population and that it can be implemented as part of routine prenatal care in Japan.