Introduction

Postpartum urinary retention (PUR) is a common complication after childbirth with an incidence varying from 0.18 to 20.6%1,2,3,4. As common definition PUR is divided into overt and covert retention5: Overt PUR is defined as an inability to urinate 6 h after vaginal delivery or after catheter removal following caesarean section. Covert PUR is defined as a post-void residual bladder volume of more than 150 ml in post-partum patients without urinary symptoms. . In several studies6, however, overt PUR is analyzed using other definitions such as the inability to undergo spontaneous micturition within 12 h after vaginal delivery7 or the requirement of at least one catheterization within the first 24 h postpartum8,9. Women requiring intermittent catheterization after PUR have long-term voiding difficulties, as reported by 5% of women within three years after diagnosis10. An early diagnosis and timely intervention improves the outcome1 so that these patients would benefit from an improved diagnostic tool, especially in the case of covert PUR with its high estimated number of unknown cases (approx. nine times the rate of overt PUR4). The measurement of post-void residual volume (PVRV) using a urinary catheter as gold standard11 is time- and staff-expensive as well as unpleasant for the patient. A measurement of PVRV via ultrasound is reported as highly reliable in most studies28,29 but is - at least - accurate enough for clinical decision marking despite the ongoing discussion regarding the exact formula for the calculation of the PVRV in comparison to the measurement using the catheter12. A sonographic monitoring13 of all postpartum patients with standard ultrasound devices (SUD) is still time- and staff-expensive so that no screening program for PUR has been established so far.

The next-generation of ultrasound machines consists of transducers developed with non-piezo silicon chip technology and connected to a smart-phone or tablet making them easily transportable. This new application of portable ultrasound devices (PUD) enables physicians to perform bedside examinations without heavy transport of the established SUD14. Therefore, the use of PUD may lead to a low threshold for an examination of the bladder, especially during daily rounds. While the applications of the PUD have been reported in several areas of obstetrics and gynecology15, such as examinations of early pregnancy, of gynecological pathology16, of postpartal complications regarding the uterus17 and of fetal biometry16,18,19,20,21, the application of the PUD for examinations of the bladder and exclusion of a PUR has not been assessed thus far in regards to its reliability and agreement.

The aim of this study is to assess the reliability and agreement between measurements of bladder dimensions and calculated PVRV obtained using both SUD and PUD in postpartum patients. It aims to determine whether PUD can serve as a reliable tool for diagnosing PUR and assessing bladder dimensions in postpartum patients, potentially offering a more accessible and cost-effective alternative to traditional SUD.

Methods

A prospective, observational investigation was carried out over a four-month interval (August to November 2023) within the Department of Obstetrics at a tertiary-level university hospital. The patients were informed about study participation and provided informed consent. The study was approved by the University Hospital Bonn’s Ethics Board (No. 345/21). The study was performed in accordance with the Declaration of Helsinki. Eligibility for participation was the in-patient care on the postnatal ward and, thus, included patients mostly between the first and third day after delivery. The treatment of patients with complications such as neonatal hyperbilirubinaemia or puerperal mastitis at this ward allowed the inclusion of patients with a longer period after delivery, as well. The examination was only performed once on each participant and was realized randomly on one day during their stay. The examinations were always perfromed prior to the participants’ discharge. Emergent patients and those with urogenital malformations were excluded. Patients with temporary urinary catheterization after cesarean section were asked to participate six hours after catheter removal. Additionally, the operational capacity of the Department of Obstetrics to facilitate precise dual-ultrasound device examinations was a prerequisite. Therefore, patients were randomly asked to participate when the operational capacity allowed the examinations. The patients’ medical histories were either acquired during their first presentation in the delivery room prior to delivery or acquired by the ward physicians at the postnatal admission in the case of puerperal complications. During daily rounds the patients were informed about the study’s objectives and methodology. After informed consent the participants were asked about their micturition in the 6 h after delivery to identify overt PUR and then asked to urinate prior to the examination. The investigative protocol entailed the random application of both PUD and SUD for capturing images of predetermined variables prior to measuring in order to mitigate bias (Fig. 1). A final catheterization after the micturition and examination to compare the post-void residual volume (PVRV) with the calculated volume based on the ultrasound diagnostic were not performed to prevent harm such as possible infection or pain for the participants. Every participant underwent a unique examination with a SUD (EPIQ 5 W by Philips, Amsterdam, Netherland) and a PUD (Butterfly iQ by Butterfly Network, Guilford, Connecticut, USA), conducted by the same operator. The SUDs are based on the piezoelectric crystal transducer and the PUD on a silicon chip using the conversion of voltage to resonance. The order of the captured images was standardized during the use of each ultrasound machine, thereby maximizing the period between each image of the same variable and reducing memory bias (Fig. 1). These assessments were carried out either by an assistant physician, trained in obstetrical scanning for two years, or consultants, trained in obstetrical scanning for more than two and less than ten years. Four examiners were part of this study (RP, AW, BS, FR). The collected data including the demographic information of patients’ histories wasrecorded and stored on the university hospital’s server using Excel software (Microsoft Corp., Redmond, WA, USA). Biometric measurements of the bladder were determined (Fig. 2), and the PVRV was calculated using the following formula:

PVRV (ml) = horizontal axis of the bladder (HA) (cm) X longitudinal axis of the bladder (LA) (cm) X sagittal axis of the bladder (SA) X 0.7 (cm)22.

A calculated post-void residual bladder volume of more than 150 ml defined covert PUR in this study5.

The study’s primary endpoint was the concordance of PVRV estimation, with secondary endpoints including HA, LA and SA of the bladder and the agreement rate for overt and covert PUR.

To avoid confirmation bias, measurements were performed post-image acquisition with each device, with a standardized sequence for PUD and SUD assessments ensuring a consistent interval between measurements of the same variable. Cases with incomplete data were excluded (n = 1), and the study concluded upon recruiting 101 patients, so that ultimately 100 patients were included (Fig. 1).

Statistical methods

The sample sizes of 100 participants was determined based on clinical feasibility and comparable studies19 so that the results of this pilot study are descriptive23. Statistical analyses and graphical representations were conducted using the Statistical Package for the Social Sciences (SPSS) software, version 27 (IBM Corp., Armonk, NY, USA). To evaluate the consistency and reliability of measurements derived via SUD and PUD, several statistical metrics were employed: Pearson correlation coefficient (PCC) with a 95% confidence interval using Wald methods, intraclass correlation coefficient (ICC) employing a two-way random-effect, agreement model with a 95% confidence interval, and Bland-Altman plots 24. An ICC and a PCC approaching the value of 1.0 indicate a high degree of agreement or correlation, as supported by the literature25,26,27. Bland-Altman plots were specifically utilized to qualitatively assess the agreement between measurement methodologies. The mean relative difference (MRD) was calculated by taking the absolute difference between individual measurements and the aggregate mean of these measurements, summing these differences for all subjects, and then dividing by the total number of cases. Subgroup categorization was based on mode of delivery, body mass index and day after delivery. The examiners were asked for feedback by completing a questionnaire.

Fig. 1
figure 1

Flow chart of the study design. Number (n). Standard ultrasound device (SUD), Portable ultrasound device (PUD).

Fig. 2
figure 2

Images of transabdominal ultrasound scans using a standard ultrasound device based on piezo technology (A, B) and a portable ultrasound device based on silicon chips (C, D): Demonstration of the transversal scan plan (A, C) showing the horizontal axis (a) and the sagittal axis (b) and of a sagittal scan plan (B, D) showing the longitudinal axis (c).

Results

A total of 100 paired postpartum ultrasound scans were performed from August to November 2023 (Fig. 2). The demographic characteristics of the study participants are outlined in Table 1. Age of the participants was between 18 and 45. 56% of participants delivered their first child, 32% delivered their second child, 10% delivered their third child and 2% delivered their fourth child. The child was delivered by spontaneous vaginal birth in 57%, by vacuum extraction in 15% and by caesarean section in 28% of the cases. The cause of caesarean section was elective in 39% and emergent in 61%. Preconceptional body mass index (BMI) varied from 16.9 to 37.9 kg/m2. Analysis of BMI distribution revealed that a majority, 62%, had a BMI within the normal range (18.5 to < 25 kg/m2), whereas 4% were categorized as underweight (BMI < 18.5 kg/m2). Notably, 34% of the cohort was identified as overweight (BMI ≥ 25 kg/m2); within this group, 53% had a BMI between 25 to < 30 kg/m2, 38% fell into obesity class 1 (BMI 30 to < 35 kg/m2) and 8% into obesity class 2 (BMI 35 to < 40 kg/m2). Of the scans, 43%, 33%, 12%, 3%, 4%, 1% and 2% were performed on day 1, 2, 3, 4, 5, 6 and 7 postpartum, respectively.

Table 1 Patients characteristics. number (n) 

In this study no overt PUR was reported because all participants were able to micturate. Five cases of covert PUR were diagnosed using PUD and SUD and confirmed by measurement of the PVRV. Thus, the two methods had an agreement rate of 100% for the diagnoses (Table 2). The mean and standard deviation of the examinations with the established standardized ultrasound device are reported dependent on mode of delivery, on the body mass index and the day of scan, and show a slight difference (s. Table 2).

Table 2 Results of the scan dependent on mode of delivery, body mass index, day of the scan using standard ultrasound device (SUD (straight)) using piezo technique and portable ultrasound device (PUD (italic)) using silicon chip technique: covert postpartum urinary retention (cPUR), Caesarean delivery (CD), horizontal axis of the bladder (HA), longitudinal axis of the bladder (HA), number (n), normal delivery (ND), normal weight (NW), overt postpartum urinary retention (oPUR), overweight (OW, including obesity), portable ultrasound device (PUD), postpartum urinary retention (PUR), estimated post-void residual volume (PVRV), sagittal axis of the bladder (SA), scan at the first day postpartum (S1PP), scan at the second day postpartum (S2PP), standard ultrasound device (SUD), spontaneous vaginal delivery (SVD), underweight (UW, ), vacuum-assisted vaginal delivery (VAVD).

The Bland-Altman analysis (Fig. 3; Table 3) for HA, SA, LA and PRV showed a similar performance between SUD and PUD with an average difference of -0.177, 0.021, -0.023 and − 1.751, respectively (s. Table 3). The 95% limits of agreement were between − 0.647 cm and 0.690 cm in the case of the HA’s measurement, -0.647 cm to 0.689 cm in the case of the SA’s measurement, -0.734 cm to 0.687 cm in the case of the LA’s measurement and − 17.404 to 13.899 in the case of the calculated PRV.

Fig. 3
figure 3

Bland-Altman analysis of the horizontal axis of the bladder (HA), the sagittal axis of the bladder (SA), the longitudinal axis of the bladder (LA) and the post-void residual volume (PVRV) between the devices: standard ultrasound device with piezo technology and portable ultrasound device with silicon chip technology.

The ICC (Table 3) values showed an excellent agreement for the measurements of the HA, SA and LA resulting in an excellent agreement for the PRV27. The PCC indicated near-perfect correlation for the measurements of HA, SA and LA as well as for the calculated PRV. The mean relative difference of the HA, SA and LA and PRV was 9%, 12%, 12% and 20%, respectively.

Table 3 Average difference, Bland-Altman 95%-limits of agreement, intraclass correlation (ICC, 95% confidence interval) and Pearson correlation coefficient (PCC, 95% confidence interval using Wald methods) and the mean relative difference (MRD) of horizontal axis of the bladder (HA), sagittal axis of the bladder (SA), longitudinal axis of the bladder (LA) and post-void residual volume (PVRV) between the devices: standard ultrasound device using piezo technic and portable ultrasound device using silicon chip technic.

Upon examining the influence of mode of delivery, the ICC (Table 4) and the PCC (Table 5) showed a high agreement and correlation for all modes of delivery. Regarding the influence of the body mass index (BMI) the subgroup normal weight range (18.5 to < 25 kg/m2) with 62 participants and the subgroup overweight (≥ 25 kg/m2) with 34 participants were analyzed, whereas the subgroup underweight (< 18.5 kg/m2) included only 4 participants and thus was too small for an analysis. The subgroups normal weight and overweight showed similar ICC except for the measurement of longitudinal axis where the confident intervals of the ICC lacked an overlap. However, both ICCs still indicate an excellent reliability27. The PCC indicate a high reliability and – as the ICC – showed no great differences between the measurements. The final subgroup analyses concerned the timepoint of examination: ICC and PCC proved an excellent agreement and reliability for examination at the first and second day after delivery (Tables 4 and 5).

Table 4 Subgroup analysis of the intraclass correlation coefficients (ICC) with 95%-confidence interval for agreement between measurements from standard ultrasound device and portable ultrasound device for the horizontal axis of the bladder (HA), longitudinal axis of the bladder (LA), sagittal axis of the bladder (SA) and post-void residual volume (PVRV). Caesarean delivery (CS), number (n), normal weight (NW), overweight (OW, including obesity), scan at the first day postpartum (S1PP), scan at the second day postpartum (S2PP), spontaneous vaginal delivery (SVD), vacuum-assisted vaginal delivery (VAVD).
Table 5 Subgroup analysis of the Pearson correlation coefficient (PCC) with a 95% confidence interval using Wald methods between measurements from standard ultrasound device and portable ultrasound device for the horizontal axis of the bladder (HA), longitudinal axis of the bladder (LA), sagittal axis of the bladder (SA) and post-void residual volume (PRV). Caesarean delivery (CD), number (n), normal weight (NW), overweight (OW, including obesity), scan at the first day postpartum (S1PP), scan at the second day postpartum (S2PP), spontaneous vaginal delivery (SVD), vacuum-assisted vaginal delivery (VAVD).

The evaluation of the examiners was mostly positive, as describedd in Table 6.

Table 6 Feedback of the examiners regarding the new ultrasound device.

Discussion

This prospective investigation for the measurement of the bladder examined the compatibility and reliability of SUD and a PUD. Employing statistical analysis tools such as the PCC, ICC and Bland-Altman plot, this study demonstrated a high correlation and agreement of the measurements with both ultrasound devices.

Concerning the diagnosis of covert postpartum urinary retention on basis of the examinations with both devices, the similar diagnostic results indicate the applicability of the new ultrasound device. The use of PUD would allow a rapid identification of critical patients, and the suspected diagnosis could be verified by measuring the PVRV by catheterization as gold standard without the further use of a PUD. The accuracy of PVRV using transabdominal ultrasound is reported as high in most studies28,29 but depends on the used formula22. The validation of the PUD’s PVRV estimation via urinary catheterization could be tested in a high-powered study, in which the diagnosis of more than the reported five cases of covert PUR is likely. A comparison between the the calculated PVRV and the measured PVRV via urinary catheterization for these five cases would lack a statistical relevance. Thus, a high-powered study could add a new perspective of the ongoing discussion of the formula and of the incidence of PUR. Therefore, this study focused on the validation of the new ultrasound device in comparison to the established diagnostic methods.

The ICC shows an excellent agreement with 0.986 for HA, 0.99 for SA and 0.986 for LA and thus reveal better results than the known inter-rater ICC determined in urological ultrasound and are comparable with the intra-rater ICC, despite different ultrasound devices30. The result of the ICC for the estimated PVRV indicates an excellent agreement and is in harmony with the results of the ICC in other obstetric ultrasound examinations1621. Other markers for analyzing the agreement between the two different assays, such as the PCC and the Bland-Altman plot, also show an excellent agreement and support the results of the ICC. The analysis of possible confounding factors such as obesity, mode of delivery and the timepoint of examination reveals that the new generation of ultrasound devices still shows a high agreement and reliability with the SUD, independent of these factors. In other obstetric ultrasound examinations the BMI influences the ICC due to the varying distance between the PUD transducers, with their smaller acoustic windows than the SUD transducers, and the object of interest17.

The feedback of the examiners highlights the simple handling and the time efficiency of the PUD in comparison to the SUD. This resulted in a low threshold to indicate an examination and thus may allow a better care of the patient. The resolution (Fig. 2) was reported to be sufficient for the examination despite the higher resolution of the SUD, suiting the high agreement and reliability of the measurements. The examiners further mentioned the patient benefits from a bedside examination. Negative feedback reflected the small screen, in the case of using a cellphone instead of a tablet, although no measurable effect on the reliability was determined. The problem of empty batteries could be prevented by appropriate charging.

This reliability study opens doors for new diagnostic opportunities, including the comprehensive diagnostic confirmation of in-patient postpartum urinary retention, or as an easy diagnostic tool for out-patient cases or for patients in underdeveloped regions without access to SUD. The first data regarding the feasibility and reliability of self-measurement of PVRV are promising31 and may enable the establishment of self-measurement for follow-ups, simplifying the investigation of long-term symptoms32. The use of new generation handheld ultrasound devices further enables a diagnostic tool to ascertain the incidence, to understand the development and the impact of PUR.

Limitations

Several limitations are inherent in this study design and scope that merit consideration: Primarily, the investigation’s single-center approach, conducted exclusively at a tertiary-level university hospital, potentially narrows the applicability of its findings. The unique environment and patient demographic of a single institution may not reflect the diverse settings and populations encountered in various healthcare systems, thus limiting the generalization of the results to broader clinical practices.

Furthermore, the study’s participant selection criteria led to unbalanced subgroups, notably in terms of body mass index, mode of delivery, and investigation period among the participants. This imbalance affects the ability to accurately assess the impact of these factors on the performance and reliability of PUDs compared to SUDs and thus could skew the analysis. The study was performed with one portable ultrasound system with a non-piezo, chip-based technology. Almost every ultrasound manufacturer has developed their own portable, handheld ultrasound device by now. Each device has its own technical specifications and works on different platforms and apps, so that results from one device cannot be generalized for all other PUDs. Aspects such as patient satisfaction, cost-effectiveness, and the potential for enhancing access to prenatal monitoring in underserved communities were not explored in our study. Such considerations are crucial for assessing the holistic value of integrating PUDs into general obstetric care.

Lastly, the study does not include any longitudinal follow-up to evaluate the long-term impacts of using PUDs on clinical outcomes or decision-making in patient management. Incorporating such follow-up could provide invaluable insights into the efficacy and practical benefits of PUD integration into routine obstetric care practices.

Conclusion

Postpartum urinary retention is a common complication whose established diagnostic tools unfortunately have relevant limitations: ultrasound examinations with standard devices are time-consuming leading to delayed diagnosis, and urinary catheterization is an invasive procedure with risk of further complications. This study presents a new diagnostic tool with the new generation of handheld ultrasound, allowing a point-of-care examination in a time-efficient manner during daily rounds. The agreement and reliability of the ultrasound examination is high in comparison to the established ultrasound examination, so that the application allows precise diagnostics and may help determine the real incidence and impact of postpartum urinary retention in the patients’ life.