Introduction

Endotracheal intubation (ETI) is a critical life-saving procedure frequently performed in neonatal intensive care units (NICUs), delivery rooms, and during neonatal resuscitation to secure the airway and facilitate effective ventilation [1]. Despite its widespread application in airway management, the accurate and timely confirmation of proper endotracheal tube (ETT) placement remains a significant clinical challenge [2]. Misplacement of the ETT, particularly esophageal intubation, is a potentially catastrophic complication associated with severe morbidity and mortality, including hypoxia, cardiac arrest, brain injury, and even death [3, 4]. Therefore, reliable and rapid methods are needed for confirming ETT position.

Traditional methods of ETT confirmation in neonates include observation of chest movement, auscultation, and assessment of end-tidal carbon dioxide (ETCO₂). However, these methods are known to be unreliable, especially in noisy environments or in patients with underlying lung disease [5, 6]. The gold standard for confirming ETT placement has historically been capnography, which detects exhaled carbon dioxide. However, capnography can be misleading in conditions of absent or extremely low pulmonary blood flow where CO2 delivery to the lungs is minimal. Furthermore, capnography may be insufficient in the neonatal setting during resuscitation of extremely preterm infants or in low-perfusion states where CO2 delivery to the lungs is minimal [7].

In recent years, point-of-care ultrasonography (POCUS) has emerged as a valuable diagnostic modality across various disciplines, offering real-time, non-invasive assessment bedside [8]. Tracheal ultrasonography has garnered increasing attention as a promising technique for confirming ETT placement. This technique involves visualizing the trachea and esophagus, allowing for the direct identification of the ETT within the tracheal lumen or its aberrant placement in the esophagus [9, 10]. Tracheal ultrasonography offers important advantages such as its non-invasive nature, rapid execution, and applicability in situations where capnography may be unavailable.

Several studies have evaluated the diagnostic accuracy of tracheal ultrasonography for confirming ETT placement in neonates [11, 12]. However, their findings vary widely due to differences in technique, operator expertise, gestational age groups, and reference standards used. Previous systematic reviews have predominantly focused on adult or mixed populations, leaving a knowledge gap specific to neonates in acute settings. This highlighted the necessity for a updated comprehensive synthesis of the existing evidence.

Therefore, the objective of this systematic review and meta-analysis is to comprehensively evaluate the diagnostic accuracy of real-time point-of-care tracheal ultrasonography in confirming endotracheal tube placement in neonatal patients.

Methods

This systematic review and meta-analysis was registered with the international Prospective Register of Systematic Reviews (PROSPERO) database with registration number (CRD420251082454). We reported this study according to the Preferred Reporting Items for Systematic Reviews and Meta-analysis (PRISMA) statement guidelines [13].

Search strategy

To identify relevant studies, we performed comprehensive electronic searches across PubMed, Web of Science, Scopus, and the Cochrane Library databases, covering all available records from inception through May 2025. Our search strategy incorporated both Medical Subject Headings (MeSH) terms and keyword variations related to “endotracheal intubation”, “ultrasonography”, acute care settings (like “emergency” and “ICU”), and diagnostic accuracy combined with Boolean operators. All details on the search methodology and the full search strategy for each database are provided in Supplementary Table. 1. Additionally, we scanned the reference lists of the included articles to identify any further eligible studies.

Eligibility criteria

Eligibility criteria were strictly defined to maintain relevance of studies. We selected observational or interventional studies that reported diagnostic accuracy data—specifically, studies providing true positives, false positives, true negatives, and false negatives, either directly or in a format allowing these values to be derived. Eligible studies had to evaluate ultrasonography for ETT placement confirmation in neonates (under 28 days old) within ED or ICU settings. The accuracy of ultrasonography required verification by an accepted reference standard such as capnography or direct visualization via laryngoscopy visualizing the ETT) passing through the vocal cords/glottis during or post-intubation when clinical response was inadequate.

Exclusion criteria encompassed case reports, cadaveric or animal studies, studies reporting on pediatric subjects, and conference abstracts lacking full-text data, and studies not in English Language.

Screening

Two independent blinded reviewers assessed the studies according to the eligibility criteria. Conflicts were resolved through discussion or by consulting a third independent reviewer. After removing duplicated of the retrieved references, the titles and abstracts of the articles were screened by the reviewers. We then screened full-text articles of the eligible abstracts for final inclusion in the systematic review.

Data extraction and outcomes

The following data were extracted by two independent authors into a standardized spreadsheet, for each eligible study: (1) Study information: country, study design, setting (ER/ICU), sample size, period of recruitment, ultrasound technique, operator type, probe type, timing of US, type of reference standard, main inclusion criteria, primary endpoints, and the conclusion of the study; (2) Patient baseline characteristics: age, sex, birth weight, indication for intubation, medical history, admission timing (daytime/night), intubation difficulty (Cormack and Lehane grade 3 or 4), and intubation time, (3) POCUS techniques and ultrasonographic signs: the primary POCUS technique reported across the included studies involved a transverse view of the neck at the level of the suprasternal notch. Correct tracheal intubation was typically confirmed by visualizing a single hyperechoic interface with posterior shadowing, representing the ETT within the trachea. In contrast, esophageal intubation was often identified by a ‘double-tract’ or ‘second target’ sign, where two tube-like structures (the trachea and the esophagus containing the ETT) were visualized; (4) Outcome measures: Endotracheal intubation (events), esophageal intubation (events), sensitivity, specificity, Area under ROC Curve (AUC), true positive, true negative, false positive, false negative, diagnostic odds ratio, predictive value (positive, negative), likelihood ratio (positive, negative), time to ETT confirmation by POCUS, time to ETT confirmation by reference standard, missed esophageal intubation rate, reintubation due to misplacement, POCUS-related complications, undetected esophageal intubation by POCUS, time to adequate view. Data extraction was conducted systematically, with one reviewer collecting study details and another verifying accuracy.

Quality assessment

Study quality was appraised using the QUADAS-2 tool, which evaluates risk of bias across four domains: patient selection, index test, reference standard, and flow/timing. Two investigators independently conducted assessments, with disagreements resolved by consensus [14].

Statistical analysis

The statistical analysis for this meta-analysis was conducted using R software (version 4.3.0) with the “meta” and “metafor” packages. Diagnostic accuracy measures, including sensitivity and specificity, were pooled using a bivariate random-effects model to account for between-study heterogeneity and the intrinsic correlation between sensitivity and specificity. Summary receiver operating characteristic (SROC) curves were generated to evaluate the overall diagnostic performance of POCUS for ETT confirmation, with the area under the curve (AUC) serving as a measure of discriminative ability.

For dichotomous variables, such as incidence of esophageal intubation, the proportions and totals were pooled as odds ratios (OR). For continuous outcomes, such as time to ETT confirmation, mean differences (MD) with 95% confidence intervals (CI) were calculated using inverse-variance random-effects models. Heterogeneity was assessed using the I² statistic and Cochran’s Q test, with I² > 50% or a p-value < 0.10 indicating substantial heterogeneity.

Publication bias was evaluated using funnel plots and Egger’s regression test for asymmetry, with p < 0.05 suggesting potential bias. Sensitivity analyses were conducted by sequentially excluding individual studies to assess the robustness of pooled estimates. All statistical tests were two-tailed, with p < 0.05 considered statistically significant.

Results

Screening and study selection

A total of 3824 records were obtained on initial search, of which 3277 unique records were further assessed for inclusion based on a title and abstract screening. Of these, the full texts of 47 records were assessed for inclusion. Finally, 13 studies were included for qualitative and quantitative synthesis (Fig. 1). The list and characteristics of all included studies are provided in Supplementary Table 2 [refs. 15,16,17,18,19,20,21,22,23,24,25,26,27]

Fig. 1
figure 1

PRISMA flowchart of screening and study selection.

Baseline and study characteristics

Of the 13 included studies with 930 neonates, four were cross-sectional, one was retrospective and the remaining were prospective cohort studies. All studies included neonates admitted to the NICU. Operators of POCUS were either neonatologists, nurses or radiologists. Weighted average mean of gestational age 32.07 ± 5.10 weeks, while neonate’s chronological age at Intubation range form 2 to 16 days. Majority of the included neonates were male. Further details are presented in Supplementary Table 2.

Quality assessment

Quality assessment using QUADAS-2 revealed Low to Some concerns across all studies except Takeuchi and de Kock et al., which demonstrated High risk of bias due to confounding in reference standard and patient selection. Across applicability domains, most studies demonstrated some concerns to low risk of bias except Takeuchi et al., which demonstrated high risk of bias due to confounding in reference standard. These studies were retained in the primary meta-analysis to ensure comprehensiveness, and their influence on the pooled estimates was evaluated through sensitivity analyses. Further details are provided in Supplementary Figs. 1, 2.

Outcomes

POCUS-confirmed success rate “POCUS diagnostic accuracy for correctly placed ETTs”

Success rate was defined as the proportion of neonates with correctly placed ETTs confirmed by POCUS as confirmed by reference standards (not the procedural success of intubation itself), form pooled analysis of 12 studies with 891 patients revealed a 99% odds of success of ET tube insertion when confirmed by POCUS (GLMM: 0.99; 95% CI: 0.94, 1.0) (Fig. 2) The assessment demonstrated high heterogeneity (I2 = 65.4%, P = 0.0008), which resolved on sensitivity analysis by omitting Singh et al. (I2 = 8%), with no change in effect size (0.99(95% CI: 0.96, 1)) (Supplementary Fig. 3). Funnel plot by Egger’s test revealed significant publication bias (p < 0.0001) (Supplementary Fig. 4).

Fig. 2
figure 2

Forest plot of POCUS-confirmed success rate.

Diagnostic test accuracy of POCUS for ETT confirmation

Pooled analysis of five studies revealed a high sensitivity of 93%(95% CI: 0.86, 0.96) and a moderate specificity of 59% (95% CI: 0.19, 0.89) (Fig. 3A, B). A positive likelihood ratio of 2.35 (95% CI: 0.92, 5.99) and a negative likelihood ratio of 0.22 (95% CI: 0.08, 0.58) was obtained (Fig. 3C, D). Further, pooled estimate demonstrated a diagnostic odds ratio of 17.07 (95% CI: 4.33, 67.29) (Fig. 4A). SROC curve revealed an AUC of 92% (95% CI: 1.59, 0.65) (Fig. 4B).

Fig. 3: Meta-analysis of diagnostic accuracy of POCUS for confirmation.
figure 3

A Sensitivity of POCUS across included studies with pooled estimate and heterogeneity statistics. B Specificity of POCUS with pooled estimate and heterogeneity statistics. C Positive likelihood ratio (+LR) with pooled estimate. D Negative likelihood ratio (–LR) with pooled estimate. Squares represent study-specific estimates, horizontal lines indicate 95% CIs, and diamonds show pooled effects.

Fig. 4: Diagnostic odds ratio and summary receiver operating characteristic (SROC) curve of POCUS for confirmation of endotracheal tube (ETT) placement.
figure 4

A Forest plot of diagnostic odds ratios (DOR) for individual studies with pooled estimate shown as a diamond. B SROC curve plotting sensitivity against false positive rate, with area under the curve (AUC) and confidence region displayed. Circles represent individual study estimates.

Esophageal intubation

The pooled analysis of 11 studies with 870 neonates recorded an incidence of esophageal intubation across the studies was 4% of total attempts (GLMM: 0.04; 95% CI: 0.03, 0.06) (Fig. 5A), with homogenous estimate (I2 = 0%, P = 0.99). Funnel plot by Egger’s test revealed significant publication bias (p < 0.0001) (Supplementary Fig. 5).

Fig. 5: Incidence of esophageal intubation and reintubation due to malpositioning.
figure 5

A Forest plot of esophageal intubation events across studies, with pooled incidence shown as a diamond. B Forest plot of reintubation due to malpositioning, with pooled incidence shown as a diamond. Squares represent study-specific estimates, horizontal lines indicate 95% CIs, and diamonds show pooled effects.

Reintubation due to misplacement

Pooled analysis of 10 studies revealed that reintubation was required in 1% proportion of included neonates due to initial tube misplacement (GLMM: 0.01; 95% CI: 0.01, 0.02) (Fig. 5B), with homogenous estimate (I2 = 0%, P = 0.623). Funnel plot by Egger’s test revealed no publication bias (p = 0.9) (Supplementary Fig. 6).

Discussion

Accurate and timely confirmation of ETT placement in neonates is a critical determinant of successful airway management in acute care settings, including NICUs and delivery rooms. Our systematic review and diagnostic test accuracy (DTA) meta-analysis demonstrated that POCUS is a highly sensitive (93%) and moderately specific (59%) modality for confirming correct ETT placement in neonates, with strong overall diagnostic performance (AUC = 92%), highlighting its diagnostic potential in real-world acute care settings. The overall proportion of neonates with correctly placed ETTs successfully confirmed by POCUS as verified against reference standards tests was 99%, while the proportion of esophageal intubations was 4%, and only 1% of neonates required reintubation due to initial misplacement of the ETT. “ In our analysis, a ‘positive test’ indicates the sonographic identification of the ETT within the tracheal lumen, while a ‘negative test’ indicates its absence from the trachea, suggesting esophageal placement with diagnostic accuracy assessed against established reference standards”.

The pooled sensitivity of 93% is consistent with previous meta-analyses conducted across broader populations, which reported sensitivity ranging from 91 to 98% for POCUS in detecting correct ETT positioning [10, 28]. However, our analysis offers critical value by focusing exclusively on neonatal populations, thereby addressing an existing evidence gap. The relatively moderate specificity (59%) observed may reflect a higher false-positive rate, possibly attributable to air artifacts or esophageal movement being misinterpreted as tracheal placement, especially by less experienced operators [29]. Nonetheless, the high sensitivity and a negative likelihood ratio (NLR) of 0.22 suggest POCUS is particularly valuable as a rule-out test in emergency settings where rapid clinical decisions are paramount.

Our diagnostic odds ratio (DOR) of 17.07 is noteworthy, indicating strong discriminatory power and clinical utility. The findings support the notion that tracheal ultrasonography offers a reliable alternative to capnography, especially in circumstances where traditional methods may be limited—such as poor perfusion states, ongoing chest compressions, or in extremely low birth weight neonates [30, 31].

POCUS presents a unique combination of advantages for neonatal care: it is non-invasive, radiation-free, repeatable, and provides real-time feedback. Its portability enables immediate bedside use during resuscitation, unlike chest radiography which introduces delays and logistical challenges [32]. Given its ability to visualize both tracheal and esophageal anatomy, POCUS also enables rapid identification of esophageal intubation—a potentially fatal event if missed. In our pooled analysis, the esophageal intubation rate was 4%, highlighting the relevance of a reliable verification tool. Importantly, the need for reintubation due to misplacement (e.g., incorrect depth) was only 1%, suggesting that early detection by POCUS can lead to prompt correction of malpositioned tubes, thereby improving outcomes. This demonstrates the utility of POCUS not just for verifying tracheal placement, but also for assessing tube depth.

Recent guidelines and consensus documents have also begun recognizing the potential role of airway ultrasound in neonatal and paediatric airway management. A 2022 expert consensus by the REPEM POCUS collaborators highlighted the growing utility of bedside ultrasound in neonatal intubation procedures, citing its ability to reduce confirmation time and misplacement risk [33]. While the diagnostic metrics are encouraging, POCUS remains inherently operator-dependent. Studies included in our analysis used varied operator profiles, including neonatologists, radiologists, and trained nurses. As noted in recent evaluations by Al-Absi et al., structured training programs significantly enhance the diagnostic yield of POCUS in imaging and diagnostics [34]. Future consideration of incorporating POCUS training into neonatal resuscitation protocols, such as the Neonatal Resuscitation Program (NRP), may improve clinician proficiency and standardize usage.

Despite robust findings, heterogeneity was moderate to high in some pooled estimates. This may reflect differences in ultrasound techniques (trans-thyroidal vs. suprasternal views), variability in reference standards (capnography vs. direct laryngoscopy), and gestational age groups. For example, extremely premature neonates pose technical challenges due to smaller airway diameters and reduced anatomical landmarks. Our sensitivity analysis showed that heterogeneity decreased significantly upon exclusion of outlier studies such as Singh et al., suggesting variability in methodology or patient selection as sources of inconsistency.

Additionally, although the AUC was high, the wide confidence interval around specificity and the presence of publication bias—particularly for success and esophageal intubation rates—indicate potential selective reporting. Many studies with negative or neutral findings may not have been published. Moreover, the majority of studies had some concern in quality assessment domains, with two studies (Takeuchi and de Kock et al.) showing high risk of bias due to inadequate reference standards and selection issues. A further limitation is the inability to perform robust subgroup analyses based on operator experience, ultrasound technique, or equipment type due to the limited number of studies and inconsistent reporting of these variables. Future research with larger, more detailed studies will be needed to explore how these factors influence the diagnostic accuracy of POCUS. Future studies should employ standardized diagnostic frameworks and ensure blinding to reduce bias.

Our findings advocate for the broader integration of tracheal POCUS into neonatal intubation algorithms, especially in circumstances where capnography may be unreliable, such as in neonates with poor perfusion states or during chest compressions. Future research should prioritize RCTs comparing POCUS-guided confirmation versus traditional methods in emergency and elective neonatal intubations. Standardization of training protocols is essential, allowing for routine and evidence-guided management of neonates.

Conclusion

Real-time point-of-care tracheal ultrasonography is a highly sensitive and moderately specific tool for confirming the correct site of endotracheal tube placement in neonates in acute care settings. It offers rapid, bedside, radiation-free confirmation and may reduce the incidence of undetected esophageal intubation. While variability in technique and operator skill remains a limitation, the overall diagnostic accuracy and safety profile of POCUS support its increasing adoption in neonatal airway management protocols. Despite methodological heterogeneity across studies, this analysis provides the most comprehensive evidence to date supporting POCUS as a valuable adjunctive tool for neonatal endotracheal tube placement verification, while highlighting the need for standardized protocols and training programs. Larger, high-quality prospective studies and implementation trials are warranted to further consolidate its role and optimize training frameworks.