Abstract
Background
Apgar score and cyanosis assessment may disadvantage darker-skinned babies. This review explored cyanosis and Apgar score assessments in Black, Asian, or minority ethnic neonates compared to White neonates.
Material and methods
Four databases were searched. Studies of any methodology were included. A narrative synthesis was undertaken.
Results
Ten studies were included. Three studies involving over 39 million neonates showed Apgar score ≤3 was predictive of neonatal mortality across all ethnicities. Black babies with Apgar score ≤3 had lower mortality rates before 28 days, however, variations in scoring practices were also observed. Three further studies (n = 39,290,014) associated low Apgar scores with poorer mental development up to 22 months, especially in mixed ethnicity and Black infants. One study reported inadequate training in assessing ethnic minority neonates. Cyanosis was the focus of three included studies (n = 455) revealing poor visual assessment of cyanosis across ethnicities. With pulse oximetry occult hypoxemia occurred slightly more frequently in Black neonates. Tongue color indicated oxygen requirement at birth, regardless of ethnicity.
Conclusions
Apgar scores correlate well with neonatal mortality in all ethnicities, however scoring variations exist. Cyanosis assessment is challenging, with tongue and lips the best places to observe in the absence of pulse oximetry.
Impact
-
Assessment of the color component of the Apgar score and of cyanosis visually are not accurate in babies with darker skin. Small racial differences may exist for pulse oximetry in neonates, but it is more reliable than visual assessment.
Similar content being viewed by others
Introduction
Ethnic inequalities in maternal and neonatal healthcare provision are increasingly being recognized.1,2 In the United Kingdom (UK), maternal mortality is two times higher for women from an Asian background and almost four times higher among Black women than women of White ethnicity.1 Similarly, there are ethnic inequalities in neonatal mortality within the UK (with 2.94, 2.22, and 1.68 neonatal deaths per 1000 live births for Black, Asian and White neonates respectively) and stillbirth (with 7.52, 5.15, and 3.30 stillbirths per 1000 births for Black, Asian and White babies respectively).2 Awareness is increasing that Black and Asian neonates may be disadvantaged by routine neonatal assessments that are based on normative White populations.3 Concerns have particularly been raised over neonatal assessments that require assessment of skin color, which is subject to observer and potential racial bias.4,5
The Apgar score is a routine perinatal practice which includes skin color assessment. The Apgar score assesses five components: the neonate’s heart rate, respiratory effort, reflex irritability, muscle tone, and appearance.6 The birth attendant attributes a score of 0–2 to each of these five components at 1 and 5 min after the birth, with a maximal possible score of 106. In particular, a score of 0 on the appearance component is defined as the neonate’s skin being “blue all over”, a score of 1 as “centrally pink with blue extremities”, and a score of 2 as the skin being “pink all over”.6 The Apgar score is extensively used across the world, as a standardized, convenient way for healthcare providers to report neonatal condition at birth.7 In general, an Apgar score of ≥7 indicates the neonate was born in a good condition, while a score of ≤3 is very low and indicates that the neonate was in a poor condition at birth.7 The reliability of the Apgar score is however questioned,8 particularly that the subjective assessment of skin color may disadvantage neonates with darker skin tones. To determine wellbeing more accurately in African, Caribbean, and Asian neonates’ researchers have suggested extended assessment is needed.6
Concerns have also been raised around the detection of neonatal hypoxia. This is typically portrayed as a neonate’s lips or skin appearing “blue” or “pale”.3 It is difficult to diagnose poor oxygenation in infants through visual assessment alone, with reliability varying between clinicians, including in darker skinned infants.9 Guidelines recommend that observation of skin color is insufficient to ascertain neonatal oxygenation.10,11 Pulse oximetry utilizes a non-invasive approach to assess oxygenation. Screening newborns with pulse oximetry is routinely recommended to identify heart conditions that may not be detected on visual examination.12 However, pulse oximeters have mainly been calibrated on White skin.13 Racial bias in pulse oximetry has been suggested to place Black adults at increased risk of occult hypoxemia.14 These inequalities in healthcare provision were particularly highlighted during the COVID-19 pandemic, where inconsistencies in technology such as pulse oximeters resulted in dark skinned individuals having a greater likelihood of inaccurate readings.15,16
The aim of this review was therefore to examine associations between neonatal examinations that include a subjective assessment of skin color and any objective measure of neonatal wellbeing in Black, Asian, or minority ethnic neonates compared to their White counterparts. This manuscript specifically focuses on Apgar score and the detection of cyanosis.
Methods
This systematic review was performed according to the protocol published on PROSPERO (CRD42022344617). This manuscript considers the neonatal assessments of Apgar score and cyanosis.
Data sources
A comprehensive electronic search was undertaken in MEDLINE, CINAHL, and PsycINFO databases. Database were searched from inception until 30th April 2022, with the search updated on 31st August 2023. Gray literature such as statutory body reports or consultation exercise reports were also searched for using OpenGrey within the DANS EASY archive. Searches included medical subject headings (MeSH) combined with key text words. These included terms around Apgar score, cyanosis, hypoxia, neonate, and ethnicity. Appendix S1 provides an example full search strategy. Only English language articles were considered.
As well as the above digital searches a group of stakeholders, including healthcare providers, academics, commissioners, and maternity user representatives, were asked via email regarding their awareness of relevant literature for inclusion.
Eligibility criteria and study selection
Two researchers independently screened citations by title and abstract against the inclusion criteria. The inclusion criteria were any qualitative, quantitative or mixed methods study that explored detection of cyanosis or examined associations between Apgar score and an objective measure of neonatal wellbeing included but were not limited to blood gas, intensive care admission, mortality, ongoing development. Studies were included if they compared Black, Asian, or “dark” skinned neonates (less than 4 weeks old) to their White counterparts or those defined as “light” skinned within the included study. Studies undertaken in any country were eligible for inclusion regardless of development as defined by the United Nations Human Development Index.17 Reviews were not eligible for inclusion per se, however, references of reviews on a related topic were screened for additional primary studies, as were the references of all included studies. Two researchers independently screened all potentially relevant citations at full text. The whole research team discussed disagreements over inclusion.
Critical appraisal
Included studies were critiqued using the Mixed Methods Appraisal Tool.18 This incorporates two screening questions, followed by five quality criteria relating to the appropriateness of methodology, data collection, and data analysis. These five quality criteria differ for five study design categories: qualitative, quantitative–randomized, non-randomized and descriptive, and mixed methods. Two researchers independently critiqued the included studies, with disagreements discussed with a third researcher until agreement was reached.
Data extraction
Two reviewers undertook data extraction using a pre-defined template to record relevant information. This included; title, author, year of publication, country of origin, study design, methodology (setting, data collection, data analysis), sample size, ethnicity and age of included neonates, study outcomes according to ethnic subgroups, covariates adjusted for within analyses, source of study funding. The method used to classify ethnicity or race within each included study was noted. Where ambiguities were noted authors of the original published report were contacted for clarification.
Data synthesis
A narrative synthesis approach19 was utilized, with Apgar score and Cyanosis/Hypoxia considered separately. Consideration was given to neonatal ethnicity within the included studies. Consideration was given to potentially confounding factors including socioeconomic status, education level, and maternal age.
Results
Of the 6303 citations obtained from database searches and the 37 studies identified through other sources, 227 were screened at full text. In total 10 studies were eligible for inclusion with seven studies considering the Apgar score7,20,21,22,23,24,25 and three studies considering cyanosis.9,16,26 Fig. 1 provides a PRISMA flow diagram of the study search and selection process.27 Table S1 provides detailed reasons for exclusion of full-text articles.
Study characteristics
Study characteristics are provided in Table 1. All but one study was conducted in a county with a very high human development index (HDI),17 including Australia (n = 1), UK (n = 1) and United States of America (USA) (n = 7). The other study21 took place in Zimbabwe (HDI 0.593) which has a medium HDI.17 In total data from 39,291,376 neonates contributed to the included studies, 39,290,921 within studies considering the Apgar score and 455 in studies considering cyanosis. Table S2 reports the funding received for the included studies, with no concerns identified.
Of the nine studies that recruited neonates, three7,20,25 assigned neonatal race according to maternal self-reported race and one16 by parental identification on the neonate’s birth certificate. One study9 assigned neonates as “dark skinned” if at least two of the three professionals observing the neonate judged it to be “dark skinned” and one study was undertaken across three countries and assigned neonates according to their country of origin.21 The remaining three studies22,23,26 did not stipulate how neonatal race was attributed.
Study quality
Table 2 summarizes critical appraisal ratings for the included studies. The research question was deemed to be clear within all studies, and all but one study21 was deemed to have collected data that addressed the research question. Two of the seven Apgar score studies7,25 and none of the three cyanosis studies were assessed as low risk of bias across all five domains.
Apgar score
Seven included studies considered the Apgar score. One of these studies considered healthcare provider training (n = 67). The remaining six studies compared an objective measure of neonatal wellbeing according to Apgar score among White infants and at least one group of infants with any other skin tone (n = 39,290,921). Of these three studies considered mortality and three considered longer term development.
Three large studies (n = 39,290,014) all linked birth-death datasets to consider neonatal mortality according to Apgar score and ethnicity. The studies included 6,544,004 neonates,22 25,936,357 neonates7 and 6,809,653 neonates25 respectively. In all three studies, a low Apgar score (of 3 or less) was associated with increased risk of neonatal death across all ethnicities. The first study22 adjusted for maternal sociodemographic and health risk factors and birthweight. Compared to neonates with a 1-min Apgar score of 7–10 those with an Apgar score of 0 to 3 had lowest odds of neonatal death if they were non-Hispanic Black (OR 20.40), with White (OR 36.21) and Mexican American neonates (OR 44.24) having increased odds of neonatal death with low Apgar score (score 0–3).22 The same race/ethnic variation was seen for medium 1-min scores (Apgar score 4–6). However, more Black neonates received a 1-min Apgar score of 6 or lower (10.4%), compared to of White or Mexican American neonates (7.6%).22 The second study7 which was low risk of bias across all domains found at the same 5-min Apgar score neonatal mortality was consistently lower in non-Hispanic Black than non-Hispanic White neonates. The biggest difference was noted at the lowest 5-min Apgar scores (scores 1–3).7 This racial variation was evident for both term and preterm births. The lower risk of mortality in Black neonates with each Apgar score remained after adjusting for potential confounders including maternal smoking, education, marital status, and gestation antenatal care commenced.7 However, the proportion of Black neonates compared to White or Mexican American neonates receiving each 5-min Apgar score was not reported, which could have influenced these findings.7 The final study25 was also low risk of bias across all domains. They found that compared to White neonates, Non-Hispanic Black and “Non-Hispanic other” neonates with a 5-min Apgar score of 0–3 had lower early neonatal mortality (up to 6 days of age) [Black: 45.8 per 1000 (95% CI 39.5–53.0); “Non-Hispanic other”: 58.7 per 1000 (95% CI 43.7–79.1); White: 63.6 per 1000 (95% CI 58.8–68.9) respectively] and overall neonatal mortality (up to 27 days of age) [Black: 53.9 per 1000 (95% CI 43.7–61.5); “Non-Hispanic other”: 65.9 per 1000 (95% CI 49.8–87.1); White: 72.4 per 1000 (95% CI 67.2–78.04) respectively].25 However, the incidence of a low and intermediate Apgar score again varied by ethnicity, with low scores (Apgar score 0–3) being significantly higher in non-Hispanic Black (0.42%, n = 3931) and “non-Hispanic other” neonates (0.33%, n = 698) compared to White neonates (0.25%, n = 8,863) (p < 0.001) and intermediate scores (Apgar score 4–6) also being significantly higher in Black (1.26%, n = 11,816) and “non-Hispanic other” neonates (1.16%, n = 2464) compared to White neonates (1.02%, n = 36,144) (p < 0.001).25
Three studies with small sample sizes (246,23 27021 and 391 neonates20 respectively) and all with concerns regarding risk of bias, explored long term development according to Apgar score for neonates of different ethnicities. The long-term predictive value of the Apgar score according to ethnicity was inconsistent within the studies. Infants with 1-min Apgar scores of 0–3 had significantly lower Bayley mental development scores (74.7 ± 12.7) and psychomotor development scores (31.4 ± 6.6) that infants with an Apgar score of 7 to 10 (Bayley mental development score 80.5 ± 3.8 and motor development score 34.5 ± 4.1).20 When considering ethnicity, one study found Bayley mental and psychomotor development scores at 8 months and 1-min Apgar score were only significantly correlated in those of mixed ethnicity (a group mainly consisting of mixed Portuguese and Black African descent) (p < 0.01 for both Bayley mental and psychomotor development scores) but not among Black only or White only infants.20 A separate study of infants with a 1-min Apgar score of 3 or below and low birthweight (≤750 g) similarly found that while Black race was predictive of abnormal mental development at 18–22 corrected months (OR 2.2, 95% CI 1.2–3.7) within univariate analysis, this was no longer the case after adjusting for confounders such as income and education (OR 1.9, 95% CI 0.9–3.8).23 Furthermore, there was no association between psychomotor development and Black race within the univariate or multivariate analysis (OR 1.2, 95% CI 0.6–2.5).23 The final study, exploring long-term outcomes in infants with a 5-min Apgar score <5, found more abnormal neurological classifications in infants born in Zimbabwe (35.8%, n = 59) than infants born in the Netherlands (0%) or the Caribbean (19.2%, n = 18).21 However, the multiple confounders within this study such as different maternal and neonatal care practices in each country meant the impact of race per se could not be ascertained.
One study that looked at healthcare professionals training on assessing Black, Asian and minority ethnic neonates undertook pre- and post-training surveys (n = 67).24 Only 9.1% (5/55) of professionals reported previously receiving specific training around care of Black and minority ethnic mothers and babies.24 They instead relied on self-directed learning and discussions with colleagues. Black mannequins were more likely to have been used in the education of midwives who trained in the previous 5 years (44%) than in those trained 5–10 years ago (18%). The Apgar score was felt not to be the most appropriate way to determine neonatal condition at birth by 96% of midwives after the training, due to the inappropriateness of the term “pink” for many neonates. Overall, 98% of midwives intended to make alterations to their clinical practice because of the new knowledge they had acquired during the training, with midwives describing being shocked by the impact of implicit bias resulting in inequality and inequity within maternity care.24
Detection of cyanosis or hypoxia
Three studies (n = 455), all with concerns over risk of bias, considered the detection of cyanosis in neonates with darker skin tones compared to White neonates.9,16,26
The first study, of 93 participants, determined how accurately arterial oxygen saturation was predicted by visual assessment of skin color at different body sites.9 Skin color was a crude guide to arterial oxygen saturation in neonates regardless of skin tone.9 False positive observations, where professionals thought the neonate was cyanosed when arterial oxygen saturation was actually over 90%, were common when observing the hands (46%), nailbeds (57%), and around the mouth (73%). However, there were few false negatives at these sites, with cyanosis observed in the hands, nailbeds, and around the mouth in all instances where arterial oxygen saturation was below 75%.9 The most reliable site to detect cyanosis was the lips however this was still poor with 28% false positives, and over 25% false negatives when arterial saturations were between 80 and 89%. Neonates were classified as dark skinned if a minimum of two out of the three observers thought they had dark skin.9 Dark skinned neonates when compared to the overall group had fewer false positives when assessing the hands (18% compared to 46%), trunk (8% compared to 19%) or around the mouth (60% compared to 73%).9
The second small study of 68 neonates explored whether supplemental oxygen requirement in the first 10 min after birth could be determined by tongue color.26 A pink tongue generally indicated that supplemental oxygen was not required as neonatal oxygen saturation was above 70%, which was the level at which the country specific guidelines advised administration of oxygen26 given healthy term neonates typically take between 5 and 10 min to achieve oxygen saturations of 90%.28 The area under the Receiver Operator Characteristics Curve was not affected by ethnicity. Area under the curve was 0.89 (95% CI 0.84–0.95) and 0.94 (95% CI 0.87–1.00) respectively for White and “non-White” neonates.26 While the exact sample size of ‘non-White’ neonates was not provided, the study demonstrated that evaluating tongue color in “non-White” neonates was not less effective at detecting hypoxemia than in White neonates.
The final study examined the impact of ethnicity on pulse oximetry among 294 neonates admitted to neonatal intensive care.16 Overestimation of arterial saturation from pulse oximetry was 2.4-fold greater in Black than White neonates (mean bias 1.73% compared to 0.72%, p < 0.01). Black neonates consistently had higher pulse oximetry saturation readings at each arterial oxygen saturation level that White neonates. While the exact difference varied by pulse oximetry saturation (SpO2), the degree of error widened between Black and White neonates for SpO2 ≤ 95%.16 Occult hypoxemia (defined as SpO2 ≥ 90% with arterial oxygen saturation <85%) occurred in 9.2% of Black neonates (188/2044) compared to 7.7% of White neonates (181/2343), although the difference was not statistically significant (p = 0.08).16 The sensitivity of the pulse oximeter to detect true hypoxia (SpO2 < 90% when SaO2 < 85%) was similar for Black and White neonates (39% and 38% respectively) with specificity also similar (81% vs 78% respectively).16
Discussion
Main findings
This systematic review included 10 observational studies that considered cyanosis or Apgar score assessment and their association with neonatal wellbeing in Black, Asian, or ethnic minority neonates compared to White neonates. Three studies showed Black neonates have lower neonatal mortality rates at low Apgar score than their White counterparts.7,22,25 However, they were also more likely to receive a low Apgar score. Detection of cyanosis was poor for all ethnicities, with the tongue and lips the best places to observe. When using pulse oximetry, occult hypoxia was more likely in Black neonates although this did not reach significance.16 Only 9.1% of staff in one survey reported receiving adequate training around caring for ethnic minority babies.24
Comparison with other studies
The Apgar score provides a rapid scoring of a neonate,29 however many of the individual elements are recognized as subjective.6 In particular, neonatal skin color assessment within the appearance component has been questioned, as it is least correlated with cord pH, arterial carbon dioxide and base excess.30
The included studies all showed mortality increased with lower Apgar scores.7,22,25 Black neonates had lower odds of mortality with a low Apgar score than White neonates after adjusting for multiple confounders including socioeconomic factors and maternal lifestyle.7,22,25 This was despite Black neonates having higher overall rates of neonatal mortality within two of the studies.22,25 This may partly be explained by inconsistencies in Apgar scoring according to ethnicity, with many studies finding Black neonates are assigned significantly lower 1-min and 5-min Apgar scores than their White counterparts.22,25,31,32,33 Inconsistencies were still noted when only infants with normal blood gas measurements were included31 or when umbilical artery gases were statistically controlled for.32 Only the appearance component of the Apgar score differs by race, with Black neonates having significantly lower appearance scores even after controlling for multiple factors such as gestational age, cord gases, and maternal antenatal health.32 These scoring differences provide the most likely explanation of better survival among Black infants with low Apgar scores. In addition, differences in Apgar scoring are noted between hospitals, between professions such as neonatologists, midwives, and obstetricians,34 and between European countries with the proportion of neonates receiving an Apgar score of 10 varying from 9% to 93% in different countries.35 This suggests major differences in clinical training and convention when scoring the Apgar.35 These differences in assignment mean the significance of lower neonatal mortality among Black neonates with a low Apgar score is currently unclear. However, it may indicate that the appearance component which defines the highest score as “pink all over”, may not be a reliable component. Inconsistent results were also found in a few small studies regarding the ability of the Apgar score to predict long-term outcomes, however, this is generally considered outside the remit of the Apgar score.36
In clinical practice, although each neonate is assigned an Apgar score, it is suggested that it is seldom used to determine clinical management and is often assigned in retrospect.37 This is best demonstrated through a United States’s study which showed 90% of nurses assigned an Apgar score within a vignette even when data for some Apgar score components were absent, for example heart rate or respiration rate.38 A correct Apgar score was assigned only 19% to 57% of the time and inter-rater agreement was poor.38 This has additionally been confirmed through qualitative interviews, where healthcare professionals stated that they simply assigned an Apgar score of 9 or 10 if a neonate was alert and crying without actually assessing each component.39
While many high resource countries have moved away from using the Apgar score as a tool for decision-making and rely more on pulse oximetry and electrocardiograms, such medical technology is not always available especially in low resource settings and rural locations. It is therefore of paramount importance that more research is undertaken in countries where the Apgar score is still relied upon for clinical decision making for example some countries in Africa, Asian, and Latin America. Studies within these countries are also essential to better understand any inconsistencies in Apgar scoring where neonates with darker skin tones are in the majority. Additionally, despite questions around the reliability of the Apgar score for clinical decision making especially for neonates with darker skin pigmentation, its routine collection in practice means the Apgar score remains a focus of research studies. Within research there is an overemphasis on the Apgar score to classify neonatal wellbeing and to adjust within regression analyses, with little consideration of its reliability or validity.37 The limitations of the Apgar score, especially in those with darker skin tones needs to be more clearly understood within the research arena.
Replacements for the Apgar score have been proposed, for example, the Neonatal Resuscitation Assessment and Adaptation Score (NRAS)40,41,42,43 and the Expanded Apgar score which the American College of Obstetricians and Gynecologists recommend using if a neonate requires resuscitation.36 However, wider use of scores such as the NRAS is not currently recommended as it has only been assessed in small samples without explicit consideration of the predictiveness within neonates from different ethnicities.
It is known that visual changes to skin color with hypoxia may be less apparent in dark skinned neonates.9,44 In the absence of a pulse oximeter, hypoxia may not be detected in infants with darker skin tones by parents or professionals, resulting in later identification of deterioration. One small included study suggested tongue color was a good indicator of supplemental oxygen requirement in the delivery room, regardless of ethnicity.26 Generally however, visual assessment of cyanosis is poor, with pulse oximetry the mode of choice to detect cyanosis particularly in any neonatal resuscitation scenario.10 The COVID-19 pandemic highlighted inequalities in pulse oximetry reliability in Black, Asian or minority ethnic adults.14,15,45 One study included within this review also suggested pulse oximetry was less accurate in preterm neonates from Black and minority ethnic backgrounds, with slightly increased incidence of occult hypoxemia.16 Although pulse oximetry is better than visual assessment, it is suggested that saturations near the bottom of the recommended range are avoided in Black preterm neonates to minimize adverse outcomes.16 Studies in older children have shown mixed results. Two studies showed pulse oximetry overestimated arterial oxygenation more frequently in Black children.46,47 Occult hypoxemia was seen in 5.8% of White compared to 9.6% of Black children in one study46 and in 1% of White compared to 5% of Black children in the other study.47 In contrast another small study found no differences in pulse oximetry measurement bias between light and dark-skinned children when classifying children according to actual skin pigmentation, rather than ethnicity or race.48 A final study suggested that the absolute difference between pulse oximetry saturation and arterial saturation was lower in African American children, with a difference of more than 3% found in 30.0% of African American children compared to 48.9% of White children.49 Further research to examine the small but potentially clinically significant differences in pulse oximeter accuracy in neonates with diverse skin tones is warranted.
This review has highlighted the limitations of visual assessment of cyanosis especially among those with darker skin tones, as well as potential inconsistencies in scoring of the appearance component of the Apgar score. However, a recent review found that the skin color descriptors such as “pink”, “blue” and “pale” were still widely used within clinical guidelines and policies without consideration of how these may appear in neonates with different skin pigmentations.50 When concurrently considering that only 9.1% of staff report receiving adequate training around caring for ethnic minority babies,24 it highlights the urgent need to address care practices to ensure they are inclusive and safe for all communities that make up our diverse multiethnic society.
Strengths and limitations
This review had several strengths. Firstly, a robust approach was applied including research of any methodology and scrutinizing literature by two reviewers. Secondly, a comprehensive search was undertaken, which included looking for gray literature.
Several limitations were however identified. Excluding non-English articles may have limited the number of studies available for inclusion. The international nature of included studies meant heterogeneity was noted in the categorization of ethnicity, race, or skin tone. Categories within the review retained the terms within the original articles, however, transferability of findings is complex. Additionally, all but one included study classified neonates according to ethnicity or race, rather than by skin tone. It is acknowledged that variations of skin tone occur within each race/ ethnic category, which are not captured within the included studies. Additionally, the heterogeneity in race/ ethnicity categories as well as in outcome measures meant that meta-analysis of the results was not possible. Included studies did not separate their results according to socioeconomic status, education level, and maternal age, therefore it was not possible to do a sub-group analysis according to these factors. The adequacy of provider training was not considered within the included studies that assessed neonatal outcomes; therefore it is unknown whether training, especially around assessment in dark skinned neonates may have impacted the results. Finally, only two included studies were deemed low risk of bias across all domains.
Conclusions and implications
Low Apgar score and neonatal mortality were strongly correlated across all ethnicities, but variations in scoring practices, particularly affecting Black neonates, requires further attention. Additionally, further evaluation is needed to determine how to objectively assess newborn health. Visual detection of cyanosis is challenging, especially in neonates with darker skin, and therefore pulse oximetry is preferred to mitigate the health disadvantages experienced by those from ethnic minorities. However, small but potentially clinically significant differences in pulse oximetry compared to arterial oxygen saturation in neonates with darker skin tones warrants further exploration. Additionally, healthcare provider training gaps impact assessment accuracy. There is an urgent need for the development and robust prospective evaluation of targeted education around assessment in Black, Asian and minority ethnic neonates.
Data availability
All data generated or analyzed during this study are included in this published article and its online supplementary information files.
References
Knight, M. et al. (Eds.) On behalf of MBRRACE-UK. Saving Lives, Improving Mothers’ Care - Lessons learned to inform maternity care from the UK and Ireland Confidential Enquiries into Maternal Deaths and Morbidity 2018–20 (National Perinatal Epidemiology Unit, University of Oxford, 2022).
Draper E. S. et al. On behalf of the MBRRACE-UK Collaboration. MBRRACE-UK Perinatal Mortality Surveillance Report, UK Perinatal Deaths for Births from January to December 2021: State of the Nation Report (The Infant Mortality and Morbidity Studies, Department of Population Health, University of Leicester, 2023).
Kapadia, D. et al. Ethnic Inequalities in Healthcare: A Rapid Evidence Review (NHS Race and Health Observatory, 2022).
Adams, B. N. & Grunebaum, A. Does “pink all over” accurately describe an Apgar color score of 2 in newborns of color? Obstet. Gynecol. 123, 36S (2014).
Moyer, V. A., Ahn, C. & Sneed, S. Accuracy of clinical judgment in neonatal jaundice. Arch. Pediatr. Adolesc. Med. 154, 391–394 (2000).
Blake, D. Do we assess ‘colour appropriately using the Apgar score? J. Neonatal Nurs. 16, 184–187 (2010).
Li, F. et al. The Apgar score and infant mortality. PloS ONE 8, e69072 (2013).
ACOG Committee Opinion. Number 644. The Apgar score. Obstet. Gynecol. 126, e52–e55 (2015).
Goldman, H. I., Maralit, A., Sun, S. & Lanzkowsky, P. Neonatal cyanosis and arterial oxygen saturation. J. Pediatr. 82, 319–324 (1973).
Resuscitation Council. Air/oxygen Blenders and Pulse Oximetry in Resuscitation at Birth. https://www.resus.org.uk/library/quality-standards-cpr/airoxygen-blenders-and-pulse-oximetry-resuscitation-birth (Resuscitation Council United Kingdom (RCUK), 2011).
Resuscitation Council. Newborn Resuscitation and Support of Transition of Infants at Birth Guidelines. https://www.resus.org.uk/library/2021-resuscitation-guidelines/newborn-resuscitation-and-support-transition-infants-birth (Resuscitation Council United Kingdom (RCUK), 2021).
Brown, S., Liyanage, S., Mikrou, P., Singh, A. & Ewer, A. K. Newborn pulse oximetry screening in the UK: a 2020 survey. Lancet 396, 881 (2020).
Moran-Thomas A. How a popular medical device encodes racial bias. Boston Review. Retrieved from: https://www.bostonreview.net/articles/amy-moran-thomas-pulse-oximeter/ (2020).
Sjoding, M. W., Dickson, R. P., Iwashyna, T. J., Gay, S. E. & Valley, T. S. Racial bias in pulse oximetry measurement. N. Engl. J. Med. 383, 2477–2478 (2020).
NHS Race and Health Observatory. Pulse Oximetry and Racial Bias: Recommendations for National Healthcare, Regulatory and Research Bodies. https://www.nhsrho.org/wp-content/uploads/2021/03/Pulse-oximetry-racial-bias-report.pdf (NHS Race and Health Observatory, 2021).
Vesoulis, Z., Tims, A., Lodhi, H., Lalos, N. & Whitehead, H. Racial discrepancy in pulse oximeter accuracy in preterm infants. J. Perinatol. 42, 79–85 (2022).
United Nations Development Programme. Human Development Index (HDI). United Nations. Retrieved from https://hdr.undp.org/data-center/human-development-index#/indicies/HDI (2023).
Hong, Q. N. et al. Mixed Methods Appraisal Tool (MMAT) version 2018. Registration of Copyright (#1148552). (Canadian Intellectual Property Office, Industry 2018).
Popay, J. et al. Guidance on the conduct of narrative synthesis in systematic reviews. A product from the ESRC methods programme Version.b92. (2006)
Serunian, S. A. & Broman, S. H. Relationship of Apgar scores and Bayley mental and motor scores. Child Dev. 46, 696–700 (1975).
Wolf, M. J., Beunen, G., Casaer, P. & Wolf, B. Neurological findings in neonates with low Apgar in Zimbabwe. Eur. J. Obstet. Gynecol. Reprod. Biol. 73, 115–119 (1997).
Mihoko Doyle, J. & Echevarria, S. Parker Frisbie W. Race/ethnicity, Apgar and infant mortality. Popul. Res. Policy Rev. 22, 41–64 (2003).
Shankaran, S. et al. Outcome of extremely-low-birth-weight infants at highest risk: gestational age ≤24 weeks, birth weight ≤750 g, and 1-minute Apgar≤ 3. Am. J. Obstet. Gynecol. 191, 1084–1091 (2004).
Chubb, B., Cockings, R., Valentine, J., Symonds, E. & Heaslip, V. Does training affect understanding of implicit bias and care of Black, Asian and minority ethnic babies? Br. J. Midwifery 30, 130–135 (2022).
Gillette, E., Boardman, J. P., Calvert, C., John, J. & Stock, S. J. Associations between low Apgar scores and mortality by race in the United States: a cohort study of 6,809,653 infants. PLoS Med. 19, e1004040 (2022).
Dawson, J. A. et al. Assessing the tongue colour of newly born infants may help to predict the need for supplemental oxygen in the delivery room. Acta Paediatr. 104, 356–359 (2015).
Page, M. J. et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ 372, n71 (2021).
Australian Resuscitation Council, New Zealand Resuscitation Council. Introduction to resuscitation of the newborn infant. ARC and NZRC Guideline 2010. Emerg. Med. Australas. 23, 419–423 (2011).
Apgar, V. A proposal for a new method of evaluation of the newborn infant. Anesth. Analg. 32, 260 (1953).
Crawford, J. S., Davies, P. & Pearson, J. F. Significance of the individual components of the Apgar score. Br. J. Anaesth. 45, 148–158 (1973).
Petrikovsky, B. M., Diana, L. & Baker, D. A. Race and Apgar scores. Anaesthesia 45, 988 (1990).
Edwards, S. E., Wheatley, C., Sutherland, M. & Class, Q. A. Associations between provider-assigned Apgar score and neonatal race. Am. J. Obstet. Gynecol. 228, 229.e1-9 (2023).
Grünebaum, A. et al. Hidden in plain sight in the delivery room—the Apgar score is biased. J. Perinat. Med. 51, 628–633 (2023).
Arri, S. J., Bucher, H. U., Merlini, M. & Fauchère, J. C. Inter-Observer Variability of the Apgar score of preterm infants between neonatologists, obstetricians and midwives. J. Neonatol. Clin. Pediatr. 5, 024 (2018).
Siddiqui, A. et al. Can the Apgar score be used for international comparisons of newborn health? Paediatr. Perinat. Epidemiol. 31, 338–345 (2017).
Watterberg, K. L. et al. The Apgar score. Pediatrics 136, 819–822 (2015).
Michel, A. Review of the reliability and validity of the Apgar score. Adv. Neonatal Care 22, 28–34 (2022).
Michel, A. D. & Lowe, N. K. Accuracy and interrater agreement of registered nurses’ assignment of Apgar Scores to standardized clinical vignettes. Clin. Nurs. Res. 32, 452–462 (2023).
Fair, F., Furness, A., Higginbottom, G., Oddie, S., Soltani, H. Review of neonatal assessment and practice in Black, Asian and minority ethnic newborns: Exploring the Apgar score, the detection of cyanosis and jaundice. Available on: https://www.nhsrho.org/publications/review-of-neonatal-assessment-and-practice-in-black-asian-and-minority-ethnic-newborns-exploring-the-apgar-score-the-detection-of-cyanosis-and-jaundice/ (NHS Race Health Observatory, 2023).
Jurdi S. R., Jayaram A., Sima A. P., Hendricks Muñoz K. D. Evaluation of a comprehensive delivery room neonatal resuscitation and adaptation score (NRAS) compared to the Apgar score: a pilot study. Glob. Pediatr. Health 2. https://doi.org/10.1177/2333794X15598293. (2015).
Witcher, T. J. et al. Neonatal resuscitation and adaptation score vs Apgar: newborn assessment and predictive ability. J. Perinatol. 38, 1476–1482 (2018).
Elglil, A. M. A., Ibrahim, A. M., Shihab, N. S. & El Mashad, A. E. R. M. Study of the neonatal resuscitation and adaptation Score (NRAS) compared to the Apgar score in neonatal resuscitation. J. Adv. Med. Med. Res. 32, 100–110 (2020).
Villota, E. C., Pasquel, D. P., Cuenca, F. A. & De Los Monteros, R. E. Noninferiority assessment of the “neonatal resuscitation and adaptation score” versus the Apgar score. Rev. Ecuat. Pediatr. 22, 20 (2021).
Schott, J. & Henley, A. Health-care equity: how to recognize clinical signs in skin. Br. J. Midwifery 8, 271–273 (2000).
Cabanas, A. M., Fuentes-Guajardo, M., Latorre, K., León, D. & Martin-Escudero, P. Skin pigmentation influence on pulse oximetry accuracy: a systematic review and bibliometric analysis. Sensors 22, 3402 (2022).
Andrist, E., Nuppnau, M., Barbaro, R. P., Valley, T. S. & Sjoding, M. W. Association of race with pulse oximetry accuracy in hospitalized children. JAMA Netw. Open. 5, e224584–e224584 (2022).
Ruppel, H. et al. Evaluating the accuracy of pulse oximetry in children according to race. JAMA Pediatr. 177, 540–543 (2023).
Foglia, E. E. et al. The effect of skin pigmentation on the accuracy of pulse oximetry in infants with hypoxemia. J. Pediatr. 182, 375–377 (2017).
Ross, P. A., Newth, C. J. & Khemani, R. G. Accuracy of pulse oximetry in children. Pediatrics 133, 22–29 (2014).
Furness, A., Fair, F., Higginbottom, G., Oddie, S. & Soltani, H. A review of the current policies and guidance regarding Apgar scoring and the detection of jaundice and cyanosis concerning Black, Asian and ethnic minority neonates. BMC Pediatr. 24, 198 (2024).
Acknowledgements
We would like to thank Ghazaleh Oshaghi and Kayla Baugh for their support in screening the citations by title. The photographic image within the graphical abstract is a montage complied from photo: sirtravelot/Shutterstock and photo: Rawpixel.com/Shutterstock which are royalty-free images after purchase. With thanks to Nick Waddington and John Kirkby for compiling the images in the graphical abstract.
Funding
This research was commissioned by the NHS Race & Health Observatory (RHO_Neonatal health and ethnicity). For the purpose of open access, the author has applied a Creative Commons Attribution (CC BY) licence to any Author Accepted Manuscript version arising from this submission.
Author information
Authors and Affiliations
Contributions
All authors participated in the study concept and design led by H.S., F.J.F., and A.F. undertook literature searching. F.J.F., A.F., and H.S. participated in study selection, data extraction, critical appraisal of the included studies, and data analysis. All authors were involved in the interpretation of the results. A.F. drafted the background section and F.J.F. drafted all other sections in the manuscript. All authors revised the manuscript and approved the final manuscript for submission.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Fair, F.J., Furness, A., Higginbottom, G. et al. Systematic review of Apgar scores & cyanosis in Black, Asian, and ethnic minority infants. Pediatr Res 97, 939–952 (2025). https://doi.org/10.1038/s41390-024-03543-3
Received:
Revised:
Accepted:
Published:
Issue date:
DOI: https://doi.org/10.1038/s41390-024-03543-3
This article is cited by
-
Color outside the lines: rethinking Apgar scores for equity
Pediatric Research (2024)