Cognitive underpinnings and ecological correlates of implicit bias against non-Americans in the United States

Kurdi, Benedek; Okura, Keitaro; Hehman, Eric; Ferguson, Melissa J.

doi:10.1038/s41598-025-98384-3

Download PDF

Article
Open access
Published: 30 April 2025

Cognitive underpinnings and ecological correlates of implicit bias against non-Americans in the United States

Benedek Kurdi^1,2,
Keitaro Okura³,
Eric Hehman⁴ &
…
Melissa J. Ferguson¹

Scientific Reports volume 15, Article number: 15191 (2025) Cite this article

2770 Accesses
13 Altmetric
Metrics details

Subjects

Abstract

Of the 330 million residents of the United States, over 40 million were born abroad. Such individuals are routinely referred to using labels such as “alien,” “foreigner,” and “noncitizen.” In this multimethod project relying on data from 5437 U.S. citizens in experimental studies and 193,649 U.S. citizens in archival studies, we examine implicit (automatic) evaluations of non-Americans in the United States, their effects on impression formation, and their ecological correlates in the form of real-life outcomes. In Studies 1A–1C, the labels “alien,” “foreigner,” and “noncitizen” were found to be highly and similarly implicitly negative. In Studies 2A–2D, applying these labels to specific individuals created immediate implicit negativity toward them, irrespective of their gender or race. Finally, pro-American/anti-foreigner implicit evaluations predicted anti-immigrant policy positions at the level of individuals (Study 3A), and a conceptually and statistically related implicit White–American/Asian–foreign implicit stereotype predicted anti-immigrant voting patterns in 18 relevant ballot initiatives at the level of U.S. counties (Study 3B). Across studies, implicit anti-foreigner bias generalized across participant demographics but was somewhat stronger among men and political conservatives. Together, this work highlights the cognitive underpinnings and real-world correlates of robust and pervasive anti-foreigner biases in the United States.

Who is American? A comprehensive analysis of the American = White/Foreign = Asian stereotype (2007–2023)

Article Open access 27 January 2025

Immigration documentation statuses evoke racialized faceism in mental representations

Article Open access 09 May 2024

Negative media portrayals of immigrants increase ingroup favoritism and hostile physiological and emotional reactions

Article Open access 12 August 2021

The founding ethos of the United States emphasizes adherence to certain cultural and political values rather than membership in a specific ethnoracial group as the main criterion for belonging to the community of Americans. This idea is often expressed by referring to the United States using monikers such as a multiracial and multicultural “melting pot”¹ or a “country of immigrants.” Indeed, more than 46 million people currently living in the United States were born abroad. About 21 million of these individuals are naturalized U.S. citizens, 12 million permanent residents (green card holders), 11 million undocumented immigrants, and 2 million temporary residents (e.g., those on student visas)². However, despite the large number of immigrants in the United States today, combined with the immigrant history of the country’s majority groups, those born and raised abroad have and continue to face myriad forms of marginalization and exclusion in U.S. society, including economic inequality³, residential segregation⁴, and hate crime victimization⁵.

Social group-based inequalities are multiply determined^6,7, including via a host of historical, political, sociological, and other factors. However, uncovering how human minds represent and apply social group-relevant information can make an important contribution toward understanding and potentially mitigating such inequalities⁸. As such, the studies reported below use a multimethod approach, relying both on experimental studies and an archival study of regional voting patterns to investigate three separate but interrelated aspects of the psychology of anti-foreigner bias in the United States. We pursue this multipronged approach because we believe that the study of societal inequality has much to gain from treating individual and structural levels of analysis as mutually informative and reinforcing^8,9.

Specifically, in Studies 1A–1C we ask whether the choice of label used to denote non-Americans (“alien,” “foreigner,” or “noncitizen”) can influence how Americans relate to and evaluate this large and diverse social group. Many Americans share the intuition that relatively subtle differences in labeling can have wide-ranging psychological repercussions. For example, in 2021, the Biden administration ordered immigration agencies to stop using the term “alien” — widely seen as dehumanizing — and to replace it with the ostensibly more neutral “noncitizen”¹⁰. The perceived importance of relevant labels is also illustrated by the public outcry following President Biden’s use of the term “illegals” in his 2024 State of the Union address¹¹.

Indeed, the psychological effects of different labels used to refer to the same social groups are well documented in empirical research^12,13,14,15, including in the context of non-American groups and immigration policy^{16,17,18,19,20}. The present project builds upon these existing findings in two ways. First, whereas past work has tended to investigate foreigner labeling effects in the context of undocumented immigration, here we probe the more general distinction between Americans vs. non-Americans, focusing on three labels: “alien,” “noncitizen,” and “foreigner.” The former two labels were selected for inclusion because, as mentioned above, the Biden administration switched from using “alien” to using “noncitizen” in official communications in 2021. However, both of these labels are highly technical and thus not often used in everyday discourse. In addition, “noncitizen” contains a lexical negation, which may shift its evaluations in a negative direction²¹. As such, we also additionally included “foreigner,” which is both more colloquial and monomorphemic, in Studies 1A–1C.

Second, prior research has relied exclusively on self-report measures to investigate relevant labeling effects. Such measures are highly informative with respect to participants’ consciously endorsed values (e.g., about the inherent equality of different social groups). However, they are less well suited to index more automatic responses, which may be misaligned with such egalitarian views due to social desirability²² or a lack of introspective access to less controlled aspects of social thought and behavior²³. As such, each study in the present project measured both self-reported (explicit) and automatic (implicit) evaluations.

In Studies 2A–2D we turned to investigating the evaluative consequences of applying these labels to particular individuals. Across experiments, we additionally manipulated these individuals’ other social group memberships, both to ensure the generalizability²⁴ of the results and to potentially document emergent intersectional biases²⁵. Investigating whether targets’ racial group membership moderates foreigner labeling effects is especially important given that social representations, including implicit stereotypes, of who is seen as a (prototypical) member of the category “American” are heavily racialized in the United States^{26,27,28,29,30,31,32}. As such, the negative evaluative effects of applying a foreigner label might be exacerbated when such labels are used to refer to non-White individuals. This effect might be particularly strong in the context of Asian targets^29,31,32, especially against the backdrop of increased anti-Asian bias as a result of the COVID-19 pandemic³³.

Finally, in Studies 3A–3B we examine the association between implicit (and explicit) anti-foreigner evaluations and social behavior. Specifically, in Study 3A, we focus on the level of individual participants by correlating the extent of American–good/foreign–bad evaluative biases and White–American/Asian–foreign stereotypic biases with participants’ anti-immigrant policy views, such as the extent to which they oppose the existence of sanctuary cities or support states suing the federal government over stricter enforcement of immigration regulations. We examine evaluative biases in this study because these biases are the focus of Studies 1A–2D; we additionally include measures of related stereotypic biases because — due to data availability — these biases are the focus of Study 3B, described in more detail below.

Importantly, Study 3A has limitations consistent with much empirical work in this domain. Notably, participants may be responding strategically to the policy items. Moreover, the policy items themselves are hypothetical and, as such, their generalizability beyond the online study setting may be limited³⁴. To address this concern, in Study 3B we draw inspiration from the recent bias of crowds^35,36 and regional intergroup bias^37,38 approaches to probe whether regional aggregates of implicit White–American/Asian–foreign stereotypes predict anti-immigrant voting patterns across 18 relevant ballot initiatives.

As such, Studies 3A and 3B have complementary strengths and limitations. The former allows for inferences about individual participants, but its setting is relatively contrived and does not preclude strategic responding on the voting preference items. By contrast, the latter is not suited for individual-level inferences³⁹, but its criterion behaviors are both naturalistic and have obvious external validity given the direct and tangible repercussions of the relevant ballot initiatives for millions of non-Americans (and Americans) living in the United States.

Results and discussion

Study 1A

Study 1A measured implicit (Implicit Association Test; IAT) and explicit (self-reported) evaluations of the labels “American” vs. “alien,” “foreigner,” and “noncitizen” in a sample of U.S. citizens. The results are shown in Fig. 1. Means, standard deviations, and correlations between implicit and explicit evaluations for this and all remaining studies are reported in Supplementary Tables 1–4.

Overall, participants exhibited a statistically significant and very strong implicit preference for the label “American” relative to the labels “alien,” “foreigner,” and “noncitizen,” t(356) = 28.07, p < 0.001, Cohen’s d = 1.49, BF₁₀ = 3.30 × 10⁸⁸. This result is unsurprising given robust and near-ubiquitous findings of ingroup favoritism on implicit evaluation measures, especially among members of dominant groups^40,41.

Counter to the widespread intuition that such labels are meaningfully different from each other, the specific category label used on the IAT (“alien,” “foreigner,” or “noncitizen”) produced no significant effect, F(2, 354) = 0.21, p = 0.812, η² < 0.01, BF₀₁ = 26.86. But perhaps importantly, in Study 1A we manipulated only the IAT category labels between conditions; category stimuli were the same across conditions and included all three labels (“alien,” “foreigner,” and “noncitizen”). As such, the lack of a condition effect may have been due to the relatively weak manipulation. We revisit this issue in Study 1B below.

The pattern for explicit evaluations was distinct in two ways. First, participants exhibited a statistically significant and small explicit preference for the labels “alien,” “foreigner,” and “noncitizen” relative to the label “American,” t(338) = -5.12, p < 0.001, Cohen’s d = -0.28, BF₁₀ = 1.64 × 10⁴. One potential interpretation of the discrepancy between implicit and explicit evaluations is that the latter were indicative of the social sensitivity of the domain²² and, relatedly, pressures to appear nonprejudiced⁴².

Second, unlike for implicit evaluations, this result was significantly modulated by the specific label used to denote the non-American category, F(2, 336) = 4.78, p = 0.009, η² = 0.03, BF₁₀ = 2.53. Whereas participants showed no significant preference in the American/alien condition, they exhibited an outgroup preference in the American/foreigner and American/noncitizen conditions. Accordingly, the American/alien condition was significantly different from the American/foreigner, t(336) = 2.49, p = 0.013, and American/noncitizen conditions, t(336) = 2.85, p = 0.005. The American/foreigner and American/noncitizen conditions did not differ from each other, t(336) = 0.32, p = 0.748.

The data thus suggest that, at least on the explicit evaluation measure, the “alien” label was perceived relatively more negatively than the “foreigner” and “noncitizen” labels, but not different from the “American” label. Importantly, overall, explicit evaluations did not suggest any negativity toward these three labels relative to “American.”

Study 1B

Study 1B followed the setup of Study 1A, but the manipulation of non-American labels was strengthened by varying not only IAT category labels but also IAT category stimuli between participants.

The results are shown in Fig. 2. Despite the stronger manipulation involving both category labels and category stimuli, implicit evaluations perfectly mirrored the results from Study 1A. Specifically, there was, again, a statistically significant and very strong implicit preference for the label “American” relative to the labels “alien,” “foreigner,” or “noncitizen,” t(287) = 20.16, p < 0.001, Cohen’s d = 1.19, BF₁₀ = 2.64 × 10⁵³. Like in Study 1A, the IAT label (and in this case, the IAT stimuli; “alien,” “foreigner,” or “noncitizen”) produced no significant effect, F(2, 285) = 2.08, p = 0.127, η² = 0.01, BF₀₁ = 4.07. This result suggests that implicit preferences reflect evaluations of the labels’ shared referents (i.e., non-Americans) rather than connotations of the specific label.

Consistent with Study 1A, participants exhibited a statistically significant and small explicit preference for the labels “alien,” “foreigner,” and “noncitizen” relative to the label “American,” t(287) = -7.08, p < 0.001, Cohen’s d = -0.42, BF₁₀ = 5.83 × 10⁸. This result was again significantly moderated by the specific label used to denote the non-American category, F(2, 285) = 4.19, p = 0.016, η² = 0.03, BF₁₀ = 1.62. Specifically, the American/alien condition was significantly different from the American/foreigner, t(285) = 2.06, p = 0.041, and American/noncitizen conditions, t(285) = 2.80, p = 0.006, reflecting less positive evaluations of “alien” relative to “foreigner” and “noncitizen.” The American/foreigner and American/noncitizen conditions did not differ from each other, t(285) = 0.85, p = 0.398. These small differences notwithstanding, similar to Study 1A, explicit evaluations did not suggest any negativity toward these three labels relative to “American.”

Study 1C

Given the relative nature of the measures used in Studies 1A–1B, it is conceivable that differences across non-American labels may have been obscured in those studies given the overwhelming positivity of the American label in comparison. As such, in Study 1C, we directly contrasted the non-American labels with each other. That is, depending on participants’ condition assignment, the IAT labels and stimuli were “alien” vs. “foreigner,” “alien” vs. “noncitizen,” or “foreigner” vs. “noncitizen.”

The results are shown in Fig. 3. Unlike in Studies 1A–1B, implicit evaluations significantly differed from each other across the three label conditions, F(2, 303) = 5.07, p = 0.007, η² = 0.03, BF₁₀ = 3.49. Follow-up analyses indicated that the significant omnibus test was due to a difference between the noncitizen/foreigner and foreigner/alien, t(303) = 2.89, p = 0.004, and the noncitizen/foreigner and noncitizen/alien conditions, t(303) = 2.62, p = 0.009. The remaining two conditions did not differ from each other t(303) = 0.38, p = 0.707. This pattern of between-condition differences is indicative of the fact that whereas foreigner and alien as well as noncitizen and alien were evaluatively equivalent to each other, foreigner was somewhat more negative than noncitizen. However, even the noncitizen–foreigner comparison produced only a relatively modest effect (β = 0.35).

The pattern of cross-condition differences was more pronounced on the explicit evaluation measure, F(2, 303) = 21.65, p < 0.001, η² = 0.13, BF₁₀ = 6.10 × 10⁶. Each pairwise comparison was significant (ps ≤ 0.011). These significant differences were driven by the fact that foreigner was significantly preferred to alien (β = 0.55) but dispreferred to noncitizen (β = 0.32). In contrast, reflecting a non-transitive preference ordering, noncitizen and alien did not differ from each other.

Study 2A

After examining implicit evaluations of the labels “alien,” “foreigner,” and “noncitizen” in isolation in Studies 1A–1C, in Study 2A we turned to studying the evaluative consequences of applying such labels to specific targets. Participants in this study were introduced to two novel individuals (both White men). In an attribute conditioning paradigm^43,44, one of these individuals was repeatedly paired with the label “American,” whereas the other was repeatedly paired with the labels “alien,” “foreigner,” and “noncitizen.” The goal was to measure implicit and explicit evaluations of the two individuals following this minimal learning manipulation. Given that Studies 1A–1C generally found the three labels to be similarly implicitly negative, in Studies 2A–2D we used them as a set to induce implicit evaluations toward novel targets.

The results are shown in Fig. 4. On the implicit evaluation measure, participants exhibited a small but statistically significant preference for the American over the non-American target, t(301) = 3.81, p < 0.001, Cohen’s d = 0.22, BF₁₀ = 73.48. We obtained a similar result in a Bayesian mixed-effects model, which additionally included a random intercept for the particular images used to represent the two individuals in the learning task and on the IAT, β₀ = 0.21 [-0.05; 0.46]; however, given that the 95-percent highest density interval (HDI) overlapped with zero, the condition difference in this study should be interpreted with some caution. As such, Studies 2B–2D were conducted, in part, to examine the robustness of the conditioning effect obtained in Study 2A.

Mirroring Studies 1A–1C, explicit evaluations were dissociated from implicit evaluations in Study 2A. Specifically, participants did not show any explicit preference between the American and non-American targets, t(292) = 0.91, p = 0.362, Cohen’s d = 0.05, BF₀₁ = 10.12. The same result also emerged in a Bayesian mixed-effects model accounting for stimulus effects, β₀ = 0.05 [-0.09; 0.19]. Similar to the previous studies, this result is likely indicative of the social sensitivity of the domain²² and pressures to appear nonprejudiced⁴².

Study 2B

Study 2B was procedurally identical to Study 2A, with an additional between-participant manipulation of target gender. As such, the male target condition was a direct replication of Study 2A, whereas the female target condition constituted a test of robustness and generalizability²⁴. In addition, the gender contrast is also of theoretical interest given that men are often assumed to be the main targets (as well as the main perpetrators) of intergroup conflict and prejudice⁴⁵.

The results are shown in Fig. 5. Similar to Study 2A, on the implicit evaluation measure, participants exhibited a small but statistically significant preference for the American over the non-American target, t(1039) = 5.96, p < 0.001, Cohen’s d = 0.18, BF₁₀ = 1.31 × 10⁶. The same result was also confirmed in a Bayesian mixed-effects model accounting for stimulus effects, β₀ = 0.19 [0.06; 0.32]. Target gender produced no significant effect, t(1022.6) = 0.77, p = 0.443, Cohen’s d = 0.05, BF₀₁ = 10.76, attesting to the generalizability of the findings from Study 2A.

Mirroring Study 2A, explicit evaluations were dissociated from implicit evaluations. Specifically, participants did not show any explicit preference between the American and non-American targets, t(1008) = -0.16, p = 0.875, Cohen’s d < 0.01, BF₀₁ = 27.86. The same result also emerged in a Bayesian mixed-effects model accounting for stimulus effects, β₀ = -0.01 [-0.09; 0.08]. Unlike on the implicit evaluation measure, target gender produced a statistically significant effect, t(1006.8) = 2.37, p = 0.018, Cohen’s d = 0.15, BF₁₀ = 1.11. However, given that the effect was small, and the Bayes Factor remained inconclusive, we refrain from interpreting this result.

Study 2C

In the United States, who is perceived to be American or non-American is heavily racialized^29,31,46. As such, Study 2C was designed to investigate whether the social category of race moderates the effects observed in Studies 2A–2B. Whereas Studies 2A–2B included only White targets, participants in this study were randomly assigned to learn about two Asian, two Black, two multiracial, or two White targets. However, we did not make the racial group membership of the two targets explicit, and a manipulation check item administered at the end of the study revealed that 58% of participants did not categorize both faces as intended. Therefore, we treat the present study as a further test of generalizability across stimulus materials and revisit the issue of race effects in Study 2D. We additionally report analyses by target race in Supplementary Results.

The results are shown in Fig. 6. Consistent with prior studies, on the implicit evaluation measure, participants exhibited a small but statistically significant preference for the American over the non-American target, t(986) = 11.39, p < 0.001, Cohen’s d = 0.36, BF₁₀ = 8.37 × 10²⁴. The same result was also confirmed in a Bayesian mixed-effects model accounting for stimulus effects, β₀ = 0.34 [0.26; 0.44]. This finding attests to the robustness of the implicit preference for American over non-American targets, even following a minimal learning manipulation.

Unlike in previous studies, explicit and implicit evaluations were characterized by similar mean levels. Specifically, mirroring implicit evaluations, participants exhibited an explicit preference for the American over the non-American target, t(967) = 3.97, p < 0.001, Cohen’s d = 0.13, BF₁₀ = 86.80. The same result also emerged in a Bayesian mixed-effects model accounting for stimulus effects, β₀ = 0.12 [0.03; 0.20]. However, we note that the explicit evaluation effect was one third in size of the parallel implicit evaluation effect and small by conventional standards.

Study 2D

Study 2D was similar to Study 2C but also included an initial racial categorization task designed to explicitly teach participants the two focal targets’ racial category membership. Thanks to the inclusion of this categorization task, manipulation check accuracy improved to 77% from 42% in Study 2C. As such, in the present study, we are able to investigate the effect of the target race variable, as intended.

The results are shown in Fig. 7. Consistent with Studies 2A–2C, on the implicit evaluation measure, participants exhibited a small but statistically significant preference for the American over the non-American target, t(1190) = 10.22, p < 0.001, Cohen’s d = 0.30, BF₁₀ = 1.65 × 10²⁰. The same result was also confirmed in a Bayesian mixed-effects model accounting for stimulus effects, β₀ = 0.29 [0.21; 0.36]. In addition, the present results also underscore the generalizability of the pro-American/anti-foreigner bias across target race, given the very strong evidence that we obtained for the lack of any effect associated with this variable, F(3, 1187) = 0.02, p = 0.996, η² < 0.01, BF₀₁ = 420.55. The same results emerged among the subset of participants with perfect manipulation check performance (see Supplementary Results).

Unlike in previous studies but like in Study 2C, explicit and implicit evaluations were characterized by similar mean levels. Specifically, mirroring implicit evaluations, participants exhibited an explicit preference for the American over the non-American target, t(1155) = 4.28, p < 0.001, Cohen’s d = 0.13, BF₁₀ = 287.55. The same result also emerged in a Bayesian mixed-effects model accounting for stimulus effects, β₀ = 0.13 [0.06; 0.19]. Also similar to implicit evaluations, target race did not modulate the labeling effect, F(3, 1152) = 0.67, p = 0.571, η² < 0.01, BF₀₁ = 159.60.

Study 3A

Studies 1A–2D investigated the cognitive underpinnings of implicit bias against non-Americans, including the evaluative effects of different labels (Studies 1A–1C) and the downstream consequences of applying such labels to specific individuals (Studies 2A–2D). In the final set of studies, we turned to probing the correlates of the American/foreign evaluative bias, and a related bias preferentially linking White Americans to the concept of Americanness and Asian Americans to the concept of foreignness^29,31,32, at the level of individual participants (Study 3A) and at the level of U.S. counties (Study 3B). Together, these studies probe whether, and to what extent, the relevant measures of explicit and implicit evaluation, whose basic cognitive properties we investigated in Studies 1A–2D, are predictive of relevant social behaviors.

The aims of Study 3A, which we conducted at the level of individual participants, were threefold. First, we examined whether the American/foreign evaluative IAT used in Studies 1A–2D was associated with anti-immigrant policy positions (e.g., opposition to the existence of sanctuary cities), thus providing a measure of criterion validity. To ensure correspondence with Study 3B, we asked participants to respond to 12 policy items modeled after the 18 real-world ballot initiatives featured in that study. Second, the archival data used to index regional anti-foreigner bias in Study 3B did not include a measure of American/foreign implicit evaluations but rather featured an IAT measuring White/Asian–American/foreign stereotypes. As such, we also included this stereotype IAT as a potential predictor of anti-immigrant policy preferences in the present study. Third, Study 3A also allowed us to investigate whether and to what extent the American/foreign–good/bad evaluation IAT included in Studies 1A–2D and the conceptually related White/Asian–American/foreign stereotype IAT available in the archival data used in Study 3B are related to each other.

The evaluative IAT used in Studies 1A–2D was significantly related to immigration policy views such that a stronger American–good/foreign–bad bias predicted more anti-immigrant policy preferences, β = 0.23 [0.17; 0.29], t(941) = 7.30, p < 0.001. Although the evaluative IAT and the stereotype IAT were significantly correlated with each other, r = 0.15, t(964) = 4.59, p < 0.001, performance on the stereotype IAT did not predict policy preferences at the individual level, β = 0.05 [-0.01; 0.12], t(941) = 1.58, p = 0.114. The effects remained virtually unchanged when both IATs were used to predict policy preferences simultaneously.

Attesting to the unique predictive validity of the evaluative IAT, the effects of this variable on immigration policy views remained significant in a model that additionally included two measures of American–good/foreign–bad explicit evaluations and two measures of Asian–foreign/White–American explicit stereotypes, β = 0.16 [0.10; 0.22], t(872) = 5.33, p < 0.001. The two explicit evaluation measures also had unique effects of β = 0.29 [0.22; 0.35], t(872) = 8.11, p < 0.001, and β = 0.24 [0.18; 0.32], t(872) = 6.93, p < 0.001, respectively, whereas neither the implicit nor the explicit stereotype measures were significantly associated with ballot preferences. The fact that the evaluative IAT showed any incremental predictive validity over and above the explicit evaluation and stereotype measures is noteworthy given that, unlike those measures, the IAT does not share any method variance with the policy preference variable, which was measured via self-report⁴⁷. In addition, given that self-reports are highly controllable, participants had ample opportunity to ensure that their responses across the explicit evaluation and policy preference items were internally consistent with each other⁴⁸.

Together, these analyses suggest that the American/foreign–good/bad evaluation IAT has a unique effect in predicting relevant policy preferences above and beyond parallel explicit evaluation items and the White/Asian–American/foreign stereotype IAT. Although the stereotype IAT did not produce any effects at the individual level, in Study 3B we turn to investigating whether this IAT predicts actual voting patterns (rather than hypothetical policy preferences) at the level of U.S. counties. The evaluative IAT could not be included in this study because it was not available at the regional level.

Study 3B

Although Study 3A had the benefit of allowing for inferences about individual participants, the policy preferences measured were hypothetical, thus limiting the external validity of the design. As such, in Study 3B, we turned to investigating the ecological correlates of anti-foreigner bias in the United States using a real-world outcome as the criterion measure. Specifically, this study probed the relationship between county-level aggregates of the White/Asian–American/foreign stereotype IAT, obtained using archival data from the Project Implicit educational website (http://implicit.harvard.edu/)^40,41, and anti-immigrant vote shares in an exhaustive set of 18 real-world ballot initiatives from ten different states over a 28-year period between 1994 and 2022. Descriptive statistics for this study are available in Supplementary Table 5.

Aggregating across the 18 ballot initiatives, we found a significant meta-analytic relationship between the White/Asian–American/foreign IAT and anti-immigrant vote share (see Fig. 8), β = 0.24 [0.16; 0.32], z = 5.86, p < 0.001. In contrast, county-level explicit Asian–foreign/White–American bias did not have a significant meta-analytic effect, β = -0.07 [-0.16; 0.01], z = -1.75, p = 0.081. The relationship between county-level implicit bias and anti-immigrant vote share remained significant after controlling for county-level explicit bias, and the size of the effect was virtually unchanged, β = 0.28 [0.20; 0.37], z = 6.84, p < 0.001. These results provide evidence for the predictive value of anti-foreigner implicit bias for real-world outcomes.

We found that all heterogeneity in the effect of implicit bias on anti-immigrant votes could be accounted for by the methodological strength of the data pertaining to each individual ballot initiative, including restriction of range issues in the dependent variable⁴⁹ and the precision with which the independent variable was measured. Specifically, the strength of the relationship increased as a result of more county-level variability in anti-immigrant vote share, b = 4.65 [1.30; 8.01], z = 2.72, p = 0.007, as well as the median by-county sample size, b = 0.0009 [0.0002; 0.0017], z = 2.38, p = 0.017.

Demographic Variability (Studies 1A, 1B, 2A–2D, and 3A)

Finally, we combined data from Studies 1A, 1B, 2A–2D, and 3A to probe demographic correlates of American/foreign implicit and explicit evaluations measured in those studies. To this end, we fit mixed-effects models to the data, with random intercepts for studies, separately for the two dependent measures. We used likelihood ratio tests to determine improvements in model fit. Relevant descriptive statistics are reported in Supplementary Table 6.

Participants’ race, age, place of birth (US vs. non-US), parents’ place of birth (US vs. non-US), and the language spoken at home (English, English and some other language, or only non-English) did not have significant effects on implicit bias against non-Americans. Participant gender had a significant effect χ²(2) = 30.95, p < 0.001, such that male participants exhibited a stronger bias than did both female participants, β = 0.14, t(5076) = 5.30, p < 0.001, and participants of other genders, β = 0.27, t(5075) = 2.58, p = 0.010. Women and participants of other genders did not differ from each other, β = 0.14, t(5074) = 1.30, p = 0.194. We also observed a significant effect of participant ideology, χ²(1) = 46.30, p < 0.001, such that conservative participants exhibited a stronger bias than did liberal participants, b = 0.09, t(4983) = 6.82, p < 0.001. These demographic effects are in line with well-established trends from the relevant literature⁴¹. At the same time, we note that the pro-American/anti-foreigner bias remained significant even among female and strongly liberal participants.

Similar to implicit evaluations, explicit evaluations were moderated by participant ideology, χ²(1) = 102.40, p < 0.001, such that conservative participants exhibited stronger anti-foreigner biases than did liberal participants, b = 0.15, t(4850) = 10.17, p < 0.001. Unlike for implicit evaluations, highly liberal participants expressed an outgroup preference, whereas highly conservative participants expressed an ingroup preference of equivalent size. No other demographic effects on explicit evaluations were significant.

General discussion

In Studies 1A–1C we demonstrated a significant and very large implicit (automatic) preference for the label “American” over non-American labels including “alien,” “foreigner,” and “noncitizen.” In Studies 2A–2D, we applied these labels to specific targets and showed that an experimental manipulation consisting of as little as 30 stimulus pairings was sufficient to induce implicit negativity toward these targets relative to control targets paired with the label “American.” Study 3A provided evidence for predictive validity at the individual level by showing that the American–good/foreign–bad implicit bias was significantly associated with anti-immigrant policy preferences. Finally, in Study 3B, a conceptually and statistically related bias preferentially linking White Americans over Asian Americans to Americanness significantly predicted actual anti-immigrant voting patterns at the level of U.S. counties across ten states.

Across all nine studies reported here, we obtained evidence of robust implicit anti-foreigner biases in the United States. Although these results are in line with theoretical perspectives from the intergroup relations literature emphasizing the ubiquity of ingroup preference as a fundamental motive of human social cognition and behavior^50,51, the pervasive nature of implicit anti-foreigner bias documented here is still noteworthy: The bias emerged both toward abstract labels (Studies 1A–1C) and specific individuals (Studies 2A–2D) as well as both at the level of individual participants (Study 3A) and U.S. counties (Study 3B). The result also generalized across different versions of the label that intuitively differ from each other in valence (“alien,” “foreigner,” and “noncitizen”), across target demographics, including White male (Study 2A), White female (Study 2B), and Asian, Black, and multiracial male targets (Studies 2C–2D), and participant demographics, despite some heterogeneity in results by gender and political ideology. Remarkably, participants’ personal and family history of immigration also did not moderate these effects, possibly indicative of the quick assimilation of individuals into U.S. society whose cultural values heavily emphasize the idea of American exceptionalism⁵².

By contrast, the results involving explicit evaluations were more variable, likely reflecting the social sensitivity of the domain²² and pressures to appear nonprejudiced⁴². For example, whereas participants expressed an outgroup preference in Studies 1A–1C, explicit evaluations were neutral in Studies 2A–2B (which involved White targets) and exhibited an ingroup preference in Studies 2C–2D (which involved targets of multiple races). Finally, although explicit anti-foreigner biases were moderately predictive of policy views at the individual level (Study 3A), they were uncorrelated with actual voting patterns at the regional level (Study 3B). Together, these findings attest to the value of using a combination of self-report and indirect measures to understand the antecedents and correlates of human social behavior⁵³, especially against the backdrop of theoretical perspectives that center ingroup preference as an essential and consistent driver of human intergroup cognition and behavior^50,51.

The present work also raises some theoretical puzzles and opens up new avenues for empirical inquiry. First, given the relative nature of the Implicit Association Test (IAT) used to measure implicit evaluations and stereotypes in this work, it is not entirely clear to what extent the present results were driven by ingroup preference (i.e., positivity toward Americans), outgroup derogation (i.e., negativity toward non-Americans), or a combination of both. We chose to use the shorthand “anti-foreigner bias” to refer to the pattern of results reported above because Studies 3A and 3B provide clear evidence for the predictive validity of the relevant IATs in the context of anti-immigrant (rather than pro-American) behaviors. Nonetheless, future work may be conducted to obtain more direct evidence on the relative contributors of ingroup preference versus outgroup derogation to the patterns of implicit evaluation and stereotyping obtained here.

Second, implicit evaluations of both the abstract labels “alien,” “foreigner,” and “noncitizen” and the individuals to whom those labels had been applied were negative. However, the average effect size was an order of magnitude larger for the former than for the latter. Why and how applying a label to a particular target decreases the evaluative strength of that label is an intriguing open question. Resolving this open question may be informed by past social–cognitive work on the “dilution effect”⁵⁴ suggesting that stereotypes are often stronger in the context of abstract groups than they are when applied to specific individuals.

Third, the correlation between the American/foreign–good/bad evaluation IAT and the White/Asian–American/foreign stereotype IAT was relatively modest. This pattern of results is unexpected given previous findings of robust correlation between implicit evaluations and stereotypes^55,56 and thus ripe for further exploration. It should be noted, however, that whereas past work has compared IATs with the same categories but different attributes (e.g., White/Asian–good/bad and White/Asian–smart/dumb), in the present studies both the categories (American/foreign vs. White American/Asian American) and the attributes (good/bad vs. American/foreign) differed from each other. This feature of the design may have further reduced correspondence⁵⁷ between the two IATs and thus depressed the correlation between them.

Fourth, some of the patterns that emerged with respect to predictive validity might be worth empirical follow-up work. Specifically, it is presently unclear why explicit bias was predictive of anti-immigrant policy views at the level of individuals but not at the level of geographic regions. Similarly, more work is needed to understand why the White/Asian–American/foreign stereotype IAT had no predictive validity at the individual level while being significantly (and uniquely) predictive of actual voting patterns at the regional level. Notably, the American/foreign stereotype IAT and the good/bad evaluative IAT were both available and thus allowed for a direct comparison only in the individual-level Study 3A. Given that the outcome measure in Study 3B consisted of anti-immigrant voting behavior in general (rather than toward Asian Americans in particular), the general evaluative IAT may have been even more predictive of such behaviors than the stereotype IAT that is specific to Asian Americans⁵⁷, had it been available in the aggregate-level archival data.

Such future work may also be able to contribute to the more general theoretical question of the relationship between (implicit) social group evaluations and stereotypes. Whereas implicit evaluations are often conceptualized as mental links between social group targets (such as White Americans and Asian Americans) with positive and negative valence, implicit stereotypes are usually thought of as containing additional information on specific semantic dimensions (such as smart vs. dumb, safe vs. dangerous, or American vs. foreign). Although these two constructs are conceptually distinct, the empirical relationship between them has been repeatedly investigated, with conflicting findings^55,56,58. Understanding under what conditions the evaluative and stereotype IATs included in the present project are relatively more or less highly related to each other may help move this debate forward. However, as noted above, unlike most relevant work, the evaluative and stereotype IATs included in the present studies as a result of data availability differed both in their categories (American vs. non-American and White American vs. Asian American) and in their attributes (good vs. bad and American vs. foreign), thus potentially creating an unfair test of the evaluation–stereotype relationship.

These open questions notwithstanding, the present work provides robust evidence for both the cognitive underpinnings and ecological correlates of pervasive anti-foreigner biases among U.S. citizens. These biases emerged both in the abstract and toward particular targets and both at the level of individual participants and at the level of geographic units. Notably, the implicit White–American/Asian–foreign bias was significantly and uniquely associated with the consequential real-world outcome of anti-immigrant vote share in ballot initiatives over the past 30 years. Among other goals, these ballot initiatives have aimed — and often succeeded — to eliminate sanctuary cities, to exclude non-Americans from social services, and to further criminalize undocumented immigrants.

The present findings also dovetail with several recent reviews of the prejudice reduction literature^8,59,60, which have concluded that single-shot, light-touch interventions are unlikely to produce meaningful change in entrenched intergroup negativity. Specifically, in the context of the present studies, simply using different labels to refer to non-Americans did not eliminate or even significantly decrease the corresponding biases. As such, despite the good intentions leading to the removal of the term “alien” from the vocabulary of federal agencies, minimal steps of this kind are unlikely to yield meaningful reductions in anti-foreigner attitudes and behaviors. Rather, the prospect of positive change is likely predicated on a joint consideration of the dynamic interplay between the cognitive constraints characterizing human minds⁶¹ and the myriad forms of structural disadvantage and exclusion characterizing non-Americans’ everyday social environments in the United States^3,4,5.