Introduction

In the information age, social media platforms play a crucial role in facilitating access to public opinions. Given an avalanche of information available, this has led to the implementation of information filtering to assist users in retrieving information that is relevant to their interests (Pariser, 2021; Scheffauer et al., 2021; Shehata and Strömbäck, 2021). Filtering of information at the individual level is commonly described as a filter bubble, a phenomenon where algorithm-driven mechanisms curate content that reinforces users’ pre-existing beliefs based on their prior behavior and preferences (Bakshy et al. (2015); Bessi, 2016). Consequently, individuals are more likely to be fed with information that confirms their viewpoints and may choose to engage predominantly with like-minded peers while excluding dissenting perspectives. This process leads to an environment characterized by repetitive exposure to similar opinions. While filter bubbles primarily shape the information that each user receives through algorithmic personalization, they also influence user behavior, which can contribute to complex dynamics at the community or group level. In theoretical terms, this phenomenon is commonly referred to as “echo chambers” and identified as a significant feature of the social ecologies in social media (Bessi, 2016; Villa et al., 2021).

The dynamics of echo chambers represent a complex phenomenon, manifesting in diverse forms across different platforms, contexts (Gao et al., 2023; Terren and Borge-Bravo, 2021), and diverging considerably between online and offline environments (Silver et al., 2019). Additionally, cross-cutting exposure, defined as exposure to dissimilar opinions (Mutz, 2006), has emerged as a critical concern, highlighting the dual nature of echo chambers effects. On the one hand, exposure to diverse viewpoints can foster tolerance and encourage constructive debates (Lyons and Sokhey, 2017; Min and Wohn, 2018). Exposure to a diverse array of news and information potentially allows individuals to consume varied perspectives and hence reduce segregation (Van Aelst et al., 2017). However, some studies suggest that the correlation between echo chamber and segregation among general social media users may be overstated (Dubois and Blank, 2018; Ross Arguedas et al., 2022). Such viewpoint indicates that, although users in echo chambers with a low level of cross-cutting exposure may reinforce their beliefs and provide opportunities for like-minded individuals to engage deeply with their views, the effects of echo chambers can vary based on context and user engagement patterns (Terren and Borge-Bravo, 2021). Although the term “echo chamber” often carries a negative connotation in online discussion, not all activities on social media platforms fall into echo chambers and yield negative outcomes.

Psychological factors significantly influence the patterns of transmission and how individuals are attracted to opinion leaders (Winter and Neubaum, 2016). Extant literature indicates that dissemination of opinions has been largely driven by opinion leaders, irrespective of social media platforms and offline context, including politicians and state-related organizations. On the one hand, opinion leaders possess considerable authority in shaping political views and consolidating users into like-minded communities that propagate similar opinions, thus contributing to phenomena such as group polarization (Van Bavel et al., 2024; Wardle and Derakhshan, 2018). On the other hand, they may also play a role in mitigating polarization (Dubois et al., 2020; Wang et al., 2022). These findings challenge the prevailing notion that social media allows equal voice to all internet users. In reality, the influence of specific individuals on opinion diffusion remains disproportionately significant. However, existing studies often analyze online social networks at a fixed point of time, with limited attention accorded to their dynamic evolution in response to changes in public opinions (Avin et al., 2024). Notably, during significant events and the rise of hostile sentiments, inherent opinion leadership can escalate confrontations between user groups with conflicting interests, thereby squeezing out a genuine diversity of opinions as new centers of opinions sway others and congeal into new blocs of echo chambers.

The academic literature on echo chambers has predominantly focused on political context, focusing on the polarization of opinions and the increasing segregation between political camps and the dissemination of misleading information (Cinelli et al., 2021; Hobolt et al., 2024). In the same vein, social media users are often categorized into distinct political groups to model partisan polarization based on their conversational preferences around policy issues (Barberá et al., 2015; Bessi, 2016). However, existing research on echo chambers is constrained in the sense that studies primarily focus on quantifying ideological segregation among users. This narrow focus risks oversimplifying the complex social media behaviors into presumed political binaries. In reality, political discussions encompass a variety of interpersonal connections driven by individuals’ tendencies to associate with like-minded peers, which is mediated by selective exposure factors (McPherson et al., 2021), such as sentiment (Stiglitz and Dang-Xuan, 2013), topics (Aiello et al., 2012), and social contexts (Dahlgren, 2019). These dynamics can lead to varying extents of isolation among users (Van Bavel et al., 2024). Nevertheless, few studies acknowledge these nuanced variations, highlighting the need for a more comprehensive understanding of the interplay between political discussion and echo chambers on social media platforms.

Furthermore, the analytical unit of echo chamber effects on social media has predominantly focused on individuals. However, rather than actively proposing new opinions and engaging in discussions and debate, general users mainly act as passive recipients of information and share messages from opinion leaders, thereby contributing to the formation of cohesive communities (McClain et al., 2021; Song et al., 2024). These communities establish structured environments that facilitate emotional contagion and group polarization through member interactions. Essentially, communities serve as the primary units for the diffusion of information, offering valuable insights into transmission patterns and the dynamics of message dissemination influenced by opinion leaders who shape collective attention (Ercegovac and Tankosic, 2024; Ranalli and Malcom, 2023). However, the use of communities as the primary unit in investigating echo chamber effects remains relatively underexplored. Thus, analyzing community dynamics characterized by shared topics and sentiments, as well as interactions both within and across communities, is crucial for developing a nuanced understanding of the factors contributing to selective exposure and opinion segregation in the context of echo chamber effects.

In response to these challenges, this paper undertakes a behavior-focused approach to examine echo chamber effects through information transmission among communities of like-minded users on social media spaces. Specifically, the identification of communities is grounded in clustering individuals according to their topological connections and the opinions they cite. We incorporate characteristics of selective exposure suggested in the literature, notably regarding sentiments and topics, to characterize communities and analyzing their contribution to cross-cutting exposure. Our hypothesis is that while like-mined communities are prevalent in social media spaces, they do not necessarily lead to opinion polarization and total segregation. Instead, their impact is highly dependent on the dynamic interplay of diverse sentiments and topics that these user communities collectively engage with. To test this argument, we explore the relationships between communities, opinion leaders, sentiment, and topics around a politically contentious issue across two periods. Specifically, we aim to address the following three questions. (1) How do communities contribute to varying degrees of opinion isolation in social media? (2) Under what circumstances can different perspectives permeate into each other among communities? (3) How have communities evolved over time in response to new centers of opinions? Fig. 1 provides a visual overview of the data flow throughout the study’s analytical stages.

Fig. 1
figure 1

A visual summary of the analytical workflow employed in this study.

This study focuses on information posted on Twitter related to China’s Belt and Road Initiative (BRI), a hotly debated developmental strategy that may potentially redefine the distribution of geopolitical power between China, an emerging global power, and the West. Launched in 2013 by Chinese President Xi Jinping, the BRI is a transcontinental project connecting China with over 155 partner countries by means of infrastructure investment and economic cooperation. In recent years, the BRI has been increasingly perceived as central to contemporary geopolitical competition, both at the ideological and material levels. Ideologically, concerns have been raised in Western countries about whether the Chinese state will leverage the BRI to accumulate more legitimacy for its authoritarian political system and even disseminate such a political culture. Materially, the BRI is expected to increase China’s economic competitiveness through infrastructure and outbound investment, and for this reason has raised the alarm in major Western economies with regard to China’s expanding influence in key economic sectors in developing countries. In summary, the BRI has transformed from a primarily economic initiative to a highly politicized topic. Thus, the BRI provides a good lens of how echo chambers and political opinions are formed around a potentially contentious political issue, and it also presents a dynamic evolution from a relatively less contested economic initiative to a highly politicized controversy.

Following the outbreak of the COVID-19 pandemic in January 2020, the BRI became an even more controversial topic, adding to much suspicious sentiment around China and the Chinese state. On the one hand, a massive amount of criticisms over the BRI were raised, contending that transportation infrastructures built under the BRI facilitated the spread of viruses across the globe (Tahmasbi et al., 2021). On the other hand, China attempted to boost its image as a responsible global player and to enhance local acceptance of the BRI by delivering medical and economic aid during the pandemic, furthering China’s geopolitical clout and some countries’ dependency on China (Mouritz, 2020; Schulhof et al., 2022). These controversies present a rare opportunity to investigate how political discourses about the BRI helped to shape the evolving field of geopolitical opinions implicating China, and how echo chamber effects in social media have co-evolved with the changing sentiments.

This study analyzed public opinions on the BRI and their diffusion on Twitter. On Twitter, users post 280-characterFootnote 1 messages or replies called tweets, which can be reposted by multiple users with options for adding comments. A plain repost is referred to as a retweet, which simply carries viewpoints expressed in the original tweets to wider audiences, and it also represents a recognition, if not endorsement, of informational value (Bruns and Burgess, 2011). A repost with additional comments is termed a quote tweet, wherein users can post 280-character messages, including hashtags and multimedia objects. Therefore, in this paper, a tweet can be a message, a reply, or a quote tweet.

The current study is based on a publicly available dataset compiled in 2022, comprising 500,711 tweets and 714,794 retweets related to the Belt and Road Initiative (BRI) (Man et al., 2022). These data were collected using five English keyword searches on Twitter. The data were collected using five English-language keyword searches on Twitter. While Twitter hosts a multilingual user base, English remains a common language on the platform, and the predominance of English tweets in the dataset allows for an examination of political discourses shared across different contexts (Alshaabi et al., 2021). These data span the period from 2013 to 2021. To align with our research scope, we implemented a two-stage filtering process. Initially, the dataset was constrained to a specific timeframe capturing public discourses from September 7, 2013—the date of the BRI’s announcement—through August 31, 2020, during the early COVID-19 pandemic. Subsequently, we conducted further refinement to eliminate noise and irrelevant content within this interval, facilitating a more precise analysis of the evolution of public opinion and echo chamber dynamics related to the BRI before and during the initial stages of the pandemic. From this refined dataset, two subsets were extracted, each encompassing an 8-month period, with 1 January 2020 used as the break point for demarcating the two periods, which are from 1 May 2019 to 31 December 2019 (60,207 tweets and 70,603 retweets), and from 1 January 2020 to 31 August 2020 (52,636 tweets and 87,585 retweets) (see Method 1). In this study, the former and the latter periods are named P1 and P2, respectively, for subsequent comparative analyses (Fig. 1).

Result

Sentiment and topic evolution in BRI Tweets

To measure the sentiment polarity and identify the conversation topic of each BRI tweet in P1 and P2, the rule-based sentiment analysis by using a pre-trained lexicon VADER (Hutto and Gilbert, 2014) and machine-learning topic modeling by using Latent Dirichlet Allocation (LDA) were employed (Blei et al., 2003), respectively (See Method 1&2). The number of tweets, retweets, and the average sentiments of tweets over the two periods are plotted in Fig. 2. The selection of optimal topic models and their keyword composition are illustrated in Fig. 3.

Fig. 2
figure 2

Number of tweets, retweets, and average sentiment on a monthly basis from May 2019 to August 2020.

Fig. 3: Summary of topic model optimization and keywords composition.
figure 3

Right: By using the elbow technique to locate the best trade-off between \(k\) and \({c\_v}\), a model of nine topics and a model of seven topics were found optimal in P1 and P2, respectively. Left: The top four major topics and their ten most representative words were exemplified to analyze the evolution of major public opinions, covering 89% and 96.5% of tweets posted before and during the early stage of the COVID-19 pandemic, respectively (See Figs. S4–S5 and Tables S4–S7 for distribution of topics and keywords).

We further investigated the temporal changes of public opinion on the BRI by comparing the distribution of topics and their keyword compositions in P1 and P2, and we found that the opinions were diversified in P2 on account of four observations (Topics are represented in abbreviated form to provide a concise presentation. For instance, topic 1 in P1 is denoted as P1-T1). Firstly, compared with 74% of tweets about trade and investment in P1 (P1-T7), only 34% of tweets pertained to these aspects were captured in P2 (P2-T6). To a large extent, P2-T6 loaded the same semantics as P1-T7 given that half of the top representative words are overlapped. Secondly, “BRI Criticism” (P2-T3) featured 35% of tweets in P2 related to accusations against the BRI, such as conspiracy theories about the Chinese Communist Party as a mastermind behind the pandemic, allegations of mass incarceration in Xinjiang, and the suppression of democracy in Hong Kong. Finally, ”Victoria-China Controversy” (P2-T7) characterized 24% of tweets in P2, focusing on the BRI agreement between China and Victoria State in Australia, a recently emerging topic that heightened the sensitive nature of partnerships with the BRI in Western liberal democracies.

Information transmission in social networks

To capture the patterns of information transmission between users, we modeled the social networks of retweets in P1 and P2 and found that the network density values for both periods are minimal, at less than 0.01 (See Method 3.1-3.2). This implied an uneven pattern of information transmission among users, namely that retweets were mainly sourced from a small portion of users. Hence, to identify major information sources, we measured each user’s out-degree centrality (ODC) by counting the number of users that each user connected with in terms of disseminating information over P1 and P2 (See Method 3.3). We aggregated users’ ODC values and found that 65% of total ODC was contributed by the top 1% of users with the highest ODC. Thus, this demonstrated that the general users were not inter-connected in transmitting opinions but only cited information from 1% of users, named “disseminators” (with 645 and 525 users in P1 and P2, respectively) in this study, who assumed the role of opinion leaders (See SI2 for details on Disseminators’ Identities and Influence).

Given the disseminators’ dominance of opinion diffusion, a holistic understanding of tweet transmission can be drawn from the relationships between disseminators and other users, in terms of the communities of opinions structured around these connections. To that end, we employed the Louvain method to group all users into communities based on their retweet connections (Blondel et al., 2008), and we identified 2138 \(\left(Q=0.81\right)\) and 1967 \(\left(Q=0.83\right)\) communities in P1 and P2, respectively (Table S8), with a high modularity (\(Q > 0.8\)) indicating the validity of this analysis (See Method 3.4).

Although numerous communities were detected, nearly 95% (2014 and 1870 communities in P1 and P2, respectively) of them were topologically disjointed from other communities. In contrast, 3% of the communities contained the vast majority (>80%) of users and >90% of retweets, involving 59 communities (with 37,714 users and 64,117 retweets) in P1 and 48 communities (with 45,289 users and 80,818 retweets) in P2, respectively. In this study, these communities were named “Main communities”.

In the context of sentiment and topic analysis in Main communities, the majority of retweets in P1 (85.53%) are classified as “Trade and Investment”. In the case of the main communities in P2, the distribution of “BRI Criticism”, “Trade and Investment”, and “Victoria-China Controversy” amounts to 27.6%, 34.89%, and 28.37%, respectively. The average sentiment values associated with the primary topics in both P1 and P2 main communities fall within the range of −0.05 to 0.1 (Fig. S11). Comparing the Main communities with the disjointed communities, we found that users in disjointed communities exhibited more positive sentiments about the BRI with higher interests in marginalized topics (See SI3). As such, a small number of users dispersed in a large number of isolated “islands” of opinions, while the overwhelming majority of users are concentrated in a discursive “continent” comprised of fewer and larger communities of opinions (Fig. 4).

Fig. 4: The illustrative retweet topology on Twitter, exemplified by the retweet networks of P2.
figure 4

The graph was generated by using ForceAtlas2, a graph generation algorithm to cluster the densely connected nodes nearby and drive sparsely connected nodes out of the center in a 2-D plane (Garrett, 2013). Users in the Main communities are colored light rose, while the rest of the users are colored gray. The node sizes are directly proportional to users’ ODC. The larger a node, the more connections a user had in information dissemination (See SI3 for graph generation details).

To unravel patterns of information transmission among the Main communities, we further analyzed and compared the disseminators and other users’ frequencies of being sources and recipients of information (i.e., the origin and the destination of a retweet thread) (Table S9), and we found that >=70% of intra-community retweets (71% in P1 and 70% in P2) and >=75% of inter-community retweets (77% in P1 and 76% in P2) in Main communities were diffused from the disseminators to other users across P1 and P2. For the composition of disseminators across P1 and P2, we found that only 136 disseminators in both P1 and P2 overlapped temporally, accounting for 21% and 25% of disseminators in P1 and P2, respectively. This demonstrated that the disseminators played the single most important role in the diffusion of opinions both within and across communities across two periods. With the significant shift of major topics in P2, new disseminators engaged in retweet networks, replacing 75% of P1’s disseminators as the dominant sources of retweets and swaying the opinions of other users.

Intra- and inter-community retweet dynamics and semantic correlations

To analyze the patterns of retweets at intra- and inter-community levels, we quantified the distributions of sentiments and topics in Main communities by applying six indicators (see Table 1, Method 4, and Figs. S12S19 for indicators’ distributions) and we found that intra- and inter-community retweets were moderately sentimentalized on average, given the average levels of “IU-sentiment” (0.2 in P1 and 0.12 in P2) and “CU-sentiment” (0.14 in P1 and 0.09 in P2). Also, retweets at the community levels generally pertained to a singular topic (i.e., purity > 0.5), given the average levels of “I-purity” (0.85 and 0.56 in P1 and P2) and “C-purity” (0.89 in P1 and 0.5 in P2). This demonstrated the general patterns of information transmission at the community level: communities represented the essential unit that held users together by their preference for certain topics and sentiments, carried in the information percolated from disseminators, who have overwhelming influences on congealing collective attention to emotion-laden and relatively monotonous topics.

Table 1 Indicators of intra- and inter-community semantics.

As to the relationships between intra- and inter-community retweets, we found that intra-community exchanges accounted for >85% of all retweets in the networks identified by the Louvain method (86% and 89% in P1 and P2, respectively. Table S9) (Blondel et al., 2008). Also, compared to inter-community retweets, more sentimentalized opinions were found in the intra-community retweets across the two periods (Two-tailed t-test on IU-sentiment and CU-sentiment. See SI4). On the one hand, the limited frequencies of inter-community exchanges showed that communities were indeed to an extent isolated from one another. On the other hand, non-disseminator users were implicated in the highly enclosed conversational spaces within communities and exposed to sentimentalized and monotonous information, as opposed to less sentimentalized information transmitted at the inter-community level, attesting to the existence of echo chamber effects.

To investigate the patterns of intra-community exchanges, we applied Pearson correlation (R) to test the linear relationships between the distributions of major topics (i.e., ratios of the number of each topic’s tweets to all retweets within communities) and “I-sentiments”. We found a direct proportional relationship between community sentiments and the “Trade and Investment” topic (Fig. 5). In P2, with the higher distribution of “Trade and Investment” in tweets, a more positive sentiment was exhibited (Fig. 6). Whereas we found a highly negative correlation between communities’ sentiments and distribution of “BRI Criticism” (P2-T3) (Fig. 6). In P1, the more retweets about “Trade and Investment” circulated within communities, the more positive sentiments users were repetitively exposed to within communities. In P2, we detected that with the higher frequencies of retweeting “BRI Criticism”, users were, on the one hand, more likely to be assimilated into communities that are filtered with less positive moods. On the other hand, similar to P1, with the higher distribution of the “Trade and Investment”, the general users were repetitively exposed to more positive moods. Thus, communities’ sentiments were closely correlated with certain topics, and such relationships co-evolved with the oscillations of political discourses.

Fig. 5
figure 5

The correlations between the “Trade and Investment” topic’s sentiments and communities’ sentiment in P1(left) and P2(right).

Fig. 6: The correlations between topic purity and communities’ sentiment.
figure 6

“BRI Criticism” in P2 (left) and “Trade Investment” in P2 (right).

Community fragmentation in retweet networks

To analyze the patterns of inter-community retweets, we applied Fragmentation (F) to measure pair-wise communities’ levels of opinions’ inter-penetration. F [−1,1] compares the number of boundary nodes’ connections (i.e., users who diffused opinions to both of two communities) and the number of internal edges (i.e., retweets that circulated within one community exclusively) (See Method 5). The higher the F, the lower opinion transitivity is between two communities and vice versa. We found that the distribution of F was significantly skewed and prone to variations (Fig. 7. See SI5 for details), given that 75% of community pairs (1283 in P1 and 846 in P2) were highly fragmented (F > 0.7 and F > 0.5 in P1 and P2, respectively), while <25% pairs (319 and 287 community pairs in P2) had denser topological connections and subject to lower levels of fragmentation (F < 0.5). In other words, despite the limited traffic of inter-community exchanges in general, the varied degree of fragmentation between communities begged the question about the highly uneven pattern of fragmentation.

Fig. 7: Distributions of pair-wise communities F in P1 (Left) and P2 (Right).
figure 7

Data samples are from a combination of communities in P1 and P2, with 1711 and 1128 pairs of communities analyzed, respectively.

To tease out the patterns of pair-wise communities’ isolation, we modeled the information transmission networks at the community level and further measured the fragmentation-weighted Eigenvector centrality (EC) for ranking each community’s levels of cross-cutting exposure. In the networks, nodes represent communities, and edges are weighted by the inverse normalized F (Method 5). The higher EC, the lower level of isolation the community of users had, and vice versa. We further applied Spearman’s ρ to test the correlations between EC and topic distribution and found moderately positive correlations between P1-EC and “Trade and Investment” (P1-T7) and between P2-EC and “BRI Criticism” (P2-T3) (Fig. 8). The same analytical framework was applied to investigate the correlations between EC and sentiment (i.e., indicators “I-sentiment” and “C-sentiment) and no significant correlation was detected. The correlations between EC and topics across the two periods highlighted the patterns of cross-cutting exposure temporally: inter-community interactions tend to occur towards communities that mutually had a high-volume share of certain topics regardless of sentiment levels. In P1, the higher proportion of retweets transmitted on the topics of trade and investment, the less isolated users were. On the contrary, in P2, the higher ratio of information regarding the risks and accusations retweeted in communities, the more likely that communities converse with each other.

Fig. 8: The fragmentation-weighted networks of communities, exemplified by communities in P2 (left) and correlations between EC and topic distribution in P1 and P2 (right).
figure 8

To highlight the EC, node sizes are directly proportional to the community’s EC. The larger and the darker a node, the higher levels of cross-cutting exposure a community had.

Community dynamics: cluster of opinion polarization

To holistically unravel the characteristics of communities and estimate the magnitude of opinion polarization, we applied hierarchical cluster analysis and the K-means algorithm to capture similar patterns of intra- and inter-community information transmission by grouping communities based on three indicators, namely “I-sentiment”, “I-purity”, and “EC” (Method 6). Hence, users have been grouped in such a way that those within each cluster had similar sentiment values, levels of topic purity, and levels of cross-cutting exposure. Accordingly, we detected four categories in P1 and three in P2 (Fig. 9) and grouped them into four cluster types based on their similarity. (See Table 2 and Figs. 1011 for indicators’ distributions).

Fig. 9
figure 9

The distributions of cluster types in P1 (left) and P2 (right).

Table 2 Variables mean of clusters in P1 and P2.
Fig. 10
figure 10

The distribution of indicators across clusters in P1 (top) and P2 (bottom).

Fig. 11
figure 11

The topic distributions by cluster in P1 (left) and P2 (right).

1. Polarized optimist

The 2305 (10 communities) and 6289 users (10 communities) in P1 and P2 respectively that mainly disseminated highly positive opinions on the trade and investment information of the BRI and were isolated from adjacent communities in exposure to less sentimentalized and diverse information (i.e., low EC).

2. Outsider

The 2854 users (10 communities) in P1 disseminated emotionally neutral and slightly negative opinions of multiple topics, and they were highly insulated from adjacent communities’ opinions. These users deviated from the majority of users in P1, who were mainly engaged in information transmission around trade and investment.

3. Bridge

The communities of users who disseminated moderately positive/negative opinions on a specific topic with a notable presence of cross-cutting exposure (referred to as high EC). In P1, Bridges consisted of 32,582 users across 41 communities and were divided into two distinct types: Bridge-P1P (positive sentiment) and Bridge-P1N (negative sentiment). The former comprised users who expressed positive opinions on the trade and investment topic within their respective communities while also engaging in inter-community retweeting with both moderately positive and negative opinions. The latter exhibited the same pattern of inter-community conversations but maintained a moderately negative stance on the same topic within their respective communities. In P2, the Bridge encompassed 5879 users across 6 communities, who held moderately negative opinions on “BRI Criticism” while still displaying diverse interests that cut across various topics.

Therefore, Bridge users transmitted opinions on a singular topic within their intra-community settings while also remaining connected both to like-minded and differing opinions. These conversations extended beyond intra-community boundaries and intersected with communities exhibiting higher EC. The disseminators, who exerted significant influence in the transmission of opinion exchanges across communities, facilitated this cross-cutting dynamic.

4. Diverse majority

The diverse majority consisted of 33,141 users across 32 communities in P2 who disseminated less sentimentalized opinions (ranging from slightly positive to slightly negative) regarding multiple topics. In P2, 73% of users mainly disseminated and were exposed to less sentimentalized opinions on a wide range of topics. Temporally, however, the proportion of users in Polarized Optimist increased from 6% in P1 to 14% in P2, showing that more users were polarized as the pandemic heightened the sensitive nature of BRI. Nonetheless, we also detected a counter-polarization mechanism in the Diverse majority, wherein the majority of users in P2 were less sentimentalized, interested in multiple topics of information (i.e., low Purity), and permeable to dissimilar opinions, reinforcing the diverse interests towards the BRI (Fig. 10).

Indeed, despite the fact that in general users tend to repetitively retweet and encounter like-minded opinions at the intra-community level, we found that the overall public opinions around the BRI were not polarized and highly sentimentalized. Taking the sizes of clusters into account, we estimated that <15% of users (2305 and 6269 users in Polarized Optimist in P1 and P2, respectively) were susceptible to opinion polarization. Unlike the common belief that the echo chamber effect acts as a mechanism for intensifying divisions among people, its influences are contingent on topics and sentiments of information that are produced by and transmitted from the disseminators to other users, rendering the highly diverse nature of users’ cross-cutting exposure of dissimilar opinions.

Discussion

In this study, we investigate an overarching question: How do communities of users contribute to varying degrees of opinion isolation in social media? To that end, we applied a behavior-focused approach to comprehensively model echo chamber effects around politically contentious issues, from identifying the variety of communities, to investigating their practical interactions, and to teasing out the factors that may potentially influence the percolation of dissimilar opinions among social media users.

By modeling the information transmission networks of public opinions around the BRI on Twitter before and during the early stage of the COVID-19 pandemic, we detected different communities and their respective discursive boundaries. These boundaries are generally exclusive to dissimilar sentiments and topics of information from adjacent communities. Our analysis revealed that public opinion mainly diffused from <1% of users (i.e., disseminators) to general users. These disseminators functioned as the dominant source and the most critical bridges for transmitting (dis)similar information transmitted both within and across communities in forming collective opinions. Aligning with the characteristics of social networks described in the extant literature, the network mapping in this study reveals an inherently hierarchical and uneven nature of information flow in social media spaces, in particular highlighting how influence is concentrated among key actors, which critically shapes the formation and polarization of public opinion.

Through an interrogation of how opinion transmission occurred at both intra- and inter-community levels, respectively, we have identified two key findings. Firstly, intra-community exchanges were more sentimentalized and topically monotonous compared to inter-community exchanges, highlighting the echo chamber effects observed in this study. This finding also highlights the presence of enclosed discursive circles, characterized by the dominance of topically repetitive and emotionally charged opinions propagated by opinion leaders.

Secondly, using correlation analysis at the community level, we discovered a nuanced variation in communities’ cross-cutting exposure, which correlated with the distribution of specific topics and sentiments. By exploring the relationship between sentiment and topic distribution within communities, our study provides valuable insights into the mechanisms of emotional contagion through member interactions. Notably, users who predominantly retweeted highly positive opinions related to the trade and investment dimensions of the Belt and Road Initiative (BRI) demonstrated heightened susceptibility to both sentimental and opinion polarization.

The findings presented in this paper represent a novel effort to understand the contour of echo chamber effects on social media. While echo chambers persist over time within certain communities, the characteristics of communities are diverse, reflected by the varying sentiments and topics that circulate. Our analysis estimated that <15% of users are prone to opinion polarization, whereas the majority (>50%) of users exhibit neither topical nor sentimental polarization. Even during the early stage of the pandemic, most users engaged with a broad spectrum of topics and displayed lower sentiment levels, indicating a variety of interests related to the BRI and suggesting it as the counter-polarization mechanism. At a theoretical level, the term “echo chamber” is often employed in existing literatures to describe the polarizing dynamics of partisan politics on social media spaces, but fewer studies have critically examined the validity of this proposition. This finding aligns with the perspective that ordinary users may not be co-opted by echo chamber effect or be confined to purely ideological spaces (Bruns, 2017; De Vries et al., 2018). By identifying distinct clusters of communities and measuring their affiliated user size, this study seeks to advance the field by modeling the magnitude of opinion polarization, which appears contingent upon the diversity of opinions, topics, and sentiments exchanged among individuals. Overall, this paper demonstrates the coexistence of both opinion polarization and counter-polarization in the discussion of politically contentious topics on social media across the two analyzed periods, rather than solely intensifying divisions among individuals.

At the methodological level, this paper demonstrates the scalability and integration of analytical units for examining echo chamber effects on social media by combining the quantification of selective exposure factors, specifically sentiment and topic, in dyadic interactions at the individual level with modeling echo chamber effects at the community level. Echo chambers emerge within insulated environments composed of like-minded individuals, necessitating a comprehensive approach to identify the conditions that promote insulation from dissimilar viewpoints at the community or group scale. Recent studies have recognized the importance of in-group and out-group dynamics to better understand echo chamber effects and provide Insight into the underlying causes of polarization (Hobolt et al., 2024). To advance understanding of echo chamber effects, we propose that social media conversations be examined through two complementary interaction levels: individual selective exposure and community-level interactions. This approach includes not only intra-community exchanges but also inter-community interactions, where investigating cross-cutting dynamics is essential to capturing the complexity of social media environments.

In sum, we believe that this study represents one of the first steps in developing a novel research agenda by demonstrating how macroeconomic initiatives and crisis-driven discourse reshape social networks temporally, create new opinion leaders, and influence opinion polarization. In this vein, our research suggests questions and goals for future research. We highlight three: (1) the relationships between echo chambers and community dynamics based on different forms of interactions among users, including replies, mentions, and quotes, (2) extending discussions on the relationship between echo chamber effects and diverse political agendas and concerns, as well as different types of agents; and (3) developing comparative research across different political issues, and across different social media platforms.

Methods

1. Data Collection

The raw data in this article were obtained from a public repository published in 2022 (Man et al., 2022). The referenced dataset was collected over a two-day period, from 10 December to 11 December 2021 through search for terms that have been commonly applied in news media about the BRI, including “belt and road”, “one belt one road”, ‘new silk road”, “maritime silk road”, and “silk road economic belt”. Any tweet that contains one of those phrases and their hashtag version was downloaded, containing public opinions of the BRI on Twitter from 7 September 2013 to 30 November 2021. Specifically, all data available in the “01. Tweets” and “02. Retweets” sections of the dataset were included, resulting in 500,711 tweets and 714,794 retweets. In accordance with Twitter’s data sharing policies, tweet authorship and content were anonymized and represented by unique identifiers (IDs). We subsequently used these tweet IDs to retrieve the tweet content and the usernames of their authors via the Twitter API v2 under the academic research access level, which permits retrieval of up to 10 million tweets and is sufficient to cover the dataset used in this study.

Furthermore, a hybrid computational and manual identification process was employed in the referenced dataset to eliminate irrelevant tweets from the raw data. Irrelevant tweets arose due to the ambiguity of the search keywords, such as tweets about “seat belt and road safety” or the homonym “silk road,” which refers to an online marketplace on the dark web. These unrelated tweets were identified through Boolean-based data filtering combined with manual verification to exclude content with overlapping keywords. Seven irrelevant and four definitive terms were used to filter 1564 potentially unrelated tweets, which were then independently reviewed by two raters, yielding substantial inter-rater agreement (Cohen’s κ = 0.68). Only tweets unanimously deemed irrelevant were removed (Man et al., 2022).

Data processing

Based on the downloaded raw data, we performed a two-stage filtering process based on timeframes to first narrow down a relevant time window and then refine the dataset by removing noise or irrelevant data within that window. This approach enhanced data quality and ensured more accurate and meaningful analyses from large public repositories.

In the first stage, we aimed to examine general changes in sentiment and conversation topics before and during the early stage of the COVID-19 pandemic. Therefore, we selected tweets from 7 September 2013—the date when the BRI was first announced by Chinese President Xi Jinping—up to 31 August 2020, corresponding to the early phase of the pandemic. After this filtering, 440,191 tweets and 558,156 retweets were retained. In total, these 998,347 messages were contributed by 314,948 Twitter users.

To identify conversational topics within the BRI-related tweets, Latent Dirichlet Allocation (LDA) was employed. LDA is a generative statistical algorithm that models probabilistic relationships among documents, topics, and words to capture underlying content (Blei et al., 2003) (See SI1.1 for details). For sentiment analysis, the VADER tool, a rule-based method, was applied to assess sentiment polarity (see Method 2). To analyze textual content on a comparable basis, all tweets and retweets included in the LDA and VADER analyses were translated into English. The dataset predominantly consists of English-language tweets, reflecting the initial focus on English-language keywords; 92.31% of tweets were identified as English in the raw data (column “text_lang_ft” in the “01. Tweets” database). Among the remaining tweets, approximately 31,320 were classified as non-English. These non-English tweets were translated into English using Google Translate, a machine translation service noted for its ability to preserve the connotations of words across languages (Baccianella et al., 2010).

In the second filtering stage, two subsets of tweets, which are from 1 May 2019 to 31 December 2019 (P1), and from 1 January 2020 to 31 August 2020 (P2), were selected to investigate the public opinions about the BRI in detail. The selection of datasets was due to practical considerations. Firstly, P1 and P2 are temporally continuous, providing a suitable window to observe the evolution of echo chamber effects before and following the beginning of the COVID-19 pandemic. Secondly, two peaks of tweets observed in May 2017 and April 2019 during the major international BRI Forums were excluded from the selected datasets, preventing the estimation of echo chamber effects in this study from being skewed toward particular BRI events. Thirdly, we chose January 2020 as the start of P2 because keywords on the BRI and COVID-19 pandemic first co-appeared on Twitter on 1 January 2020, wherein words such as “coronavirus” and “pneumonia outbreak” were posted when users expressed anxiety about a media statement published by the World Health Organization (WHO) country office on a case of viral pneumonia in China. Additionally, it concurred with a significant plummet in sentiment observed since January 2020 (See SI1 and Fig. S1 for details). These anomalies may be caused by critical changes in the creation and diffusion of opinions on social networks and require an in-depth investigation. In total, P1 included 60,207 tweets and 70,603 retweets, while P2 comprised 52,636 tweets and 87,585 retweets.

2. Sentiment Analysis

The sentimental values of tweets were evaluated by using a rule-based analysis tool Valence Aware Dictionary and sEntiment Reasoner (VADER) wrapped in “vaderSentiment (version 3.3.1)” Python package that measures the emotional polarity and intensity by averaging the sentiment scores of all words in a tweet based on a pre-evaluated sentiment lexicon (Hutto and Gilbert, 2014). Although recognition of sentiments in texts can be accomplished by different tools such as SentiWordNet (Freeman, 2002), VADER was applied in this paper owing to its relatively high accuracy for analyzing sentiments in social media content, including sentiment-laden symbols, slang, and emoticons. In this article, all translated tweets were assessed and labeled with a value between −1 to +1, ranging from the extremely negative to the most positive opinions.

3. Social Network Analysis

Social Network Analysis (SNA) was applied in this study to model and analyze the reposting patterns of the BRI tweets.

Network modeling

A network is comprised of a set of nodes and edges. In this study, nodes are characterized as users, and edges are represented by retweets between users. Taking the unidirectional aspect of retweets into account, a message is shared one-way from the authors to retweeters. An illustrative example can be found in Table 3. If a tweet is originally posted by user X and reposted by user Y, an edge is created from X to Y. When this retweet attracted Z (a follower of Y), who thus cites X, another edge is created. Furthermore, the number of retweets between two users is counted, if X created a new tweet which again was reposted by Y, edges between them will be aggregated by the reposting frequency. Accordingly, we modeled the retweet networks of users. The network in P1 consists of 45621 nodes and 59602 edges. The network in P2 consists of 52425 nodes and 74675 edges.

Table 3 Example of edge formation in the retweet network.

Network density

To analyze how dense a network is in terms of edge connectivity, graph density was employed. It represents the ratio between the edges present in a network and the maximum number of possible edges in the network. The value of density ranges from 0 to 1, with the lower limit corresponding to no connection in a network and the upper limit denoting a network with all possible connections between nodes. The closer the value is to 1, the more cohesive the nodes in a network, which means information can flow more easily between nodes. In this paper, we calculated network density based on the density formula for a directed graph, and the definitions of variables are presented in Table 4:

$${{DEN}}_{G}=\frac{\left|E\right|}{\left|V\right|\times (\left|V\right|-1)}$$
(1)
Table 4 List of variables applied in calculating network density.

Opinion leaders

The popularity of each user in the retweet networks was measured by using degree centrality and betweenness centrality (equation 2) (Clauset et al., 2004). It measures the probability of the shortest path between two randomly chosen users that go through a user (in this study, an information bridge). To compare different network sizes in P1 and P2 temporally, the probability was further normalized by dividing the maximal pairs of directed flows between all users in the respective period.

Additionally, one’s actual number of users connected in the network was measured by using degree centrality (ODC) (Clauset et al., 2004), which was further divided into four indicators, namely intra-cluster-in-degree (I_ID), intra-cluster-out-degree (I_OD), inter-cluster-in-degree (C_ID), and inter-cluster-out-degree (C_OD) to illustrate the frequency of disseminating and encountering information in intra- and inter-clusters retweets (see Table 5).

$${BC}\left(i\right)=\,\frac{1}{\left(n-1\right)(n-2)}\,\mathop{\sum}\limits_{j\ne k,\,k\ne i,\,j\ne k}\frac{{N}_{{sp}}(j{\to }^{i}k)}{{N}_{{sp}}(j\to k)}$$
(2)
Table 5 Definitions of symbols applied in centrality measures.

Network partitioning

Nodes in retweets networks were classified to find communities of like-minded users. Conceptually, an ideal partitioning is to recognize groups of users that are mutually exclusive, wherein nodes within the same clusters have dense connections, but sparse connections between nodes in different clusters. In this study, weights between nodes are characterized by their reposting frequency as discussed in 3.1.

The Louvain algorithm was applied in partitioning, which is an optimization method to partition graphs by optimizing the modularity (Q) (Blondel et al., 2008). Specifically, modularity [−1,1] measures the fraction of edges within groups minus the expected random distribution of edges in networks. The greater the value of modularity, the structure of communities is found to be more robust. As a rule of thumb, Q 0.3 implies that a significant community structure is detected (Girvan and Newman, 2002). The formula of Modularity is shown in equation (3). \(m\) denotes the number of edges in the observed network; \({A}_{{ij}}\) represents the reposting frequency between node \(i\) and \(j\); \({d}_{i}\) and\(\,{d}_{j}\) are the degree of node \(i\) and \(j\); \(\delta ({c}_{i}{c}_{j})=1\) if node \(i\) and \(j\) are in the same community, and otherwise 0.

$$Q=\,\frac{1}{2m}\mathop{\sum}\limits _{i,j}\left[{A}_{{ij}}-\frac{{d}_{i}{d}_{j}}{2m}\right]\delta \left({c}_{i}{c}_{j}\right)$$
(3)

Initialization of the Louvain method requires assigning every node a unique community label. Subsequently, the optimization of modularity comprises two phases:

  1. A.

    Modularity optimization

    • A1. Order the nodes and for each node \(i\), move \(i\) to the community of neighbor \(j\) that lead to a maximum increment of Q \((\Delta Q)\)

    • A2. If \(\Delta Q < 0\), \(i\) remains in its current community

    • A3. Repeatedly loop through all nodes until \(\Delta Q=0\)

  2. B.

    Community Aggregation

    • B1. Create a weighted network of communities from Phase 1, wherein nodes represent communities from Phase 1, and the weight of edges marks the number of links between communities

    • B2. Apply processes in Phase 1 to the resulting network until \(\Delta Q=0\)

Eigenvector centrality

We applied the eigenvector centrality (EC) to rank each community’s levels of cross-cutting exposure. Given the different levels of fragmentation (F) among community pairs, the edges of community networks were further weighted by using the inverse normalized F. Hence, the higher EC, the lower level of isolation the community of users had, and vice versa.

The calculation of EC is unpacked as follows. For example, a graph G: = (V,E) with \(\lceil V\rceil\) vertices. Let A = (\({a}_{v,t}\)) be the adjacent matrix (i.e., \({a}_{v,t}=1\) if vertex v is linked to vertex t, and \({a}_{v,t}\,\)= 0 otherwise). \(\lambda\) is a constant (i.e., the eigenvalues). The relative centrality score of \({x}_{v}\) of vertex v can be defined as:

$${x}_{v}=\frac{1}{{\rm{\lambda }}}\mathop{\sum}\limits_{t\,\in \,M(v)}{x}_{t}=\frac{1}{{\rm{\lambda }}}\mathop{\sum}\limits_{t\,\in \,{\rm{V}}}{a}_{v,t}{x}_{t}$$
(4)

4. Intra- and inter-community conversational patterns

In this study, six continuous variables were applied to characterize C-Main communities based on the topological and semantic patterns of retweet behaviors within and among each of them. Each variable is illustrated in Table 6 and explained in Table 7, where I-sentiment and C-sentiment mark the sentiment values mean of all retweets reposted within and across a community, respectively, ranging from −1 (the most negative) to 1 (the most positive). Next, I-purity and I-purity measure the alignment between topics and echo chambers, to determine if a single topic appears more frequently relative to others by calculating the ratio of the largest class of topics to the total number of retweets shared within and across a community respectively (Bright, 2018). The maximum purity is 1, showing all retweets pertain to a single topic.

Table 6 List of variables and respective formulas applied in clustering.
Table 7 Definitions of variables applied in clustering.

5. Measuring fragmentation

The development of inter-community fragmentation (F) was inspired by the work of (Steinley, 2006), who devised a measure based on comparing the activities of boundary nodes (users who communicate with political figures from both groups) to the activities of internal nodes who communicate solely within one group. Instead of focusing on connectivity between partisan agents, this study characterizes boundary nodes simply as content creators whose opinions were retweeted by members from both communities, aiming at evaluating to what extent two communities shared mutual perspectives.

The calculation of F was exemplified in a toy network shown in Fig. 12. The network contains users who retweeted at least one of the other’s opinions and the directional diffusion of opinions among them. Moreover, users are partitioned into two communities A and B, and two sets of nodes were created. Internal nodes: \({N}_{i}=\left\{A2,{A}3,{A}4,{B}1,B2,B3,B4\right\}\) and a boundary node: \({N}_{b}=\left\{A1\right\}\). Accordingly, edges were classified based on their point of original nodes, where \({E}_{i}=\left\{\left[A2\to A3\right],\,\left[A2\to A4\right],\left[B1\to B2\right],\left[B1\to B3\right],[B1\to B4]\right\}\), and \({E}_{b}=\left\{\left[A1\to A2\right],\,\left[A1\to A4\right],\left[A1\to B1\right]\right\}\). \(\left|{E}_{i}\right|\) and \(\left|{E}_{b}\right|\) mark the cardinality of \({E}_{i}\) and \({E}_{b}\). Hence, \(F=0.25\), indicating a mild degree of inter-community fragmentation.

$$F=\frac{\left|{E}_{i}\right|-\left|{E}_{b}\right|}{\left|{E}_{i}\right|+\left|{E}_{b}\right|}$$
(5)
Fig. 12
figure 12

A toy network of inter-community fragmentation.

6. Clustering of communities

The K-means unsupervised clustering algorithm with initialization of Euclidean distance and Ward’s method was applied to cluster C-Main communities (Deng et al., 2014). Specifically, K-means is a non-hierarchical clustering method that can be incorporated with graph partitioning methods to develop a detailed understanding of users’ collective preferences in social networks (Hoffman et al., 2018; Ward, 1963). The objective of clustering is to maximize intra-cluster homogeneity based on three indicators, namely “I-sentiment”, “I-purity”, and “EC”. Thus, communities were classified into K clusters of which each belongs to the cluster with the nearest mean (cluster centroid). To locate the a priori parameter (K) (i.e., the optimal number of clusters), we employed Ward’s method to identify the number of clusters and also to prevent finding the locally optimal solutions known to occur with K-means clustering (Jacomy et al., 2014; Steinley, 2006).

In the initialization of Ward’s method, each community was assigned a unique cluster. The selection of which two clusters to merge is based on which combination of clusters minimizes the intra-cluster heterogeneity based on the Euclidean distance. At each step, the two clusters combined are those that minimize the increase in the total sum of squares across all variables in all clusters.

Subsequently, the K-means algorithm was executed with the following steps:

  1. A.

    Define the number of clusters (K) obtained from Ward’s method

  2. B.

    Randomly place the centroid in each cluster

  3. C.

    For each data point (i.e., the echo chamber in this study), find the nearest centroid based on the Euclidean method and assign it to the corresponding cluster

  4. D.

    For each cluster, calculate the mean of all points assigned to itself as the new centroid

  5. E.

    Repeat C and D until the end of a fixed number of iterations (10)

Data limitations

Since all data presented in this study were retrieved from Twitter’s database by using Twitter APIs, tweets and retweets deleted by Twitter due to the violation of Twitter’s policies (such as disseminating hateful or violent speech) were not accessible to us and hence excluded in this study. Subject to data attributes provided by Twitter, we cannot draw any conclusions about the demographics of users and the authenticity of discourses contained in our dataset. Hence, all tweets and retweets presented in this study can tell us only about public discourses toward the BRI on Twitter. We do not make assumptions about connections between the discourses on Twitter and people’s opinions offline and on other platforms.

Meanwhile, among the “tweets” investigated in this study there are 14,924 retweets with comments. Among them, 1732 are derived from original tweets that are also included in the database of the study, while 13,192 are derived from original tweets outside the database. Due to API restrictions imposed by Twitter, we were unable to clarify from whom the retweets with comments were derived and thus unable to portray how they were transmitted. However, given the small number of this type of tweets (1732), the impact on data analysis is assumed to be minimal.

Further limitations pertain to the challenges of accurately interpreting sentiment from social media text. We approached VADER sentiment scores with caution, avoiding reliance on arbitrary thresholds for determining sentiment polarity classification. It is important to note that the sentiment of individual tweets has limited influence on aggregate community sentiment, as a mixture of positive and negative tweets may yield neutral overall sentiment scores. To address this, we incorporated quantitative measures (see Table 2) that capture more nuanced sentiment patterns within communities and their correlation with topic distributions. This approach facilitates the identification of explainable sentiment patterns that account for contextual and topical factors, rather than the simplistic sentiment categorizations.