Introduction

In the contemporary era of the knowledge economy, the dissemination of knowledge increasingly transcends geographical boundaries, significantly reshaping patterns of scientific collaboration worldwide (Lin et al. 2023). Consequently, scientific research activities worldwide have entered a phase of unprecedented collaboration, resulting in numerous breakthroughs and innovations across diverse scientific fields through extensive partnerships spanning regions and nations (Simonton, 2013; Franco and Pinho, 2019; Thompson, 2004). This phenomenon has spurred substantial scholarly interest in the changing landscape of the global economy and knowledge flows (Adams, 2012). Previous research has thoroughly analyzed the formation and dynamics of the Global Scientific Collaboration Network (GSCN, hereafter), revealing patterns of regional clustering, global knowledge diffusion, and shifting innovation centers (Wang and Lin, 2018). The United States occupies a central hub position within the knowledge collaboration network, with the United Kingdom, France, Germany, and Japan playing key roles (De Prato and Nepelski, 2014). With the ascent of emerging technological powers such as China and India, the once Western- and American-dominated global landscape is evolving into a tripartite structure encompassing Western Europe, North America, and the Asia-Pacific (Wilsdon, 2011; Csomós, 2018). This structural shift is attributable to the dissemination and accumulation of knowledge. According to Grossetti et al. (2014), knowledge production is spreading to more countries and cities, indicating a significant trend toward ‘decentralization’ on both global and national scales.

Studies in recent decades have seen a transformation shift from traditional nation-centric frameworks towards city-centric perspectives. Influential scholars have stated that global cities are surpassing conventional administrative boundaries, evolving into key nodes of global economic and informational flows (Castells, 1996; Taylor et al. 2002; Sassen, 1991). The interactions between these cities have shaped a global city network, transforming cities into primary territorial entities through which nations compete and collaborate for innovation resources (Lüthi et al. 2018). Indeed, developing innovative world-class cities has become a crucial strategic priority for countries seeking advantageous positions within global competition (Wolfe and Bramwell, 2016; Elkhidir et al. 2023). Furthermore, a city’s prominence largely depends on its innovation capabilities and its ability to consistently update its knowledge base (Pflieger and Rozenblat, 2010). Relying solely on internal knowledge can lead cities toward technological stagnation. Thus, active participation in cross-regional or global knowledge exchange networks is essential for sustainable innovation.

The literature on city-level scientific collaboration has primarily developed along two research agendas. The first research agenda investigates patterns of scientific collaboration among cities primarily within national or regional contexts. This focus is justified by the fact that scientific collaboration is influenced by factors such as national size, geographic distance, and language barriers (Katz, 1994; Glänzel and Schubert, 2005). Specifically, a significant distance-decay effect is observed in knowledge flows, as innovation activities at sub-national scales are typically concentrated within geographically localized clusters (Ponds et al. 2007; Liang and Zhu, 2002; Breschi and Lissoni, 2009; Essletzbichler, 2012). Consequently, proximity has emerged as a key determinant shaping inter-city cooperation, influencing the intensity and structure of collaboration networks. The second research agenda recognizes that advancements in information technology and intensified globalization have weakened the distance-decay effect, allowing knowledge collaboration to increasingly transcend geographic proximity and expand globally. Then, scholars have begun examining global inter-city knowledge flows beyond traditional geographical constraints. Research has illustrated, for example, that European cities, particularly London, Paris, and Amsterdam, have historically exerted substantial influence within the GSCN (Matthiessen et al. 2002; Matthiessen et al. 2010). However, recent studies indicate that the traditional dominance of European and North American cities has been declining over the past two decades, accompanied by a notable rise in the prominence of Asia-Pacific cities (Cao et al. 2023b). Cities are intrinsically integrated into the GSCN, enhancing their positions through collaborative efforts (Tian et al. 2023). Their roles in the GSCN determine their access to innovative resources and their standing in technological competition (Bonaventura et al. 2021).

While previous studies have extensively discussed the global knowledge network from national or metropolitan perspectives, there remains a critical gap regarding how cities globally organize and integrate into the GSCN. Specifically, few studies have systematically explored cities’ roles and their unique collaboration patterns at a comprehensive global scale, especially within fields closely related to urban sustainable development. Addressing this research gap, this analysis investigates global scientific collaboration at the city level, drawing upon comprehensive scientific publication collaboration data from cities worldwide in 2022. This analysis’s contribution lies in its city-centric global approach and in uncovering the diverse roles and interactions of cities within the global innovation landscape. Furthermore, by integrating spatial and network analyses, this analysis provides a novel methodological perspective, deepening our understanding of how cities participate in, shape, and benefit from global scientific collaboration. Such an understanding is vital for policy-makers and city planners aiming to enhance their cities’ competitiveness and innovation capacities within the global knowledge economy.

The sections of this paper are structured as follows: The section “Literature review” offers a comprehensive literature review, outlining the main components of research on the GSCN. The section “Methodology” introduces the data sources and research methodologies employed. The section “Results” quantifies network topology structure indicators and presents the global scientific collaboration pattern through geographical visualization, revealing the community structure and center-hinterland dynamics of the knowledge collaboration network. Finally, the paper discusses the findings and provides a summary in the “Discussion” and “Conclusion” sections, respectively.

Literature review

Regional innovation system and external collaboration

The scientific collaboration studies can be tracked back to the regional innovation system, which was proposed by Cooke (1992) as ‘a regional organizational system where interconnected actors within a certain geographical area continuously learn from each other in an embedded environment, thus generating sustained innovation’. The regional innovation system encompasses various actors such as enterprises, universities, research institutes, etc., with knowledge, information, technology, talent, and capital being the primary production factors. Knowledge exchange and creative capability are regarded as the core of regional innovation systems (Chung, 2002; Raco, 1999). In the context of scientific globalization, scholars have increasingly recognized the cross-regional flow of innovation elements as a potential source of regional innovation (Coe and Bunnell, 2003). Consequently, there is a growing demand for external collaboration within regional innovation systems to continuously expand the search for innovation resources. By integrating external innovation elements with local ones, external innovation factors are assimilated into regional innovation activities. As a result, regional innovation systems evolve from closed to open development. The flow and complementarity of innovation elements on a global scale significantly enhance the innovation capabilities of innovation actors and their respective regions.

Scientific research provides a theoretical foundation and knowledge reservoir, serving as a key driving force for technological innovation and social progress. Over the past century, science has increasingly become collaborative and globalized. Consequently, collaboration in scientific endeavors is commonly perceived as a global, collective process involving researchers from around the world (Hennemann and Liefner, 2015; Hennemann et al. 2012b). This global trend of scientific collaboration can be attributed to several reasons. From a scientific perspective, new discoveries are often complex and nonlinear (Rosenberg, 1982), leading to increased uncertainties and risks. Transnational collaboration facilitates the exploration of cutting-edge questions and the resolution of global problems (Lee and Haupt, 2021). From a knowledge flow perspective, knowledge is categorized into explicit and tacit forms. While explicit knowledge is recorded, tacit knowledge remains unrecorded in the human mind, posing challenges for widespread dissemination (van den Berg and Kaur, 2022). It is worth noting that the collaboration and mobility of knowledge elites and researchers greatly facilitate the global diffusion of tacit knowledge (Saxenian and Sabel, 2008). However, this distribution and global flow of knowledge are uneven. Ultimately, this imbalance results in unequal relationships between core and peripheral nations in international cooperation, with developing countries often positioned at the periphery. Hence, global knowledge collaboration serves as a channel for underdeveloped countries to access international scientific knowledge and external resources (Keller, 2004).

Aggregation and diffusion of innovation cooperation

As urbanization accelerates, cities increasingly become epicenters of innovation, attracting knowledge, technology, talent, and capital. Early innovation theories highlighted the crucial role of entrepreneurs, later shifting focus to the significance of large corporations in driving innovation (Xu et al. 2007). Innovation, transcending mere intellectual effort, necessitates the integration of existing resources. Unlike traditional manufacturing, innovation activities are geographically concentrated (Feldman and Kogler, 2010) and exhibit spatial variability due to regional development disparities (Feldman and Florida, 1994). Cities, with their aggregation of universities, research institutions, talent, and businesses, alongside developed infrastructure, evolve into more than mere backdrops for individual or corporate innovation (Florida et al. 2017; Cheng et al. 2022). The elevated role of cities in innovation has led geographers and regional scientists to add a spatial dimension to the study of innovation and entrepreneurship, focusing on the geographic distribution of innovation and the processes shaping these geographical patterns. Geographers have discerned that innovative activities (e.g. patents, copyrights, publications) tend to be highly concentrated within and between cities. An examination of the geographical locations of patents and their citations has revealed that knowledge spillovers initially localize but become more geographically dispersed over time (Jaffe et al. 1993). This has paved the way for examining inter-city scientific cooperation, exemplified by studies on international pharmaceutical networks (Cantner and Rake, 2014) and international co-invention patent networks (De Prato and Nepelski, 2014), as well as the geographic distribution and connectivity of scientific innovation activities (Dong et al. 2023).

The multidimensional proximity analytical framework offers a nuanced mechanism for understanding the global cross-city diffusion and agglomeration of knowledge. The French School of Proximity pioneered the investigation into the role of proximity in the innovation process, analyzing interactions in the innovation process through various forms of proximity (Ibert et al. 2015). Boschma (2005b) outlines types of proximity, including geographic, cognitive, organizational, institutional, and social, with geographic and cognitive proximities being widely discussed in the context of global knowledge cooperation (Gui et al. 2018). Geographic proximity, for instance, facilitates the flow of tacit knowledge, with the initial regional innovation systems emphasizing the significance of tacit knowledge to regional innovation (Gertler, 2003). Classic agglomeration theories highlight the importance of geographic distance to economic activities, leading to a high degree of knowledge concentration within certain areas. However, current research on the dynamic evolution of global knowledge centers suggests a gradual shift from the West to the East, with knowledge hubs transitioning from high-level concentration to dispersion, facilitated by advancements in transportation and communication infrastructure. This shift highlights the growing importance of cognitive proximity in knowledge collaboration. Cognitive proximity measures the closeness and similarity in knowledge and capabilities between entities, fostering the understanding and utilization of external knowledge. Entities within the same cognitive community can overcome the limitations of spatial proximity, enabling transnational knowledge collaboration and the formation of a global scientific network. The importance of cognitive proximity as a determinant in selecting cooperation partners in innovation activities suggests regions with higher cognitive proximity are more likely to engage in cross-regional R&D collaborations (Hennemann et al. 2012b). From the perspective of epistemic communities, the exploration of global scientific cooperation patterns reveals that the globalized scientific system is profoundly influenced by local science clusters.

The analytical framework of multidimensional proximity provides a robust mechanism for explaining the global diffusion and agglomeration of knowledge across cities. The French School of Proximity was at the forefront of studying the role of proximity in the innovation process, analyzing the exchange and interaction during the innovation process through various forms of proximity (Ibert et al. 2015). Boschma (2005b) highlighted that proximity typically includes geographical, cognitive, organizational, institutional, and social aspects. Among these, the roles of geographical and cognitive proximity in global knowledge collaboration have been extensively discussed. Initially, classical agglomeration theory underscored the significance of geographical distance on economic activities, leading to the high concentration of knowledge within certain areas to some extent (Gertler, 2003). Subsequently, researchers discovered that subjects with closer and similar levels of knowledge and capabilities are more likely to engage in inter-regional R&D cooperation. Hennemann et al. (2012a), from the perspective of epistemic communities, investigated global patterns of scientific cooperation and found that the global science system is influenced by local science clusters. This means that members of the same cognitive community can overcome spatial proximity, enabling knowledge collaboration across national borders and fostering the formation of a global scientific network.

Measurement and pattern characteristics of knowledge collaboration networks

In characterizing the connections between nations or cities through knowledge flow, scholars commonly use patent cooperation to construct international co-patenting networks (Nam and Barnett, 2011; Xu et al. 2024), patent citations for international patent citation networks (Chen and Guan, 2016), co-authored scientific publications (Cantner and Rake, 2014; Li et al. 2015), as well as technological cooperation networks (Ma et al. 2015), and global talent mobility networks (Czaika and Orazbayev, 2018) to measure knowledge cooperation networks. For instance, Ozcan and Islam (2014) examined the global nanowire patent cooperation network, indicating that the current international cooperation structure is still centered around the USA, but the network is evolving towards a more dispersed structure. Hoekman et al. (2009) explored the co-publications network in Europe, revealing an elite structure in past research collaborations in the region. From the perspective of centrality analysis, global knowledge production centers exhibit a ‘hub-and-spoke’ spatial agglomeration. Ribeiro et al. (2018) investigated the characteristics of the co-publications network, demonstrating that international cooperation grows in a specific pattern, forming a scale-free network, and also revealed a high level of connections between authors in the network, exhibiting a ‘power-law form’. Cao et al. (2023b) analyzed the evolution of the global scientific cooperation network, focusing on the changing rankings of Chinese cities. They found that most Chinese cities have risen in centrality rankings, but only a few have emerged as global leading science centers.

Global scientific collaboration is continuously evolving, with the pattern of cooperation constantly being updated. Matthiessen et al. (2010) investigated the structure of the world city knowledge network in the first decade of this century, observing a decline in the traditional central position of North America and Western Europe, and an increase in regions like Southeast and South Asia. Dong et al. (2017) analyzed publications published in the last century and the early part of this century and found that in the early 20th century, the United States, the United Kingdom, and Germany were the mainstays of global scientific research with high scientific output. Post World War II, science progressively globalized, with more countries becoming key players in global research and engaging in extensive collaborations. For instance, top research institutions in countries like South Africa, Korea, and Australia began collaborating in scientific research with European nations (Dickey et al. 2022). Chen and Guan (2016) pointed out that, based on a study of the international knowledge dissemination structure using patent citation networks, they found that Asian countries and regions have significantly increased their influence in knowledge, with BRICS countries playing a crucial role in the innovation systems of developing and emerging countries. Similarly, Gui et al. (2019) examined the structure and evolution of the GSCN in the first fifteen years of this century, indicating that the once bipolar world dominated by the UK and the USA is gradually being replaced by a tripartite world (Europe, North America, and the Asia-Pacific). The center of global knowledge production is shifting from the West to the East. Cao et al. (2023b) constructed an inter-city scientific collaboration network based on global cities, finding a rise in the Asia-Pacific region and a weakened bipolar dominance of Europe and North America. In sum, even though the international cooperation network has been historically dominated by certain European countries and the USA (Heimeriks and Boschma, 2014; Liu et al. 2014), this network has been rapidly expanded globally (Leydesdorff et al. 2013).

Summary

In summary, inter-city scientific collaboration networks represent a prominent research area within innovation studies. The existing literature employs diverse approaches to represent knowledge flows, with scientific publications and patents frequently serving as key indicators of national innovation capabilities. Among these approaches, analyses based on scientific publication collaborations stand out due to their effectiveness in capturing knowledge dissemination processes. They not only illuminate patterns of knowledge production between cities and nations but also offer broader applicability and insights into the dynamics of global innovation networks. However, as the globalization of science deepens, the majority of existing research has explored scientific collaboration predominantly at the national scale, often focusing on specific disciplines or regions (Zhang et al. 2023b; Yin and Li, 2019). While some studies have indeed adopted cities as the primary analytical units, they largely remain confined to national or regional contexts. Consequently, there is a notable scarcity of comprehensive verification studies examining city-level scientific collaboration from a truly global and multi-disciplinary perspective. To address this gap, this analysis aims to enhance research in two ways. First, it adopts global cities explicitly as the research units rather than nations or regions. Second, it expands the scope of analysis by covering multiple disciplines relevant to urban development. Specifically, this study constructs a GSCN using data from 579 cities worldwide (Fig. 1). Through this comprehensive approach, the analysis provides a detailed understanding of global cities’ relative positions and clarifies cooperation patterns within the GSCN. Moreover, the findings offer practical guidance and policy recommendations, assisting more cities in strategically engaging with and benefiting from the GSCN.

Fig. 1
Fig. 1
Full size image

Research framework.

Methodology

Data sources

This section elucidates three steps: city selection, data acquisition, network construction, respectively. (1) The advent of industrialization and urbanization has precipitated significant challenges for cities worldwide, including escalated consumption of resources and increased emissions of greenhouse gases, which have burgeoned into a global crisis (Zhao et al. 2022). Inter-city collaboration on a global scale is deemed essential for achieving sustainable urban development (Chan, 2016). Furthermore, a higher population density is often correlated with an accelerated pace of innovation cycles (Cai et al. 2021; Bettencourt et al. 2007). Therefore, based on population statistics from 2022, we have selected cities with a population exceeding one million residents for the following reasons. 1) large cities typically function as comprehensive hubs within GSCN, serving multiple roles that considerably amplify their influence in global knowledge flows (Filippetti and Zinilli, 2023). The scale of these cities inherently facilitates broader international cooperation, resource aggregation, and talent mobility. 2) Practical constraints of data availability on global scales have further guided our selection. Most global datasets (e.g. OECD, UN datasets) predominantly aggregate and report data for large metropolitan areas, leading to challenges in maintaining consistency and comparability when incorporating smaller cities. This criterion led to the inclusion of 579 cities in our dataset, representing a diverse array of urban environments across the globe.

(2) Our analysis is primarily underpinned by data sourced from scholarly publications archived within the Web of Science (WoS) database for the year 2022. This comprehensive extraction included, but was not limited to, publications indexed in SCI, SSCI, and A&HCI databases. We meticulously collected data on the authors’ affiliations at the city level, focusing on papers that involved collaborations across two or more distinct cities. Furthermore, we aggregated the number of collaborative papers within different academic disciplines, facilitating a nuanced analysis of interdisciplinary and inter-city scientific collaboration. (3) To delineate GSCN, an in-depth understanding of its constitutive elements—namely, nodes and edges—was requisite. In our study, cities served as the nodes of the network. We conceptualized the GSCN as an undirected network, assuming that scientific collaborations, as reflected through co-publications, are inherently symmetrical between cities. This assumption allowed us to create a detailed collaboration matrix, with each cell representing the volume of collaborative publications between pairs of cities. Leveraging this rigorously constructed matrix, we crafted an undirected weighted collaboration network encompassing all 579 cities, with a total of 4,035,146 publications covering all disciplines. Besides, we specifically selected three disciplines—‘energy fuels’, ‘engineering’, ‘environmental sciences ecology’—to analyze the structure and dynamics of global and sustainable scientific collaboration networks among cities. These three disciplines frequently appear in international urban sustainability policy documents and urban research agendas, thereby reflecting contemporary priorities in global urban development practices. To maintain consistency, we utilized the Web of Science subject categories classification to clearly delineate these disciplines in our data collection process. By categorizing the three disciplines, the numbers 131,423, 501,698, and 396,459 publications are gained, respectively. As a result, networks which are covering ‘all disciplines’, ‘energy fuels’, ‘engineering’, ‘environmental sciences ecology’ are constructed.

Methods

The methods encompass spatial analysis and network exploration. Firstly, the geographical layout of the GSCN is depicted through geographic visualization. Secondly, the centrality metrics such as degree centrality, betweenness centrality, and weighted degree centrality are calculated. Topological features of the GSCN are explored through network indicators such as average degree, clustering coefficient, shortest path length, and network density. Thirdly, hierarchical clustering algorithms are utilized to examine the spatial structure, modularity algorithms to explore the community structure, and dominant flow analysis to investigate the functional roles of global cities within the GSCN.

Centrality metric

Centrality measurements are used to identify nodes that are most critical and central in unweighted networks. Among which, degree, betweenness centralities are the most widely used in unweighted networks. The network metrics are calculated using the following formulas in Table 1.

Table 1 Network indicator description.

Dominant flow analysis method

The dominant flow analysis method posits that in a network, the most powerful nodes have a greater influence than other connections (Nystuen and Dacey, 1961). This approach has found widespread application in domains such as trade flows (König et al. 2022; Ou et al. 2024) and knowledge flows (Gui et al. 2019). Based on the size of nodes and the strength of connections between them, nodes can be categorized into three types: (1) dominant nodes, whose maximum flow is directed towards nodes smaller than themselves; (2) subdominant nodes, whose maximum flow is directed towards nodes both larger and smaller than themselves; and (3) subsidiary nodes, which do not receive the maximum flow from any other nodes. The dominant flow analysis method is employed to reveal the center-hinterland structure within the network.

Hierarchical clustering algorithm

The core-periphery structure model primarily draws upon methods from mathematical graph theory and relational structure (Nystuen and Dacey, 1961; Joseph and Newman, 2010) to delineate the positions of different nodes within the network. Spatial order within social networks is primarily determined by node size and the flow of relationships between nodes (De Nooy et al. 2018). The core-periphery structure is initially established by applying hierarchical clustering algorithms to pair nodes with similar sizes and relationship patterns into a group. Subsequently, the hierarchical clustering algorithm continues to merge the most similar groups into a larger one. This process is repeated until all similar nodes are grouped into the same cluster. Finally, the hierarchical relationships among different cohesive subgroups are used to determine the network positions of each node. In the global inter-city research collaboration network, cities with higher weighted degrees, indicating more extensive publication collaborations with other cities, are more likely to occupy core positions.

Weighted stochastic block model to detect network community

In this study, the Weighted stochastic block model (WSBM, hereafter) is utilized to distinguish communities in GSCN. Community detection methods presuppose the existence of community structures within networks but are limited to identifying these structures and fail to detect core-periphery structures (Cao et al. 2023a; Gui et al. 2019; Newman and Girvan, 2004). Block models, while considering the similarity of connections between nodes and grouping them accordingly, do not account for the weights of the edges within the network. Moreover, these methods are subject to ‘methodological determination’ (Zhang and Thill, 2019), meaning they presuppose the existence of meso-scale structures and simultaneously exclude other potential structures within the network. In reality, the real network structure, when subjected to external shocks, may exhibit a diverse and hybrid structure. Therefore, the WSBM is necessary to explore meso-scale structures, searching for an optimal meso-scale structure within the data that can account for the complex and varied interconnection characteristics (Zhang et al. 2023a).

In the present study, the network is modeled as an undirected network where nodes represent 579 cities. The cooperation flows between these cities serve as the weights of the edges. The construction of the WSBM involves two steps. Firstly, each city node i = 1, 2,, N is assigned to a potential group (Zi {1, 2, …, K}), where N denotes the number of nodes and K the number of groups into which the network is partitioned. Zi is the group index of node i. Secondly, the weighted adjacency matrix between cities i and j is defined by the connection aij, which is not deterministic but stochastic. The distribution of aij depends on the edge parameters θZiZj related to the group membership of cities i and j, reflecting the principle of stochastic equivalence—that all cities within a group have the same probability of connectivity to the rest of the network θZiZj. If aij follows a normal distribution, each edge bundle (ZiZj) is parameterized by mean and variance \({\theta }_{{ZiZj}}=({\mu }_{{zi},{zj},}{{\sigma }^{2}}_{{zi},{zj}})\). Given a block matrix θ = [θkk]k×k and the known distribution of \({\theta }_{{ZiZj}}\), the adjacency matrix \(A={[{a}_{{ij}}]}_{N\times N}\) can be estimated.

The WSBM model involves an inverse inference process: Given the number of groupings K and the observed traffic bundles, the underlying groupings Z and the stochastic block matrix θ can be inferred using the WSBM (Aicher et al. 2013). The objective of WSBM is to infer the optimal partitioning Z and the weighted stochastic block matrix \(\theta ={[{\theta }_{k{k}^{{\prime} }}]}_{k\times k}\), through the maximization of the likelihood function. The formulation of the likelihood function is as follows:

$${{P}}({\rm{A}}|{\rm{Z}},{\uptheta })={\rm{P}}({\rm{A}}|{\rm{Z}},|{{\upmu }},{\sigma }^{2})=\prod _{i,j}exp \left({a}_{{ij}}\frac{{\mu }_{{zi},{zj}}}{{{\sigma }^{2}}_{{zi},{zj}}}-{a}_{{ij}}^{2}\frac{1}{{{2\sigma }^{2}}_{{zi},{zj}}}-\frac{{{\mu }^{2}}_{{zi},{zj}}}{{{2\sigma }^{2}}_{{zi},{zj}}}-\log {\sigma }_{{zi},{zj}}\right).$$
(1)

Aicher et al. (2015) employed Bayesian regularization to optimize the likelihood function of the WSBM by treating parameters Z and θ as random variables endowed with prior distributions. The posterior distribution PZ, θ|A is derived using Bayes’ theorem as follows:

$$P({Z},{\theta }|{A})\propto {P}\left({A},|,{Z},{\theta }\right){P}\left({Z},{\theta }\right).$$
(2)

WSBM algorithm was implemented in MATLAB 2021a, and the visualization of the results was facilitated using both MATLAB and Gephi 0.10.1.

Results

Overview of spatial layout

Using a hierarchical clustering algorithm, this analysis divides the GSCN into a four-level hierarchy: core, semi-core, semi-periphery, and periphery, as illustrated in Fig. 2. (1) At the apex is the core layer, exemplified by Beijing, which connects with 93.1% of the cities in this analysis. With the highest number of publication collaborations (261,515), Beijing stands as the central nexus of the GSCN, underscoring its pivotal role in global scientific engagement. (2) The second level comprises cities such as London, New York, Boston, Paris, Shanghai, Guangzhou, and Nanjing. These cities exhibit robust interconnections within the layer and significant engagement across the network, evidenced by high average degree values (514) and average weighted degree values (132,897). Notably, connections within this layer account for 10.74% of their total collaborations, indicating a strong propensity for inter-level engagements. (3) This layer includes 35 cities, such as Hong Kong and Shenzhen, with a relatively high average degree (478) and average weighted degree (63,988). The semi-periphery layer focuses primarily on inter-layer connections, with 29.36% of its connections being intra-layer, highlighting its intermediary role in the GSCN. (4) The fourth level encompasses 535 cities on the fringes of the GSCN. Cities within this layer hold less significance in the network, with both the average node degree and average weighted degree below the overall network metrics. The connectivity within periphery layer cities is relatively low, with a network density of 0.44, slightly below the GSCN’s overall density. However, a significant portion of publication collaborations (67.15%) originates within this layer, emphasizing a strong intra-layer collaboration tendency. Each of these layers plays a distinct role within the GSCN, contributing to the overall structure and function of global scientific collaboration. The core and semi-core layers are crucial for inter-city connections and maintaining the network’s robustness, while the semi-periphery and periphery layers serve as bridges and support the network’s extensive reach. This hierarchical organization illustrates the varying levels of influence and connectivity among cities in the global scientific landscape.

Fig. 2
Fig. 2
Full size image

The patterns of GSCN at the city-level.

The number of scientific publication collaborations, both intercontinental and within continents, is visualized in Fig. 3. The specifics are as follows: (1) the proportions of major continents in global scientific collaborations, with Asia, Europe, and North America taking the lead, account for 35.45%, 24.19%, and 18.81%, respectively. (2) Intra-continental collaborations dominate global scientific collaborations, comprising 55.77% of the total. Among these, intra-Asian collaborations are the most numerous, representing 29.36%. Additionally, there are 518,504 collaborations (13.03%) within North America, 389,455 (9.79%) within South America, 85,850 (2.16%) within Europe, 31,991 (0.8%) within Africa, and 24,935 (0.63%) within Oceania. (3) Intercontinental collaborations account for 44.23% of scientific publication collaborations and show significant disparities. Cooperation among Europe, North America, and Asia forms a tight-knit collaboration network. The collaboration between Europe and North America is the most active, accounting for 9.20% of the total global scientific collaborations. Following closely is the collaboration between Europe and Asia, constituting 8.90% of the total. In comparison, cooperation among Oceania, South America, and Africa is relatively weaker. (4) The differences in the proportion of internal and external collaborations are evident for each continent. Asia and South America exhibit a relatively balanced ratio of internal and external collaborations, indicating both close internal cooperation and extensive external connections. In contrast, Europe, Africa, Oceania, and North America tend to engage more in intercontinental collaborations. For instance, Europe has fewer internal collaborations, with the majority originating from other continents. Similarly, Africa heavily relies on external collaborations.

Fig. 3
Fig. 3
Full size image

The patterns of GSCN in continents.

Network structural characteristics across disciplines

Centrality metrics

This section analyzes the influence and control of various cities within disciplinary networks by calculating centrality metrics, including degree centrality, weighted degree centrality, and betweenness centrality. These metrics are visualized in accompanying figures to facilitate a comprehensive understanding. Our findings are as follows: (1) Over a quarter of cities worldwide engage in knowledge exchanges with more than 400 other cities, signifying a highly inclusive network that promotes collaboration across cities of diverse sizes. Beijing and London act as central hubs in the global inter-city publication collaboration network, significantly enhancing the network’s connectivity and inclusiveness. As transit nodes, Beijing, London, and Shanghai are crucial in connecting the majority of global cities and play a significant role in the dissemination of knowledge. Weighted degree centrality, reflecting the strength of a city’s connections, with Beijing, London, and New York as the central hubs. Beijing’s exceptional contribution is evidenced by its leading 261,515 publication collaborations, affirming its indispensable role within the GSCN (Fig. 4a). (2) An analysis of centrality indicators reveals a dominance of Asian cities in terms of their centrality within the network. Specifically, 14 out of the top 20 cities, based on degree centrality, are in Asia. This trend is consistent with betweenness centrality, where 15 out of the top 20 cities are Asian. Beijing stands out as a central hub in all analyzed networks, underscoring its critical role in facilitating knowledge exchange and innovation (Fig. 4a). Other notable cities such as Hong Kong, Hangzhou, Sydney, and Seoul serve as critical nodes, acting as significant channels for knowledge flow and innovation, thereby anchoring a vigorous collaboration network predominantly in Asia. (3) The prominence of cities within the network varies substantially across different scientific disciplines. Beijing maintains a consistent position as a central node across various fields, highlighting its versatile role in global scientific collaboration. Other cities, including Shanghai, Nanjing, Wuhan, Guangzhou, Shenzhen, Singapore, and London, display notable standings within networks related to ‘energy fuels’, ‘engineering’, and ‘environmental sciences and ecology’. Their significant positions in these specific disciplines suggest a specialized focus and contribution to global knowledge exchange in these areas (Fig. 4b–d).

Fig. 4: Network centrality metrics of nodes across disciplines.
Fig. 4: Network centrality metrics of nodes across disciplines.
Full size image

a All disciplines, b energy fuels, c engineering, d environmental sciences ecology.

Topological characteristics

Examining the clustering coefficients and average shortest path, the networks of ‘all disciplines’, ‘engineering’, and ‘environmental sciences and ecology’ exhibit notably high clustering coefficients and shorter average path lengths compared to randomly generated networks. These coefficients exceed the clustering coefficient value of 0.502 for random networks, with average path lengths consistently falling below the random network benchmark of 1.8 (Fig. 5a, b). The observed average path lengths reveal the existence of several localized clusters within the network, characterized by dense internal connections and relatively sparse connections between clusters, embodying the quintessential small-world phenomenon. Conversely, the cooperation network in ‘energy fuels’ displays a lower clustering coefficient and a longer average path length, lacking distinct small-world characteristics. This deviation underscores a less integrated and possibly more exploratory or fragmented nature of collaboration within this specific field.

Fig. 5: Topological indicators of GSCN across disciplines.
Fig. 5: Topological indicators of GSCN across disciplines.
Full size image

a All disciplines, b energy fuels, c engineering, d environmental sciences ecology.

When analyzing network density and the average degree among nodes, the GSCN of ‘all disciplines’ emerges with the highest network density and average node degree, recorded at 0.5 and 288, respectively (Fig. 5c, d). These figures highlight the network’s extensive linkages of collaborative publications among global cities, indicating strong, robust overall connectivity. By comparison, the networks of ‘engineering’ and ‘environmental sciences and ecology’ exhibit similar network density and average connectivity. In contrast, the cooperation network in ‘energy fuels’ registers a network density and average node degree of 0.12 and 72. This suggests that the collaborative framework in the ‘energy fuels’ area remains underdeveloped, pointing towards an imperative for enhanced cooperative engagement.

Spatial organization structure across disciplines

Community structure analysis

With the application WSBM, this section dissects the GSCN into distinct communities, unveiling patterns of both strong and weak connections among cities. This approach enabled the identification of optimal groupings (K = 3–8) based on maximum likelihood connectivity, facilitating a nuanced understanding of the network’s community structure across various disciplines, including ‘all disciplines’, ‘energy fuels’, ‘engineering’, and ‘environmental sciences and ecology’, as depicted in Fig. 6.

Fig. 6: Communities and distributions of GSCN.
Fig. 6: Communities and distributions of GSCN.
Full size image

a.1 Communities of all disciplines, a.2 communities of energy fuels, a.3 communities of engineering, a.4 communities of environmental sciences and ecology. b.1 distributions of all disciplines, b.2 distributions of energy fuels, b.3 distributions of engineering, b.4 distributions of environmental sciences and ecology.

Firstly, the community structure division of the GSCN, encompassing all disciplines, exhibits clear core-periphery characteristics across groups. Figure 6(a.1) reveals that the GSCN is partitioned into four groups, where Group 1 and Group 2 are characterized by higher density of knowledge collaborations and substantial inter-group connections. Specifically, Group 1 is identified as the core, and Group 2 as the semi-periphery, as depicted in Fig. 6(b.1). In contrast, Group 3 and Group 4 exhibit less frequent inter-group collaborations and are thus categorized as the periphery. The spatial distribution of nodes within these groups, as shown in Fig. 6(b.1), indicates that the core and semi-periphery groups boast a global presence, whereas the periphery groups are more regionally concentrated. The extensive knowledge collaborations within and between Groups 1 and 2, distributed across continents, suggest that leading innovation hubs worldwide are actively engaged in global knowledge collaborations, enriching the network with diverse knowledge inputs. In contrast, Groups 3 and 4, the periphery, have fewer knowledge links both internally and externally, with a concentration of nodes in Asia and Africa. This indicates Asia’s evolving stature as a pivotal knowledge center globally, although with some cities are yet to achieve a comprehensive level of knowledge globalization.

Secondly, this analysis examines the community structures within specific disciplines, indicating that ‘energy fuels’ and ‘engineering’ exhibit a core-periphery arrangement, whereas the GSCN in ‘environmental sciences and ecology’ presents a dual-core structure. Figure 6(a.2) and (a.3) show that ‘energy fuels’ and ‘engineering’ have a dominant core group (Group 1) with extensive knowledge collaborations. The remaining groups, with fewer internal connections, are identified as peripheral groups. In the ‘energy fuels’ network, the peripheral Group 2 maintains a closer association with the core Group 1 than do other peripheral groups. The ‘environmental sciences and ecology’ network features a dual-core configuration, as visualized in Fig. 6(a.4), with two core groups (Groups 1 and 2) engaging in significant knowledge collaborations. Groups 3 and 4, despite fewer internal collaborations, maintain more extensive links with the core groups and are thus identified as semi-peripheral groups.

Additionally, Fig. 6(b.2) and (b.3) show the distribution of nodes in different disciplinary networks. It was found that the core group cities in ‘energy fuels’ and ‘engineering’ disciplines are scattered all over the world, with cities from different countries actively participating in GSCN. However, the peripheral group nodes in ‘energy fuels’ and ‘engineering’ networks comprise 71 and 480 nodes, respectively, suggesting a substantial number of nodes in the ‘engineering’ network are peripherally integrated, indicating a lesser degree of inclusion. Specifically, the community structure in the ‘environmental sciences and ecology’ network exhibits pronounced regional characteristics. As depicted in Fig. 6(b.4), the core groups (Groups 1 and 2) in the ‘environmental sciences and ecology’ network are geographically split between the eastern and western hemispheres, with Asia representing the east and Europe and the Americas representing the west. The semi-periphery groups (Groups 3 and 4) also display an east-west division, with Asia and Africa in the east and the Americas in the west, illustrating the geographical and thematic dichotomies within GSCN.

Center hinterland structure analysis

Utilizing dominant flow analysis, this analysis unveils the intricate ‘center-hinterland’ structure within the GSCN. As shown in Fig. 7, the size of the nodes represents their strength, while the thickness of the lines indicates the quantity of publication collaborations. The GSCN manifests as a composite of multiple ‘center-hinterland’ systems, each varying in size and collectively forming a discontinuous ‘archipelago’ structure. The largest one is the ‘Asian archipelago’, with Beijing as its core, spanning 139 cities, predominantly located in East Asia. This archipelago represents the largest of the configurations, underscoring the significant role of Beijing and its surrounding cities in regional and global scientific collaboration. Following closely is the ‘Africa–European archipelago’, centered around pivotal cities such as London and Paris. African cities (47.9%) and European cities (33.3%) jointly form 81.2% of this 73-city archipelago. This arrangement highlights the influential position of European cities within Euro-African publication collaborations, a dynamic interwoven with historical colonial ties. The third noteworthy archipelago, the ‘North American Archipelago’, is centered on New York and encompasses 56 cities, with North American cities accounting for 75% of the total. The three archipelagos, anchored by core cities in Asia, North America, and Europe, play critical roles in shaping the GSCN structure, demonstrating the global influence of cities both within and beyond their respective continents.

Fig. 7: The “center-hinterlands” structure of the GSCN.
Fig. 7: The “center-hinterlands” structure of the GSCN.
Full size image

a ‘All disciplines’, b ‘energy fuels’, c ‘engineering’, and d ‘environmental sciences ecology’.

Distinct variations in the center-hinterland structure are observed across different scientific disciplines. In the networks of ‘energy fuels’ and ‘engineering’, Chinese cities, prominently led by Beijing, constitute the largest center-hinterland cluster. This cluster includes significant hinterland cities such as Shanghai, Nanjing, and Shenzhen, illustrating the dominant role of Chinese cities in these disciplines. Other smaller yet notable clusters revolve around cities like Delhi, Chennai, Sao Paulo, and Moscow. In these disciplines, the clusters anchored by London and New York are comparatively limited in scope. In contrast, within the ‘environmental sciences and ecology’ network, the Beijing-centric cluster contracts, while the influence and reach of clusters centered on New York and London notably increase. The broadest clusters span East Asia, Europe, and North America, emphasizing the pivotal contributions of cities from these regions in forming the GSCN structure.

Discussion

Scientific publication cooperation represents the forefront of collaborative knowledge generation within scientific research, serving as a crucial reservoir of information for exploring the GSCN. Notably, scientific publication collaboration not only reflects direct research partnerships but also captures the complex interplay of global academic interactions and collaborative efforts. An in-depth analysis of the spatial patterns and topological structure of the GSCN reveals the multifaceted and intricate nature of global scientific knowledge collaboration.

The small-world characteristic indicates that scientific collaboration relationships between global cities are relatively close, with efficient information and knowledge flow. Consequently, the communication and collaboration among cities are more unimpeded, and the technological innovation network possesses high connectivity and information transmission efficiency. However, our analysis also observed significant spatial imbalances and regional clustering phenomena in the global inter-city publication collaboration network. An analysis of the spatial patterns revealed a clear North–South divide, characterized by regional clustering and spatial diffusion. The strong mobility of knowledge has incorporated numerous cities into the GSCN, including some from economically underdeveloped countries. However, collaborative publications are mainly concentrated in Europe, North America, and the Asia-Pacific, with core cities from these regions illustrating significant regional clustering. This finding aligns with previous studies that identified a top-end clustering pattern in the scientific domain (Florida et al. 2018) rather than a flat one (Asheim and Isaksen, 2002). The global inter-city publication collaboration network exhibits an elite structure (Hoekman et al. 2009), exacerbating the imbalanced pattern of global knowledge flows. The imbalance stems from several profound underlying factors, such as differences in government funding for local scientific research across regions (Filippetti and Zinilli, 2023), language barriers encountered during inter-regional knowledge exchange (Hwang, 2013), and geopolitical issues among regions (Makkonen and Mitze, 2023).

Our analysis found that Asian cities, particularly Chinese cities, have significantly risen in stature within the GSCN, diverging from previous research. Earlier studies primarily emphasized the central roles of Western cities such as London, New York, San Francisco, Boston, and Tokyo occupied central positions, with few Asian cities in the core (Matthiessen et al. 2010). In contrast, this analysis highlights Beijing, Shanghai, Guangzhou, and Nanjing as emerging central cities, breaking the European–North American duopoly in the global knowledge collaboration network. The network’s core is shifting from Europe–North America to Asia, particularly China, now emerging as a central pole in global knowledge collaboration (Cao et al. 2023b). These findings highlight the emerging trends in scientific globalization and the sustained expansion of China in scientific research and international collaboration. This trend might be attributed to its economic development, increased educational investments, and improved research environments. These findings reveal the dynamism and diversity of the global knowledge collaboration network, indicating a shift in scientific collaboration from the West to the East. This transition represents a geographical shift and a redistribution of scientific research power and influence (Maisonobe et al. 2016). This trend suggests that future global knowledge collaboration will be more diverse and inclusive, offering countries and regions of different economic development levels more opportunities to contribute to global scientific knowledge production and dissemination.

By dividing the global collaboration networks into community clusters, we observed the communities to which the nodes belong. This analysis found that there are evident clusters in the GSCN, with connections highly concentrated in certain communities. This finding might be collectively shaped by a combination of geographical, cultural, linguistic, and historical factors. Specifically, cities, leveraging similar knowledge bases and technological conditions, engage in close knowledge collaboration, thereby establishing widespread connections among group cities based on cooperation. This aligns with previous findings that although scientific cooperation operates globally, it favors domestic or regional collaboration, demonstrating the persistence of cognitive communities (Hennemann and Liefner, 2015; Hennemann et al. 2012b). Although the trend of scientific knowledge globalization is increasingly evident, this analysis found that the geographical extension of the GSCN remains somewhat limited since geographical proximity can reduce collaboration, communication and coordination costs (Bathelt et al. 2004). This suggests that, in the age of globalization and informatization, while the constraints of physical distance are diminishing and knowledge can easily transfer across geographical spaces, promoting knowledge flow and diffusion across cities, regions, and globally (Cairncross, 1997; Friedman, 2005), tacit knowledge might still have geographical boundaries, with such knowledge more likely to flow and exchange within geographically proximate areas (Liu et al. 2015; Dickey et al. 2022). Additionally, non-material factors, including culture, language, institutions, and history, which foster trust and understanding, still significantly influence knowledge flow and collaboration (Boschma, 2005a; Pouris, 2010).

The analysis provided in the results section offers a nuanced understanding of the structure and dynamics of GSCN across different disciplines. The observed high clustering coefficients and shorter average path lengths in the networks of ‘all disciplines’, ‘engineering’, and ‘environmental sciences and ecology’ indicate that scientific collaborations do not occur haphazardly but are the result of deliberate, strategic connections among researchers. This analysis finds that the global innovation network formed in the ‘energy fuels’ is not as closely-knit or characterized by small-world features as other networks, indicating that research is still concentrated in a few countries. Existing studies showed that the complexity of energy technologies requires the recipient to have a certain level of technical capability to effectively absorb, adapt, and apply imported technology (Yang et al. 2024; Johnstone et al. 2010). Furthermore, due to differences in geography, climate, economic, and social structures, an energy technology might need to be adjusted according to local conditions to achieve optimal efficiency, a process that can become a significant barrier to technology dissemination. Consequently, knowledge and technological innovation in the ‘energy’ sector often revolve around specific geographic or knowledge clusters. From a spatial distribution perspective, Chinese cities have been the main frontiers for research on energy and environmental areas (Shan et al. 2019). In China, these clusters not only facilitate the rapid flow of knowledge and iterative updates of technology but also promote the development and application of carbon neutrality technologies. For instance, the city clusters along the eastern coast of China have become innovation centers for carbon capture and storage technology due to their dense industrial activities and research institutions (Wang et al. 2023). The preferential attachment characteristic suggests cities that are marginalized across the network can obtain high value by establishing connections with existing highly connected nodes (Zhang et al. 2019; Resce et al. 2022; Zinilli et al. 2024). This finding provides guidance on collaboration strategy for scientific research institutions and scholars: in the era of scientific globalization, it is crucial for cities and countries to join the global scientific knowledge collaboration network and seek international collaborations to promote their development (Sheppard, 2002). Establishing collaborative relationships with central nodes can more effectively acquire and disseminate knowledge, thereby enhancing the impact and innovation of scientific research (Wagner and Leydesdorff, 2005).

While comparing with global production networks isn’t the primary objective of this analysis, the results highlight similarities and differences between the patterns in global production networks and the global inter-city publication collaboration network. Similarities lie in the fact that in both networks, cities with strong economic power, research capabilities, and global influence occupy stable central positions. Differences include variations in clusters. The global inter-city publication collaboration network consists of multiple clusters of different scales and characteristics, reflecting the diversity and complexity of knowledge collaboration. In contrast, global production networks might be more concentrated and unified, reflecting the integration of global production and supply chains (Werner, 2016). (2) The global inter-city publication collaboration network reveals new central cities, especially the rise of Asian cities, showcasing the dynamic changes and multipolar trends in the global knowledge collaboration network. In contrast, global production networks, constrained by the distribution of global economic power and industrial chain layouts, are more stable and less prone to change (Parnreiter, 2019). In summary, the differences between the global inter-city publication collaboration network and global production networks reflect the distinct characteristics and patterns of knowledge and material outputs in global distribution and flow. These differences provide crucial insights into understanding the complexity of knowledge and economic activities under globalization. They also reveal how cities from different regions and cultural backgrounds participate in global knowledge innovation and dissemination in various ways and levels, and how such participation reflects the diversity and complexity of global knowledge production and flow.

Conclusion

Through the analysis of publication collaboration data, this study utilizes complex network theory to investigate the GSCN, specifically focusing on the collaboration patterns across 579 cities worldwide. This exploration sheds light on the cooperation patterns of scientific partnerships, an area ripe for detailed investigation within the context of global scientific endeavors. The findings reveal that (1) the GSCN exhibits a small-world structure, indicative of a network marked by substantial interconnectedness and collaboration efficiency. (2) The GSCN displays both a significantly imbalanced distribution and a pronounced tripolar geographical concentration centered on North America, Western Europe, and East Asia. Cities notably Beijing, London, New York, and Shanghai, anchor intensive collaboration, with Beijing demonstrating exceptional centrality. (3) The GSCN exhibits a multi-level community structure. Global cities (e.g., Beijing, London, New York) serve as transnational hubs, forming the core community through intensive interconnections. Meanwhile, influenced by geographical proximity, these global cities have developed center-hinterland structures with their surrounding regions. Besides, comparative analysis across disciplines unveils distinct structural divergence and topological characteristics.

From a theoretical standpoint, this analysis significantly enriches our understanding of the underlying structures that facilitate scientific collaboration on a global scale. By gathering and analyzing extensive datasets, we explore cooperation and communication patterns among global cities, thereby offering novel perspectives and analytical tools for deciphering intricate urban relationships. Furthermore, the cities encompassed in our analysis are not limited to those traditionally included in the world urban system, addressing gaps in prior research and enhancing our comprehensive global understanding. It underscores the importance of recognizing and addressing the spatial and disciplinary imbalances that exist within these networks to foster a more inclusive and interconnected global scientific community. The practical significance of this analysis lies in its potential to outline a tangible pathway for advancing sustainable development through global scientific collaboration, particularly addressing the considerable gaps faced by countries in the Global South in achieving the SDGs. Global collaboration provides essential experiences and reference points for countries with limited technological capabilities, facilitating advancements in renewable energy technologies, smart city initiatives, and other sustainability innovations. A critical step toward effective knowledge exchange involves overcoming collaborative isolation faced by Global South countries within the GSCN. Therefore, we suggest that scholars and institutions from cities in the Global North proactively engage with researchers from the Global South in addressing SDG-related issues, rather than excluding them.

This analysis indicates several limitations. First, regarding city selection, this analysis adopted a population threshold of one million, assuming that larger cities are more likely to host universities and research institutions. However, this criterion might overlook smaller cities that host significant research institutions, potentially omitting valuable scientific collaboration data from the global inter-city innovation network. Nevertheless, since this analysis measures inter-city linkages primarily by the quantity rather than the quality of co-authored publications, the exclusion of smaller cities, which may have strong innovative capabilities in specialized research areas, likely has a limited impact on the robustness of the overall results. Second, the WOS database does not fully cover publications in languages other than English, potentially affecting the universality of the findings (Mongeon and Paul-Hus, 2016). However, given that non-English publications tend to concentrate within specific countries or regions, their exclusion minimally influences the assessment of global inter-city scientific collaboration, ensuring the general applicability and robustness of the study’s conclusions. For future research directions, subsequent studies should further investigate how factors such as geography, language differences, citation and publication quality influence the formation and evolution of the GSCN. Additionally, future research could examine alternative forms of knowledge exchange to more comprehensively capture knowledge flows between cities, thus contributing to a richer understanding of global scientific collaboration dynamics.