Introduction

Throughout urban history, cities have been considered “melting pots” where people of different ethnicities, occupations, social statuses and income groups meet and live together to benefit from the business and economic opportunities of agglomeration economies (Loo & Axhausen, 2022). As a result, urban vibrancy and diversity are key characteristics of cities. Yet, with the juxtaposition of the rich and the poor (among other contrasting differences), social segregation has been an urban challenge.

Issues of social segregation are further aggravated if not dealt with carefully in urban planning (Moro et al., 2021). In many American cities, urban sprawl occurred with the large-scale development of detached single-family housing (often with private gardens) in suburban and rural areas. In contrast, the lower-income class has been “stuck” in downtowns with degrading housing and infrastructure (Ewing & Cervero, 2010). Similarly, social segregation has been an increasing concern in Asian cities. The strategy of developing mixed neighbourhoods through public housing at prime sites in downtown areas has also been increasingly difficult due to high land prices and the opposition of local communities to keep public estates and other community facilities away to protect the value of their properties. In Hong Kong, large-scale public housing development has become increasingly difficult, except in government-led new towns away from traditional urban cores (Loo & Chow, 2011). This has led to the relocation of the lower-income class out of the urban core areas.

With the increasing segregation of the lower-income class (though in different locations, that is, within downtowns in American cities versus the farther-away new towns in many Asian cities), the level of residential mixing in many cities worldwide has reduced over time. Such an urban development process undermines one of the major characteristics and benefits of living in cities with diversity and inclusivity. In the longer term, researchers found social segregation detrimental to access to public health and education resources (Sharkey & Faber, 2014), occupational success (Quillian, 2014) and children’s future economic outcomes (Chetty et al., 2022). All these reduce the capabilities of the low-income class to move up the social ladder and reinforce the vicious cycle of social segregation in society.

Even though the study of segregation dates back to the 1930s (Duncan & Duncan, 1955; James & Taeuber, 1985), most of the work has focused on residential segregation. Considering the research interest here is to estimate people’s chance of exposure to a different social group, focusing on where people live alone obviously omits that people will not spend all their time at home. Noting this limitation, recent studies with people-oriented and place-based approaches (Loo, 2021; Loo & du Verle, 2017) led us to focus on measures of segregation within the activity space of individuals (Cagney et al., 2020). Activity space is a spatiotemporal pattern of individuals as a result of their routine activities (Browning & Soller, 2014; Pred, 1977) and has been widely used to measure mobility (Song et al., 2010; Q. Wang et al., 2018), inequalities (Jones & Pebley, 2014) and more recently disease risk (Loo et al., 2023). Research on social segregation and isolation has been conducted using mobile phone records, social media traces, and traffic card logs to consider social encounters in daily life (Athey et al., 2021; Ellis et al., 2004; Moro et al., 2021; Nilforoshan et al., 2023; Q. Wang et al., 2018; Silm et al., 2018; Yabe et al., 2023). The growing body of activity space segregation research highlights exposure to different social groups as people conduct different activities at different locations beyond homes. While the activity space approach focuses on the exposure at activity locations, there are other dimensions, such as the segregation within professions or institutions like schools, which needs to be further explored. Theoretically, these studies echo the uncertain geographic context problem (UGCoP) (Kwan, 2012) when “the effects of area-based contextual variables on individual behaviours or outcomes” deviate from the “true geographical experience.” Empirically, they provide evidence that deeply rooted social segregation persists beyond residential segregation and lies more in people’s activity space.

With a focus on experienced social segregation during daytime, a body of research has drawn our attention to the role of public transit. For example, the New York City subway is a “place” that diverse social groups can encounter (Ocejo & Tonnelat, 2014). Some studies also found that cities with higher public transit use tend to have relatively lower social isolation (Athey et al., 2021; Pentland, 2015). It is plausible that efficient and equitable access to transportation helps to overcome the isolation of communities with respect to accessing jobs, necessities, and education opportunities in cities (Wissink et al., 2016).

In this study, we build on the activity-based segregation measures that focus on people’s exposure to other income groups (Fan et al., 2023; Yabe et al., 2023). Specifically, we test the effects of different transport modes on activity-based social mixing. Studies that use large-scale mobile phone records rarely capture transport modes at the individual level. In addition, most of the activity-based segregation studies have been cross-sectional since mobile phone market penetration was much lower ten years ago (Nilforoshan et al., 2023). In this regard, this study leverages the travel survey data across two decades to investigate social mixing and its relationship with the transport system from both spatial and temporal perspectives.

Literature review and research gaps

In this section, we mainly review previous literature from two perspectives. First, we review recent studies on measures of social segregation. Then, we review the literature that focuses on the intertwined relationships of travel modes, public transit, and social segregation.

Measures of segregation

Recent research on segregation has begun to consider the extent to which people can encounter social groups different from their own beyond residential spaces (Cagney et al., 2020). Many indexes have been developed accordingly. Athey et al. (2021) and Cook et al. (2024) used the “experienced isolation” to measure the racial isolation people experienced throughout their daily travel. Moro et al. (2021) and Yabe et al. (2023) used “experienced segregation” and “diversity” to measure the evenness of exposure to different income groups for individuals. Nilforoshan et al. (2023) used “exposure segregation” to describe the correlation between a person’s socioeconomic status and the average socioeconomic status of everyone to whom one is exposed to.

These activity-based segregation studies have been supported by the recent surging availability of mobile phone records (Athey et al., 2021; Cook et al., 2024; Moro et al., 2021). However, with the limited individual information from these datasets, researchers have to infer the demographic information per mobility history and spatial attributes. Such methods are less applicable to a dense and mixed urban environment where multiple land use types co-exist in high-rise buildings. To overcome this limitation about mobile phone data, researchers also used available travel survey data to create activity-based segregation measures. For example, Boterman & Musterd, (2016) used a travel survey in the Netherlands and found that low-income groups tend to live in homogeneous neighbourhoods. Le Roux et al. (2017) used similar data in Paris and identified the temporal dynamics of social segregation in Paris – they found districts with similar social composition at night were different during the day due to the socially selective daily trips.

Social segregation, inequality, and transportation

Despite the differences in specific measures used, some consensus is reached. First, these studies based on activity-based measures have confirmed that experienced segregation is correlated with residential segregation (Athey et al., 2021; Jones & Pebley, 2014; Moro et al., 2021). In addition, studies also highlighted that the existing structural variance (for example, among the young and old or the rich and the poor) persisted. People in a vulnerable state of life tend to have a greater exposure to poverty and experience a higher level of segregation. For example, Cook et al. (2024) found students are more segregated than adults. Cornwell & Cagney (2017) tracked older adults in New York City to show that older adults with fewer years of education have greater exposure to poverty.

Beyond these critical issues, studies also hinted at the potential cures from city planning and urban design perspectives. For example, Moro et al. (2021), Fan et al. (2023), and Abbiasov et al. (2024) demonstrated the effects of different urban amenities such as restaurants, retail, grocery stores and movie theatres in alleviating or exacerbating activity-based segregation. Accessibility to transportation, in the meantime, is also suggested to have a role in affecting social segregation and inequality. Cities benefit their dwellers by providing access to multi-modal transport systems, thus affecting individuals’ mobility patterns. However, the role of the transport system shows contradictory roles in inequality and segregation studies. On the one hand, regional-level measures imply a positive correlation between public transit accessibility with lower social segregation and isolation (Athey et al., 2021). Blumenberg & Hess (2003) found that improved public transit was positively associated with keeping a job. On the other hand, a body of research highlights the inherent inequality of transport accessibility. Landis (2022) pointed out that wealthier neighbourhoods have more mobility options than residents of poorer neighbourhoods, thus leading to further disparities between the two. Furthermore, the development of a mass transit system is also long considered a potential factor that leads to gentrification, further pushing the poor away from urban resources (Baker & Lee, 2019; Houston & Zuniga, 2021; Liang et al., 2022; Lung-Amam et al., 2019; Padeiro et al., 2019). These intricate pieces of evidence encourage us to examine the role of travel mode and transport infrastructure planning in segregation studies, both from the mobility choices it provides and the development impact it has.

Building upon the activity-based segregation research, we post four major hypotheses following the research gaps identified:

H1: At the individual level, people of the most vulnerable groups experienced more social segregation at day and night.

The vulnerable groups are most affected by segregation (Arbaci, 2007; Mitchell & Chakraborty, 2018). We want to know to what extent this is also true in high-density cities where the opportunities to meet people from another income group can be enhanced with the diversity created by density.

Then, we examine the effects of different transport modes on the degree of social mixing that a place exhibits, the impact of major transport infrastructural development on social mixing, and changes over time.

H2: Different transport modes are associated with different social mixing experienced.

Social mixing is affected by travel behaviour. The choice of transport modes affects the social mixing experienced by people. Generally, the more active transport modes are hypothesised to be more conducive to social mixing.

H3: The extension of railway lines has encouraged more social mixing.

Newly built metro stations are hypothesised to have encouraged higher social mixing at the place level during daytime and nighttime and at the individual level for residents living around the station neighbourhoods.

H4: The level of social mixing has reduced over time in the long term.

Studies of social exposure are typically focused on a particular temporal snapshot. Thus, the longitudinal shifts in a city’s social integration dynamics over a decade or longer have yet to be explored. We expect the residential segregation to increase over time in the long term. Yet, with the opportunities offered by higher density and diversity in compact cities, the level of social mixing declined less rapidly during daytime than during nighttime.

Data and method

This section presents our data and method following the research structure shown in Fig. 1.

Fig. 1: Research structure.
figure 1

This research uses TCS data to construct three main social mixing measures: place-based mixing, individual daytime mixing, and individual nighttime mixing. Place characters are described by POI types and functions in the city as place control variables. a. Corresponding to H1, we identify the vulnerable social groups from the survey and compare their social mixing level. b. Corresponding to H2, it estimates the impact of trip mode distribution on place-level social mixing. c. Corresponding to H3, we estimate the impact of significant Mass Transit Railway (MTR) (the subway in Hong Kong) infrastructure changes on social mixing. d. Corresponding to H4: with data dated to 1992, we present the changes in social mixing in the past two decades.

Travel characteristics survey

The main analysis in this paper is based on Hong Kong TCS 2002 and 2011. TCS 2002 includes 92,520 participants from 29,981 households. TCS 2011 includes 101,385 participants from 35,401 households (See supplemental Table S1 for details on all TCS data used in this study). Firstly, TCS provides each participant’s household monthly income in different ranges (e.g., below 3999, 4000–5999, etc.). Accordingly, we divided the participants into five groups based on the rank of their household income. Along with household income, each TCS participant also reports gender, age, number of cars per household, and home location.

Moreover, TCS asks participants to report their journeys, which usually include multiple trip legs (sample questions from TCS are included in SI Fig. 1.). For each surveyed trip, a participant reports their main travel mode and other trip modes by each trip leg. The main travel modes include walking only, private car, MTR, ferry, bus, taxi, special purpose bus (SPB), public light bus (PLB), and others. Considering multi-modal transfers, a trip may consist of multiple trip legs and each trip leg has an associated travel mode. To illustrate, one survey respondent might walk to a subway station, take the subway, and then walk to the office. For this one trip, we consider three trip legs. All trips and trip legs are also associated with an expansion factor provided by the TCS data. The expanded number of participants matched the census population in each survey year. Figure S2b shows the 2011 trip Origin-Destination Matrix of Hong Kong. The travel surveys are reported at the census Tertiary Planning Unit Street Block (TPUSB, short for SB) level (area ranges from 384 m2 to 22 km2, 0.23 km2 on average).

MTR extension

The other data we have used include the network extension history of the local MTR company, MTRC, from 2002 to 2011 and the geographical boundaries of spatial planning units in Hong Kong. SI Figure S5 shows all MTR stations by their year of establishment from 1920 to 2020.

Place attributes

To measure the effect of transport mode on each place’s social mixing, we need to control for places’ attributes. Here, we collect Points of Interest (POIs) from multiple sources in Hong Kong (including the Hong Kong Lands DepartmentFootnote 1 and OpenStreetMap) to measure seven types of POIs at each SB. The POI types include food, accommodation, finance, retail, health, education, and recreation. The original POI comes with subcategories (see Table S3 for the POI category distribution). We have encoded them into the seven large types of POIs for transport-related analysis (Lian et al., 2024; Zhang et al., 2021). In addition to the POI data, we include urban functional areas in Hong Kong (See SI Fig. S6 for the functional map)(Loo et al., 2024). There are four functional areas: Suburbs, New Towns, Urban Core, and CBD. Given that the CBD covers a very limited number of spatial units in this study, we combined the Urban Core and CBD in the analysis.

People-oriented social mixing index

To test H1, we first compute the social mixing at the individual level. Following the previous literature (Fan et al., 2023; Moro et al., 2021), the individual daytime or mobility-based social mixing index \({{DM}}_{i}\) measures individual \(i\)’s evenness of co-location with people from different income groups throughout the entire day. In this exercise, we remove people who only reported staying at home during a survey. \({{DM}}_{i}\) is defined as:

$${{\rm{DM}}}_{{\rm{i}}}=1-\frac{5}{8}{\sum }_{{\rm{q}}}\left|{{\rm{\tau }}}_{{iq}}-\frac{1}{5}\right|$$
(1)

where \({{\rm{\tau }}}_{{iq}}\) is individual i’s relative exposure to income group \(q\).

$${{\rm{\tau }}}_{{\rm{iq}}}=\mathop{\sum }\limits_{{\rm{\alpha }}}{{\rm{\tau }}}_{{\rm{i}}{\rm{\alpha }}}{{\rm{\tau }}}_{{\rm{\alpha }}{\rm{q}}}$$
(2)

where \({{\rm{\tau }}}_{i{\rm{\alpha }}}\) is individual \(i\)’s proportion of visits (or time spent) at place \({\rm{\alpha }}\) among all places \(i\) visited. We also tested results weighted by the time spent at each location by an individual (See Supplemental Note). \({\tau }_{\alpha q}\) is income group \(q\)’s proportion of visits at place \(\alpha\). \({{DM}}_{i}=1\) when a person meets each individual group evenly through the entire day’s trip. \({{DM}}_{i}=0\) when an individual meets only people of the same income group through all places visited in a day. A statistical summary of individual social mixing from 2002 to 2011 is shown in Table S6.

We also further develop the social extroversion index beyond the individual mixing index (Moro et al., 2021; Yabe et al., 2023). Here, we define an individual as socially exploring if he/she visits a place (any spatial unit) where his/her own income group or adjacent income group makes less than 20% (each income group represents approximately 20% of the sample size) of the visitors. On the contrary, if an individual visits a place where his/her own income group or his/her adjacent income group makes more than 20% of the visitors, the individual is not a social extrovert. Given that this index value can change with the threshold selected, we also recompute it using other thresholds (30% or 40%). The detailed description can be found in SI 5.3.

Place-based social mixing index

Following the previous literature (Fan et al., 2023; Moro et al., 2021), we calculate place-based daytime mixing \(D{M}_{{\rm{\alpha }}}\) for different spatial units. The place-based social mixing index (\({\rm{D}}{M}_{{\rm{\alpha }}}\)) is calculated by using each income group’s total visits (except going home) at a defined spatial unit \({\rm{\alpha }}\).

$${\rm{D}}{M}_{{\rm{\alpha }}}=1-\frac{5}{8}{\sum }_{q}\left|{v}_{q{\rm{\alpha }}}-\frac{1}{5}\right|$$
(3)

where \({v}_{q{\rm{\alpha }}}\) is the total weighted visits of income group q to unit \({\rm{\alpha }}\). \(D{M}_{{\rm{\alpha }}}\) equals 0 when the unit \({\rm{\alpha }}\) is only visited by people from one income group. \(D{M}_{{\rm{\alpha }}}\) equals 1 when the visits to unit \({\rm{\alpha }}\) is evenly distributed among all five income groups.

Home location is obtained from the TCS databases at the SB level for the nighttime or residential social mixing index. Using the home location, we derive the proportion of the population from different income groups living in an SB \({\rm{\alpha }}.\) Then we compute the nighttime social mixing (\({\rm{N}}{M}_{{\rm{\alpha }}}\)) as:

$${\rm{N}}{M}_{{\rm{\alpha }}}=1-\frac{5}{8}{\sum }_{q}\left|{p}_{q{\rm{\alpha }}}-\frac{1}{5}\right|$$
(4)

where \({p}_{q{\rm{\alpha }}}\) is the total share of income group q living in α. \(N{M}_{{\rm{\alpha }}}\) equals 0 when the unit \({\rm{\alpha }}\) is having very high residential segregation that is only inhabited by people of one income group. \(N{M}_{{\rm{\alpha }}}\) equals 1 when there is no residential segregation, and it is a very well-mixed residential neighbourhood that the residing population at \({\rm{\alpha }}\) are evenly distributed among all five income groups.

For the robustness of the study, we compute the place-based social mixing indices using different spatial units. To compare the effects of planning units and regular grids, we include two planning units of different sizes and three levels of hexagonal grids. The two planning units are the TPU and the SB. The size of SBs in Hong Kong ranges from 384 m2 to 22 km2 (0.23 km2 on average). The size of TPU ranges from 59,023 m2 to 28.5 km2 (3.8 km2 on average). In other words, the size of each of these planning units varies substantially. Though they are relevant for policy implications, they are not very good for testing MAUP as the size varies too much among these spatial units.

To test the MAUP better, the analysis of location data is often done using a regular grid, which provides smooth gradients and the ability to measure differences between cells (Sahr et al., 2003). Hence, we divide the city using the same shape and size of grids at different levels. Athey et al. (2021) and Xu et al. (2019) use squares to analyse racial segregation. Here, we use H3 hexagonsFootnote 2, mainly considering the distance attributes – all neighbours are equidistant for hexagons. Specifically, we include H3 Levels 7, 8 and 9 in the analysis, considering their sizes are closest to the average size of TPUs and SBs. A Level 7 hexagon has an edge length of 1.2 km (5.2 km2), a Level 8’s edge length is 0.46 km (0.74 km2), and a Level 9’s edge length is 0.17 km (0.11 km2).

As trips and trip legs were originally reported at the SB level, we interpolate the trips to each H3 level considering the overlapping area. For any H3 hexagon \(h\) with area \({A}_{h}\), we first obtain a list of \(N\) SB \({\{b}_{1},{b}_{2},\ldots {b}_{n},\ldots ,{b}_{N}\)} overlapping with \(h\). Each overlapped SB \({b}_{n}\) has intersection area with \({A}_{h}\) as \({a}_{{hn}}\). Each overlapped SB \({b}_{n}\)’s original area is \({a}_{{bn}}\). A given SB \({b\_n}\) could have attracted \({tri}{p}_{n}\), then the trips are interpolated to h as \({tri}{p}_{{hn}}=\frac{{tri}{p}_{n}\times {a}_{{hn}}}{{a}_{{bn}}}\). Total trips attracted to H3 hexagon h is \(Tri{p}_{h}=\sum tri{p}_{hn}\). To conform to reality, we only consider built-up areas based on the data from the OpenStreetMap. Natural land (water or greenery) is excluded.

It is expected that the larger the spatial unit, the higher the DM should be. Comparing among the two planning units, DM should be smaller for TPU than SB. For the different hexagonal cells, DM should reduce from Level 9 (smallest) to Level 7 (largest). A full comparison of all study units is shown in SI Fig. S7.

Impact of travel modes on social mixing

To test H2, we analyse the impact of travel modes on place-level social mixing and individual social mixing separately. For place-level mixing, we first run two cross-sectional regression analyses:

$${\rm{D}}{{\rm{M}}}_{{\rm{\alpha }}}={\rm{\gamma }}\times L{{\rm{Trip}}s}_{m}$$
(5)
$${\rm{D}}{{\rm{M}}}_{{\rm{\alpha }}}={{\rm{\beta }}}_{1}\times L{\rm{Pop}}+{{\rm{\beta }}}_{2}\times L{\rm{TotalT}}+{\rm{\gamma }}\times L{{\rm{Trip}}s}_{m}+{\rm{\sigma }}\times P{\rm{OI}}$$
(6)

where \({LTrip}{s}_{m}\) is the log-transformed trip legs by travel mode \(m\). The coefficients of γ for different transport modes will allow us to see whether a transport mode is conducive to social mixing (positive slope or coefficient) or associated with social segregation (negative slope or coefficient). LTotalT stands for the log-transformed total trips attracted to any spatial unit, and LPop stands for the log-transformed population living in any spatial unit. For LTotalT, we also test using the log-transformed total trip legs for better model fit. \({POI}\) includes a list of POI type variables that control for the place type. Here we compare the γ between Eqs. 5 and 6 to understand the effects of a transport mode controlling for other context variables. Still, urban functional characteristics may impact place-based mixing. To address this concern, we add a variation of Eq. 6 by including urban functional area (Loo et al., 2024):

$$\begin{array}{l}{\rm{D}}{{\rm{M}}}_{{\rm{\alpha }}}={{\rm{\beta }}}_{1}\times L{\rm{Pop}}+{{\rm{\beta }}}_{2}\times L{\rm{TotalT}}+{\rm{\gamma }}\times L{{\rm{Trip}}s}_{m}\\\qquad\qquad+\,{\rm{\sigma }}\times P{\rm{OI}}+\alpha \times {Function}\end{array}$$
(7)

where the Function variable is a categorical variable that is one of the Suburbs, New Towns, or Urban Core (including CBD).

We also test the longitudinal effect of trip mode changes on the DM with the model below:

$$\begin{array}{l}\Delta D{{\rm{M}}}_{{\rm{\alpha }}}={{\rm{\beta }}}_{1}\Delta L{\rm{Pop}}+{{\rm{\beta }}}_{2}\Delta L{\rm{TotalT}}+{\rm{\gamma }}\Delta L{{\rm{Trip}}s}_{m}\\\qquad\qquad+\,{POI}+{Function}\end{array}$$
(8)

where \(\Delta {\rm{L}}{Trip}{s}_{m}\) measures the change of log-transformed total trips by mode m to each study unit between 2011 and 2002. This model also controls for the count of POI by each type and urban functional areas to account for land use differences.

To further understand whether travel modes explain social mixing experienced at the individual level (\({{DM}}_{i}\)), we specify another regression:

$${\rm{D}}{{\rm{M}}}_{{\rm{i}}}={{\rm{\beta }}}_{1}{\rm{Demographics}}+{{\rm{\beta }}}_{2}{\rm{Mode}}+{{\rm{\beta }}}_{3}{\rm{Mobility}}$$
(9)

where Demographics include three indicators: income level, age and gender. Income is the categorical variable describing each individual’s household income level per survey answer (1 being the lowest income group, 5 being the highest income group). Mode is described by the proportion of trips legs (\({P}_{m}\)) conducted through a single travel mode \(m\). For example, if a person \(i\) reports N trip legs in the survey, and n out of N are conducted by travel mode m, then we have \({P}_{m}=n/N\). Mobility is described by two indicators – L and A. L represents the total trip legs an individual reported to the survey. The higher this number, the more mobile the individual is. A is the total activity size. We construct an activity container for each individual based on the reported trip origin and destination SB’s centroids. The container is a convex hull covering all visited paces by an individual. A similar method to create an activity container was used by Alessandretti et al. (2020). Then, we compute the perimeter of each participant’s activity space (See SI Note 1.3 and Fig. S4) to describe the size of their activities. Table 1 presents four variations of Eq. 9 that are listed below:

$${DM}={{\rm{\beta }}}_{2}{Mode}$$
(10)
$${DM}={{\rm{\beta }}}_{3}{Mobility}$$
(11)
$${DM}={{\rm{\beta }}}_{1}{Demographics}+{{\rm{\beta }}}_{2}{Mode}$$
(12)
$${DM}={{\rm{\beta }}}_{1}{Demographics}+{{\rm{\beta }}}_{2}{Mode}+{{\rm{\beta }}}_{3}{Mobility}$$
(13)
Table 1 Cross-sectional explanatory models of individual daytime social mixing (2011).

The same models are applied to both year 2002 and 2011 to test for robustness. Understanding that people’s mobility pattern and choice of transport mode will largely be impacted by what they have access to, we added a transport opportunity control to Eqs. 12 and 13. This transport opportunity is described by whether the surveyed participants have reported car at home, live within 800 metres of a bus station, or live within 1000 metres of any MTR station (SI section 9 describes the details). Repeated models using 2002 data is shown in SI Table S8.

Impact of MTR expansion on social mixing dynamics

To test H3, we analyse the impact of MTR expansion at the place level and individual level separately. Hong Kong has undergone substantial MTR network expansion from 2002 to 2011. Using the MTR station establishment history, we divide all study units into five groups by their distance from the nearest MTR stations built between 2002 and 2011 (Fig. 2a):

  1. a.

    Inner ring of new MTRs: spatial units that are within the 1000-metre buffer of any MTR stations built between 2002 and 2011;

  2. b.

    Outer ring of new MTRs: spatial units within the 1000 to 2000-metre rings of the MTR stations built between 2002 and 2011;

  3. c.

    Inner ring of Pre-02 MTR: spatial units that are within the 1000-metre buffer of MTR stations built before 2002;

  4. d.

    Outer ring of Pre-02 MTR: spatial units within the 1000 to 2000-metre rings of pre-2002 MTR stations;

  5. e.

    No MTR: all others outside the 2000-metre buffer of any MTR stations built before 2011.

Fig. 2: Methods.
figure 2

a Spatial distribution of spatial units based on distance from newly established MTR stations between 2002 and 2011. All grey areas are places without any MTR station built before 2011 within 2000 metres. b Construct the treatment and control groups among the participants based on their home location. c Visualising the trip legs collected via the TCS 2011 data. For all trips’ origin and destination distribution, see Figure S1.

The 1000-metre buffer is a commonly used threshold to determine catchment areas from a given transit station (Fan et al., 2021). Summary statistics of the five groups of places are shown in SI Table S9. After establishing the five groups, we compare the change of place-based social mixing of the inner (treatment) and outer rings (control) of the new MTRs, considering these two groups should share similar socioeconomic status before the MTR treatment. We construct a Difference-in-Difference (DD) model adjusted for overall trend, change in population, and change in trip volumes by MTR, buses and private cars, provided that the extension of MTR also introduces changes in other trip modes. The linear model estimating the social mixing at spatial unit \(i\), year t could be written as:

$${{DM}}_{{it}}={{\rm{\beta }}}_{i}+{{\rm{\gamma }}}_{t}+{{\rm{\sigma }}}_{{DD}}{withMT}{R}_{i,t}+{{\rm{\epsilon }}}_{i,t}$$
(14)

where \({withMT}{R}_{i,t}\) is the indicator of in year t if the spatial unit \(i\) is within the 1000-metre buffer of a new MTR station. \({\beta }_{i}\) the location effects, indicating if a spatial unit \(i\) is within the 1000-metre buffer of any new MTR station. \({{\rm{\gamma }}}_{t}\) is the year effect.

Considering that the extension of MTR infrastructure not only changes people’s experience but also may lead to the redistribution of local residents (He et al., 2018; Yip, 2016), we conduct a Chi-square test among all four groups of spatial units to compare whether the observed change of proportion of an income group population living within the proximity of metro stations is statistically different from the expected changes.

To test the impact of MTR extension on social mixing at the individual level, the study assigns each individual to groups based on their home location. Mirroring the methods adopted for place-level analysis, we compare people’s change in social mixing across the same five groups. The changes in individual social mixing are described via a variant of Eq. 9:

$${M}_{{it}}={\beta }_{i}+{\gamma }_{t}+{\sigma }_{{DD}}{withMT}{R}_{i,t}+{\lambda }_{i}+{\epsilon }_{i,t}$$
(15)

where we add \({{\rm{\lambda }}}_{i}\) effect to control for the income level, age, and gender of each participant in the survey.

Lastly, to explain whether the increased usage of MTR can lead to increased social mixing at the individual level, we construct an Indirect Least Square (ILS) model using whether individual living within the new MTR catchment area as the instrument variable.

$$\Delta {M}_{\alpha }=\sigma \Delta {Tri}{p}_{{mtr}}+\lambda$$
(16)

where \(\Delta {Tri}{p}_{{mtr}}\) is predicted with the equation \(\Delta {\rm{Tri}}{{\rm{p}}}_{{\rm{mtr}}}={\rm{\pi }}{\rm{withMTR}}\), \(\Delta {M}_{{\rm{\alpha }}}\) is the average individual social mixing by home location \({\rm{\alpha }}\), gender, and income group. \({\rm{\lambda }}\) is the effect on home location, gender and income group.

Long-term changes in social mixing over two decades

Lastly, we also consider the TCS data from 1992 in Hong Kong. The focus here is to provide a long-term view of changes in social mixing at the individual and place levels over time. Generally, the quality of visiting data in TCS 1992 is not as good - with the spatial unit not at the SB level (as in 2002 and 2011) but at the traffic zone level. The traffic zones are generally much larger (their areas range from 384 m2 to 22,600,000 m2, with a median of 37,180 m2). Given that traffic zones are larger, we must adjust the spatial scale from H3 Level 9 to H3 Level 7 for temporal comparisons (see Fig. S3 for the income distribution across the two decades). Nonetheless, as it is rare to examine social mixing in a city over two decades, we consider the analysis meaningful despite the limitations. Similar to previous steps, the individual mixing during the day and their experienced nighttime mixing at their home location are computed separately.

Results

Individual mixing varies greatly by income, age, gender and time of the day

At the individual level, we first find that social mixing is not homogeneous across income groups, age, people’s location of home, and day and night. Figure 3a shows the results, including all trip legs and considering the frequency of visits to each location. The blue lines describe the distribution of individual mixing with TCS data. The grey line is the distribution of individual mixing calculated with randomly assigned income groups for each person. On average, an individual in Hong Kong has a higher social mixing level than previous findings in the U.S. (Moro et al., 2021) (see SI Fig. S8 for comparing the distribution of individual mixing between Hong Kong and Boston Metropolitan region). Figure 3a also indicates that the distribution of individual social mixing is far from random (two-sided t-test statistics = −111.6 p-value < 0.0001). Figure 3b distinguishes between social introverts and social extroverts. Social extroverts tend to spend more time at places where similar income groups are not the majority (\({{\rm{\sigma }}}_{i} > =50 \%\)). In contrast, social introverts tend to visit and spend most of their time at places where similar income groups are the majority (\({{\rm{\sigma }}}_{i} < 50 \%\)). In Hong Kong, the share of social extroverts (34%) is lower than that of introverts (66%), meaning that, in the context of Hong Kong, people are more likely to stay with their income group (see SI Note 5.2 Table S7 for robust testing of the definition of social extrovert and introvert).

Fig. 3: Individual mixing.
figure 3

a Distribution of individual mixing compared to simulated results with randomly assigned income groups based on 2011 income distribution. The simulated value shows an average of 100 stochastic assignments. b The proportion of visits paid to places where the major income group is not their own compared to random assignments. We repeated the analysis by adjusting the “major income group” threshold in the SI notes Table S7. c Individual social mixing at day and night by age groups (only spatial units with at least 5 households interviewed are included in the calculation). d Individual mixing by income group. Income group 1 is the lowest income group, and 5 is the highest. e Individual social mixing at nighttime by income groups. f Distribution of individual social mixing by the main travel purpose of each individual. HBS: home-based-school; HBO: home-based-other; EB: employment-based; HBW: home-based-work; NHB: non-home-based.

To further understand the temporal dynamics of social mixing, we compare the daytime individual-level mixing and the nighttime social mixing at home locations (residential social mixing). The territory-wide average \({{DM}}_{i}\) and \({{NM}}_{i}\) are at 83.7% and 54.4%, respectively. We further compare the day and night social mixing among all age groups (Fig. 3c). The nighttime social mixing is relatively stable across all ages; however, the daytime level of mixing shows substantial variations across age groups. Before the age of 30, people tend to have more diversified social experiences during the day as they grow up. Afterwards, people’s daytime social mixing decreases with age. On average, people over 65 years old tend to have 3.2% lower mobility-based daytime mixing than people between the ages 18 to 35 (two-sided t-test p-value < 0.0001).

Figure 3d, e show that, among all five income groups, individuals from the lowest income group tend to encounter less diverse people during the daytime. In 2011, individuals from the lowest income group experienced 1.4% less social mixing than group 4 during the day, two-sided t-test t-value = −11.88, p-value < 0.0001. Further, people from the lowest income group also tend to reside in highly segregated (second to the last) places. In addition, we also compare individuals’ daytime social mixing based on the main trip purpose for each person (Fig. 3f). The main trip purpose represents the most frequent trip purpose taken by each person. We observe that individuals who made more non-home-based (NHB) trips were more socially active and, hence, experienced more social mixing. In contrast, individuals whose major trip purposes were home-based-school (HBS) encountered the lowest experienced social mixing. This partly reflects the public school allocation system in Hong Kong being based on residence and, hence, aggravating the segregation that children may experience, especially for those living in disadvantaged neighbourhoods (Loo & Lam, 2015).

Buses are the most-friendly modes for social mixing

To test H2 and explain the effects of different transport modes on social mixing, we first show the place-level results. We compare the results of Eqs. 5 and 6 using Fig. 4a–e. The five binned scatter plots demonstrate the coefficient of each trip mode, with or without the spatial context. Not yet considering the total trip number and the local population, all transit mode usage positively correlates with the place’s social mixing level. After controlling for the density factors (TotalT and Pop), bus, walking, and MTR remain positively correlated with the social mixing level. On the contrary, taxi trips’ association with social mixing level diminishes, and car trips show a negative relationship with the social mixing level (Full table results are included in the SI Note Table S4). Among walking, bus, and MTR trips, bus trips have the strongest association with social mixing levels. Generally, a one percent increase in bus trips is associated with a 2.8% increase in social mixing; a one percent increase in MTR trips is associated with a 2.0% increase in social mixing; a one percent increase in walking trips is associated with a 1.4% increase of social mixing.

Fig. 4: Explain social mixing with transportation.
figure 4

(ae) Estimating the place level social mixing with transport modes with and without density control (See SI Table S4 for the full results, including effects of POI counts by types). f Impact of total change of trips by each transport mode on place-level social mixing (Eq. 7, See Table S5 for full results). g Equation 6 is repeated within each functional area. Table S11 shows the distribution of transportation opportunities.

It is worth noting that the place characteristics, described by the types of POIs and urban functional areas, have significant relationships with place-level social mixing. SI Note Table S4 shows that the number of food and health-related POIs contributes most significantly to place-level social mixing. A one percent increase in food-related POIs is associated with a 1.4–2.0% increase in daytime social mixing and a 1.1–1.7% increase in nighttime social mixing. Similarly, a one percent increase in health-related POIs is associated with a 1.5–2.1% increase in daytime social mixing and a 1.7–2.1% increase in nighttime social mixing. We also find that education and recreational POIs are positively associated with more socially mixed places, although these associations are less significant compared to food and health-related POIs. Among the three types of urban functional areas, the New Towns demonstrates the highest place mixing both during the day and at night. During the day, the Suburb is 4.4% less socially diverse, and the Urban Core (including CBD) areas are 2.4% less socially diverse compared to the New Towns. At night, the Urban Core (including CBD) areas are the least socially diverse, with 4.1% lower diversity than the New Towns.

Figure 3f summarises the results for Eq.7. Consistent with the cross-sectional study, increased trips by bus and MTR exert positive effects on the change of social mixing over time in Hong Kong. On the contrary, the increase in taxi trips has a negative effect on the change in social mixing.

These results also apply at the individual level. Table 1 summarises four explanatory models of the variability of individual mobility-based social mixing (DMi). The four models are variants from Eq. 8. Beyond the travel modes discussed above (Model 1), how far and how frequently people travel is shown to have a positive connection with mobility-based social mixing (Model 2). Next, we also control for individuals’ income level, gender, age, and access to different transportation opportunities (Models 3 and 4). Based on the more comprehensive model (Model 4), it is observed that the transport modes adopted by each individual are also highly correlated with one’s exposure to other income groups, that is, the level of individual social mixing. Specifically, people taking more MTR and bus trips are likely to experience higher social mixing. Though to a lesser extent, ferry and tram trips also have similar effects. However, walking and cycling at the individual level is not as effective in exposing people to a diverse social group. On the contrary, taxi and private car trips negatively affect the social mixing level. Generally, mobility (as reflected by the number of total trip legs and size of activity space) has a positive relationship with individual social mixing.

Mass Transit Railway extension and place-level social mixing

We found that the extension of MTR has increased the \(D{M}_{{\rm{\alpha }}}\) significantly (5–6% increase controlling for the change of trips and the change of population) while exerting little effect on \(N{M}_{{\rm{\alpha }}}\). The major reason is that locations chosen for the MTR extension tended to be more socially mixed even before the MTR extension (SI Table S9, the inner ring group had 4.9% higher daytime mixing and 11.4% nighttime mixing before the MTR extension than the outer ring group).

Even though the change in residential social mixing (NMα) brought by the MTR extension is neglectable, and the distribution of income groups may have shuffled. To illustrate, the place may have a similar level of social mixing, but the dominant income group might have changed from low-income to high-income. To test this, we conduct a Chi-square test among all five groups of spatial units (Table 2). The Chi-square test shows income groups 1, 4 and 5’s distribution changes across study unit groups are not homogeneous (p-value < 0.01). When we focus on the lowest income group, we can see that the treatment group’s change of group 1 (10,766) is lower than the expected value (24,209). The control group’s change of residents is higher than expected (9287), implying the treatment study units have experienced a relatively reduced low-income population (gentrification). On the contrary, more people from the higher income group (group 4) have been found residing around the new MTR station areas. We also repeat this analysis by changing the definition of low and high-income groups using the government low-income households’ line by family size each year. The results indicate consistent findings (see SI Note 7). While there is support for H3 at the place level during daytime, their impacts on nighttime social mixing are negligible.

Table 2 Chi-square Test: Comparing changes of the residential population by income groups.

Mass Transit Railway extension’s positive effect on individual social mixing

First, we present the changes in travel behaviour by individual groups. Figure 5a–c show that people who live closer to the new MTR catchment areas tend to travel further, use more MTR, and use fewer buses in 2011 than in 2002. Correspondingly, their unique trip legs were reduced more (Fig. 5d). Further, using whether people’s home location is in a newly-established MTR catchment area as the instrument, we estimate that a one percent increase in MTR trips is associated with a 0.13% increase in individual social mixing (see Table S10 for the full result). This result further emphasises the importance of public transit in improving social mixing.

Fig. 5: Impact of metro extension on individual social mixing.
figure 5

a People living near the new MTR stations have seen a larger activity perimeter than other groups. b People close to the new MTR stations have seen more MTR trips than other groups. c The change in percentage of bus trip legs. d The change of percentage of total unique trip legs.

Two decades of social mixing

Lastly, we test H4 and present the differences in social mixing experienced by individuals in the daytime and nighttime from 1992 to 2011. Firstly, individual-experienced daytime mixing (mobility-based social mixing) is constantly higher than nighttime (residential social mixing) (Fig. 6d). Still, daytime mixing \((D{M}_{i}\)) and has a slightly downward trend, indicating people are less socially mixed over the two decades. Furthermore, residential or nighttime mixing \((N{M}_{i}\)) has a peak in the year 2002 and went downwards again, resonating with the public housing development project in Hong Kong being slowed down and very difficult during the decade (Loo & Chow, 2011). Specifically, for mobility-based or daytime mixing, people from lower-income groups experienced higher social mixing in 1992 than in 2002 and 2011 (Fig. 6e). Overall, the highest and lowest income groups (groups 1 and 5) have been the most segregated at night across the two decades.

Fig. 6: Two decades of social mixing in Hong Kong experienced by individuals day and night.
figure 6

a Individual social mixing by major trip purpose. HBS home-based-school, HBO home-based-other, EB employment-based, HBW home-based-work, NHB non-home-based. b Individual mixing by gender. c Individual mixing by age groups. d Differences between individual social mixing and nighttime social mixing in residential areas from 1992 to 2011. e Individual social mixing by each year. f Residential social mixing by income group and year.

Secondly, the difference between men and women regarding daytime or mobility-based social mixing has reduced (Fig. 6b). Thirdly, the relationships between trip purpose and social mixing have also changed. People whose primary trip purposes were employment-based (EB) have become more socially mixed during the two decades, while the people who mainly conducted home-based trips (HBS, HBO and HBW) have generally become less socially mixed (Fig. 6a). Finally, variations of mobility-based social mixing across different age groups increased from 1992 to 2011.

Discussion

This research makes four main contributions to the existing activity-based segregation studies. First, resonating with the U.S.-based studies utilising mobile phone records (Athey et al., 2021; Yabe et al., 2023), transit cards (Xu et al., 2019), and Twitter posts (Q. Wang et al., 2018), the TCS-based measures of daytime and nighttime social mixing provide further evidence to confirm the extension of residential segregation to the activity space even in a high-density urban environment, Hong Kong. Theoretically, our results anchor the UGCoP effect. In addition, building upon the structural variance of inequality (Boterman & Musterd, 2016; Cagney et al., 2020; Cook et al., 2024), this study shows that social mixing level is related to a person’s age, income group, and main trip purpose. We found that teenagers experienced less social mixing during the day. Similarly, people with most trips for school suffer from the lowest social mixing. In addition, people from the lowest income group experienced significantly lower social mixing than those from higher income groups. This pattern has persisted since 1992. If we consider exposure to a diverse social group to be a key to long-term personal development, especially for low-income families (Chetty et al., 2022), this result warns us that even though mobility moderates the social segregation for individuals, people from the most vulnerable group still benefit less. Therefore, local travel incentives such as low transit fares for older adults, students, and low-income families can have a long-term impact by encouraging people from these groups to increase their social exposure.

Second, this project utilises the detailed trip mode information provided by TCS to analyse the trip modes’ impact on social mixing at place and individual levels across years. At the place level, we found that conditioning on place amenity types (described by the composition of POIs), population density, and visitor density, a higher percentage of bus trips and MTR trips are associated with higher place-level mixing. The effect is consistent with both cross-sectional and longitudinal studies. We also found that the POIs included in the study contribute significantly to place-level social mixing. Resonating with previous studies on POI and activity-based social segregation (Fan et al., 2023; Moro et al., 2021), we found that food-related POIs contribute highly to place-level social mixing during the day and slightly lower contribution to nighttime social mixing. In parallel, a series of cross-sectional models also show the positive effect of public transit on higher social mixing levels at the individual level. Longitudinally, we used an instrumental variable to show that the increased usage of MTR is associated with increased social mixing on average. All these results support the social benefits of an urban public transit system as it moderates social segregation in the long run.

Despite the promising effect of public transit, when evaluating social mixing at the place level, the intricate impact of MTR extensions on social mixing is worth mentioning. Contrary to a simple answer, whether MTR extensions are conducive to increasing social mixing or not, our analysis reveals that MTR extensions were biased towards areas already characterised by a high degree of social mixing during daytime and nighttime prior to the extension. Consequently, the additional benefits brought about by such extensions may have been overstated, given the lack of consideration for the pre-existing contextual conditions. Theoretically, this points to the pitfall of potential “place-selection bias” in transport infrastructure planning. Despite the reputation of transit-oriented development (TOD) for promoting mixed-use development, rail extension proposals are predominantly implemented in areas that are already socially diverse. Additionally, our analysis indicates a decrease in the population of the lowest-income group in areas impacted by MTR extensions. This result resonates with studies on railway extension’s gentrification effects on local neighbourhoods (He et al., 2018; Liang et al., 2022; Loo & du Verle, 2017). It further cautions against replacing lower-income neighbourhoods with higher-income ones through metro line expansions.

Finally, it is imperative to recognise that walking positively impacts increasing social mixing at the place level. This reinforces the need for enhancing walkability by creating spaces that are welcoming to all income groups within cities. On the contrary, people who primarily walk for their daily activities are associated with lower individual social mixing. These findings remind us to reconsider the potential implications of local living initiatives—people who benefit from a very local life may also sacrifice their social diversity.

Conclusion

In this study, we used travel survey data to demonstrate that public transit moderates social segregation in a city. By analysing both the patterns and trends of social mixing at the place level and the individual level, we highlighted the value of using human mobility data in examining social mixing in cities over time. This study still has the following limitations. First, for the long timespan of this study, there could be systematic changes regarding how the survey was collected, especially in 1992. We have not yet been able to identify changes in other transport infrastructures that happened simultaneously with the MTR extension, which can be a confounding factor in the difference-in-difference setting. Lastly, the study site, Hong Kong, is a high-density city. Its public transport is accessible to around 98.8% of the populationFootnote 3, comparable to European cities like Paris and London, but much higher than most U.S. cities included in many previous activity-based segregation studies. Therefore, the effect of public transit and other mobility options is worth further investigation in a less accessible context. These factors should be further evaluated in future studies.