Table 1 Definition of variables used in the decision tree model

From: US cities are defined by rings and pockets with limited socioeconomic mixing

Variable

Definition

Dist. to CBD

Distance in kilometers from the block group to the CBD; CBD identified as the largest cluster of POIs within 50 m of each other using the DBSCAN algorithm

Amenity (H)

Shannon entropy of POI types, calculated as \(H=-\Sigma p(x)\log p(x)\) where p(x) is the proportion of POIs in each six-digit NAICS industry classification

Density

Population density (population per square kilometer)

Amenity (number)

Count of POIs

Median income

Median household income in US dollars

Nonwhite (%)

Percentage of population not identifying as non-Hispanic white

College (%)

Percentage of adults (age 25+ years) with at least a bachelor’s degree

Household size

Average number of people per household

Vacancy rate

Percentage of rental units that are vacant

Rent burden

Percentage of households spending more than 30% of income on rent

Under 16 (%)

Percentage of population under 16 years of age

Unemployed (%)

Percentage of labor force that is unemployed

  1. The model uses a combination of demographic, economic and urban structure variables to predict neighborhood segregation and isolation. Data come from the US Census Bureau’s American Community Survey and SafeGraph location data. All variables are measured at the Census block group level.