The city as text

Reades, Jonathan; Hu, Yingjie; Tranos, Emmanouil; Delmelle, Elizabeth

doi:10.1038/s44284-025-00314-x

Review Article
Published: 01 September 2025

The city as text

Nature Cities volume 2, pages 794–800 (2025)Cite this article

1260 Accesses
4 Citations
15 Altmetric
Metrics details

Subjects

Preface

Urban researchers now have access to vast amounts of textual data—from social media and news to planning documents and property listings. These textual data provide important information about the activities of people and organizations in urban environments. Meanwhile, recent advancements in computational tools, including large language models, have expanded our ability to analyze textual data. Here we explore how these tools are reshaping the ways we analyze, understand and theorize the city through text. By outlining key developments, applications and challenges, it argues that text is no longer a ‘fringe resource’ but a central component in urban analytics with the potential to connect quantitative and qualitative researchers.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on SpringerLink
Instant access to the full article PDF.

USD 39.95

Prices may be subject to local taxes which are calculated during checkout

Large language models in urban planning

Article 09 June 2025

Urban planning in the era of large language models

Article 08 September 2025

Bridging urban theory and artificial intelligence: a multi-agent recommendation system for sustainable city development

Article Open access 23 March 2026

References

Arribas-Bel, D. Accidental, open and everywhere: emerging data sources for the understanding of cities. Appl. Geogr. 49, 45–53 (2014).
Google Scholar
Oto-Peralías, D. What do street names tell us? The ‘city-text’ as socio-cultural data. J. Econ. Geogr. 18, 187–211 (2018).
Google Scholar
Batty, M. The New Science of Cities (MIT Press, 2017).
Kitchin, R. Big Data, new epistemologies and paradigm shifts. Big Data Soc. https://doi.org/10.1177/2053951714528481 (2014).
Article Google Scholar
Arribas‐Bel, D. & Reades, J. Geography and computers: past, present and future. Geogr. Compass 12, e12403 (2018).
Google Scholar
Harford, T. Big data: are we making a big mistake? Significance 11, 14–19 (2014).
Google Scholar
Long, Y. & Thill, J.-C. Combining smart card data and household travel survey to analyze jobs-housing relationships in Beijing. Comput. Environ. Urban Syst. 53, 19–35 (2015).
Google Scholar
Lazer, D. et al. Computational social science. Science 323, 721–723 (2009).
Google Scholar
Lore, M., Harten, J. G. & Boeing, G. A hybrid deep learning method for identifying topics in large-scale urban text data: benefits and trade-offs. Comput. Environ. Urban Syst. 111, 102131 (2024).
Google Scholar
Lazer, D. M. J. et al. Computational social science: obstacles and opportunities. Science 369, 1060–1062 (2020).
Google Scholar
Zook, M. & Poorthuis, A. in The Geography of Beer (eds Patterson, M. & Hoalst-Pullen, N.) 201–209 (Springer, 2014).
Crooks, A., Croitoru, A., Stefanidis, A. & Radzikowski, J. #Earthquake: Twitter as a distributed sensor system. Trans. GIS 17, 124–147 (2013).
Google Scholar
Crampton, J. W. et al. Beyond the geotag: situating ‘big data’ and leveraging the potential of the geoweb. Cartogr. Geogr. Inf. Sci. 40, 130–139 (2013).
Google Scholar
Johnson, I. L., Sengupta, S., Schöning, J. & Hecht, B. The geography and importance of localness in geotagged social media. In CHI’16: Proc. 2016 CHI Conference on Human Factors in Computing Systems, 515–526 (ACM, 2016).
Hecht, B. & Stephens, M. A tale of cities: urban biases in volunteered geographic information. In Proc. Int. AAAI Conference on Web and Social Media Vol. 8, 197–205 (AAAI, 2014).
Wang, Z., Lam, N. S., Obradovich, N. & Ye, X. Are vulnerable communities digitally left behind in social responses to natural disasters? An evidence from Hurricane Sandy with Twitter data. Appl. Geogr. 108, 1–8 (2019).
Google Scholar
Hristova, D., Williams, M. J., Musolesi, M., Panzarasa, P. & Mascolo, C. Measuring urban social diversity using interconnected geo-social networks. In WWW’16: Proc. 25th International Conference on World Wide Web, 21–30 (ACM, 2016).
Borner, K. Atlas of Science: Visualizing What We Know (MIT Press, 2010).
Skupin, A. The world of geography: visualizing a knowledge domain with cartographic means. Proc. Natl Acad. Sci. USA 101, 5274–5278 (2004).
Google Scholar
Boyack, K. W. et al. Clustering more than two million biomedical publications: comparing the accuracies of nine text-based similarity approaches. PLoS ONE 6, e18029 (2011).
Google Scholar
Mikolov, T., Chen, K., Corrado, G. & Dean, J. Efficient estimation of word representations in vector space. In Proc. 1st International Conference on Learning Representations (ICLR) 1–12 (ICLR, 2013).
Nissim, M., van Noord, R. & Van Der Goot, R. Fair is better than sensational: man is to doctor as woman is to doctor. Comput. Linguist. 46, 487–497 (2020).
Google Scholar
Stich, C., Tranos, E. & Nathan, M. Modeling clusters from the ground up: a web data approach. Environ. Plan. B Urban Anal. City Sci. 50, 244–267 (2023).
Google Scholar
Würschinger, Q. & McGillivray, B. Semantic change and socio-semantic variation: the case of COVID-related neologisms on Reddit. Linguist. Vanguard https://doi.org/10.1515/lingvan-2023-0106 (2024).
Taylor, J. E. & Gregory, I. N. Deep Mapping the Literary Lake District: A Geographical Text Analysis (Rutgers Univ. Press, 2022).
National Archives. Born-digital records and metadata. National Archives https://www.nationalarchives.gov.uk/information-management/manage-information/digital-records-transfer/what-are-born-digital-records/ (2024).
Moretti, F. Distant Reading (Verso Books, 2013).
Fu, X. Natural language processing in urban planning: a research agenda. J. Plan. Lit. https://doi.org/10.1177/08854122241229571 (2024).
Hu, Y. Geo‐text data and data‐driven geospatial semantics. Geogr. Compass 12, e12404 (2018).
Google Scholar
Ahmed, K. B., Radenski, A., Bouhorma, M. & Ahmed, M. B. Sentiment analysis for smart cities: state of the art and opportunities. In Proc. International Conference on Internet Computing and Internet of Things (ICOMP) 55–61 (The Steering Committee of The World Congress in Computer Science, 2016).
Kovacs-Gyori, A., Ristea, A., Havas, C., Resch, B. & Cabrera-Barona, P. #London2012: towards citizen-contributed urban planning through sentiment analysis of Twitter data. Urban Plan. 3, 75–99 (2018).
Google Scholar
Ceccato, V. & Snickars, F. in Urban Ecology (eds Breuste, J. et al.) 273–277 (Springer, 1998).
Das, D. Urban quality of life: a case study of Guwahati. Soc. Indic. Res. 88, 297–310 (2008).
Google Scholar
Eby, J., Kitchen, P. & Williams, A. Perceptions of quality life in Hamilton’s neighbourhood hubs: a qualitative analysis. Soc. Indic. Res. 108, 299–315 (2012).
Google Scholar
Khoo, C. S. & Johnkhan, S. B. Lexicon-based sentiment analysis: comparative evaluation of six sentiment lexicons. J. Inf. Sci. 44, 491–511 (2018).
Google Scholar
Wankhade, M., Rao, A. C. S. & Kulkarni, C. A survey on sentiment analysis methods, applications and challenges. Artif. Intell. Rev. 55, 5731–5780 (2022).
Google Scholar
Hu, Y., Deng, C. & Zhou, Z. A semantic and sentiment analysis on online neighborhood reviews for understanding the perceptions of people toward their living environments. Ann. Am. Assoc. Geogr. 109, 1052–1073 (2019).
Google Scholar
Zou, L. et al. Social and geographical disparities in Twitter use during Hurricane Harvey. Int. J. Digit. Earth 12, 1300–1318 (2019).
Google Scholar
Huang, J. et al. Re-examining Jane Jacobs’ doctrine using new urban data in Hong Kong. Environ. Plan. B Urban Anal. City Sci. 50, 76–93 (2023).
Google Scholar
Fu, X., Sanchez, T. W., Li, C. & Reu Junqueira, J. Deciphering public voices in the digital era: benchmarking ChatGPT for analyzing citizen feedback in Hamilton, New Zealand. J. Am. Plan. Assoc. 90, 728–741 (2024).
Google Scholar
Azaryahu, M. Renaming the past: changes in ‘city text’ in Germany and Austria, 1945–1947. Hist. Mem. 2, 32–53 (1990).
Google Scholar
Zelinsky, W. Along the frontiers of name geography. Prof. Geogr. 49, 465–466 (1997).
Google Scholar
Rose-Redwood, R., Alderman, D. & Azaryahu, M. Geographies of toponymic inscription: new directions in critical place-name studies. Prog. Hum. Geogr. 34, 453–470 (2010).
Google Scholar
Purves, R. S., Clough, P., Jones, C. B., Hall, M. H. & Murdock, V. Geographic information retrieval: progress and challenges in spatial search of text. Found. Trends® Inf. Retr. 12, 164–318 (2018).
Google Scholar
Goodchild, M. F. & Hill, L. L. Introduction to digital gazetteer research. Int. J. Geogr. Inf. Sci. 22, 1039–1044 (2008).
Google Scholar
Alex, B., Byrne, K., Grover, C. & Tobin, R. Adapting the Edinburgh geoparser for historical georeferencing. Int. J. Humanit. Arts Comput. 9, 15–35 (2015).
Google Scholar
Karimzadeh, M., Pezanowski, S., MacEachren, A. M. & Wallgrün, J. O. GeoTxt: a scalable geoparsing system for unstructured text geolocation. Trans. GIS 23, 118–136 (2019).
Google Scholar
DeLozier, G., Baldridge, J. & London, L. Gazetteer-independent toponym resolution using geographic word profiles. In Proc. Twenty-Ninth AAAI Conference on Artificial Intelligence https://doi.org/10.1609/aaai.v29i1.9531 (AAAI, 2015).
Gritta, M., Pilehvar, M. T. & Collier, N. Which Melbourne? Augmenting geocoding with maps. In Proc. 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 1285–1296 (ACL, 2018).
Wang, J., Hu, Y. & Joseph, K. NeuroTPR: a neuro-net toponym recognition model for extracting locations from social media messages. Trans. GIS 24, 719–735 (2020).
Google Scholar
Zhou, B., Zou, L., Hu, Y., Qiang, Y. & Goldberg, D. TopoBERT: a plug and play toponym recognition module harnessing fine-tuned BERT. Int. J. Digit. Earth 16, 3045–3063 (2023).
Google Scholar
Hu, Y. et al. Geo-knowledge-guided GPT models improve the extraction of location descriptions from disaster-related social media messages. Int. J. Geogr. Inf. Sci. 37, 2289–2318 (2023).
Google Scholar
Hu, X., Kersten, J., Klan, F. & Farzana, S. M. Toponym resolution leveraging lightweight and open-source large language models and geo-knowledge. Int. J. Geogr. Inf. Sci. https://doi.org/10.1080/13658816.2024.2405182 (2024).
Hu, Y. & Janowicz, K. An empirical study on the names of points of interest and their changes with geographic distance. In Proc. 10th International Conference on Geographic Information Science https://doi.org/10.4230/LIPIcs.GISCIENCE.2018.5 (Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2018).
Hu, Y., Mao, H. & McKenzie, G. A natural language processing and geospatial clustering framework for harvesting local place names from geotagged housing advertisements. Int. J. Geogr. Inf. Sci. 33, 714–738 (2019).
Google Scholar
McKenzie, G. & Hu, Y. The ‘nearby’ exaggeration in real estate. In Cognitive Scales of Spatial Information Workshop at COSIT2017: Proc. 13th International Conference on Spatial Information Theory 1–4 (Springer, 2017).
Peris, A., Meijers, E. & Van Ham, M. Information diffusion between Dutch cities: revisiting Zipf and Pred using a computational social science approach. Comput. Environ. Urban Syst. 85, 101565 (2021).
Google Scholar
Southall, H., Mostern, R. & Berman, M. L. On historical gazetteers. Int. J. Humanit. Arts Comput. 5, 127–145 (2011).
Google Scholar
Delmelle, E. C. GIScience and neighborhood change: toward an understanding of processes of change. Trans. GIS 26, 567–584 (2022).
Google Scholar
Chapple, K., Poorthuis, A., Zook, M. & Phillips, E. Monitoring streets through tweets: using user-generated geographic information to predict gentrification and displacement. Environ. Plan. B Urban Anal. City Sci. 49, 704–721 (2022).
Google Scholar
Glaeser, E. L., Kim, H. & Luca, M. Nowcasting gentrification: using Yelp data to quantify neighborhood change. In AEA Papers and Proceedings Vol. 108, 77–82 (American Economic Association, 2018).
Zhou, X. & Zhang, L. Crowdsourcing functions of the living city from Twitter and Foursquare data. Cartogr. Geogr. Inf. Sci. 43, 393–404 (2016).
Google Scholar
Törnberg, P. & Chiappini, L. Selling black places on Airbnb: colonial discourse and the marketing of black communities in New York City. Environ. Plan. Econ. Space 52, 553–572 (2020).
Google Scholar
Zukin, S., Lindeman, S. & Hurson, L. The omnivore’s neighborhood? Online restaurant reviews, race and gentrification. J. Consum. Cult. 17, 459–479 (2017).
Google Scholar
Olson, A. W., Calderón-Figueroa, F., Bidian, O., Silver, D. & Sanner, S. Reading the city through its neighbourhoods: deep text embeddings of Yelp reviews as a basis for determining similarity and change. Cities 110, 103045 (2021).
Google Scholar
Delmelle, E. C. & Nilsson, I. The language of neighborhoods: a predictive-analytical framework based on property advertisement text and mortgage lending data. Comput. Environ. Urban Syst. 88, 101658 (2021).
Google Scholar
Kennedy, I., Hess, C., Paullada, A. & Chasins, S. Racialized discourse in Seattle rental ad texts. Soc. Forces 99, 1432–1456 (2021).
Google Scholar
Nilsson, I. & Delmelle, E. C. Smart growth as a luxury amenity? Exploring the relationship between the marketing of smart growth characteristics and neighborhood racial and income change. J. Transp. Geogr. 106, 103522 (2023).
Google Scholar
Zhang, H., Li, Y. & Branco, P. Describe the house and I will tell you the price: house price prediction with textual description data. Nat. Lang. Eng 30, 661–695 (2024).
Google Scholar
Huang, Z. How Languages used in Property Listing Descriptions Vary and Affect its Price Geographically Across the UK? (Univ. College London, 2020).
Jiang, Y. Housing Price Prediction in London: a Predictive Analysis Based on Property Advertisement Texts (Univ. College London, 2022).
Wang, W. How do Textual Information and Sentiment Analysis Improve House Price Estimation? (Univ. College London, 2022).
Lai, Y. & Kontokosta, C. E. Topic modeling to discover the thematic structure and spatial-temporal patterns of building renovation and adaptive reuse in cities. Comput. Environ. Urban Syst. 78, 101383 (2019).
Google Scholar
Mleczko, M. & Desmond, M. Using natural language processing to construct a National Zoning and Land Use Database. Urban Stud. 60, 2564–2584 (2023).
Google Scholar
Xu, W., Markley, S., Bronin, S. C. & Drogaris, D. A national zoning atlas to inform housing research, policy and public participation. Cityscape 25, 55–72 (2023).
Google Scholar
Brinkley, C. & Stahmer, C. What is in a plan? Using natural language processing to read 461 California city general plans. J. Plan. Educ. Res. 44, 632–648 (2021).
Google Scholar
Brinkley, C. & Wagner, J. Who is planning for environmental justice—and how? J. Am. Plan. Assoc. 90, 63–76 (2022).
Google Scholar
D’ignazio, C. & Klein, L. F. Data Feminism (MIT Press, 2023).
Thomas, T., Ramiller, A., Ren, C. & Toomet, O. Toward a national eviction data collection strategy using natural language processing. Cityscape 26, 241–260 (2024).
Google Scholar
Gromis, A. et al. Estimating eviction prevalence across the United States. Proc. Natl Acad. Sci. USA 119, e2116169119 (2022).
Google Scholar
Nelson, K., Garboden, P., McCabe, B. J. & Rosen, E. Evictions: the comparative analysis problem. Hous. Policy Debate 31, 696–716 (2021).
Google Scholar
Summers, N. & Steil, J. Pathways to eviction. Law Soc. Inq 50, 129–169 (2025).
Google Scholar
Cai, M., Huang, H. & Decaminada, T. Local data at a national scale: introducing a dataset of official municipal websites in the United States for text-based analytics. Environ. Plan. B Urban Anal. City Sci. 50, 1988–1993 (2023).
Google Scholar
Occhini, G. Who, What and Where (Univ. of Bristol, 2024).
Arts, S., Hou, J. & Gomez, J. C. Natural language processing to identify the creation and impact of new technologies in patent text: code, data and new measures. Res. Policy 50, 104144 (2021).
Google Scholar
Ozgun, B. & Broekel, T. The geography of innovation and technology news—an empirical study of the German news media. Technol. Forecast. Soc. Change 167, 120692 (2021).
Google Scholar
Axenbeck, J. & Breithaupt, P. Innovation indicators based on firm websites—which website characteristics predict firm-level innovation activity? PLoS ONE 16, e0249583 (2021).
Google Scholar
Yan, B., Janowicz, K., Mai, G. & Gao, S. From ITDL to Place2Vec—reasoning about place type similarity and relatedness by learning embeddings from augmented spatial contexts. In SIGSPATIAL’17: Proc. 25th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems Vol. 35, 1–10 (ACM, 2017).
Spruyt, V. Loc2Vec: Learning Location Embeddings with Triplet-Loss Networks (Sentiance, 2018).
Woźniak, S. & Szymański, P. hex2vec: context-aware embedding H3 hexagons with openstreetmap tags. In GEOAI’21: Proc. 4th ACM SIGSPATIAL International Workshop on AI for Geographic Knowledge Discovery 61–71 (ACM, 2021).
Du, J., Chen, Y., Wang, Y. & Pu, J. Zone2Vec: distributed representation learning of urban zones. In Proc. 2018 24th International Conference on Pattern Recognition (ICPR) 880–885 (IEEE, 2018).
Sun, K., Hu, Y., Joseph, K. & Zhou, R. Z. GALLOC: a GeoAnnotator for Labeling LOCation descriptions from disaster-related text messages. Int. J. Geogr. Inf. Sci. 39, 1623–1653 (2025).
Google Scholar
Mekala, D. & Shang, J. Contextualized weak supervision for text classification. In Proc. 58th Annual Meeting of the Association for Computational Linguistics 323–333 (ACL, 2020).
Occhini, G., Tranos, E. & Wolf, L. Measuring a country’s digital industrial structure: commercial websites and weakly supervised classification to the rescue. Preprint at SocArXiv https://doi.org/10.31235/osf.io/h572n (2023).
Singleton, A. D. & Spielman, S. Segmentation using large language models: a new typology of American neighborhoods. EPJ Data Sci. 13, 34 (2024).
Google Scholar
Wu, J. et al. A survey on LLM-generated text detection: necessity, methods and future directions. Comput. Linguist. 51, 275–338 (2025).
Google Scholar
Mellon, J. et al. Do AIs know what the most important issue is? Using language models to code open-text social survey responses at scale. Res. Polit. 11, 20531680241231468 (2024).
Google Scholar
Park, J. S. et al. Generative agents: interactive simulacra of human behavior. In UIST’23: Proc. 36th Annual ACM Symposium on User Interface Software and Technology 1–22 (ACM, 2023).
Zheng, Z. & Sieber, R. Putting humans back in the loop of machine learning in Canadian smart cities. Trans. GIS 26, 8–24 (2022).
Google Scholar

Download references

Author information

Authors and Affiliations

Centre for Advanced Spatial Analysis (CASA), UCL, London, UK
Jonathan Reades
Department of Geography, University at Buffalo, Buffalo, NY, USA
Yingjie Hu
School of Geographical Sciences, University of Bristol, Bristol, UK
Emmanouil Tranos
Department of City and Regional Planning, University of Pennsylvania, Philadelphia, PA, USA
Elizabeth Delmelle

Authors

Jonathan Reades
View author publications
Search author on:PubMed Google Scholar
Yingjie Hu
View author publications
Search author on:PubMed Google Scholar
Emmanouil Tranos
View author publications
Search author on:PubMed Google Scholar
Elizabeth Delmelle
View author publications
Search author on:PubMed Google Scholar

Contributions

J.R. conceived and designed the experiments. All authors contributed materials and analysis tools and wrote the paper.

Corresponding author

Correspondence to Jonathan Reades.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Cities thanks Julia Harten, Xinyu Fu and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Reades, J., Hu, Y., Tranos, E. et al. The city as text. Nat Cities 2, 794–800 (2025). https://doi.org/10.1038/s44284-025-00314-x

Download citation

Received: 30 July 2024
Accepted: 30 July 2025
Published: 01 September 2025
Version of record: 01 September 2025
Issue date: September 2025
DOI: https://doi.org/10.1038/s44284-025-00314-x