Introduction

The impacts of climate change pose enormous challenges for governments, institutions, and economies worldwide1. Climate-induced disasters—including storms, wildfires, floods, droughts, and heatwaves—are increasing in both frequency and severity, necessitating urgent climate adaptation strategies to enhance resilience and reduce vulnerability (IPCC, 2022). While substantial progress has been made in understanding and implementing climate mitigation efforts to reduce greenhouse gas emissions, adaptation remains comparatively neglected, both in scholarship and practice2,3,4,5,6. This imbalance persists despite a growing recognition that adaptation is not merely a reactive response but a necessary pillar of long-term climate governance.

This gap is particularly pronounced in two overlapping contexts: the Global South and authoritarian regimes. Countries in the Global South face heightened exposure to climate impacts but often lack the institutional capacity and financial resources necessary for effective adaptation1,7,8,9,10,11,12. In contrast, existing research on political behaviors, attitudes, and policy enactment in the aftermath of climate disasters has predominantly focused on developed democracies13,14,15,16,17,18,19,20,21,22.

Furthermore, research on citizen-driven demands for climate adaptation in non-democracies remains limited. Existing studies focusing on non-democracies often delve into the incentives and behaviors of political elites and bureaucracies in shaping sustainability outcomes23,24,25,26. Among the few that examine public responses to climate disasters in authoritarian contexts, most rely on sentiment analysis of commercial social media platforms, offering largely descriptive or correlational insights27,28,29 or assess public perceptions and preferences regarding adaptation strategies30,31,32.

In democratic systems, public opinion exerts influence through electoral accountability, incentivizing politicians to respond to citizen preferences. In authoritarian regimes, however, public sentiment holds substantially less direct political weight. Yet, under systems of “responsive authoritarianism,” as in China, state-sanctioned mechanisms for expressing grievance and needs, though mediated, can nonetheless serve as conduits for bottom-up influence33,34. Understanding how adaptation demands are articulated through such official venues is, therefore, critical to evaluating political behavior under authoritarian rule. This study centers on one such institution: the Local Leaders’ Message Board (LLMB), China’s largest government-run petition system.

Against this backdrop, this study addresses a critical empirical and conceptual gap in understanding how citizens in developing autocracies articulate resilience-oriented demands directly to the state in the aftermath of climate-linked extreme weather events. It employs a mixed-methods approach that leverages recent advancements in causal inference to examine how a mega disaster—the 2021 Henan flood in China—drove public appeals for government-led risk mitigation and infrastructural resilience. Specifically, it investigates whether the disaster catalyzed a measurable increase in citizen appeals for flood-related protection and remediation, both within Henan and across other provinces. In addition to estimating treatment effects, the study examines the substantive content of these appeals to identify how the public perceived flood-related risks and articulated demands for disaster preparedness within a non-democratic context. This contributes to a growing body of research on climate risk perception and citizen-state interaction in authoritarian regimes while also offering insight into how extreme weather events may drive bottom-up mobilization for resilience, even in the absence of climate-specific language.

China is emblematic of the challenges facing large, developing autocracies grappling with severe climate risks and limited institutional experience in adaptation planning2,6. The 2021 Henan flood marked a major turning point in China’s climate governance. Triggered by record-breaking rainfall from July 17 to 23, the event drew national and international attention and prompted the State Council—China’s highest administrative authority—to launch a formal investigation. Scientific attribution research conducted by the Chinese Academy of Sciences and Peking University concluded that anthropogenic climate change increased the rainfall intensity by 7.5%35. In the provincial capital of Zhengzhou, 617.1 mm of rain fell over three days, nearly equivalent to the annual average of 640.8 mm36. On July 20 alone, the city recorded 201.9 mm of rainfall in a single hour—the highest hourly total ever recorded in China37. The deluge overwhelmed infrastructure, flooded hospitals, and breached riverbanks, ultimately leaving 398 people dead or missing (380 of whom were in Zhengzhou), displacing nearly 14 million residents, and causing direct economic losses of 120.06 billion RMB (approximately 16.5 billion USD)37.

This study finds that the 2021 Henan flood triggered a sharp and sustained increase in citizen petitions demanding flood protection, infrastructure upgrades, and neighborhood safety—demands that, although not explicitly framed in climate terms, are functionally aligned with adaptation. Evidence of spillover effects in unaffected provinces highlights how disaster salience can spread through informational and affective channels, reshaping public expectations for state-led resilience. Topic modeling reveals that these appeals were grounded in concrete, localized vulnerabilities rather than abstract climate awareness. Taken together, the findings highlight how extreme weather events can trigger bottom-up political engagement in authoritarian contexts, suggesting that incorporating citizen perspectives—however obliquely expressed—can enhance the relevance and effectiveness of adaptation policies, particularly where technocratic, top-down approaches are the norm.

Results

Citizen petitioning and the Henan flood context

To examine the political aftermath of the flood, this study draws on a comprehensive and proprietary dataset of citizen queries submitted to the LLMB. Launched in 2008 and operated by People.cn—the official online arm of the People’s Daily—the LLMB is the largest centrally operated digital petition platform in China, covering all 32 provincial-level administrative units. All citizens can use the platform to report local misgovernance, request administrative support, or offer suggestions to government agencies.

When submitting a query, individuals choose a topic category and specify the target government office. Messages are subject to review and potential censorship by the platform before being posted publicly, after which local governments issue public responses. This dual public-facing structure, where both the citizens’ appeal and the government’s reply are visible, offers a unique lens into bottom-up engagement under authoritarian rule.

While detailed demographic information for individual LLMB petitioners is unavailable, regional variation in usage appears to be correlated with levels of urbanization and internet penetration38. These structural factors likely shape who participates and what types of risks and demands are articulated. As such, the petition data should be understood as reflecting the expressed preferences of a politically engaged and digitally connected subset of the population, rather than a fully representative cross-section of Chinese society.

Submitting to LLMB constitutes a form of political behavior rather than merely expressing opinions. LLMB messages are formalized appeals concerning routine governance issues, such as water access, transportation, education, and environmental hazards38,39. As a tool of “responsive authoritarianism,” the LLMB enables the central state to monitor local performance and absorb grassroots pressure without relinquishing political control. It thus serves as both a pressure-release valve and an instrument of bureaucratic accountability33,39,40.

Crucially, the LLMB is analytically distinct from commercial blogs or microblogs, which have received substantial scholarly attention for their relationship to government censorship41,42. Unlike social media platforms that amplify content through virality, LLMB enforces categorization, moderation, and hierarchical routing of complaints to government offices. Government-run microblogs, by contrast, are more often used for propaganda, information dissemination, or superficial public engagement rather than formal grievance redress43,44.

The dataset used in this study includes all LLMB submissions between January 1 and October 4, 2021. Henan Province serves as the treatment group, while all other provinces serve as the control group. The time frame ends on October 4 to avoid contamination from a second major flood event in Henan that began on October 5, 2021.

To measure public demand for disaster-related risk mitigation, the dependent variable is defined as the daily percentage of petitions in each province that pertain to adaptation. In this study, climate adaptation requests refer to citizen petitions that express concern about flood risk, drainage, and infrastructure vulnerabilities, including both short-term appeals for immediate protection and longer-term calls for systemic resilience. While some petitions may reference damage repair, only those that imply anticipatory or structural risk reduction, rather than routine maintenance, are coded as adaptation-oriented. These petitions are interpreted as a proxy for adaptation-relevant concerns, based on the rationale that appeals concerning flood impact management most directly reflect citizen-identified needs for enhanced local resilience45.

To identify relevant petitions, the study employs a keyword-based filtering approach. Messages are flagged as flood-related if they include one or more of the following terms: “flood water,” “heavy water,” “water disaster,” “flood disaster,” “torrent,” “flood waterlogging,” “rainstorm,” “flood control,” “flood prevention,” “flood discharge,” “flood drainage,” and “flood storage.” These terms were selected to capture citizen concerns about both immediate flood impacts and the need for anticipatory or preventive infrastructure.

To contextualize the Henan flood within the broader landscape of climate disasters during the study period, this study compares its severity to that of other major—but less destructive—floods occurring over the same timeframe. Information on climate disasters is compiled primarily from authoritative publications issued by the Ministry of Emergency Management (MEM), the central government agency responsible for disaster reporting and response in China. These official sources serve as the foundation for constructing a comprehensive and detailed list of climate-related disasters, particularly flooding events, across provinces. The dataset includes key impact metrics, such as the number of total deaths, the number of people affected, and estimated economic damages (in 2021 USD).

The EM-DAT database, maintained by the Centre for Research on the Epidemiology of Disasters at the Catholic University of Louvain in Belgium, remains a valuable global reference in disaster research. However, for the Chinese context, EM-DAT entries were occasionally found to lack granularity or to diverge from official domestic reporting. Therefore, EM-DAT is used selectively for cross-validation, while MEM records and provincial emergency bulletins are treated as primary data sources. In case of missing or ambiguous MEM data, supplementary information is drawn from other credible official reports and state media summaries.

Surge in adaptation demands in Henan

Figure 1 presents the log-transformed percentage of daily climate adaptation requests in Henan Province and the rest of the country from January 1 to October 4, 2021. These trends are visualized on a log scale to improve interpretability at the lower end of the distribution and to better capture meaningful changes in rare event data.

Fig. 1: Log-transformed daily climate adaptation requests as a percentage of total requests in Henan Province and the rest of the country.
figure 1

Each blue dot represents the percentage of daily climate adaptation requests relative to all requests received that day, plotted on a log scale. Due to the log transformation, values equal to zero are not displayed—these reflect the complete absence of adaptation-related petitions on those days. The red line is a LOESS smoothing curve (span = 0.3), illustrating nonparametric trends in request volume, with the shaded blue region denoting the 95% confidence interval. Red dashed vertical lines mark the beginning and end of the July 17–23, 2021, rainstorm in Henan Province.

In the months leading up to the flood, Henan registered virtually no adaptation-related requests. Immediately following the onset of the July 2021 rainstorm, however, the percentage of climate adaptation requests in Henan spiked dramatically. While this initial surge later declined, levels remained markedly elevated through late summer and early fall, indicating a persistent shift in public attention and demand for government-led climate adaptation. In contrast, the rest of the country exhibited a comparably stable baseline throughout the period, despite six provinces also experiencing major flood events of lesser magnitude and impact than the 2021 Henan flood.

Spillover effects across provinces

The severe flood that struck Henan Province had measurable spillover effects, prompting a rise in climate adaptation demands in other regions of China (Fig. 2). From the onset of the rainstorm on July 17 through October 4, approximately 20% of climate adaptation requests (13 out of 67) submitted outside Henan explicitly referenced “Zhengzhou,” “Henan,” or the shorthand “720” and “7.20” commonly used to denote the disaster. One potential concern is that some of these messages might have been submitted by Henan residents posting to other provinces’ LLMB portals. However, IP address data confirm that all 13 messages originated from users physically located in the provinces where the petitions were submitted. These references suggest that the scale and destructiveness of the Henan flood prompted residents in other regions to reflect on their own vulnerabilities and to urge local governments to take preventive action, even in the absence of direct exposure to flooding. Illustrative excerpts from these requests are shown in Table 1.

Fig. 2: Geographic distribution of spillover requests referencing the 2021 Henan flood.
figure 2

Provinces shaded in gray submitted one or more climate adaptation requests between July 17 and October 4, 2021, that explicitly referenced the Henan flood. Henan Province is outlined in black. Color shading indicates the number of spillover-related requests per province on a log-transformed scale.

Table 1 Sample excerpts of citizen requests for climate adaptation referencing the Henan floods made from outside Henan Province

At the same time, it is essential to acknowledge the limitations of content-based filtering: citizens affected by the Henan flood may not always explicitly reference it. Media coverage and emotional salience could still motivate individuals to request adaptation measures without directly naming the event. Such latent or unobservable spillover effects cannot be systematically detected, but their likely presence suggests that this analysis may underestimate the full influence of the Henan flood on public adaptation demands.

Distinctiveness of the Henan flood

Figure 3 visualizes trends in daily percentage of climate adaptation requests (log-transformed) alongside key disaster severity indicators across seven Chinese provinces affected by major floods from January 1 to October 4, 2021. Following established conventions46, disaster severity is captured through three metrics: total fatalities, the number of people affected, and direct economic losses (converted to 2021 USD). Henan Province, which experienced the deadliest and most economically damaging flood, stands out with the largest and most sustained increase in climate adaptation requests. Shaanxi Province, which experienced two distinct flood events during the study period, exhibited a more noticeable increase in adaptation requests than other provinces, though the magnitude remained smaller than that observed in Henan.

Fig. 3: Trends in daily climate adaptation requests and disaster severity across flood-affected provinces from January 1 to October 4, 2021.
figure 3

The left column displays the log percentage of adaptation-related requests relative to all requests submitted each day. Blue dots represent daily values plotted on a log scale, and shaded red regions indicate major flood periods specific to each province, as reported in official records. Due to the log transformation, values equal to zero are not displayed—these reflect the complete absence of adaptation-related petitions on those days. The three right columns present lollipop plots showing the total deaths, number of people affected, and economic damages (in 2021 U.S. dollars) associated with each flood. Sources: Ministry of Emergency Management of the People’s Republic of China66; Centre for Research on the Epidemiology of Disasters67; Xinhua68; Xinhua69; People’s Daily70; Department of Emergency Management of Hebei Province71.

Estimating causal effects of the Henan flood

To assess the timing and persistence of treatment effects and to test the parallel trends assumption, a dynamic difference-in-differences (DiD) model using event-time indicators is employed. This method, also known as an event study design, estimates how citizen demand for adaptation evolved on a weekly basis before and after the flood’s onset on July 24, 2021. The dynamic specification enables a flexible examination of treatment timing, including potential anticipatory behavior, and provides a visual diagnostic for pre-treatment trend violations47,48.

Figure 4 presents the estimated weekly treatment effects relative to the week prior to the flood. Pre-treatment coefficients are centered around zero and statistically insignificant, supporting the plausibility of the parallel trends assumption. While one pre-treatment estimate is marginally significant, this is consistent with expected random variation rather than evidence of systematic pre-trend violations47. Following the onset of the Henan flood, the results show a marked and sustained increase in adaptation-related requests in Henan, with several post-treatment coefficients reaching conventional significance thresholds. The pattern suggests that the flood catalyzed a meaningful and persistent shift in public demand for government-led climate adaptation.

Fig. 4: Dynamic difference-in-differences estimates of the weekly effects of the 2021 Henan flood on the percentage of climate adaptation requests, relative to the week immediately preceding the flood (Week-1).
figure 4

Week-1 is the reference period prior to the onset of the flood (July 24, 2021). The dashed vertical line marks Week 0. Shaded regions represent 95% confidence intervals clustered at the province level. Asterisks denote significance levels (*p < 0.1, **p < 0.05, ***p < 0.01). Henan’s adaptation request share was zero in all weeks prior to the flood, resulting in some event-time coefficients being absorbed by fixed effects or lacking identifying variation. Estimates are plotted for weeks in which sufficient variation allows for coefficient identification.

Importantly, these estimates are likely conservative. As shown in earlier sections, the Henan flood generated spillover effects in other provinces. Through national media coverage and heightened public awareness, residents outside Henan may have also experienced increased concern about climate risks, even without direct flood exposure49. These indirect effects reduce the contrast between treated and control units, potentially attenuating measured treatment effects. In addition, contemporaneous but less severe flooding in other provinces further narrows this contrast. As a result, the estimated effect size likely underrepresents the true magnitude of public political response to this mega disaster.

To complement the dynamic analysis, a standard DiD model was employed to estimate the average treatment effect of the Henan flood. Results indicate that the percentage of adaptation-related requests in Henan increased by approximately 0.39 percentage points relative to other provinces (p = 0.007). Although this numerical change appears modest, it is substantively significant given that Henan’s pre-flood baseline was effectively zero. The flood thus appears to have catalyzed the emergence of climate adaptation as a new domain of citizen petitioning. As noted earlier, spillover effects into the control group, driven by national media coverage and heightened risk awareness, likely attenuate the estimated effect. Accordingly, this result should be interpreted as a conservative, lower-bound estimate of the true impact.

Themes in adaptation appeals: a topic modeling analysis

Topic modeling of Henan climate adaptation petitions yielded six distinct themes, each grounded in tangible, place-specific risks associated with flood impacts. As shown in the keyword bar chart (Fig. 5), citizens focused primarily on infrastructural concerns. Topic 0 centers on river management and urban development, with keywords such as riverway, construction, and city. Topic 1 emphasizes major transit disruptions and neighborhood vulnerability, referencing “expressway”, “flood,” and “old city district.” Topic 2 focuses on residential services and drainage problems, with keywords like “water supply” and “basement.” Topic 3, dominated by terms such as “new river” and “management,” points to localized restoration or infrastructure projects. Topics 4 and 5 include broader socio-political framings, highlighting local governance, township priorities, and calls for state-led development.

Fig. 5: Keyword bar charts for six topics derived from Henan climate adaptation petitions.
figure 5

Each panel displays the top-ranked keywords for a given topic, identified through BERTopic modeling. The horizontal axis represents the class-based TF-IDF score, which reflects the relative distinctiveness of each term within the corpus. Topics are indexed numerically (0–5) for identification; the numbering does not indicate importance. All keywords are in English translation from Mandarin Chinese petitions. Units are in arbitrary TF-IDF units. TF-IDF Term Frequency-Inverse Document Frequency.

Notably, terms such as “climate change”, “global warming”, or “climate warming” were never mentioned in citizen appeals. This underscores that the petitions reflect concrete calls for safety and infrastructure improvements in response to an extreme event, rather than ideational commitments to climate adaptation per se. Public demand for government-led disaster resilience was anchored in practical concerns rather than in the language of climate governance.

Discussion

Public opinion serves as a cornerstone of climate policymaking in democracies, where elections, protests, and advocacy enable citizens to influence outcomes13,14,15,16,17,18,19,20,21,22,50. In authoritarian regimes, however, such influence is tightly constrained and typically operates through narrow, state-sanctioned channels33,39,40. China’s LLMB is one such channel, offering a moderated space where citizens can submit petitions that are both visible to the public and legible to the state. This study demonstrates that even within such confines, a high-profile, devastating extreme weather event can trigger a visible and sustained surge in citizen appeals for risk reduction.

Following the 2021 Henan flood, the volume of petitions addressing drainage, infrastructure, and neighborhood safety rose sharply. These petitions, although devoid of explicit references to climate change, articulate concrete vulnerabilities and needs that closely align with the objectives of climate adaptation. This challenges a common assumption in parts of the literature that climate awareness must precede climate action51,52,53. In this context, “climate awareness” refers specifically to an understanding of anthropogenic climate change as a systemic, long-term driver of local risks, distinct from more localized or experiential forms of flood risk perception. Instead, these findings suggest that immediate, tangible threats may be more powerful catalysts for adaptation-oriented demands than abstract knowledge of climate change.

Moreover, the detection of spillover effects in provinces unaffected by the Henan flood reveals how adaptation concerns can diffuse through informational and emotional channels, such as news coverage and shared national sentiment. This contrasts with much of the existing literature, which emphasizes how personal experience with extreme weather shapes individual beliefs and behaviors14,18,20,22,54. This study demonstrates that witnessing distant disasters—via media or social networks—can also mobilize adaptation-oriented demands, highlighting how climate risk perception becomes socially distributed when the severity of an event crosses a salient threshold.

These dynamics illustrate how publics in non-democratic systems engage climate risk through institutionally sanctioned means. Rather than collective protest or electoral pressure, citizens in China respond through institutionalized petitioning, which allows for targeted articulation of grievances. LLMB petitions did not invoke climate science, but they expressed a pragmatic grammar of vulnerability that mapped closely onto adaptation priorities. The emergence of such themes indicates that citizen preferences can be expressed meaningfully—even if obliquely—within constrained participatory channels.

Still, interpretations must consider the filtered nature of LLMB data. Messages are moderated by People.cn, and content that is politically sensitive may be redirected or withheld. Nevertheless, bureaucratic inefficiencies, especially following natural disasters, are often permissible targets of critique under China’s “responsive authoritarianism”34,39. The flood likely created a temporary expressive opening, enabling more candid public airing of concerns. The persistence of adaptation petitions weeks after the disaster suggests these were not ephemeral expressions of distress, but reflections of newly salient priorities. While disentangling genuine shifts in preferences from changing thresholds of expressibility remains methodologically difficult, the specificity and consistency of petition themes support the conclusion that the data capture authentic public demand.

Local governments also influence the expressive landscape of the LLMB through their role in responding to citizen petitions. Although the platform is centrally hosted and moderated by People.cn, local cadres play a central role in issuing responses and addressing the issues raised. After a major disaster, they may be more inclined to tolerate or even welcome critical feedback as a means of demonstrating responsiveness and bureaucratic accountability. At the same time, the visibility of petitions may be shaped by central-level moderation practices, which tend to favor technically framed, infrastructure-oriented messages over overt political critique. Despite these constraints, the emergence of concrete, localized adaptation themes in the data points suggests genuine bottom-up engagement. Nevertheless, LLMB petitions reflect only expressed preferences, not latent ones. Future research, particularly with access to non-public or censored submissions, could help illuminate how self-censorship and platform filtering shape the boundaries of citizen-state interaction in authoritarian settings. Given this platform-specific focus, further work could investigate how citizen discourse unfolds on commercial social media platforms, where expressions of risk may be more spontaneous, emotionally charged, or differently framed. Such comparative analysis could help determine whether the tone, content, or perceived urgency of climate-related demands diverges across venues. Additionally, future studies could investigate intra-provincial variation in petition content and frequency, which may provide insight into localized disparities in climate vulnerability, administrative responsiveness, or digital access, particularly between urban and rural areas.

While this study centers on citizen demand, future work should evaluate the government’s responsiveness, both immediate and institutional. Did these petitions translate into policy action? Did adaptation investments follow in their wake? In addition, expanding the analysis beyond a single event to encompass a broader range of climate shocks (e.g., droughts, typhoons, and heatwaves) would enhance our understanding of how different hazards mobilize citizen concern across varied sociopolitical contexts. The findings also hold broader implications for understanding climate governance under authoritarian rule. Future research could build on this work by examining and comparing how participatory pathways operate across different non-democratic regimes.

Finally, although this study does not directly assess policy design, it provides an empirical foundation for rethinking citizen engagement in resilience planning. The recurrence of neighborhood-level concerns (e.g., culverts, basements, and old city districts) suggests that residential communities may be critical units of adaptive capacity. This is especially salient in urban China, where walled compounds house nearly 80% of the population and are managed by local property firms55. Recognizing these site-specific articulations of risk—and creating formal mechanisms to incorporate them into planning—could enhance the relevance and durability of adaptation interventions.

These insights are especially relevant considering the limitations of China’s flagship “sponge city” pilot initiative. Launched in 2015, the initiative aimed to integrate gray, green, and blue infrastructure (e.g., drainage tunnels, wetlands, parks, and permeable pavements) to enhance urban flood resilience. Zhengzhou, a pilot city, invested over 53.4 billion RMB (approximately 8.3 billion USD) in sponge city upgrades (Fu et al.56); yet the 2021 disaster exposed the pilot’s failure to address compounding hazards or integrate localized knowledge. Petition data revealed public awareness of specific infrastructural deficiencies—under-capacity drainage and unprotected culverts—that sponge city blueprints had not addressed.

This highlights the importance of complementing top-down planning with bottom-up input. Co-design—where citizens, planners, and officials jointly define risks and priorities—can help tailor adaptation strategies to the realities of everyday life57. In constrained political systems, even modest participatory mechanisms, such as analyzing citizen petitions, consulting neighborhood associations, or engaging community managers, can improve targeting, legitimacy, and resilience outcomes. While authoritarian regimes often default to centralized, technocratic policy design, this study underscores the benefits of resisting that tendency. The concrete, place-based concerns raised by citizens after the Henan flood demonstrate how co-produced knowledge can reveal critical blind spots in top-down planning. In this light, adaptation, particularly in rapidly urbanizing autocracies, should be seen not merely as a technical task but as a shared process of governing risk. Embedding citizen input within formal planning systems can make adaptation more cost-effective, contextually grounded, and politically sustainable.

Methods

Estimation strategy

To estimate both the dynamic and average effects of the Henan flood on public demand for adaptation, this study employed two complementary DiD specifications.

A dynamic DiD model using event-time indicators tested for pre-treatment trends and estimated how treatment effects evolved over time. Event-time was measured in weekly intervals relative to the onset of the Henan flood (July 24, 2021). The model was specified in Eq. (1) below:

$${Y}_{i,t}=\mathop{\sum }\limits_{{l=-a,}\atop {l\ne -1}}^{b}{\beta }_{l} \cdot {{\bf{1}}}\left({k}_{i,t}=l\right)+{\alpha }_{i}+{\gamma }_{t}+{\epsilon }_{i,t}$$
(1)

where \({Y}_{i,t}\) represented the percentage of daily climate adaptation requests in province \(i\) on day \(t\), and \(a\) and \(b\) denoted the numbers of leads and lags, respectively58. \({k}_{i,t}=t-{T}_{i}\) measured the number of weeks relative to the flood onset date \({T}_{i}\) (July 24, 2021). The term \({{\boldsymbol{1}}}\left({k}_{i,t}=l\right)\) was an indicator for week \(l\) before or after the flood. Following standard practice, \(l=-1\), the week immediately before treatment, was omitted as the reference category59. \({\alpha }_{i}\) represented province fixed effects, capturing time-invariant provincial characteristics, and \({\gamma }_{t}\) denoted date fixed effects, accounting for time-specific shocks that affected all provinces simultaneously. \({\epsilon }_{i,t}\) was the cluster-robust error term.

The model was estimated using the feols function from the fixest package in R, which is optimized for high-dimensional fixed effects and supports cluster-robust inference. This estimator incorporated finite-sample corrections that reduced the risk of over-rejection with a moderate number of clusters, making it well-suited to applications with 32 provinces60.

As the focus of this study was the causal effect of the 2021 Henan flood—even in the presence of less severe, contemporaneous flood events elsewhere—Henan was the only treated unit, with treatment occurring once in week \(0\). Coefficients \({\beta }_{l}\) were identified only for event-time periods in which Henan had non-zero outcomes. Because treatment occurred in a single unit at a single point in time, issues associated with staggered adoption designs—such as contamination bias or negative weighting—were not applicable in this setting61,62.

To estimate the average treatment effect, a static DiD model was specified in Eq. (2) as follows:

$${Y}_{i,t}=\beta \cdot ({{\mathrm{Treated}}}_{i}\times {{\mathrm{Post}}}_{t})+{\alpha }_{i}+{\gamma }_{t}+{\epsilon }_{i,t}$$
(2)

where \({Y}_{i,t}\) represented the percentage of climate adaptation requests in province \(i\) on date \(t\); \({{\mathrm{Treated}}}_{i}\) was a binary indicator equal to 1 if province \(i\) was Henan and 0 otherwise; \({{\mathrm{Post}}}_{i}\) was an indicator for the post-treatment period, equal to 1 for dates on or after July 24, 2021, and 0 otherwise. The model included province fixed effects \({\alpha }_{i}\), which controlled for time-invariant characteristics specific to each province, and date fixed effects \({\gamma }_{t}\), which accounted for common temporal shocks or seasonal patterns common to all provinces. The error term \({\epsilon }_{i,t}\) was clustered at the province level to allow for arbitrary serial correlation within provinces over time. The coefficient of interest, \(\beta\), captured the average treatment effect—the differential change in the percentage of adaptation requests in Henan relative to other provinces following the flood.

However, evidence of spillovers, such as increased awareness in untreated provinces due to media exposure, raised concerns about violations of the stable unit treatment value assumption. As described by Aronow and Samii63, this scenario reflected “general interference,” where treatment in one unit affected outcomes in others. Such spillovers likely attenuated estimated effects, meaning the reported estimates should be interpreted as conservative.

Topic modeling using BERTopic

To analyze how citizens articulated adaptation demands in Henan following the 2021 flood, this study employed BERTopic, a neural topic modeling technique that integrated transformer-based document embeddings, dimensionality reduction, and unsupervised clustering with a class-based term frequency–inverse document frequency (TF-IDF) method for topic representation. This approach was chosen over traditional models, such as Latent Dirichlet Allocation (LDA), due to its superior semantic coherence, ability to capture contextual meaning, and robustness when working with short-text corpora—all of which were critical for analyzing LLMB petitions that typically contain fewer than 200 characters. Unlike LDA, which treats documents as unordered collections of words (“bag-of-words”) and ignores context, BERTopic maintains word order and semantic relationships, enabling more accurate detection of themes in compact, citizen-generated texts.

BERTopic operated in four main steps. It first used a pre-trained transformer model (e.g., Sentence-BERT) to encode each petition into a dense semantic vector. These vectors were then reduced in dimensionality using Uniform Manifold Approximation and Projection (UMAP) and clustered using Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN), a density-based algorithm that can identify irregularly shaped clusters and outliers. Finally, the model generated interpretable topic representations by applying a class-based variant of TF-IDF, which highlighted terms that were distinctive to each cluster relative to the overall corpus. As demonstrated in Grootendorst64, BERTopic yielded more coherent and policy-relevant topics than traditional approaches, particularly in applications involving short texts.

The modeling procedure comprised five key steps. The first step was data processing. Among petitions related to floods, texts were preprocessed by removing ceremonial greetings, punctuation, and stopwords, using a customized Chinese stopword list. These steps ensured that non-substantive elements did not distort semantic representations in the embedding phase.

The second stage of the modeling pipeline involved generating sentence-level semantic embeddings for each petition. Three pre-trained transformer models were evaluated for this task: hfl/chinese-roberta-wwm-ext, uer/chinese_roberta_L−12, and distiluse-multilingual-v2. Each model was tested using a consistent topic modeling framework with BERTopic and evaluated based on three quantitative performance metrics: topic coherence (via normalized pointwise mutual information), topic diversity, and inter-topic distance. As shown in Table 2, the multilingual model (distiluse-base-multilingual-cased-v2) achieved the highest coherence score (0.6367), perfect topic diversity (1.0), and the largest inter-topic distance (0.3120), indicating both internal consistency and separation across topics. Accordingly, this model was selected to generate embeddings for the final analysis.

Table 2 Performance comparison of embedding models used in BERTopic

The third step involved dimension reduction and clustering. To enhance clustering efficiency and maintain the integrity of semantic relationships, the high-dimensional sentence embeddings were projected into a two-dimensional space using UMAP. UMAP preserved both local and global structure in the embedding space, enabling meaningful distance comparisons between documents. Clustering was subsequently performed using HDBSCAN, a non-parametric, density-based algorithm that automatically identified the optimal number of clusters. This was especially well-suited for small-sample, short-text corpora, such as this one (n = 36), where topic boundaries may not be uniformly distributed. Model hyperparameters were set as follows:

  • \(\min \_{\mathrm{topic}}\_{\mathrm{size}}=\,2\), to retain topic granularity given the limited sample size

  • \({\mathrm{top}}{\_n\_}{\mathrm{words}}=\,8\), to maximize interpretability of topic labels

  • \({\mathrm{zeroshot}}\_min\_{\mathrm{similarity}}=\,0.8\), to avoid premature topic merging.

All resulting clusters were manually reviewed for semantic coherence, and no automatic topic consolidation was performed. This manual oversight ensured that emergent topics reflected substantively meaningful distinctions in citizen concern.

The fourth step concerned topic representation and interpretation. Topic representations were generated using class-based TF-IDF, a modified TF-IDF metric tailored to unsupervised topic models. This method identified words that were especially characteristic of a given cluster by comparing term importance within the topic to its importance across the entire corpus. For each topic, the following outputs were examined:

  • Top-ranked keywords, selected based on their weighted contribution to topic definition

  • Representative documents, identified as those closest to the centroid of each cluster in the embedding space

  • Relative topic frequency, used to assess the prominence of each theme across the dataset.

Thematic labels were assigned through a hybrid process that combined automated outputs with qualitative interpretations of exemplar texts. This ensured that topic categories were both internally valid and externally meaningful.

The final step was visualization and evaluation. To support interpretation and diagnostic evaluation, topic structure was visualized using bar charts (for prevalence; Fig. 5), hierarchical dendrograms (for inter-topic relationships; Fig. 6), and two-dimensional UMAP plots (for semantic distance; Fig. 7).

Fig. 6: Hierarchical clustering of adaptation-related topics in Henan.
figure 6

Dendrogram shows hierarchical agglomerative clustering of six topics derived from BERTopic analysis, using Ward’s method and cosine distance between class-based TF-IDF vectors. Each leaf node represents one topic, with truncated labels showing the top-ranked keywords. Horizontal linkage distance indicates semantic dissimilarity; shorter branches denote greater topic similarity. TF-IDF Term Frequency-Inverse Document Frequency.

Fig. 7: Inter-topic distance map of climate adaptation requests in Henan.
figure 7

Topics are visualized in a two-dimensional semantic space after dimensionality reduction via UMAP and clustering via HDBSCAN. Each circle represents one topic; circle size reflects its relative frequency in the corpus. Spatial proximity indicates semantic similarity based on document embeddings. Units on the axes are arbitrary and represent UMAP dimensions. UMAP Uniform Manifold Approximation and Projection; HDBSCAN Hierarchical Density-Based Spatial Clustering of Applications with Noise.

The topic distribution presented in Fig. 5 was generated using the visualize_barchart() function in BERTopic. The six topics, labeled numerically from 0 to 5, were discovered through HDBSCAN clustering applied to Sentence-BERT embeddings of flood-related petitions. The topics were labeled Topic 0 through Topic 5 because BERTopic automatically indexes topics numerically starting at 0, and the number does not imply importance or rank. Six topics were generated in total because the combination of HDBSCAN and hyperparameters (e.g., min_topic_size = 2, zeroshot_min_similarity = 0.8) yielded six semantically coherent clusters from the Henan climate adaptation petition corpus. Each panel displayed the top keywords ranked by class-based TF-IDF scores, which reflected the relative distinctiveness of a term to that topic within the full corpus. The results illustrated that despite the small dataset (n = 36), distinct clusters of concern emerged. These topics were interpretable without post hoc merging, validating the model’s performance in a short-text, low-sample regime. Their internal coherence also aligned with themes identified during manual content review.

To assess the semantic similarity between discovered topics, a hierarchical agglomerative clustering algorithm was applied to the class-based TF-IDF representations of each topic. The dendrogram was generated using Ward’s method on cosine distances between each topic’s class-based TF-IDF vector. Each row corresponded to one of six discovered topics (numbered 0–5), with a truncated label displaying its top-ranked keywords. Horizontal linkage distance represented semantic dissimilarity: shorter branches indicated greater topic overlap. Two high-level groupings emerged (Fig. 6): one encompassing Topics 0, 1, and 4 (emphasizing urban infrastructure and regional challenges), and another including Topics 2 and 5 (focused on local governance and neighborhood development). This clustering supported the face validity of the BERTopic model and offered additional insight into the alignment of citizen concerns.

Figure 7 presents a two-dimensional projection that visualizes the semantic relationships among topics derived using BERTopic. Document embeddings were reduced using UMAP and then clustered with HDBSCAN. Each circle represents a single topic, with its size scaled to reflect the relative frequency of that topic in the corpus. The spatial separation between circles reflects their semantic dissimilarity: topics that are closer together share more lexical and contextual overlap, while those further apart are more distinct in meaning. The visualization confirms that topics identified in post-flood petitions are well-separated, indicating discrete categories of citizen concern.

Inclusion and ethics

This study did not involve human participants, animal subjects, or any identifiable personal data, and therefore did not require ethics approval or informed consent. All textual data analyzed was sourced from publicly accessible online petition platforms, in full accordance with the platforms’ terms of use and relevant data protection regulations. Prior to analysis, all data were anonymized and aggregated to safeguard privacy. The author confirms that no third-party copyrighted or proprietary materials were used without appropriate permission or licensing. The map was created using open-access spatial data from the rnaturalearth R package, which is distributed under a permissive license.