Introduction

Uterine corpus cancer (UCC) stands as the most prevalent gynecological malignancy in high- and middle-income countries1,2. According to the International Agency for Research on Cancer, the incidence rate of uterine corpus cancer is increasing rapidly and is estimated to increase by more than 50% worldwide by 20403. In China’s demographic statistical system, household-registered residents (HRR) refer to individuals with official household registration status issued by local public security authorities4. The delineation of this population group is determined exclusively by its household registration status, disregarding actual residential location. Permanent residents (PR) are defined as all individuals maintaining continuous residence in a designated administrative area for 6 months or longer, regardless of registration status, thus including both household-registered residents and internal migrants. The current cancer registration system, which relies predominantly on HRR-based data, exhibits significant epidemiological limitations. Specifically, it does not include migrant populations, resulting in significant selection bias in uterine corpus cancer incidence rate estimations at both national and subnational levels5,6. Permanent residents constitute the epidemiologically valid denominator, as they comprehensively capture the target population exposed to geographically stratified socioeconomic determinants and environmental risk factors. Moreover, although China’s cancer registry system can generate national incidence statistics, its limited population coverage may significantly constrain the evidence base for cancer control policy formulation. Consequently, establishing a PR-based small-area cancer incidence surveillance system is critical for implementing precision cancer prevention and control across three-tiered (county, provincial, and national) administrative levels, enabling granular spatial assessment of cancer burden to support data-driven public health decisions.

In 2016, China’s migrant population reached 245 million, accounting for 17.7% of the total 1.38 billion population6. Provincial-level data showed immigrant populations constituting 40.2%, 37.4%, and 33.1% of permanent residents in Shanghai, Beijing, and Tianjin, respectively. Conversely, emigrant populations represented 20.2%, 16.2%, and 13.3% of household-registered residents in Guizhou, Henan, and Guangxi6,7. This substantial migrant population suggests that HRR-based cancer registries systematically underestimate incident cases within this demographic group. Under China’s current public health service system, migrant populations face multidimensional health inequities, including service coverage gaps, excessive financial burdens, and disparities in care quality. These institutional deficiencies persistently exacerbate health disadvantages among this vulnerable group8. Accurately assessing UCC incident cases among permanent residents is critical for the precise allocation of oncology specialists, nursing resources, and hospital beds. Systematic analysis of disparities in UCC incidence rates and case numbers between permanent and household-registered residents provides not only essential evidence for optimizing regional healthcare resource distribution, but also crucial data support for developing tailored cancer prevention and control strategies9,10. Notably, the UCC incidence among permanent residents in mainland China has not been previously reported.

In recent years, Bayesian spatial statistical methods and models have been extensively applied in epidemiology. While traditional Markov chain Monte Carlo approaches maintain theoretical rigor, their computational demands make them unsuitable for routine cancer registry analysis11. The integrated nested Laplace approximation-stochastic partial differential equation (INLA-SPDE) method provides an efficient alternative, using analytical approximations rather than stochastic sampling to achieve computational efficiency while retaining Bayesian advantages. This method demonstrates particular strength in small-area cancer surveillance estimates12. Our study followed the Reporting of Studies Conducted using Observational Routinely-collected Health Data (RECORD) Statement and the Guidelines for Accurate and Transparent Health Estimates Reporting (GATHER) statements12,13.

Methods

Study design

We used the Bayesian INLA-SPDE modeling to estimate the UCC incidence among permanent residents in mainland China in 2016. The study flow diagram is shown in Fig. 1.

Fig. 1
figure 1

Flow diagram.

Data sources

Uterine corpus cancer

We extracted the data on UCC incidence from 487 cancer registries in mainland China for 2016. This data selection adhered to stringent quality evaluation criteria outlined in the annual cancer registry report, specifically referencing the International Classification of Diseases tenth revision codes C54–C5514. The calibration of UCC incident cases was performed in accordance with the female household-registered resident within the respective cancer registries. According to the national standard GB2260-2019, urban areas categorize cities at or above the prefectural level as urban regions, while counties and county-level cities are classified as rural regions. The inclusion and exclusion criteria for the data are detailed in Supplementary Materials 1.0. The 487 cancer registries encompass 669 districts and counties across 31 provinces, representing 23.5% of the total 2845 districts and counties in mainland China.

Population

We obtained household-registered and permanent resident population data for all 2,845 districts and counties in mainland China from the Population and Employment Statistics Yearbook and City Statistics Yearbook. The net migrant population was calculated as the difference between permanent and household-registered residents. Interprovincial migration proportions were derived from the 2016 China Migrants Dynamic Survey (CMDS)15, with methodological details provided in Supplementary Table S1.

Covariates

We extracted the data on 27 covariates from local yearbooks to estimate the UCC incidence among the household-registered resident. These covariates span various domains, including demography (proportion of ethnic minority population, urban population, rural population, proportion of the population aged 65 and over, proportion of childbearing women aged 15–49, population aged 15 years and over), economy (gross domestic product (GDP), GDP per capita, per capita disposable income of urban residents, proportions of primary, secondary and tertiary industries), education (average school year, illiterate population aged 15 and over, illiteracy rate), marriage (unmarried, married, divorced, and widowed), vital statistics (average number of live births, average number of surviving children), housing (average number of rooms per household, floor area per capita), air quality (PM2.5) and meteorology (average sunshine duration, average temperatures, average precipitation) (see Supplementary Table S2 online)16,17,18. These covariates collectively provide a comprehensive set of factors that may influence UCC incidence within the studied population.

Bayesian spatial model and validation

We developed a Bayesian spatial model to estimate UCC incidence among household-registered residents across 2,845 districts and counties, leveraging data from 487 cancer registries19,20. Covariates were standardized to continuous variables with a mean of zero and a standard deviation of one. A rigorous selection process identified the optimal set for inclusion. Univariable models assessed relationships, with covariates selected for significance (p < 0.05) based on the predictive Akaike information criterion21. Multicollinearity was examined by identifying highly correlated covariates (correlation coefficient > 0.8 and variance inflation factor > 10.0; Supplementary Fig. S1). Spatial autocorrelation of UCC incidence was evaluated through a semi-variogram (Supplementary Fig.S3)22. Using well-defined registry coordinates, we modeled point-level incidence within a Bayesian geostatistical framework via the INLA-SPDE approach23.

Prior to model construction, we validated the distributional characteristics of UCC incident cases across cancer registries. The distribution exhibited right-skewness with variance (σ2) significantly exceeding the mean (μ), confirming substantial overdispersion (Supplementary Fig. S2). We computed the UCC incidence (\(P_{i}\)) at each cancer registry (\(X_{i}\)) based on the number of local female household-registered residents (\(N_{i}\)). We estimated the number of UCC incident cases (\(Y_{i}\)) following an overdispersed poisson distribution: \(Y_{i} |P_{{(X_{i} )}} \sim Poisson(N_{i} ,P_{{(X_{i} )}} )\). Subsequently, we employed a log-link function with regional population size as the offset term for model calibration:\(Log(P_{{(X_{i} )}} ) = \beta_{0} + \beta_{1} m_{1} + \cdots + \beta_{m} m_{m} + f_{iid} (V_{{X_{i} }} ) + f_{loc} (S_{i} )\), \(\beta_{{0}}\) represents the model intercept, \(\beta_{m} = (\beta_{1} , \ldots ,\beta_{m} )\) is the vector of coefficients for the covariates, \(f_{iid} (V_{{X_{i} }} )\) represents each monitoring point following an independent and identically distributed (\(i.i.d.\)) distribution to address over-dispersion, and \(f_{loc} (S_{i} )\) represents a spatial random effect. In this model, points that are spatially proximate or have similar covariate patterns are expected to exhibit similar UCC incidences. To streamline the model, we used deviance information criterion (DIC), Watanabe–Akaike information criterion (WAIC), and marginal log-likelihood (MLL) for selection. The estimated outcomes were summarized with mean estimates and 95% Bayesian credible intervals (BCIs) at a spatial resolution of 4.4km by 4.4km (see Supplementary Fig. S6 online). The weighted average population was aggregated into 2845 districts and counties. Provincial UCC incidences for the household-registered resident were calculated using the number of UCC incident cases and the household-registered resident at the provincial level. Model validation employed a fivefold cross-validation approach, evaluating performance through mean error, mean absolute error, mean square error, root mean square error, and the percentage of observations covered by 95% BCIs. Additionally, we conducted a sensitivity analysis to evaluate the model’s stability (see Supplementary Table S5 online). This analysis focused on range parameters, scale parameters, and mesh grid settings, examining how variations in these parameters within broad prior distributions impact the model’s DIC, WAIC and MLL. Detailed descriptions of the modelling, estimation, and validation processes can be found in Supplementary Figure S4 and Supplementary Materials 2.0.

Output indicators

We obtained the UCC incidence in the household-registered residents through modelling. Utilizing the CMDS data, we quantified uterine corpus cancer incidence in migrant populations through a province-specific weighting approach. For immigration areas, the attributable number of cases among immigrants was calculated by applying origin-province incidence rates to population inflows weighted by migration patterns. For emigration areas, case estimates among emigrants were derived using local incidence rates of their source provinces. The UCC incidence and incident cases among permanent residents was derived by combining the incidences in the household-registered and migrant populations. Using an indirect standardization approach, we estimated UCC age-standardized incidence rate in permanent residents by applying age-specific incidence rates from household-registered residents to permanent resident demographic structures. The analysis incorporated adjustments for age stratification (0–1, 1–4, 5–9, 10–14, 15–19,…, ≥ 85 years) based on national census data, with additional calibration for provincial urban–rural distributions.

Statistical analysis

We used Epi Info for data management, R 4.3.2 for covariate processing, model fitting and validation, SPSS 17.0 for common statistical analysis, and ArcGIS 9.0 for geographical mapping.

Results

Cancer registries

In 2016, data from 487 cancer registries in mainland China met the stringent quality control criteria. These registries cover 669 districts and counties, representing 28.6% of the total female household-registered resident. Among them, 211 registries were classified as urban, covering 45.3% of the national urban female population, and 276 were rural, covering 20.1% of the rural female population. Beijing and Tianjin achieved complete population coverage, whereas Shanxi, Xizang, and Xinjiang reported the lowest coverage rates, at 7.5%, 7.8%, and 8.1%, respectively. Additional details are shown in Fig. 2A, B.

Fig. 2
figure 2

Description of cancer registries, migrant population, UCC incidence and number of incident cases. Cancer registry (Pannel A), Population of cancer registry (Pannel B), the migrant population (Pannel C), the estimated UCC the incidence (Pannel D), incident cases (Pannel E), and age standardized UCC incidence (Pannel F) of the permanent residents in mainland China in 2016.

Female migrant population

In 2016, the cumulative female migrant population across the 31 provinces of mainland China reached 67,509,881, constituting 9.9% of the overall female household-registered residents. The total net emigrant population in China was 40,675,472. The three provinces with the largest net emigration were Henan (8,987,420 emigrants, representing 15.9% of provincial household-registered residents), Guizhou (4,464,844; 20.5%), and Anhui (3,888,113; 11.3%). Conversely, the net immigrant population totaled 26,834,409, with the highest concentrations in Guangdong (6,806,607 immigrants, accounting for 13.2% of permanent residents), Shanghai (4,622,773; 39.6%), Zhejiang (4,335,011; 15.2%), and Beijing (3,946,737; 37.0%). The urban immigrant population reached 31,635,770, predominantly located in Guangdong (10,088,054), Zhejiang (4,024,817), and Beijing (2,017,174). In contrast, rural areas exhibited a net emigration pattern, with total rural migrants numbering 61,330,622. The net rural emigrants (52,768,691) originated primarily from Henan (9,523,361), Sichuan (4,393,875), and Guizhou (4,312,313), while the rural immigrant population totaled 8,561,931, over 48% of whom were clustered in Shanghai (4,185,660). Provincial-level distributions are detailed in Table 1 and Fig. 2C.

Table 1 The differences between the household-registered residents (HRR) and the permanent residents (PR).

Model fitting and validation

Taking into account the research objectives and predictive power, the final geostatistical model selected four variables, including GDP, divorced population, illiterate population aged 15 years and above, and rural population (see Supplementary Fig.S4 online). Among them, the estimated coefficient of the divorced population on the incidence of uterine corpus cancer in the house-hold registered population was 0.062 (95% BCI: 0.006–0.118). The standard deviation of the spatial random effect was 0.890 (95% BCI: 0.446–1.885), indicating that there was a spatial effect at the overall level. The DIC and WAIC of the final model were 3353.37 and 3307.93 respectively (see Supplementary Table S3, S4 online). Model validation shows that the model is able to correctly estimate 93.6% of positions within 95% BCI.

The estimated UCC incidence among the permanent residents

In 2016, the estimated UCC incidence for both the household-registered residents and permanent residents in mainland China were 9.0/100,000 females. Among the 31 provinces, the most notable variations in the estimated UCC incidence between permanent residents and household-registered residents were observed in Shanghai (− 0.4/100,000), Guangdong (− 0.2/100,000), Beijing (− 0.2/100,000), and Qinghai (0.2/100,000). Within urban areas, the most significant difference in estimated UCC incidence between permanent and household-registered residents was observed in Guangdong (− 0.7/100,000), followed by Beijing (− 0.3/100,000). In rural regions, substantial differences were noted in Shanghai (− 0.5/100,000) and Xizang (0.4/100,000). The details are presented in Table 2 and Fig. 2D, and the Supplementary Table S6.

Table 2 The estimated differences in the incidence of UCC between the HRR and the PR (per 100,000).

The estimated UCC incident cases among the permanent residents

We estimated 61,510 UCC incident cases among household-registered residents and 60,275 among permanent residents in mainland China in 2016, with a cumulative absolute difference of 6,176 cases across all 31 provinces. The most substantial provincial differences occurred in Henan (− 899 cases; 15.7% of provincial total), Guizhou (− 442; 20.3%), Guangdong (630; 13.7%), and Shanghai (409; 60.0%). Urban areas showed the highest immigrant-associated cases in Guangdong (915; 37.9% of urban total), Zhejiang (356; 43.5%), and Beijing (185; 43.2%), while rural emigrant-related cases peaked in Henan (− 953; 21.3% of rural total), Guizhou (− 417; 25.5%), and Anhui (− 402; 17.9%). Detailed results are presented in Table 3 and Fig. 2E.

Table 3 The estimated differences in the UCC incident cases between the HRR and the PR.

The estimated age-standardized UCC incidence in the permanent residents

The overall age-standardized UCC incidence among the permanent residents in mainland China was 9.3/100,000. Among the 31 provinces, the highest age-standardized incidence rates were 13.5/100,000 in Xizang, 12.7/100,000 in Guangdong, and 12.0/100,000 in Guizhou, while the lowest rates were 7.4/100,000 in Chongqing, 7.5/100,000 in Heilongjiang, and 7.6/100,000 in Hubei. In urban areas, the highest age-standardized incidence rates were 13.6/100,000 in Guangdong, 13.3/100,000 in Guizhou, and 11.6/100,000 in Henan. In rural areas, the highest rates were 15.0/100,000 in Xizang, 12.8/100,000 in Xinjiang, and 12.3/100,000 in Guizhou. Details are provided in Table 2, Fig. 2F, and Supplementary Table S6.

Discussion

In 2016, the 487 cancer registries covered 669 districts and counties, representing 23.5% of all 2845 districts and counties and 28.6% of the household-registered population in mainland China, providing nationally representative data.

China’s cancer registration system has undergone significant refinement during the last twenty years. Nevertheless, the present HRR-based system fails to incorporate migrant populations, rendering it progressively inadequate for tackling public health challenges associated with large-scale migrant population24,25. The HRR-based allocation of social entitlements in China systematically restricts migrants’ access to healthcare and welfare services in host communities26. However, the migrant population is mostly engaged in high-risk occupations, chronically exposed to hazardous environments, with low education levels and employment in labor-intensive jobs, which increases the risk of uterine corpus cancer27. With China representing 17.7% of the global population, estimating UCC incidence among permanent residents contributes to the global understanding of this cancer28,29.

Our estimates relied on geo-referenced UCC incidence data from cancer registries, analyzed using Bayesian modeling to account for high-resolution spatial heterogeneity in disease exposure and population density when estimating country-level incidence30. The Bayesian INLA-SPDE approach combines the strengths of INLA and SPDE methods, providing fast, accurate estimates and efficient handling of large spatial datasets31,32. Globally, estimates of cancer incidence and mortality are generally conducted at the national level33. However, the INLA-SPDE method provides accurate estimates at smaller scales (districts or counties)34. This approach has been widely adopted in epidemiology35 and demonstrated excellent performance in our model, with 93.6% of estimates falling within the 95% BCIs. The sensitivity analysis yielded results consistent with the final outcomes, indicating robust model stability and reliable study findings.

The female inter-provincial migrant population in 2016 reached 67,509,881, ranking 30th globally, and too significant to ignore when reporting UCC incidence36. From a broader perspective, the migrant population proportion is expected to remain high, making it essential to estimate UCC incidence among the permanent residents in mainland China. The estimated UCC incidence was 9.0/100,000 for both female permanent and household-registered residents. However, provincial-level differences in incidence between these groups varied, ranging from 0.2/100,000 in Qinghai to -0.4/100,000 in Shanghai. Despite equal incidence among permanent and household-registered residents, the complexity of population mobility means that the migrant population is exposed to risk factors and protective factors in an inconsistent manner. It should be noted that the marginal differences in the overall UCC incidences between permanent residents and household-registered residents should not be dismissed as inconsequential. Rather, estimating the UCC incidence among permanent residents is crucial, and efforts to refine these estimates should be intensified.

The disparity in UCC incidence between the permanent and household-registered residents is influenced by the interprovincial distribution and the age composition of the migrant population. In China, UCC age-specific incidence rates generally rise with age, reaching their peak in the 50–59 years37. In provinces with net emigration such as Hebei and Guizhou, UCC incidence among permanent residents exceeds that of household-registered residents. This disparity primarily stems from the outmigration of working-age populations aged 16–55 years, combined with a rapidly growing elderly permanent resident population38. As another net emigration region, Liaoning Province’s observed incidence decline reflects its distinctive population mobility patterns, characterized by emigration of 30–50 years and immigration of young adults. In Jiangsu, Qinghai, and Xizang provinces experiencing net immigration, rising incidence rates correlate with migrant populations primarily from neighboring provinces. This trend accompanies increased family-oriented migration and a shift toward adult-dominant age structures39. Declining incidence rates in Beijing, Guangdong, and Shanghai, which are regions with substantial population inflow, are predominantly observed in rapidly developing economic areas that attract large labor migrant populations with younger age structures. This demographic shift toward younger permanent residents consequently reduces overall incidence rates.

The total estimated UCC incident cases among migrant populations reached 6176, with 2470 cases in immigrants and 3706 cases in emigrants, accounting for 10% of all estimated UCC incident cases. Regarding regional distribution, Henan province showed the highest disparity with 899 cases among emigrants, followed by Guizhou with 442 cases. For immigrants, Guangdong recorded 630 cases and Shanghai 409 cases. From an urban–rural perspective, UCC incident cases among migrant populations are higher in rural than urban areas. Specifically, in rural areas, there were 5502 incident cases among the migrant population, while in urban areas, there were 2954 incident cases among the migrant population. The estimated UCC incident case distribution correlates proportionally with provincial migrant population sizes. Based on this demographic correlation, differentiated healthcare resource allocation strategies should be implemented, with increased resource investment in immigrant areas like Beijing and Tianjin, while appropriately reducing allocations in emigrant areas. These findings provide critical evidence for precision healthcare planning, particularly regarding the required density of gynecological oncologists and other specialized resources40. The case estimates among permanent residents offer an epidemiologically robust basis for such resource optimization, ensuring alignment between service provision and actual population needs across both receiving and sending regions. The highest age-standardized UCC incidence rates were recorded in Xizang (13.5/100,000), Guangdong (12.7/100,000), and Guizhou (12.0/100,000). Urban residents consistently demonstrated higher standardized incidence rates compared to rural residents. These findings are consistent with epidemiological studies linking elevated UCC incidence and mortality to higher regional GDP.

These findings, encompassing incidences and new cases, play a crucial role in developing UCC control strategies in mainland China. This nationwide county-level analysis pioneers the incorporation of migrant populations into uterine corpus cancer incidence estimation, explicitly assessing their contribution to epidemiological burden and generating policy-relevant baselines for Healthy China 2030 monitoring. This research holds particular significance for nations that rely on household-registered resident cancer registration, particularly those facing significant challenges posed by substantial migrant populations42. Moreover, for cancers with higher incidence and mortality rates, a greater number of new cases and deaths should be observed within the migrant population.

The limitations of the study include the inability to provide the estimated UCC age-specific incidence and the directly standardized UCC estimated incidence due to the unavailability of detailed age-specific UCC incidence across regions. Estimations to be made with full age-specific data may result in larger disparities, which is an area for future research. The 2016 uterine corpus cancer incidence estimates in this study provide crucial metrics for cancer control planning, though their model-based derivation and lack of external validation necessitate cautious interpretation.

In conclusion, the direction and magnitude of incident cases differences correlate with provincial migrant population dynamics. The estimated UCC incidence rates and case counts among female permanent residents across all 2845 districts and counties provide critical evidence for developing UCC control strategies at provincial and national levels in mainland China. This research is particularly relevant for counties or regions that use household-registered resident for cancer registration and are dealing with significant migrant populations. The observed disparities between permanent and household-registered residents’ uterine corpus cancer incidence offer evidence for optimizing prevention strategies and precision healthcare resource allocation.