Predictive quality of census-based socio-economic indicators on Covid-19 infection risk at a fine spatial scale in France

Romain-Scelle, Nicolas; Riche, Benjamin; Benet, Thomas; Rabilloud, Muriel

doi:10.1038/s41598-025-03768-0

Download PDF

Article
Open access
Published: 01 July 2025

Predictive quality of census-based socio-economic indicators on Covid-19 infection risk at a fine spatial scale in France

Nicolas Romain-Scelle ORCID: orcid.org/0000-0002-9036-411X^1,2,3,
Benjamin Riche³,
Thomas Benet⁴ &
…
Muriel Rabilloud^1,2,3

Scientific Reports volume 15, Article number: 22076 (2025) Cite this article

1081 Accesses
7 Altmetric
Metrics details

Subjects

Abstract

The COVID-19 pandemic in France induced the development of a national, high spatiotemporal resolution confirmed infection cases database. We aimed to estimate the predictive ability of census-based indicators on the infection risk to assess their potential usefulness in future pandemic response. We collected and aggregated all counts of biologically confirmed cases of SARS-CoV-2 infection in the Auvergne-Rhône-Alpes region in France at small-area statistical units between May 2020 and February 2021 (second wave). Ten census-based ecological covariates were evaluated as predictors of case incidence using a Poisson regression with conditional autoregressive (CAR) spatial effects. Benefits of CAR effects and covariates on model predictive ability was assessed comparing posterior predictive distribution of case incidence with the observed value for each statistical unit. Among 7,917,997 inhabitants, 438,992 infection cases over 5410 neighbourhoods were analysed. Spatial correlation was high for the periods before and after the epidemic peak, and illustrated with cartography. The addition of covariates to the null model led to an increase in satisfying prediction of + 5% from 14%, with a maximum of 21% across all periods. The ecological covariates assessed were insufficient to provide a satisfying prediction of infection risk without explicitly accounting for the spatial organization of the epidemic.

A Bayesian spatio-temporal model of COVID-19 spread in England

Article Open access 06 May 2024

Household exposure, demographic and health characteristics associated with SARS-CoV-2 infection in a cohort study in Northern France

Article Open access 28 November 2025

Housing situations and local COVID-19 infection dynamics using small-area data

Article Open access 31 August 2023

Introduction

The onset of the COVID-19 pandemic in 2020 induced a vast surveillance effort to track both individual and populational impact of the SARS-CoV-2 virus circulation and to guide public policy toward individual and collective interventions¹. In France, a near real time passive surveillance system for infections was created, collating data on every infection diagnostic test in a single information system (SI-DEP)². This extensive data collection included for each tested subject their age, sex and day of testing, and for the first time in France standardized postal addresses allowing for the systematic geocoding of cases. The spatial unit at which data was made available is the Îlot de Regroupement pour l’Information Statistique (IRIS). It is a statistical unit aiming to subset administrative towns into small (~ 1500 residents in average) homogenous units from an urban planning standpoint (commonality of housing, economic activities…). The IRIS is also the unit at which census-based and derived indicators are mostly available. Hence, for the purpose of epidemic management, there was a keen interest by public health professional in using neighbourhoods’ socio-economic composition as a predictor of the risk of infection at a small geographical scale. Although the published literature is rich with ecological analyses of case incidence explained by socio-economic indicators^{3,4,5,6,7,8,9}, analyses conducted at such a fine geographical scale are scarcely reported, and when they are reported an adequate measurement of the predictive quality of the models are rarely given. Therefore, we aimed to assess the ability of local socio-demographic and economic indicators measured at a population level to explain the observed SARS-CoV-2 infection incidence.

Materials and methods

We conducted a retrospective, ecological analysis of the count of incident cases of infection by SARS-CoV-2 within the Auvergne-Rhône-Alpes French region, to assess the association between infection rate and socio-economic indicators from census data. The study region accounted for 12% of the total metropolitan French population, with 8.1 million inhabitants in 2022. Our study was conducted for a time period between May 13th, 2020 and February 14th, 2021, covering most of the second wave in France. The study time frame was divided in four epidemics periods matching the dynamic of the second wave: a low incidence period up to July 26th 2020, the second wave growth phase up to October 25th, the peak and decrease period marked by local curfews followed by a national confinement up to December 13th and the stabilization phase up to February 14th 2021. No confinement or curfews were in effect aside from the third period, which starts at the epidemic peak of the second wave.

The infection case data (defined as a positive PCR or antigenic based test conducted by a health care professional) was extracted from the SI-DEP information system and aggregated over each epidemic period and spatial unit. No reliable test data were available due to technical limitations and a changing definition of the count of individuals tested over the study period, leading to only the positive test being analysed. Case counts between 1 and 5 were censored by the provider to prevent reidentification of subjects. 2% of all cases were censored and were imputed following a procedure detailed in the Supplementary Material. No age-sex stratified data has been considered, because it would have led to important loss of information due to statistical censorship.

The socio-economic indicators evaluated as predictors of the distribution of infection cases were the proportion of migrants, the proportion of unemployed individuals, the proportion of single person homes, the proportion of households without a child, the proportion of car ownership, the proportion of individuals without a high school diploma, and the proportion of overpopulated homes. Two other indicators were included for population mobility in volume and nature, i.e. the proportion of individuals working outside of their residency town, and the proportion of individuals travelling to work by car. All above mentioned indicators were obtained from the 2017 population census data from the French National Institute of Statistics and Economic Studies (INSEE, Data Availability statement).

To quantify the association between the socio-economic indicators and case incidence, four parametric models were defined for each period. The null model (M1) was a generalized linear model with the counts of cases following a Poisson distribution, estimating the expected incidence rate for one person-week per IRIS. The offset is the log of the product of the IRIS population and the duration of the period (in weeks). The linear predictor for all four models is the following, with $\mu$ the intercept, ${\varvec{X}}{\varvec{\beta}}$ the fixed effects of covariates and ${\varvec{\phi}}$ a conditional autoregressive (CAR) random effect:

$${\mathcal{M}}_{1}:offset+\mu$$

$${\mathcal{M}}_{2}:offset+\mu +{\varvec{X}}{\varvec{\beta}}$$

$${\mathcal{M}}_{3}:offset+\mu +{\varvec{\phi}}$$

$${\mathcal{M}}_{4}:offset+\mu +{\varvec{X}}{\varvec{\beta}}+{\varvec{\phi}}$$

CAR random effects allow for the decomposition of the between IRIS residual variance in two components: one spatially autocorrelated, with the correlation structure defined by a neighbourhood matrix and one uncorrelated. The CAR random effect proposed by Leroux and MacNab^10,11 was used, providing two parameters to estimate with a direct interpretation: ${\tau }^{2}$ the total between-unit variance and $\rho$ the spatial autocorrelation coefficient controlling how ${\tau }^{2}$ is distributed between correlated and uncorrelated effects (0 giving fully uncorrelated effects, and 1 fully correlated effects). The neighbourhood matrix between spatial units was defined on a travel-time-based metric. All IRIS within 60 min of each other were considered neighbours, with a weight defined as the inverse of the computed travel time separating each other. The matrix was standardized to ensure a mean sum of rows equal to one, rendering the ${\tau }^{2}$ parameter directly interpretable on the scale of the predicted incidence rates of infection cases.

To assess the functional forms of the covariates, we used a penalized spline regression on the model M2 (covariates only) in each period, with a response following a negative binomial distribution. The estimation procedure in penalized spline regression¹² provides the effective degrees of freedom (edf) for each covariate. This estimation and the graphical analysis of the corresponding functional forms were used to guide the choice of the degree of freedom used in the final models, implemented using orthogonal polynomials. The definitive degrees of freedom were selected empirically to maintain the most notable functional forms while not compromising the convergence of the Bayesian sampling procedure.

$\beta$ parameters quantifying the association between the incidence rates of cases and the evaluated covariates were set to follow a centred normal prior distribution of variance 1000, and ${\tau }^{2}$ followed an Inverse-Gamma prior, with shape 0.1 and scale 1. The prior for $\rho$ is set internally in the package as a uniform distribution in [0,1]. For each model fitting, 3 parallel Monte-Carlo Markov Chains were generated to produce 6,000 samples from the posterior distribution for each parameter. Convergence was assessed using Gelman plots and the potential scale reduction factor¹³. For the random effects’ parameters (${\tau }^{2}, \rho$) the median of the posterior distribution of the parameters was used as punctual estimate, and the credibility interval was built using the 2.5th and 97.5th centiles of the posterior distribution. The random variance ${\tau }^{2}$ is presented as a standard deviation $\tau$, additive on the log-incidence scale and as $\text{exp}\left(2\tau \right)$, multiplicative on the incidence scale. For the fixed effects of covariates, the estimated incidence rate ratios (IRR) for a covariate $p$ were defined as the ratio between the predicted incidence rate at a given value of the covariate over the incidence rate at the mean value of the covariate, and plotted. The predicted IRR for each IRIS and period, using the overall mean predicted incidence rate as a denominator, was presented using cartography.

One major hypothesis in Poisson process modelling is that the probability of the next event is independent from the number of events which occurred previously. When modelling counts of cases aggregated over time as is the case in our study, this implies that the number of expected cases in one given IRIS is strictly proportional to the number of subjects at risk in this IRIS. For an infectious disease, this is not the case as the probability of contamination within one IRIS is a function of the number of contacts between residents and the number of infectious residents, which themselves may vary as a function of total population within this IRIS. To relax this hypothesis, models M2 and M4 include as a predictor the number of person-week centred on the mean person-week count per IRIS in the study region. The resulting IRR estimates the effect of the size of the population in a given IRIS upon the infection risk of an individual living in this IRIS.

Each model was evaluated by computing the Moran index of the residuals to assess whether the model was successful in accounting for the spatial autocorrelation, 1 indicating a complete positive autocorrelation and 0 no autocorrelation. We used the deviance explained to measure and compare, for each period, the goodness-of-fit of the models. The deviance explained by models M2, M3, and M4 compared to the null (M1) was used to assess the improvement in goodness-of-fit, with 1 indicating a perfect fit and 0 a fit no better than the null model. Predictive quality was evaluated using the quantile of the observed count of cases in the posterior predictive distribution of the counts for each unit. The observed value was classified in three categories: under the 5th or above the 95th quantile giving a poor prediction, under the 25th or above the 75th quantile giving a mediocre prediction, and within the interquartile range giving a satisfying prediction.

Detailed models’ specification, weight matrix computation, results from the edf determination (Figure S1) and evaluation of the MCMC sampling and of the priors’ influence (Figure S2) are reported in Supplementary Material. All analysis was conducted using R software (4.4.0). Travel times were computed using the Conveyal R5 routing engine¹⁴. Geographic data was extracted from the OpenStreetMap dataset. The model parameters were estimated using the package CARBayes¹⁵, and the functional form exploration was conducted using package mgcv.

The dataset of infection cases used in this study is composed of aggregated counts with censorship of statistical units for which there was between 1 and 5 observations. Therefore, we deemed the data anonymized¹⁶, with no further Institutional Review Board approval necessary under French law. The infection data collection in France was mandatory for the study period and a default authorization for the reuse of data for research purposes was included in the relevant law and subsequent decree².

Results

The study region is composed of 5,410 IRIS, with 5,069 for which all variables were available, and 7,917,997 individuals at risk, for an average of 1,562 individuals per available IRIS. The 341 IRIS with missing data for the covariates were in highly rural and mountainous areas with very low population count. The total number of cases observed during the study period was 562,376, of which 438,992 cases could be attributed to an IRIS and used in this study (78%). The median proportion of single person homes was 31% (IQR 25–39), and the median proportion of families without child was 50% (IQR 45–57) (Table 1). Regarding transportation, the median proportion for working outside of the residency town was 59% (IQR 37–72), 83% (IQR 73–89) for travelling to work by car, and 93% (IQR 87–96) for car ownership. The median proportion for home overcrowding was 1.6% (IQR 0–3.5).

Table 1 Description of the demographic and socio-economic characteristics of the IRIS (infra-municipal spatial units) of the French region Auvergne-Rhône-Alpes.

Full size table

The effect of the socio-economic indicators was found to be varying across periods, with several noticeable trends by the full model (M4, Fig. 1). The proportion of immigrants and households without child were found to have similar effects across periods, in shape and magnitude. The incidence rate was positively associated with the proportion of immigrants in a near linear fashion, negatively associated with the proportion of households only for values above 50%, and negatively associated with the proportion of households owning a car only during the Growth period. The effect of the IRIS population on the incidence rate is found to be of consistent shape across periods, with a sharp reduction of incidence rate in IRIS with a low population count compared to the regional mean, and an increase in incidence rate with a population between 1 and 4 times the mean value. Finally, regarding the Low Incidence period, although no significant association is found (no credibility intervals are displayed to improve readability), the incidence rate seems to increase with the proportion of overcrowded homes.

Table 2 presents the estimations by the full model (M4) for the parameters of the CAR random effects and the mean predicted incidence rate over the study. The spatial correlation coefficient is found to be moderate during the Low Incidence and Stabilization periods (0.46 and 0.59 respectively), indicating a similar share of spatially correlated and uncorrelated variance across IRIS. During the Growth and Peak and decrease periods, a strong spatial autocorrelation was found (0.78 and 0.88), indicating a strong spatial structure of the incidence rate across the study region. The between-IRIS variance in incidence rates, presented as standard deviation, is found to be 1.16 for the Growth period and between 0.51 and 0.58 for the last three periods, implying a dispersion of the IRIS’ predicted incidence rates by a factor of 10.2 and by a factor of about 3 respectively. Figure 2 provides a map by period of the predicted IRR by the full model. The high degree of spatial autocorrelation for periods 2 and 3 is strongly visible on panels B and C, with a grouping of IRIS with elevated IRR along urban areas as illustrated by the railway network on the map. Insets for the named metropolitan areas are presented in Supplementary Material (Figures S3 to S6).

Table 2 Mean predicted incidence rate per period for the entire study area (for 1,000 person-week), and CAR random effects parameters estimated by the full model (M4) with their 95% credibility interval.

Full size table

Models’ residuals for the Low incidence and Stabilization periods exhibited a minimal spatial autocorrelation, even for the null model M1 (0.028 and 0.072 respectively, Table 3). For the Growth and Peak and Decrease periods, both the inclusion of a spatial effect and the inclusion of covariates were able to mitigate the residual autocorrelation indicating a spatial structure of the covariate distribution. The fit for the Growth period was also noticeably improved by the introduction of covariates (M2), with the deviance explained reaching 0.15. For the Peak and decrease and Stabilization periods, the fit improved only with the addition of a spatial effect to the model, with no relevant difference between M3 and M4. During the Low incidence period, the deviance explained showed marginal improvement between M3 and M4, with a small benefit from the covariates’ addition, but the proportion of IRIS with a poor prediction remained above 80% in all cases. Regarding the three others periods, only the models with a CAR random effect provide a high proportion of satisfying predictions (above 81% satisfying, under 8% poor), whereas the models M1 and M2 are producing poor predictions for 43–58% of the IRIS, with a maximum of 21% of satisfying predictions. For all periods, the null model and the model with covariates only were very close regarding predictive quality. The best improvement of M2 over M1 was an increase from 14 to 19% of satisfying predictions during the Growth period.

Table 3 Moran indices, deviance explained and distribution of prediction quality for each model fitted on the observed data, by epidemic period.

Full size table

Discussion

Our main result is an inability of our ecological covariates alone to raise the proportion of satisfying prediction above 21%, indicating a predictive performance on the observed data barely superior to that of the simple estimated mean incidence over the study region. Putting aside the Low Incidence period for which cases are extremely sparse across the study region, the addition of spatial random effect produced major increases in predictive quality. In conjunction with the notable improvement in explained deviance with the spatial effects, this suggests our choice of a travel time metric for the neighbourhood matrix is appropriate. However, a proper assessment of this adequacy remains to be done.

The present literature remains scarce regarding small-area modelling of infectious disease, limiting our ability to compare the present results appropriately. We identified a handful of authors reporting on small-area analysis about COVID-19 incidence^7,8,17,18 but without providing a quantitative estimation of the predictive ability of the fixed effects in each model. Our report therefore brings to the literature a quantification of the potential for ecological variables to contribute to infection risk prediction, although our initial aim of providing proper estimates of risk on which the local authorities could act was not reached. We can however strongly stress the necessity to present an exhaustive panel of model performance metrics when conducting this type of ecological analysis at a small spatial scale.

The results found regarding the socio-economic and mobility indicators are less clear, with only two indicators maintaining a relatively constant and significant effect across study periods: the proportion of immigrants and the proportion of families without a child. The protective effect of families without a child may be related to the family size (i.e. number of family sources) and lack of exposure to school-based contamination thus limiting household transmissions¹⁹. Our results are concordant with other findings that identified higher risk of Covid-19 infection among immigrants²⁰. Time-varying effects of the ecological covariates could be related to changes in behaviour within specific populations, as was described in Germany during the first wave in 2020³, or ultimately to unmeasured confounding.

Our study has several limitations. First, for reasons of personal data protection, we were unable to conduct an analysis standardized for age and sex. More than 30% of cases observed during the entire study period would have been censored. Estimates of age-related risk of infection suggests a upward trend in risk with increasing age²¹, indicating a potential excess of cases in units where the elderly are overrepresented, and the opposite in areas inhabited by a younger population. With regards to the sex-related risk, other studies found variable results, with a tendency toward a higher risk of positivity among males compared to females^6,22.

Our second caveat is the absence of availability of the counts of individuals tested for SARS-CoV-2 infection. Indeed, assuming a homogeneous testing rate across all spatial unit is a strong hypothesis⁵. Moreover, similar work on French data has established a variation of the testing rate according to socio-economic level, with a decrease in testing with the increase in deprivation⁷. It is therefore reasonable to suspect the counts of infection cases to be underestimated in more deprived areas, which could lead to bias the estimates of the association between incidence rates and socio-economic indicators.

In conclusion, the present findings suggest the insufficiency of using ecological indicators at a small-area scale (1000–10,000 inhabitants) to predict incidence rates during an infectious disease, at least when applied to a respiratory infectious disease such as Covid-19. As modelling strategies in disease mapping frequently involve the use of random effects to account for excess variability and autocorrelation, it appears advisable when analysing aggregated data at a fine geographic resolution to thoroughly assess whether the fixed effects estimated are relevant to the global model performance.

Data availability

The infection cases data that support the findings of this study are available from the French Health Data Hub within the SI-DEP database. Restrictions apply to the availability of these data, which were used under licence for this study. Data may be requested by the following procedure: https://www.demarches-simplifiees.fr/commencer/soumission-d-un-projet-de-recherche-etude-ou-evalu. The census data that support the findings of this study are openly available on the INSEE website at the following links: https://www.insee.fr/fr/statistiques/4799268, https://www.insee.fr/fr/statistiques/4799323, https://www.insee.fr/fr/statistiques/4799252, https://www.insee.fr/fr/statistiques/4799305. The administrative layout data that support the findings of this study are openly available on the INSEE website at the following link: https://www.insee.fr/fr/information/2017499. The geographical data that support the findings of this study are openly available on the Geofabrik website at the following links: https://download.geofabrik.de/europe/france.html. Codes and specific datasets (with the exception of the infection cases dataset) are available on request to the corresponding author.

References

Ibrahim, N. K. Epidemiologic surveillance for controlling Covid-19 pandemic: Types, challenges and implications. J. Infect. Public Health 13(11), 1630–1638 (2020).
Article PubMed PubMed Central Google Scholar
Décret n° 2020–551 du 12 mai 2020 relatif aux systèmes d’information mentionnés à l’article 11 de la loi n° 2020–546 du 11 mai 2020 prorogeant l’état d’urgence sanitaire et complétant ses dispositions [Internet]. mai 13, 2020. Disponible sur: https://www.legifrance.gouv.fr/loda/id/JORFTEXT000041869923/
Plümper, T. & Neumayer, E. The pandemic predominantly hits poor neighbourhoods? SARS-CoV-2 infections and COVID-19 fatalities in German districts. Eur. J. Public Health 30(6), 1176–1180 (2020).
Article PubMed Google Scholar
Scarpone, C. et al. A multimethod approach for county-scale geospatial analysis of emerging infectious diseases: A cross-sectional case study of COVID-19 incidence in Germany. Int. J. Health Geogr. 19(1), 32 (2020).
Article PubMed PubMed Central Google Scholar
Padellini, T. et al. Time varying association between deprivation, ethnicity and SARS-CoV-2 infections in England: A population-based ecological study. Lancet Reg. Health Eur. 15, 100322 (2022).
Article PubMed Google Scholar
Vahidy, F. S. et al. Sex differences in susceptibility, severity, and outcomes of coronavirus disease 2019: Cross-sectional analysis from a diverse US metropolitan area. PLoS ONE 16(1), e0245556 (2021).
Article CAS PubMed PubMed Central Google Scholar
Vandentorren, S. et al. The effect of social deprivation on the dynamic of SARS-CoV-2 infection in France: A population-based analysis. The Lancet Public Health 7(3), e240–e249 (2022).
Article PubMed PubMed Central Google Scholar
Fernández-Martínez, N. F. et al. Socioeconomic differences in COVID-19 infection, hospitalisation and mortality in urban areas in a region in the South of Europe. BMC Public Health 22(1), 2316 (2022).
Article PubMed PubMed Central Google Scholar
Ahmad, K. et al. Association of poor housing conditions with COVID-19 incidence and mortality across US counties. PLoS ONE 15(11), e0241327 (2020).
Article CAS PubMed PubMed Central Google Scholar
Leroux, B. G., Lei, X. & Breslow, N. Estimation of disease rates in small areas: A new mixed model for spatial dependence. In Statistical models in epidemiology, the environment, and clinical trials (eds Halloran, M. E. & Berry, D.) 179–191 (Springer, 2000).
Chapter Google Scholar
MacNab, Y. C. Hierarchical Bayesian modeling of spatially correlated health service outcome and utilization rates. Biometrics 59(2), 305–315 (2003).
Article MathSciNet PubMed Google Scholar
Wood, S. N. Generalized additive models: An introduction with R 2nd edn, 496 (Chapman and Hall/CRC, 2017).
Book Google Scholar
Brooks, S. P. & Gelman, A. General methods for monitoring convergence of iterative simulations. J. Comput. Graph. Statist. 7(4), 434–455 (1998).
Article MathSciNet Google Scholar
Contributors. GitHub Repo. 2022 [cité 27 juill 2022]. Conveyal R5 Routing Engine. Disponible sur: https://github.com/conveyal/r5
Lee, D. CARBayes: An R package for bayesian spatial modeling with conditional autoregressive priors. J. Stat. Softw. 55(13), 1–24 (2013).
Article Google Scholar
Article 29 Data Protection Working Party. Opinion 05/2014 on Anonymisation Techniques [Internet]. European Commission; 2014 avr [cité 13 sept 2024]. Report No.: WP216. Disponible sur: https://ec.europa.eu/justice/article-29/documentation/opinion-recommendation/files/2014/wp216_en.pdf
Arauzo-Carod, J. M., Domènech, A. & Gutiérrez, A. Do local characteristics act in a similar way for the first two waves of COVID-19? Analysis at intraurban level in Barcelona. J. Public Health 43(3), 455–461 (2021).
Article Google Scholar
Riou, J. et al. Socioeconomic position and the COVID-19 care cascade from testing to mortality in Switzerland: A population-based analysis. The Lancet Public Health 6(9), e683–e691 (2021).
Article PubMed PubMed Central Google Scholar
Madewell, Z. J., Yang, Y., Longini, I. M. Jr., Halloran, M. E. & Dean, N. E. Household secondary attack rates of SARS-CoV-2 by variant and vaccination status: An updated systematic review and meta-analysis. JAMA Netw. Open 5(4), e229317 (2022).
Article PubMed PubMed Central Google Scholar
Gosselin, A., Warszawski, J., Bajos, N., EpiCov Study Group. Higher risk, higher protection: COVID-19 risk among immigrants in France—Results from the population-based EpiCov survey. Eur. J. Public Health 32(4), 655–663 (2022).
Article PubMed PubMed Central Google Scholar
Goldstein, E., Lipsitch, M. & Cevik, M. On the effect of age on the transmission of SARS-CoV-2 in households, schools, and the community. J. Infect. Dis. 223(3), 362–369 (2021).
Article CAS PubMed Google Scholar
Prinelli, F. et al. Sex- and gender-related differences linked to SARS-CoV-2 infection among the participants in the web-based EPICOVID19 survey: The hormonal hypothesis. Maturitas 158, 61–69 (2022).
Article CAS PubMed Google Scholar

Download references

Acknowledgements

We thank the DATA direction at Santé publique France for providing us with the ad hoc dataset, and Dr. Christine Saura from Santé publique France for her imput during the elaboration of this study. We also thank the three reviewers for their productive observations and propositions to improve upon our original manuscript.

Funding

The authors declare that no funds, grants, or other support were received during the preparation of this manuscript.

Author information

Authors and Affiliations

Université Lyon 1, Lyon, France
Nicolas Romain-Scelle & Muriel Rabilloud
Laboratoire de Biométrie et Biologie Evolutive - UMR CNRS 5558, Villeurbanne, France
Nicolas Romain-Scelle & Muriel Rabilloud
Service de Biostatistiques-Bioinformatique, Hospices Civils de Lyon, Lyon, France
Nicolas Romain-Scelle, Benjamin Riche & Muriel Rabilloud
Auvergne-Rhône-Alpes Regional Office, Santé Publique France, Lyon, France
Thomas Benet

Authors

Nicolas Romain-Scelle
View author publications
Search author on:PubMed Google Scholar
Benjamin Riche
View author publications
Search author on:PubMed Google Scholar
Thomas Benet
View author publications
Search author on:PubMed Google Scholar
Muriel Rabilloud
View author publications
Search author on:PubMed Google Scholar

Contributions

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by Nicolas ROMAIN-SCELLE. The first draft of the manuscript was written by Nicolas ROMAIN-SCELLE and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Nicolas Romain-Scelle.

Ethics declarations

Competing interest

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Romain-Scelle, N., Riche, B., Benet, T. et al. Predictive quality of census-based socio-economic indicators on Covid-19 infection risk at a fine spatial scale in France. Sci Rep 15, 22076 (2025). https://doi.org/10.1038/s41598-025-03768-0

Download citation

Received: 08 September 2024
Accepted: 22 May 2025
Published: 01 July 2025
Version of record: 01 July 2025
DOI: https://doi.org/10.1038/s41598-025-03768-0