Assessing COVID-19 transmission through school and family networks using population-level registry data from the Netherlands

Garcia-Bernardo, Javier; Hedde-von Westernhagen, Christine; Emery, Tom; van Hoek, Albert Jan

doi:10.1038/s41598-024-82646-7

Download PDF

Article
Open access
Published: 28 December 2024

Assessing COVID-19 transmission through school and family networks using population-level registry data from the Netherlands

Javier Garcia-Bernardo^1,2,
Christine Hedde-von Westernhagen³,
Tom Emery⁴ &
…
Albert Jan van Hoek⁵

Scientific Reports volume 14, Article number: 31248 (2024) Cite this article

4004 Accesses
4 Citations
174 Altmetric
Metrics details

Subjects

Abstract

Understanding the impact of different types of social interactions is key to improving epidemic models. Here, we use extensive registry data—including PCR test results and population-level networks—to investigate the impact of school, family, and other social contacts on SARS-CoV-2 transmission in the Netherlands (June 2020–October 2021). We isolate and compare different contexts of potential SARS-CoV-2 transmission by matching pairs of students based on their attendance at the same or different primary school (in 2020) and secondary school (in 2021) and their geographic proximity. We then calculate the probability of temporally associated infections—i.e. the probability of both students testing positive within a 14-day period. Our results highlight the relative importance of household and family transmission in the spread of SARS-CoV-2 compared to school settings. The probability of temporally associated infections for siblings and parent-child pairs living in the same household ranged from 22.6–23.2%. Interestingly, a high probability (4.7–7.9%) was found even when family members lived in different households, underscoring the persistent risk of transmission within family networks. In contrast, the probability of temporally associated infections was 0.52% for pairs of students living nearby but not attending the same primary or secondary school, 0.66% for pairs attending different secondary schools but having attended the same primary school, and 1.65% for pairs attending the same secondary school. It is worth noting, however, that even small increases in school-related infection probabilities can trigger large-scale outbreaks due to the dense network of interactions in these settings. Finally, we used multilevel regression analyses to examine how individual, school, and geographic factors contribute to transmission risk. We found that the largest differences in transmission probabilities were due to unobserved individual (60%) and school-level (35%) factors. Only a small proportion (3%) could be attributed to geographic proximity of students or to school size, denomination, or the median income of the school area.

Implications of the school-household network structure on SARS-CoV-2 transmission under school reopening strategies in England

Article Open access 29 March 2021

Model-based projections for COVID-19 outbreak size and student-days lost to closure in Ontario childcare centres and primary schools

Article Open access 18 March 2021

Insights into household transmission of SARS-CoV-2 from a population-based serological survey

Article Open access 15 June 2021

Introduction

Epidemic models that explicitly incorporate network structure have gained traction during the COVID-19 pandemic^1,2,3,4,5. Including information about the intricate network structure of the population allows for better predictions of the shape of the epidemic curve, the regions or population groups likely to be infected given observed cases, and the types of contacts relevant for transmission. All three are highly relevant for public health responses.

Despite advances in network-based epidemic modeling during the COVID-19 pandemic, a significant gap remains in measuring the relative impact of different types of school and family contacts on transmission dynamics. Previous studies on influenza^6,7, and also SARS-CoV-2^8,9highlight the importance of schools as bridges for disease transmission between households, schools and other social contact areas. Indeed, in an attempt to contain the spread of the virus, governments around the world closed schools, resulting in large learning losses, especially among students from less-educated families¹⁰.

At the same time, studies in diverse countries such as the United Kingdom, Australia, and Singapore emphasize that transmission within schools can be managed with interventions such as physical distancing, air filtering and rapid isolation, and that household contacts remain a more prominent pathway for transmission^{11,12,13,14,15,16,17}. For instance, Cordery et al.¹³ observed minimal school-based transmission when precautions were in place, contrasting with high secondary transmission rates in households, likely due to prolonged close contact and viral shedding. Furthermore, in Wales, Thompson et al.¹¹ found that while students faced increased risks of infection from peers in their immediate year groups, the total number of cases in a school was not associated with an increased risk for staff or pupils. Similarly, Macartney et al.¹² observed low SARS-CoV-2 transmission rates in Australian educational settings, suggesting that schools did not contribute significantly to COVID-19 spread when effective case-contact testing and epidemic management strategies were in place.

Our study adds to this body research by examining the impact of family and school contacts on COVID-19 transmission among Dutch students using detailed population-level registry data from the Netherlands. We specifically examine students who transitioned from primary to secondary school in 2021, which takes place at age 12. Focusing on this transition allows us to distinguish whether infections occur primarily at school, through social ties inherited from primary school, or through non-school interactions such as community transmission. Specifically, we match pairs of students who attended primary school together in 2020 and either attended separate secondary schools or the same school in 2021. Given the large student segregation at schools¹⁸, we expect students who attended the same primary school to be similar to each other. We compare these groups of students to a reference group of students who did not attend primary or secondary school together. We then calculate the probability of temporally associated infections for the different groups as a function of the distance between the students’ homes. Next, we compare our results to the probabilities of temporally associated infections among different family members (siblings, parent-child, co-parents) living in the same house or at varying distances. Finally, we run a series of multilevel regressions to understand the heterogeneity between schools.

Our findings show that family ties contribute strongly to the spread of SARS-CoV-2. While attending the same school increased the probability of temporally associated infections from 0.5% to 1.6%, the probability of associated infections was much higher for family members living in the same house (25–50%) and even for family members living at different addresses (around 10%). During the period studied in this paper, temporally associated infections in primary schools were rare. These results align with previous literature showing a high frequency of secondary transmission in households^11,13 and low frequency of transmission in schools. Examining heterogeneity at the school level, we found that factors such as the distance between the students’ homes, school size, the median income of the postcode area of the school, and school denomination explained only 3% of the variance in outcomes. Most of the variance manifested at the individual level (60%) and at the school level (35%).

The paper proceeds as follows: Section 2 details the data, the matching of students and the analysis of the data. Section 3 details the probability of temporally associated infections for different subgroups. Section 4 concludes and discusses the potential of administrative data for epidemic studies.

Data and methods

Main datasets and network construction

Our analysis integrates two main datasets from Statistics Netherlands (CBS): the COVID-19 PCR-test data and the population network data. Every CBS dataset can be linked to each other at the individual-level through a unique identifier. We provide below a short summary of the data processing steps. A detailed explanation of all datasets and variables can be found in the online Supplementary Information.

The first dataset is the COVID-19 test dataset, which includes all PCR-tests conducted by municipal health services in the Netherlands outside of a hospital setting between June 2020 and September 2021. Schools were open for the majority of the period studied (Fig. 1). Since reinfections were unusual for the period studied, we retained for each person the first recorded infection.

The second dataset is the Person Network dataset, which contains formal relationships—i.e., family, school and household relationships recorded officially by the government—between individuals connecting the entire population of the Netherlands. These connections indicate “a highly increased probability that two individuals interact socially”¹⁹. Administrative networks bear a novel opportunity to researchers studying social processes since they do not suffer from common drawbacks of studies based on surveys, digital trace data, or contact tracing such as non-response bias, selection bias, or social desirability effects¹⁹. Furthermore, these data are readily available in the Netherlands as well as many other countries, and could lower the burden of additional data collection efforts to inform policy decisions in a pandemic situation.

We constructed school networks using educational records from primary and secondary schools. Educational records connect students to their schools, year of education, and program tracks. We included students who did not attend special schools—schools servicing students with special needs such as blind or deaf individuals, where infection dynamics are likely to be different. To compare infections arising from school, family and non-school interactions (Section 2.2–2.3) we focus on students transitioning to secondary school in September 2021. For the multilevel regression analysis (Section 2.4) we focus on students registered in primary schools. This approach allows us to compare transmission dynamics across distinct social environments by examining both primary and transitioning secondary students.

To analyze the role of family ties in the transmission of SARS-CoV-2, we collected all family pairs in the following categories: Full-siblings, co-parents (two adults being the parents of the same child) and parent-child. We extracted family networks using data derived from parent-child records¹⁹. For example, siblings are recorded if they share at least a parent or if their parents are partners. Different types of siblings—such as half-siblings (who share one biological parent) and step-siblings (who have no biological parents in common and are related through their parents’ relationship)—may have very different levels of closeness. Some might grow up together, some might grow up in different houses (especially those who share the same father), while others might become siblings as adults and have less frequent contact. To better estimate the probability of temporally associated co-infections in siblings that are likely to keep regular contact, we focus only on full-siblings.

Finally, for the multilevel regression analyses we classified schools according to their denomination. The school denomination denotes the type of school and is correlated with attitudes towards COVID-19. In the Netherlands, parents have the right to choose schools that match their values. A majority of schools are Christian (either Protestant, Catholic, Evangelic or Reformist), while around one third are public schools¹⁰. Other denominations include for example Islamic and Anthroposophic schools. We also included in the regression analyses the school size, the median income of the school’s neighborhood (at the 4-digit postcode level, an administrative area equivalent to a neighborhood with an average population size of 4,314 and a maximum population of 28,190²⁰ and the distance between the house addresses of each pair of students. For privacy considerations, the location of the houses is only known at a resolution of 100x100 m². We assigned all individuals to the last known address before 2021 and kept individuals who remained living in the Netherlands throughout 2021. We estimated the distance between two households as the euclidean distance plus 52 meters—the average distance between two random points in a 100x100 m² square. This implies that students living in the same 100x100 m² are estimated to live at a distance of 52 meters. Students living in the same household (sharing the same house ID) were set at a distance of 0 meters.

Matching students in groups of increasing level of contact

We first analyzed the role of schools in the transmission of SARS-CoV-2 by focusing on students transitioning from primary school to secondary school in 2021 (illustrated in Fig. 2). Focusing on this transition allows us to distinguish whether infections mainly occur at school, or from non-school interactions such as community transmission. We create four groups of student pairs representing increasing level of contact: Group 1 (Baseline): Pairs of students who did not attend the same primary or secondary school. Since we are interested in a comparison group of pairs of students living near each other, we oversampled pairs of students living within the same municipality. Group 2 (Same background): Pairs of students who attended the same primary school (and will have a similar social background) but not the same secondary school. Group 3 (Same school, different program track): Pairs of students who attended both primary and secondary schools together but were not in the same program track in secondary school—i.e, they attend different classrooms. Group 4 (Same school, same program track): Pairs of students who attended both primary and secondary schools together and were in the same program track in secondary school.

Finally, we created three separate categories for twins, which we identify as pairs of students living in the same household and attending the same school year. Twins 2 (Same background): Pairs of twins who attended the same primary school but not the same secondary school. Twins 3 (Same school, different program track): Pairs of twins who attended both primary and secondary schools together but were not in the same program track in secondary school—i.e, they will attend different classrooms. Twins 4 (Same school, same program track): Pairs of twins who attended both primary and secondary schools together and were in the same program track in secondary school. Unsurprisingly, we were unable to create a group Twins 1 (Baseline) since there were less than 10 twin pairs in the studied cohort which did not attend the same primary school together.

Probability of temporally associated infection

We calculated the probability of temporally associated infections within a 14-day period for each group as a function of distance between the places of residence of two students or two family members. To preserve the individuals’ privacy, results can only be exported from the secure computer of CBS in groups of at least 10 individuals. Because of this, we calculated the number of student or family pairs and number of temporally associated infected pairs for the following distance bins: 0m, 0-300m, 300-1000m, 1000-3000m, 3000-10,000m, 30,000+. Each bin excludes its left boundary (e.g., 0–300m includes distances greater than 0m but up to 300m), except for the 0m bin, which represents individuals living in the same household. The 14-day period was chosen based on the approximately 7 days incubation and generation periods for SARS-CoV-2²¹, but the results are robust to changes of this threshold.

The probability of temporally associated infections within a distance bin d is calculated as $P_{\text {temp}}(d) = \frac{\sum _{i,j} I_{ij}(d)}{\sum _{i,j} N_{ij}(d)},$ where $I_{ij}(d)$ equals one if the student or family pair (i, j) living within the distance bin d tested positive within a 14-day period, and $N_{ij}(d)$ is the total number of student or family pairs within the distance bin d.

The number of student and family pairs ($\sum _{i,j} N_{ij}(d)$) and associated temporally associated infections ($\sum _{i,j} I_{ij}(d)$) is given in Table 1 and in Table 2 respectively for school and family ties.

Table 1 Number of student pairs (N) and student pairs co-infected within 14 days of each other (N_inf) as a function of distance and background. G1: Pairs of students who did not attend the same primary (in 2020) or secondary schools (2021). G2–4: Attended the same primary school in 2020 and (G1) did not attend the same secondary school in 2021; (G2) attended the same secondary school but different program; (G3) attended the same secondary school and the same program. Note that results can only be exported from the secure computer of CBS in groups of at least 10 individuals.

Full size table

Table 2 Number of family pairs (N) and family pairs co-infected within 14 days of each other (N_inf) as a function of distance and type of family tie.

Full size table

Statistical analysis of school and municipality heterogeneity

In our second analysis we focused on the factors driving the transmission dynamics between students. We conducted a multilevel regression analysis, where we modeled temporally associated infections between student pairs using logistic regressions, with parameters estimated via maximum likelihood estimation.

Multilevel models are capable of accurately estimating regression parameters in situations where data is hierarchically structured, and thus violating the assumption of independence of observations. They furthermore allow to attribute variance to the respective levels in the data structure.

We constructed models accounting for a three-level structure: individuals, schools, and municipalities (Dutch: gemeenten). This enabled us to identify the extent to which the school context contributes to temporally associated infection events, separating it from influences at the individual and municipal level.

We added several explanatory variables at each level to explain variability in temporally associated infection probabilities. At the individual level, we added the distance between student pairs as a predictor of an associated infection. Due to the skewed distribution of the variable, and to aid convergence in parameter estimation, we took its natural logarithm, centered, and z-scaled it (i.e., subtracted the mean and divided by the standard deviation to normalize the values). The school-level predictors were the number of students indicating the size of the school, the median income of the school’s 4-letter postcode area, and the school’s denomination (if any). School size and income were centered and z-scaled for the same reasons as the distance variable.

A first model served as the baseline, decomposing the variance at the different levels by including random intercepts for schools and municipalities, but not including any predictors:

$$\begin{aligned} y_{isg} = \gamma _{000} + v_{0m} + u_{0sm} + e_{ism}, \end{aligned}$$

(1)

where i denotes an individual, s a school, and m a municipality. In the equation, $\gamma _{000}$ is the overall intercept and $v_{0m}, u_{0sm}, e_{ism}$ represent the error terms at the municipality, school, and individual level, respectively.

We then estimated a second model including the random intercepts introduced above as well as the predictors at the individual and school level:

$$\begin{aligned} y_{ism} = \gamma _{000} + \gamma _{p00} X_{pism} + \gamma _{0q0} Z_{qsm} + v_{0m} + u_{0sm} + e_{ism}, \end{aligned}$$

(2)

where $X_{pism}$ corresponds to the ($p=1$) predictor at the individual level: the logged distance between students. $Z_{qsm}$ represents the ($q=3$) predictors at the school level: school size, median income, and school denomination.

Finally, we included random slopes ($u_{psm}$) at the school level for the distance between student pairs ($X_{pism}$):

$$\begin{aligned} y_{ism} = \gamma _{000} + \gamma _{p00} X_{pism} + \gamma _{0q0} Z_{qsm} + u_{psm} X_{pism} + v_{0m} + u_{0sm} + e_{ism}. \end{aligned}$$

(3)

Significance of predictors was assessed using Wald tests at $\alpha =0.05$. Significant differences between the models were determined by likelihood-ratio tests. Explained variance at different levels of the models was calculated according to the method proposed by McKelvey & Zavoina²², which relates the systemic variance of the model introduced by the predictor variables to the residual variance at all levels. Model coefficients and variances were furthermore rescaled following Hox et al.²³, pp.125 to enable comparison of explained variance across models (see also Hox et al.²³, pp.121–125 for more clarification on variance calculation and rescaling procedures in multilevel models for dichotomous outcomes.).

Instead of running the three models for all the different groups of student pairs introduced in Section 2.2, we restricted this part of the study to students who attended primary education together (in the same class year) in 2021. Furthermore, the analyses were based on a 5-percent sample of all schools in the data, with an inclusion probability proportional to school size. This was done in order make the models computationally feasible, while still including a substantial number of different schools from various areas. The sampling yielded a dataset of 2,509,927 observations representing student pairs, grouped in 312 schools, and 174 municipalities. While there is a large class imbalance, with 0.1% of the student pairs temporally co-infected, we follow the advice of recent research of not correcting for it²⁴. Coefficient estimates are stable for high class imbalance as long as there are enough observations in the minority class and corrections tend to miscalibrate the models²⁴.

The statistical models were run using the lme4 library in R. The Python and R code documenting the performed steps of all data processing and analysis procedures is available at https://github.com/jgarciab/covid_schools/.

Results and discussion

Shared school and classroom environments

We first analyzed the role of schools in the transmission of SARS-CoV-2 by focusing on students transitioning from primary school to secondary school (see Methods Section 2.2), which allowed us to better separate school from non-school social interactions such as community transmission.

We found that the probability of temporally associated infection was 0.52% (95% confidence interval (CI): 0.49–0.56%) for the baseline group, 1.11% (CI: 1.01–1.20%) for students in group 3, and 1.65% (CI: 1.52–1.77%) for students in group 4. Compared with the baseline group (group 1), attending the same primary school and the same program track in secondary school (group 4) increased the probability of associated infections significantly by 213% (CI: 183–247%, see Fig. 3A). Attending secondary school in a different program track (group 3)—thus not sharing a classroom so frequently—increased the probability of associated infections significantly by 111% (CI: 89–135%, see Fig. 3A).

Students who attended primary school together but not secondary school together (group 2) had only a slightly higher probability of temporally associated infections (CI: 0.62–0.69%) compared to the baseline (0.52%). This small difference indicates that social ties inherited from primary school had little impact on this probability. Moreover, for all groups of students, the distance between students’ houses had only a minor effect on the probability of associated infection (Fig. 3B).

Shared household and family contexts

After assessing the increase in the probability of temporally associated infections for students attending the same school, we examined how this probability compares to the probability for individuals of the same family.

For twins living in the same household, attending the same school modestly increased the probability of temporally associated infection: 33% (28 out of 84 pairs, CI: 28–84%) for twins attending different secondary schools, compared to 39% (24 out of 62 pairs, CI: 24–62%) for twins attending the same school and different track, and 50% (54 out of 107 pairs, CI: 54–107%) for twins attending the same school and program track (Fig. 4A). Living in the same household results in a large increase in the probability of temporally associated infection (Fig. 4B). The estimated probabilities ranged from 23% for sibling pairs and parent-child pairs to 50% (CI: 41–60%) for twins attending the same program track in secondary school. This increased risk is presumably due to prolonged exposure if there is an active case. The probabilities for family pairs are much larger than the estimated 1.6% for students in group 4 (attending the same year group in primary and secondary school). This finding aligns with current scientific knowledge highlighting the key role of household transmission in the spread of SARS-CoV-2 (see for example^8,9,11,13).

Among individuals who do not share a household, family relationships are highly predictive of the probability of temporally associated infections (Fig. 4B). The probability of temporally associated infections for parent-child, co-parents and siblings living in different but nearby households is 7–12%. This probability decreases with distance (Fig. 4B), as social interactions between family members are more likely when they live close together.

School and municipality heterogeneity in the probability of temporally associated infections

We finally investigated the determinants of temporally associated infections through a series of multilevel regression models (Table 3), which explicitly attribute variance to different levels of observational units (see Section 2.4). The presented results of Model 1—including random intercepts for schools and municipalities but no predictors—serves as a baseline to assess the variance explained by the predictors. The majority of the total variance in this three-level structure manifested at the individual level with $\frac{3.29}{3.29+1.93+0.25} = 0.60$ (i.e., 60% of the variance is due to individual-level differences). The school-level variance made up a share of 0.35, which left 0.05 to the municipality level.

Table 3 Results of logistic multi-level regression models. Statistically significant coefficients are marked as *** $p<0.001$, ** $p<0.01$, * $p<0.05$. Standard errors are displayed next the coefficients in parenthesis. The school denominations were translated from Dutch. Codes of the original variable from the CBS dataset INSCHRWPOTAB are displayed in square brackets. The reference category of denomination is Specialized non-denominational education [ABZ] (e.g., Montessori). Numerical variables were centered and z-scaled.

Full size table

We then included predictors indicating the residential distance of student pairs, the size of the school they attended, the median income of the postcode area of the school, and the school’s denomination (Model 2). The random effects remained the same as in Model 1—i.e., including random intercepts for schools and municipalities.

The distance between student pairs’ homes was found to significantly decrease the temporally associated infection probability. This is in line with the results of the preceding probability analysis, identifying student pairs living in the same household to be facing the highest risk of temporally associated infection. We also found significant decreases in associated infection probabilities in Islamic and Protestant schools as compared to the reference category. While Islamic schools are only a small minority of all schools, Protestant, together with public schools and Catholic schools, are one of the largest denominations. However, our analysis does not distinguish if the results are driven by a lower spread of virus or by a lower propensity to test. School size and median income in the postcode area did not show a significant association with temporally associated infections.

In total, the predictors were able to explain 3% of the overall variance. This can be calculated by the share of variance in the linear predictor compared to the total variance of Model 2, or comparing the total variances of Model 2 and Model 1. Looking at variance reduction at all levels individually, variance decreased by 3 percent at the individual level, 6.2 percent at the school level, and, most notably, by 48 percent at the municipality level. While the predictors explained a large share of variability at the municipality level, the variability was very low to begin with (5% of the total). A likelihood-ratio test confirmed a significant improvement in model fit ($\chi ^2(13) = 65.24, p < 0.001$) of Model 2 over Model 1.

Finally, in Model 3, we included random slopes of student-pair distance at the school level—i.e., the effect of distance was allowed to vary by school. A term for intercept-slope covariance was also included, modeling the strength of the distance effect depending on the average probability by school. These parameters substantially increased model fit, meaning, the association of distance between the students residence and temporally associated infection probability was indeed dependent on the specific school. This is also indicated by the significant result of the likelihood-ratio test comparing Model 3 to Model 2 ($\chi ^2(2) = 200.16, p < 0.001$).

To conclude, the variance decomposition of the multi-level models showed that the vast majority of variability in the data results from differences at the individual level (60%) and the school level (35%). While we could find significant effects of student distance and school denomination on temporally associated infection probability, these effects could explain only 3 percent of the overall variance. Possible omitted factors driving differences in this probability could be families’ attitudes towards COVID-19, or prevention measures implemented at the school level.

Conclusion

In this paper we investigate the impact of schools and families in the temporal association of SARS-CoV-2 infections among students during the period from June 2020 to September 2021. This is possible by integrating population-scale networks and PCR test result data using registry data from Statistics Netherlands.

Our results show that living together at home is the most significant factor correlated with two individuals testing positive within a 14-day period, underscoring the importance of household transmission in the spread of the virus. Both social ties inherited from primary school and geographical distance were found to have little effect on the probability of both students testing positive within a 14-day period. This suggests that either social ties with classmates in primary school are weakened after students move to secondary school, or that COVID restriction strategies targeting non-school social networks were highly effective. Future studies could estimate the effect of school and non-school restrictions on students contacts and SARS-CoV-2 transmission.

In contrast with the low impact of social ties inherited from primary school, shared school and classroom environments were found to significantly increase the likelihood that both students would test positive within a 14-day period. Although the likelihood of temporally associated infections in schools was low, it should be noted that even small increases in the transmission rate in schools can lead to larger outbreaks, since the transmission rate is linearly related to the reproduction number²⁵and a large proportion of children’s contacts are expected to occur in schools. The observed increase in temporally associated infections from 0.6 to 1.6% may lead to reproduction numbers above one infectee per infected when schools reopen²⁶. These insights into the transmission dynamics of SARS-CoV-2 within Dutch families and educational institutions can inform future use of network models and provide insights for possible interventions, such as school closures.

The analysis presented in this paper has a limitation that open up fruitful additional avenues for future research. Governments around the world introduced several interventions to reduce the transmissions, including (partially) closing schools, workplaces, and wearing masks in confined spaces. Due to the wide variety of measures at school, workplaces, and public spaces²⁷, we did not examine the role of school closures. Using data from Statistics Netherlands and a similar methodological approach, further research could explore the effectiveness of various interventions. Furthermore, a similar approach merging infection results data and population scale data could be used to understand the effects of school and family connections for other diseases.

Data availability

Administrative data from Statistics Netherlands (CBS) is available for statistical and scientific research under specific conditions. For more details about the data and its usage, as well as access and usage regulations, we refer you to Bokánky et al.²⁸ and van der Laan et al.¹⁹. For inquiries regarding micro-data, please contact microdata@cbs.nl.

Aggregated data exported from CBS and all analysis script can be found at https://github.com/jgarciab/covid_schools/. For further needs contact the corresponding author.

References

Bustamante-Castañeda, F., Caputo, J., Cruz-Pacheco, G., Knippel, A. & Mouatamide, F. Epidemic Model on a Network: Analysis and Applications to COVID-19. Physica A: Statistical Mechanics and its Applications. 564, 125520 (2021).
Article MathSciNet MATH Google Scholar
Prasse, B., Achterberg, M. A., Ma, L. & Van Mieghem, P. Network-Inference-Based Prediction of the COVID-19 Epidemic Outbreak in the Chinese Province Hubei. Applied Network Science. 5, 1–11 (2020).
Article MATH Google Scholar
Firth, J. A. et al. Using a Real-World Network to Model Localized COVID-19 Control Strategies. Nature Medicine. 26, 1616–1622 (2020).
Article CAS PubMed MATH Google Scholar
Sanchez, F. et al. A Multilayer Network Model of Covid-19: Implications in Public Health Policy in Costa Rica. Epidemics. 39, 100577 (2022).
Article CAS PubMed PubMed Central Google Scholar
Cui, Y., Ni, S. & Shen, S. A network-based model to explore the role of testing in the epidemiological control of the COVID-19 pandemic. BMC Infectious Diseases. 21, 1–12 (2021).
Article MATH Google Scholar
Endo, A. et al. Within and between Classroom Transmission Patterns of Seasonal Influenza among Primary School Students in Matsumoto City, Japan. Proceedings of the National Academy of Sciences. 118, e2112605118 (2021).
Article CAS MATH Google Scholar
Cauchemez, S. et al. Role of Social Networks in Shaping Disease Transmission during a Community Outbreak of 2009 H1N1 Pandemic Influenza. Proceedings of the National Academy of Sciences. 108, 2825–2830 (2011).
Article ADS CAS MATH Google Scholar
Van Iersel, S. C. J. L. et al. Empirical Evidence of Transmission over a School-Household Network for SARS-CoV-2; Exploration of Transmission Pairs Stratified by Primary and Secondary School. Epidemics 43, 100675 (2023).
Article PubMed PubMed Central MATH Google Scholar
CMMID COVID-19 Working Group et al., Implications of the School-Household Network Structure on SARS-CoV-2 Transmission under School Reopening Strategies in England. Nature Communications 12, 1942 (2021).
Engzell, P., Frey, A. & Verhagen, M. D. Learning Loss Due to School Closures during the COVID-19 Pandemic. Proceedings of the National Academy of Sciences. 118, e2022376118 (2021).
Article CAS Google Scholar
Thompson, D. A. et al., Staff–pupil SARS-CoV-2 infection pathways in schools in Wales: a population-level linked data approach. BMJ paediatrics open. 5 (2021).
Macartney, K. et al. Transmission of SARS-CoV-2 in Australian educational settings: a prospective cohort study. The Lancet Child & Adolescent Health. 4, 807–816 (2020).
Article CAS MATH Google Scholar
Cordery, R. et al. Transmission of SARS-CoV-2 by children to contacts in schools and households: a prospective cohort and environmental sampling study in London. The Lancet Microbe 3, e814–e823 (2022).
Article PubMed PubMed Central MATH Google Scholar
Heavey, L., Casey, G., Kelly, C., Kelly, D. & McDarby, G. No evidence of secondary transmission of COVID-19 from children attending school in Ireland, 2020. Eurosurveillance. 25, 2000903 (2020).
Article PubMed PubMed Central Google Scholar
Ismail, S. A., Saliba, V., Bernal, J. L., Ramsay, M. E. & Ladhani, S. N. SARS-CoV-2 infection and transmission in educational settings: a prospective, cross-sectional analysis of infection clusters and outbreaks in England. The Lancet Infectious Diseases. 21, 344–353 (2021).
Article CAS PubMed Google Scholar
Yung, C. F. et al. Novel coronavirus 2019 transmission risk in educational settings. Clinical Infectious Diseases. 72, 1055–1058 (2021).
Article CAS PubMed MATH Google Scholar
Vardavas, C. et al. Transmission of SARS-CoV-2 in educational settings in 2020: a review. BMJ open. 12, e058308 (2022).
Article PubMed Google Scholar
Kazmina, Y., Heemskerk, E. M., Bokányi, E. & Takes, F. W. Socio-economic segregation in a population-scale social network. Social Networks. 78, 279–291 (2024).
Article Google Scholar
Van der Laan, J., de Jonge, E., Das, M., Te Riele, S. & Emery, T. A Whole Population Network and Its Application for the Social Sciences. European Sociological Review, jcac026 (2022).
CBS, S. C. B. voor de. Data from: Kerncijfers per postcode Centraal Bureau voor de Statistiek. https://www.cbs.nl/nl-nl/dossier/nederland-regionaal/geografische-data/gegevens-per-postcode.
Zhao, S. et al. Estimating the Generation Interval and Inferring the Latent Period of COVID-19 from the Contact Tracing Data. Epidemics. 36, 100482 (2021).
Article CAS PubMed PubMed Central MATH Google Scholar
McKelvey, R. D. & Zavoina, W. A Statistical Model for the Analysis of Ordinal Level Dependent Variables. The Journal of Mathematical Sociology. 4, 103–120 (1975).
Article MathSciNet MATH Google Scholar
Hox, J. J., Moerbeek, M. & Van De Schoot, R. in Multilevel Analysis: Techniques and Applications 3rd ed. (Routledge, Third edition. — New York, NY : Routledge, 2017).
Van den Goorbergh, R., van Smeden, M., Timmerman, D. & Van Calster, B. The harm of class imbalance corrections for risk prediction models: illustration and simulation using logistic regression. Journal of the American Medical Informatics Association. 29, 1525–1534 (2022).
Article PubMed PubMed Central MATH Google Scholar
Miller, J. C. A Note on the Derivation of Epidemic Final Sizes. Bulletin of Mathematical Biology .74, 2125–2141 (2012).
Article MathSciNet PubMed PubMed Central MATH Google Scholar
Munday, J. D. et al. Estimating the Impact of Reopening Schools on the Reproduction Number of SARS-CoV-2 in England, Using Weekly Contact Survey Data. BMC Medicine 19, 233 (2021).
Article CAS PubMed PubMed Central MATH Google Scholar
Rozhnova, G. et al. Model-based evaluation of school-and non-school-related measures to control the COVID-19 pandemic. Nature communications 12, 1614 (2021).
Article ADS CAS PubMed PubMed Central MATH Google Scholar
Bokányi, E., Heemskerk, E. M. & Takes, F. W. The anatomy of a population-scale social network. Scientific Reports 13, 9209 (2023).
Article ADS PubMed PubMed Central MATH Google Scholar

Download references

Acknowledgements

This work was financed by the Ministry of Health, Welfare and Sport (Ministerie van Volksgezondheid, Welzijn en Sport: H16-4068-27762) and facilitated by ODISSEI (https://odissei-data.nl/), the Dutch National Infrastructure for Social Research.

Author information

Authors and Affiliations

ODISSEI Social Data Science (SoDa) Team & Department of Methodology and Statistics, Utrecht University, Utrecht, Netherlands
Javier Garcia-Bernardo
Centre for Complex Systems Studies, Utrecht University, Utrecht, Netherlands
Javier Garcia-Bernardo
Department of Industrial Engineering and Innovation Sciences, Eindhoven University of Technology, Eindhoven, Netherlands
Christine Hedde-von Westernhagen
ODISSEI & Department of Public Administration and Sociology, Erasmus University, Rotterdam, Netherlands
Tom Emery
Centre for Infectious Diseases Control, National Institute for Public Health and the Environment, Bilthoven, Netherlands
Albert Jan van Hoek

Authors

Javier Garcia-Bernardo
View author publications
Search author on:PubMed Google Scholar
Christine Hedde-von Westernhagen
View author publications
Search author on:PubMed Google Scholar
Tom Emery
View author publications
Search author on:PubMed Google Scholar
Albert Jan van Hoek
View author publications
Search author on:PubMed Google Scholar

Contributions

Conceptualization: all authors; methodology, data curation and analysis: J.G.B., C.H.vW; writing—original draft preparation: J.G.B., C.H.vW; writing— review and editing: all authors; visualization: J.G.B; supervision: J.G.B., T.E., A.J.vH. project administration: J.G.B and T.E..; funding acquisition: T.E.

Corresponding author

Correspondence to Javier Garcia-Bernardo.

Ethics declarations

Ethical statement

Data is collected by Statistics Netherlands (CBS) and the National Institute for Public Health and the Environment (RIVM), and made available to researchers for well-defined projects and statistical analysis. Researchers need to be pre-approved before accessing the data, and all data is pseudoanomyzied, and available in a secure research environment. The data is safeguarded under the stringent privacy regulations set by the Statistics Netherlands Act (“Wet op het Centraal bureau voor de statistiek”) and the European Union’s General Data Protection Regulation, guaranteeing that individual personal information is not revealed during the analysis. All methods were carried out in accordance with relevant guidelines and regulations.

Competing interests

The authors declare no competing interests.

Declaration of generative AI and AI-assisted technologies in the writing process

During the preparation of this work the author(s) used https://www.deepl.com/write for copyediting and to improve readability. After using this tool/service, the author(s) reviewed and edited the content as needed and take(s) full responsibility for the content of the publication.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Garcia-Bernardo, J., Hedde-von Westernhagen, C., Emery, T. et al. Assessing COVID-19 transmission through school and family networks using population-level registry data from the Netherlands. Sci Rep 14, 31248 (2024). https://doi.org/10.1038/s41598-024-82646-7

Download citation

Received: 18 July 2024
Accepted: 06 December 2024
Published: 28 December 2024
Version of record: 28 December 2024
DOI: https://doi.org/10.1038/s41598-024-82646-7