Introduction

Globally, non-religious people have been observed to have smaller families than their religious counterparts1,2,3,4,5,6,7. Moreover, within religious traditions, those who report stronger religious beliefs8, higher personal importance of religion9, more frequent religious attendance10,11,12, or higher importance of the religious community13 exhibit higher fertility than those with weaker religious commitments. The positive association between religiosity and fertility is further evidenced by cross-national studies14,15,16,17,18,19,20. For example, Adsera demonstrated that in 1994, weekly churchgoers had larger families than non-religious people across 13 high-income countries including the U.S., Canada, Australia, U.K., Northern Island and New Zealand21. Historical data from prior to the industrial revolution22,23,24,25, and ethnographic data from rural community26 further support the general argument that more religious people have more children. In this study, fertility refers to realized reproductive outcomes, i.e. reported childbearing, rather than to biological fecundity or reproductive capacity.

Why might the strength of religious commitment affect fertility in these high-income, low-fertility countries? Some investigators observe that the central normative claims of religions promote larger families. As such, those with stronger religious beliefs will find direct motivation to reproduce. Moreover, religious people are more likely to object to contraception27,28, oppose abortions29,30,31, and adopt family-oriented worldviews3,9. Religious people also often have higher fertility intentions, and can be more effective in realizing their fertility intentions18, presumably as a result of one or a combination of these norms, or other social factors. More recently, religion-fertility research has considered institutional, socio-economic, and communal factors13 as interdependent features of religion that, along with religious norms and values, affect fertility. Evidence suggests that institutional support from religious organizations can lower the costs of large families and encourage higher fertility. For example, the Catholic Church in Europe prior to Second Vatican Council in the 1960 s operated kindergartens which lowered the costs of large families to parents and was associated with higher fertility32. Similarly, religious mothers in both low and high-fertility countries receive more help with their children26,33,34,35,36,37, which may decrease the costs of having additional children38, shorten interbirth intervals, and as a result, lead to higher fertility39.

To date, most studies that examine the relationship between religion and fertility have been correlational. Serious objections to inferring causality from cross-sectional studies on fertility have been raised for at least 30 years40. This is because although religion is associated with fertility, we cannot assess whether becoming religious affects fertility rates, or if people who transition into parenthood become then become more religious. Indeed, the latter claim has some support from qualitative41, experimental42, and longitudinal43,44 studies. Importantly, correlational research may also lead to biased causal interpretations because unobserved third variables can influence both religiosity and fertility. Specifically, factors such as existential security and education are consistently linked to lower fertility rates as well as reduced religious attendance and higher apostasy, especially in the context of developed countries45,46,47,48,49,50,51,52. Likewise, political conservatism may independently foster both higher fertility53 and stronger religiosity54, reflecting shared traditional value orientations rather than a direct causal link between religiosity and fertility.

The evidence that religiosity may affect fertility is scarce. A longitudinal study of Dutch respondents found that while having a child does not predict attendance, attendance predicts later fertility12. Data from Britain, France and the Netherlands suggest that practicing Catholics in France and nominal Catholics and practicing Protestants in the Netherlands had a higher probability of having an additional child in three years following data collection16. On the other hand, study using a sample of women in Spain and Italy indicated that maternal religiosity during a daughter’s childhood was negatively associated with later fertility. Only in Spain, paternal religiosity was positively associated with a daughter’s fertility55. Thus, early socialization into religion seems to have a mixed impact on later fertility.

Such studies represent substantial advancement over purely cross-sectional studies of religiosity and fertility; however, they still rely on associations observed over time rather than on causal evidence. The reason is that, as two-wave investigations, they cannot adjust for common causes of both religiosity and later fertility, and thus they do not estimate the effect of change in religiosity. Two-wave designs cannot account for situations in which a person who is not religious or not very religious becomes highly religious, or when someone who is highly religious becomes less religious or an apostate. An alternative approach to estimating a causal relationship between religion and fertility was employed by Xie and Zhou56, who used the number of religious sites across Chinese provinces as an instrumental variable for religiosity. This method rests on the assumption that the number of religious sites is unrelated to historical factors—such as economic development, education levels, or cultural norms—that could also influence fertility. Under this assumption, they reported a positive causal effect of religiosity on fertility, significant only at the 10% level, despite a large sample size of over 4,700 respondents. The reliance on assumptions about endogeneity of religious places and the marginal statistical significance call into question the robustness of these findings as strong causal evidence, and further point to the need for more studies that would diversify attempts to track the impact of religion on fertility.

Here, we address this need by providing another approach to inferring the causal effect of religiosity on fertility by testing whether changes in religious attendance and faith importance predict fertility in the eight-year period following these changes. While we outline several mechanisms proposed in the literature that might link religiosity to fertility, we do not evaluate nor compare these mechanisms in our analyses. Our aim is to assess whether becoming more religious is associated with subsequent changes in fertility. Based upon previous research, we predict that becoming more religious will have a positive impact on fertility in the eight following years. The sample we use consists of adolescents from the United States, which is especially suitable to address these questions. First, adolescence is flexible life period that is recognized across cultures as the most significant period for socializing religious worldviews57 and thus it is likely that this sample will change their religiosity substantially58,59,60,61 which is crucial for estimating the effects of religious change. Moreover, at the time of data collection, the U.S. had a relatively high freedom in family planning decisions and ability to achieve fertility intentions through contraception and/or abortions. Therefore, even during those eight years that followed the period in which religious change was estimated, respondents had enough time to adopt a full range of norms, or be affected by accompanying communal factors, and adjust their behavior accordingly.

Because participants in our dataset were relatively young, we also examined whether the effects of changes in religiosity on fertility differ between men and women. Across cultures, male fertility varies more than female fertility62, and some men in the U.S. may employ religion to pursue high fertility goalscf.63. Moreover, longitudinal research from 1980 to 2001 indicates that religiosity is associated with delay in the sexual debut of adolescent women but shows mixed effects among men64. If religiosity both delays sexual debut and motivates larger family size, these influences could counteract each other in women—limiting early fertility despite pro-natal motivations—while reinforcing each other in men. On this basis, we predicted that the effect of churchgoing on subsequent fertility would be stronger among men.

Methods

We adopted a three-wave panel design (see Fig. 1) for identifying the causal effect of regular religious service on fertility rates65. In such a design, measurements of common causes of both the treatment and outcome, or descendants of such common causes, are obtained at the first measurement interval, called the “baseline wave.” Included in this covariate set are baseline measures of the treatment and the outcome. In the first instance, previous states of the treatment and outcome are often the strongest confounders of subsequent treatments and outcome. Crucially, by including baseline measures of the treatment, we are able to obtain an effect estimate for the incidence of new regular religious service, rather than merely of its prevalence66,67. Baseline measures were obtained in 2003 (t0), the exposure, service attendance and importance of faith, was measured in 2005 (t1), and the outcome, fertility was measured in 2013 (t2) and 2003 (t0) in order to model the incidence of new children. This combination of waves offers the longest feasible observation period for any effects of religious change to influence fertility outcomes, and thus represents the most suitable design given the structure of the data. The simple conceptual formalization of this approach is given by the formula: fertility(t2) = religiosity(t1) + religiosity(t0) + fertility(t0) + covariates(t0), where religiosity is either faith importance or church attendance. All variables are detailed in the following sections.

Fig. 1
figure 1

Causal diagram using VanderWeele’s et al.65 approach for confounding control in a three-wave panel design (solid lines denote assumed causality, dashed lines denote confounding path). By including baseline measures of the outcome, as well as including the baseline treatment, and a rich array of covariates, we assume back door paths between the treatment and outcome will be blocked. However, because of possible unmeasured confounding, we also perform sensitivity analyses with E-values.

Data

Data used in this study come from the National Study of Youth and Religion (hereafter NSYR)68. This project interviewed 3,370 U.S. adolescents through a randomly generated sample representative of all household phone numbers in the U.S. including Alaska and Hawaii. Researchers have used the longitudinal data from this project to study various questions regarding religion, such as ritual participation and morality69, education and religiosity46, or socialization into religion70. The dataset includes four waves collected in 2003, 2005, 2007/8, and 2013. In 2003, respondents were 13–17 years old, meaning that the project followed respondents for around a 10-year period from their adolescence to early adulthood–a life stage when U.S. adults typically start their families and have children. For example, in 2000 white women became mothers at an average age of 25.9, rising to 27.0 in 2014, while African American women became mothers at 22.3 in 2000 and 24.2 in 201471.

Variables

Outcome measures: fertility

We model fertility as a total count of children reported in 2013 (t2). In t2, respondents were asked how many times they were pregnant (if female) and how many times they got someone pregnant (if male). For each of these events, respondents were asked which year and how the pregnancy ended (currently pregnant, abortion, live birth gave up for adoption, live birth kept, miscarriage, stillbirth, and don’t know). This allowed us to recover birth histories for each respondent counting “births kept” to obtain the number of children in t0 and t2. Our method directly measured only biological offspring.

Exposure variables

As an exposure, we conducted two separate analyses using two binomial variables. The first is regular church attendance defined as weekly and more often (or not), and the second is defined as faith being very or extremely important in daily life for the participant (or not). Church attendance was measured using two items: one asking how often participants usually attend services at the main place of worship of their denomination, and another asking how often they attend services in a place of worship of a different denomination, if applicable. The original items included seven ordered categorical options (never, few times a year, many times a year, once a month, two to three times a month, once a week, and more than once a week). We dichotomized responses by coding participants who attended at least once a week as 1, and all others as 0. If participants responded two to three times a month in both the main and the second denomination, we assigned them 1 since it suggests they attended at least four times a month on average. The reason for dichotomization was, besides collapsing attendance across denominations, to separate regular churchgoers from all others in order to provide estimates with a clear natural interpretation. Importance of faith was measured using an item that asked participants how important religious faith is in shaping their daily life (not important at all, not very important, somewhat important, very important, and extremely important). In order to unify the faith importance variable with church attendance and provide comparable estimates, we dichotomized the variable by assigning 1 to participants reporting very or extremely important for importance of faith, and zeros to all others. Both variables were measured in the same way in t0 (and adjusted for in the models).

Baseline covariates

Beside baseline religiosity and fertility measured at t0, we control for potential common causes of both religiosity and fertility and their proxies, most importantly existential security, education, and conservative values. As suggested in the introduction, these factors can potentially affect adolescents’ life trajectories and shift them towards higher religiosity and fertility without assuming a causal association between religiosity and fertility45,46,47,48,49,50,51,52. However, since respondents were very young in t0, for these variables we use proxies reported by caregivers in the survey (~ 90% caregivers were parents), namely their completed education, household income, perceived safety of the neighborhood, number of siblings in the household, political ideology, and religiosity. On the top of these, we included also standard demographic controls: age, sex, and ethnicity. Respondent’s religion was included as there is evidence for differences in family size ideals and birth timing between denominations in the U.S72,73. All confounders, save for caregiver’s religiosity, were measured with single-item measures (Table S1 in SM). Caregiver’s religiosity was modelled as latent variable composed of five items, namely dummy variables asking whether caregivers pray with the child before meals (1) and in other occasions, except for meals (2), ordinal variables asking how often caregivers talk with the child about religion (3), how often they attend religious services (4), and finally how important their faith is for their daily life (5). Factor loadings (between 0.680 and 0.809) and Cronbach’s alpha (0.861) suggested excellent internal reliability and thus we extracted the final latent score for analyses (Table S2 in SM for details). Because respondents were very young at t0, including theoretically relevant controls such as marital status was not meaningful (marriages were likely extremely rare, if any); moreover, the dataset does not include such a question.

To statistically identify key predictors of fertility at t2, we first fitted a Poisson regression model using baseline (t0) covariates. The results indicated that fertility was positively associated with age, lower caregiver household income, lower caregiver education, and a higher number of siblings. No significant associations were found for caregiver political ideology, religiosity, or neighborhood safety. Based on these findings, we decided to run two versions of the main model: one including all covariates, and another excluding caregiver political ideology, religiosity, and neighborhood safety. Additionally, fertility differed by religion and ethnicity, with men exhibiting lower fertility than women. Further details are provided in Table S3 and Figure S2 in SM.

We included only baseline covariates and excluded any covariates from later waves to avoid potential post-treatment bias. A wide range of life-course processes may proximally affect the a pathway between religiosity and fertility—such as shifts in dating practices, sexual debut, or partnership formation. However, because these factors are themselves potentially influenced by religiosity at t1, adjusting for their later-wave measurements blocks the very effect we aim to estimate and thereby would introduce post-treatment bias74,75. For example, higher religiosity may increase the probability of getting married, which in turn is strongly associated with the likelihood of having a first child76,77,78. This relationship may be causal, or the two events may simply co-occur. In either case, including marital status measured at t2 would potentially lead to a false negative result. Similarly, religiosity measured at t1 may influence religiosity at t2, which in turn may affect fertility at t2; including t2 religiosity would therefore again introduce post-treatment bias.

Analytical plan

We analyzed a total sample of 3,355 participants who had complete data on religious service attendance and faith importance at t0 (excluding 15 participants). Missing values in baseline covariates (4.29% of observations)79 were imputed using single imputation80,81, and attrition between waves was addressed with inverse probability of censoring weights82,83, accounting for 760 participants lost between baseline and exposure waves, and 454 lost between t1 and t2. Our analyses relied on baseline covariates to avoid post‑treatment bias and assumed missingness was at random given measured covariates. Causal inference was based on the assumptions of causal consistency84,85, conditional exchangeability (exogeneity after conditioning on observed covariates)65,83,86, and positivity87, and substantial variance in changes in the main predictor65,83,88. We implemented targeted maximum likelihood estimation (TMLE)89 with ensemble machine learning (SuperLearner) to estimate the causal effect while flexibly modelling both the outcome regression and the treatment mechanism (propensity score for a binary exposure), with censoring adjustment90,91. TMLE combines these components in a doubly robust framework that yields consistent, asymptotically efficient estimates when at least one of the nuisance functions is correctly specified. In our setting, TMLE targets a static policy contrast at t1—setting all individuals to weekly attendance versus non-weekly attendance—and evaluates its effect on fertility measured at t2, using cross-validated SuperLearner for the outcome, treatment, and censoring mechanisms to provide targeted inference while adjusting for observed confounding. Sensitivity to unmeasured confounding was assessed using E‑values92,93, and all analyses were repeated in sex‑stratified subsamples to explore effect heterogeneity. Details on the analytical plan are available at SM, section S2.

We note that attrition between waves resulted in substantial missing fertility data at t2 and t0 (40% missing among men and 32% among women). Censoring weights rely on the assumption that attrition is at random, conditional on the measured covariates. This assumption may not hold if dropout is driven by factors related to both our predictor and outcome. One plausible scenario is that many participants withdrew due to increasing time constraints associated with marriage and raising a family. If these life changes were preceded by an increase in religiosity, then some of those missing at t2—particularly 454 participants lost between t1 and t2—may have first become more religious, then married, had children, and subsequently discontinued participation. Such a process could attenuate observed associations, biasing results toward the null and increasing the risk of false negatives. Conversely, attrition could also bias results in the opposite direction. For example, participants who experienced considerable life changes—such as relationship breakdown, job loss, poor health, or migration—may have been more likely to drop out and also to have lower fertility. If these individuals were disproportionately less religious at baseline, their absence could make the remaining sample appear both more religious and more fertile, inflating the estimated association. Although we applied censoring weights to mitigate these risks, the possibility of residual bias from unmeasured predictors of attrition remains.

Results

Before proceeding to the main three-wave design analysis, we first conducted a cross-sectional analysis using only data from t2, without any treatment for missing data (participants with missing values were excluded). In this analysis, we retained the original (non-dichotomized) measures of faith importance and attendance in the main denomination, both of which were z-scored (again to mirror the methods used in previous cross-sectional studies). We included covariates commonly used in cross-sectional studies—education level, income, marital status, and age—and conducted separate analyses for men and women to maintain consistency with the subsequent analyses. Poisson regression was applied to account for the count nature of the fertility data. Attendance showed a strong positive association with fertility for both men (IRR = 1.20, 95% CI [1.06, 1.35], p = 0.005) and women (IRR = 1.14, 95% CI [1.05, 1.24], p = 0.002). Faith importance was also positively associated with fertility among men (IRR = 1.29, 95% CI [1.12, 1.48], p < 0.001) and women (IRR = 1.28, 95% CI [1.16, 1.42], p < 0.001). Full details of these analyses are provided in SM section S3. We then proceeded with the analysis of causal effects as described above.

In NSYR data, changes in religiosity between t0 and t1 are frequent enough to analyze their effects on fertility (see Figure S1 and Table S4 in SM). Table 1 shows the distributions of outcome variables in t2, Table 2 provides information on the exposure variable in t1, and Table S3 (in SM) displays distributions of all covariates at baseline wave t0. Because we employ nuanced methods for causal inference with a focus on the exposure variable, we do not report estimates associated with baseline covariates. Their function in the model is solely to ensure estimating the effect of the exposure.

Table 1 Description of outcome variables (fertility) in t2 (2013). All of missing values are the result of attrition between waves (i.e., missingness in children variable is affected by attrition in t2).
Table 2 Description of exposure (religiosity) variables in t0 (2003) and t1 (2005).

Regular church attendance was associated with 0.11 more children eight years later (95% CI = [0.00, 0.22], p = 0.048). Furthermore, an E-value of 1.45, and its lower bound of 1.05, suggests weak but reliable evidence for causality. This effect was driven by men. Regular male churchgoers had 0.16 more children (95% CI = [0.01, 0.32], p = 0.036), an effect which had evidence for causality (E-value = 1.59 with lower bound 1.11). On the other hand, we observed no significant nor meaningful effect of religion on fertility among women (coef. = 0.05, 95% CI = [−0.10, 0.20], p = 0.526). The difference between men and women in the effects of religious attendance on number of children was 0.11 (95% CI = [−0.10, 0.33]).

High faith importance was associated with 0.06 more children eight years later but the association was not significant (95% CI = [−0.03, 0.15], p = 0.161). While among men, the effect of high faith importance was close to significant (coef. = 0.09, 95% CI = [0.01, 0.32], p = 0.075), women were not at all affected by a change in faith importance (coef. = 0.04, 95% CI = [−0.09, 0.18], p = 0.502). The difference between men and women in the effects of religious attendance on number of children was 0.05 (95% CI = [−0.12, 0.22]). Next, we repeated the analyses excluding caregiver religiosity, political ideology, and perceived neighborhood safety. The results remained consistent with the previous results (see Figure S3 in SM).

In summary (see Fig. 2), regular religious service attendance is causally associated with increased fertility eight years later among men. This effect, although significant, is small and large confidence intervals suggest low certainty of the result. Among men, faith importance also likely increases fertility, but this effect is again weak and not significant. Among women, there is no effect of religious attendance nor of faith importance on fertility. These results are in striking opposition to the results of the cross-sectional one-wave (t2) analysis that suggests a strong positive correlation between both religiosity measures and fertility among both men and women.

Fig. 2
figure 2

Effects of regular service attendance and importance of faith on fertility eight years later. Note: the numbers represent estimate (95%CI) [E-value, lower E-value bound].

To further assess the robustness of these findings, we estimated a series of Poisson regression models using the unweighted, non-imputed dataset and list-wise deletion, relying only on baseline covariates to preserve the three-wave causal structure. Across twelve models—including predictors operationalized as (a) the TMLE-equivalent treatment dummies, (b) the original Likert-scale measures, and (c) first differences between t1 and t0—we observed a pattern highly consistent with the TMLE results. Among men, all three specifications for attendance and most specifications for faith importance showed positive associations with fertility, with effect sizes similar in magnitude to those produced by TMLE. Among women, none of the Poisson models indicated meaningful or significant effects for either religiosity measure, again mirroring the TMLE estimates. These supplementary analyses, described in detail in SM section S5, reinforce the conclusion that the observed causal effects are modest, gender-specific, and robust to alternative operationalizations of religiosity and model specifications.

Discussion

In this study, we aimed to explore a potential causal association between religiosity and fertility among U.S. adolescents. In addition to tracking births after changes in religiosity, we leveraged an ability to control for baseline attendance/faith importance and fertility two years before the exposure, and for key common causes of religious change, allowing a stronger causal inference from religiosity to reproductive behaviour. Several strategies that we employed, collectively, provide confidence that the changes in religiosity instigated changes in fertility.

The effect of religiosity on fertility was not strong. Only men seem affected by religiosity and, moreover, the effect is rather small and imprecisely estimated (broad CIs). In the case of faith importance, the effect is significant only at a 10% threshold. This is in stark contrast with the results of the cross-sectional analysis from the last wave, which demonstrated stable and strong associations between religiosity and fertility. However, weak evidence for a causal link is not surprising, considering the complex societal factors that drive fertility including economic and cultural factors94,95. Moreover, given that outcomes were measured eight years after the exposure wave, some amount of switch is expected. Participants who became religious shortly after the exposure measurement will have bene misclassified as not-religious, and vice versa for those who switch. This misclassification will overstate the relevance of religiosity among those who switched out of religiosity and understate the relevance of religiosity to those who switched in. Given the often-strong associations reported between religiosity and fertility in prior cross-sectional studies, one interpretation is that our weak findings raise important concerns about the robustness of those earlier results. Specifically, they suggest that previously observed associations may have been influenced by unmeasured confounding factors or reverse causality, rather than reflecting a direct causal impact of religion on fertility. Another interpretation is that, although the results hold for the population from which our sample was drawn, they may not generalize to other contexts. This limitation, of course, also applies to prior studies reporting strong effects. A strength of our study, however, is that we explicitly define our causal estimands and use appropriate workflows to recover them under clearly stated assumptions. These assumptions—while not fully testable aside from practical positivity—are transparent and can be scrutinized or rejected by future investigators. In this sense, the primary contribution of our study lies in its methodological approach.

Secondly, our findings add to previous studies drawing on U.S. samples which indicate that having a newborn leads to stronger connections to religious community96 and having a child transitioning to school age leads to greater religiosity97. Speculatively, these previous findings, taken together with our results, suggest that the religion-fertility relationship might take the form of a mutually reinforcing feedback loop where religiosity affects fertility and fertility in turn affects religiosity, at least in a society with relatively high economic growth, high religiosity, and low fertility98. Such explanation may explain why correlational studies often find strong positive associations between religiosity and fertility.

Our models revealed a notable discrepancy between men and women in their sensitivity to the religious exposure. Woman were almost unaffected by becoming more or less religious, while men exhibited differences in subsequent fertility as the result of religious exposure. This finding contrasts with previous studies reporting no difference in the association between religiosity and fertility across sexes100,101. One possible explanation draws on research showing that religiosity generally tends to delay sexual debut among young women but not young men55, and that religiosity is generally more strongly associated with lower sexual permissiveness in women than in men—perhaps because religious norms often place greater emphasis on women’s chastity102. As a result of these opposing influences—norms for large families and norms restricting women’s sexual activity—religion may not exert a net causal effect on fertility among young women. Later in life, however, when religious women are married, the influence of religion on encouraging larger families may come to the fore, as marital sex aligns with religious norms and family expansion can be pursued without moral conflict. Future research should replicate our three-wave analysis in samples of participants who have completed fertility.

Our study holds limitations regarding the capability to generalize the findings to the broader world population103,104. Some previous evidence highlights positive correlations between religiosity and fertility, particularly in societies that are not (or only little) integrated into the market economy25,26,105. However, there remains a notable scarcity of longitudinal studies within such contexts. This scarcity is even more conspicuous when contrasted with the wealth of research in Western market-integrated and low-fertility countries from America and Europe. In less market integrated societies, religion may contribute to increased fertility among its adherents by fostering parental support26 which in turn may reduce child mortality106,107 rather than directly impacting family planning, but still contributing to the overall fertility rates of its adherents.

Another potential objection to our design is that religious attendance and faith importance are not exogenous in the same way as governmental policies108, natural disasters109, or experimental manipulations110. As a result, unmeasured third variables or time-varying confounders (anything that happened after t1 and is not blocked by t0 covariates) may confound our findings. Indeed, manipulating religiosity in a natural setting is unethical, and naturally occurring events, such as epidemics99 or wars111, which affect religiosity, may also influence other factors that drive the potential effects of religiosity on fertility112. To address the threat of unmeasured confounding, we controlled for a set of theoretically relevant variables in the baseline (t0) that capture key aspects of family norms, ideological orientation, and economic background. Additionally, we controlled for the baseline measures of religiosity. For a third variable to bias our findings, a major event would need to have occurred across the entire U.S. prior to 2003—affecting both fertility and changes in religiosity through channels other than those we account for, such as perceived safety, ideology, or material security. Moreover, in our design, an unmeasured confounder would need to affect both religiosity and fertility independently of the baseline measurements of religiosity. Although our method imposes considerable confounding control, there still remain factors that were not able to control for in t0 such as health. Thus, our results should be interpreted with full awareness of this assumption, and assessed with appropriate reserve.