Introduction

Non-cognitive abilities, such as creativity, teamwork, and adaptability, are essential for lifelong success (Bowles et al., 2001; Heckman, 2007; Roberts, 2009; Gutman and Schoon, 2013; Duckworth and Yeager, 2015). Research has shown that these skills, alongside cognitive abilities, significantly influence individuals’ social status, economic achievement, health, and overall well-being. Early childhood is a critical period for the development of these abilities, and investments in early childhood education, particularly in fostering non-cognitive skills, yield substantial long-term returns (Jencks et al., 1979; Heckman et al., 2006; Edin et al., 2022). Given the significant impact of early education on later life outcomes, it is imperative to thoroughly investigate the long-term effects of preschool education on the development of non-cognitive abilities.

Preschool education plays a multifaceted role in fostering the development of students’ non-cognitive abilities (Wang et al., 2024). Through engaging in group play and collaborative activities, students are embedded in social settings that foster positive interactions and essential skills such as sharing, turn-taking, and conflict resolution, all within a supportive and nurturing environment (Ashiabi, 2007; Bodrova and Leong, 2015; Tersi and Matsouka, 2020). Additionally, preschool education emphasizes artistic and creative activities that spark imagination and creativity, while also boosting students’ confidence and self-expression (Nagamachi, 2002; Rauf and Tan, 2020; Rainford, 2020). These activities offer a platform for emotional and social intelligence development, enabling students to explore and articulate their thoughts and feelings through diverse creative outlets.

The long-term impact of preschool education on non-cognitive abilities remains inconclusive. Some studies have demonstrated positive effects: a review of 36 preschool programs for low-income students showed improvements in socialization, school enthusiasm, and interpersonal relationships (Barnett, 1995). These programs were also associated with better educational outcomes and long-term behaviors, such as lower rates of teenage pregnancies and criminal activity (Currie and Almond, 2011). The Chicago Longitudinal Study by Reynolds et al. found that enriched preschool environments can enhance both cognitive and non-cognitive skills, with non-cognitive abilities proving particularly influential on criminal behavior, as indicated by reduced incarceration rates (Reynolds et al., 2010).

Conversely, other studies reveal less consistent results regarding the impact of preschool education on non-cognitive development. Apps et al. (2013) observed that, although preschool significantly improves test scores at ages 11, 14, and 16, its impact on non-cognitive aspects is less predictable (Apps et al., 2013). Research by Zhao et al. found no significant association between preschool experience and the development of non-cognitive skills such as social skills and leadership (Zhao and Li, 2024). Additionally, a Canadian study suggested that preschool attendance might negatively affect students’ social skills and increase anxiety levels (Baker et al., 2008). Similar findings from

U.S. studies indicate that preschool experiences may, in some cases, diminish learning motivation, self-control, and interpersonal communication skills (Loeb et al., 2007).

Despite the recognized importance of preschool education in nurturing non-cognitive abilities, existing research has three key limitations. First, most studies are concentrated in developed countries, like the United States, and evidence from developing regions remains sparse (Loeb et al., 2007; Baker et al., 2008; Reynolds et al., 2010). Socioeconomic and cultural differences may influence the impact of early education on non-cognitive skills, making it crucial to investigate these relationships in varied contexts. Second, the prevalence of correlational studies limits the causal inferences we can draw between preschool education and non-cognitive development, particularly given potential endogeneity biases. Lastly, existing research often focuses narrowly on specific non-cognitive skills, such as social interaction or motivation, without employing a holistic framework (Barnett, 1995; Loeb et al., 2007). Thus, there is a need for a comprehensive index grounded in frameworks like the Big Five Personality Model to capture the full spectrum of non-cognitive skills developed in preschool education.

China provides an ideal context for examining the long-term impact of preschool education on non-cognitive skills, particularly given the country’s ongoing emphasis on human capital as a driver of economic growth (Li et al., 2005). With rural areas undergoing significant socioeconomic transformations, non-cognitive skills are increasingly valued in both education and the labor market. Leveraging data from the China Education Panel Survey (CEPS), a nationally representative dataset (Zheng et al., 2021; Wang, 2022; Zhang et al., 2023), this study applies instrumental variable analysis to explore the causal relationship between preschool attendance and adolescent non-cognitive development within this distinct environment. This context not only offers insights into human capital formation in rural China but also contributes valuable evidence for other developing countries facing similar challenges.

This study aims to address these research gaps by (1) describing preschool attendance patterns and non-cognitive ability levels among students, (2) assessing the enduring effects of preschool attendance on non-cognitive skills in adolescence, (3) examining heterogeneous effects across different student subgroups, and (4) identifying potential mechanisms through which preschool education impacts non-cognitive development.

Methods and measures

Study design and participants

This study utilizes data from the 2013–2014 China Education Panel Survey (CEPS), a project led by Renmin University of China that tracks Chinese middle school students over the long term to investigate the socioeconomic factors influencing their development. The study employs a phase-stratified probability proportional to size (PPS) sampling method to ensure the sample’s representativeness. Initially, 28 counties were selected nationwide, followed by a random selection of schools within these areas, resulting in a thorough investigation of 438 7th and 9th-grade classes and a total of 19,958 students. The survey not only examines students’ individual characteristics but also delves into the impact of family, classroom, and school environments on their development, collecting multidimensional information through comprehensive interviews.

In line with the research objectives, the final analytical sample was narrowed to 9443 rural students from 431 classrooms across 112 schools, due to the unavailability of key variables, specifically preschool attendance and non-cognitive ability measures. Further details regarding the sampling methodology, questionnaires, and other aspects are available at http://ceps.ruc.edu.cn/English/Home.htm.

Data collection

Data collection for our study took place during the second semester of the 2013–2014 academic year, using a comprehensive questionnaire designed to gather detailed information on students’ socioeconomic backgrounds, preschool attendance records, non-cognitive ability scores, and various mechanism variables, such as financial and time investments, family dynamics, and peer influence. The survey was distributed to all eligible 7th and 9th-grade students within our sample. In addition to collecting basic demographic information (e.g., age, gender, and preschool attendance), the survey gathered extensive data on parental characteristics, including education levels and economic status, to provide a nuanced understanding of family background and its potential impact on students’ non-cognitive abilities.

Variables

Preschool attendance

The CEPS student survey asked participants if they had attended preschool after turning three, including both You Er Yuan (centers that cater to 50 or more children aged two or three and above) and Xue Qian Ban (offers one year of preschool education prior to formal schooling) (Gong et al., 2016). Ideally, distinguishing between these types of preschools would be crucial for assessing differences in educational quality. However, limitations in the available data prevent such detailed differentiation. As a result, our study uses a binary variable to gauge preschool attendance, signifying whether a child received education before the age of six. A child who attended preschool is assigned a value of 1, and those who did not are assigned a value of 0. This method offers a general measure of the impact of preschool education in China (Zheng et al., 2019; Wang et al., 2023; Zhang and Zhang, 2023).

Non-cognitive ability score

In this study, non-cognitive abilities were measured across five dimensions: Conscientiousness, Positive Emotion, Agreeableness, Openness, and Extraversion (Zhang and Zhou, 2023). Each dimension’s score was calculated as the mean of all items corresponding to that dimension. The overall non-cognitive ability score for each student was then derived by summing the standardized scores of these five dimensions (Min et al., 2019; Zhao and Chen, 2022). To facilitate comparison, each dimension’s score was standardized to a range of 0 to 1 before being summed to create a total non-cognitive ability score.

Socio-demographic characteristics

Drawing on theoretical frameworks and empirical evidence (Hampson, 1984; Fabes et al., 1999; Caprara et al., 2014; Magnuson et al., 2016; Schneider and Weber, 2022; He et al., 2023), our analysis includes a comprehensive set of student, parental, and family factors that previous studies have shown to influence students’ non-cognitive skills. Recognizing that these factors may act as confounders, particularly concerning our primary variable of interest—preschool attendance—we have incorporated control variables into our empirical models to reduce potential omitted variable bias. Student data includes age, gender, only-child status, left-behind status, and boarding status, while parental data covers parental education levels and family economic status. The variable “ Low family economic status” is a binary indicator derived from the question on student family economic conditions, “What is the current economic status of your family?” with responses “difficult,” “medium,” and “wealthy.” We recoded “difficult” as 1, indicating low family economic status, and the other two as 0.

Statistics analysis

This research utilized four main statistical approached. To begin with, we use Ordinary Least Squares (OLS) regression, incorporating the set of covariates previously mentioned, as our initial approach to uncover the link between preschool attendance and the non-cognitive skills of students. This initial step is designed to establish a foundational understanding, which will serve as a benchmark for more advanced analytical methods to be employed later on. Consequently, we start by estimating the following linear equation.

$${{\rm{Y}}}_{{\rm{i}}}={{\rm{\beta }}}_{0}+{{\rm{\beta }}}_{1}{{\rm{Preschool}}}_{{\rm{i}}}+{{\rm{\beta }}}_{2}{{\rm{X}}}_{{\rm{i}}}+{{\rm{\varepsilon }}}_{{\rm{i}}}$$
(1)

Where \({{\rm{Y}}}_{{\rm{i}}}\) represents the standardized scores of non-cognitive abilities for student i; \({{\rm{Preschool}}}_{{\rm{i}}}\) is a binary indicator for whether student i is attended preschool, then \({{\rm{Preschool}}}_{{\rm{i}}}\) = 1 if the student attended; and \({{\rm{Preschool}}}_{{\rm{i}}}\) = 0 otherwise. The vector \({{\rm{X}}}_{{\rm{i}}}\) encompasses a range of covariates at both the individual and parental levels, as previously discussed. \({{\rm{\varepsilon }}}_{{\rm{i}}}\) represents the random error term for each observation. If Eq. (1) is correctly specified, the estimated coefficient for the preschool attendance variable, \({{\rm{\beta }}}_{1}\), will reflect the impact of preschool attendance on the development of non-cognitive abilities. Here, i indexes each individual observation within out dataset.

Secondly, a significant challenge in the direct application of OLS estimation is the variation in preschool education attendance rates due to disparities in preschool resources among different counties. To accurately account for these differences in preschool education availability and their impact on the development of non-cognitive abilities at the county level, we have incorporated county fixed effects (\({{\rm{\gamma }}}_{{\rm{i}}}\)) into our model (Eq. (2)). This adjustment helps to control for both observed and unobserved variations in county-specific characteristics, particularly those related to preschool attendance.

$${{\rm{Y}}}_{{\rm{i}}}={{\rm{\beta }}}_{0}+{{\rm{\beta }}}_{1}{{\rm{Preschool}}}_{{\rm{i}}}+{{\rm{\beta }}}_{2}{{\rm{X}}}_{{\rm{i}}}+{{\rm{\gamma }}}_{{\rm{i}}}+{{\rm{\varepsilon }}}_{{\rm{i}}}$$
(2)

Thirdly, the assessment of the long-term impact of preschool education using OLS and county fixed-effect models may be subject to bias due to significant differences between comparison groups. To precisely estimate the long-term effects of preschool education on potential participants, we have adopted the Propensity Score Matching (PSM) method. PSM pairs students in the treatment group with similar ones in the comparison group, attributing differences in non- cognitive abilities to the effects of preschool attendance (Rosenbaum and Rubin, 1985). We apply nearest neighbor and caliper matching with replacement, and bootstrap standard errors over 200 replications. These PSM findings also validate and check for biases in our OLS and county fixed effects model results.

Lastly, to address potential endogeneity, we employ instrumental variable (IV) estimation. Our selected instruments, the school-level preschool attendance rate and the proportion of kindergartens in the community, are expected to be correlated with the independent variable (preschool attendance) but not with the dependent variable (non-cognitive abilities), thereby serving as valid instruments for this analysis. Both of our instruments reflect the local supply-side availability of preschool education. These measures influence whether a child attends preschool primarily by increasing accessibility and reducing logistical or financial barriers faced by families. In rural China, preschool enrollment decisions are shaped largely by structural factors such as the presence and capacity of local kindergartens, rather than by individual child characteristics or family preferences related to non-cognitive development. Moreover, the instruments capture conditions specific to the preschool period, which precedes the time frame in which non-cognitive outcomes are measured during junior high school. There is no plausible direct pathway through which the density of kindergartens or school-level enrollment rates several years earlier would independently influence adolescents’ non-cognitive skills, except via preschool participation. To further support the exclusion restriction, we conducted placebo analyses using parental occupation as an outcome, which is theoretically unrelated to preschool availability (see in Appendix Table A1). The results showed no significant association with this placebo variable, lending additional credibility to the assumption that their influence on non-cognitive outcomes operates exclusively through preschool attendance. We adjust robust standard errors in parentheses for all regression models, with the analyses conducted using Stata 17.0 (Stata Corp., Texas, USA).

Results

Rural students’ preschool attendance and non-cognition development

Analyzing a dataset of 9443 students, we discovered that 76.13% had preschool experience, correlating with various demographic factors. Table 1 reveals a balanced gender distribution in the sample, with 51.1% (n = 4825) male students and 48.9% (n = 4618) female students. The average age of participants was 13.99 years, with a standard deviation of 1.36 years. In terms of family structure, 26.6% were from only-child families, 26.1% were left-behind students, and 47.4% were boarding students. Regarding family economics, about 27% of students were from the poorest households. Lastly, only 19.2% of fathers and 13.6% of mothers had education levels beyond high school. Data also shows that rural preschool attendees typically come from wealthier backgrounds and are younger, only child, non-left-behind, and non-boarding, significantly different from non-attendees (P < 0.001). Their parents also have higher education and income levels (P < 0.001). These differences emphasize the need to account for baseline disparities when evaluating the impact of preschool on non-cognitive skill development in rural China.

Table 1 Summary statistics of background characteristics.

The data further indicate that individuals with preschool experience tend to have higher non-cognitive ability levels than those without such experience (mean ± SD: −0.041 ± 3.383 VS −0.680 ± 3.391). This suggests a possible positive impact of preschool education on the development of non-cognitive abilities. The data analysis reveals significant differences among five non-cognitive ability dimensions: positive emotions (mean ± SD: 0.021 ± 0.944 SD VS −0.128 ± 0.967), agreeableness (mean ± SD: −0.020 ± 0.981 VS −0.153 ± 1.010), openness (mean ± SD: −0.042 ± 0.974 VS −0.146 ± 0.990), and extraversion (mean ± SD: −0.019 ± 0.986 VS −0.237 ± 1.039). Junior high school students who had received preschool education significantly outperformed those who had not. However, there seems to be little difference in the index of conscientiousness between individuals with and without preschool attendance experience (mean ± SD: 0.018 ± 0.960 VS −0.015 ± 0.960).

Long-term effects of the preschool attendance on rural students’ non-cognition development

Table 2 details the long-term effects of preschool attendance on rural students’ non-cognitive abilities, with columns 1–6 adjusting for individual and family factors, and columns 7–12 adding county-fixed effects. The key figures are the coefficients for preschool attendance, with robust standard errors shown in parentheses. Our empirical study on the long-term impact of preschool attendance on non-cognitive abilities found two main results. The regression model, controlling for individual and family factors (Table 2, columns 1–6), shows a significant positive effect of preschool on non-cognitive abilities (=0.386, P < 0.01). It also positively influences positive emotions (=0.095, P < 0.01), agreeableness (=0.086, P < 0.01), openness (=0.055, P < 0.05), and extraversion (=0.171, P < 0.01), but not conscientiousness. Columns 7–12 of Table 2 confirm the robustness of these findings after accounting for county fixed effects. Our analysis also identifies several key factors that enhance students’ non-cognitive abilities and their corresponding dimensions. Notably, younger students, females, only children, non-left-behind students, those with highly educated parents, and those from families with better economic status tend to have higher non-cognitive scores (P < 0.01).

Table 2 The effect of preschool attendance on student’s non-cognition development.

In addition, we performed Oster’s δ sensitivity analysis for the OLS estimates, as shown in Appendix Table A2. The δ value for reducing the estimated effect to zero is 1.074, suggesting that unobserved confounders would need to be at least as influential as the observed covariates to nullify our findings. While the δ value drops to 0.544, indicating a reduction in effect by half, suggesting some sensitivity to moderate unobserved bias, our key conclusions remain robust.

Propensity score matching (PSM)

Table 3 presents the propensity score matching (PSM)-adjusted outcomes for non-cognitive abilities. Rows 1 and 7, column 1 show regressions for total non-cognitive scores, while rows 2–6 and 8–12, column 1 detail the long-term effects of preschool attendance on five non-cognitive dimensions. Specifically, preschool attendees show 7.9–8.6 percentage points higher positive emotions, 8.9 points higher agreeableness, 4.9–6.2 points higher openness, and 9.5–12.7 points higher extraversion compared to non-attendees. The PSM results suggest that the results are robust after considering the selection of observations.

Table 3 PSM results of the effect of preschool attendance on student’s non-cognition development.

Figure 1a depicts the post-matching probability densities for preschool attendees and non-attendees using the county-fixed effects model, showing good overlap between the groups. This substantial overlap, along with a balanced covariate distribution, validates the subsequent analysis on the matched sample of 9443. To further assess the robustness of our PSM results to potential unobserved confounders, we report standardized mean differences (SMDs) before and after matching (Tan, Cai, and Bodovski, 2022), and present a Love plot to visually display the improvement in covariate balance. Results show that no covariate exhibits an SMD above 10% after matching, suggesting satisfactory balance across treatment groups (Details in Table 4, Table A3, and Fig. 1b). We conducted a Rosenbaum bounds sensitivity analysis to assess the robustness of the estimated treatment effect to potential hidden bias. The results (see in Appendix Table A4) show that the findings remain statistically significant even under a sensitivity parameter Γ=2.0, suggesting that an unobserved confounder would have to increase the odds of receiving treatment by 100% to invalidate the results. This indicates strong robustness to unobserved selection bias.

Fig. 1: Diagnostics for propensity score matching.
Fig. 1: Diagnostics for propensity score matching.
Full size image

a Overlap in the support of the covariates after between the attendees and non-attendees. b Standardized mean differences of baseline covariates before and matching (Love plot).

Table 4 Summary of standard bias (in %) before and after matching.

IV estimates

Table 5 presents the results from the two-stage least squares (2SLS) estimation using both the school-level preschool enrollment rate and the community kindergarten density as instruments for preschool attendance. Including both instruments jointly in the first stage enhances the relevance of the instruments and allows for overidentification tests to assess instrument validity.

Table 5 IV estimate.

The first-stage F-statistic is 22.82, exceeding the conventional threshold of 10 and suggesting no weak instrument problem. The Kleibergen–Paap rk LM statistic yields a p value of 0.0007, rejecting the null hypothesis of underidentification, confirming that the model is well-identified. The Hansen J-test produces p-values above 0.30 across all specifications (e.g., 0.550 in column 2), indicating that the instruments are not correlated with the second-stage error term and thus satisfy the exclusion restriction.

In the second stage, preschool attendance is positively and significantly associated with non-cognitive outcomes. Specifically, attending preschool increases the total non-cognitive score by 4.396 points (standard error = 2.381, p < 0.10). Among subdimensions of non-cognitive traits, significant positive effects are found for Neuroticism (coefficient = 0.753, SE = 0.400, p < 0.10), Agreeableness (coefficient = 1.321, SE = 0.654, p < 0.05), and Openness (coefficient = 1.126, SE = 0.558, p < 0.05). The coefficients for Conscientiousness (0.516, SE = 0.533) and Extraversion (0.680, SE = 0.546) are positive but not statistically significant at conventional levels. These results suggest that preschool participation has favorable impacts on overall non-cognitive abilities and several specific personality traits, even after accounting for potential endogeneity.

Robustness check

We perform multiple robustness checks to verify the stability of our basic model’s findings, with the outcomes presented in Table 6. Given the positive impact of preschool education on child development, it is reasonable to expect that extended exposure to quality preschool would yield superior outcomes. To explore this, we categorized preschool duration into four groups: less than 1 year (duration ≤ 1 year), 1–2 years (1 year < duration ≤2 years), and above 2 years (duration > 2 years). We constructed three dummy variables for these groups, using children with no preschool experience as the reference category. Our findings, as detailed in Table 6, Panel A, reveal that students with the longest preschool tenure, particularly those attending for over 2 years, experienced the most substantial benefits in terms of non-cognitive development.

Table 6 Robustness check of the effect of preschool attendance on student’s non-cognition development.

We investigated the significance of the timing of preschool entry (Table 6, Panel B). We categorized preschool starting age into five groups: under 2 years (age < 2), 2–3 years (2 ≤ age < 3), 3–4 years (3 ≤ age < 4), 4–5 years (4 ≤ age < 5), and over 5 years (age ≥ 5), with an additional control group for those who did not attend preschool. Considering that students starting earlier might have the longest exposure, we controlled for the duration of preschool in our model to isolate timing effects from dosage effects. Our results indicate that students who began preschool between 3 and 4 years of age (including 3 years) saw the greatest benefits for their non-cognitive development.

Furthermore, the lack of random class assignment could unbalance the sample, leading to potential bias. To counter this, regressions in Table 5, Panel C, utilized a randomly sorted sample, yielding coefficients for non-cognitive skills and their five dimensions (=0.074–0.350, P < 0.05). Concerning birth order, variations in results may stem from differences in lower-parity students or parental investment choices varying with birth order, as noted by Black, Devereux, Salvanes (2005) (Black et al., 2005). Additionally, having older siblings in preschool can help bypass waiting lists. The last panel (D) of Table 6 presents findings specific to first-born students, with coefficient estimates aligning with our previous results.

Heterogeneity analysis

To examine the differential impact of preschool attendance on students’ non-cognitive abilities, we conducted a heterogeneity analysis from eight distinct perspectives, as presented in Table 7. These include age, gender, only-child status, left-behind status, boarding status, father’s educational level (high school or above), mother’s educational level (high school or above), and family economic status (poorest). Our findings indicate that certain subgroups exhibit a stronger effect from preschool attendance. Specifically, younger students, male students, non-only students, non-left-behind students, boarding students, and students with parents who have lower education levels or better family economic status tend to experience more pronounced gains in non-cognitive abilities.

Table 7 Heterogeneity analysis of the effect of preschool attendance on student’s non-cognition development.

Mechanism

To understand the pathways through which preschool attendance influences students’ non-cognitive abilities, we focus on two key dimensions: family resource investment and peer interactions. These dimensions were selected because they represent fundamental aspects of a child’s early social and emotional environment that are likely to shape non-cognitive skills. Family resource investment involves the time, money, and attention parents provide to support their child’s development, fostering skills such as perseverance, emotional regulation, and social communication. Peer interactions, on the other hand, expose students to diverse social scenarios, enhancing abilities like empathy, cooperation, and adaptability, which are essential for long-term social success. Table 8 presents the detailed findings.

Table 8 Mechanism analysis of the effect of preschool attendance on student’s non-cognition development.

Monetary, time investment, and family dynamics

To examine how preschool attendance fosters non-cognitive skills through family resources, we analyzed a range of related activities, which include both material resources and parent-child interactions. Specifically, nine channels were examined to assess the effect of preschool attendance on students’ non-cognitive development: a large collection of books at home, exercising with parents, visiting cultural venues (such as museums, zoos, or science museums) with parents, attending movies, shows, or sports games with parents, quality of the parent-child relationship, discussing school matters with parents, talking about the child’s emotional experiences with parents, visiting cultural venues with classmates, and attending movies, shows, or sports games with classmates.

In Panel A of Table 8, we report how the experience of preschool attendance positively affects indicators of family resource investment. The results show statistically significant effects of preschool attendance on indicators including: “A large collection of books at home (=0.105, P < 0.01)”, “Exercising with parents (=0.089, P < 0.1)”, “Visiting cultural venues with parents (=0.103, P < 0.01)”, “Attending entertainment events with parents (=0.090, P < 0.01)”, “Parent- child relationship quality (=0.037, P < 0.01)”, “Discussing school matters with parents (=0.062, P < 0.01)”, and “Talking about the child’s emotions with parents (=0.042, P < 0.01)”. These findings suggest that increased family time and resource investment fostered by preschool attendance may play a role in developing student’s non-cognitive skills, enhancing their social, emotional, and cognitive resilience.

Peer level

Research by Li and Zhao (2022) suggests that peer interactions are instrumental in improving students’ non-cognitive skills (Li and Zhao, 2022). In line with this, our study used two indicators to evaluate the impact of preschool attendance on peer relationships: “visiting cultural venues with classmates” and “attending entertainment events with classmates”. As shown in the results (Table 7, Panel B), both indicators had significantly positive coefficients, implying that preschool attendance enhances student’s non-cognitive abilities by promoting positive and frequent interactions with peers.

In addition to the baseline mechanism analysis, we performed KHB decomposition analyses. The results indicate that family monetary and time investments, as well as peer interactions, partially explain the positive effect of preschool attendance on non-cognitive outcomes, though significant direct effects remain (see Appendix Table A5).

Conclusion and discussions

With the labor market increasingly valuing human capital, the role of non-cognitive skills is gaining importance. Preschool education significantly contributes to the development of these skills by providing an early foundation in essential abilities, such as adaptability, teamwork, and social engagement. This study investigates the long-term effects of preschool attendance on non-cognitive development among junior high school students in rural China, utilizing nationally representative data from the 2013–2014 China Education Panel Survey (CEPS). Specifically, it examines how preschool attendance affects factors tied to students’ development, including family resource investment, time investment, and peer interactions. Our findings contribute new insights into the ongoing benefits of preschool education within rural contexts, highlighting its potential to enhance students’ non-cognitive abilities in meaningful ways.

The results demonstrate that students with preschool experience exhibit stronger development in non-cognitive dimensions, particularly positive emotions, agreeableness, openness, and extraversion. Furthermore, the data reveal that certain demographic factors—such as being an only child, living with parents, not boarding, having more educated parents, and coming from economically advantaged backgrounds—are associated with higher rates of preschool attendance in China. This suggests that access to preschool is partially shaped by socioeconomic status, raising important considerations for equitable early education access. This conclusion is consistent with other studies (Wu, 2011; Zhang, 2013; Gong et al., 2016). The observed long-term benefits of preschool attendance for non-cognitive development highlight the importance of expanding preschool access, especially in under-resourced rural areas, where these skills are crucial for personal and academic growth in an evolving economic landscape.

Using four identification strategies—OLS regression, county fixed effects, propensity score matching, and instrumental variables—we discovered a significant positive long-term impact of preschool attendance on students’ non-cognitive abilities. These results align with Chinese studies on long-term effects (Gong et al., 2016) as well as findings from international samples (Nores and Barnett, 2010). Our data indicate that longer preschool attendance (two years or more) confers greater benefits in non-cognitive abilities and that enrolling between the ages of 3 and 4 yields the most substantial advantages for non-cognitive development. Thus, our findings underscore the critical role of preschool in fostering long-term non-cognitive abilities in rural China.

Our heterogeneity analysis offers further insights into the nuanced effects of preschool attendance across different student profiles. Firstly, the impact of preschool attendance on non-cognitive development is more pronounced for non-Only-Child students, possibly due to a lack of sibling interaction in only-child households that might otherwise foster social skills (Falbo and Poston Jr, 1993; Tobin et al., 2009). Secondly, left-behind students—those separated from one or both parents due to migration—appear less able to benefit from preschool experiences, likely due to a combination of both reinforcing and offsetting social effects. Finally, younger students, boys, non-Only students, boarders, and those with parents with lower educational levels show the greatest benefits from preschool attendance. These findings suggest the need for policies that alleviate financial burdens on vulnerable households and provide affordable preschool access to rural families. Such support can amplify the benefits of preschool, reduce developmental disparities, and foster greater educational equity between disadvantaged and advantaged students.

We identify two key mechanisms by which preschool education could enhance students’ non-cognitive skills: monetary and time investment from families, as well as enriched peer interactions, both of which contribute positively to non-cognitive development. Preschool attendance frees up parents’ time (Hirshberg et al., 2005; Barnett and Jung, 2021), which they can use for work, potentially raising family income for educational resources (Hirshberg et al., 2005; Guo et al., 2024), or for rest, improving their capacity for supportive parenting (Wu, 2011; Havnes and Mogstad, 2015). Additionally, preschool may expose parents to child development information, supporting more positive and engaged parenting practices (Wu, 2011; Ladd, 2016), positively influencing family dynamics. On the peer side, preschool provides a structured environment where students can regularly interact with peers, helping them develop interpersonal skills like cooperation, conflict resolution, and empathy (Boivin et al., 2005; Eivers et al., 2012; Cappelen et al., 2020; Xiao et al., 2022). These early peer interactions allow students to practice social behaviors and adaptability in a safe setting, building confidence and social awareness (Repper and Carter, 2011; Xiao et al., 2022). Together, these enhanced family and peer experiences offer a foundation for developing essential non-cognitive skills that support long-term success.

We recognize three limitations in our study that merit consideration. Firstly, due to data limitations, we were unable to analyze the effects of specific preschool types or investigate the mechanisms in depth, despite offering speculative explanations. Future research with more detailed data is necessary to gain a clearer understanding of these nuances. Secondly, although we employed four different strategies to address endogeneity concerns, unobserved confounders may still influence our findings. Further investigation with stronger identification methods could help clarify the link between preschool attendance and non-cognitive outcomes. Lastly, our use of cross-sectional data limits our ability to establish causality definitively. Future studies employing longitudinal panel data would provide a more robust basis for causal inferences.

Despite these limitations, our study makes an important contribution by offering a comprehensive analysis of how preschool attendance affects the development of non-cognitive skills in students. The findings have significant policy implications for student development in rural China, suggesting that expanding preschool access could be an effective way to enhance non-cognitive outcomes. Given the limited preschool availability for rural students, as highlighted by Gong et al. (2016) (Gong et al., 2016), efforts to increase access for students from rural households are crucial. Expanding preschool opportunities in rural areas could help bridge developmental disparities between rural and urban regions and promote equity from early childhood onward within rural communities.

Future research could build upon this study by exploring the moderating effects of factors such as family engagement, community support, and regional educational policies would offer deeper insights into how preschool interventions can be tailored to different socioeconomic contexts. Further studies could also investigate the longitudinal impact of preschool education on various life outcomes, thereby enhancing our understanding of the role of early childhood interventions across the lifespan.