Introduction

The educational development and achievement of young people hold profound implications not only for their long-term health and well-being in adulthood (Kosik et al., 2018; Zhang et al., 2022) but also serve as critical drivers of socioeconomic advancement and cultural enrichment across national and societal contexts (Dolby and Rizvi, 2008; Menzies and Baars, 2021). Crucially, the attainment of an undergraduate degree apparently signifies the prospects of youths for employment, income stability, and quality of life across the adult life course (Anderson and Li, 2020; Bass and Besen-Cassino, 2016; Dolby and Rizvi, 2008), particularly for those pursuing science, technology, engineering, and mathematics (STEM) disciplines. Indeed, the proportion of STEM graduates within a nation directly and positively influences its political-economic competitiveness, national innovation capacity, societal development, and technological advancement (Penprase, 2020; Reid et al., 2025; Tasos et al., 2018). Despite this, the rates of students to enroll for STEM majors in matriculation and/or successful graduation with a STEM degree have been declining over the past few decades (Ji, 2021; Pov et al., 2024), underscoring the urgency of identifying factors that promote successful STEM degree completion. Although researchers have investigated the contributors of science- or STEM-related socialization, e.g., parental guidance in science education, home-based STEM resources, teacher support, and school STEM environment, to youths’ STEM development (de las Cuevas et al., 2022; Ji, 2021), little research has examined the dynamic relationships between youths’ educational expectations and science performance during middle school in contribution to later STEM attainment of youths in adulthood. This gap is notable given robust evidence linking educational expectations to academic achievement (Fishman, 2022; Keung and Ho, 2019; Pinquart and Ebeling, 2020). Importantly, both educational expectations and academic performance are developmental constructs that evolve over time, rather than static traits as often conceptualized. To address this, the present study employs a dynamic, longitudinal framework to investigate how shifts in youths’ educational expectations affect the changes in their science performance during middle school, and how these trajectories work collectively to predict eventual STEM degree completion in adulthood.

The current study’s focus on youths’ educational expectations and science performance during middle school is grounded in the critical developmental phase of early adolescence. This period represents a critical formative window for cultivating educational aspirations, academic motivation, and foundational learning competencies (Carolanl, 2017; Park, 2021), with implications that reverberate into adulthood through educational and career trajectories. Such a focus is particularly salient for youth STEM development, as STEM achievement is inherently cumulative and progressive, requiring sustained academic motivation and mastery of essential science knowledge and skills acquired during—or even prior to—middle school (Miller and Pearson, 2012; Penprase, 2020; Zhang, 2022). However, existing longitudinal studies often reduce the contributions of early educational development and science-related socialization factors to static, fixed effects, measuring these variables at singular timepoints rather than capturing their dynamic evolution over time (Kohen and Nitzan, 2022; LeBeau et al., 2012; Luo et al., 2022; Wang, 2024). This methodological limitation constrains the field’s ability to discern how evolving processes in adolescents’ educational development shape later STEM achievement, thereby hindering the design of targeted educational reforms and policies to advance science and STEM education. To address these research gaps, we intended in the current study to examine how the developmental and growth trajectories of youths’ educational expectations may contribute to their developmental and growth trajectories in science performance across middle school, which then jointly lead to youths’ later successful graduation with a STEM degree in adulthood.

Theoretical framework of youths’ educational expectations and science performance during middle school and STEM attainment in adulthood

Although the educational development of youths and their later academic achievement, including successful graduation with a STEM degree in adulthood, depend on various environmental and personal factors, youths’ educational expectations have been consistently found to positively predict better academic performance and success, such as higher standardized exam scores, GPAs, college education enrollment, and graduation with a university degree (Andrew and Hauser, 2011; Liu et al., 2025; Liu et al., 2020; Wang, 2016). These relationships align with core tenets of situated expectancy-value theory (SEVT; Eccles, 2009; Eccles and Wigfield, 2020), which posits that achievement-related choices (e.g., pursuing STEM degrees) are directly influenced by (a) expectations for success (beliefs about one’s ability to succeed) and (b) subjective task values (perceived importance, interest, or utility of a task). Educational expectations reflect both constructs: They embody youths’ success expectations in academia and the SEVT concept of attainment value (the importance placed on educational goals), which collectively motivate persistent engagement in learning activities—including science—and long-term goal pursuit, such as STEM degree attainment in adulthood.

Nevertheless, limited research has examined how educational expectations of youths may dynamically shape their science performance during middle school or how these evolving trajectories may jointly predict later STEM degree attainment. Manifestly, emerging advanced longitudinal studies reported that both educational expectations and academic performance of youths are time-varying processes evolving over time (Chykina, 2019; Dochow and Neumeyer, 2021; Marsh, 2023), denoting their developmental and changing nature. SEVT elucidates this dynamism: Expectations and values are situated and malleable, continually reshaped by contextual feedback (e.g., science performance outcomes) and sociocultural influences (e.g., parental support, school climate). Early success in science reinforces competence beliefs and interest values, creating recursive feedback loops that sustain motivation and performance growth (Eccles and Wigfield, 2020; Wang and Degol, 2013). Relevantly and critically, science performance, hence, serves as both a direct predictor of long-term STEM degree outcomes (by signaling foundational competence necessary for advanced study) and a mediator of the expectation-STEM link (by translating aspirational motivation into tangible academic achievement that validates STEM identity and task-related utility/attainment values). Additionally, middle school marks a pivotal transition from childhood—characterized by dependency and undefined self-concept—to adolescence (Reynolds et al., 2019), a critical period for cultivating identity, academic motivation, learning interests and performance, as well as future educational and occupational directions. This formative period is critical for youths to establish educational competencies that bridge childhood and early adulthood within the life course. In fact, SEVT underscores adolescence as a “critical window” for STEM identity formation, which means when early science experiences align with youths’ interest values (e.g., fascination with scientific inquiry) and utility values (e.g., belief that science aids career goals), they foster enduring STEM aspirations (Eccles and Wigfield, 2023; Lauermann et al., 2017). Therefore, it is of research worth to investigate how youths formulate their educational expectancy and performance in middle school (Reynolds et al., 2019; Zilanawala et al., 2017), which are believed to have profound impacts on their later academic achievement in adulthood, such as successful completion of a STEM degree.

As abovementioned, although the STEM development of youths is a cumulative and progressive process that meticulously relies on their educational motivations and essential science knowledge and skills adequately acquired in middle school years or earlier (Miller and Pearson, 2012; Penprase, 2020), there is, however, a paucity of research that has empirically explored the evolving relationship between youths’ educational expectations and science performance during middle school and scrutinized how these trajectories may jointly affect later STEM attainment in adulthood. For this, we intended in the current study to adopt a dynamic approach to study how the developmental and growth trajectories of youths’ educational expectations shape their developmental and growth trajectories in science performance across middle school years, which are both expected to contribute to later STEM degree completion in adulthood. Apparently, SEVT provides a mechanistic framework that strong educational expectations (expectancy beliefs) enhance science effort and persistence, improving performance (attainment); this success subsequently validates and strengthens STEM-related utility/interest values, ultimately predicting STEM degree pursuit (choice), (Eccles and Wigfield, 2020). Thus, science performance is theorized to mediate the expectation-STEM link—by operationalizing motivational beliefs into demonstrable competence—while also functioning as a direct predictor of STEM attainment due to its role as a gateway to advanced STEM coursework. This dual role is empirically tested in this study.

In this study, the educational expectations of youths refer to the Wisconsin model portraying as youths’ general educational expectancy and aspirations for achieving better academic performance (Roth, 2017). Therefore, the educational expectations of youths are understood as the possible educational attainment perceived as reachable (Andrew and Hauser, 2011; Roth, 2017). This approach to assess educational expectations and its effects on adolescents’ academic performance has been commonly applied by empirical researchers (Andrew and Hauser, 2011; Liu et al., 2020; O’Donnell et al., 2022; Zhang, 2014).

Notably, while SEVT traditionally emphasizes domain-specific expectations (e.g., math self-efficacy), general educational expectations reflect overarching success beliefs that permeate academic domains. This aligns with SEVT’s acknowledgment that broader achievement beliefs scaffold domain-specific motivations (Wigfield and Eccles, 2023), particularly during early adolescence when career identities are nascent, e.g., science and STEM development. Empirically, contemporary advanced longitudinal research underscores that educational expectations and academic performance are dynamic, evolving constructs rather than static traits, necessitating analysis across developmental trajectories (Andrew and Hauser, 2011; Chykina, 2019; Marsh, 2023; Widlund et al., 2023). This is consonant with the life course perspective and Bayesian learning theory explicating that youths are cognizant agents capable of refining their cognitive and behavioral processes—including educational expectations and science learning practices—throughout middle school years in order to accomplish what they expect valuable and achievable in the future when assimilating new information and experiences across the life course (Morgan, 2005; Pallas, 2003). In the same vein, SEVT complements this: Expectations and values are continually updated via recursive processes where prior outcomes (e.g., science grades) recalibrate future expectations (success beliefs) and task values (Eccles and Wigfield, 2020).

In fact, the STEM development of youths in adulthood necessitates their well-equipped essential science knowledge and skills as well as strong educational aspirations cumulatively and progressively established in the early years of middle school (English, 2017; Larkin and Lowrie, 2022; Zhang, 2022). Notably, successful STEM development in adulthood hinges on foundational science competencies and sustained educational aspirations cultivated sustainably during early adolescence, particularly in middle school. This corresponds with what Mau et al. (1995) mentioned: “(e)ighth-grade students typically are in the crucial stage of exploring self and the world of work. Unlike some occupational fields, preparation for nontraditional occupations, especially in the areas of science and engineering, must begin early. …training in math and science needs to be sequential and uninterrupted from elementary school, and fundamentals must be mastered before high school (p. 324).” Accordingly, SEVT formalizes this: Middle school science mastery builds foundational competence beliefs, while early STEM exposure cultivates interest values—both prerequisites for later STEM choices (Ozulku and Kloser, 2023; Wang, 2013). In this study, science performance of youths during middle schools refers to their academic proficiency in core science subjects that include biology, chemistry, physical science, and environmental science, which are the pivotal scientific foundations for their later STEM development (Larkin and Lowrie, 2022; Tasos et al., 2018). Despite this, very few longitudinal studies have systematically examined how youths’ educational expectations and science performance during middle school jointly predict later STEM attainment. Pertinently, Larson et al. (2014) found in their short-term longitudinal study that first-year college students’ initial educational aspirations and science interests significantly predicted STEM degree completion. Similarly, Zhang et al. (2019) linked grade-10 academic aspirations to BA-STEM attainment, while Marsh, (2023) identified dynamic relationships between high-school math self-concept and postsecondary STEM credentials. These findings resonate with SEVT that proximal expectations/values predict distal choices, but crucially, middle school is the formative phase where these beliefs crystallize (Eccles, 2009).

While prior longitudinal studies affirm the connection between adolescents’ educational expectations, science performance, and eventual STEM achievement, they predominantly emphasize rank-order or autoregressive effects, overlooking how developmental and growth trajectories of these constructs during middle school prospectively shape STEM degree attainment in adulthood. Empirical evidence confirms that educational expectations and academic performance evolve dynamically during adolescence (O’Donnell et al., 2022; Zhao et al., 2019). For example, O’Donnell et al. (2022), tracking 1477 Australian adolescents from ages 12–13 to 16–17, demonstrated that intrapersonal shifts in educational expectations significantly predicted later academic outcomes. Besides, Zhao et al. (2019), following 775 Chinese students from grades 6–8, found that early educational exploration and commitment predicted developmental progression in these measures, ultimately enhancing grade 8 academic achievement. Yet, such work largely treats educational development as static, neglecting latent developmental and growth trajectories that may underpin long-term educational outcomes like STEM degree attainment. For this, Carolan (2017) mentioned: “(e)mpirical work emanating from this tradition, however, has treated this complex mental process of educational expectations formation as one that is relatively fixed by adolescence and not responsive to new, relevant information (p. 238).”

In this study, we conceptualize the developmental and growth trajectories of youths’ educational expectations and science performance during middle school as two interrelated and evolving processes: (1) the developmental trend of an individual youth relative to their peers over time (reflecting developmental status in educational expectations and science performance), and (2) intrapersonal changes within each youth across grades 7 to 9 (capturing growth in these measures through continuing and repeated assessments). These dual trajectories align with SEVT’s developmental principles that initial status (intercepts) reflects early-formed expectations/values, while growth (slopes) captures how school experiences reshape them (Eccles and Wigfield, 2023; Wigfield and Eccles, 2000). We hypothesize that the developmental and growth trajectories of youths’ educational expectations during middle school will positively shape their developmental and growth trajectories in science performance. Together, these two evolving trajectories are expected to predict youths’ later successful graduation with a STEM degree in adulthood. By modeling these longitudinal dynamic relationships, the current study contributes to a nuanced understanding of how early academic motivation and skill development coalesce to shape STEM trajectories. Thereby, this study bridges critical research gaps in education and behavioral research by reconceptualizing youths’ educational expectations and science performance as dynamic, co-evolving processes during middle school. It reveals how these developmental and growth trajectories synergistically predict youths’ later STEM degree attainment—a contribution that challenges static conceptual frameworks while advancing life course and Bayesian learning theories.

To the authors’ knowledge, this is the first empirical inquiry to conceptualize educational expectations and science performance as time-varying constructs and rigorously test their dynamic interplay during adolescence. By modeling these relationships through a longitudinal framework, the study advances understanding of how cumulative and progressive processes in early academic motivation and science learning—and their interdependencies—spill over into long-term STEM outcomes. The findings of this study help provide insights for how to design policies and plan educational interventions in nurturing educational aspirations and foundational science competencies of youths systematically in early school years with an aim to prepare their later STEM development across the life course.

The current study

Taking what has been reviewed above together, we planned to take a dynamic approach in this study to investigate the developmental and growth trajectories of youths’ educational expectations and science performance across middle school in prediction of their later successful graduation with a STEM degree in adulthood. This dynamic approach means that youths’ educational expectations and science performance are individually varying at standing levels (the development) in middle school and are intrapersonally changeable (the growth) across middle school, which helps scrutinize the importance of cultivability and malleability of educational motivations and academic performance during early school years in relation to youths’ long-term academic achievement, e.g., STEM degree attainment in adult life (Affuso et al., 2025; Chykina, 2019; Perinelli et al., 2022). In addition, as educational expectations are empirically reported a contributor to youths’ academic performance in school years (Dotterer, 2022; Perinelli et al., 2022; Zhang, 2014), which both are important for youths’ later completion of a college degree (Kim and Fong, 2014; Sommerfeld, 2016; Zhan and Sherraden, 2011; Zhang, 2022), we expect that the developmental and growth trajectories of youths’ science performance would mediate the relationship between the developmental and growth trajectories of youths’ educational expectations and their STEM degree completion in adulthood. Due to the cumulative and progressive processes of youths’ educational expectations and science performance in school, we hypothesize that the initial developmental trajectories of youths’ educational expectations and science performance lead to their growth in these trajectories across middle school. Accordingly, we set the hypotheses as follows:

H1: The development and growth of youths’ educational expectations in middle school would positively lead to their development and growth in science performance across middle school.

H2: The development and growth of youths’ educational expectations and their science performance during middle school would positively predict their later successful graduation with a STEM degree in adulthood.

H3: The development and growth of youths’ science performance would mediate the effects of their development and growth in educational expectations during middle school on their later successful graduation with a STEM degree in adulthood.

H4: The growth of youths’ educational expectations and science performance during middle school would be a function of their initial development of educational expectations and science performance in middle school, anticipating that the initial development of youths’ educational expectations and science performance would predict their growth in educational expectations and science performance across middle school.

To ensure the estimated relationships between educational trajectories and STEM outcomes reflect core theoretical processes rather than sociodemographic disparities, we account for key contextual factors central to STEM equity research, which include youths’ gender, family composition, parental SES, and ethnicity (Ji, 2021; Luo et al., 2022; Penprase, 2020). These factors were included as statistical controls to isolate the unique predictive effects of motivational and performance trajectories while acknowledging their established roles in systemic opportunity gaps. The rationales are that female students are reported to have higher educational motivation and academic performance than their male counterparts (Atchia and Chinapah, 2023; Keung and Ho, 2019). Moreover, youths living in two-parent families with a biological mother and father and having higher parental SES were found to have higher educational expectations and better academic achievement compared to their student counterparts of other family structures and lower parental SES (Gu et al., 2024; Renzulli and Barr, 2017; Sun and Li, 2011; Zhang et al., 2021). Additionally, Asian youths are reported to have stronger academic motivation and better educational performance, especially in STEM development, compared to their peers of other ethnic origins (Feliciano and Lanuza, 2017; Hsieh and Simpkins, 2022). Thereby, the current study classified youths into five ethnic origins: White, African American, Hispanic, Asian, and Native American, with Asian youths as the reference group due to their higher academic motivation and educational outperformance (Feliciano and Lanuza, 2017; Portes and Rumbaut, 2014). Youth participants’ age was excluded as a control variable in the modeling procedures to mitigate potential collinearity, as educational expectations and science performance are time-varying constructs inherently aligned with school years and chronological age (Wang and Wang, 2019).

Methods

Data and sample

The present study utilized data from the Longitudinal Study of American Youth (LSAY), a nationwide representative study of public middle- and high-school students in the United States (Miller, 2014). LSAY surveyed two cohorts beginning in 1987: The first cohort included 2829 high school students in the 10th grade, and the second cohort recruited 3116 middle school students in the 7th grade. The longitudinal interviews were conducted annually from 1987 to 1994 (spanning 7 years). Cohort-1 participants were tracked for 3 years of high school and 4 years post-graduation after high school, while cohort-2 participants were followed for 3 years of middle school, 3 years of high school, and 1year postgraduation of high school. In 2006, the National Science Foundation (NSF) provided additional funding to trace the educational and occupational development of the cohort-1 and 2 participants in LSAY. Five additional longitudinal surveys were conducted in 2007, 2008, 2009, 2010, and 2011. In 2007, LSAY successfully relocated around 95% of the original sample of cohort-1 and 2 students (N = 5945), and these youth students were in an age range between 33 and 37 years old. LSAY employed a stratified sampling framework from a national population of public middle and high schools in 12 sampling strata defined by the geographic region and type of community in the country to randomly draw student participants as the final sample that was representative of the public middle- and high-school student populations in the United States. The original sample comprised ~48% female and 52% male students, and 70% Whites, 17% African Americans, 9% Hispanics, 3% Asian Americans, and 1% Native Americans. The present study used the data of cohort-2 students as the study sample because they contain pertinent information regarding educational expectations and science learning in middle school years and STEM development in adulthood.

Measures

Educational expectations of youths were measured annually from grades 7 through 9 by an item provided in LSAY that is “As things stand now, how far in school do you think you will get?” It was rated on a 6-point scale: 1 = high school only, 2 = vocational/ trade school, 3 = some college, 4 = bachelor’s degree, 5 = master’s degree, and 6 = doctorate/professional degree, which means higher responses represent higher educational expectations. The measurement item is consistent with the Wisconsin model to assess educational expectations of youths (Liu et al., 2020; O’Donnell et al., 2022; Roth, 2017), which has been commonly used to rate educational expectancy and aspirations in existing empirical research (Feliciano and Lanuza, 2016; Karlson, 2015; Liu et al., 2020).

Science performance of youths was assessed annually from grades 7 to 9 using standardized scores provided by the LSAY personnel, capturing students’ proficiency in core STEM disciplines: biology, chemistry, physical science, and environmental science. These foundational subjects are critical for future STEM engagement and development (Ayuso et al., 2022; Liversidge, 2009). Scores were derived via item-response theory (IRT) methodology to adjust for measurement reliability, guessing, and item difficulty (DeMars, 2010). To ensure cross-cohort comparability, LSAY employed BILOG-MG software (Zimowski et al., 1996) to recalibrate scores using multiple-group IRT modeling (MGM). This aligned Cohort-1 students’ middle school performance with high school benchmarks and integrated middle-to-high school data for Cohort-2 students (the study sample). Annual score ranges spanned 26–88 (7th grade), 22–83 (8th grade), and 27–91 (9th grade), with higher scores reflecting better science achievement.

STEM degree completion in college was measured in 2007 by an item provided by the LSAY personnel to indicate whether the participants had successfully graduated with a four-year STEM baccalaureate degree in science, technology, engineering, mathematics, or medicine. This dichotomous classification of graduates with a STEM degree compared to their counterparts otherwise has been used in prior research (Luo et al., 2022; Wright et al., 2017), which is coded: 0 = no baccalaureate or non-STEMM major and 1 = STEMM major.

Contextual factors of youths’ gender, family composition, parental SES, and ethnicity were included in the modeling procedures for precluding confounding effects. Youths’ gender (0=female, 1=male) and family composition (0 = otherwise, 1 = two-parent family) are dichotomous variables. Parental SES is a continuous variable measured by the Duncan’s Socioeconomic Index (SEI) to calculate parental responses of educational level, income, and occupational prestige (Caston, 1989). SEI is constructed by weighting an occupation’s median education and income on the metric of occupational prestige (Caston, 1989), which has been widely used in empirical research to indicate socioeconomic status, with higher scores indicating better parental SES (Montero et al., 2021; Pitt and Zhu, 2019). Youths’ ethnicity is classified into four ethnic dummy variables, with Asian youths serving as reference (0) and White, African American, Hispanic, and Native American youths setting as comparison (1).

Modeling procedures

Given the longitudinal design of this study, parallel-process latent growth curve modeling (PP-LGCM) was employed to examine how the developmental (initial levels) and growth trajectories (rates of changes) in youths’ educational expectations and science performance during middle school (grades 7–9) jointly predict later successful graduation with a STEM degree in adulthood. PP-LGCM, belonging to growth modeling, is a robust modeling procedure to analyze longitudinal relationships between evolving constructs and their association with distal outcomes (Grimm et al., 2017), such as STEM degree completion. The basic latent growth curve model (LGCM) is expressed as:

$$Y={\tau }_{y}+{\varLambda }_{y}\eta +\varepsilon ,$$

where \(Y\) is observed scores, \({\tau }_{y}\) denotes population means of \(Y\), \({\varLambda }_{y}\) contains factor loadings, \(\eta\) includes latent intercept (\({\eta }_{0})\) and slope (\({\eta }_{1}\)) factors of youths’ educational expectations and/or science learning performance, and \(\varepsilon\) represents residuals. Intercept factors capture initial levels, while slope factors quantify changes over time. As both educational expectations and science performance of youths were measured repeatedly in an equal time interval from grades 7 through 9, the time scores were set to be t0, t1, and t2 across the middle school years in the current study.

PP-LGCM extends LGCM by simultaneously modeling two parallel processes that include the developmental and growth trajectories of youths’ educational expectations and science performance within a unified framework in the current study, which can be written as:

$${Y}_{{ti}}^{m}={\eta }_{0i}^{{y}_{m}}+{\lambda }_{t}^{{y}_{m}}{\eta }_{1i}^{{y}_{m}}+{\varepsilon }_{{ti}}^{{y}_{m}}$$
$${\eta }_{0i}^{{y}_{m}}={\eta }_{0}^{{y}_{m}}+{\varsigma }_{0i}^{{y}_{m}}$$
$${\eta }_{1i}^{{y}_{m}}={\eta }_{1}^{{y}_{m}}+{\varsigma }_{{1}_{i}}^{{y}_{m}}.$$

The first equation is the within-subject model, in which \({Y}_{{ti}}^{m}\) indicates the educational expectations of youth and their science performance observed in middle school, and \({\eta }_{0i}^{{y}_{m}}\) in the second equation and \({\eta }_{1i}^{{y}_{m}}\) in the third equation are the model parameters representing the latent intercepts and slopes of the mth growth process that refers to youths’ educational expectations and science performance. The second and third equations denote the between-subject models, in which \({\eta }_{0}^{{y}_{m}}\) and \({\eta }_{1}^{{y}_{m}}\) are the dependent variables of the intercept and slope factors regarding youths’ educational expectations and science performance during middle school years, \({\eta }_{0}^{{y}_{m}}\) is the estimated overall initial levels of youths’ educational expectations and science performance in middle school (the development of youths’ educational expectations and science performance), \({\eta }_{1}^{{y}_{m}}\) is the average rate of changes referring to youths’ educational expectations and science performance across middle school (the growth of youths’ educational expectations and science performance), and \({\varsigma }_{0i}^{{y}_{m}}\) and \({\varsigma }_{{1}_{i}}^{{y}_{m}}\) are the residuals.

To combine the above equations with a distal outcome of youths’ successful graduation with a STEM degree in adulthood, we have:

$${\eta }_{{Di}}={a}_{D0}+{\beta }_{1}{\eta }_{{Ii}}+\,{\beta }_{2}{\eta }_{{Si}}+{\beta }_{3}{z}_{i}+\,{\xi }_{{Di}},$$

where \({\eta }_{{Di}}\) is the distal outcome of students’ graduation with a STEM degree, \({\beta }_{1}{\eta }_{{Ii}}\) and \({\beta }_{2}{\eta }_{{Si}}\) are the regression parameters of the intercept and slope factors of youths’ educational expectations and science performance to represent their developmental and growth trajectories in middle school, and \({\beta }_{3}{z}_{i}\) is the regression parameters for the contextual factors of youths’ gender, family composition, parental SES, and ethnicity. These contextual factors were adjusted in the modeling procedures to preclude background confounding effects on the developmental and growth trajectories of youths’ educational expectations and science performance and their later STEM degree completion in adulthood, with \({a}_{D0}\) indicating as the intercept of the distal STEM attainment outcome and \({\xi }_{{Di}}\) referring as the person-specific difference between \({\eta }_{{Di}}\) and \({a}_{D0}\) respectively.

The modeling procedures were fit in Mplus 8.10 (Muthen and Muthen, 1998–2017). To address the nested structure of the LSAY data (clustered at the school level), the COMPLEX function with <TYPE = COMPLEX> was applied to adjust standard errors and chi-square tests for interdependence (Wang and Wang, 2019). Missing values are the concern of longitudinal research. In this study, missing data (1.3–12.7%) indicated low attrition rates, consistent with longitudinal population-based studies (Gustavson et al., 2012). Traditional Little’s test (Little and Rubin, 2002) is commonly used to evaluate missingness patterns, e.g., missing completely at random (MCAR), missing at random (MAR), or missing not at random (MNAR), for missing data. However, it is recently found to risk biased MCAR rejection due to its sensitivity to large sample size and nonnormality of longitudinal data (Ahmad and Zhang, 2021; Enders, 2022). Instead, we adopted the TestMCARNormality procedure in MissMech package of R programming to test the pattern of missing data (Jamshidian et al., 2014), which accounts for longitudinal data complexities, temporal dependencies, and large-scale samples via bootstrapping. The result supports the MCAR assumption of the data used in this study (p = 0.099)Footnote 1. Missing data were handled using full information maximum likelihood (FIML), a likelihood-based method that leverages all available participant data for efficient estimation (Lee and Shi, 2021), thereby preserving statistical power and minimizing bias under the assumption that data were missing at MCAR or MAR. In the context of growth modeling, FIML can accommodate missingness due to attrition or sporadic non-response by estimating model parameters directly from the observed data likelihood, avoiding the need for ad hoc procedures like listwise deletion or imputation, which can distort results (Enders, 2022). Model fit was evaluated by comparative fit index (CFI), root mean-square error of approximation (RMSEA), and standardized root mean-square residual (SRMR). Acceptable model fit is: CFI > 0.90, RMSEA < 0.08, and SRMR < 0.1; and excellent model fit is: CFI > 0.95, RMSEA < 0.06, and SRMR < 0.08 (Wang and Wang, 2019).

Results

Descriptive analysis

Descriptive statistics are summarized in Table 1. The sample comprised 48% female (n = 1495) and 52% male (n = 1621) youths. Most participants (87.1%, n = 2715) resided in two-parent households, with 12.9% (n = 401) from other family structures. Parental socioeconomic status (SES) averaged M = 39.72 (SD = 16.575). Ethnically, the majority identified as White (69.6%, n = 2169), followed by African American (16.2%, n = 504), Hispanic (9.1%, n = 284), Asian (3.5%, (n = 112), and Native American (1.5%, n = 47). Youths’ educational expectations showed a slight decline across grades: M = 4.011 (Grade 7), M = 4.005 (Grade 8), and \(M\) = 3.886 (Grade 9). Conversely, science performance increased progressively (M = 49.85 in Grade 7, M = 53.28 in Grade 8, M = 57.44 in Grade 9). By 2007, only 9% of youths (n = 281) had attained a four-year STEM baccalaureate degree. The decrease in educational expectations across middle school may reflect youth students more realistically perceiving their educational development gradually along with the harder academic standards required in higher grades, even with their improvement in science performance observed.

Table 1 Descriptive statistics of the study variables (N = 3116).

Unconditional LGCM models of educational expectations and science performance

Table 2 presents standardized results of the unconditional latent growth curve model (LGCM) analyzing youths’ educational expectations (Model 1 A). The model demonstrated excellent fit (CFI = 1.000, RMSEA = 0.000, SRMR = 0.000). Factor loadings for the intercept factor (grade 7–9 educational expectations) were significant (λ = 0.857, 0.886, 0.826; p < 0.001), while slope factor loadings (grade 8–9) were λ = 0.227 and 0.424 (p < 0.001). A significant negative correlation emerged between intercept and slope factors (r = −0.167, p < 0.01), suggesting that youths with higher initial development of educational expectations exhibited slower subsequent growth. Both intercept (σ² = 1.456) and slope (σ² = 0.096) variances were significant (p < 0.001), indicating substantial individual variation. Table 3 summarizes LGCM results for youths’ science performance (Model 1B), which also showed excellent model fit (CFI = 1.000, RMSEA = 0.000, SRMR = 0.000). Intercept factor loadings (grade 7–9 scores) were λ = 0.964, 0.908, 0.904 (p < 0.001), with slope loadings (grade 8–9) at λ = 0.292 and 0.581 (p < 0.001). Similarly, a negative intercept-slope correlation (r = −0.134, p < 0.001) implied that higher initial development of science performance was related to slower later growth. Significant variances for intercept (σ² = 91.419) and slope (σ² = 10.080) factors (p < 0.001) further highlighted individual differences. Collectively, both models fit the data well, accurately capturing developmental and growth trajectories of youths’ educational expectations and science performance during middle school.

Table 2 Standardized results of unconditional latent growth curve model (Unconditional LGCM) for youths’ educational expectations (Model 1 A).
Table 3 Standardized results of unconditional latent growth curve model (Unconditional LGCM) for youths’ science performance (Model 1B).

Conventional PP-LGCM model of educational expectations and science performance

A conventional parallel-process latent growth curve model (Conventional PP-LGCM) was estimated to examine associations between youths’ educational expectations and science performance in middle school (Model 2). This model regressed the intercept and slope factors of science performance on the intercept factor of educational expectations, and the science performance slope factor on the educational expectations slope factor (Figs. 13). The model demonstrated good fit: CFI = 0.997, RMSEA = 0.049, SRMR = 0.018. As shown in Fig. 1, the intercept of educational expectations positively predicted both the intercept (β = 0.424, p < 0.001) and slope factors (β = 0.065, p < 0.01) of science performance, indicating that higher initial development of educational expectations was associated with better initial development and modestly faster growth in science performance. Additionally, the slope of educational expectations positively predicted the science performance slope (β = 0.141, p < 0.001), suggesting that greater growth in educational expectations parallelly led to greater gains in science performance. The factor loadings were robust: educational expectations intercept (λ = 0.845, 0.874, 0.815) and slope (λ = 0.196 and 0.366), and science performance intercept (λ = 0.955, 0.909, 0.899) and slope (λ = 0.281 and 0.556), all p < 0.001. These results provide empirical justification for estimating conditional PP-LGCM modeling to predict distal STEM degree attainment. Full parameter estimates of Model 2, including cross-domain regressions and within-domain covariances, are reported in Table S1 in Appendix.

Fig. 1: Standardized effects of conventional parallel-process latent growth curve model (Conventional PP-LGCM) of youths’ educational expectations and science performance (Model 2).
figure 1

Note. For model fit, CFI = 0.997, RMSEA = 0.049, SRMR = 0.018. InpEdExp = Intercept Factor of Youths’ Educational Expectations; SlpEdExp= Slope Factor of Youths’ Educational Expectations; InpScPef= Intercept Factor of Youths’ Science Performance; SlpScPef= Slope Factor of Youths’ Science Performance. *p < 0.05, **p < 0.01, ***p < 0.001.

Fig. 2: Standardized effects of conditional parallel-process latent growth curve model (Conditional PP-LGCM) of youths’ educational expectations and science performance in prediction of successful completion of a STEM degree (Model 3 A).
figure 2

Note. For model fit, CFI = 0.997, RMSEA = 0.026, SRMR = 0.011. InpEdExp = Intercept Factor of Youths’ Educational Expectations; SlpEdExp = Slope Factor of Youths’ Educational Expectations; InpScPef = Intercept Factor of Youths’ Science Performance; SlpScPef = Slope Factor of Youths’ Science Performance. Dotted arrows are the effects of youths’ gender, family composition, parental SES, and ethnicity on the intercept and slope factors of youths’ expectations and science performance in middle school years and their STEM degree attainment in adulthood. *p < 0.05, **p < 0.01, ***p < 0.001.

Fig. 3: Standardized effects of conditional and directional parallel-process latent growth curve model (Conditional and Directional PP-LGCM) of youths’ educational expectations and science performance in prediction of successful completion of a STEM degree by regressing slope factors on intercept factors (Model 3B).
figure 3

Note. For model fit, CFI = 0.997, RMSEA = 0.026, SRMR = 0.011. InpEdExp= Intercept Factor of Youths’ Educational Expectations; SlpEdExp = Slope Factor of Youths’ Educational Expectations; InpScPef = Intercept Factor of Youths’ Science Performance; SlpScPef = Slope Factor of Youths’ Science Performance. Dotted arrows are the effects of youth gender, family composition, parental SES, and ethnicity on the intercept and slope factors of youths’ expectations and science performance in middle school years and their STEM degree attainment in adulthood. *p < 0.05, **p < 0.01, ***p < 0.001.

Conditional PP-LGCM models of educational expectations and science performance in prediction of STEM degree completion

A conditional parallel-process latent growth curve model (Conditional PP-LGCM) was estimated to examine how youths’ educational expectations, science performance, and contextual factors (youths’ gender, family composition, parental SES, and ethnicity) predict STEM degree completion in adulthood (Model 3 A). STEM degree completion was modeled as an ordered categorical outcome. The model demonstrated excellent fit: CFI = 0.997, RMSEA = 0.026, SRMR = 0.011. Fig. 2 shows that higher initial development of youths’ educational expectations (β = 0.148, p < 0.001) and science performance (β = 0.150, p < 0.001) in middle school predicted greater odds of STEM degree attainment (15.95% and 16.18% per standard deviation increase, respectively). Similarly, growth in educational expectations (β = 0.087, p < 0.01) and science performance (β = 0.045, p < 0.05) predicted modest increases in STEM degree attainment (9.08% and 5% per SD). Full parameter estimates of Model 3A are reported in Table S2 in Appendix. To test whether growth in educational expectations and science performance depends on its initial developmental trajectories, a conditional and directional parallel-process latent growth curve model (Conditional and Directional PP-LGCM) was conducted to regress the slope factors of educational expectations and science performance on their respective intercept factors while retaining the structure of Model 3A intact (Model 3B). The model fit of this conditional and directional PP-LGCM model was excellent: CFI = 0.997, RMSEA = 0.026, SRMR = 0.011. Fig. 3 shows that the developmental trajectories of youths’ educational expectations and science performance negatively predicted the growth rates of educational expectations (β = −0.186, p < 0.001) and science performance (β = −0.165, p < 0.001), respectively, indicating that higher baseline development of educational expectations and science performance were associated with slower subsequent growth. Notably, the effect of educational expectations’ intercept on science performance’s slope strengthened in Model 3B (β = 0.151 vs. 0.091 in Model 3 A; p < 0.001), suggesting that modeling intercept-slope relationships as directional (vs. correlational) better captures the developmental trajectories in educational expectations and science performance. Besides, other parts of Model 3B had similar effects to those of Model 3A. See Table S3 for the details of parameter estimates of Model 3B in Appendix.

Effects of youths’ contextual factors

For the effects of youth gender, family composition, parental SES, and ethnicity, male youths showed higher science performance (β = 0.217, p < 0.001) and greater STEM degree attainment (β = 0.065, p < 0.01; OR = 1.069), and youths from two-parent families (β = 0.037, p < 0.01; OR = 1.037) and of higher parental SES (β = 0.080, p < 0.01; OR = 1.083) had greater STEM odds. In addition, higher parental SES contributed to stronger developmental trajectories of youths’ educational expectations (β = 0.322, p < 0.001) and science performance (β = 0.144, p < 0.001). Hispanic, African American, White, and Native American youths reported lower developmental trajectories of educational expectations (β = −0.148 to −0.051, p < 0.001–0.05) and science performance (Hispanic: β = −0.139; African American: β = −0.210, p < 0.001) compared to Asian peers. These groups also had reduced odds of STEM degree completion (OR = 0.913–0.894, p < 0.01).

Mediation analysis

Latent mediation tests (Table 4) revealed significant indirect effects of educational expectations’ intercept on successful graduation with a STEM degree via science performance’s intercept (β = 0.055, p < 0.001) and slope (β = 0.007, p < 0.05). A marginally significant mediation emerged for science performance’s intercept and slope (β = −0.003, p < 0.10). While the slope-to-slope path from educational expectations to science performance was nonsignificant (β = −0.003, p = 0.131), the total indirect effect was significant (β = 0.058, p < 0.001), indicating that science performance of youths jointly mediated the link between their educational expectations and the STEM outcome.

Table 4 Standardized mediated effects of the slope and intercept factors of youths’ science performance and/or the slope factor of youths’ educational expectations on the relationship between the intercept factor of youths’ educational expectations and successful STEM degree completion (Model 3B).

Monte Carlo simulation for power test and sensitivity analysis

The application of parallel-process latent growth curve modeling (PP-LGCM) to examine dynamic longitudinal relationships between youths’ educational expectations, science performance, and STEM degree attainment is a sophisticated analytical approach. Given the complexity of these longitudinal models, we conducted Monte Carlo simulations to assess (1) statistical power in detecting true parameter estimates and (2) model reliability and stability (sensitivity analysis). Conventional analytical-based methods for power test (e.g., multivariate methods and exact tests) cannot handle the complexities of growth modeling employed in the current study as they depend on the priori and deterministic method (Arend and Schäfer, 2019; Carsey and Harden, 2014). Moreover, employing Monte Carlo simulations by testing varying sample sizes to verify model reliability and stability is an effective way of sensitivity analysis to corroborate replications in complicated mixed-effects and growth models (Lee and Hong, 2021). Hence, Monte Carlo simulation is adopted to conduct power test and sensitivity analysis for Model 3B, the final growth model in the current studyFootnote 2.

For power test, a total of 4000 replications were applied for repeated random sampling in the simulation (by setting seed = 12345) with the original sample size (N = 3116) to generate synthetic datasetsFootnote 3. The parameter estimates of the fitted PP-LGCM Model 3B were input in the population model and analysis model to verify their statistical power of replications. The results showed that all the focused parameter estimates regarding the regression effects between the developmental and growth trajectories of youths’ educational expectations and science performance as well as STEM degree completion were all found to have a statistical power > 0.80, ranging from \(\hat{P}=0.897{to}1.000\) (Table 5). Bias in parameter estimates (0–1.91%) and standard errors (0–3.62%) all fell below the 5% benchmark, indicating high precision. The coverage rates of 95% confidence intervals (95% Coverage) that include the true population parameter values from replications were between 0.942 and 0.963, well within the 0.91 to 0.98 threshold. In addition, the average model fit across replications was excellent: CFI = 1.000, RMSEA = 0.004, SRMR = 0.004, The results of power test indicate the sample size (N = 3116) employed in the current study is well adequate to detect true parameter estimatesFootnote 4. To ensure simulation reliability, we repeated the simulation analyses with 5000 and 10,000 replications. The results of these replications all meet the required Monte Carlo criteria.

Table 5 Results of Monte Carlo simulation for power test of Model 3B (Replications = 4000).

We further conducted Monte Carlo sensitivity analysis to test model robustness through simulating smaller and larger samples (with N = 2500, 4000, 5000, and 7000)Footnote 5 for checking reliability and stability of growth modeling analyzed in the current study. Table 6 shows the results of Monte Carlo sensitivity analysis for Model 3B. The average model fits across varying samples were excellent (CFIs= 1.000, RMSEAs = 0.002 to 0.004, and SRMRs = 0.003 to 0.004) and parameter recovery was highly stable as the ranges of bias in parameter estimates and standard errors were all less than 5% across different sample sizes. Besides, 95% confidence intervals (CIs = 0.943 to 0.960) were all within the 0.91 and 0.98 threshold, suggesting consistent coverage even with reduced Ns. The statistical powers of different sample sizes ranged from 0.829 to 1.000, demonstrating reliability across conditions. The results of Monte Carlo sensitivity analysis support the reliability and stability of Model 3B in terms of model structure and parameter estimates for the longitudinal relationships between developmental and growth trajectories of youths’ educational expectations and science performance in contribution to later STEM degree attainment in adulthood. Full Mplus code and result outputs of all the growth models and Monte Carlo simulation analyses for power test and sensitivity analysis are available at https://osf.io/m4eyn/files/osfstorage.

Table 6 Results of Monte Carlo simulation for sensitivity analysis of Model 3B with N = 2500, 4000, 5000, and 7000.

Discussion

The cultivation of STEM talent is crucial for national and global socioeconomic, cultural, and technological advancement (Larkin and Lowrie, 2022; Penprase, 2020). While prior studies have identified factors influencing youth STEM development, little research has explored how middle school trajectories of educational expectations and science performance jointly shape later STEM degree attainment of youths in adulthood. Existing work often treats these factors as static rather than dynamic (Kohen and Nitzan, 2022; Premraj et al., 2021). This study demonstrates that the developmental and growth trajectories of youths’ educational expectations and science performance during middle school are critical contributors to adult STEM degree completion. Furthermore, the trajectories of science performance are significantly influenced by concurrent development and growth in educational expectations, with both pathways jointly enhancing the likelihood of earning a STEM degree. These findings underscore the need for educational strategies that nurture both academic performance and aspirational growth during middle school, as these dynamic trajectories collectively lay the groundwork for long-term STEM success.

Evidently, educational expectations are a well-established driver of academic performance (Park, 2021; Reynolds and Johnson, 2011). The current study extends this understanding by demonstrating that higher initial development of general educational expectations in middle school predict better development and steeper growth in science performance during early adolescence. Critically, these general expectations—reflecting youths’ overarching beliefs about their academic potential and the value of educational attainment (Wisconsin Model; Roth, 2017)—were robust predictors of later STEM degree completion, even when measured independently of STEM-specific aspirations. This aligns with SEVT’s principle that broad achievement beliefs scaffold domain-specific motivations (Wigfield and Eccles, 2023), particularly during early adolescence when STEM identities are nascent. Further, growth in educational expectations itself significantly predicts growth in science performance. These findings are particularly salient given the heightened malleability of educational expectations during middle school compared to later developmental stages (Andrew and Hauser, 2011; Pyne et al., 2018), underscoring the importance of cultivating positive general educational expectations of students in early adolescence to bolster science achievement and long-term STEM outcomes.

Parallel-process latent growth curve modeling (PP-LGCM) analyses revealed that initial development (intercept factors) of expectations and science performance shapes their subsequent growth (slopes), respectively. While higher initial levels were associated with slower growth rates, both intercepts and slopes independently contributed to STEM degree attainment. This highlights the dual significance of fostering strong aspirational and science learning foundations and sustaining growth trajectories during middle school to support later STEM success. This study further demonstrates that growth trajectories in general educational expectations during middle school significantly predict STEM degree attainment in adulthood. This relationship operates through two key, interrelated mechanisms: (1) General expectations foster persistence in science learning, building foundational competence necessary for advanced STEM study (as evidenced by mediation via science performance trajectories); and (2) They establish a credentialing pathway—youths with higher expectations are more likely to pursue and complete college degrees, thereby accessing the institutional gateway to STEM majors (Bass and Besen-Cassino, 2016; Khattab et al., 2022). These findings corroborate evidence that the link between adolescent expectations, science proficiency, and adult STEM achievement is inherently developmental and cumulative (Dokme et al., 2022; Miller and Pearson, 2012). Given the formative role of middle school in shaping educational and career motivations and directions, policymakers and educators must prioritize these dynamic trajectories when designing interventions (Bozick et al., 2010; Penprase, 2020). To this end, fostering supportive ecosystems—encompassing schools, families, and communities—is essential for cultivating expectations and science engagement during early adolescence, thereby anchoring equity-driven educational reforms (Ji, 2021; Zhou et al., 2019).

Latent mediation analyses revealed that both the developmental and growth trajectories of science performance mediated the effects of educational expectations on STEM degree attainment, both collectively and independently. These findings offer actionable insights for advancing science education and STEM equity (Anderson and Li, 2020; Penprase, 2020). The significant relationship between general educational expectations and STEM attainment challenges the assumption that only domain-specific motivations (e.g., science self-efficacy) are relevant for STEM pathways. Instead, it highlights the foundational role of broad academic aspirations in creating the preconditions for STEM engagement: Sustained effort in foundational science courses (mediating the expectation-STEM link) and progression into higher education where STEM specialization occurs. Beyond fostering direct engagement in STEM, policymakers and educators should design programs that cultivate general educational expectations as foundational drivers of motivation—for instance, through mentorship initiatives or project-based learning—to amplify student agency and sustained growth in science proficiency. While the indirect path linking growth in educational expectations (slopes) to STEM attainment via science performance growth (slopes) was nonsignificant, the total indirect effect—combining initial levels and growth trajectories—underscores the cumulative and progressive role of middle school expectations and science development in shaping adult STEM outcomes. This highlights the need for early, targeted efforts to strengthen general educational expectations at the onset of adolescence. Such efforts may catalyze cascading gains in both educational expectations and science performance, ultimately enhancing STEM achievement. In fact, by modeling general educational expectations as dynamic trajectories, this study reveals how their cumulative development during middle school—propelled by recursive feedback loops where early academic success reinforces aspirations (Eccles and Wigfield, 2020)—creates momentum toward college completion, within which STEM degree attainment becomes a viable pathway. This broader credentialing effect complements the direct motivational influence on science mastery, underscoring why general expectations uniquely predict long-term STEM degree outcomes even after accounting for science performance.

The current study provides robust evidence for the long-term effects of youths’ educational expectations and science performance during middle school on their later STEM degree completion in adulthood. Using a nationally representative longitudinal sample, we found that both the initial development (intercepts) and growth in (slopes) educational expectations and science performance significantly predicted STEM attainment, with standardized effects ranging from β = 0.045 to 0.150. While these effects may appear modest by conventional benchmarks (e.g., Cohen’s guidelines), they are substantively meaningful given the long-term, population-level nature of the study. For example, a 1-standard-deviation increase in the intercept of science performance (β = 0.150) corresponds to a 15% higher likelihood of STEM degree completion, which could translate to thousands of additional STEM graduates in a national cohort. While few studies have examined the evolving trajectories of educational expectations and science performance as predictors of STEM attainment in adulthood, our findings resonate with, yet critically extend, prior research examining static predictors of STEM outcomes. For example, Jiang and Simpkins (2024) reported comparable effect sizes (β = 0.11–0.15) for students’ self-concept in mathematics and science abilities—measured at discrete time points (grades 9 and 11)—on STEM major enrollment. Relevantly, Moakler and Kim (2014) identified modest effects of academic confidence (β = 0.14) and high school GPA (β = 0.15) on STEM major selection in cross-sectional analyses. However, by modeling educational expectations and science performance as evolving trajectories dynamically rather than static snapshots, our study uniquely reveals that their rates of change (slopes: β = 0.087 and 0.045, respectively) independently contribute to long-term STEM attainment. This underscores the necessity of conceptualizing academic motivation and achievement as fluid processes, where cumulative gains (or declines) over time amplify their impact on career pathways. These results challenge the reliance on single-timepoint measurements in prior literature and advocate for educational policies and interventions that nurture both initial competencies and sustained growth in science engagement and aspirations during adolescence.

Manifestly, our findings underscore the critical role of educational expectations and science performance during middle school in shaping later STEM degree attainment. To leverage these insights, schools should integrate aspirational support with skill-building—for instance, through mentorship initiatives or project-based learning (PBL) workshops to connect classroom science to real-world challenges for middle school students (Larkin and Lowrie, 2022). In addition, schools should consider integrating science performance with aspirational support by pairing academic skill-building with interventions (Ayuso et al., 2022; English, 2017), such as growth mindset training and STEM industry and school partnership schemes, to boost students’ confidence and aspirations as well as science competence. Besides, schoolteachers and educators should be equipped to acknowledge and support both aspirational and science learning development of middle school students, which are crucial to provide timely assistance and remedies for struggling student learners to catch up their aspirational and STEM development (Ozulku and Kloser, 2024). More importantly, longitudinal tracking and accountability of students’ development and progression in educational expectations and academic performance (including science learning) during middle school are pivotal to monitoring academic readiness and inform resource allocation for facilitating students’ learning trajectories of STEM development. This is critical to cultivate next-generational STEM professionals due to the cumulative and progressive nature of STEM development found in this study. Moreover, policymakers must prioritize funding for middle school STEM resources, particularly in underserved communities, to address equity gaps (English, 2017; Larkin and Lowrie, 2022). Early identification systems can flag students at risk of declining engagement, enabling timely interventions such as mentorship and learning guidance initiatives. By aligning educational practices with these dual pathways of educational expectations and science performance, stakeholders can cultivate a robust STEM pipeline, ensuring students are well prepared earlier to meet future workforce demands.

Besides, this study highlights the contextual influences on youths’ educational expectations, science performance, and STEM degree attainment. Male youths demonstrated higher STEM degree attainment and stronger initial development in science performance, reflecting persistent gender disparities often attributed to gendered academic preferences (de las Cuevas et al., 2022; Dokme et al., 2022; Owen, 2023). However, recent studies challenge these patterns, reporting comparable science interest and ability across genders (Zhao and Perez-Felkner, 2022). Family background also played a pivotal role: youths from two-parent families and higher socioeconomic status (SES) households exhibited greater STEM attainment and stronger initial development in academic trajectories. This underscores the role of home resources, family socialization, and parental human capital as critical assets for STEM development (Ji, 2021; Luo et al., 2022; Miller and Pearson, 2012). Ethnic disparities further emerged, with Asian youths showing higher baseline expectations and STEM attainment—a pattern linked to cultural emphasis on academic achievement and communal educational support (Feliciano and Lanuza, 2016; Portes and Rumbaut, 2014; Toyokawa and Toyokawa, 2019). These findings collectively advocate for policies that need to prioritize equitable resource allocation to mitigate contextual barriers, ensuring inclusive pathways to STEM success. Such measures are important for transforming education into a true equalizer, empowering students from diverse family, ethnic, and societal backgrounds.

Conclusion

Utilizing longitudinal data of a representative sample of middle school youths from the Longitudinal Study of American Youth (LSAY), spanning early adolescence to adulthood, the current study corroborated that youths’ educational expectations developmentally enhance science performance during middle school, which then later collectively foster STEM degree completion. Furthermore, science performance of youths in middle school emerged not only a direct predictor of later STEM degree attainment but also as a mediator linking educational expectations to the STEM outcome. Nevertheless, some limitations of the current study should be noted. First, although the LSAY sample is nationally representative, its exclusion of private school students leaves unclear whether developmental and growth trajectories of expectations, science learning, and STEM attainment differ across educational settings. Hence, replication with recent, multi-cohort data including students in public and private schools is critical to validate these dynamics in modern contexts. Second and more importantly, the findings of this study should be interpreted in light of potential cohort effects and historical context tied to the LSAY data, which spanned 1987–2011. Manifestly, the experiences of youth participants were shaped by era-specific factors, such as pre-internet educational practices and STEM policies now outdated by contemporary reforms (Larkin and Lowrie, 2022; Penprase, 2020). Consequently, generalizability to current educational environments—marked by technological integration (e.g., AI tools, online learning) and emerging equity initiatives-may be limited. Third, educational systems—particularly in science and STEM fields—have undergone significant reforms over the past decade (Ayuso et al., 2022; Klees, 2018). Consequently, contemporary data reflecting these shifts are essential to validate the generalizability of this study’s findings. Fourth, while this study examines the developmental trajectories of youths’ educational expectations and science performance during middle school as predictors of later STEM degree attainment, it is important to acknowledge that STEM outcomes are shaped by a complex interplay of interconnected systems, including familial, peer, societal, institutional, and intrapersonal influences. Future research should adopt multi-system frameworks to better disentangle these dynamic interactions across educational contexts (Anderson and Li, 2020; Penprase, 2020). Additionally, due to LSAY only using a single measurement item to evaluate educational expectations, it is suggested to employ well-validated multi-item measures to assess youths’ educational expectations in the future, as this would enable more nuanced comparisons with the current study’s findings by different approaches in measuring this construct. Finally, although middle school represents a critical period for fostering STEM motivations, these trajectories develop dynamically across the lifespan, spanning early childhood through postgraduate education (Larkin and Lowrie, 2022; Pallas, 2003). To inform comprehensive policy and educational reforms, future research should explore how intersecting systems—across various educational stages—interact to shape STEM achievement, thereby enabling the design of more nuanced and effective interventions, especially for those students under disadvantaged learning and less science-stimulated environments.