Introduction

Despite the growing global emphasis on STEM education and its integration into the training of pre-service teachers, there remains a significant disconnect between STEM education and actual job opportunities in the field. The National Science Foundation (NSF) in the United States has highlighted the importance of STEM education in preparing students for the future workforce, yet this emphasis does not always translate into job availability or alignment with pre-service STEM education.

In the Sultanate of Oman, the need to strengthen STEM competencies among pre-service teachers has been highlighted, particularly in light of Oman’s performance in international assessments such as the Trends in International Mathematics and Science Study (TIMSS). The TIMSS 2019 results, for example, showed Oman ranking relatively low in science among participating countries, indicating a crucial need to improve science education quality. Researchers like Shahat et al. (2022) and Shahat and Al-Balushi (2023) have suggested that enhancing science education in Oman requires adequately training pre-service teachers in essential STEM skills.

To clarify the relationship between STEM education and science education, it is crucial to recognize that science education is closely linked to STEM, as it provides the foundational knowledge on which many STEM concepts are built. Science education focuses on teaching scientific principles, theories, and methodologies across disciplines such as biology, chemistry, and physics. In contrast, STEM education goes beyond this by integrating Science, Technology, Engineering, and Mathematics to foster interdisciplinary learning and real-world problem-solving (Berisha and Vula, 2021; Bybee, 2011). Key issues in STEM education include promoting interdisciplinary thinking, fostering innovation, developing problem-solving skills, and ensuring equitable access to STEM opportunities. These topics are critical for preparing students to address complex global challenges and to meet the needs of the evolving workforce. Within the field of STEM, this paper investigates several important issues, such as the integration of engineering design into science instruction and the role of science teachers in bridging the gap between theoretical knowledge and practical STEM applications. Specifically, we examine how science teachers can incorporate engineering processes to enhance students’ problem-solving skills and their preparedness for broader STEM competencies (Alvarado and Huerta, 2020; Atman et al., 2007).

Since 2019, Oman’s Ministry of Education has introduced a STEM program for 10th-grade students, taught by science teachers who have undergone specific training. The program, which involves modeling machines like cars, boats, and rockets, underscores the natural connection between science education and STEM. Science teachers, with their deep understanding of scientific principles, are well-equipped to lead these STEM activities, demonstrating the close alignment between both fields (Shahat et al., 2022, Shahat et al., 2024, Shahat and Al-Balushi, 2023). While this study focuses on specific aspects of STEM, such as engineering design integration, it also contributes to addressing broader issues in STEM education, reinforcing the close alignment between science education and STEM (Shahat et al., 2022, Shahat et al., 2024, Shahat and Al-Balushi, 2023).

The importance of integrated STEM education in preparing pre-service teachers is widely acknowledged among educators and researchers. Studies have shown that many science teachers do not feel confident teaching STEM subjects in an integrated manner. This lack of confidence is often due to a superficial understanding of STEM concepts and a lack of adequate training. For example, He et al. (2022) found that over 80% of pre-service STEAM teachers in the UK and China in their study had not attended any STEM-related training or lectures. The necessity for more robust professional preparation and institutional support in implementing integrated STEM lessons is therefore evident.

Institutional reports, like that of the National Academies of Sciences, Engineering, and Medicine in the United States, have advocated for increased opportunities for pre-service teachers to engage in STEM-focused activities such as engineering design projects. Research by Chiu et al. (2013), and Handan and Çoramik (2022) has demonstrated that integrating STEM into pre-service teacher training can significantly enhance the teaching of STEM concepts and foster interest in STEM careers in students.

One effective component of STEM education is engineering design, which stands at the forefront of contemporary reforms in science education. Engineering design helps students apply scientific principles to solve real-world problems, thus bridging the gap between science education and broader STEM competencies. However, studies like those by Winarno et al. (2020) and Shahat et al. (2022) have identified gaps in science teachers’ knowledge about integrating engineering design into science education. Additionally, research by Elayyan and Al-Mazroi (2020) into the challenges in implementing STEM in Oman’s science education highlighted the lack of STEM education programs for in-service teachers.

According to the National Academies of Sciences, Engineering, and Medicine in the United States, contemporary reform initiatives in science education are increasingly advocating for integrated teaching approaches, particularly in the United States, where engineering practices and design have been recognized as essential components in meeting K-12 science education standards. Despite these advancements, a significant challenge remains: many educators are unfamiliar with effective methodologies for teaching science through engineering design.

This study aims to contribute significantly to the advancement of STEM education in Oman by developing a tool to assess pre-service science teachers’ self-efficacy beliefs for teaching science through engineering design processes (EDPs). This tool will be detailed in the following paper, which discusses its creation, application, and the analysis of its measurement properties using Rasch software. The ultimate goal is to ensure that future generations are well-prepared to face the challenges of the global economy and succeed in the labor market.

Preparing science teachers in the Sultanate of Oman

In the Sultanate of Oman, science teacher preparation involves earning a bachelor’s degree in education with a science specialization or a two-year Qualified Diploma in Education for those with a related scientific bachelor’s degree. Both pathways include extensive training in educational theories, scientific knowledge, pedagogical skills, and practical classroom experience. Graduates must pass a national certification exam given by the Ministry of Education. The Ministry also offers continuous professional development for practicing teachers through courses and workshops to keep them updated with educational trends (Shahat et al., 2022, 2023).

Literature review and background

This study focuses on engineering design in STEM education. Below is a literature review providing background on the EDP, highlighting its definition, importance, and integration. This review synthesizes relevant research and shows how this study builds upon and contributes to existing scholarship, offering new insights into engineering design within STEM education.

Engineering design and science education

EDPs are integral to science education and, as Kim et al. (2019) have highlighted, the systematic and iterative nature of these processes is similar to many scientific investigations. The processes used in engineering design, which include defining problems, generating ideas, and evaluating and refining solutions, play a crucial role in scientific inquiry. Despite this, there remains an ongoing debate regarding the precise definitions used. For instance, Kim et al. (2019) view engineering design as an optimal problem-solving method, whereas Atman et al. (2005) consider it a strategy for devising solutions to existing problems. Ulrich et al. (2020) expand on these definitions, describing it as the application of scientific and mathematical principles for problem-solving and the development of new products or systems. They argue it necessitates extensive research, comprehension of the problem, and generation and evaluation of solutions, culminating in decision-making and implementation (Eggert, 2005). Lin et al. (2021) further suggest that engineering design thinking challenges existing standards and embraces a multi-perspective approach to problem-solving. Such adaptability is crucial because it directly enhances teachers’ self-efficacy. When teachers are adaptable, they feel more confident in their abilities, which enables them to effectively apply educational principles to real-world challenges. This, in turn, improves their effectiveness in conveying complex concepts to students, ensuring that students can better understand and engage with the material.

Engineering design is fundamental across all engineering disciplines and is essential for solving a variety of problems and critical to every engineer’s success (Atman et al., 2005); it has been found to aid in early problem identification, reducing failure risks, and improving safety and reliability (Ullman, 2010). Moreover, it drives the optimization of solutions, reduces costs and environmental impacts, encourages innovative problem-solving, and fuels technological progress (Karsnitz et al., 2012). Utilizing it enriches science learning at all educational levels (Marulcu and Barnett, 2013) by translating theoretical knowledge into practical applications (Eggert, 2005). These processes are central to scientific inquiry and are instrumental in boosting self-efficacy among science teachers. This connection to self-efficacy is crucial as it aligns directly with the objectives of our study, which seeks to develop a tool for measuring self-efficacy beliefs in science teaching through the lens of EDP.

The integration of engineering design in education transcends mere construction activities; it involves teaching students how to systematically organize ideas and make informed decisions, thereby creating high-quality solutions or products (Hynes, 2012). The Next Generation Science Standards (NGSS) introduced by the National Research Council (NRC), the National Science Teachers Association (NSTA), the American Association for the Advancement of Science (AAAS), and achieve in 2013, aim to provide a cohesive framework for science education across the United States, underscore the significance of incorporating engineering practices within scientific education. The integration proposed by the standards is designed to enhance student motivation and interest in STEM in alignment with modern career demands in engineering (Brenda, 2020). Moreover, these standards advocate for a pedagogical approach that enhances engagement and significantly bolsters teacher self-efficacy by providing them with the tools and methodologies to integrate engineering concepts into their teaching practices effectively.

While engineering focuses on identifying human needs and devising solutions, science primarily aims to understand natural phenomena (Bybee, 2011). Schwarz et al. (2017) advocate for including engineering in science classes, enabling students to engage in engineering design projects and apply scientific knowledge in practical ways to real world problems. Crotty et al. (2017) identified three primary methods for incorporating engineering design into STEM units: (1) culminating projects, (2) mastery through design, and (3) consistent engineering integration. Of these, mastery through design, especially when introduced at the beginning of the unit, was found to be the most effective.

Research has shown the varied influence of engineering design on science learning. Guzey et al. (2019) observed increased learning gains through engineering design in science classes. Similarly, Berland et al. (2014) found that after Engineering design based units students better understood and valued aspects of engineering design such as generating potential solutions. In addition, Schubert et al. (2012) reported an increase in students’ knowledge and confidence through engineering design experiments, and Crotty et al. (2017) also found enhanced student achievement when integrating engineering design into science education from the start of a unit. Kim et al. (2019) reported that engineering design connects students with scientific concepts, fosters cognitive independence, and encourages learning. Alvarado and Huerta (2020), and Cakir and Guven (2012) observed enhanced problem-solving skills and a deeper understanding of scientific concepts in students engaged in engineering design projects in physics. Selcen Guzey et al. (2016) noted that incorporating fundamental practices and concepts from engineering and science, alongside training in engineering design, provided an engaging and motivational challenge, boosting students’ interest in engineering. Practical and cognitive skills were investigated by Strimel et al. (2018), who discovered that high school students with prior engineering experience were more adept in constructing design scenarios and required less time for various cognitive processes. As a result of all these benefits, Lie et al. (2019) advocated for using engineering design activities across various disciplines, including integrated STEM education.

Research also explores the utility of EDP in teacher training and the challenges involved. The usefulness of incorporating engineering design seems clear, e.g., Lin et al. (2021) showed that integrating EDP into STEM project-based learning for pre-service technology teachers was effective in fostering design thinking. Kuvac and Koc (2022) examined pre-service science teachers’ perceptions of engineering, and concluded that STEM education could rectify stereotypical views about engineering. However, Mumba et al. (2023) investigated pre-service secondary science teachers’ understanding of engineering design and concluded that there is a need for deeper training and that many science teachers struggle with developing STEM-based teaching materials due to limited exposure to engineering. More evidence of this need comes from Shahat and Al-Balushi (2023) who evaluated Omani pre-service science teachers in tasks related to engineering design stages and revealed varying levels of competence. The National Academy of Engineering (NAE) in 2012 emphasized the importance of integrating engineering concepts into K-12 education, and reported a gap in teacher preparation for engineering education in K-12 schools. Hynes (2012) noted that teachers often receive inadequate preparation for teaching engineering. Chiu et al. (2013) found that successful engineering design projects in schools require extensive support for teachers and students, suggesting computer-based learning environments as a solution. Lindahl (2006) highlighted the lack of resources for engineering design as a significant challenge. These studies highlight the importance of sufficient resources and training pre-service and in-service teachers in engineering design.

Integrating engineering design projects into science curricula comes with many challenges. For example, evaluating engineering design projects is a more complex undertaking compared with typical science education evaluations (Shahat and Al-Balushi, 2023). Time constraints in science curricula also pose challenges for integrating engineering design projects. Atman et al. (2007) identified challenges faced by students in completing engineering design projects, emphasizing the need for clear descriptions of EDP for effective integration into science education.

Engineering design processes (EDPs)

EDP frameworks offer structured methodologies essential for effectively addressing engineering challenges. The different proposed models of the EDP illustrate its diverse applications in education. The International Technology and Engineering Educators Association (ITEEA) introduced a five-step framework in 2010 that includes: (1) defining the problem, (2) researching and gathering information, (3) developing potential solutions, (4) choosing the best solution, and (5) testing and improving the solution. Similarly, the National Academy of Engineering (NAE) in 2009 in the United States proposed a six-step process, which consists of: (1) defining the problem, (2) researching and gathering information, (3) generating ideas, (4) selecting the best idea, (5) developing a prototype, and (6) testing and improving the solution.

Atman et al. (2007) offered a more detailed three-stage process with ten activities: (1) defining the problem’s scope by identifying needs, defining the problem, and collecting information; (2) developing alternative solutions through idea generation, modeling, feasibility analysis, and evaluation; and (3) ensuring project quality, which involves decision-making, communication, and implementation. Figure 1 illustrates these stages as described by Atman et al. (2007). Hynes (2012) introduced an eight-step process, including identifying the problem, researching, developing solutions, selecting the best solution, building a prototype, testing and evaluating, communicating, and redesigning. Whereas, Marulcu and Barnett (2013) condensed the EDP into six steps: defining the problem, searching for solutions, choosing the best solution, building and testing a prototype, and iterating steps for refinement. Each framework provides unique insights and steps, and are adaptable to various educational and professional scenarios, enriching the understanding of the EDP by offering a clear, structured method that is adaptable to various scenarios, thereby enhancing problem-solving skills and practical application.

Fig. 1: A diagram of the Engineering Design Processes by Atman et al. (2007), with three stages: ‘Problem Scoping,’ ‘Developing Alternative Solutions,’ and ‘Project Realization’.
figure 1

Each stage lists key activities, with arrows showing progression from one stage to the next.

Chiu et al. (2013) created a comprehensive model (Fig. 2) to help students understand engineering design, encompassing six key steps:

  1. 1.

    Clarifying Design Specifications and Constraints: Students select specific specifications and constraints to consider while developing solutions, including time, cost, size, and surface area limitations.

  2. 2.

    Developing Knowledge: This involves investigation and inquiry to examine related concepts for solving the problem.

  3. 3.

    Ideal Solutions: Students propose every idea as valid and testable, fostering innovation and generating appropriate solutions.

  4. 4.

    Building a Prototype: From various potential solutions, students choose and construct either virtual models or realistic prototypes.

  5. 5.

    Testing and Evaluating the Design: Students test their designs against project standards and share, present, and critique them among peers.

  6. 6.

    Improving the Design: Based on evaluation outcomes and feedback from teachers and peers, students refine their designs for improvement.

Fig. 2: A circular diagram of the Informed Engineering Design Model labeled ‘Design Cycle,’ featuring six stages: ‘Refine Design,’ ‘Specifications Constraints,’ ‘Develop Knowledge,’ ‘Ideate Solutions,’ ‘Build Prototype,’ and ‘Test and Evaluate Design’.
figure 2

Arrows show a clockwise flow, with external factors like ‘Design Challenge,’ ‘Design Solution,’ and others.

This model by Chiu et al. guides students through a systematic EDP, from conceptualization to refinement, ensuring deep engagement with engineering principles.

A review by Winarno et al. (2020) of 40 international journal articles highlighted variations in the EDP steps used in science education, with the most common steps being problem identification, construction, testing, evaluation, and redesign. Similarly, Shahat et al. (2022) outlined a comprehensive EDP involving defining the problem, conducting research, developing and choosing solutions, creating detailed designs, building and testing prototypes, improving the design, implementing the solution, and evaluating results.

The EDP frameworks are essential for addressing complex engineering problems and play a pivotal role in enhancing educational practices. Teaching methodologies that utilize the EDP can be finely crafted to interlink with key concepts, thereby enhancing teacher self-efficacy. By effectively applying the EDP in their teaching, educators gain confidence in their ability to teach complex topics, which improves their overall effectiveness across various educational stages. A core aspect of the EDP—defining and contextualizing problems—is a fundamental engineering skill that is equally vital in educational environments. This skill directly enhances a teacher’s capability to plan lessons and deliver relevant and intellectually engaging content, thereby boosting their confidence and effectiveness in the classroom.

Educators can significantly refine their instructional strategies by incorporating EDP steps into teaching methodologies. This integration fosters an environment that encourages critical thinking and problem-solving among students and enhances the overall educational impact by making learning more dynamic and applicable to real-world scenarios. These frameworks, therefore, serve as a bridge between theoretical knowledge and practical application, which is crucial for preparing students to tackle future challenges effectively.

Moreover, these models underscore the role of the EDP in promoting critical thinking and problem-solving skills within the realms of science and engineering education. Research by Lin et al. (2021) reveals that different educational levels prioritize various aspects of the EDP. Elementary students often focus on idea generation, high school students on predictive analysis and testing, and pre-service teachers on building and redesigning prototypes. This differentiation highlights the adaptability of the EDP to cater to diverse educational needs and levels, providing a tailored approach that enhances both teaching and learning experiences. By understanding and taking into account these varied focal points, educators can more effectively prepare students for the complexities of modern scientific and engineering challenges.

Assessing student achievements in engineering design is challenging yet essential. Xing et al. (2021) noted difficulties in tracking and evaluating students’ design processes. Performance assignments or projects, as suggested by Wind et al. (2019) and Anwar et al. (2022), offer comprehensive assessments of students’ EDP skills. Rubrics provide a structured assessment framework, assessing various EDP steps, as seen in the Creative Engineering Design Assessment (CEDA) developed by the College of Psychology and Engineering at Florida Tech in the United States (Charyton et al., 2008). Self-assessment also plays a critical role, with Sulaeman et al. (2021) highlighting its impact on student engagement and understanding of design skills. Of course, computerized systems and online assessments, like those used by Sikurajapathi et al. (2020), offer time-saving standardized methods for formative and summative assessment. Continuous feedback, as observed by English and King (2015), further enhances students’ abilities and understanding of the EDP. By integrating these assessments with regular feedback, educators can provide a more effective and comprehensive learning experience, fostering deeper comprehension and application of the EDP.

These theoretical foundations are closely connected with the development of our self-efficacy measurement tool. Each item in the questionnaire is designed to assess teachers’ confidence across different stages of the EDP. For example, items may ask teachers to rate their confidence in their ability to generate and evaluate potential solutions to problems posed in the science curriculum. This approach ensures that the tool is deeply rooted in the practical application of engineering design principles, providing a robust measure of self-efficacy that is directly linked to teachers’ day-to-day teaching experiences.

In summary, this study aims to measure the preservice teachers’ self-efficacy beliefs for teaching science through EDPs.

Research aim

As of now, there has been no research in Arab nations focused on evaluating the self-efficacy beliefs of preservice science teachers regarding the instruction of science through EDPs. Recognizing the importance of this area, we developed the Self-Efficacy Beliefs for Teaching Science through Engineering Design Processes Questionnaire (SEBTSEDPQ). This instrument aims to gauge the understanding and implementation of specific EDPs by Omani preservice science teachers in their science classrooms. Earlier discussions in our study have highlighted how such data could be pivotal for educational reforms in Oman. Therefore, the primary objective of our research was to devise items for the SEBTSEDPQ, gather and analyze data using this tool, assess the scale’s measurement effectiveness, and utilize the resulting Wright map to pinpoint areas of science education in Oman in need of enhancement. While this study is specific to Oman, its outcomes are valuable to science education, particularly in the context of preservice science teacher training, in Arab-speaking countries and beyond.

Methodology

General background

This study employed a quantitative methodology, integrating survey data and Rasch psychometric analysis, to explore preservice science teachers’ confidence in teaching science through EDPs. Conducted from September to December 2022, the research received ethical approval from the Sultan Qaboos University Institutional Review Board and the Research Ethics Committee of the Ministry of Education in Oman, affirming adherence to the highest ethical standards. The approval was granted under the recognition that all aspects of the study strictly comply with international guidelines and regulations for research involving human participants such as the Declaration of Helsinki. This approval certifies that our research design, including participant engagement and data management, was thoroughly reviewed, deemed ethical, and compliant with relevant guidelines/regulations.

Participants’ involvement was voluntary, with assurances provided that their participation would not affect their professional roles. It was explicitly communicated that the aim of data collection was purely for research purposes. To safeguard confidentiality and comply with ethical standards, all data were pseudonymized. This process ensured that while data could be linked for analytical purposes, it was impossible to trace back any information to individual teachers, thus maintaining strict anonymity.

Regarding informed consent, our study meticulously ensured that consent was obtained from all participants or their legal guardians before participation. This process was carried out in accordance with ethical guidelines that emphasize respect for participant autonomy and informed decision-making. Participants were fully informed about the study’s objectives, methods, potential impacts, and their rights, including the right to withdraw at any time without any repercussions. In cases where consent was deemed unnecessary, a detailed justification was provided based on specific exemptions recognized by the Research Ethics Committee, ensuring that such decisions were in line with ethical guidelines and did not compromise the welfare or rights of participants.

Sample selection

The study’s sample was comprised of 326 preservice science teachers from five Omani universities: Sultan Qaboos University (n = 73), Nizwa University (n = 62), Rustaq College (n = 70), A’Sharqiyah University (n = 56), and Dhofar University (n = 55). These participants consented to complete the survey and partake in the study. They were chosen through a readily available sampling technique, ensuring a representative selection across various categories, including gender and the specific teacher education program completed. As described by Kadilar and Cingi (2005), this method effectively combines random selection with categorization, facilitating both quantitative and qualitative research approaches. Table 1 provides information regarding the sample.

Table 1 Sample statistics.

Instrument and data collection

The SEBTEDQ instrument was crafted based on the standardized steps of engineering design for science education as outlined by Shahat et al. (2022). In adapting this instrument, we considered the cultural nuances, educational contexts, and the predominant use of Arabic in Oman. To assess content validity, Omani experts reviewed the instrument while taking into account the existing literature. These experts evaluated the accuracy and relevance of each item within the dimensions. The instrument demonstrated acceptable levels of Cohen’s Kappa, with values ranging from 0.65 to 0.79. The SEBTEDQ comprises 36 items on the Personal Self-efficacy (SE) scale and 31 items on the Outcome Expectancy (OE) scale, as detailed in Table 2. Our study adhered to the definitions provided by the Friday Institute for Educational Innovation (FIEI, 2012), which describes personal self-efficacy as the self-assurance and confidence a teacher has in their ability to teach a specific STEM subject. Additionally, the institute defines outcome expectation as the extent to which an individual believes that the actions of teachers generally influence student learning within that particular STEM subject. Based on expert feedback, minor adjustments were made to three items, for example, a behavioral verb was placed at the start of item 4. For linguistic accuracy, the SEBTSEDPQ underwent translation and back-translation between Arabic and English by six independent professional translators. The finalized English versions of these statements are presented in Table 2. Cohen’s kappa (κ) was employed to gauge the inter-rater agreement among evaluators. Responses were coded on a scale from ‘strongly agree = 5’ to ‘strongly disagree = 1’, with reverse coding applied to negatively worded items (Aiken, 1997).

Table 2 The items in the SEBTEDQ.

Following the expert assessment, the subsequent step involved a psychometric analysis to assess the performance of the tool. Data gathering took place, and the characteristics of the newly developed instrument were examined. This tool was deployed as an online questionnaire via the Google Forms application (refer to Table 1 for additional information on the sample). The digital survey was distributed to all aspiring science educators at five chosen universities across the Sultanate of Oman.

Analysis techniques

In order to address research questions, we analyzed the measure with Statistical Discovery Software. We used the Rasch model to evaluate the underlying psychometric properties of the for SEBTEDQ survey which have 36-item Self Efficacy (SE) instrument, and a 31-item Outcome Expectancy (OE) instrument. Specifically, Rasch analysis determined which items will be retained or deleted.

Rasch psychometric techniques

The self-efficacy (SE) data and the outcome expectancy (OE) data were evaluated utilizing Rasch psychometric techniques (Boone et al., 2014, Boone and Staver, 2020; Rasch, 1960; Wright and Masters, 1982; Wright and Stone, 1979). In the last 30 years, Rasch methods have been extensively utilized in the field of science education (Shahat et al., 2022; Soeharto and Rosmaiyadi, 2018) as well as many other fields such as medicine (e.g., Tennant and Conaghan, 2007).

Rasch analysis is especially useful for working with data collected from rating scale instruments. There are many reasons to utilize Rasch’s methods. First, Rasch does not treat raw rating scale data as if it is linear. Consider ‘Strongly Agree’ being coded with a 5, ‘Agree’ with a 4, and ‘Neutral’ with a 3. A common procedure is to evaluate the rating scale data as if the data were linear (to commence computations with the ‘raw’ numbers 5, 4, and 3). However, this can lead to misleading results as all we actually know for certain is that ‘Strongly Agree’ means there is more agreement than ‘Agree’, and that ‘Agree’ means more agreement than ‘Neutral’. The jump from ‘Neutral’ to ‘Agree’ cannot be assumed to be the same sized jump as from ‘Agree’ to ‘Strongly Agree’. Rasch methods allow one to evaluate ordinal rating scale data and ultimately compute person ‘measures’ and item ‘measures’ on a linear scale. Such linear measures are a requirement if one wishes to use parametric statistical tests in an analysis.

Another strength of using Rasch methods is that all items are not assumed to be of the same ‘agreeability’. For example, there is no assumption that an ‘Agree’ (4) on item 7 of the SE scale means the same level of agreement as an ‘Agree’ (4) on item 10. Another way to appreciate this point is that a ‘Strongly Agree’ (5) for item 25 by one respondent might indicate the same overall confidence for the same respondent as an answer of ‘Agree’ (4) to item 15 on the SE survey. This is because all items do not measure the same part of a trait.

Rasch techniques also provide many analyses (e.g., fit, person reliability, item reliability, analysis of category probability curves, Wright maps) that can be utilized to evaluate the measurement functioning of an instrument. These techniques go far beyond the computation of a Cronbach alpha or the use of factor analysis. Rasch methods are commonly used for evaluating rating scale data as they often allow results to be graphically organized to provide greater explanatory power. For example, Wright maps can be constructed to present the ordering of items for an instrument (Brann et al., 2021). Such maps provide a very informative profile regarding respondents’ attitudes and can provide a mechanism to evaluate the construct validity of an instrument; because of this, Wright maps have been used to guide reform efforts (ibid).

The most widely utilized Rasch software program, Winsteps (Linacre, 2023), was used for the data analysis presented in this paper. The Rasch rating scale model was used for the analysis (Andrich, 1978). Numerous researchers have utilized this Rasch model formulated for rating scales (e.g., Fitzpatrick and Hill-Briggs, 2015; Oliveria et al., 2014). Details concerning the Rasch rating scale model, as presented by Planinic et al. (2019) and as discussed by Brann et al. (2021), are provided below.

In the rating scale model, each item has several rating scale categories. The probability that a person n endorses category j of item i over the previous category (j -1), or the probability of being observed in category j, can be represented by the Rasch-Andrich rating scale model.

$$\mathrm{ln}\left(\frac{{P}_{nij}}{{P}_{ni(j-1)}}\right)={B}_{n}-{D}_{i}-{F}_{j}$$

In this context, Fj represents the Rasch-Andrich threshold, also known as step calibration. It signifies the point on the latent variable scale where the likelihood of a person n being observed in category j for item i is equal to the likelihood of them being observed in the preceding category (j - 1). The estimation of Fj is based on the frequency of each category and item difficulty. This item difficulty is derived from comparing the highest and lowest categories. Specifically, it is located at the point where the chance of observing person n in category (j - 1) matches the probability of observing them in the lowest category, as outlined by Brann et al. (2021).

Results

The SEBTEDQ survey presented to the sample of respondents can be viewed as a 36-item Self Efficacy (SE) instrument, and a 31-item Outcome Expectancy (OE) instrument. Utilizing Rasch analysis techniques, we present our analysis for both instruments. The following discussion provides our interpretation of the psychometric analyses.

Fit

An important step that must be conducted when Rasch techniques are utilized is an evaluation as to whether the data ‘fit’ the Rasch model. When data fit, it is evidence of unidimensionality, meaning that a set of items defines a single trait. This is critical if item responses are to be pooled to provide an overall personal measure. Unidimensionality is a requirement of the Rasch model and a necessity if one hopes to measure with an instrument. There are numerous fit statistics utilized in Rasch analyses. We utilized the fit statistics of MNSQ Outfit and MNSQ Infit. Researchers have used a variety of cut-off values to flag items that may not ‘fit’ the Rasch model. Such misfitting items might not be part of the same dimension as the other items. We specified fit values of 0.5 to 1.5. Items with fit values lower than 0.5 or higher than 1.5 were deemed to not fit the Rasch model (O’Connor et al., 2016). In our analysis of the 36 items of the SE scale, 3 items (Q14, Q26, Q33) did not fit the model. These items were removed from the analysis. The remaining 33 items of the SE did fit the model and represent the final SE instrument of our study. Table 3 presents the final MNSQ fit values of the 33 items from the SE instrument that did fit the model. Analysis of the 31 OE scale suggested that one item (Q1) did not fit the Rasch model. The remaining 30 items of the OE scale did fit the model. These 30 items make up the final OE scale. Table 4 presents the MNSQ fit values for the 30 OE items that did fit the model.

Table 3 Summary of the Rasch Item Statistics for the Self-Efficacy Scale (SE).
Table 4 Summary of the Rasch Item Statistics for the Outcome Expectancy Scale (OE).

Added analysis of dimensionality

Previously we have presented an analysis of dimensionality using “Fit”. Dimensionality was also evaluated by conducting a Principal Component Analysis of Residuals (PCAR) and assessing point measure correlations. PCAR allows one to investigate whether there is a higher level of noise in the data than that which one would expect with the items measuring a single trait. If there is a higher level of noise present, then follow up tests are conducted as outlined in Boone and Staver (2020). Many researchers conducting a Rasch analysis consider PCAR techniques (Brentari and Golia, 2007). The PCAR analysis for the self-efficacy data and the outcome expectancy data suggested an unexplained variance in the first contrast higher than a targeted value of 2.0 or less (Boone and Staver, 2020). This required follow up PCAR analyses to explore dimensionality. For example, review of the item content as a function of PCAR identified item clusters. For these two data sets further PCAR analyses of items as a function of potential groups of items did not suggest more than one variable.

Dimensionality can also be evaluated using point-measure correlations. Li et al. (2016) recommended a benchmark of 0.3 for point measure correlations to establish unidimensionality. Li et al. (2016) viewed items with point-measure correlations above 0.3 as acceptable. For our analysis all items in both the self-efficacy scale and the outcome expectancy scale exhibited correlations above 0.3. This also supports the assertion that the set of self-efficacy items define one trait and that the set of outcome expectancy items define a single trait.

Independence

Another component of a Rasch analysis concerns the evaluation of what is termed item independence. It should not be the case that items are dependent upon one another. One technique that has been used by researchers has been to review residual correlations of items to assess item independence. Fitzpatrick and Hill-Briggs (2015) have suggested that correlations below 0.70 indicate that the items are locally independent. In our analysis all items of the self-efficacy scale had correlations below 0.70. Also, all items of the outcome expectancy scale had correlations below 0.70.

Reliability

There are a range of reliability indices that can be computed for a Rasch analysis (Eggert et al., 2017). Two of the most reported indices are person reliability and item reliability (Linacre, 2023). A rule of thumb for a well-functioning instrument is that it should have a target value of 0.80 or higher for person reliability and 0.90 or higher for item reliability (Malec et al., 2007). Additionally, two indices can be computed (person separation, and item separation). These values provide an assessment of how well the instrument can distinguish, or separate, respondents, and how well respondent responses help to distinguish groups of items (Malec et al., 2007). A commonly reported rule of thumb is a target value of 2.0 or higher for person separation, and a value of 4.0 or higher for item separation (Malec et al., 2007). Some researchers who have utilized these rules of thumb in their analyses are Robert, et al. (2012). In our analysis, the following values were observed for the SE scale: 0.91 item reliability, 0.94 person reliability, 3.22 item separation, and 4.06 person separation. The values observed for the OE scale were as follows: 0.77 item reliability, 0.93 person reliability, 1.85 item separation, and 3.36 person separation.

Rating scale effectiveness

We also evaluated the SE and OE scales using guidelines suggested by Linacre (1999, 2004). These guidelines concern evaluating the functioning of the rating scale. These guidelines include the following:

  1. 1.

    10 observations in each rating scale category.

  2. 2.

    Regular observation distribution.

  3. 3.

    Average measures advance monotonically with category.

  4. 4.

    The outfit MNSQ for rating scale categories is less than 2.0.

  5. 5.

    Orderly series of step calibrations that advance in a monotonic way.

  6. 6.

    The distance or gap between response categories should be larger than 1.4 logits and less than 5.0 logits (O’Connor et al., 2016).

These guidelines can be evaluated using a part of the Winsteps output which we provide below in Table 5. Brann et al. (2021) is one example of a research project which has utilized these guidelines.

Table 5 Self-Efficacy (SE) and Outcome Expectancy (OE) Rating Scale Performance.

The SE analysis revealed the following in relation to the six guidelines detailed above:

  1. 1.

    Every rating scale step of the SE scale has an observed count exceeding 10.

  2. 2.

    The distribution of observed counts first increases and then decreases across the rating scale steps.

  3. 3.

    Average measures show a steady rise, starting from −0.82 at the lowest to 1.04 at the highest.

  4. 4.

    For each step of the rating scale, the Outfit MNSQ remains under 2.0.

  5. 5.

    The Andrich Thresholds (or step calibrations) demonstrate a uniform and monotonic increase.

  6. 6.

    All gaps are under five logits, and there is one instance of a gap being smaller than 1.4 logits. The others all exceeded 1.4 logits.

For the OE analysis the same pattern in rating scale functioning was observed as for the SE data.

For the 6 guidelines suggested by Linacre, there is only one instance of noncompliance for both the SE and OE, which was a single gap being less than the suggested gap width cutoff of 1.4 logits.

Category probability curves

An added step that we took in our analysis of the functioning of our rating scales concerns the creation and evaluation of category probability curves. A category probability curve can be utilized to examine how a rating scale is performing and how the category thresholds are ordered (Linacre, 2023). In Figs. 3 and 4 we provide the curves for the SE scale and the OE scale, respectively. For each curve, there should be a portion of the curve that is ‘most probable’. This means that when a scale is functioning well, there should be a portion of the SA curve that is most probable, a portion of the curve for A that is most probable, and so on for all the rating scale steps (N, A, and SA). This means that there should be a most probable peak observed for each rating scale step. What can be observed is that indeed a unique ‘most probable’ peak is observed for each rating scale step for both the SE scale and the OE scale.

Fig. 3: A Category Probability Curve for the SE, showing the probability of selecting each response category (Strongly Disagree, Disagree, Neutral, Agree, Strongly Agree) relative to item difficulty.
figure 3

The x-axis measures item difficulty, and the y-axis shows category probability. The curves for each response category (‘SD’ in red, ‘D’ in blue, ‘N’ in magenta, ‘A’ in black, ‘SA’ in green) intersect across the difficulty range. This curve is based on the full set of 33 SE items.

Fig. 4: A Category Probability Curve for the OE, showing the probability of selecting each response category (Strongly Disagree, Disagree, Neutral, Agree, Strongly Agree) relative to item difficulty.
figure 4

The x-axis measures item difficulty, and the y-axis shows category probability. The curves (‘SD’ in red, ‘D’ in blue, ‘N’ in magenta, ‘A’ in black, ‘SA’ in green) intersect at various points. Note: SD=Strongly Disagree, D=Disagree, N=Neutral, A=Agree, SA=Strongly Agree. This curve is based on the full set of 30 OE items.

Wright maps

Many researchers have utilized Wright maps for the presentation and communication of Rasch rating scale data (e.g., Lu et al., 2022; Shahat et al., 2022). A Wright map for the SE scale is presented in Fig. 5 and the Wright map for the OE scale is presented in Fig. 6. Items in our Wright maps are ordered from items that respondents agree with the most (items at the base of the map that have a negative logit measure) to items that were agreed with the least (items toward the top of the map that have a positive measure). Wright maps are informative in many aspects. One of which is an assessment of the way items mark the trait being measured. The Wright map for the SE and OE scale reveals that although there is some overlap of items along each trait, there are items marking the trait as one proceeds up the Wright map. This is the pattern one would like to observe with measurement instruments. When there are large gaps between items, that suggests there are regions of the trait that are not being measured by items. Future versions of both the SE and OE instruments might attempt to provide 1–2 more items that are difficult for respondents to Strongly Agree with, which could potentially help fill the gap at the very high end of the two Wright maps.

Fig. 5: The Wright map of SE items, displaying the difficulty of endorsing each item on a logit scale.
figure 5

The y-axis represents the logit values, ranging from −0.75 to 0.75, with positive values indicating items that are harder to endorse (e.g., harder to select ‘Strongly Agree’) and negative values indicating items that are easier to endorse (e.g., easier to select ‘Strongly Agree’). Items are plotted as a function of item measure, with each item labeled as ‘Q’ followed by its number (e.g., Q1, Q2) and positioned along the logit scale according to its difficulty. Q1 is shown as the hardest to endorse, while Q2 is the easiest.

Fig. 6: The Wright map of OE items, showing the difficulty of endorsing each item on a logit scale.
figure 6

The y-axis ranges from −0.50 to 0.50, with positive values indicating items harder to endorse and negative values indicating items easier to endorse. Items are labeled ‘Y’ followed by their number (e.g., Y2, Y27) and positioned according to difficulty. Y27 is the hardest to endorse, while Y31 is the easiest. Items are plotted as a function of item measure.

Discussion

This study focused on the science domain within the broader STEM framework, reflecting the current state of STEM education in Oman, where science teachers primarily lead STEM instruction. While this aligns with local educational practices, it is important to clarify that STEM encompasses more than just science; it integrates Science, Technology, Engineering, and Mathematics to promote an interdisciplinary approach to learning, which is essential for solving real-world problems (Shahat and Al-Balushi, 2023). Our findings demonstrate that integrating engineering design—a key STEM element—into science classrooms reinforces scientific understanding and promotes critical STEM-related skills like critical thinking and creativity. For example, modeling machines such as cars, boats, and rockets allows students to apply scientific principles in practical contexts, addressing both theoretical and hands-on aspects of STEM education (Marulcu and Barnett, 2013; Mumba et al., 2023).

The data collected from 326 teachers provides valuable insights into how science teachers in Oman perceive and implement STEM education. Science teachers are the primary facilitators of STEM-related content, largely due to the current curriculum structure, which places significant emphasis on science education. For example, Oman’s Ministry of Education has implemented STEM programs in secondary schools that are primarily led by science teachers, with specific training programs focused on integrating engineering design and technology into science lessons. These programs include professional development initiatives such as the STEM in Schools training, which equips science teachers with skills to incorporate technology and engineering concepts into their classroom practices. This emphasis on science education reflects the pivotal role that science educators play in leading STEM initiatives (Shahat et al., 2022, Shahat et al., 2024, Shahat and Al-Balushi, 2023).

However, this science-centric approach may limit the broader applicability of the findings across the full spectrum of STEM domains—namely, technology, engineering, and mathematics (Shahat and Al-Amri, 2023). To fully capture the interdisciplinary nature of STEM education, it is essential to integrate these additional domains alongside science. For example, expanding teacher training programs to include dedicated modules on engineering and technology education would better align with the goals of STEM education. This integration would foster a more holistic approach, creating a comprehensive framework that reflects the interdisciplinary learning experiences that STEM aims to achieve. Future research should broaden its focus to include subject areas beyond science, ensuring that the full spectrum of STEM—technology, engineering, and mathematics—is represented (Honey et al., 2014).

STEM education aims to integrate multiple disciplines to build a diverse skill set in students, equipping them for challenges in the modern workforce. While science education plays a central role, an effective STEM curriculum requires collaboration across various domains (Bybee, 2011; Anwar et al., 2022). In Oman, science teachers often take on this responsibility, but the integration of other subjects, such as technology and engineering, is necessary for a holistic approach. This integration would enrich students’ learning experiences and align more closely with international best practices in STEM education (Shahat et al., 2022).

While the study offers valuable contributions, particularly through the development of the SEBTSEDQ instrument, it also highlights some limitations. The small-scale nature of the data experiment primarily focused on science, which may restrict the generalizability of the findings across all STEM domains. It is essential to expand future research to include technology, engineering, and mathematics, areas that are equally vital to a comprehensive STEM framework. Despite these constraints, this study lays a strong foundation for broader investigations into how preservice teachers integrate interdisciplinary STEM knowledge into their classrooms.

The findings suggest that Omani science teachers may be confident in delivering traditional science content but struggle with the integration of more complex STEM elements, particularly engineering design. This gap is reflected in the misfit of several items on the SE scale, such as SE14 (“I engage in discussions with students about evaluating solutions for scientific challenges”) and SE26 (“I have confidence in my capacity to respond to students’ inquiries regarding engineering design”). These items pointed to the difficulties teachers face when facilitating student engagement with engineering design. The challenges observed in responding to student inquiries and evaluating engineering-related solutions indicate the need for more practical, hands-on training. This type of training, allowing preservice teachers to practice applying theoretical concepts in controlled environments, is critical for improving teacher self-efficacy and ultimately enhancing STEM instruction (Fantz et al., 2011; Mumba et al., 2023).

The Rasch analysis conducted in this study demonstrated that while the SEBTSEDQ instrument functions well overall, there are areas for improvement. The removal of misfitting items such as SE14 and OE1 highlights a gap between theoretical knowledge and practical application in the classroom. Teacher training programs in Oman must address these gaps by focusing not only on content knowledge but also on equipping teachers with the skills needed to facilitate student-centered learning that encourages critical thinking and real-world problem-solving (Hafiz and Ayop, 2019).

The category probability curves suggested that the SEBTSEDQ functions well overall, but the analysis also indicated that expanding the range of item difficulty could better capture the diverse experiences of teachers. While many science teachers feel confident in delivering traditional content, their self-efficacy appears to decline when they are required to integrate more interdisciplinary tasks, such as engineering design. This points to the need for differentiated professional development that addresses the varying levels of teacher preparedness for STEM education (Mumba et al., 2023).

The Wright maps further provided insights into teacher preparedness and self-efficacy. For example, the proximity of certain items (Q9, Q20, and Q25) suggested that some aspects of teacher training may be redundant, while gaps between other items (Q2 and Q3) indicate areas where teachers lack confidence. Addressing these gaps through targeted professional development, particularly in engineering design and technology integration, is critical for improving teacher preparedness and enhancing the overall quality of STEM education in Oman (Lu et al., 2013).

The findings of this study underscore the need for teacher training programs in Oman to be restructured to include more emphasis on engineering design and technology integration (Shahat et al., 2024). These areas are critical for fostering interdisciplinary problem-solving skills among students. Furthermore, the study highlights the importance of giving teachers more opportunities for hands-on practice in engineering design, allowing them to build confidence and competence before entering the classroom (Shahat and Al-Balushi, 2023). This approach aligns with research that emphasizes the value of experiential learning in building teacher self-efficacy (Fantz et al., 2011).

The insights gained from this study provide a solid foundation for future research that can explore the broader application of these findings across other STEM domains, particularly technology and mathematics. As Oman continues to expand its STEM education initiatives, developing teacher training programs that emphasize interdisciplinary learning and practical application will be essential for preparing both teachers and students to meet the demands of the modern workforce.

Conclusion and implications for science education

This study aimed to evaluate Omani preservice science teachers’ self-efficacy in implementing EDPs within science education using the Self-Efficacy Beliefs for Teaching Science through Engineering Design Processes Questionnaire (SEBTSEDQ). Given the absence of prior research in Arab countries exploring preservice teachers’ self-efficacy regarding engineering design, this study fills a significant gap and offers essential insights into science education in the Arab-speaking world (Shahat and Al-Balushi, 2023). The findings contribute to local educational practices and the global literature on teacher self-efficacy in STEM education, particularly in the context of integrating interdisciplinary STEM elements like engineering design into science teaching.

Several psychometric analyses were performed to assess the SE and OE sections of the instrument. The Rasch analysis revealed that three items from the SE scale and one from the OE scale did not fit the Rasch model and were removed to improve the instrument’s overall functioning. The misfit was attributed to divergences in preservice teachers’ experience and understanding of key concepts, such as evaluating engineering design solutions and facilitating discussions about scientific challenges (Shahat and Al-Balushi, 2023, Shahat et al., 2024). The category probability curves and reliability indices confirmed that the SEBTSEDQ instrument generally functions well, though the OE scale could be further refined by expanding the range of item difficulty or increasing sample size to enhance item reliability (Honey, Pearson, & Schweingruber, 2014).

The Wright maps provided additional insights by highlighting areas where items might be redundant and identifying gaps that could be filled with new items. For example, items related to basic scientific inquiry were closely clustered, while those requiring more interdisciplinary skills, such as engineering design, were more challenging for teachers. These findings indicate that preservice teachers are more confident in traditional science instruction than in engineering-related tasks, underscoring the need for targeted professional development in interdisciplinary STEM skills (Bybee, 2011). The Wright maps thus serve as a valuable tool for refining the SEBTSEDQ instrument and improving future assessments of teacher preparedness in STEM education.

The study has important implications for science education, particularly in Arab countries where research on preservice teachers’ self-efficacy in engineering design is limited. The findings reveal key areas where preservice teachers need further support, particularly in integrating EDPs into their teaching practices. For example, the observed challenges in facilitating problem identification and solution evaluation in engineering design suggest that teacher education programs should provide more hands-on, practical training in these areas (Fantz et al., 2011; Mumba, 2023). These findings can inform the development of teacher training programs that better equip preservice teachers to implement STEM education effectively, especially by fostering interdisciplinary problem-solving and critical thinking skills (Hafiz and Ayop, 2019).

This study makes a significant contribution by applying the Rasch model to evaluate the psychometric properties of the SEBTSEDQ instrument, which was designed to specifically measure self-efficacy in integrating engineering design into science education. As the first such study in the Arab-speaking world, it addresses a critical gap in the literature on STEM education in this region. With minor modifications, the instrument could be adapted and applied in other educational contexts, providing a rigorous, quantitative tool for measuring self-efficacy and informing the development of teacher training programs.

Furthermore, this research highlights the critical role of teacher self-efficacy in successfully implementing engineering design in STEM curricula. Engineering design fosters essential skills such as creativity, problem-solving, and critical thinking, all of which are necessary for preparing students for future STEM careers. To fully develop these competencies, teacher education programs must focus on providing more comprehensive training in engineering design and interdisciplinary STEM teaching. The study also emphasizes the importance of interdisciplinary approaches in STEM education, as engineering design requires the integration of concepts from science, technology, and mathematics (Bybee, 2011; Honey et al., 2014, Shahat et al., 2023, 2024).

The insights gained from this research have broader implications for improving STEM education, particularly by highlighting the need for hands-on, practical training that allows teachers to build confidence in interdisciplinary teaching. This is crucial for ensuring that teachers are equipped to guide students through complex, real-world problem-solving processes that span multiple STEM domains. Moreover, this study underscores the importance of continued professional development, focusing on areas such as engineering design and technology integration, to address gaps in teacher competencies (Mumba, 2023).

Finally, the use of Wright maps and the Rasch analysis in this study offers a valuable framework for future research, particularly in refining psychometric instruments like the SEBTSEDQ. These tools can optimize teacher training programs by identifying specific areas where teachers may need additional support and tailoring professional development initiatives accordingly. The SEBTSEDQ instrument, with appropriate adjustments, could be applied in other regions or educational levels, making it a versatile tool for future researchers exploring teacher self-efficacy in STEM education.

Overall, this research provides essential knowledge for educators, policymakers, and teacher trainers, offering actionable insights that can inform the development of more effective STEM education strategies. As Oman and other Arab-speaking countries continue to expand their STEM education initiatives, these findings will be critical for shaping teacher training programs that prepare educators to meet the demands of modern education and equip students with the skills they need for success in a rapidly evolving workforce.

Future directions

This study lays the groundwork for future research to extend the findings across other STEM domains, advancing the development of a comprehensive and integrated STEM education framework in Oman. While this study focused primarily on science education, future research should explore preservice teacher self-efficacy in other STEM areas—Technology, Engineering, and Mathematics. Investigating self-efficacy in these domains would validate the current findings and provide a more holistic understanding of how Omani teachers can deliver interdisciplinary STEM education (Shahat et al., 2024). Expanding this research to include these domains would help ensure that teacher training programs are adequately preparing educators to implement all aspects of STEM education effectively.

Furthermore, as emerging technologies such as artificial intelligence (AI), virtual and augmented reality, and robotics increasingly become integrated into educational practices, future research should examine their impact on teacher self-efficacy in STEM education. These technologies offer innovative opportunities for engaging students and bridging the gap between theoretical knowledge and practical application. Investigating how preservice and in-service teachers use these technologies could offer deeper insights into the ways that digital tools enhance both teaching and learning experiences. For instance, AI could support personalized learning pathways, while augmented reality could make complex STEM concepts more accessible and engaging (Honey et al., 2014). Understanding the role of such technologies in improving teacher self-efficacy and student engagement could significantly shape future professional development programs.

Additionally, future studies should investigate interdisciplinary pedagogical strategies that integrate science, technology, engineering, and mathematics, rather than teaching them in isolation. Given the need for teachers to guide students through complex, real-world problems that require skills from multiple disciplines, research should focus on how interdisciplinary curricula can enhance critical thinking, creativity, and problem-solving abilities. Exploring how integrated STEM approaches impact teacher confidence and student outcomes would provide valuable insights into improving STEM education as a whole (Fantz et al., 2011). Interdisciplinary strategies can potentially strengthen the link between classroom learning and practical applications, better preparing students for the challenges of the modern workforce.

Finally, longitudinal research is necessary to track the long-term impacts of improved teacher self-efficacy on student outcomes. Such studies could explore how increased teacher confidence correlates with student achievement, motivation, and interest in pursuing STEM careers. A longitudinal approach would offer critical data on the sustainability of teacher training interventions and their effects on student success over time. This evidence could further inform education reform efforts and teacher training programs in Oman and beyond, guiding policymakers and educators in designing strategies that promote long-term improvements in both teaching and learning (Mumba, 2023). By understanding how self-efficacy evolves and influences student outcomes over time, educators can refine their approaches to STEM education, ensuring that teachers and students are well-equipped for the demands of a rapidly changing world.