Introduction

Mastering effective communication skills is fundamental in foreign language education, with oral proficiency playing a crucial role in English as a Foreign Language (EFL) instruction (Fernandez-Garcia and Fonseca-Mora, 2022; Pitura, 2022). While knowledge of grammar and vocabulary is essential, true oral proficiency extends beyond linguistic competence to include fluency, accuracy, and the ability to use language appropriately in various contexts (Buckingham and Alpaslan, 2017; Hsu et al. 2023). Speaking is not just a linguistic activity but also a cognitive and emotional one, deeply connected to learners’ motivation and willingness to communicate (Buckingham and Alpaslan, 2017; Fathi et al. 2023; Piniel and Albert, 2018). Given the importance of oral communication in real-life situations—from casual conversations to formal negotiations—educators are continually seeking innovative methods to enhance students’ speaking skills (Yang et al. 2022).

The integration of technology into language education has opened new possibilities for enhancing language learning (Fathi and Rahimi, 2024; Liang and Hwang, 2023; Liu et al. 2024; Zhang and Zou, 2022). Among these innovations, digital storytelling (DST) has emerged as a promising approach (Gimeno-Sanz, 2015). DST involves creating narratives using digital tools that combine text, images, audio, and video, providing learners with an engaging platform to construct and share stories (Huang, 2023; Kim and Li, 2021). By blending multimedia resources with storytelling, DST creates a rich learning environment that can boost motivation and enhance language proficiency (Chen Hsieh and Lee, 2023; Liang and Hwang, 2023).

Research suggests that DST can positively impact various aspects of language learning, including speaking proficiency, self-regulation, and anxiety reduction (Chen Hsieh and Lee, 2023; Huang, 2023; Liang and Hwang, 2023; Yan et al. 2024; Yang et al. 2022). However, these factors are often studied in isolation. It is important to recognize that language learning is not merely the acquisition of discrete skills but a complex interplay of cognitive, metacognitive, and affective processes (Dörnyei, 2006; Pavlenko, 2013). Therefore, examining the interrelationships among DST, L2 speaking skills, self-regulation, and speaking anxiety is essential for a holistic understanding of how DST influences language learning.

Understanding the interplay between DST, L2 speaking skills, self-regulation, and speaking anxiety is crucial for several reasons. First, DST may enhance speaking skills not only by providing practice opportunities but also by encouraging learners to actively engage in self-regulatory processes. Creating digital stories requires planning, organizing ideas, and reflecting on language use, which are key components of self-regulated learning (Efklides, 2011; Yang and Wu, 2012; Zimmerman, 2002). By fostering these skills, DST can help learners become more autonomous and effective in their language learning. Second, DST has the potential to reduce speaking anxiety by creating a supportive and less threatening environment (Huang, 2023; Liang and Hwang, 2023). The use of multimedia elements allows learners to express themselves in diverse ways, which can alleviate the pressure associated with traditional speaking tasks (Chen, 2022; Huang, 2023; Sadik, 2008). Additionally, the collaborative aspects of DST, such as peer review and group discussions, can build a sense of community and provide emotional support, helping learners overcome anxiety (Chen, 2022; Lim et al. 2022; Xiangming et al. 2020).

We hypothesize that DST, by promoting a sense of agency and providing opportunities for scaffolded practice and reflection, can create a beneficial cycle. In this cycle, improved speaking skills may enhance learners’ self-regulation, leading to further proficiency gains and reduced anxiety. Likewise, better self-regulation and lower anxiety can improve speaking performance. Understanding these interconnected dynamics is crucial for optimizing DST as a teaching tool and adapting instruction to support overall language development. Examining these variables together is essential because they likely interact in complex ways to shape learners’ speaking development. Language acquisition involves not only cognitive processes but also metacognitive strategies and emotional factors. By exploring how DST influences these areas collectively, we can gain deeper insights into its role in facilitating language learning and identify strategies to enhance its effectiveness.

This study addresses a gap in the literature by systematically examining how DST affects L2 speaking skills, self-regulation, and speaking anxiety in the context of an IELTS preparation course. By investigating these factors together, the research offers a comprehensive view of DST’s impact on language learning. The findings may help educators and curriculum designers understand the potential of DST as a tool that not only improves linguistic proficiency but also promotes self-regulated learning and reduces anxiety. These insights are particularly valuable in high-stakes testing environments like IELTS, where speaking proficiency and learner confidence are critical. Accordingly, the study addresses the following research questions:

  1. 1.

    To what extent does digital storytelling impact the L2 speaking skills (fluency, vocabulary, accuracy, and pronunciation) of EFL learners in an IELTS preparation course?

  2. 2.

    To what extent does digital storytelling influence the self-regulation strategies (goal-setting, self-monitoring, self-evaluation) of EFL learners in an IELTS preparation course?

  3. 3.

    To what extent does digital storytelling affect the levels of speaking anxiety experienced by EFL learners in an IELTS preparation course?

Literature review

Digital storytelling in L2 learning

Digital storytelling (DST), which integrates multimedia elements like sounds, images, and text, has become an effective tool in language teaching (Gimeno-Sanz, 2015; Soler Pardo, 2014; Vinogradova et al. 2011). It helps learners, especially those with lower proficiency, to improve communication by offering a dynamic way to construct and present narratives (Kim and Li, 2021; Smeda et al. 2014). By incorporating non-linguistic elements, DST supports creative expression and offers an alternative for students who struggle with traditional language production, expanding their ability to communicate (Smeda et al. 2014). The combination of linguistic and visual elements enhances language learning, fostering engagement and improving various language skills (Sadik, 2008). Beyond communication, DST promotes a comprehensive language learning experience. It engages learners in research, writing, presenting, problem-solving, and the use of multimedia tools, building a broad range of skills (Bailey et al. 2021). This approach mirrors real-world language demands, where multiple competencies are often required simultaneously.

Moreover, DST extends beyond oral expression, influencing multiple aspects of language acquisition. Research shows it promotes critical thinking through digital storytelling games (Chen and Chuang, 2021) and enhances vocabulary through strategies in extramural English gaming (Calafato and Clausen, 2024). The integration of DST with media like interactive ebooks and role-playing games expands the possibilities within Computer Assisted Language Learning (CALL), advancing engagement in language education (Raffone, 2023). Innovations in computational storytelling continue to push creativity and learning further (Trichopoulos et al. 2023).

The benefits of DST across reading, writing, listening, and speaking have been well-documented (Hava, 2021; Yang et al. 2022), with a particular focus on speaking performance. For instance, Tsou et al. (2003) found that DST improved language competency and sentence complexity, while Bobkina and Domínguez Romero (2022) reported gains in grammar, syntax, vocabulary, and pronunciation. These studies suggest that DST enhances both fluency and accuracy, contributing to overall oral proficiency. Specifically, DST has been linked to smoother speech, reduced hesitation, and improved use of discourse markers (Tatlı et al. 2022). The continuous practice and refinement required by DST help learners develop more coherent speech patterns, as noted by Yang et al. (2022) and Fu et al. (2022), who found it crucial for fluency improvement over time.

DST has been shown to improve accuracy in grammatical use, word choice, and sentence structure. The scriptwriting and revision process encourages learners to focus on precision, resulting in more accurate and sophisticated language production (Kim, 2014; Bobkina and Domínguez Romero, 2022). Tsou et al. (2003) found notable gains in grammar and syntax among DST participants. Additionally, pronunciation benefits from DST, as its audio components provide models for correct pronunciation and intonation. This exposure, along with opportunities for practice and feedback, improves phonological control, including stress patterns and intonation (Kim and Lee, 2018; Liang and Hwang, 2023). In terms of vocabulary, DST pushes learners to use a more diverse and contextually appropriate range of words, expanding their lexical repertoire (Lustenberger, 2024). Some studies (Chen and Yeh, 2024; Yang et al. 2022) highlight how DST facilitates vocabulary acquisition by immersing learners in meaningful language use, aiding long-term retention.

Several mechanisms explain effectiveness of DST in developing speaking skills. First, the extended spoken output required by DST ensures continuous engagement with the language, essential for skill development (Huang, 2023). The multimodal support, integrating visual and auditory elements, reinforces correct pronunciation and vocabulary use (Kim and Lee, 2018). Additionally, the interactive and creative nature of DST increases motivation and engagement, leading to more active language use and practice. Hava (2021) found that DST interventions significantly boosted learners’ self-confidence and personal use of English, with motivation being a key factor in speaking skill improvement.

DST encourages feedback and self-reflection, key elements of self-regulated learning. The iterative process of creating and revising digital stories allows learners to monitor their progress, receive feedback, and make targeted improvements in speaking performance (Kim, 2014). This approach not only enhances language skills but also builds learner autonomy and confidence, which are crucial for language acquisition. Beyond specific language skills, DST supports overall development, including literacy and digital abilities, as learners engage in topic selection, research, scriptwriting, and creative expression (Skinner and Hagood, 2008). This fosters critical thinking, problem-solving, and metacognitive awareness (Yang and Wu, 2012).

DST has also been shown to positively influence attitudes toward language learning and promote autonomy. Studies, such as those by Sam and Hashim (2022) and Tecedor (2023), highlight its role in boosting learners’ confidence and improving their perception of speaking skills. However, subjective perceptions alone may not fully capture the impact, and future research could combine feedback with objective performance data for a more comprehensive understanding. The flexibility of DST across various learning environments further highlights its versatility. For example, Liang and Hwang (2023) demonstrated its effectiveness in enhancing multimodal storytelling in robot-centered contexts, while Zhussupova and Shadiev (2023) explored its use in developing public speaking skills in diverse classrooms. The integration of digital storytelling with gaming, as examined by Raffone (2023), opens new possibilities in CALL. Moreover, advances in computational storytelling (Trichopoulos et al. 2023) continue to push the boundaries of its role in language education, making it an adaptable tool for future learning contexts.

Taken together, the reviewed literature highlights the broad potential of DST as an effective tool in language learning by fostering active engagement, holistic language development, and essential skill acquisition. It enhances oral proficiency through meaningful language use, practice, and self-monitoring, while its interactive and collaborative features boost motivation, engagement, and self-efficacy, reducing language-related anxiety. Moreover, DST promotes cognitive and metacognitive growth by fostering critical thinking, problem-solving, and strategic learning.

Self-regulated learning

Self-regulated learning (SRL) is essential in EFL education, enabling students to take active control of their language acquisition (Oxford, 2016; Teng and Zhang, 2016). SRL involves goal-setting, selecting learning strategies, monitoring progress, and adjusting approaches as needed (Andrade and Evans, 2012; Teng, 2024; Zhang and Zou, 2024). Research shows that EFL learners who engage in self-regulation often achieve higher proficiency and greater motivation (Tseng et al. 2006; Zheng et al. 2018). For example, the use of metacognitive strategies—such as planning, organizing, and self-evaluating—has been linked to improved reading and writing skills (Shih and Huang, 2018; Zhang and Seepho, 2013). By fostering autonomy and responsibility, SRL helps learners become more independent and effective in language learning (Little, 2007; Zhang and Zou, 2024).

In EFL speaking, SRL is especially important as learners face challenges like speaking anxiety, pronunciation, and fluency (Aregu, 2013; Derakhshan and Fathi, 2024). Research highlights the context-specific nature of SRL (Hu et al. 2023; Kuo et al. 2014), stressing the need for tailored assessments to evaluate learners’ self-regulation in different learning settings. SRL supports cognitive processes by empowering learners to manage their educational experiences (Zimmerman, 2002). The iterative process of planning, monitoring, and assessing progress (Pintrich, 2000) is particularly relevant in EFL speaking, where strategies like goal-setting, self-monitoring, and self-evaluation are crucial for improving proficiency and reducing anxiety (El-Sakka, 2016).

Several key studies have explored the application of SRL in EFL speaking contexts. Uztosun (2020) developed a scale to measure self-regulated motivation for improving English speaking, while Aregu (2013) found that enhancing SRL in teaching spoken communication improves speaking efficacy and performance. Derakhshan and Fathi (2024) highlighted the connection between growth mindset, self-efficacy, and SRL in fostering successful L2 speaking outcomes. Öztürk and Çakıroğlu (2021) demonstrated how flipped learning designs incorporating SRL strategies can develop language skills, including speaking. Chen et al. (2020) investigated SRL strategy profiles among EFL learners, with implications for speaking proficiency, and Lei et al. (2022) showed how mobile-assisted language learning enhances SRL and speaking skills.

Several SRL strategies have been identified as particularly effective in EFL speaking. Goal-setting helps learners focus by establishing clear objectives for speaking tasks (Zimmerman, 2002). Self-monitoring allows learners to track their progress and identify areas for improvement, such as fluency and accuracy (Bong and Skaalvik, 2003). Self-evaluation involves reflecting on performance to assess strengths and weaknesses (Efklides, 2011). Seeking feedback from peers and instructors offers insights for refining speaking strategies (Aregu, 2013). Finally, adapting strategies based on self-evaluation and feedback improves speaking outcomes (Zimmerman and Kitsantas, 2014).

Despite these findings, several gaps remain. More research is needed on the effectiveness of specific SRL strategies across different EFL speaking tasks and contexts, as well as the role of individual differences, such as motivation and anxiety, in SRL use. Additionally, the impact of instructional interventions on promoting SRL and the long-term effects of SRL training on speaking proficiency and self-efficacy are underexplored. This study aims to address these gaps by investigating how digital storytelling fosters SRL strategies in EFL speaking, focusing on goal-setting, self-monitoring, and self-evaluation to enhance speaking skills and reduce anxiety.

L2 speaking anxiety

L2 speaking anxiety is a pervasive and multifaceted challenge that L2 learners encounter throughout their language acquisition journey, often persisting and intensifying as they progress (Badrasawi et al. 2020; Gkonou, 2013; Poza, 2011). This phenomenon manifests in various forms, including concerns about pronunciation, apprehension about negative evaluation, and general unease in speaking situations (Koch and Terrell, 1991; Tercan and Dikilitaş, 2015; Zheng and Cheng, 2018). Although traditionally viewed as a hindrance, the impact of anxiety on L2 speaking is not uniformly negative. Indeed, it can function as both a debilitative and facilitative force, depending on its intensity. Scovel (1978) was among the first to highlight that moderate levels of anxiety could enhance learners’ awareness and motivation, potentially leading to improved performance. Recent studies support this dual nature of anxiety. For instance, Calafato (2024) found a positive correlation between anxiety and language aptitude in multilingual language teachers, suggesting that a certain level of anxiety might actually contribute to greater language learning abilities. Similarly, El Shazly (2021) reported that although AI chatbot interactions initially intensified foreign language anxiety, they ultimately led to gains in oral proficiency, underscoring anxiety’s potential facilitative role.

However, it is crucial to recognize that excessive or chronic anxiety can severely impede learners’ speaking performance and overall language development. Specific activities, such as public speaking, role-plays, and even defining words, have been identified as particularly anxiety-inducing (Galante, 2018; Koch and Terrell, 1991). The prevalence and intensity of speaking anxiety vary considerably across learners and contexts, with studies reporting a range of anxiety levels. For example, moderate anxiety levels have been observed among Turkish EFL undergraduates (Balemir, 2009), while significantly higher levels are reported among Malaysian gifted students compared to their non-gifted peers (Kamarulzaman et al. 2013). These findings suggest that individual differences and learning environments significantly shape anxiety experiences. Interestingly, research consistently shows that gender differences in speaking anxiety are largely insignificant (Batiha et al. 2016; Ahmed et al. 2017).

L2 speaking anxiety arises from both internal and external factors. Internally, fear of making mistakes, negative self-perceptions, and perfectionism heighten anxiety (Jomaa and Jupri, 2014). For example, Ahmed et al. (2017) found that concerns about English proficiency and difficult assignments contributed to anxiety in postgraduate learners. Externally, anxiety is amplified by fear of negative evaluation from instructors or peers, unfamiliar environments, and assessment pressure (Batiha et al. 2016). Ozdemir and Papi (2022) also noted that fixed mindsets increase anxiety, while growth mindsets boost self-confidence in speaking. The effects of speaking anxiety on L2 performance are complex. High anxiety can hinder fluency and accuracy (Kamarulzaman et al. 2013), though moderate anxiety may enhance performance. Rood and de Jong (2023) found that anxiety influenced both utterance and cognitive fluency, while Aubrey (2022) showed how anxiety can serve as both a motivator and challenge. However, Mora et al. (2024) emphasize that anxiety’s effects can worsen with increased task complexity, underscoring the need to consider task demands in language learning.

To reduce speaking anxiety, strategies like fostering a positive classroom environment with clear expectations and instructor support can help (Zhiping and Paramasivam, 2013). Instructional innovations, such as concept mapping-based flipped learning (Chen and Hwang, 2020) and mobile-assisted peer feedback (Ebadijalal and Yousofi, 2023), have also been effective in reducing anxiety and improving speaking performance (Hwang et al. 2024). Technology-enhanced language learning has shown potential in addressing anxiety. While new technology can initially increase anxiety (El Shazly, 2021), it can also create a more supportive environment (Chen, 2022; Xiangming et al. 2020). Interactive digital tools may empower learners, enabling more comfortable and confident expression (Aktaş, 2023).

The present study aims to contribute to this ongoing discussion by examining the potential of DST as a tool for reducing anxiety in the context of IELTS preparation. By fostering a supportive and creative learning environment, DST may empower learners to overcome their anxieties and develop greater confidence in their speaking abilities.

The interplay of DST, self-regulation, and anxiety in EFL speaking

Digital storytelling has gained recognition as an effective tool for enhancing L2 speaking skills (Huang, 2023; Yang et al. 2022). Meanwhile, research highlights the importance of self-regulation and anxiety in language learning (Özer and Akçayoğlu, 2021; Yuksel et al. 2023). Despite growing interest in these areas, few studies have examined the interplay between DST, self-regulation, and anxiety in EFL speaking, leaving a gap in understanding how DST can address both learners’ speaking skills and emotional needs. Although positive impact of DST on L2 speaking proficiency is well-documented (Gimeno-Sanz, 2015; Soler Pardo, 2014; Vinogradova et al. 2011), research specifically linking DST with self-regulation and anxiety in EFL settings remains limited (Su and Guo, 2024). This gap is significant, as exploring these connections could improve the application of DST to enhance self-regulation and reduce anxiety in language learning.

While direct studies on DST’s effect on self-regulation in EFL speaking are scarce, related research suggests its potential. DST promotes learner autonomy, a key element of self-regulation, by encouraging students to manage their learning (Hava, 2021; Kim, 2014). Creating digital stories involves metacognitive activities like goal-setting, planning, and reflection, which are core to self-regulated learning (Zimmerman and Kitsantas, 2014), indicating that DST can foster self-regulation, even if not explicitly measured in prior studies. Research on self-regulated learning in other L2 contexts supports the idea that technology-enhanced tools like DST can promote self-regulation (Rahimi and Fathi, 2022). Tools that encourage reflective practice, peer feedback, and autonomy have been shown to enhance self-regulation (Teng, 2022; Uztosun, 2020). Given DST’s emphasis on creative input, revision, and narrative control, it likely promotes SRL in EFL learners, facilitating improved speaking skills through enhanced self-regulation.

Although direct research linking DST to anxiety reduction in EFL contexts is limited, studies on related concepts such as learner confidence and motivation provide insight. DST has been associated with increased learner confidence and engagement, which can reduce anxiety (Hava, 2021; Huang, 2023; Kim and Lee, 2018). The multimodal nature of DST—combining text, images, and audio—offers a less intimidating platform for language production, potentially reducing fear of negative evaluation, a common contributor to speaking anxiety (Aktaş, 2023; Chen, 2024). Research on technology-enhanced language learning broadly suggests that such tools create supportive, low-pressure environments (Chen, 2022; Fathi et al. 2024; Xiangming et al. 2020), implying that DST, through its engaging format, might similarly reduce anxiety by offering diverse ways for learners to express themselves.

Drawing on Vygotsky’s SCT and related studies, DST’s multi-step process—planning, scripting, revising, and presenting—naturally engages learners in self-regulatory behaviors. These tasks, which involve goal-setting, monitoring, and evaluation, are essential components of SRL (Zimmerman, 2002). The iterative nature of DST allows for ongoing self-assessment and refinement, strengthening self-regulatory skills over time. DST also gives learners autonomy over content and presentation decisions, fostering intrinsic motivation, a key factor in effective self-regulation (Çelik et al. 2012; Hava, 2021). This autonomy, paired with collaborative elements like peer feedback, allows learners to internalize feedback and gradually develop stronger self-regulation (Zimmerman and Kitsantas, 2014). Thus, DST facilitates both social and autonomous self-regulation in EFL learners.

The creative and multimodal nature of DST offers a less intimidating environment for language production by allowing learners to use a combination of visual, auditory, and textual elements. This reduces reliance on verbal output, helping to alleviate anxiety associated with speaking in a foreign language (Kim and Lee, 2018). Additionally, DST enables learners to rehearse and revise their spoken content multiple times before presenting, building confidence and reducing performance-related anxiety (El Shazly, 2021). The opportunity to refine their work over time eases the pressure to perform perfectly in one attempt. Furthermore, the collaborative aspect of DST projects fosters a supportive peer environment, reducing the fear of negative evaluation and making the learning process less stressful (Zhiping and Paramasivam, 2013).

Taken together, there is a need for research that directly explores the relationship between DST, self-regulation, and anxiety in EFL speaking. This study addresses that gap by examining how DST can improve speaking proficiency, enhance self-regulation, and reduce anxiety in EFL learners. By integrating these constructs, the research provides insights into the effectiveness of DST as a comprehensive learning tool and offers evidence-based recommendations for fostering learner autonomy, promoting self-regulation, and reducing anxiety in language learning.

Theoretical framework

This study is grounded in Vygotsky’s (1978) Sociocultural Theory (SCT), which emphasizes the role of social interaction and cultural tools in cognitive development (Kozulin, 2003). Vygotsky posits that learning is embedded in social contexts and mediated by cultural artifacts, such as language and technology. This perspective provides a useful framework to examine the effects of DST on L2 speaking skills, self-regulation, and anxiety. In SCT, language is a key cultural tool shaping thought and learning (Vygotsky, 1978). DST, by integrating language with visual and auditory elements, serves as a multifaceted tool that enriches communication and supports knowledge construction. This aligns with Vygotsky’s view of cultural tools’ mediational role in learning (Lantolf and Thorne, 2006). DST allows learners to experiment with language, express ideas creatively, and engage in meaningful interactions, fostering linguistic competence and cognitive development (Gimeno-Sanz, 2015; Soler Pardo, 2014).

The Zone of Proximal Development (ZPD), central to Vygotsky’s theory, represents the space where learners can expand their abilities with guidance. DST offers opportunities for interaction and feedback, providing scaffolding from peers and instructors that helps learners push their linguistic limits and internalize new skills (Hava, 2021; Yang et al. 2022). This collaborative environment supports the development of advanced speaking skills through activities learners initially struggle to perform independently (Bailey et al. 2021).

Furthermore, Vygotsky emphasized scaffolding as essential for skill development, and DST offers multiple scaffolding opportunities, such as storyboarding templates, multimedia resources, and peer feedback (Sadik, 2008; Robin, 2016). The collaborative aspect of DST, where learners co-create and share stories, embeds learning in a rich social context, aligning with Vygotsky’s view that cognitive development is shaped by interactions with more knowledgeable others (Vygotsky, 1978; Lantolf and Thorne, 2006). These interactions not only improve language skills but also foster a supportive learning community. Learning, from Vygotsky’s perspective, is a process of internalization, where learners gradually transform external knowledge into internal mental functions through social interactions and cultural tools. DST encourages active participation, reflection, and feedback, creating an environment conducive to internalization. As learners develop their digital stories, they also build self-regulatory skills, including goal-setting and strategic decision-making (Godwin-Jones, 2015; Hafner and Miller, 2011). This process empowers learners to take control of their language development, enhancing both cognitive and metacognitive abilities (Zimmerman, 2000; Efklides, 2011). Vygotsky also emphasized the importance of social support in reducing anxiety and creating a sense of belonging. The collaborative nature of DST fosters a supportive environment where learners can express themselves without fear of judgment, which helps reduce the anxiety often associated with L2 speaking (Chen, 2022; Aktaş, 2023). Through peer feedback and interaction, learners build resilience and confidence, overcoming their fear of making mistakes (Zhiping and Paramasivam, 2013; Sam and Hashim, 2022).

Grounded in Vygotsky’s SCT, this study examines L2 speaking skills, self-regulation, and anxiety. DST offers a rich context for developing speaking skills through social interaction, scaffolding, and cultural tools (Yang and Wu, 2012; Fu et al. 2022). It also supports self-regulation by promoting learner autonomy, goal-setting, and self-monitoring (Huang, 2023; Lantolf and Thorne, 2006). Moreover, DST’s collaborative environment helps mitigate speaking anxiety, a common challenge in L2 learning (Chen Hsieh and Lee, 2023; Xiangming et al. 2020). While prior studies have shown the benefits of DST in language learning (Fu et al. 2022; Huang, 2023), this research seeks to deepen understanding through a Vygotskian lens. By exploring how DST scaffolds learning within learners’ ZPD, this study aims to reveal how it enhances speaking skills, fosters self-regulation, and reduces anxiety, offering insights for both theory and practice.

Methods

Participants and materials

To ensure unbiased evaluation, a double-blind randomized controlled trial design was employed. The required sample size of 89 participants was determined through a priori power analysis using G*Power software. This targeted the detection of a moderate effect size (0.25) with a significance level of α = 0.05 and 80% power.

The participants were recruited from an IELTS preparation program at a language institute affiliated with Shandong University, Weihai, China. The sample comprised 89 EFL learners, with 68% female (n = 61) and 32% male (n = 28), and ages ranging from 18 to 25 years (M = 21.1, SD = 1.9). All participants demonstrated intermediate English proficiency (B1-B2 on the Common European Framework of Reference for Languages), as assessed by their placement test scores upon entering the IELTS preparation course. This ensured a relatively homogenous sample in terms of language ability, thus allowing for a clearer assessment of the impact of the DST intervention.

The participants’ academic backgrounds were diverse, spanning fields such as engineering (n = 25), humanities (n = 22), natural sciences (n = 18), and social sciences (n = 24), potentially enhancing the generalizability of the findings to EFL learners from various disciplines. Moreover, all participants expressed a strong desire to improve their speaking skills for academic purposes and achieve a higher band score on the IELTS exam. This intrinsic motivation likely contributed to their active engagement in the study and may have influenced their learning outcomes.

Following recruitment, the participants were randomly assigned to two groups: the experimental group (n = 45), which received the DST intervention, and the control group (n = 44), which followed the conventional instructional methods of the IELTS preparation program. Prior to their participation, all individuals were thoroughly informed about the study’s objectives and procedures, and their written informed consent was obtained, adhering to the ethical guidelines established by the Ethics Committee at Shandong University, Weihai, China. This committee had previously reviewed and approved the study protocol, ensuring adherence to ethical principles throughout the research process. Strict measures were also implemented to guarantee the confidentiality, anonymity, and well-being of all participants during the study.

Instrumentation

Speaking skills evaluation

To evaluate the EFL learners’ speaking abilities, the assessment incorporated the IELTS speaking skill test—a standardized evaluation designed to measure candidates’ proficiency across four equally weighted domains. These domains included fluency, focusing on the coherent and smooth expression of ideas without hesitation; vocabulary, assessing the accurate and appropriate use of a diverse range of vocabulary; accuracy, emphasizing the precise application of grammatical structures; and pronunciation, evaluating the utilization of English phonetics such as intonation, stress, and connected speech. Utilizing the IELTS Speaking Band Descriptors, learners were assigned scores ranging from 1 to 9 in each domain. An overall speaking skill score was then determined by averaging scores across these four areas. This process resulted in individual scores for fluency, vocabulary, accuracy, and pronunciation, which were further averaged to ascertain learners’ total speaking skills. The assessment of learner scores was carried out by two proficient and experienced raters, ensuring an acceptable inter-rater reliability coefficient of 0.87.

Self-regulation questionnaire

To investigate the self-regulatory strategies employed by EFL learners, we employed the Self-Regulating Trait (SRT) questionnaire developed by O’Neil and Herl (1998). Comprising 32 Likert-type questions, this questionnaire spans a spectrum from ‘almost never’ to ‘almost always’. The SRT questionnaire includes two fundamental dimensions: motivation and metacognition. Within each dimension, self-monitoring and planning are representative of metacognition, while effort and self-efficacy correspond to motivation. Previous studies by Herl et al. (1999) validated the reliability and validity of this instrument, revealing a Cronbach’s Alpha consistency estimation of 0.83 in this study, indicating high reliability. Content validity was confirmed through a thorough review by three experts in the field of language learning and educational psychology, who assessed the relevance and comprehensiveness of the items. Additionally, face validity was evaluated during a pilot phase involving a sample of 20 participants similar to the main study cohort. Feedback from this pilot phase indicated that the items were clear and relevant to the constructs being measured.

Furthermore, the reliability of each subscale was assessed using Cronbach’s Alpha, with the metacognition dimension yielding an alpha coefficient of 0.85 and the motivation dimension an alpha of 0.82, both indicating high internal consistency. The overall reliability of the SRT questionnaire in this study was 0.83, consistent with the findings of Herl et al. (1999), thereby affirming the instrument’s reliability in measuring self-regulation among EFL learners.

L2 speaking anxiety scale

In the assessment of students’ English speaking anxiety, the study adopted a modified version of the 12-item Speaking Anxiety Scale derived from Horwitz et al. (1986) Foreign Language Classroom Anxiety Scale. Specifically focusing on items related to anxiety and confidence while speaking English, this adapted scale, termed the English Speaking Anxiety Scale (ESAS), comprised statements such as “I never feel quite sure of myself when I am speaking English in my class” and “I don’t feel anxious when I talk to native speakers of English.” Respondents rated these items on a 5-point Likert scale, ranging from ‘Strongly disagree’ to ‘Strongly agree’, assigning values 1–5 to the respective descriptors. Similar to the SRT questionnaire, the ESAS underwent rigorous content and face validity assessments. Expert reviews confirmed the relevance and appropriateness of the items for measuring speaking anxiety in the EFL context. The face validity of the scale was also affirmed during the pilot testing phase, where participants reported that the items were easy to understand and accurately reflected their anxiety levels when speaking English.

The internal consistency of the ESAS was robust, with a Cronbach’s Alpha of 0.81, aligning with the reliability reported by Liu (2021). These measures confirm the ESAS as a reliable and valid tool for assessing L2 speaking anxiety in this study.

DST software

Adobe Spark was chosen as the primary DST software for this study due to its versatile features and alignment with the research objectives. Its user-friendly interface minimized technical barriers, allowing participants to focus on crafting narratives and expressing themselves. Its professional templates also facilitated visually appealing presentations, enhancing both the storytelling experience and audience engagement. The seamless integration of multimedia elements—images, videos, audio, and text—provided participants with tools to create dynamic narratives, which contributed to improving their speaking skills.

Throughout the study, Adobe Spark played a crucial role in supporting the DST intervention. During the first week, its intuitive design helped participants learn the DST process, including story planning, multimedia selection, scripting, and presentation preparation. In weeks 2–3, the platform enabled brainstorming and topic selection aligned with academic interests and IELTS themes, while also helping participants outline and plan their stories with instructor guidance. In weeks 4–6, text capabilities of Spark were used for drafting story scripts, ensuring clarity and engagement. Participants integrated images, audio, and music to create cohesive narratives. During the recording phase (weeks 7–8), participants narrated their stories using the platform, which supported the incorporation of multimedia and provided tools for feedback on pronunciation and fluency.

In week 9, Spark facilitated peer review and revisions, enabling presentations and collaborative feedback. Participants improved their stories based on peer and instructor input, focusing on language proficiency and presentation effectiveness. In the final week (week 10), participants presented their refined stories, and the instructor evaluated them based on content delivery, language proficiency, creativity, and multimedia use. Adobe Spark’s integration throughout the study enhanced participant engagement and contributed to meeting the research goals on the impact of DST on L2 speaking skills.

Procedures

Experimental group

The experimental group actively participated in a ten-week DST intervention, involving two weekly sessions of 60 min each. This structured intervention comprised several distinct stages, offering a comprehensive approach to enhance speaking skills (see Table 1 for an overview of the instructional procedures).

Table 1 Overview of Instructional Procedures.

During the initial week, the instructor introduced the concept of digital storytelling, emphasizing its potential benefits for improving speaking abilities. Participants were acquainted with the fundamental steps integral to the DST process, covering story planning, multimedia element selection, narrative scripting, and presentation preparation. Clear communication of ethical considerations and expectations for participation contributed to a transparent understanding among participants. In the subsequent weeks, learners engaged in topic selection and planning (weeks 2–3), where they brainstormed and selected storytelling topics aligned with their academic interests and IELTS exam themes. Guided by the instructor, participants developed detailed story outlines, identified key elements, and planned the overall narrative structure.

The following weeks (4–6) focused on scriptwriting and multimedia integration. Participants crafted individual story scripts characterized by clear language and engaging narrative flow. The selection of multimedia elements, including images, audio recordings, and background music, was carefully guided by the instructor, ensuring cohesiveness within the chosen narrative framework.

Weeks 7–8 involved the recording and production phase, where participants utilized digital storytelling software or online platforms to record their narrated stories. The instructor provided personalized feedback during this process, concentrating on refining pronunciation, enhancing fluency, and ensuring overall presentation clarity. Moving into week 9, learners engaged in peer review and revision, presenting their digital stories to peers in small groups. This facilitated constructive feedback and collaborative idea-sharing. Participants, guided by peer feedback and instructor suggestions, iteratively revised their stories, addressing language proficiency concerns and enhancing overall presentation effectiveness.

In the final week (week 10), participants delivered their refined digital stories to the entire class, showcasing their storytelling abilities. The instructor employed a standardized rubric to evaluate each presentation, considering key aspects such as content delivery, language proficiency, creativity, and the effective use of multimedia elements.

Control group

The control group served as a crucial reference point for evaluating the impact of the digital storytelling intervention. Over the ten-week study period, they adhered to the standard curriculum of the IELTS preparation program, which employed a variety of established pedagogical approaches to enhance speaking skills. Engaging in guided discussions formed a core component of the control group’s activities. These discussions covered a wide range of topics relevant to the IELTS exam and participants’ academic fields. Through these discussions, learners were encouraged to express their opinions, engage in critical thinking, and practice spontaneous language use.

Additionally, students collaborated in small groups to prepare and deliver presentations on assigned topics. This collaborative activity fostered teamwork, honed public speaking skills, and enhanced participants’ ability to organize and present information effectively. Role-playing exercises provided learners with opportunities to immerse themselves in simulated scenarios relevant to academic or real-life situations. This allowed them to practice using English in diverse contexts and develop their communication skills. Moreover, participants engaged in individual speaking tasks, including impromptu speeches, short presentations, and responding to prompts. These activities provided personalized feedback and targeted skill development opportunities.

Within these activities, the instructor employed various teaching methods, including explicit instruction on grammatical structures, vocabulary development exercises, and dedicated sessions focused on improving pronunciation accuracy, intonation, and clarity. It is essential to highlight that while the control group benefited from diverse speaking activities and received instruction on essential language skills, they did not partake in any digital storytelling interventions. This clear distinction between the experimental and control groups allows for a robust comparison, enabling researchers to isolate the specific effects of the digital storytelling intervention on L2 speaking skills in the experimental group.

Maintaining internal validity

To ensure internal validity and control for potential confounding variables, several rigorous measures were taken. First, randomization was used to allocate participants impartially to either the experimental or control group through a double-blind procedure, reducing the risk of bias. Baseline assessments of speaking skills, self-regulation, and speaking anxiety were conducted to establish comparability between groups before the intervention. Blinding was maintained throughout, with both instructors and evaluators unaware of group assignments, minimizing bias in instruction and evaluation.

Standardized intervention protocols were strictly followed for the experimental group, while consistent instructional methods were applied to the control group, ensuring uniformity in implementation. Ethical considerations were carefully observed, with informed consent obtained from all participants in accordance with guidelines for research involving human subjects, maintaining the integrity of the study and protecting participants’ rights.

Statistical analyses

In the initial phase, a comprehensive examination of key variables across distinct time intervals was conducted using descriptive statistics, including mean and standard deviation. To ensure the normal distribution of both the data and residuals, Shapiro-Wilk tests were employed, accompanied by visual inspections of QQ plots (Razali and Wah, 2011). Additionally, the uniformity of variance was validated through Levene’s tests (Johnson, 2013). Independent samples t-tests were performed at the baseline to confirm the initial similarity between the two groups.

Given the inherent repeated measures design of our data, Linear Mixed Effects Models (LMM) emerged as the principal analytical method to explore the longitudinal impact of the intervention on all measured variables (Mirman, 2017). This approach was favored over traditional repeated-measures analysis of variance (ANOVA) due to its superior handling of missing data, enhancing statistical power through the extrapolation of observed values (Singer and Willett, 2003). Moreover, the LMM accommodated both fixed effects (e.g., group effects) and random effects (e.g., individual disparities in pre-intervention scores) (Bates et al. 2015).

The LMM incorporated fixed effects—group, time, and their interaction—alongside random effects, specifically participant intercepts (Baayen et al. 2008). Variance components were estimated using Restricted Maximum Likelihood (REML) (Bolker et al. 2009), and Satterthwaite adjustments were applied to determine degrees of freedom (Satterthwaite, 1946). All statistical analyses were performed using the General Analyses for Linear Models (GAMLj) module in Jamovi, version 2.2.5 (The Jamovi Project, 2022).

To further investigate the effects of time within each group, distinct simple effects analyses were undertaken to scrutinize changes across different time points in both the LLM and control groups (Olejnik and Algina, 2003). Statistical significance was established at a significance level below 5% (p < 0.05). Post-hoc analyses utilizing Bonferroni correction were employed to evaluate the intervention’s effect within each group at each time point separately (Rothman, 1990).

Results

Baseline comparisons (Pre-intervention assessments)

Prior to the intervention, we conducted comprehensive assessments to ensure the comparability of the experimental and control groups. As detailed in Table 2, independent samples t-tests confirmed a lack of statistically significant differences between the groups across all measured variables: global speaking (t(87) = 0.545, p = 0.352, Cohen’s d = 0.12), fluency (t(87) = −1.026, p = 0.151, Cohen’s d = 0.22), vocabulary (t(87) = −1.170, p = 0.124, Cohen’s d = 0.25), accuracy (t(87) = 0.649, p = 0.273, Cohen’s d = 0.14), pronunciation (t(87) = −1.059, p = 0.146, Cohen’s d = 0.23), speaking self-regulation (t(87) = −0.927, p = 0.192, Cohen’s d = 0.20), and speaking anxiety (t(87) = 0.805, p = 0.224, Cohen’s d = 0.17). These findings indicate that at the baseline level, there were no significant differences between the groups in any of these measured variables, ensuring comparability between the groups prior to the intervention.

Table 2 Results of Independent Samples T-Tests for the Groups in Pre-intervention Assessments.

Post-intervention and follow-up assessments

Following the intervention, significant differences emerged between the groups across various time points. For the experimental group, there was a progressive increase in mean scores from 4.12 (T0) to 5.86 (T1) and further to 6.02 (T2), while the control group exhibited scores of 4.09 (T0), 5.11 (T1), and 5.32 (T2). A repeated measures ANOVA was conducted to evaluate the differences in mean scores over time for both groups. The analysis revealed significant differences over time for both the experimental group (F(2, 86) = 32.17, p < 0.001, η2 = 0.42) and the control group (F(2, 86) = 14.26, p < 0.001, η2 = 0.25), indicating that both groups experienced significant improvements in their speaking skills, with the experimental group showing a more pronounced increase.

Regarding speaking self-regulation, participants in the experimental group displayed mean scores of 4.85 (T0), 5.97 (T1), and 5.81 (T2), while the control group showed scores of 4.93 (T0), 5.15 (T1), and 5.17 (T2). Over time, significant differences were observed solely within the experimental condition (F(2, 86) = 12.79, p < 0.001, η2 = 0.31), with a non-significant trend in the control group (F(2, 86) = 2.85, p = 0.141, η2 = 0.06) Table 3.

Table 3 Descriptive Statistics and Time Effects Analysis.

In terms of vocabulary, the experimental group showed mean scores of 4.52 (T0), 6.03 (T1), and 5.91 (T2), while the control group exhibited scores of 4.63 (T0), 5.19 (T1), and 5.07 (T2). Significant differences were found in the experimental group (F = 16.37, p < 0.001, η2 = 0.35), with a marginal trend in the control group (F = 4.82, p = 0.056, η2 = 0.10).

Pronunciation scores in the experimental group were 4.74 (T0), 6.42 (T1), and 6.53 (T2), whereas the control group demonstrated scores of 4.83 (T0), 5.26 (T1), and 5.34 (T2). Notably, significant differences over time were observed in both the experimental (F = 19.21, p < 0.001, η2 = 0.38) and control (F = 8.36, p = 0.034, η2 = 0.15) groups.

Regarding speaking anxiety, the experimental group exhibited mean scores of 3.72 (T0), 2.67 (T1), and 2.52 (T2), while the control group showed scores of 3.65 (T0), 3.39 (T1), and 3.41 (T2). Over time, significant differences were observed in the experimental group (F = 12.31, p < 0.001, η2 = 0.32), while the control group did not show significant changes (F = 0.96, p = 0.324, η2 = 0.02).

Also, pre-intervention assessments (see Table 2) confirmed a lack of statistically significant differences between the experimental and control groups across various measured variables: global speaking (t(87) = 0.545, p = 0.352, Cohen’s d = 0.12), fluency (t(87) = −1.026, p = 0.151, Cohen’s d = 0.22), vocabulary (t(87) = −1.170, p = 0.124, Cohen’s d = 0.25), accuracy (t(87) = 0.649, p = 0.273, Cohen’s d = 0.14), pronunciation (t(87) = −1.059, p = 0.146, Cohen’s d = 0.23), speaking self-regulation (t(87) = −0.927, p = 0.192, Cohen’s d = 0.20), and speaking anxiety (t(87) = 0.805, p = 0.224, Cohen’s d = 0.17). These findings indicate that at the baseline level, there were no significant differences between the groups in any of these measured variables, which strengthens the validity of subsequent comparative analyses.

The instructional intervention utilizing digital storytelling exhibited a distinct impact on the global speaking scores of the experimental group compared to the control. As revealed in Table 4, Linear Mixed Model (LMM) analysis revealed a substantial and statistically significant increase in global speaking scores within the experimental group from baseline (T0) to post-test (T1) compared to the control group (β = 0.72, p = 0.035). Furthermore, at the one-month follow-up (T2), the experimental group showcased a notably higher increase in scores compared to the control (β = 0.67, p = 0.039). The fixed effects of the model accounted for 32.1% of the variance in score differences (R2 = 0.321), while the full model incorporating random effects explained approximately 59% of the variance (R2 = 0.589).

Table 4 The Results of Linear Mixed Model Analysis.

Time effects analyses reiterated these findings, indicating significant time effects in both the experimental (F = 32.17, p < 0.001) and control groups (F = 14.26, p < 0.001) for global speaking scores. The experimental group demonstrated a more substantial shift in scores compared to controls, particularly at the one-month follow-up, emphasizing the sustained advantages of the digital storytelling intervention in fostering global speaking abilities.

Similar discernible patterns emerged for underlying speaking skills, including fluency, vocabulary, accuracy, and pronunciation. The intervention led to statistically significant differences between the experimental and control groups from T0 to T1 for fluency (β = 0.91, p = 0.014), vocabulary (β = 0.95, p = 0.016), accuracy (β = 1.05, p = 0.011), and pronunciation (β = 1.01, p = 0.001). Remarkably, the differences persisted and increased from baseline to the one-month follow-up (T2) for all sub-skills: fluency (p = 0.023), vocabulary (p = 0.018), accuracy (p = 0.009), and pronunciation (p = 0.001). The increases in scores for these sub-skills were consistently higher in the experimental group, suggesting the potential of digital storytelling-based instruction in conferring enduring impacts on speaking skills compared to conventional approaches.

Time effects analyses for underlying speaking skills revealed significant effects in the experimental group for fluency (F = 12.79, p < 0.001), vocabulary (F = 16.37, p < 0.001), accuracy (F = 10.38, p < 0.001), and pronunciation (F = 19.21, p < 0.001). However, the control group did not exhibit significant time effects for fluency (F = 2.85, p = 0.141), vocabulary (F = 4.82, p = 0.056), and accuracy (F = 2.16, p = 0.134), highlighting the distinctive benefits of the digital storytelling intervention on sustained improvement in speaking skills. In terms of speaking self-regulation, the experimental group showcased significant advantages post-intervention (T0-T1). The increase in scores for speaking self-regulation was notably higher in the experimental group compared to the control group (β = 0.45, p = 0.041), and the decrease in speaking anxiety scores was significantly pronounced (β = −0.79, p = 0.026).

However, at the one-month follow-up (T0-T2), while the advantage in speaking self-regulation remained significantly higher in the experimental group compared to controls (β = 0.54, p = 0.038), the difference in speaking anxiety scores diminished, yet remained significantly higher in the experimental group (β = −0.96, p = 0.014). Time effects analyses supported these observations, revealing significant time effects for speaking self-regulation (F = 11.51, p < 0.001) and speaking anxiety (F = 12.31, p < 0.001) in the experimental group, while the control group exhibited a non-significant effect for speaking self-regulation (F = 2.74, p = 0.162) and speaking anxiety (F = 0.96, p = 0.324).

Taken together, the use of digital storytelling in the instructional intervention notably improved various aspects of speaking skills among participants compared to conventional methods. The experimental group consistently outperformed the control group in global speaking abilities, fluency, vocabulary, accuracy, pronunciation, speaking self-regulation, and anxiety reduction. These findings emphasize the effectiveness of digital storytelling as a potent tool in enhancing speaking skills and addressing anxiety within language learning contexts. The sustained improvements observed at the one-month follow-up further highlight the potential long-term benefits of this innovative instructional approach.

Discussion

This study aimed to investigate the effects of digital storytelling on L2 speaking skills, self-regulation, and speaking anxiety among EFL learners in an IELTS preparation course. The discussion below ties each major finding directly back to the initial research questions and relates these findings to existing literature.

RQ1: impact of DST on L2 speaking skills

Our first research question explored the extent to which DST impacts the L2 speaking skills of EFL learners. The findings reveal a significant positive effect of DST on speaking proficiency. The experimental group showed statistically significant improvements in global speaking scores post-intervention, which aligns with previous studies demonstrating the efficacy of DST in enhancing speaking abilities (Fu et al. 2022; Huang, 2023; Liang and Hwang, 2023; Nair and Yunus, 2021; Yang et al. 2022). Notably, the gains extended to specific sub-skills such as fluency, vocabulary, accuracy, and pronunciation. This comprehensive improvement suggests that DST facilitates multifaceted enhancements in speaking skills, supporting its role as a powerful pedagogical tool (Kim and Lee, 2018; Rahimi and Yadollahi, 2017).

The unique features of DST contribute to these improvements. By constructing narratives using text, audio, and visuals, learners engage in a multi-sensory approach that promotes deeper immersion in spoken language (Hwang et al. 2016; Yang et al. 2022; Zhussupova and Shadiev, 2023). This active engagement allows for targeted practice, as learners thoughtfully consider word choice, grammar, and pronunciation while crafting their stories. The iterative process of scripting, recording, and presenting stories provides opportunities for self-monitoring and feedback, fostering metacognitive awareness (Godwin-Jones, 2015; Huang, 2023). Peer feedback during creation and presentation adds a layer of scaffolding, offering insights for refining speaking proficiency (Shen et al. 2023).

From a Vygotskian perspective, these findings illustrate how learning occurs through social interaction and cultural tools. DST served as a mediating cultural tool, enabling learners to interact with language in a rich context. Engaging with multimedia elements provided scaffolding within their ZPD, allowing them to perform beyond their independent capabilities (Vygotsky, 1978). The collaborative aspects of DST, such as peer feedback and group discussions, facilitated social interaction central to SCT. Through sharing stories and receiving input, learners co-constructed knowledge and refined their speaking skills, reflecting Vygotsky’s notion that cognitive development is socially mediated (Lantolf and Thorne, 2006).

RQ2: influence of DST on self-regulation strategies

Addressing our second research question, the study found a significant increase in speaking self-regulation scores in the experimental group. This indicates that DST fosters the development of self-regulatory strategies, aligning with research suggesting that DST empowers learners to take control of their learning (Liu et al. 2018; Sam and Hashim, 2022; Tecedor, 2023). The emphasis of DST on learner autonomy and goal-oriented activities creates an environment conducive to cultivating self-regulation (Fu et al. 2022; Kim, 2014). Learners engaged in setting goals, monitoring progress, and adapting their approaches—key components of the self-regulatory cycle (Zimmerman, 2002).

From the SCT viewpoint, this improvement represents the internalization of practices modeled and scaffolded within the DST environment (Yang and Wu, 2012). Through social interaction and the use of cultural tools (the digital platform), learners gradually assumed greater control over their learning (Vygotsky, 1978). The requirement to make decisions about content and presentation fostered a sense of agency, consistent with Vygotsky’s idea that tools facilitate higher mental functions through internalization. The collaborative nature of DST provided a social context essential for developing self-regulation. Interacting with peers and receiving feedback helped learners refine strategies and develop metacognitive skills for self-monitoring and evaluation (Efklides, 2011; Teng, 2022; Rahimi and Yadollahi, 2017). This process mirrors Vygotsky’s emphasis on the role of social interaction in cognitive development, leading to independent application of self-regulatory strategies.

Our findings extend previous research by demonstrating how DST promotes specific self-regulatory behaviors that enhance speaking proficiency. They support the notion that technology-enhanced environments play a crucial role in facilitating self-regulated learning (Lei et al. 2022; Öztürk and Çakıroğlu, 2021).

RQ3: effect of DST on speaking anxiety

Regarding our third research question, the study found a significant reduction in speaking anxiety levels in the experimental group. While both groups experienced some anxiety reduction, the decrease was more substantial among learners exposed to DST. This suggests that DST effectively creates a supportive environment for speaking practice. This finding aligns with studies indicating that technology-enhanced learning can reduce anxiety and foster positive learning experiences (Chen, 2022; Xiangming et al. 2020). DST’s multimodal nature provides alternative channels for expression, easing the pressure of oral performance (Huang, 2023; Sadik, 2008; Yang, 2012). By allowing personalized communication, DST helps learners feel more comfortable and confident (Kim and Lee, 2018). Crafting personal narratives and sharing them with peers fosters a sense of ownership and authenticity, linked to reduced anxiety and increased self-efficacy (Kim and Li, 2021; Hava, 2021). The iterative process allows for rehearsal and refinement, building confidence before final presentations (Yang and Wu, 2012), which aligns with self-regulated learning principles (Zimmerman, 2002).

From an SCT perspective, the reduction in anxiety can be attributed to the supportive social environment of DST. Vygotsky posited that emotional development is intertwined with cognitive growth and that positive social contexts alleviate barriers to learning (Vygotsky, 1978). DST fostered a sense of community, with learners receiving encouragement and constructive feedback, which built confidence. This reflects Vygotsky’s emphasis on social support in overcoming challenges and facilitating development.

Overall, our findings extend SCT by demonstrating how digital tools can serve as modern cultural artifacts that mediate learning. While Vygotsky’s original work did not encompass digital technology, the principles of SCT are applicable to contemporary educational contexts. DST represents a convergence of social interaction, cultural tools, and individual cognition, embodying the core elements of SCT in a digital age. The study shows that digital environments can effectively facilitate the internalization of complex skills like self-regulation and reduce affective barriers such as anxiety. This suggests that SCT remains a relevant and robust framework for understanding learning processes, even as the tools and contexts evolve. By integrating technology into language learning, we can harness the power of SCT to enhance educational outcomes in new and meaningful ways.

Conclusion and implications

This study provides clear evidence that digital storytelling improves L2 speaking skills, enhances self-regulation, and reduces speaking anxiety among EFL learners in an IELTS preparation course. Drawing on Vygotsky’s (1978) SCT, the findings underscore the critical role of social interaction, cultural tools, and internalization in language learning. The results emphasize the importance of integrating digital tools in alignment with sociocultural principles, fostering collaboration, using meaningful cultural tools, and supporting the internalization of skills for more effective language learning.

The significant improvement in global speaking scores in the experimental group underscores the benefits of DST. Improvements were also observed in sub-skills such as fluency, vocabulary, accuracy, and pronunciation. These gains are likely due to DST’s interactive platform, which encourages active engagement with spoken language through multimedia, supporting learner expression. Additionally, the iterative nature of DST fosters self-monitoring and targeted practice, contributing to metacognitive development. The increase in speaking self-regulation in the experimental group reinforces DST’s effectiveness. By promoting learner autonomy, DST aligns with research on the role of technology in fostering self-directed learning. Learners engaged in goal-setting and adaptive strategies, leading to a more autonomous and empowered learning experience. While both groups experienced reduced speaking anxiety, the experimental group saw a greater reduction, suggesting DST’s role in creating a supportive and less stressful environment. The interactive and creative features of DST likely contributed to this decrease, allowing learners to express themselves with greater confidence.

From a practical standpoint, the alignment of these findings with SCT underscores the importance of designing language activities that promote collaboration and self-regulation. DST offers a rich, interactive platform that enhances language proficiency while fostering autonomy and reducing anxiety. To integrate DST effectively, educators should adopt a structured approach, aligning DST tasks with language learning objectives. DST can be introduced as a supplementary activity where students create stories related to their language goals, with activities scaffolded from basic narrative construction to more complex storytelling.

DST is also well-suited for promoting self-regulated learning. Educators can encourage students to set language goals—such as improving fluency or expanding vocabulary—at the outset of a DST project. Tools like progress checklists or reflection journals can help students monitor their learning and take ownership of the process, fostering deeper engagement. The multimedia nature of DST supports diverse learning styles, but educators must scaffold its use to prevent students from feeling overwhelmed. Introductory sessions on the technical aspects of DST, combined with ongoing support, can help students focus on language production. Encouraging peer collaboration during the storytelling process can further enrich the learning experience through idea exchange and mutual support.

The reduction in speaking anxiety observed in the experimental group suggests that DST creates a less intimidating environment for language practice. Educators can leverage this by using DST as an alternative to traditional oral presentations, allowing students to rehearse and refine their narratives, thus reducing performance pressure. A supportive classroom atmosphere, where peer feedback focuses on improvement, can further reduce anxiety and encourage fuller participation. DST also promotes authentic language use in meaningful contexts. Educators can design projects that require learners to create stories about personal experiences, cultural traditions, or academic topics. These real-world contexts enhance engagement and motivation, encouraging the use of more complex language structures. For successful DST implementation, educators need to be comfortable with both the technological and pedagogical aspects. Institutions should provide professional development, including workshops on using DST platforms and aligning them with language objectives. Access to user-friendly digital storytelling platforms and technical support for both educators and students is also essential. Institutions might also explore partnerships with technology providers to offer affordable access to DST tools, ensuring they are widely available to learners.

While this study contributes valuable insights, it is essential to acknowledge its limitations, which may inform future research endeavors. Firstly, the study focused on a specific demographic within the experimental group, and the results may not be fully generalizable to diverse learner populations. Future research could consider expanding the participant pool to encompass a more comprehensive range of language learners, accounting for factors such as proficiency levels, cultural backgrounds, and learning preferences. Additionally, the intervention duration in this study was relatively short-term. Examining the long-term effects of digital storytelling on speaking skills could provide a more nuanced understanding of its sustained impact. Longitudinal studies tracking participants over extended periods would be instrumental in clarifying the enduring benefits and potential challenges associated with incorporating DST into language curricula. Furthermore, the study primarily relied on quantitative measures to assess speaking skills, self-regulation, and anxiety reduction. Integrating qualitative methods, such as interviews or reflective journals, could offer deeper insights into learners’ experiences and perceptions regarding the use of DST in language learning.

Another limitation lies in the potential influence of external factors, such as learners’ prior exposure to technology or varying levels of motivation. While efforts were made to control for these variables, future studies might consider more rigorous control measures or explore how these factors interact with the effectiveness of DST in language education. Lastly, the study focused on the effects of DST in a controlled educational setting. Exploring the applicability and effectiveness of DST in more diverse and authentic language learning environments, such as community language programs or online language courses, would enhance the external validity of the findings.