Introduction

For the last two years, a plethora of research has been conducted on using Generative Artificial Intelligence (GenAI) in teaching and learning contexts. GenAI is understood in the paper as “a category of AI that can create new content such as text, images, videos, and music” (OECD, 2025) in contrast to AI defined as “a technology that enables machines to imitate various complex human skills” (Sheikh et al., 2023, p. 15) or “systems that display intelligent behavior by analyzing their environment and taking actions—with some degree of autonomy—to achieve specific goals”. (High-Level Expert Group on Artificial Intelligence, 2019, p. 1). The emergence of new technologies and tools based on GenAI has led language teachers and learners to integrate these into various tasks such as administrative work, classroom activities, and self-regulated learning (e.g., Ahn et al., 2024; Cardona et al., 2023; Dizon, 2024; Godwin-Jones et al., 2024). In language teaching and learning, the uses of GenAI range from language checking and text editing to generating materials for reading and writing, lesson plans, exercises, and images and video generation (Dijkstra et al., 2022; Fawzi, 2023; Hsu, 2023; Kehoe, 2023; Kapp & Briskin, 2025; Kohnke et al., 2023; Li et al., 2024; van den Berg & du Plessis, 2023; Warschauer et al., 2023).

When GenAI tools first emerged, people seemed confused about their benefits or hazards to education. Language teachers and teacher trainers mainly looked for ways to refrain learners from using these tools as they led to inappropriate ways of preparing assignments, plagiarism, overreliance, and cheating (e.g., Cotton et al., 2023; Fyfe, 2023; Jose & Jose, 2024) as well as other concerns such as biases and privacy (Akgün & Greenhow, 2022; Chan & Colloton, 2024), which are at the forefront of concerns about AI; however, attention has started to be directed to the benefits and opportunities for language teaching, learning, and classroom management to maximize the benefits of using these tools in and outside the classroom via a diversity of activities (e.g., Gazaille et al., 2023; Yee et al., 2023). Language learners especially consider these technologies for academic purposes and support them, such as Grammarly (Johnston et al., 2024; Walton Family Foundation, 2024).

In this article, pre-service language teachers are understood as students of language studies with a specialization in teaching, who are just acquiring their first teaching and pedagogical experience, in contrast to experienced and practising teachers. The transfer of this recommendation for the use of GenAi in teaching and especially in lesson planning is of great importance, because it allows, on the one hand, to sensitize this young generation to the possible challenges of the new technology, but also to implement interesting links at the beginning of their professional career. In addition, it is worth emphasizing that today’s pre-service teachers, most of whom belong to Generation Z, have a completely different approach to technology and its use than experienced teachers; hence, the approach to both groups, e.g., in the context of schools, should include differentiated elements.

Pre-service language teachers might have limited, if not any, access to GenAI in their university curriculum. However, it is worth emphasizing the importance of awareness of the use of GenAI among future generations of teachers, who will soon be teaching themselves, and given the rapid technological development, teaching without GenAI may soon be impossible. Future graduates can finish their studies without enough practice in the use of GenAI for their future job, where AI will indeed dominate in various aspects (Gatlin, 2023), and this can be done through modeling the use of AI tools for language learning and teaching purposes (Moorhouse & Kohnke, 2024). More importantly, this can be done collectively and collaboratively with learners and pre-service teachers, where they can think and reflect on what, how, and why they use GenAI in the classroom. Therefore, the rationale underlying the current study lies in investigating how GenAI can be introduced to pre-service language teachers and how GenAI can be utilized with LS in designing lesson plans for practicum. On top of that, the study conducted by Clark and van Kessel (2024) highlights the research gap in the field of standardization of GenAI use for lesson planning. The researchers identify the need for the development of the protocols (standardized methods of proceeding) for GenAI implementation in the field of language learning and teaching training. Implementing GenAI in the field related to practicum opens up new perspectives since GenAI has become an important tool for designing the recruitment process in many fields (Chen, 2023; Shen, Zhang, 2024).

The study aims to investigate how GenAI tools are integrated into language lesson planning with a practicum-based focus and to dissect the challenges and benefits the participants face regarding training activities in integrating GenAI into lesson planning within the Lesson Study framework. In line with the aim of the study, the following research questions were proposed:

  1. 1.

    How did participants integrate Generative AI (GenAI) tools into language lesson planning within the Lesson Study framework?

  2. 2.

    What are the participants’ perceived benefits and challenges of training activities in integrating Generative AI (GenAI) into language lesson planning within the Lesson Study framework?

Literature review

GenAI in education

With the emergence of GenAI and GenAI tools such as ChatGPT and CoPilot, technology has disrupted teaching and learning by generating audio-visual materials and carrying out many tasks, replacing the need for great efforts to create content. For many teaching and learning contexts, AI offers customized, diverse activities for learners at different levels, two-way communication and human-like responses. AI tools have offered teachers many opportunities, from designing classroom activities and assessments to lesson plans (Kehoe, 2023; Peachey, 2024; Sekli et al., 2024; UNESCO, 2023). The use of technology in assessment and lesson planning has received much attention from researchers due to the opportunities provided, such as scoring and providing feedback (Edmett et al., 2024; Gatlin, 2023; Kehoe, 2023; Sadeghi & Douglas, 2023; van den Berg & du Plessis, 2023). As for using GenAI in lesson planning, several studies focus on teachers’ and pre-service teachers’ use of GenAI tools in creating lesson plans. For example, van den Berg and du Plessis (2023) investigated the role of GenAI tools such as ChatGPT in creating lesson plans and fostering critical thinking in teacher education. In line with this investigation, ChatGPT generated lessons in specific subjects and levels via prompts to develop objectives, steps, procedures, exercises, and worksheets. It was acknowledged that ChatGPT had impressive features that enabled teachers to create lesson plans and exercises; however, a word of caution was suggested regarding the limitations and potential biases.

In another study conducted by Hsu et al. (2024), it was found that primary pre-service teachers in Ireland mainly used GenAI tools, especially ChatGPT, for personal and academic tasks and activities. However, the results did not show frequent use compared to other uses. They suggested that pre-service teachers would benefit from training on the academic use of GenAI tools as a part of their professional development, which would also address their concern about the ethical use of these tools in the classroom (Holmes & Miao, 2023).

GenAI in language teaching

The report by Edmett et al. (2024) provides the results of a global teacher survey that included 1.348 teachers in 118 countries as participants in investigating their use of GenAI tools. The findings indicated that English language teachers mainly used chatbots and that many participants stated that they used GenAI tools to create lesson plans. Contrary to these findings, the results of the survey distributed to sixteen language instructors and eighty-nine students at several universities in the USA by Elsherbiny (2024) indicate that most of these instructors still need to utilize AI in their teaching. The lecturers who reported using AI in their classrooms indicated that they mainly used AI to create classroom materials and save time for lesson plans and classroom assessments. However, the findings interestingly show that most survey students reported using GenAI tools for language learning.

New light is shed on GenAI use by the study conducted by Mananay (2024), which points out the importance of GenAI in the customization of lesson plans and scaffolding practices. Based on the survey conducted among 100 teachers from Visayas with different lengths of experience, and a focus discussion among the chosen representatives of the participants of the survey. The findings highlight the meaning of ChatGPT incorporation into the scheduling and planning of the educational activities, the key point of teachers’ reflections was also the expressed the need to incorporate the adequate training programs at almost every stage of teacher education to prepare better future teachers to face challenges of using ChatGPT.

Lesson study model as the framework

Lesson Study was originally a Japanese model used for professional development for collaborative learning to improve science and mathematics; however, it has developed over time and is now used for lesson planning, practice-oriented research, and other forms of professional development (Kusanagi, 2021). In lesson planning, LS is stated to be conducted through several phases, such as planning, practice, and assessment (Kıncal et al., 2019). In the planning stage, a group of teachers prepares a lesson together. Then, in the practice section, the lesson is taught by one of these teachers while the others observe the classroom and the students. In the last phase, assessment, teachers assess the lesson plan, reflect on it, and revise it based on the comments and suggestions. Regarding the use of LS in teaching practicum and technology, studies have benefited from it in various ways. For example, Çetin (2024) and Çetin and Daloğlu (2025) proposed a model named Practicum Lesson Study (PLS) to use the benefits of implementing LS and provide a more practical and sustainable model for lesson planning. Hudson et al. (2024) also offered a rural science teacher development model via Technology-Mediated Lesson Study (TMLS) to benefit teachers living in different regions through collaboration and implementation of lessons. Coenders and Verhoef (2018) mentioned in their study that the lesson plan model can be implemented with success in personal teachers’ development, which reversed the previous perception of the method for teaching only and shifted the interest into its application for self-regulatory practices of teachers. In their concept, Coenders and Verhoef (2018) discussed a two-tier system of knowledge widening and experience exchange through active observation and reflection on the implemented strategies and techniques. In the development phase, the teachers are confronted with the new pedagogies and their potential use to satisfy the transforming needs. The second phase aims at reflecting the used methods and building connections between the personal development sphere, practice, developed materials and potential consequences.

Practicum between teaching reality and higher education

Practicum is an important part of education for future teachers, as it allows students to have their first encounter with the professional context, as well as to put the theory they acquired during their study into the practice. The reflection on their own performance during the practicum might be crucial for the future career path and understanding of the job-related challenges.

In the current study, GenAI has been implemented into lesson planning and reflection for the participants with the LS (Lewis et al., 2004; Lewis & Hurd, 2011), in a five-module GenAI training model for lesson design based on LS: (1) Study GenAI tools and Discussion; (2) Plan—Lesson Planning and Critical Evaluation; (3) Teach—GenAI-enhanced lessons (4) Observation—Structured observations and Feedback; (5) Reflect—Reflective analysis on—Ethical Challenges. This model, guided by LS, which is the instructional inquiry model of Lewis and Hurd (2011), has been tailored for language teachers, emphasizing specific training on the GenAI tool for lesson planning as well as teaching practice and reflections based on the lesson plans designed and improved in these phases. Each phase focuses on pedagogical integration and ethical considerations related to using Gen-AI in lesson planning and teaching, ensuring that students learn how to use these tools effectively and consider the broader implications of their use in educational contexts. The LS-based GenAI training was conducted over 12 weeks in the study, and details were provided in the procedure section.

Method

This study adopted a qualitative case study approach to investigate the integration of GenAI tools in the practicum settings of Teaching English as a Foreign Language (TEFL). The choice of a case study design was driven by the need for an in-depth, context-rich understanding of how pre-service English language teachers engage with GenAI tools during their teaching practice, how these tools influence pedagogical decision-making, and the perceived impacts on language learning processes and outcomes. According to, a case study is particularly suitable when the research aims to explore a contemporary phenomenon within its real-life context, especially when the boundaries between the phenomenon and the context are not clearly defined. In this regard, the TEFL practicum—as a complex, authentic instructional context involving real students, institutional expectations, and mentor supervision—provides a dynamic backdrop against which the use of GenAI tools can be critically examined. To enhance methodological rigor, the study was further framed within the LS methodology, an established teacher professional development model that emphasizes collaborative planning, observation, and reflection on teaching practices. Embedding the study within the LS framework enabled a cyclical exploration of GenAI tool integration, where teacher participants iteratively designed lessons incorporating AI tools, implemented these lessons in practicum classrooms, and then engaged in reflective dialogs with peers and mentors. This methodological synergy between the case study design and the LS framework allowed the researchers to capture both the situated complexity of GenAI use and the evolving pedagogical reasoning of novice teachers as they grappled with new technological affordances. The LS framework also provided a systematic structure for data collection across phases (plan–teach–observe–reflect), allowing triangulation from multiple sources and participants over time.

Participants

The study was conducted with ten pre-service English language teachers in their final year of a four-year teacher education program at a state university in Türkiye. The participants, comprising seven males and three females, were aged between 22 and 24 (Table 1). All participants had completed the required theoretical coursework and were engaged in practicum experiences at various K–12 schools as part of their graduation requirements. The unit of analysis for this study was the individual pre-service teacher, as each participant represented a distinct case for analyzing the integration and experience with GenAI tools.

Table 1 Participants’ characteristics.

For the qualitative component of the study, a purposive sampling strategy was employed to select participants. In the department. This group was targeted as they had taken previous courses addressing the use of technology in materials design before this study, and they were doing their practice teaching in the schools. The participants were considered a homogenous group since they took the same courses in the department, and their English skills were close to each other, as indicated by their grades in these courses and their university entrance performance. The recruitment process for this study involved a combination of email communications and oral invitations, ensuring accessibility and clarity in initial participant engagement. Potential participants were contacted directly via institutional email addresses, and in some cases, they were also approached in person. All individuals who agreed to participate in the study provided written informed consent prior to data collection. The consent process included a clear explanation of the study’s purpose, the procedures involved, and participants’ rights. Specifically, participants were informed that their involvement was entirely voluntary and that they retained the right to withdraw from the study at any point without any negative consequences. Additionally, the anonymity and confidentiality of their responses were emphasized, and all data collected were anonymized to protect the identities of the participants.

Context

The study was situated within the final semester of a four-year pre-service English language teacher education program at a state university in Türkiye. At this stage in their academic trajectory, participants had completed the core theoretical coursework in areas such as second language acquisition, language teaching methodology, materials design, assessment, and coursebook evaluation. These foundational courses were designed to equip pre-service teachers with both pedagogical knowledge and practical skills relevant to contemporary language teaching contexts. Notably, several of these courses explicitly addressed the integration of technology in language instruction, particularly in materials development and classroom practice, providing participants with conceptual and applied insights into technology-enhanced teaching. During the semester in which the study was conducted, the participants were concurrently enrolled in their second and final teaching practicum. This practicum, a capstone component of the program, required them to engage in classroom teaching within K–12 educational settings under the dual supervision of mentor teachers at placement schools and a university-based practicum supervisor. Within this structure, participants were expected to design, implement, and evaluate lesson plans and teaching materials in authentic instructional environments. A specific emphasis was placed on the integration of digital technologies in lesson planning, instruction, and assessment. Participants were also encouraged to reflect critically on their own teaching practices as well as those of their peers, often through structured peer feedback sessions and guided self-evaluation activities.

Procedure

The study was conducted in four phases, namely: Phase 1: Study: GenAI Tools and Discussion, Phase 2: Plan: Lesson Planning and Critical Evaluation, Phase 3: Teach: GenAI-enhanced lessons, Phase 4: Observation—Structured Observations and Feedback, and Phase 5: Reflect—Reflective Analysis and Ethical Challenges (Fig. 1).

Fig. 1
figure 1

Procedure of the study.

Phase 1: Study: GenAI tools and discussion

The lecturer conducted a workshop to familiarize the participants (practicum students) with various GenAI tools for lesson planning and activities (Week 1–2. This section included a demonstration of Large Language Models (LLMs) such as ChatGPT, Gemini, and CoPilot, activity generators such as Twee, Magicschool, and Diffit, and formative assessment platforms such as Quizziz and Kahoot! The goal was to build awareness and demonstrate the potential use of GenAI to enhance lesson planning, activities, and material creation. The session also included guided discussions that explored the ethical considerations of using GenAI in language teaching and learning. The topics included privacy issues, AI bias, intellectual property rights, and the implications of GenAI for academic integrity, as outlined by the Council of Higher Education in Türkiye, CoHE (2024). The participants also worked collaboratively on a code of ethics outlining responsible GenAI use in lesson planning and materials design.

Phase 2: Plan: Lesson planning and critical evaluation

In this phase, participants were guided in using GenAI tools to generate lesson materials based on the curriculum of the practicum school, such as tailored reading passages, vocabulary lists, quizzes, and scaffolded activities, and practiced using AI as a brainstorming partner, supporting creativity and innovation. Moreover, the participants created AI-generated materials for diverse learning needs, such as providing different reading levels or alternative approaches for varying language abilities, fostering a student-centered approach. Upon creating materials, the participants also collaboratively reviewed AI-generated content for cultural, linguistic, and contextual appropriateness. For example, they might check for language nuances that could affect learners, ensuring resources align with cultural sensitivity. The participants also discussed transparency regarding AI-generated content with learners.

Phase 3: Teach: GenAI-enhanced lessons

After a week of developing these lesson plans, one teacher from each group implemented their AI-enhanced lesson plans in natural classroom settings in the practicum schools. During the teaching phase, the participants benefited from AI tools to engage students actively, which appeared in previous sessions, particularly beneficial in language classes where immediate feedback and practice were essential. For instance, language quizzes or interactive vocabulary games were used to boost engagement, especially at the beginning and end of the lessons. The participants also emphasized safeguarding student data in real-time AI applications that limit personal information collection.

Phase 4: Observation—structured observations and feedback

During the teaching phase, the observer participants, the mentor, and the lecturer observed the classroom, the students’ reactions to the materials, the interaction, and how GenAI-based materials supported lesson delivery and classroom management. Observers (mentors and peers) conducted structured observations of the AI-integrated lessons, using the observation checklist to gather data on engagement levels, the effectiveness of GenAI tools, and the overall learning environment. Following the observations, mentors and peers held feedback sessions to discuss the observations with the participants. The feedback included insights on how effectively AI tools enhanced traditional teaching methods and their influence on the student-teacher dynamic.

Phase 5: Reflect—reflective analysis and ethical challenges

After implementing the AI-driven lessons, practicum students engaged in reflective analysis sessions. They discussed the impact of GenAI on their teaching efficacy, classroom participation, and student learning outcomes under the teacher’s direction. The participants also identified specific instances where AI positively contributed to their teaching and areas where it may have limited creativity or interaction. Moreover, the participants also documented challenges regarding using GenAI in the discussions. They developed a code of practice for GenAI in lesson planning and teaching, which resulted from their reflections and experiences and could serve as a resource for future practicum students and teacher educators.

Data collection and instruments

Qualitative data were collected from four types of data sources for the integration of GenAI in lesson planning: teaching materials, available as a supplementary file (lesson plans and activities, Appendix A), self-report (Appendix B) and evaluation (Appendix C), peer and mentor evaluations (Appendix D), and discussions (notes from meetings). Each participant contributed the following types of data: one GenAI-enhanced lesson plan, one teaching observation checklist (completed by a peer or mentor), and one written reflection. In total, the dataset consisted of 10 lesson plans, 10 structured observation forms, and 10 participant reflections. Additionally, 4 group discussion transcripts from peer debriefings were included. For the first research question, participants’ lesson plan activities and observations were evaluated to determine how GenAI tools were integrated into activities and lesson plans within the LS during the training sessions, as well as the participants’ teaching of these plans and using the activities in real classrooms. Data were collected by the first author, who also served as the practicum supervisor, with structured observations conducted by peers and mentors based on a shared rubric.

Data analysis procedures

The data analysis in this study followed a thematic analysis approach, a widely used method in qualitative research for identifying, analyzing, and reporting patterns (themes) within the data. This approach was deemed appropriate due to its flexibility and its capacity to provide a rich, detailed, and nuanced understanding of participants’ experiences and perceptions. The process of data analysis was conducted in a systematic manner through several stages.

All data sources, including participants’ lesson plans, activity materials, reflective journals, and feedback from structured observations, were transcribed verbatim, where necessary, to ensure accurate representation. This transcription also involved the integration of field notes taken during the observations and meetings. The first step in analysis involved familiarizing the researcher with the data through careful reading and re-reading. Initial coding was conducted manually to highlight meaningful segments within the data that aligned with the research questions, focusing on how GenAI tools were integrated, the teaching strategies used, and the perceptions and challenges of the participants.

The coding process was iterative, with each researcher independently coding the data, followed by discussions to ensure consistency and reliability. Following initial coding, the next step involved grouping related codes into potential themes. Themes were developed inductively, meaning they emerged from the data rather than being imposed at the outset. Themes were refined through discussions among the research team, ensuring that they captured the nuances of participants’ experiences with GenAI tools in their practicum. During this stage, data was continually revisited, and new insights were integrated into the analysis. As the study employed a case study approach, a cross-case analysis was also performed. The unit of analysis for this study was the individual participant’s experience with GenAI tool integration in their lesson planning and teaching.

For cross-case analysis, the research team compared themes across the participants to identify common patterns and variations. This process was essential to understanding how different contextual factors (e.g., school environment, mentor influence, and participants’ prior experiences) influenced the integration and effectiveness of GenAI tools. The final stage of analysis involved interpreting the findings, linking them to existing literature, and addressing the research questions. To ensure validity, the research team employed member checking, where participants were invited to review the findings and confirm whether the interpretation resonated with their experiences. Additionally, peer debriefing sessions were conducted with colleagues from the same academic field to further ensure the credibility of the results.

Results

Use of GenAI tools in lesson planning

The first research question aimed to determine how GenAI tools were used in lesson planning within the LS framework. The following table (Table 2) provides the related themes, categories, codes, and representative quotes. The following table highlights the various GenAI tools used by participants for lesson planning.

Table 2 Types of GenAI tools and their applications in lesson planning.

As indicated in this table, the participants leveraged various GenAI tools to enhance their lesson-planning process, utilizing distinct tools to address diverse pedagogical needs. Nine participants relied on ChatGPT and MagicSchool for lesson content, creating and crafting reading passages and comprehension questions. Another 6 used ChatGPT and Gemine to generate vocabulary lists and explanations. While the tools were common across many participants, their applications often reflected individual pedagogical intentions. For example, Participant 3 used ChatGPT not just for generating comprehension questions, but to simulate scaffolded questioning, saying, “I started with simple yes/no questions and moved toward inferential ones. ChatGPT gave me a base, but I modified it according to my students’ reactions in class”. In contrast, Participant 7 focused on saving time: “The AI gave me ready-made content that was good enough. I didn’t edit much—it was useful, especially when I had limited prep time”. These cases reveal differing degrees of critical engagement with AI outputs.

As for images, seven participants incorporated Microsoft Bing Image Generator to create visual aids, while 5 used DALL-E to generate cultural visuals. For instance, Participant 6 discussed how she used DALL·E to design visuals that would resonate with students’ interests: “I asked it to generate fantasy-style settings to explain narrative tenses. The students were excited because the visuals weren’t from the textbook”. Conversely, Participant 1 expressed hesitation, stating, “Some images didn’t feel appropriate for my class. I had to check each carefully before showing them”. This contrast illustrates a tension between innovation and caution.

Video-based tools such as Synthesia and Animoto were used by 5 and 3 participants, respectively, to create instructional videos and animated storytelling. Audio tools featured prominently, with 4 participants using ElevenLabs for listening activities. One participant noted, “ElevenLabs allowed me to generate high-quality listening activities”. Similarly, 3 participants employed Audyo to produce Customized Dialogs for targeted listening comprehension tasks. Participant 8 found the audio tools empowering: “With ElevenLabs, I could adjust the accent and pace. It was helpful for teaching different varieties of English”. Meanwhile, Participant 2 raised concerns about authenticity: “The voices sounded robotic. I wasn’t sure if this was what students should be listening to when trying to learn natural pronunciation”. Kahoot! was widely used by 8 participants for quizzes and games, and Quizziz by 5 participants for formative assessments. Finally, 4 participants utilized Canva to create multimedia resources that integrated text, images, and audio, emphasizing its role in catering to Diverse Learning Styles. One participant (Participant 5) described how Canva served as a unifying tool: “It became the final step where I pulled everything together—text from ChatGPT, images from DALL·E, and audio clips”. In contrast, Participant 9 found the multimodal design process overwhelming: “It took more time than I expected. AI gave me the parts, but assembling them took a lot of decision-making”. These individual differences show that while tool use was common, the pedagogical rationales, critical editing, and emotional responses varied, highlighting the complexity of integrating GenAI into lesson planning within authentic classroom contexts.

Benefits and challenges

The participants’ perceived challenges and benefits of training activities in integrating GenAI into lesson planning were also investigated within the LS framework, and Table 3 presents the emerging themes, categories, codes, and representative quotes.

Table 3 Theme, categories, and codes for GenAI integration in lesson planning.

Benefits of GenAI

Under the “Benefits of GenAI,” the theme emerged as a central aspect of the participants’ experiences. This theme is supported by three key categories: Activity Creation, Material Personalization, and Collaborative Design. Activity Creation was the most frequently mentioned among participants. Tailoring content for different levels was noted by 10 participants, highlighting the flexibility of GenAI tools in creating customized reading and vocabulary exercises. One participant remarked, “Using AI helped me design reading texts for advanced and beginner students, which saved me a lot of time”. Participant 4 explained, “I had a mixed-level class, and I asked ChatGPT to write two versions of the same reading passage—one for A2 and one for B1. That helped me give something to everyone without rewriting from scratch”. However, not all participants were equally satisfied. Participant 10 noted, “Sometimes the AI couldn’t simplify the text in a meaningful way—it would just shorten sentences without making it easier to understand”. This highlights the limitations of AI’s adaptive potential. Similarly, Interactive Exercises were cited by 8 participants as a key benefit, with one stating, “The AI-generated vocabulary games boosted student engagement, especially at the beginning of the lesson”.

Material Personalization was another prominent theme. Seven participants discussed the creation of differentiated learning materials; five reflected on the issue of contextual appropriateness. Participant 6 illustrated a moment of critical awareness: “One text had a birthday party scene with alcoholic drinks. I teach in a conservative area, so I had to remove those references. AI doesn’t always understand the local classroom culture”. In contrast, Participant 2 argued, “I didn’t mind such issues—students actually found it funny, and we used it as a discussion topic”. Seven participants emphasized the ability to create differentiated learning materials that cater to diverse student needs. One participant noted, “AI allowed me to prepare materials suited for different student needs in the same classroom”.

Challenges of GenAI

Under the “Challenges of GenAI” theme, three categories emerged related to the challenges of GenAI: Appropriateness of content, Overuse, and ethical considerations. The challenge of appropriateness of content was mentioned by 5 participants. AI-generated content occasionally included culturally irrelevant phrases, which required careful revision. As one participant said, “Some phrases generated by AI were culturally inappropriate, but reviewing them helped me tailor the content”.

Over-reliance was a concern raised by seven participants. For instance, Participant 7 admitted, “I didn’t write my lesson plans from scratch at all—just edited what ChatGPT gave me. I know it’s not ideal, but I had too much going on”. This contrasts sharply with Participant 3, who stated, “I used AI as a brainstorming partner. It never replaced my own planning”. These differing stances illustrate the tension between pragmatism and professional identity in early-career teachers.

Ethical considerations emerged as a central theme in participants’ reflections, underscoring the multifaceted responsibilities educators bear when integrating AI tools into pedagogical contexts. A recurrent issue emphasized by six participants was the necessity of transparency with learners. These participants expressed a conscientious effort to inform students about the use of AI-generated content in the preparation of lesson materials, reflecting a commitment to ethical teaching practices and informed student engagement. One participant articulated this clearly: “I always informed students when using AI-generated materials to keep everything transparent”. This emphasis on openness aligns with broader ethical principles in education, particularly those related to trust, accountability, and the development of learners’ digital literacies. Another important ethical issue raised by four participants pertained to bias and accuracy in AI-generated content. Participants acknowledged that while AI tools can expedite lesson planning and resource generation, they are not devoid of flaws. These educators highlighted the need for vigilant content review, particularly to identify and mitigate potential algorithmic biases that might propagate stereotypes or misrepresent cultural or linguistic realities. The potential for AI-generated content to unintentionally reflect dominant ideologies or marginalize minority perspectives was seen as a serious ethical concern, particularly in multilingual and multicultural classrooms. This awareness reflects an emerging critical literacy among educators who are not only users of technology but also evaluators of its socio-educational implications. A particularly rich area of ethical engagement was found in the domain of collaborative peer review, cited by nine participants. Peer discussions around AI-generated materials were described as invaluable, not only for enhancing the quality and contextual appropriateness of resources but also for fostering a shared ethical responsibility. These discussions often served as informal sites of professional development, where educators could interrogate the pedagogical assumptions embedded in AI-generated content, reflect on representational fairness, and deliberate on the appropriateness of certain materials for diverse learner groups. As one participant noted, “Discussing AI-generated content with peers helped refine and improve the material for better classroom use”. In this sense, collaborative review processes functioned as ethical safeguards, allowing for collective scrutiny and continuous reflection. Moreover, some participants hinted at broader ethical dilemmas concerning authorship and intellectual responsibility. Although not always explicitly articulated, there was an undercurrent of concern about the implications of using machine-generated content in place of teacher-created materials. Questions such as “Who owns the lesson?”, “Whose voice is being represented?”, and “To what extent should we rely on AI tools?” were implicit in the discussions. These concerns touch upon foundational issues in the ethics of education, including the authenticity of teacher work, the preservation of pedagogical identity, and the shaping of epistemological norms in the AI-enhanced classroom.

Discussion

The findings showed that the participants used several tools for designing lesson plans. Using these tools, they practiced generating multiple lesson plans and activities using various tools. Indicated that the participants created efficient lesson plans using these tools. They were useful, especially for creating activities and lesson plans on functions and topics that coursebooks may not typically cover. The quality of these lesson plans was evaluated during the discussions and after their implementations in the real classes when they were finally developed. The participants used GenAI tools such as ChatGPT and Gemini for lesson and content creation; however, they did not have to prepare lesson plans from scratch, although in some cases, they used these tools to revise the ones they had already created. Although various GenAI tools are available, which are increasing daily (Peachey, 2024), the analysis of the selected tools for lesson plans and activity creation indicates that the participants mainly used the freely available ones, with easy access and use. During the study and plan stages, the participants viewed these tools more as teaching assistants or mentors ready to comment on the work and help them (Ahn et al., 2024; Clark & van Kessel, 2024). Moreover, during the reflective analysis, several participants also used ChatGPT as a self-directed tool for not only learning but also teaching, which is in line with the discussion by Dizon (2024) and the reviews by Li et al. (2024) and Kohnke et al. (2023).

The LS framework provided a structured, collaborative environment that enabled participants to effectively integrate GenAI tools into their lesson planning and teaching practices. By following the LS phases, participants not only explored a variety of AI tools but also critically evaluated their pedagogical impact. During the Study phase, participants familiarized themselves with AI tools like ChatGPT, DALL·E, and ElevenLabs, gaining a clear understanding of their potential to create diverse educational materials. This phase seems to have laid the foundation for participants to experiment with AI-enhanced lesson plans, as they brainstormed innovative ways to address various learning objectives. This phase was also important for the participants, as they could try the AI tools modeled by the lecturer (Mananay, 2024; Moorhouse & Kohnke, 2024; Peachey, 2024).

The Plan phase supported this exploration by encouraging collaborative material development and peer reviews, fostering a culture of shared learning and constructive feedback. Participants collaboratively assessed the appropriateness of AI-generated content, ensuring cultural and linguistic relevance while tailoring activities to meet diverse student needs. This finding is in line with the suggestion by Hong (2023) that the lesson plans created using ChatGPT and other GenAI tools were used as a starting point to check and discuss the output, to adapt and improve it as a group, which was possible through the LS framework.

The Teach phase allowed participants to implement their AI-enhanced lesson plans in real classroom settings, where they observed first-hand the impact of these tools on student engagement and learning outcomes. Structured observations during the Observation phase, conducted by peers and mentors, provided critical insights into the effectiveness of GenAI tools in improving lesson delivery and classroom interaction. This might have also contributed to the participants’ critical thinking during these sessions, which corroborates the finding of the study by van den Berg and du Plessis (2023). Feedback sessions helped participants refine their approaches, highlighting the benefits and limitations of GenAI integration.

Finally, the Reflect phase facilitated deep reflective practices as participants analyzed their experiences, identified ethical considerations, and developed a practical code of conduct for GenAI use in education. This iterative process within the LS framework improved participants’ pedagogical skills and fostered a critical understanding of how to use emerging technologies responsibly and effectively in language education.

While the participants valued using these GenAI tools for tailoring content for different levels and creating differentiated and contextually appropriate learning materials, which are in line with the findings of other studies (e.g., Edmett et al., 2024; Gatlin, 2023; several issues were also raised, including transparency with learners, and addressing bias and accuracy (Li et al., 2024). The participants underscored the importance of informing learners whenever they used materials produced and/or improved using GenAI tools, as this was also what was expected of learners to do when they did the same. Since GenAI tools are trained based on the knowledge created by humans, they might be open to biases resulting from this knowledge. Therefore, the participants also suggested checking the outputs for unfair bias in line with the discussion and suggestions by Akgün and Greenhow (2022) and Chan and Colloton (2024). Moreover, overreliance was one of the other concerns raised by the participants, who indicated that human evaluation of the GenAI output was always required, which is in line with the discussion by Jose and Jose (2024).

As indicated in the current study by the participants, GenAI tools can be used as a very important supplement for the short and limited time that is spent with the learners to allow more time for interaction, discussion, and sharing in and outside the classroom (e.g., Gazaille et al., 2023; Yee et al., 2023). These tools also provide a range of effective materials for enhancing receptive and productive capabilities through a diverse range of audio-visual aids. It is, therefore, important for teachers to think carefully about how they will use different technologies and what they want the students to do with them. Moreover, the GenAI outputs might include inaccurate information and suggestions, which require teacher judgment and evaluation of these outputs for any potential issues. Moreover, equity might also be one of the concerns regarding access to GenAI tools, as not every learner has access to advanced features and might be at a disadvantage. This might also be one of the potential issues in the near future that will widen the gap between those who have access to GenAI tools with more advanced features and those who might not or have to use the basic and free ones.

As for pedagogical implications and the guidance while using GenAI tools in lesson plans, the following principles can be stated, following the participants’ responses and views.

  • Teachers do not have to use GenAI tools for the exercises and lesson plans they could prepare. However, GenAI tools can be used to improve and enhance the activities, and they must see these tools as an assistant. These tools are not to replace teachers.

  • GenAI outputs must be carefully evaluated and edited by teachers themselves, using reliable sources for the online/print materials. Personalization of the content based on local needs may always be necessary.

  • As learners are expected and encouraged to disclose any GenAI use in their work, teachers should also inform the learners of their use of these tools in their work, such as reading text, vocabulary activities, and lesson plans.

  • Before using GenAI tools, teachers must carefully inform themselves about and follow the AI policy in their schools or institutions. If the policy or instructions are unclear, teachers can request information and/or use international guidelines.

  • When assigning work to learners, teachers must also consider that not all learners have equal access to these tools.

  • Teachers must be careful about privacy and security issues. They must protect their and learners’ privacy when interacting with these GenAI tools and using prompts.

Conclusion

There is a wide range of different technologies that we can use to help both teachers and learners in different contexts. However, among these, GenAI has been by far the most disruptive one by generating any type of audio-visual and written content and acting as the material writer. At the same time, the previous technologies were tools that helped us create materials to enhance by turning them into more engaging and interactive versions. Considering all the data obtained through journals, discussions, and lesson plans within the LS study, the study concluded that GenAI tools can be effective tools for enhancing lesson plans and activities if appropriate training is provided in a collaborative environment and certain principles are followed, such as checking the output and evaluating it. Like everything else, GenAI technology needs to be used in an effective and efficient way and not just for the sake of use. Important for us, however, is to try to work out how to meaningfully integrate these technologies into what is usually done in the classroom.

Limitations and suggestions for further research

This research is also confronted by a number of limitations that should be taken into account. To begin with, the results are based on a small, context-dependent sample of pre-service teachers undertaking a practicum-based Lesson Study from one teacher education program. Therefore, the generalizability of these results to other settings, institutions, or in-service contexts may be restricted. The participants’ past experiences using GenAI tools and their levels of digital literacy were not determined in a systematic manner, which could have influenced their utilization of the tools, along with their ethical or pedagogical considerations. Further research can alleviate these limitations by conducting longitudinal, multi-site research with a diverse range of teacher populations, including practicing teachers as well as individuals from different cultural and institutional contexts. Comparative case studies would also serve to further describe how contextual variables (i.e., institutional policy, digital infrastructure, and student populations) serve as mediating variables on the integration of GenAI into lesson planning. Future research could also attempt to examine the impact of formal training on GenAI pedagogies, such as how formalized ethical guidelines and critical media literacy education shape teacher decision-making and student outcomes. There is a pressing need for research that examines the implications of using Generative AI on the identity, authorship, and professional autonomy of educators, particularly as artificial intelligence tools become more embedded in pedagogical design and evaluative practices.