Introduction

In the era of Artificial Intelligence (AI), the importance of programming education is becoming increasingly prominent, and it has become one of the key ways to cultivate future innovators (González-Pérez & Ramírez-Montoya, 2022). Computational thinking (CT), as a core competency for adapting to changes in digital society and AI, is widely recognized as a foundational skill(Tikva & Tambouris, 2021), and programming is an effective way to develop CT (Belmar, 2022). According to the International Society for Technology in Education (ISTE), CT includes problem solving, creativity, critical thinking, algorithmic thinking, and collaborative skills, elements that help students meet the challenges of increasingly complex and open digital technologies (ISTE, 2015). A growing number of scholars are advocating for the introduction of programming education at the primary and secondary school levels due to the fact that it not only equips students with knowledge and skills in the field of computing, but also positively impacts their future learning; moreover, the earlier programming education begins, the more pronounced the benefits will be (Lai et al., 2021; Lindberg et al., 2019). However, programming is particularly challenging for primary and secondary school students (Yang & Lin, 2024), requiring sufficient external support during the learning process (Webb et al., 2017), such as assistance from teachers. However, due to the low teacher-student ratio in programming classes, it is difficult to provide timely help, leaving some students unable to achieve the expected outcomes in programming learning, which in turn reduces their programming self-efficacy (Medeiros et al., 2018). Self-efficacy, defined as an individual’s confidence in their ability to complete specific tasks, directly influences their persistence and effort when facing challenges (Bandura, 1978). Previous research has shown that self-efficacy plays a critical role in programming learning (Gurer & Tokumaci, 2020). Therefore, how to provide timely support for students in K-12 programming education and improve their programming performance, CT, and self-efficacy has become an important issue that needs to be addressed in the field of educational technology.

The rapid development of Generative Artificial Intelligence (GenAI) technology brings some solutions to the above problems. GenAI chatbots based on advanced converter architectures (e.g., GPT-4) can predict, understand, and generate human-like text (Pavlik, 2023), generate coherent and contextually relevant responses during long interactions (Mohamed, 2024), and provide targeted assistance based on individual needs. With their powerful natural language understanding and extensive knowledge base, GenAI-based chatbots can provide timely feedback to users, and thus a growing number of researchers are applying them in the field of education (Kasneci et al., 2023). In programming instruction, GenAI chatbots can give various solutions to students’ problems, provide code examples, or modify code (Husain, 2024). Using GenAI chatbots as programming learning assistants for elementary and secondary school students will increase confidence in solving programming problems and improve learning outcomes and motivation (Muñoz et al., 2023; Shoufan, 2023). Although the GenAI chatbot shows great potential in the field of programming education, there are certain problems. Currently, GenAI chatbots are commonly used for the delivery of learning knowledge or as practice tools (McGrath et al., 2024). Without additional instruction or strategies to guide students in the use of chatbots, students may struggle to effectively correlate newly acquired knowledge with existing knowledge, leading to disappointing learning outcomes (Chin et al., 2014; Hwang et al., 2019). In addition, learners may lack critical thinking when using GenAI and get answers to questions directly through the tool. This adversely affects the development of skills such as critical thinking creativity, and if students become overly reliant on GenAI chatbots, they may fail to develop the skills needed to solve problems on their own, which may even lead to cognitive stagnation (Dwivedi et al., 2023; Qin et al., 2023).

In terms of helping learners organize what they have learned, researchers have suggested the use of mind maps, which is a method of instructing learners to present elements related to core concepts in a graphical pattern (Edwards & Cooper, 2010). As a thinking visualization tool, mind mapping can externalize learners’ thinking processes and guide students to think in an orderly manner and engage in reflective learning(Merchie & Van Keer, 2012; Stokhof et al., 2020). Many previous studies have reported the benefits of mind mapping for improving students’ academic performance and problem-solving skills (Cristea et al., 2011; Eppler, 2006). Feedback of answers from GenAI chatbots relies on students’ questioning skills (Xia et al., 2022). Mind mapping organizes and develops students’ ideas (Buzan, 2024), which will facilitate their provision of clearer and more structured information when interacting with the chatbot to enhance the accuracy of its feedback (Abd-Alrazaq et al., 2023), and will also help students to understand and internalize what they have learned in a deeper way, which will lead to better learning outcomes. The inclusion of a mind mapping session will also avoid the tendency of students to get their answers directly from the GenAI chatbot, which helps to develop their critical thinking and problem-solving skills (Dwivedi et al., 2023). Therefore, this study proposed a learning approach that integrates mind mapping with a GenAI chatbot to improve students’ programming performance. In addition, numerous studies have shown that the type of mind maps and the way they are used can also affect the impact of learning outcomes (Shi et al., 2023; Zhao et al., 2022). It is also an interesting topic to explore how generative chatbots, supported by different types of mind maps, affect students’ programming learning outcomes. Based on this, this study proposed the following research questions:

RQ1: Does an integrated mind mapping and GenAI chatbot learning approach improve students’ programming academic performance?And is there a difference in the impact of different types of mind mapping-supported GenAI chatbots on students’ programming academic performance?

RQ2: Does an integrated mind mapping and GenAI chatbot learning approach improve students’ computational thinking? And is there a difference in the impact of GenAI chatbots supported by different types of mind maps on students’ computational thinking?

RQ3: Does an integrated mind mapping and GenAI chatbot learning approach improve students’ self-efficacy? And are there differences in the effects of different types of mind mapping-supported GenAI chatbots on students’ self-efficacy?

Literature review

Programming education in K-12

In recent years, programming education has received increasing attention from educators and researchers as an effective way to develop students’ 21st-century skills (Hu, 2024). Many countries and regions are incorporating programming into K-12 education in response to the future demand for talent in a digital society (Åkerfeldt et al., 2024).

However, numerous studies have shown that programming poses a significant challenge for K-12 students due to their lack of relevant knowledge. The abstract and complex nature of programming concepts is a significant barrier for K-12 learners (Ma et al., 2023). To address the obstacles students face in learning programming, researchers have introduced various teaching tools and methods in programming education (Lai & Wong, 2022). For example, graphical programming languages help students intuitively understand programming concepts and processes (Tsai, 2019). Game-based programming learning, through a series of designed instructional scenarios, makes learning both challenging and fun, potentially motivating students to achieve better learning outcomes. Additionally, collaborative programming is another teaching method (Wu et al., 2019) whereby students work in groups to complete programming tasks and build knowledge (Van Aalst, 2009), understanding complex concepts and solving problems through communication and conflict resolution. Although these tools and methods can somewhat reduce the difficulty of learning programming, challenges remain. Programming, as an activity that involves logic and problem solving, requires complex thinking skills, which are often difficult for students to develop independently (Shadiev et al., 2014). They need instructional support to clearly articulate programming logic and express solutions (Kwon, 2017).

Chatbots in education

Chatbots, also known as conversational agents, are computer programs designed to simulate human-like conversations (Wu & Yu, 2024; R. Zhang et al., 2023b). They interact with users on specific topics or in specific domains through text and speech in a natural conversational manner (Smutny & Schreiberova, 2020). There are significant differences between traditional chatbots and AI chatbots (Wu & Yu, 2024). Traditional chatbots rely on predefined patterns and templates, and their interactions are limited by rules that do not allow them to accurately understand students’ questions, leading to the provision of irrelevant or fixed answers (Coniam, 2014; Yang et al., 2022). Not only does this not help solve the problem, but it may also increase student negativity and decrease technology acceptance (Yang et al., 2022). GenAI chatbots use a variety of AI techniques such as natural language processing, machine learning, information retrieval, and deep learning, which retain user input and learns from previous user input, promoting enhanced engagement and interaction (Nguyen et al., 2022).

Researchers have noted that GenAI chatbots have great potential in education to improve student performance to some extent (McGrath et al., 2024). For example, Tai and Chen (2024) developed a GenAI chatbot based on ChatGPT and applied it to teaching English speaking in elementary schools, and the results showed that it significantly improved the speaking ability of English learners. Chae et al. (2023) developed a GenAI chatbot for English language learning, which, by mimicking the language education behaviors of human teachers, could autonomously organize teaching tasks and progress. In programming education, Yilmaz and Yilmaz (2023b) also demonstrated the positive effects of GenAI tools on students’ CT, programming self-efficacy, and motivation to learn programming. However, some studies have found that GenAI chatbots do not have a positive impact on students’ programming learning. Sun et al. (2024) integrated GenAI chatbots into programming education and found that they did not significantly improve programming performance. By applying GenAI chatbots directly in the classroom, students can receive timely feedback on their knowledge. However, this immediate feedback mechanism may only contribute to students’ surface understanding of knowledge, and not really promote their deeper internalization of knowledge.

Mind mapping

Mind maps can help learners externalize their thinking processes and lead them to visual thinking and orderly thinking, which can in turn improve their problem-solving skills (Liu et al., 2018). Some previous studies have shown that mind maps can transform complex ideas into visual diagrams, which not only help learners integrate new knowledge with existing knowledge, but also help them memorize and comprehend the learning content (Buzan & Buzan, 2002) and promote their creative thinking (Abd Karim & Abu, 2018; Bonk & Cunningham, 2012). Some researchers have reported the potential of mind maps for promoting learners’ critical thinking by helping them to analyze and identify associations between different aspects of knowledge (Abd Karim & Mustapha, 2022; Fu et al., 2019; Wu & Wu, 2020).

Mind maps also have a very positive impact in programming education. Ismail et al. (2010) pointed out in their study that combining mind map learning scaffolds with cooperative learning significantly positively impacted students’ problem-solving skills, programming performance, and metacognitive knowledge. Liu et al. (2018)investigated the impact of mind maps on undergraduate programming learning, finding that they could transform abstract and intangible thought patterns into visible and radiating thought patterns, thus improving students’ logical and creative thinking. Chen (2020) integrated mind maps into the teaching of visual programming with Scratch to enhance students’ motivation and reflective abilities in learning.

However, training with mind maps as a cognitive tool can be time-consuming and might impose a significant cognitive load on K-12 learners (Van Gog et al., 2006). Therefore, educators need to identify more suitable mind map scaffolding methods for K-12 learners.

Method

In order to assess the effects of a learning approach that integrates mind mapping with a GenAI chatbot on middle school students’ programming academic performance, CT, and programming self-efficacy, and to investigate whether there is a difference in students’ performance in the above areas when using different types of mind mapping-supported GenAI chatbots, this quasi-experimental study was conducted with three groups of 111 middle school students. The participants, study-related design, experimental procedures, and measurement tools are described in detail below.

Participants

The 111 participants in this study were seventh-grade students (11–12 years old) from three classes in a public junior high school in southeastern China. Two classes were randomly selected as Experimental Group 1 (16 boys, 20 girls) and Experimental Group 2 (17 boys, 19 girls), and the other class as a control group (19 boys, 20 girls). All participants had taken information technology(IT) courses at the primary school level and had basic computer skills. Students in the three groups completed an 11-week experimental course. The course taught the same content and was taught by the same IT teacher. The main content included “Python Programming Basics” and “Python Programming Basic Structure.”

Design of the GenAI chatbot based on a large language model

In order to successfully complete this experiment, the research team constructed a GenAI chatbot based on a large language model, choosing GPT-4 as the core model, a choice that makes it significantly different from traditional AI chatbots. Compared to chatbots that rely on rules or simple pattern matching, GPT-4 implements dynamic language processing via deep neural networks, trained on a wide range of datasets, and has the ability to understand and generate human-like responses across multiple topics(Tai & Chen, 2024). This advanced capability significantly enhances the chatbot’s conversational capabilities, enabling it to interact with users in a more natural and contextually relevant way. By integrating GPT-4, the chatbot’s level of intelligence is enhanced to provide a more realistic and immersive interactive experience for the user. The chat interface is shown in Fig. 1. Students can initiate conversations on various topics with the chatbot by entering their questions in the main window.

Fig. 1
figure 1

GenAI chatbot based on large language modeling.

Deploying GPT-4 (or the Large Language Model) within the educational domain requires careful consideration of the construction of prompts to maximize the efficiency of the model (Liu et al., 2024). In this study, a modular hybrid prompt design (Liu et al., 2022) was used to ensure relevance and precision across course stages by combining general course prompts with stage-specific prompts. General course prompts provide meta-information such as course objectives, expected outcomes, and target grade level at the beginning of the course, laying the groundwork for students to interact with the GenAI chatbot. Stage-specific prompts, on the other hand, provide customized guidance according to the different stages of the course (e.g., clarifying the problem, analyzing the problem, formulating a solution, writing a program, and summarizing and reflecting on it) to ensure that the chatbot’s feedback is tightly correlated with the learning objectives of the current stage. In addition, students were free to interact with the GenAI chatbot at various stages of the course to ask questions, seek clarification, or request more examples, thus enabling dynamic support for personalized learning. In addition, the chatbot designed for this study supports voice interaction to provide students with more convenient input-output feedback to meet the cognitive and emotional needs of different students.

Protecting students’ private data is important while ensuring interaction with a GenAI chatbot. We developed a dedicated web application based on open source code. The application allows educators to update system prompts in real time based on course dynamics. Especially critical is that all sensitive information, such as personalization settings and conversation logs, is securely stored in the indexedDB database of the local browser. Interaction with GPT-4 is via OpenAI’s API, ensuring that data privacy standards are strictly adhered to.

Design of thinking maps scaffolding and teaching activities

When using a GenAI chatbot, appropriate and accurate questions can help to get higher-quality answers. By creating a mind map to clarify the knowledge points and logical ideas needed to solve problems, it helps to clarify the direction and purpose of communication with the chatbot and obtain more accurate and useful responses. After getting feedback from the chatbot, learners can modify and optimize the mind map to make the problem-solving ideas clearer and more visible. Therefore, the relationship between the mind map and the chatbot is mutually reinforcing, with the mind map facilitating the chatbot to generate high-quality answers and the chatbot facilitating the iterative optimization of the mind map. The relationship between students, mind maps, and chatbots is shown in Fig. 2.

Fig. 2
figure 2

Diagram of the relationship between students, mind maps, and the GenAI chatbot.

In traditional teaching, some teachers let students construct mind maps by themselves, that is, “self-constructed” mind maps. Since junior high school students are young and lack hands-on skills and sufficient thinking logic, it is difficult for them to complete mind maps independently. To solve this problem, we designed a progressive mind map scaffold with three stages, using different forms of mind maps in each stage to gradually reduce the support for learners’ knowledge construction. The first stage was a gap-filling mind map; the second stage was a prompting mind map; and the third stage was a self-constructed mind map. Gap-filling mind maps are mind map frames pre-designed by the teacher, which contain clues to solve the problem and some blank spaces for students to fill in the key knowledge points. In prompting mind maps, the teacher no longer provides the framework of the mind map, but gives some hints or key points to solve the problem, and students need to design the solution and complete the mind map according to the hints. Self-constructed mind maps refer to mind maps that students construct independently according to their own understanding of the knowledge and thinking logic. These three stages are interrelated and present a progressive approach to promote students’ cognitive and thinking development. The progressive mind map is shown in Fig. 3.

Fig. 3
figure 3

Example of a progressive mind map.

In this study, the instructional activities consisted of five segments, namely: clarifying the problem, analyzing the problem, formulating the program, writing the program, and summarizing and reflecting on the program.

  1. (1)

    Defining the problem

    In the Defining the Problem session, the teacher set up situational problems based on daily life scenarios, presented the teaching case through the scenarios, and posed questions to the students. Students then entered the problem situation designed by the teacher and identified the core of the problem. The role of mind mapping in this session was to help students break down the problem and focus on the core of the problem. Students could sort out the information in the situation through the mind map, so as to clarify the different aspects and levels of the problem, and sort out the logical relationship of the problem through the presentation of structured thinking. The GenAI chatbot helped students quickly capture the key issues in a situation through question explanations and interactive feedback.

  2. (2)

    Analyze the problem

    In the Analyze the problem session, the teacher guided the students to specifically analyze the identified problem and create an initial mind map. Students were guided by the teacher to use the mind map to break down the problem, sort out the knowledge points needed to solve the problem, and break down the problem. The mind map helped students sort out the logic in the analysis process and build a clear logical structure in this session. The GenAI chatbot could provide rich analyzing suggestions in this session to help students further expand their thinking. Difficulties encountered by students in the analysis process could be inspired and answered through conversations with the chatbot.

  3. (3)

    Developing solutions

    In this session, the teacher organized the students to complete the mind maps and monitored the teaching and learning process. Students needed to conceptualize the solution to the problem based on the previous session and complete the mind map as a whole. The mind map helped students organize the solution and clarify the steps at this stage, transforming abstract solution ideas into concrete operational solutions. The GAI chatbot helped students expand their ideas by providing different solution suggestions and discussions to ensure that their solutions were reasonable and innovative. Through interaction, the chatbot could also help students evaluate the strengths and weaknesses of different solutions.

  4. (4)

    Program writing

    In this session, the teacher organized the students to write code on the computer and made rounds to supervise the students to see if they were generating code directly. Students needed to transform the mind mapping solution they created into code and debug and run it for optimization. Mind mapping helped students to visualize the problem solution scenarios in this session. Through mind mapping, students were able to check whether the implementation steps in the programming process matched the solution against the ideas they had designed, thus ensuring that the programming was logical and systematic. The GenAI chatbot provided debugging suggestions and instant error correction functions in real time based on the students’ code, helping them solve specific problems in the programming process, optimize the program structure and logic, and improve the effectiveness and performance of the program.

  5. (5)

    Summarize and reflect

    In the summary reflection session, the teacher guided the students to use the GAI chatbot to analyze their programming process and comprehensively reviewed the students’ learning outcomes and classroom performance. The chatbot could provide personalized suggestions based on the interaction process with students to assist them in summarizing and reflecting. Students reflected on the summarization and suggestions of the teacher and the GenAI chatbot, and revised and improved their mind maps. The complete flow of the teaching activities is shown in Fig. 4.

    Fig. 4
    figure 4

    Teaching activity flow.

The experimental procedure is shown in Fig. 5. Participants conducted the study in a computer classroom for a duration of 11 weeks, with one 40-min session per week. Since the participant group had not previously learned the Python language and were having their first exposure to the GenAI chatbot, the first 3 weeks of the study consisted of lessons on the basics of the Python programming language and the use of the GenAI chatbot. In week 4, students completed pretest questions on programming knowledge as well as a pretest questionnaire. The first phase of progressive learning in Experimental Group 1 was weeks 5 and 6, in which students learned programming using the learning method of integrating fill-in-the-blank mind maps with the GenAI chatbot. The second phase was weeks 7 and 8, in which students learned programming using the learning method of integrating prompted mind maps with the GenAI chatbot, and the third phase was weeks 9 and 10, in which they learned programming using the learning method of integrating self-constructed mind maps with the GenAI Chatbot to learn programming. Students in Experimental Group 2 learned programming using the learning method of integrating self-constructed mind maps with the GenAI chatbot during the 5th to 10th weeks of the program. Students in the control group did not use the mind map scaffolding, but learned programming only with the assistance of the GenAI chatbot during weeks 5 through 10 of the course.

Fig. 5
figure 5

Experimental flow chart.

Instrument

In this study, we conducted a pretest and posttest of programming knowledge, pre- and post-questionnaire surveys of CT, programming self-efficacy, and programming learning motivation, as well as interviews. We designed the pre- and posttest questions of programming knowledge based on the course syllabus, which were revised by two senior IT teachers. The number and type of questions in the pre- and posttests were the same, with a full score of 100. The pretest questions were designed to assess learners’ prior knowledge of Python programming, and the posttest questions were designed to assess participants’ mastery of programming knowledge after completing the course.

The questionnaire used to measure participants’ CT was adapted from the scale developed by Korkmaz et al. (2015). The scale was developed based on the theoretical framework of the ISTE to measure the CT of K-12 students. The scale comprises 22 questions, representing the five dimensions of CT: creativity, algorithmic thinking, collaboration, critical thinking, and problem-solving tendency. The total Cronbach’s alpha coefficient of the questionnaire was 0.82, and the Cronbach’s alpha coefficients of the five dimensions were 0.84, 0.87, 0.86, 0.78, and 0.73, respectively. The questionnaire used a 5-point Likert scale, and the answer range was from 1 (strongly disagree) to 5 (strongly agree).

The programming self-efficacy questionnaire was adapted from the scale of the self-efficacy part of the Learning Motivation Strategies Questionnaire (MSLQ) developed by Pintrich and De Groot (1990). The scale consisted of nine 7-point Likert scales, and each item ranged from 1 (strongly disagree) to 7(strongly agree). After testing, the Cronbach’s alpha coefficient of the scale was 0.835, indicating good reliability.

Results

In this study we used the Jamovi 2.3.28 software to conduct statistical analysis on the three groups’ data. Before data analysis, the dependent variables were tested for normal distribution and homogeneity of variances.

Academic achievements

After completion of the relevant foundational knowledge instruction and before the formal teaching intervention, a pretest on programming was administered to students from the three classes. The descriptive results of the programming pretest scores are shown in Table 1. According to the statistical data in Table 1, the average pretest scores for Experiment Group 1, Experiment Group 2, and the control group were 59.75, 60.06, and 58.64, respectively. The differences in the average score deviations were minimal.

Table 1 Descriptive statistics of programming pretest scores.

To test whether there were differences in the three classes’ pretest scores of programming performance, a one-way analysis of variance was conducted. The results are shown in Table 2. There was no significant difference in the pretest scores of the three classes. Therefore, the initial programming level of the participants in the three classes was relatively close.

Table 2 One-Way ANOVA(Fisher’s).

In this study, one-way analysis of variance (ANOVA) and Tukey post hoc test analysis were used to examine differences in the posttest scores of programming achievement among the three groups of students, with the results represented by Tables 3 and 4, respectively. The ANOVA showed that there was a significant difference (p < 0.01) in the programming scores of the three groups of students: Experimental Group 1, Experimental Group 2 and the control group. In order to specifically analyze which groups differed, a post hoc test was conducted, and the results are shown in Table 2. The posttest mean of the performance of Experimental Group 1 was 5.11 points higher than that of Experimental Group 2, and the difference was statistically significant (t = 2.98, p = 0.013); the posttest mean of Experimental Group 1 was 9.31 points higher than that of the control group, and the difference was statistically significant (t = 5.34, p < 0.001); the posttest mean of Experimental Group 2 was 4.20 points higher than that of the control group, and the difference was statistically significant (t = 2.41, p = 0.046). Therefore, the three teaching methods had different impacts on participants’ programming learning; the teaching method of integrating mind mapping and the GenAI chatbot was beneficial for students’ programming learning, while the chatbot supported by progressive mind mapping was more favorable to students’ programming learning.

Table 3 One-Way ANOVA(Fisher’s).
Table 4 Tukey Post-Hoc test-academic achievement.

Computational thinking

The pretest questionnaire of CT was subjected to one-way ANOVA. The mean and standard deviation of CT for Experimental Group 1 were 75.8 and 6.64, respectively; the mean and standard deviation of Experimental Group 2 were 74.8 and 7.20, respectively; and the mean and standard deviation of the control group were 74.3 and 6.01, respectively. The results (see Table 5) showed that there was no statistically significant difference in overall CT ability among the three groups (F = 0.452, p = 0.638 > 0.05).

Table 5 One-Way ANOVA(Fisher’s).

One-way analysis of variance was performed on the total scores of the CT posttest of the three groups of students; the results are shown in Table 6. The data showed that there were significant differences in the three groups’ total scores of CT. A Tukey post hoc test was performed and pairwise comparisons were made to specifically analyze which groups had differences. The results are shown in Table 7. The posttest average of the total CT score of Experimental Group 1 was 3.33 points higher than that of Experimental Group 2, with a statistically significant difference (t = 2.40, p = 0.047); the posttest average of Experimental Group 1 was 7.63 points higher than that of the control group, with a statistically significant difference (t = 5.61, p < 0.001); and the posttest average of Experimental Group 2 was 4.29 points higher than that of the control group, with a statistically significant difference (t = 3.16, p = 0.006). Therefore, the three teaching methods did not affect participants’ CT in the same way; the teaching method of integrating mind mapping and the GenAI chatbot was beneficial for students’ CT development, and the chatbot supported by progressive mind mapping was more favorable to students’ CT.

Table 6 One-Way ANOVA(Fisher’s).
Table 7 Tukey Post-Hoc test-computational thinking total.

Creativity

One-way analysis of variance was performed on the total posttest scores of the creativity dimensions of the three groups of students; the results are shown in Table 8. The data showed that there were significant differences in the total creative posttest scores of the three groups of students. For this reason, Tukey’s post-hoc test was performed and pairwise comparisons were made to specifically analyze which groups had differences. The results are shown in Table 9. The posttest average of the creativity of Experimental Group 1 was 0.0278 points lower than that of Experimental Group 2, but the difference was not statistically significant (t = −0.0634, p = 0.998). The posttest average of Experimental group 1 was 1.04 points higher than that of the control group, with a statistically significant difference (t = 2.43, p = 0.044); while the posttest mean value of Experimental Group 2 was 1.07 points higher than that of the control group, with a statistically significant difference (t = 2.50, p = 0.037). This shows that the teaching method of integrating mind mapping and the GenAI chatbot was favorable for the development of students’ creative thinking tendencies.

Table 8 One-Way ANOVA(Fisher’s).
Table 9 Tukey Post-Hoc Test-Creativity.

Algorithmic thinking

One-way analysis of variance was performed on the algorithmic thinking posttest scores of the three groups of students; the results are shown in Table 10. The data showed that there was no significant difference in the algorithmic thinking posttest scores of the three groups of students. Therefore, the three teaching methods had similar effects on improving students’ algorithmic thinking.

Table 10 One-Way ANOVA(Fisher’s).

Cooperativity

One-way analysis of variance was performed on the three groups’ posttest scores of cooperative learning tendencies; the results are shown in Table 11. The data showed that there was no significant difference in the posttest scores of the three groups of students’ cooperative learning tendencies. The three teaching methods, therefore had similar effects on students’ cooperative learning tendency.

Table 11 One-Way ANOVA(Fisher’s).

Critical thinking

One-way analysis of variance was performed on the total posttest scores of the critical thinking dimension of the three groups of students. The results are shown in Table 12. The data showed that there were significant differences in the critical thinking dimension scores of the three groups of students. To this end, Tukey’s post hoc test was performed and pairwise comparisons were made to specifically analyze which groups had differences. The results are shown in Table 13. The average critical thinking posttest of Experimental Group 1 was 0.583 points higher than that of Experimental Group 2; this difference was not statistically significant (t = 1.17, p = 0.475). The posttest average of Experimental Group 1 was 1.77 points higher than that of the control group, with a statistically significant difference (t = 3.62, p = 0.001); the posttest mean value of Experimental Group 2 was 1.19 points higher than that of the control group, with a statistically significant difference (t = 2.43, p = 0.044). Therefore, the three teaching methods did not have the same impact on the participants’ critical dimension; the teaching method of integrating mind mapping and the GenAI chatbot was favorable to the development of students’ critical thinking, but the chatbot supported by progressive mind mapping was more favorable to the development of students’ critical thinking.

Table 12 One-Way ANOVA(Fisher’s).
Table 13 Tukey Post-Hoc test-critical thinking.

Problem solving

One-way analysis of variance was performed on the total posttest scores of the three groups of students’ problem-solving tendencies; the results are shown in Table 14. The data showed that there were significant differences in the posttest scores of the problem-solving tendency dimension of the three groups of students. To this end, a Tukey post hoc test was performed and pairwise comparisons were made to specifically analyze which groups had differences. The results are shown in Table 15. The posttest mean value of Experimental Group 1 was 2.03 points higher than that of Experimental Group 2, and the difference was statistically significant (t = 3.27, p = 0.004); the posttest mean value of Experimental Group 1 was 3.54 points higher than that of the control group, with a statistically significant difference (t = 5.83, p < 0.001); the posttest mean value of Experimental Group 2 was 1.51 points higher than that of the control group, with a statistically significant difference (t = 2.49, p = 0.037). As a result, the three instructional approaches had different impacts on participants’ problem-solving tendencies. The instructional approach of integrating mind mapping and the GenAI chatbot was favorable to the development of students’ problem-solving tendencies, but the chatbot supported by progressive mind mapping was more favorable to the development of students’ problem-solving tendencies.

Table 14 One-Way ANOVA(Fisher’s).
Table 15 Tukey Post-Hoc test-problem solving.

Programming self-efficacy

One-way analysis of variance was performed on the self-efficacy posttest scores of the three groups of students. The results (see Table 16) showed that there was no significant difference in the self-efficacy posttest scores of the three groups (F = 1.39, p = 0.253 > 0.05).

Table 16 One-Way ANOVA(Fisher’s).

A paired sample t test was conducted on the pre- and posttest scores of the self-efficacy of the three groups of students. The results (see Table 17) showed that there were significant differences in the pre- and posttest scores of the self-efficacy of the three groups of students. Therefore, the analysis of this study showed that all three methods could improve students’ self-efficacy in learning programming.

Table 17 Paired Samples t Test.

Discussion

In order to improve students’ programming performance, we designed a learning approach that integrated mind mapping with a GenAI chatbot. The results of the study showed that the learning method of integrating mind mapping with a GenAI chatbot had a positive impact on students’ academic performance, computational thinking, and programming self-efficacy compared with the traditional chatbot-based learning method. There were also some differences in the effects of GenAI chatbots supported by different mind maps on the above aspects.

In terms of academic performance, the programming performance of students in Experimental Group 1 was significantly higher than that of students in Experimental Group 2 and in the control group. The programming performance of students in Experimental Group 2 was significantly higher than that of students in the control group. This result is consistent with previous research results (J.-H. Zhang et al., 2023a). Mind maps can stimulate learners’ interest and increase their enthusiasm and attention (Feng et al., 2023), which will then have a positive impact on their academic performance (Yildiz Durak, 2020). By making mind maps, students can express the flow and hierarchical structure of their thoughts, which is beneficial for their understanding of programming concepts (Zhao et al., 2022). Compared with self-constructed mind mapping, progressive mind mapping allows students to go through training from filling in the blanks to prompting and finally self-constructing learning scaffolds. This process can more effectively improve students’ ability to understand complex problems and their programming skills. With proper prompting, students can describe the problem-solving steps in detail and ask more specific questions to the GenAI chatbot for more in-depth discussions. At this point, the GenAI chatbot can provide more specific and explicit support to the students, thus promoting their academic performance.

In terms of computational thinking, the overall scores of participants in Experimental Group 1 were significantly higher than those of students in Experimental Group 2 and in the control group. The overall scores of the participants in Experimental Group 2 were significantly higher than those of the students in the control group. In terms of the sub-dimensions of computational thinking, the study found that there were significant differences between the three groups of participants in the areas of “creativity,” “critical thinking,” and “problem solving,” but not in the areas of “algorithmic thinking” and “cooperative learning.” There were no significant differences in “algorithmic thinking” and “cooperative learning.” This finding suggests that while the learning method of integrating mind maps with a GenAI chatbot promotes the development of students’ computational thinking, it does not provide support for every aspect of computational thinking. The results of the post hoc test showed that the two groups of students who used the learning method of integrating mind mapping with the GenAI chatbot outperformed the students in the control group on the creativity dimension, which is consistent with the results of previous studies (Su et al., 2022). Mind mapping makes complex concepts and ideas more visual. This visualization of the thinking process helps learners to more easily see the connections between different concepts, sparking new ideas and inspiration (Dong et al., 2021). Students in the process of drawing mind maps can constantly sort out the veins of knowledge, summarize the framework, and understand the knowledge at different levels, and in the process, can promote the enhancement of creative thinking. For the two experimental groups, the data showed that there was no significant difference in the creativity performance of the two groups, but the students in Experimental Group 2 scored slightly higher on creativity than the students in Experimental Group 1. In Experimental Group 2, students were able to construct their own thinking network through self-constructed mind maps. Self-constructed knowledge systems help to enhance students’ creative thinking because they are free to explore different thinking paths and develop new ways of solving problems in the process (Hunter et al., 2008). Although progressive mind mapping provides more systematic guidance, such guidance may not necessarily have a clear advantage over fully autonomous thought construction in the creativity dimension. In terms of critical thinking, participants in the Experimental Group 1 and Experimental Group 2 also performed significantly better than those in the control group. Cottrell (2023) stated that critical thinking provides people with the tools to constructively use doubt and questioning, allowing them to analyze the information available to students and make better and more informed decisions about whether this information is likely to be true. In this study, students in the two experimental groups were first communicating and interacting with the GenAI chatbot on the basis of mapping their thinking and clarifying their ideas, so students would select more favorable and reliable support. In addition, students were asked to reflect on the process of designing a mind map, which provided them with an opportunity to apply critical thinking for more detailed analysis and internalization of the learning process (Polat & Aydın, 2020). Self-constructed mind mapping requires students to actively construct their own knowledge networks, a process that helps them think independently and integrate new information through critical analysis. While progressive mind mapping provides students with additional guidance, the two may ultimately converge in terms of critical thinking development, as both forms of mind mapping support students to reflect on and analyze problems in depth. In terms of problem-solving tendency, students in Experimental Group 1 scored significantly higher than students in Experimental Group 2 and those in the control group, and students in Experimental Group 2 scored significantly higher than students in the control group. This indicates that the learning method of integrating mind mapping and the GenAI chatbot had a positive effect on students’ problem solving, and the effect of the learning method supported by progressive mind mapping was more significant. Through mind mapping, students can decompose abstract and complex programming problems into more concrete and simple problems, which makes the structure of the problems more inclined, and this externalized thinking process not only enhances students’ comprehension but also promotes the development of their problem-solving ability. In addition, progressive mind mapping to some extent helps students clarify the steps of problem solving, so they are more clearer about their learning goals(J.-H. Zhang et al., 2023a). With clear goals, students are motivated to invest their energy in finding solutions and communicating and interacting with the GenAI chatbot.

Self-efficacy represents a person’s inner confidence in completing a certain task and is an important factor affecting the success of programming learning (Wei et al., 2021). Problem solving in programming learning is a task that requires complex cognitive processes. Lack of prior knowledge and experience may make it difficult for students to solve problems due to a lack of confidence (Cheah, 2020). The analysis of variance results of the posttest questionnaire showed that there was no significant difference in the programming self-efficacy of the three groups of students. However, according to the paired t-test results of the pre- and posttest questionnaires of the three groups of students, the programming self-efficacy of the three groups increased significantly, which suggests that the GenAI chatbot can help students build up their confidence in programming learning. The real-time feedback provided by the GenAI chatbot during the learning process can enhance students’ programming self-efficacy; this result is consistent with the results of previous studies. Therefore, it can be said that the use of GenAI chatbots in programming education is effective in terms of enhancing students’ programming self-efficacy.

Conclusions

The use of GenAI chatbots in teaching programming has become a trend. It can provide instant feedback to help students with their programming learning (Yilmaz & Yilmaz, 2023a). However, there are some problems with GenAI chatbots applied in programming teaching (Husain, 2024). According to the characteristics of mind mapping, it can weaken the disadvantages of GenAI chatbots and strengthen its advantages to some extent. However, few studies have investigated the impact of integrating mind maps with GenAI chatbots on students’ programming learning. Therefore, this study investigated the effects of integrating a learning approach of mind mapping with a GenAI chatbot on students’ programming academic performance, computational thinking, and programming self-efficacy. The results showed that the learning method of integrating mind maps with a GenAI chatbot had a positive effect on students’ programming performance, creative thinking, critical thinking, and problem-solving tendencies. The learning approach of integrating progressive mind mapping with GenAI chatbots had an even more significant positive effect on students’ programming achievement and problem-solving skills.

Based on the research process and results, we make some recommendations for researchers and educators: first, teachers should guide students to make reasonable use of artificial intelligence, such as GenAI chatbots, to promote independent learning and higher-order thinking development. In instructional design, teachers should encourage students to think independently before using GenAI chatbots to validate ideas, expand knowledge, or obtain assistive suggestions. Through the rational use of AI tools, students are promoted to become active agents of independent learning, not just passive receivers of information. Second, it is necessary to strengthen the effective integration of mind maps and GenAI chatbots, and to emphasize the application of progressive mind maps. Mind maps can help students intuitively clarify their thoughts and structure complex programming knowledge, while GenAI chatbots can provide students with timely feedback and support. Progressive mind mapping can effectively support students’ knowledge construction process in programming teaching. Through the gradual guidance, students can better understand programming concepts and show higher flexibility of thinking when solving programming problems. Finally, it is important to focus on the cultivation of students’ motivation and engagement. Teachers need to carefully design programming problems so that they can both arouse students’ interest and stimulate their desire to learn. For example, by setting interesting and challenging programming problems, students can not only feel the fun of programming, but also experience the sense of accomplishment of gradually overcoming difficulties in the process of completing the tasks, thus further enhancing their learning motivation.

Limitations and further research

This study provides a reference for the application of generative AI chatbots in programming teaching, but there are still limitations. First, the experimental intervention time was short, only 6 weeks, and the improvement of programming performance requires long-term investment, so the findings need to be further validated by long-term studies. Second, the sample size and representativeness were limited, involving only three classes of first-year students. Future research should expand the sample size to cover students of different ages and backgrounds to improve the generalizability of the results. Finally, the evaluation content and methods were insufficient; this study was mainly based on test questions and scale data; a reasonable evaluation method has not been determined for students’ mind mapping works and the process data of students’ interactions with the GenAI chatbot. In the future, it is still necessary to choose more reasonable evaluation methods to obtain more rigorous evaluation results.