Improving students’ programming performance: an integrated mind mapping and generative AI chatbot learning approach

Ye, Xindong; Zhang, Wenyu; Zhou, Yuxin; Li, Xiaozhi; Zhou, Qiang

doi:10.1057/s41599-025-04846-4

Download PDF

Article
Open access
Published: 20 April 2025

Improving students’ programming performance: an integrated mind mapping and generative AI chatbot learning approach

Xindong Ye^1,2,
Wenyu Zhang¹,
Yuxin Zhou¹,
Xiaozhi Li¹ &
…
Qiang Zhou^2,3

Humanities and Social Sciences Communications volume 12, Article number: 558 (2025) Cite this article

5229 Accesses
1 Citations
10 Altmetric
Metrics details

Subjects

Abstract

With the development of the times, programming education has become increasingly important for individual development. However, for programming beginners such as primary and secondary school students, learning programming is not a simple task and requires additional learning support. Generative AI (GenAI) chatbots are effective teaching aids that can reduce the learning difficulty of programming by providing real-time guidance and personalized learning support based on students’ abilities. Therefore, it has been a trend to apply GenAI chatbots in teaching. However, technology always has two sides. Over-reliance on these chatbots may weaken students’ ability to think independently and affect their learning effectiveness. Therefore, how to rationally utilize GenAI chatbots in the classroom and improve their application effectiveness has become an important issue for both researchers and frontline teachers. Based on this, the present study proposed a learning method that integrates mind mapping with GenAI chatbots. To assess the effectiveness of this learning method and to investigate whether there are differences in the impact of various types of mind mapping-supported GenAI chatbots on students’ programming academic performance, computational thinking, and programming self-efficacy, the research team conducted a quasi-experimental study. The participants were 111 seventh-grade students at a junior high school in southeastern China. Experimental Group 1 (36 students) used a learning approach that integrated progressive mind maps with a Generative AI chatbot, Experimental Group 2 (36 students) used a learning approach that integrated self-constructed mind maps with a GenAI chatbot, and the control group (39 students) used a traditional AI chatbot-based learning approach. The results showed that participants in both experimental groups had significantly better programming learning performance as well as computational thinking than the control group, and that the learning method integrating progressive mind mapping with GenAI chatbots was more effective.

Pre-service language teachers’ experiences and perceptions of integrating generative AI in practicum-based lesson study

Article Open access 24 September 2025

“It happened to be the perfect thing”: experiences of generative AI chatbots for mental health

Article Open access 27 October 2024

Promises and challenges of generative artificial intelligence for human learning

Article 22 October 2024

Introduction

In the era of Artificial Intelligence (AI), the importance of programming education is becoming increasingly prominent, and it has become one of the key ways to cultivate future innovators (González-Pérez & Ramírez-Montoya, 2022). Computational thinking (CT), as a core competency for adapting to changes in digital society and AI, is widely recognized as a foundational skill(Tikva & Tambouris, 2021), and programming is an effective way to develop CT (Belmar, 2022). According to the International Society for Technology in Education (ISTE), CT includes problem solving, creativity, critical thinking, algorithmic thinking, and collaborative skills, elements that help students meet the challenges of increasingly complex and open digital technologies (ISTE, 2015). A growing number of scholars are advocating for the introduction of programming education at the primary and secondary school levels due to the fact that it not only equips students with knowledge and skills in the field of computing, but also positively impacts their future learning; moreover, the earlier programming education begins, the more pronounced the benefits will be (Lai et al., 2021; Lindberg et al., 2019). However, programming is particularly challenging for primary and secondary school students (Yang & Lin, 2024), requiring sufficient external support during the learning process (Webb et al., 2017), such as assistance from teachers. However, due to the low teacher-student ratio in programming classes, it is difficult to provide timely help, leaving some students unable to achieve the expected outcomes in programming learning, which in turn reduces their programming self-efficacy (Medeiros et al., 2018). Self-efficacy, defined as an individual’s confidence in their ability to complete specific tasks, directly influences their persistence and effort when facing challenges (Bandura, 1978). Previous research has shown that self-efficacy plays a critical role in programming learning (Gurer & Tokumaci, 2020). Therefore, how to provide timely support for students in K-12 programming education and improve their programming performance, CT, and self-efficacy has become an important issue that needs to be addressed in the field of educational technology.

The rapid development of Generative Artificial Intelligence (GenAI) technology brings some solutions to the above problems. GenAI chatbots based on advanced converter architectures (e.g., GPT-4) can predict, understand, and generate human-like text (Pavlik, 2023), generate coherent and contextually relevant responses during long interactions (Mohamed, 2024), and provide targeted assistance based on individual needs. With their powerful natural language understanding and extensive knowledge base, GenAI-based chatbots can provide timely feedback to users, and thus a growing number of researchers are applying them in the field of education (Kasneci et al., 2023). In programming instruction, GenAI chatbots can give various solutions to students’ problems, provide code examples, or modify code (Husain, 2024). Using GenAI chatbots as programming learning assistants for elementary and secondary school students will increase confidence in solving programming problems and improve learning outcomes and motivation (Muñoz et al., 2023; Shoufan, 2023). Although the GenAI chatbot shows great potential in the field of programming education, there are certain problems. Currently, GenAI chatbots are commonly used for the delivery of learning knowledge or as practice tools (McGrath et al., 2024). Without additional instruction or strategies to guide students in the use of chatbots, students may struggle to effectively correlate newly acquired knowledge with existing knowledge, leading to disappointing learning outcomes (Chin et al., 2014; Hwang et al., 2019). In addition, learners may lack critical thinking when using GenAI and get answers to questions directly through the tool. This adversely affects the development of skills such as critical thinking creativity, and if students become overly reliant on GenAI chatbots, they may fail to develop the skills needed to solve problems on their own, which may even lead to cognitive stagnation (Dwivedi et al., 2023; Qin et al., 2023).

In terms of helping learners organize what they have learned, researchers have suggested the use of mind maps, which is a method of instructing learners to present elements related to core concepts in a graphical pattern (Edwards & Cooper, 2010). As a thinking visualization tool, mind mapping can externalize learners’ thinking processes and guide students to think in an orderly manner and engage in reflective learning(Merchie & Van Keer, 2012; Stokhof et al., 2020). Many previous studies have reported the benefits of mind mapping for improving students’ academic performance and problem-solving skills (Cristea et al., 2011; Eppler, 2006). Feedback of answers from GenAI chatbots relies on students’ questioning skills (Xia et al., 2022). Mind mapping organizes and develops students’ ideas (Buzan, 2024), which will facilitate their provision of clearer and more structured information when interacting with the chatbot to enhance the accuracy of its feedback (Abd-Alrazaq et al., 2023), and will also help students to understand and internalize what they have learned in a deeper way, which will lead to better learning outcomes. The inclusion of a mind mapping session will also avoid the tendency of students to get their answers directly from the GenAI chatbot, which helps to develop their critical thinking and problem-solving skills (Dwivedi et al., 2023). Therefore, this study proposed a learning approach that integrates mind mapping with a GenAI chatbot to improve students’ programming performance. In addition, numerous studies have shown that the type of mind maps and the way they are used can also affect the impact of learning outcomes (Shi et al., 2023; Zhao et al., 2022). It is also an interesting topic to explore how generative chatbots, supported by different types of mind maps, affect students’ programming learning outcomes. Based on this, this study proposed the following research questions:

RQ1: Does an integrated mind mapping and GenAI chatbot learning approach improve students’ programming academic performance?And is there a difference in the impact of different types of mind mapping-supported GenAI chatbots on students’ programming academic performance?

RQ2: Does an integrated mind mapping and GenAI chatbot learning approach improve students’ computational thinking? And is there a difference in the impact of GenAI chatbots supported by different types of mind maps on students’ computational thinking?

RQ3: Does an integrated mind mapping and GenAI chatbot learning approach improve students’ self-efficacy? And are there differences in the effects of different types of mind mapping-supported GenAI chatbots on students’ self-efficacy?

Literature review

Programming education in K-12

In recent years, programming education has received increasing attention from educators and researchers as an effective way to develop students’ 21st-century skills (Hu, 2024). Many countries and regions are incorporating programming into K-12 education in response to the future demand for talent in a digital society (Åkerfeldt et al., 2024).

However, numerous studies have shown that programming poses a significant challenge for K-12 students due to their lack of relevant knowledge. The abstract and complex nature of programming concepts is a significant barrier for K-12 learners (Ma et al., 2023). To address the obstacles students face in learning programming, researchers have introduced various teaching tools and methods in programming education (Lai & Wong, 2022). For example, graphical programming languages help students intuitively understand programming concepts and processes (Tsai, 2019). Game-based programming learning, through a series of designed instructional scenarios, makes learning both challenging and fun, potentially motivating students to achieve better learning outcomes. Additionally, collaborative programming is another teaching method (Wu et al., 2019) whereby students work in groups to complete programming tasks and build knowledge (Van Aalst, 2009), understanding complex concepts and solving problems through communication and conflict resolution. Although these tools and methods can somewhat reduce the difficulty of learning programming, challenges remain. Programming, as an activity that involves logic and problem solving, requires complex thinking skills, which are often difficult for students to develop independently (Shadiev et al., 2014). They need instructional support to clearly articulate programming logic and express solutions (Kwon, 2017).

Chatbots in education

Chatbots, also known as conversational agents, are computer programs designed to simulate human-like conversations (Wu & Yu, 2024; R. Zhang et al., 2023b). They interact with users on specific topics or in specific domains through text and speech in a natural conversational manner (Smutny & Schreiberova, 2020). There are significant differences between traditional chatbots and AI chatbots (Wu & Yu, 2024). Traditional chatbots rely on predefined patterns and templates, and their interactions are limited by rules that do not allow them to accurately understand students’ questions, leading to the provision of irrelevant or fixed answers (Coniam, 2014; Yang et al., 2022). Not only does this not help solve the problem, but it may also increase student negativity and decrease technology acceptance (Yang et al., 2022). GenAI chatbots use a variety of AI techniques such as natural language processing, machine learning, information retrieval, and deep learning, which retain user input and learns from previous user input, promoting enhanced engagement and interaction (Nguyen et al., 2022).

Researchers have noted that GenAI chatbots have great potential in education to improve student performance to some extent (McGrath et al., 2024). For example, Tai and Chen (2024) developed a GenAI chatbot based on ChatGPT and applied it to teaching English speaking in elementary schools, and the results showed that it significantly improved the speaking ability of English learners. Chae et al. (2023) developed a GenAI chatbot for English language learning, which, by mimicking the language education behaviors of human teachers, could autonomously organize teaching tasks and progress. In programming education, Yilmaz and Yilmaz (2023b) also demonstrated the positive effects of GenAI tools on students’ CT, programming self-efficacy, and motivation to learn programming. However, some studies have found that GenAI chatbots do not have a positive impact on students’ programming learning. Sun et al. (2024) integrated GenAI chatbots into programming education and found that they did not significantly improve programming performance. By applying GenAI chatbots directly in the classroom, students can receive timely feedback on their knowledge. However, this immediate feedback mechanism may only contribute to students’ surface understanding of knowledge, and not really promote their deeper internalization of knowledge.

Mind mapping

Mind maps can help learners externalize their thinking processes and lead them to visual thinking and orderly thinking, which can in turn improve their problem-solving skills (Liu et al., 2018). Some previous studies have shown that mind maps can transform complex ideas into visual diagrams, which not only help learners integrate new knowledge with existing knowledge, but also help them memorize and comprehend the learning content (Buzan & Buzan, 2002) and promote their creative thinking (Abd Karim & Abu, 2018; Bonk & Cunningham, 2012). Some researchers have reported the potential of mind maps for promoting learners’ critical thinking by helping them to analyze and identify associations between different aspects of knowledge (Abd Karim & Mustapha, 2022; Fu et al., 2019; Wu & Wu, 2020).

Mind maps also have a very positive impact in programming education. Ismail et al. (2010) pointed out in their study that combining mind map learning scaffolds with cooperative learning significantly positively impacted students’ problem-solving skills, programming performance, and metacognitive knowledge. Liu et al. (2018)investigated the impact of mind maps on undergraduate programming learning, finding that they could transform abstract and intangible thought patterns into visible and radiating thought patterns, thus improving students’ logical and creative thinking. Chen (2020) integrated mind maps into the teaching of visual programming with Scratch to enhance students’ motivation and reflective abilities in learning.

However, training with mind maps as a cognitive tool can be time-consuming and might impose a significant cognitive load on K-12 learners (Van Gog et al., 2006). Therefore, educators need to identify more suitable mind map scaffolding methods for K-12 learners.

Method

In order to assess the effects of a learning approach that integrates mind mapping with a GenAI chatbot on middle school students’ programming academic performance, CT, and programming self-efficacy, and to investigate whether there is a difference in students’ performance in the above areas when using different types of mind mapping-supported GenAI chatbots, this quasi-experimental study was conducted with three groups of 111 middle school students. The participants, study-related design, experimental procedures, and measurement tools are described in detail below.

Participants

The 111 participants in this study were seventh-grade students (11–12 years old) from three classes in a public junior high school in southeastern China. Two classes were randomly selected as Experimental Group 1 (16 boys, 20 girls) and Experimental Group 2 (17 boys, 19 girls), and the other class as a control group (19 boys, 20 girls). All participants had taken information technology(IT) courses at the primary school level and had basic computer skills. Students in the three groups completed an 11-week experimental course. The course taught the same content and was taught by the same IT teacher. The main content included “Python Programming Basics” and “Python Programming Basic Structure.”

Design of the GenAI chatbot based on a large language model

In order to successfully complete this experiment, the research team constructed a GenAI chatbot based on a large language model, choosing GPT-4 as the core model, a choice that makes it significantly different from traditional AI chatbots. Compared to chatbots that rely on rules or simple pattern matching, GPT-4 implements dynamic language processing via deep neural networks, trained on a wide range of datasets, and has the ability to understand and generate human-like responses across multiple topics(Tai & Chen, 2024). This advanced capability significantly enhances the chatbot’s conversational capabilities, enabling it to interact with users in a more natural and contextually relevant way. By integrating GPT-4, the chatbot’s level of intelligence is enhanced to provide a more realistic and immersive interactive experience for the user. The chat interface is shown in Fig. 1. Students can initiate conversations on various topics with the chatbot by entering their questions in the main window.

Deploying GPT-4 (or the Large Language Model) within the educational domain requires careful consideration of the construction of prompts to maximize the efficiency of the model (Liu et al., 2024). In this study, a modular hybrid prompt design (Liu et al., 2022) was used to ensure relevance and precision across course stages by combining general course prompts with stage-specific prompts. General course prompts provide meta-information such as course objectives, expected outcomes, and target grade level at the beginning of the course, laying the groundwork for students to interact with the GenAI chatbot. Stage-specific prompts, on the other hand, provide customized guidance according to the different stages of the course (e.g., clarifying the problem, analyzing the problem, formulating a solution, writing a program, and summarizing and reflecting on it) to ensure that the chatbot’s feedback is tightly correlated with the learning objectives of the current stage. In addition, students were free to interact with the GenAI chatbot at various stages of the course to ask questions, seek clarification, or request more examples, thus enabling dynamic support for personalized learning. In addition, the chatbot designed for this study supports voice interaction to provide students with more convenient input-output feedback to meet the cognitive and emotional needs of different students.

Protecting students’ private data is important while ensuring interaction with a GenAI chatbot. We developed a dedicated web application based on open source code. The application allows educators to update system prompts in real time based on course dynamics. Especially critical is that all sensitive information, such as personalization settings and conversation logs, is securely stored in the indexedDB database of the local browser. Interaction with GPT-4 is via OpenAI’s API, ensuring that data privacy standards are strictly adhered to.

Design of thinking maps scaffolding and teaching activities

When using a GenAI chatbot, appropriate and accurate questions can help to get higher-quality answers. By creating a mind map to clarify the knowledge points and logical ideas needed to solve problems, it helps to clarify the direction and purpose of communication with the chatbot and obtain more accurate and useful responses. After getting feedback from the chatbot, learners can modify and optimize the mind map to make the problem-solving ideas clearer and more visible. Therefore, the relationship between the mind map and the chatbot is mutually reinforcing, with the mind map facilitating the chatbot to generate high-quality answers and the chatbot facilitating the iterative optimization of the mind map. The relationship between students, mind maps, and chatbots is shown in Fig. 2.

In traditional teaching, some teachers let students construct mind maps by themselves, that is, “self-constructed” mind maps. Since junior high school students are young and lack hands-on skills and sufficient thinking logic, it is difficult for them to complete mind maps independently. To solve this problem, we designed a progressive mind map scaffold with three stages, using different forms of mind maps in each stage to gradually reduce the support for learners’ knowledge construction. The first stage was a gap-filling mind map; the second stage was a prompting mind map; and the third stage was a self-constructed mind map. Gap-filling mind maps are mind map frames pre-designed by the teacher, which contain clues to solve the problem and some blank spaces for students to fill in the key knowledge points. In prompting mind maps, the teacher no longer provides the framework of the mind map, but gives some hints or key points to solve the problem, and students need to design the solution and complete the mind map according to the hints. Self-constructed mind maps refer to mind maps that students construct independently according to their own understanding of the knowledge and thinking logic. These three stages are interrelated and present a progressive approach to promote students’ cognitive and thinking development. The progressive mind map is shown in Fig. 3.

In this study, the instructional activities consisted of five segments, namely: clarifying the problem, analyzing the problem, formulating the program, writing the program, and summarizing and reflecting on the program.

(1)
Defining the problem

In the Defining the Problem session, the teacher set up situational problems based on daily life scenarios, presented the teaching case through the scenarios, and posed questions to the students. Students then entered the problem situation designed by the teacher and identified the core of the problem. The role of mind mapping in this session was to help students break down the problem and focus on the core of the problem. Students could sort out the information in the situation through the mind map, so as to clarify the different aspects and levels of the problem, and sort out the logical relationship of the problem through the presentation of structured thinking. The GenAI chatbot helped students quickly capture the key issues in a situation through question explanations and interactive feedback.
(2)
Analyze the problem

In the Analyze the problem session, the teacher guided the students to specifically analyze the identified problem and create an initial mind map. Students were guided by the teacher to use the mind map to break down the problem, sort out the knowledge points needed to solve the problem, and break down the problem. The mind map helped students sort out the logic in the analysis process and build a clear logical structure in this session. The GenAI chatbot could provide rich analyzing suggestions in this session to help students further expand their thinking. Difficulties encountered by students in the analysis process could be inspired and answered through conversations with the chatbot.
(3)
Developing solutions

In this session, the teacher organized the students to complete the mind maps and monitored the teaching and learning process. Students needed to conceptualize the solution to the problem based on the previous session and complete the mind map as a whole. The mind map helped students organize the solution and clarify the steps at this stage, transforming abstract solution ideas into concrete operational solutions. The GAI chatbot helped students expand their ideas by providing different solution suggestions and discussions to ensure that their solutions were reasonable and innovative. Through interaction, the chatbot could also help students evaluate the strengths and weaknesses of different solutions.
(4)
Program writing

In this session, the teacher organized the students to write code on the computer and made rounds to supervise the students to see if they were generating code directly. Students needed to transform the mind mapping solution they created into code and debug and run it for optimization. Mind mapping helped students to visualize the problem solution scenarios in this session. Through mind mapping, students were able to check whether the implementation steps in the programming process matched the solution against the ideas they had designed, thus ensuring that the programming was logical and systematic. The GenAI chatbot provided debugging suggestions and instant error correction functions in real time based on the students’ code, helping them solve specific problems in the programming process, optimize the program structure and logic, and improve the effectiveness and performance of the program.
(5)
Summarize and reflect

In the summary reflection session, the teacher guided the students to use the GAI chatbot to analyze their programming process and comprehensively reviewed the students’ learning outcomes and classroom performance. The chatbot could provide personalized suggestions based on the interaction process with students to assist them in summarizing and reflecting. Students reflected on the summarization and suggestions of the teacher and the GenAI chatbot, and revised and improved their mind maps. The complete flow of the teaching activities is shown in Fig. 4.
Fig. 4
Teaching activity flow.
Full size image

The experimental procedure is shown in Fig. 5. Participants conducted the study in a computer classroom for a duration of 11 weeks, with one 40-min session per week. Since the participant group had not previously learned the Python language and were having their first exposure to the GenAI chatbot, the first 3 weeks of the study consisted of lessons on the basics of the Python programming language and the use of the GenAI chatbot. In week 4, students completed pretest questions on programming knowledge as well as a pretest questionnaire. The first phase of progressive learning in Experimental Group 1 was weeks 5 and 6, in which students learned programming using the learning method of integrating fill-in-the-blank mind maps with the GenAI chatbot. The second phase was weeks 7 and 8, in which students learned programming using the learning method of integrating prompted mind maps with the GenAI chatbot, and the third phase was weeks 9 and 10, in which they learned programming using the learning method of integrating self-constructed mind maps with the GenAI Chatbot to learn programming. Students in Experimental Group 2 learned programming using the learning method of integrating self-constructed mind maps with the GenAI chatbot during the 5th to 10th weeks of the program. Students in the control group did not use the mind map scaffolding, but learned programming only with the assistance of the GenAI chatbot during weeks 5 through 10 of the course.

Instrument

In this study, we conducted a pretest and posttest of programming knowledge, pre- and post-questionnaire surveys of CT, programming self-efficacy, and programming learning motivation, as well as interviews. We designed the pre- and posttest questions of programming knowledge based on the course syllabus, which were revised by two senior IT teachers. The number and type of questions in the pre- and posttests were the same, with a full score of 100. The pretest questions were designed to assess learners’ prior knowledge of Python programming, and the posttest questions were designed to assess participants’ mastery of programming knowledge after completing the course.

The questionnaire used to measure participants’ CT was adapted from the scale developed by Korkmaz et al. (2015). The scale was developed based on the theoretical framework of the ISTE to measure the CT of K-12 students. The scale comprises 22 questions, representing the five dimensions of CT: creativity, algorithmic thinking, collaboration, critical thinking, and problem-solving tendency. The total Cronbach’s alpha coefficient of the questionnaire was 0.82, and the Cronbach’s alpha coefficients of the five dimensions were 0.84, 0.87, 0.86, 0.78, and 0.73, respectively. The questionnaire used a 5-point Likert scale, and the answer range was from 1 (strongly disagree) to 5 (strongly agree).

The programming self-efficacy questionnaire was adapted from the scale of the self-efficacy part of the Learning Motivation Strategies Questionnaire (MSLQ) developed by Pintrich and De Groot (1990). The scale consisted of nine 7-point Likert scales, and each item ranged from 1 (strongly disagree) to 7(strongly agree). After testing, the Cronbach’s alpha coefficient of the scale was 0.835, indicating good reliability.

Results

In this study we used the Jamovi 2.3.28 software to conduct statistical analysis on the three groups’ data. Before data analysis, the dependent variables were tested for normal distribution and homogeneity of variances.

Academic achievements

After completion of the relevant foundational knowledge instruction and before the formal teaching intervention, a pretest on programming was administered to students from the three classes. The descriptive results of the programming pretest scores are shown in Table 1. According to the statistical data in Table 1, the average pretest scores for Experiment Group 1, Experiment Group 2, and the control group were 59.75, 60.06, and 58.64, respectively. The differences in the average score deviations were minimal.

Table 1 Descriptive statistics of programming pretest scores.

Subjects

Abstract

Similar content being viewed by others

Pre-service language teachers’ experiences and perceptions of integrating generative AI in practicum-based lesson study

“It happened to be the perfect thing”: experiences of generative AI chatbots for mental health

Promises and challenges of generative artificial intelligence for human learning

Introduction

Literature review

Programming education in K-12

Chatbots in education

Mind mapping

Method

Participants

Design of the GenAI chatbot based on a large language model

Design of thinking maps scaffolding and teaching activities

Instrument

Results

Academic achievements

Computational thinking

Creativity

Algorithmic thinking

Cooperativity

Critical thinking

Problem solving

Programming self-efficacy

Discussion

Conclusions

Limitations and further research

Data availability

References

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Ethical approval

Informed consent

Additional information

Supplementary information

Quantitative data

Scales

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links