The usage of artificial Intelligence-empowered text analysis model with convolutional neural network in english reading

Huang, Qiuyang; Zhao, Yanmei; Li, Wenling; Liu, Xutao

doi:10.1038/s41598-025-26720-8

Download PDF

Article
Open access
Published: 28 November 2025

The usage of artificial Intelligence-empowered text analysis model with convolutional neural network in english reading

Qiuyang Huang¹,
Yanmei Zhao²,
Wenling Li³ &
…
Xutao Liu⁴

Scientific Reports volume 15, Article number: 42665 (2025) Cite this article

2130 Accesses
1 Altmetric
Metrics details

Subjects

Abstract

With the increasing application of artificial intelligence (AI) in education, deep learning models have offered new solutions for personalized reading instruction. This study addresses the core issue of insufficient personalization in high school English reading instruction by systematically applying the Text Convolutional Neural Network (Text CNN) model in real classroom settings for the first time. A teaching-assistive framework is developed, integrating dynamic content recommendation with real-time feedback. Using multi-scale feature extraction and an attention mechanism based on Text CNN, the model automatically classifies reading materials, extracts key terms, and delivers personalized resources. A comparative experiment was conducted in a high school with 60 students, divided into an experimental group and a control group. The results showed that the experimental group, assisted by the Text CNN tool, improved their average reading comprehension scores from 66.2 to 79.6, outperforming the control group’s 70.3. The model achieved classification accuracies of 93.5% and 94.2% for science and education texts, respectively, while reducing teachers’ grading workload by 70%. However, this study was validated on a limited dataset of 2,000 texts, and the model’s performance on unstructured texts (e.g., social media posts or legal documents) remains to be explored. Overall, this study provides empirical evidence for applying Text CNN in English instruction and lays a theoretical and practical foundation for developing and optimizing intelligent educational tools.

An AI-driven tools assessment framework for english teachers using the Fuzzy Delphi algorithm and deep learning

Article Open access 24 November 2025

University english teaching evaluation using artificial intelligence and data mining technology

Article Open access 19 August 2025

A transformer based approach to STEAM integrated english course design in high schools under deep learning

Article Open access 26 November 2025

Introduction

Research background and motivations

With the advancement of educational informatization, artificial intelligence (AI) has been increasingly applied in personalized learning and automated assessment^1,2. However, most existing studies focus on general text classification or sentiment analysis and rarely address the closed-loop process of text adaptation, student proficiency mapping, and real-time feedback in high school English reading instruction^3,4. Compared with large pre-trained Transformer models such as the BERT series, which require high computational resources, Text Convolutional Neural Network (Text CNN) offers natural advantages in computational efficiency, real-time performance, and sensitivity to local semantics (phrases or sentence structures), making it easier to deploy and iterate in resource-constrained classroom environments⁵. Moreover, integrating Text CNN with interpretability modules (e.g., key term and phrase visualization) and instructional strategies can provide teachers with actionable guidance, forming a teaching loop that connects model output, teacher intervention, and data feedback. Based on these motivations, this study proposes and validates a Text CNN application framework for high school reading instruction, focusing on three key aspects: automatic difficulty assessment of materials, personalized recommendations, and interpretable feedback. Comparative experiments were conducted in real classroom settings to evaluate the framework’s practical effectiveness.

Research objectives

The main objective of this study is to enhance the effectiveness of high school English reading instruction using the Text CNN model, focusing on three key aspects:

1.
Content adaptation Employing Text CNN to classify reading materials and extract keywords, thereby providing personalized learning content tailored to students’ reading levels.
2.
Learning outcome evaluation Conducting a comparative experiment to assess the effectiveness of the Text CNN-assisted tool in improving students’ reading comprehension.
3.
Teaching efficiency optimization Investigating the use of Text CNN for automated assessment to reduce teachers’ workload and improve overall teaching efficiency.

To achieve these objectives, a comparative experiment was designed involving 60 high school students, who were divided into an experimental group and a control group. The experimental group used the Text CNN-based assistive tool alongside traditional teaching methods, while the control group relied solely on conventional textbooks and teacher guidance. Pre- and post-test comparisons, along with model performance evaluations, were conducted to validate the effectiveness of Text CNN in English reading instruction. Furthermore, this study provides both theoretical foundations and practical guidance for the development and optimization of future educational technologies.

Literature review

In recent years, deep learning-based text analysis techniques have received widespread attention for their applications in language processing and educational settings. Qin and Irshad (2024) developed an English textbook readability evaluation method based on the Text CNN model, providing an innovative solution for reading instruction. Their study demonstrated that selecting learning materials according to students’ reading abilities not only enhanced reading interest and comprehension but also achieved high evaluation accuracy (90%) on a self-constructed dataset, offering a scientific basis for English instructional design⁶.

In terms of reliability assessment and predictive analysis methodologies, recent studies have offered valuable references for evaluating the robustness of educational technology systems. Shehadeh et al.⁷ proposed a multi-state system reliability assessment method based on interval universal generating functions, using Dempster–Shafer theory and interval analysis to estimate system reliability under uncertainty⁷. This approach captures potential system states and their likelihoods through interval-valued belief functions, providing a refined perspective for complex system safety evaluation. Similarly, Shehadeh and Alshboul⁸ applied advanced ensemble machine learning algorithms to building safety prediction, achieving a prediction accuracy of 98.13% with a modified decision tree model, demonstrating the advantages of ensemble learning in complex environments⁸. These methodologies offer useful guidance for reliability assessment and predictive analysis in educational technology systems.

In the field of natural language processing, Khan et al.⁹ employed a Convolutional Neural Network–Long Short-Term Memory (CNN-LSTM) architecture for multilingual sentiment analysis in English and Roman Urdu texts. Their model performed well across multiple corpora, illustrating the complementary strengths of CNN and LSTM⁹. Susandri et al.¹⁰ further improved sentiment classification accuracy to 88% using a hybrid Convolutional Neural Network–Bidirectional Long Short-Term Memory (CNN-BiLSTM) model, highlighting the practical potential of deep learning for large-scale sentiment analysis¹⁰.Regarding model interpretability, Ce and Tie¹¹ proposed a backtracking analysis method for CNN-based text classification models and visualized results on the IMDb dataset, providing new approaches for enhancing transparency and trust in text classifiers¹¹. Ren et al.¹² introduced Dynamic Label Alignment Strategy (DyLas) to address dynamic label changes in large-scale multi-label text classification using large language models. This method proved effective in domains such as e-commerce, news, medical coding, and legal documents¹².In reference parsing and semantic modeling, Yin and Wang¹³ proposed the Contrastive Prompt-based Parser for References (CONT_Prompt_ParseRef) model, which demonstrated strong robustness even in low-resource settings¹³. Jiang et al.¹⁴ developed Feature Fusion and Multi-branch Graph Convolutional Network (Fpa-GCN) to improve the extraction accuracy of aspect-based sentiment triples through multi-branch graph convolution and feature fusion, showing the advantage of graph structures for capturing fine-grained contextual relations¹⁴. Chen et al.¹⁵ further proposed the Graph Cross-correlation Recommendation (GCR) method, which modeled cross-correlations between user and item subgraphs, achieving superior performance in recommendation tasks compared to mainstream approaches¹⁵.

In addition, recent years have seen a deepening of cross-disciplinary research on the integration of AI and complex system decision models, providing new reference paths for the reliability and interpretability of intelligent models in education. Shehadeh et al.⁷ proposed an integrated management framework combining digital twin technology with machine learning algorithms to achieve dynamic coordination within urban “water–energy–food–environment” systems. Through 3D modeling and real-time data-driven simulation mechanisms, the study enabled visualized prediction and optimized management of resource allocation, offering quantitative support for sustainable development goals. This framework emphasizes the role of AI in multi-source data fusion and feedback-driven decision-making, providing valuable insights for the design of learner behavior modeling and instructional feedback loops in educational intelligent systems¹⁶. Moreover, Shehadeh et al.⁷ introduced a predictive analytics approach based on an improved Extreme Gradient Boosting (XGBoost) algorithm for conflict detection in Building Information Modeling (BIM). This method significantly enhanced classification and prediction accuracy across multi-structural information, demonstrating the high robustness of machine learning models in complex semantic environments¹⁷. In addition, Shehadeh and Alshboul⁸ integrated virtual reality with machine learning to construct an intelligent framework for proactive detection and collaborative design, effectively reducing design conflicts by 16% and project duration by 12%. This study highlights AI’s potential in interactive visualization and human–machine collaborative optimization, and its “real-time feedback–adaptive improvement” paradigm bears clear parallels to adaptive content recommendation mechanisms based on learner responses in educational contexts¹⁸. Further, Alshboul et al.¹⁹ proposed a quality management decision framework based on evidence reasoning and belief functions, employing the Dempster combination rule to integrate multi-source information, enabling dynamic updates and accurate judgment under uncertainty¹⁹. This approach provides a mathematical foundation for building AI-based teaching models capable of self-learning and dynamic adjustment, particularly valuable when handling fuzzy labels and multivariate features in educational text analysis.

Overall, these studies collectively reflect the latest advances of AI in predictive analytics, data fusion, and uncertainty management within complex systems. They not only expand AI’s application boundaries in engineering and management but also offer cross-disciplinary methodological support for educational technology research. Building on these insights, this study develops a Text-CNN-based intelligent reading instruction model that enhances a data-driven instructional feedback system. The model facilitates the transition of AI from simple “model optimization” to system-level intelligence and improved instructional interpretability.

Research methodology

Adaptation of text CNN for reading instruction tasks

CNN was initially developed for computer vision tasks, but their variant for natural language processing, known as Text CNN, has been shown to efficiently capture local contextual features in text^20,21,22. Compared with large-scale pre-trained models with complex architectures, such as the BERT series, Text CNN can achieve high performance in text representation and classification while maintaining a smaller model size, faster inference speed, and lower training cost. These characteristics make it particularly suitable for resource-constrained educational scenarios, such as real-time classroom feedback systems and learning platforms requiring rapid text analysis.

The core idea of Text CNN is to apply convolutional filters of varying sizes over sequences of embedded word vectors to capture n-gram features at different scales. For example, a filter size of 2 can extract phrase-level collocations, while sizes of 3 or 4 can capture more complex semantic patterns. These local features are then aggregated through pooling layers to form a compact representation of the entire text. Unlike traditional methods, Text CNN directly models local dependencies between words, better reflecting sentence-level semantics within the broader discourse context.

In English reading instruction, students often face challenges such as inaccurate assessment of text difficulty, unclear identification of key terms, and insufficient personalized exercises. To address these issues, this study adapts and extends the standard Text CNN architecture for educational tasks:

1.
Multi-scale convolution design Parallel convolutional filters of sizes {2, 3, 4} are used to extract features at the phrase, syntactic fragment, and complex sentence levels, enabling the model to capture linguistic information at the word, sentence, and paragraph scales.
2.
Attention-based pooling mechanism Replacing traditional max pooling, an attention-weighted strategy highlights key words and sentences relevant to reading comprehension, enhancing model interpretability and allowing teachers to intuitively identify areas of difficulty for students.
3.
Teaching difficulty mapping module A difficulty-level mapping layer is added after convolutional representation, aligning textual features with Common European Framework of Reference for Languages (CEFR) levels or a custom difficulty system. This enables automatic assessment of appropriate reading material levels.
4.
Personalized recommendation interface By integrating students’ historical performance and error patterns, model outputs are connected to a recommendation module that dynamically delivers adaptive exercises, forming a closed loop of “model assessment → recommended resources → student feedback.”

Through these adaptations, Text CNN not only provides efficient text representation and classification but also deeply aligns with instructional needs, balancing real-time performance, interpretability, and personalization. In high school English reading instruction, the model can assist teachers in selecting appropriate materials while providing students with exercises tailored to their proficiency levels, demonstrating the practical value of AI technologies in educational settings.

Text CNN-based english reading material evaluation model

The construction of a model evaluation typically involves splitting the dataset into two parts: one for training the model and the other for evaluating its performance²³. During the training phase, the primary task is to learn the core patterns related to readability assessment from labeled texts, which form the foundation of the model. Once trained, the model can be applied not only to texts within the dataset but also to new, unseen texts, accurately predicting their readability levels. In deep learning-based text evaluation models, the main objective is to establish a mapping from a text dataset $\:D=\{{d}_{1},{d}_{2},...,{d}_{n}\}$ to corresponding readability levels $\:{l}_{i}=\{{G}_{1},{G}_{2},...{G}_{m}\}$^24,25,26,27. Typically, the model consists of three components: text data representation, feature extraction, and classification. Figure 1 illustrates the overall structure of the text evaluation model.

First, the raw text undergoes preprocessing, including tokenization, stop-word removal, and word vector embedding. The processed text is then fed into the Text CNN network, which employs multi-scale convolutions and attention-based pooling to extract both local and global semantic features. Subsequently, a teaching adaptation layer maps the convolutional representations to reading difficulty levels and key terms, which are further linked to a personalized recommendation interface. After the model outputs the classification results, it is trained using cross-entropy loss and the Adam optimizer. Techniques such as Dropout, L2 regularization, and early stopping are applied to reduce the risk of overfitting. Finally, model performance is evaluated using K-fold cross-validation and an independent test set, while confidence intervals and confusion matrices are employed to assess the robustness of the results.

The implementation of text representation is a crucial prerequisite for performing text readability evaluation tasks using DL. During the text data representation phase, the Word2vec tool is commonly used to generate word vectors²⁸. Word2vec efficiently converts words into numerical vectors, and expresses the characteristics of these words in the form of vector space. The core idea is to train word vectors of specific dimensions using a DL model. By calculating the distance between word vectors, the similarity between words can be measured. By inputting a sequence of words $\:[{w}_{1},{w}_{2},...,{w}_{m}]$ into Word2vec, the corresponding word vectors $\:\mathbf{X}=[{\mathbf{x}}_{i},{\mathbf{x}}_{2},...,{\mathbf{x}}_{n}]$ are obtained. $\:{\mathbf{x}}_{i}\in\:{\mathbb{R}}^{d}$ and n suggests the sequence length. For a sentence with a length of n (padding can be applied if necessary), its representation is as follows:

$$\:{\mathbf{X}}_{1:n}={\mathbf{x}}_{1}\oplus\:{\mathbf{x}}_{2}\oplus\:...\oplus\:{\mathbf{x}}_{n}$$

(1)

$\:\oplus\:$ denotes the concat operation. Each row represents a Word2vec vector of a word, and vertically, these rows are arranged according to the order of the words in the sentence. The input data size is $\:n\times\:k$. n suggests the word quantity in the sentence with the longest length in the training data, typically set to 64, and k represents the embedding dimension, typically set to 300.

During the feature extraction process, the CNN model is chosen as the encoder. First, the word vector representation X of the input sequence is obtained. With the purpose of extracting meaningful features from the sequence, different sizes of convolutional kernels are applied to the word vector sequence X for convolution operations^29,30. By using multiple kernel sizes, relationships between words in different ranges of the sequence can be effectively captured, resulting in more representative feature representations. It is supposed that a convolution kernel has a size of k, and this kernel will operate on a segment of the sequence of length k, extracting local contextual information through a sliding window mechanism.

Next, the convolution operation generates feature maps, where each feature map represents the response degree of a specific pattern in the sequence. In order to improve the generalization ability of the model and focus on important features, a non-linear activation function and pooling operations are usually added after the convolutional layer^31,32,33. The non-linear activation function helps capture complex patterns in the data, while pooling operations reduce the feature map size, decrease computational complexity, and suppress noise. As the convolutional kernel moves, a window matrix $\:{\mathbf{W}}_{i}=[{\mathbf{x}}_{i},{\mathbf{x}}_{i+l},...,{\mathbf{x}}_{i+k-l}]$ containing k consecutive words is formed at each position i in the sequence. Then, the window matrix $\:{\mathbf{W}}_{i}$ is subjected to convolution with the convolution kernel matrix M, generating a feature map $\:\mathbf{C}\in\:{\mathbb{R}}^{L-k+1}$. The feature mapping of the word window vector w at position i is calculated as follows:

$$\:{\mathbf{c}}_{i}=\sigma\:(\mathbf{w}\otimes\:\mathbf{m}+b)$$

(2)

⊗ denotes multiplication, b represents the bias term, and $\:\sigma\:$ denotes the sigmoid activation function. Through these calculations, the feature map is obtained as follows:

$$\:\mathbf{C}=[{\mathbf{c}}_{i},{\mathbf{c}}_{2},...,{\mathbf{c}}_{n-h+l}]$$

(3)

Next, max pooling is applied to the results obtained from the convolution operation, as shown in the following equation:

$$\:\widehat{\mathbf{c}}=max\left\{\mathbf{C}\right\}$$

(4)

The max pooling operation selects the maximum value from $\:{\mathbf{c}}_{i}$ as the feature representation of the i-th word for that particular convolutional kernel.

For regularization, dropout is applied at the second-to-last layer $\:\mathbf{Z}=[{\widehat{\mathbf{c}}}_{i},{\widehat{\mathbf{c}}}_{2},...,{\widehat{\mathbf{c}}}_{m}]$, and an $\:{l}_{2}$ norm constraint is applied to the weight vector. The equation is as follows:

$$\:\mathbf{y}=\mathbf{w}\cdot\:(\mathbf{z}\circ\:\mathbf{r})+b$$

(5)

$\:\circ\:$ represents the element-wise multiplication operator, and $\:\mathbf{r}$ denotes the masking vector of the Bernoulli random variable with a probability p of 1. Additionally, after each step of gradient descent, if $\left\| {\mathbf{w}} \right\|_{2} > s$, the weight vector $\:\mathbf{w}$ is rescaled to satisfy $\left\| {\mathbf{w}} \right\|_{2} > s$, thereby applying the $\:{l}_{2}$ norm constraint to the weight vector.

In the final step, the chosen features are forwarded to a dense Softmax layer for classification. In the text classification component, logistic regression is utilized to build a multi-class classifier, where the input vector corresponds to the feature vector generated by the CNN³⁴. The final representation vector, v, is derived from the preceding Pooling layer and subsequently passed to the Softmax layer for classification. The equation is as follows:

$$\:Softmax\left({z}_{i}\right)=\frac{{e}^{{z}_{i}}}{\sum\:_{c=1}^{C}\:{e}^{{z}_{c}}}$$

(6)

The output value of the i-th node is denoted as $\:{z}_{i}$, while C represents the total number of output nodes, which corresponds to the number of categories the nodes are classified into^35,36,37,38. The Softmax function is used to transform the input values of a multi-class classification problem into a probability distribution within the range [0, 1], where the sum of all probabilities equals 1. The Text CNN model employs cross-entropy as the loss function, and its equation is as follows:

$$\:L=\frac{1}{N}\sum\:_{i}\:{L}_{i}=\frac{1}{N}\sum\:_{i}\:-\sum\:_{c=1}^{M}\:{y}_{ic}\text{l}\text{o}\text{g}\left({p}_{ic}\right)$$

(7)

M represents the total number of categories; $\:{y}_{ic}$ is an indicator variable (taking values 0 or 1). It is 1 if the predicted category of sample i matches the actual category, and it is 0 otherwise. $\:{p}_{ic}$ denotes the probability that the model assigns sample i to category C.

Research design for english reading instruction

This study aims to explore the potential of an English reading instruction tool based on the Text CNN model to improve teaching efficiency and effectiveness. A comparative experiment was designed with the following arrangements: (1) Experimental Subjects: Students from an English learning class at a high school were selected and divided into a control group and an experimental group, with an equal number of students in each. The English proficiency levels of the students in both groups were comparable. (2) Experimental Content: The Text CNN model was employed to classify textbooks and supplementary reading materials, providing content appropriate for students’ reading levels. The model also extracted key terms from the reading materials, helping students quickly grasp the main themes and essential information of each article. Based on students’ learning feedback and comprehension performance, personalized reading materials and exercises were delivered to the experimental group. (3) Comparison of Teaching Plans: The control group followed the traditional approach of textbook reading combined with teacher guidance. The experimental group, in addition to the traditional method, used the Text CNN-based reading support tool to assist students in selecting materials, extracting key information, and receiving personalized practice. Figure 2 illustrates the experimental process flow diagram.

Finally, a multidimensional quantitative evaluation of students’ learning outcomes is conducted. By comparing the pre-test and post-test, the improvement in reading comprehension abilities between the control group and the experimental group is assessed.

Experimental design and performance evaluation

Datasets collection, experimental environment and parameters setting

The dataset used in this study consisted of 2,000 English reading texts, primarily sourced from publicly available educational corpora (available at: https://www.kaggle.com/datasets) and extended reading materials from the school. The texts covered six thematic categories: literature, science and technology, news, social topics, law, and education. Each text ranged from 500 to 1,500 words. All texts were independently annotated by two teachers, each with more than three years of high school English teaching experience. The annotations included text type, core themes, and reading difficulty levels. In cases of disagreement between the two teachers, a third senior teacher served as an arbitrator. To ensure annotation consistency, Cohen’s Kappa coefficient was calculated, yielding a value of 0.86, indicating a high level of agreement.

Reading difficulty levels were determined using a combination of automated and manual approaches. First, texts were automatically graded using the Flesch-Kincaid and Lexile readability formulas. These results were then reviewed and adjusted by teachers according to CEFR standards, producing five final levels: Beginner, Lower-Intermediate, Intermediate, Upper-Intermediate, and Advanced. This approach ensured that the grading process was both objectively supported and aligned with actual teaching practices.

The participants were 60 s-year high school students (average age 16.8 years, with an approximately equal gender distribution) from a key high school. Students’ English proficiency was stratified into low, medium, and high levels based on pretest scores from a standardized English reading comprehension test. Using stratified randomization, students were assigned to an experimental group and a control group, with 30 students in each, ensuring balance in pretest scores and gender distribution. Prior to the experiment, an equivalence test was conducted using independent-samples t-tests, which indicated no significant difference in pretest scores between the groups (p > 0.05), confirming their comparability.

The experiment was conducted using the Python programming language and deep learning frameworks, primarily TensorFlow and Keras, for model training. An NVIDIA GPU (GeForce GTX 1080 Ti) was employed to accelerate training, while a standard server with 32 GB of memory handled data processing. The parameters of the Text CNN model were adjusted based on prior research and the characteristics of the dataset. The model architecture consists of multiple convolutional layers for feature extraction, pooling layers to reduce dimensionality, and a fully connected layer for classification. Key hyperparameter settings are summarized in Table 1.

Table 1 Key hyperparameter settings.

Full size table

To ensure robust performance evaluation, the dataset was split into training, validation, and test sets in a 70%/10%/20% ratio using stratified sampling by text category. Hyperparameter tuning (Grid or Bayesian search) was performed on the training set, with 5-fold cross-validation used to verify the stability of the final hyperparameters. The test set was reserved exclusively for final performance reporting. Evaluation metrics included accuracy, recall, and F1 score, with 95% confidence intervals calculated using bootstrap sampling (n = 1000).

Performance evaluation

Performance of the text CNN model on different types of reading materials

When evaluated on different types of reading materials, the Text CNN model achieved high accuracy, recall, and F1 scores, demonstrating strong adaptability in handling diverse text-processing tasks. Comparisons of accuracy, recall, and F1 scores across different text categories are shown in Fig. 3.

The Text CNN model demonstrates robustness and adaptability across various types of reading materials. The results show that the model achieves the highest performance on science & technology and education texts, with accuracy rates of 93.5% and 94.2%, respectively. This suggests that the Text CNN model excels when processing texts with clear structures and specialized vocabulary. Such materials often feature standardized language and fixed expressions, which allow the model to effectively capture underlying patterns. The model also performs well on news and literary texts, achieving accuracy rates of 92.1% and 91.8%, respectively. Although these types of texts contain more emotional expressions and complex semantic relationships, the model is still able to extract key features and make accurate classifications. In contrast, performance on social media texts is slightly lower, with an accuracy of 89.6%. This decrease can be attributed to the colloquial language and informal expressions typical of social media, which pose challenges for Text CNN when processing unstructured content and diverse forms of expression. Nevertheless, the recall rate and F1 score remain high, indicating that the model maintains good generalization and balanced performance even on these more variable texts. Performance on legal texts is comparatively lower, with an accuracy of 90.3% and a recall of 89.8%. Legal texts often include complex terminology and require precise language, demanding deeper contextual understanding and fine-grained feature extraction. As a result, the Text CNN model exhibits slightly reduced performance on these tasks.

Overall, the model demonstrates stable and consistent performance across all text types, highlighting its adaptability in diverse textual environments. In conclusion, Text CNN achieves high accuracy and strong stability, particularly excelling with texts that have clear structures and standardized terminology. While the complexity of social media and legal texts slightly affects performance, the model remains effective and supportive across a wide range of reading tasks.

Performance comparison of different models in text analysis tasks

This section compares the performance of six commonly used text analysis models. They are Text CNN, Support Vector Machine (SVM), Naive Bayes, LSTM, BERT, and Gated Recurrent Unit (GRU). The models’ performance in text classification tasks is evaluated based on accuracy, recall, and F1 score. Figure 4 shows the comparison of accuracy, recall, and F1 score across different models in text analysis tasks.

Figure 4 reveals that Text CNN outperforms other models, with an accuracy of 94.1%, recall of 93.8%, and F1 score of 94%. This indicates that Text CNN is highly effective in text classification tasks. BERT follows closely, with an accuracy of 92.3%, recall of 91.8%, and F1 score of 92%. The high performance of BERT highlights its advantage in detecting local patterns in texts, making it especially suitable for tasks like sentence classification. The LSTM model also performs quite well, with an accuracy of 90.5%, recall of 89.7%, and F1 score of 90.1%. LSTM has an advantage in capturing long-range dependencies in text, making it suitable for text classification tasks that require long sequence context. GRU, similar to LSTM but with a simpler architecture, has an accuracy of 89.8%, recall of 88.6%, and F1 score of 89.2%. Although slightly inferior to LSTM, GRU still performs well in text classification tasks and offers higher computational efficiency. The SVM model performs well in high-dimensional spaces, with an accuracy of 86.7%, recall of 85.3%, and F1 score of 86%. While its performance is lower than that of the DL models, SVM remains a reliable baseline model in text classification tasks. Finally, the Naive Bayes model performs the worst, with an accuracy of 81.4%, recall of 80.8%, and F1 score of 81.1%. Despite its computational simplicity and ease of interpretation, Naive Bayes’ assumption of feature independence limits its performance on complex text data. Overall, BERT and Text CNN perform best among all models, with Text CNN showing the most outstanding performance.

Student performance analysis

This section analyzes the English reading comprehension test scores of the experimental and control groups before and after the experiment. The main indicators for the score analysis include the average score, standard deviation, median, maximum, and minimum values. By comparing the test results before and after the experiment, this work evaluates the effectiveness of the Text CNN-based auxiliary learning tool in enhancing students’ English reading ability. Figure 5 shows the student performance analysis.

In the control group, the average score before the experiment is 65.4, with a standard deviation of 5.8, indicating some variability in the students’ performance. After the experiment, although the control group’s scores improve slightly, with the average score rising to 70.3 and the standard deviation increasing to 6.2, the change is not significant. This suggests that traditional teaching methods have limited effectiveness in improving students’ reading comprehension abilities. In contrast, the experimental group shows a more significant change after using the Text CNN-assisted tool. Before the experiment, the experimental group’s average score is 66.2, with a standard deviation of 6.1, and their performance distribution is similar to that of the control group. After the intervention, the experimental group’s average score increases to 79.6, with a standard deviation of 7.3, and the median score is 80, indicating a notable improvement. The maximum score is 90, and the minimum score is 68, showing a broader distribution, but the overall improvement is substantial. From the data analysis, it is evident that the experimental group experiences a significant improvement in scores after using the Text CNN-based auxiliary tool, especially in reading comprehension. This demonstrates that the Text CNN model’s advantages in personalized learning, keyword extraction, and content recommendation can effectively enhance students’ performance in English reading comprehension.

Discussion

The experimental results demonstrate the effectiveness of the Text CNN model in English reading instruction, particularly for science & technology and education texts, achieving accuracy rates of 93.5% and 94.2%, respectively. Nevertheless, several limitations remain. First, the dataset is relatively small, containing only 2,000 texts with a limited range of topics and genres, which may restrict the model’s generalization to broader educational contexts. Second, although techniques such as Dropout and L2 regularization were applied, deep learning models trained on limited data are still prone to overfitting, particularly on text types with high linguistic variability, such as social media and legal documents. Third, the current model does not systematically incorporate external knowledge or cross-modal information, which may constrain its performance in reading tasks requiring commonsense reasoning or deep contextual integration. Future research should evaluate model robustness on larger, more diverse corpora and explore strategies such as noise injection and adversarial training to enhance generalization.

Conclusion

Research contribution

This study contributes at theoretical, methodological, and practical levels: (1) Theoretical Contribution: The study successfully applies the Text CNN model to high school English reading instruction, demonstrating its effectiveness in text difficulty classification, semantic feature extraction, and personalized content recommendation. This enriches the theoretical framework of deep learning applications in education. (2) Methodological Contribution: An end-to-end instructional adaptation framework is proposed, integrating multi-scale convolution, attention-based pooling, and difficulty mapping mechanisms. This framework enhances model classification performance and interpretability, providing a reusable methodological reference for future research. (3) Practical Contribution: A teaching support tool with real-time feedback and adaptive recommendation capabilities was developed, significantly improving students’ reading comprehension scores and reducing teachers’ grading workload. Experiments indicate particularly strong performance on science & technology and education texts, with accuracy exceeding 93%.

The significance of this study lies primarily in advancing the deep integration of AI technologies with English language teaching and providing new perspectives and practical pathways for the sustainable development of intelligent education. By introducing a Text-CNN model, the research establishes an intelligent support framework for secondary school English reading instruction, enabling a dynamic closed-loop process in text analysis, content recommendation, and learning feedback. This demonstrates both the interpretability and practical feasibility of AI applications in educational settings. The framework not only improves the alignment of reading materials with learners’ needs and enhances the precision of learning resource delivery but also promotes a data-driven shift in instructional decision-making, making classroom teaching more scientific and personalized. The results indicate that the judicious application of AI models can effectively reduce teachers’ repetitive workload. This reduction frees up time for differentiated instruction and the development of higher-order thinking skills. Consequently, it facilitates a shift from a traditional “knowledge delivery” model to a more student-centered “learning guidance” model. Importantly, this study provides empirical evidence for education policymakers and intelligent learning system designers, offering valuable insights for the long-term development of AI-empowered basic education.

Future works and research limitations

The limitations of this study are as follows. First, the dataset used in the experiments was relatively small, with samples primarily drawn from a single school setting, which may limit the generalizability of the model’s conclusions. Second, the Text-CNN model focuses mainly on semantic features at the text level and does not fully account for underlying affective, pragmatic, or cross-contextual factors in reading comprehension. Finally, the experimental period was relatively short, preventing a long-term evaluation of the model’s impact on learning continuity and knowledge transfer.

To address the study’s limitations, future research could proceed in several directions: (1) Expanding Data Scale and Diversity: Introduce reading materials from multiple regions and disciplines to construct a more representative benchmark corpus, enabling systematic evaluation of model adaptability across educational contexts. (2) Exploring Multimodal Learning: Incorporate multimodal information, such as audio and images, to create “text–audio” aligned corpora. This can improve the model’s understanding of spoken and unstructured texts, supporting complex reading tasks in real classroom settings. (3) Graph-based Modeling for Recommendation Optimization: Inspired by models such as Fpa-GCN, relationships among students, learning materials, and knowledge elements can be explicitly modeled as heterogeneous graphs. Graph neural networks can then capture semantic dependencies in complex interactions, improving the accuracy and interpretability of personalized recommendations. (4) Developing Lightweight and Explainable AI Tools: Design tools for real-world educational scenarios that are lightweight and interpretable, providing teachers with transparent and controllable model decisions without adding deployment costs. This approach can further facilitate human–AI collaborative teaching.

Data availability

The datasets used and/or analyzed during the current study are available from the corresponding author Yanmei Zhao on reasonable request via e-mail cherryxf99@yxnu.edu.cn.

References

Chakraborty, N. et al. Language identification from multi-lingual scene text images: A CNN based classifier ensemble approach. J. Ambient Intell. Humaniz. Comput. 12 (2), 7997–8008 (2021).
Article Google Scholar
Dharma, E. M. et al. The accuracy comparison among word2vec, glove, and fasttext towards convolution neural network (cnn) text classification. J. Theor. Appl. Inf. Technol. 100 (2), 31 (2022).
Google Scholar
Ghourabi, A., Mahmood, M. A. & Alzubi, Q. M. A hybrid CNN-LSTM model for SMS spam detection in Arabic and english messages. Future Internet. 12 (9), 156 (2020).
Article Google Scholar
Mhamed, M. et al. Improving Arabic sentiment analysis using CNN-Based architectures and text preprocessing. Comput. Intell. Neurosci. 2021 (1), 5538791 (2021).
Article PubMed PubMed Central Google Scholar
Ombabi, A. H., Ouarda, W. & Alimi, A. M. Deep learning CNN–LSTM framework for Arabic sentiment analysis using textual information shared in social networks. Social Netw. Anal. Min. 10 (2), 1–13 (2020).
Google Scholar
Qin, Y. & Irshad, A. Research on the evaluation method of english textbook readability based on the TextCNN model and its application in teaching design. PeerJ Comput. Sci. 10 (3), e1895 (2024).
Article PubMed PubMed Central Google Scholar
Shehadeh, A., Alshboul, O. & Saleh, E. Enhancing safety and reliability in multistory construction: A multi-state system assessment of shoring/reshoring operations using interval-valued belief functions. Reliab. Eng. Syst. Saf. 252, 110458 (2024).
Article Google Scholar
Shehadeh, A. & Alshboul, O. Enhancing occupational safety in construction: Predictive analytics using advanced ensemble machine learning algorithms. Eng. Appl. Artif. Intell. 159, 111761 (2025).
Article Google Scholar
Khan, L. et al. Deep sentiment analysis using CNN-LSTM architecture of english and Roman Urdu text shared in social media. Appl. Sci. 12 (5), 2694 (2022).
Article CAS Google Scholar
Susandri, S., Defit, S. & Tajuddin, M. Enhancing text sentiment classification with hybrid CNN-BiLSTM model on whatsapp group. J. Adv. Inform. Technol. 15 (3), 542 (2024).
Article Google Scholar
Ce, P. & Tie, B. An analysis method for interpretability of CNN text classification model. Future Internet. 12 (12), 228 (2020).
Article Google Scholar
Ren, L. et al. DyLas: A dynamic label alignment strategy for large-scale multi-label text classification. Inform. Fusion. 120, 103081 (2025).
Article Google Scholar
Yin, Z. & Wang, S. Enhancing bibliographic reference parsing with contrastive learning and prompt learning. Eng. Appl. Artif. Intell. 133, 108548 (2024).
Article Google Scholar
Jiang, H. et al. Fpa-GCN: enhancing aspect sentiment triplet extraction with feature-rich prediction-aware graph convolutional networks: H. Appl. Intell., 55(9): 740. (2025).
Article Google Scholar
Chen, H. et al. Graph cross-correlated network for recommendation. IEEE Trans. Knowl. Data Eng. (2024).
Shehadeh, A., Alshboul, O. & Arar, M. Enhancing urban sustainability and resilience: Employing digital twin technologies for integrated WEFE nexus management to achieve SDGs. Sustainability 16 (17), 7398 (2024).
Article ADS Google Scholar
Shehadeh, A. et al. Enhanced clash detection in Building information modeling: Leveraging modified extreme gradient boosting for predictive analytics. Results Eng. 24, 103439 (2024).
Article Google Scholar
Shehadeh, A. & Alshboul, O. Enhancing engineering and architectural design through virtual reality and machine learning integration. Buildings 15 (3), 328 (2025).
Article Google Scholar
Alshboul, O., Shehadeh, A. & Saleh, E. Advancing construction quality management: An integrated evidential reasoning and belief functions framework. J. Constr. Eng. Manag. 151 (8), 04025093 (2025).
Article Google Scholar
Arora, M. & Kansal, V. Character level embedding with deep convolutional neural network for text normalization of unstructured data for Twitter sentiment analysis. Social Netw. Anal. Min. 9 (1), 12 (2019).
Article Google Scholar
Wang, H. et al. A short text classification method based on N-gram and CNN. Chin. J. Electron. 29 (2), 248–254 (2020).
Article Google Scholar
Park, H., Song, M. & Shin, K. Sentiment analysis of Korean reviews using CNN: Focusing on morpheme embedding. J. Intell. Inform. Syst. 24 (2), 59–83 (2018).
CAS Google Scholar
Pande, S. D. et al. Digitization of handwritten devanagari text using CNN transfer learning–a better customer service support. Neurosci. Inf. 2 (3), 100016 (2022).
Google Scholar
Cheng, Y. et al. Sentiment analysis using multi-head attention capsules with multi-channel CNN and bidirectional GRU. IEEE Access. 9 (2), 60383–60395 (2021).
Article Google Scholar
Xu, W. et al. CNN-based skip-gram method for improving classification accuracy of Chinese text. KSII Trans. Internet Inform. Syst. (TIIS). 13 (12), 6080–6096 (2019).
Google Scholar
Su, Y. J. et al. A novel LMAEB-CNN model for Chinese microblog sentiment analysis. J. Supercomputing. 76 (2), 9127–9141 (2020).
Article Google Scholar
Zhang, X. & Ma, Y. An ALBERT-based TextCNN-Hatt hybrid model enhanced with topic knowledge for sentiment analysis of sudden-onset disasters. Eng. Appl. Artif. Intell. 123 (2), 106136 (2023).
Article ADS Google Scholar
Joloudari, J. H. et al. BERT-deep CNN: state of the Art for sentiment analysis of COVID-19 tweets. Social Netw. Anal. Min. 13 (1), 99 (2023).
Article Google Scholar
Sharma, A. K., Chaurasia, S. & Srivastava, D. K. Sentimental short sentences classification by using CNN deep learning model with fine tuned Word2Vec. Procedia Comput. Sci. 167 (1), 1139–1147 (2020).
Article Google Scholar
Dong, M. et al. Variable Convolution and pooling convolutional neural network for text sentiment classification. IEEE access. 8 (2), 16174–16186 (2020).
Article Google Scholar
Sasidhar, T. T., Premjith, B. & Soman, K. P. Emotion detection in Hinglish (hindi + english) code-mixed social media text. Procedia Comput. Sci. 171 (2), 1346–1352 (2020).
Article Google Scholar
Fesseha, A. et al. Text classification based on convolutional neural networks and word embedding for low-resource languages. Tigrinya Inform. 12 (2), 52 (2021).
Google Scholar
Feng, H. et al. CNN models for readability of Chinese texts. Math. Found. Comput. 5 (4), 18–22 (2022).
Article Google Scholar
Priyadarshini, I. & Cotton, C. A novel LSTM–CNN–grid search-based deep neural network for sentiment analysis. J. Supercomputing. 77 (12), 13911–13932 (2021).
Article Google Scholar
Khan, L. et al. Multi-class sentiment analysis of Urdu text using multilingual BERT. Sci. Rep. 12 (1), 5436 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Banerjee, I. et al. Comparative effectiveness of convolutional neural network (CNN) and recurrent neural network (RNN) architectures for radiology text report classification. Artif. Intell. Med. 97 (2), 79–88 (2019).
Article PubMed Google Scholar
Zhao, N. et al. Combination of convolutional neural network and gated recurrent unit for aspect-based sentiment analysis. IEEE Access. 9 (2), 15561–15569 (2021).
Article Google Scholar
Habbat, N., Anoun, H. & Hassouni, L. Combination of GRU and CNN deep learning models for sentiment analysis on French customer reviews using XLNet model. IEEE Eng. Manage. Rev. 51 (1), 41–51 (2022).
Article Google Scholar

Download references

Funding

This research was supported by the Key Project of Science and Technology Research of the Jiangxi Provincial Department of Education for the year 2024 (Project No.: GJJ2405704); Research on the Inner Mechanism and Practical Path of Generative Artificial Intelligence Empowering Teachers’ Professional Development in Southwest Border Universities (Guangxi Minzu Normal University 2024 Second Half of High-level Talents Research Project; Project No.: 2024XBNGCC03); 2025 University-Level Special Research Project on AI-Enabled Curriculum Teaching Reform: “Research on Teaching Optimization of Educational Research Methods Course Empowered by Artificial Intelligence” (Project No.: RGZN202520).

Author information

Authors and Affiliations

School of Economics and Management, Jiangxi Arts & Ceramics Technology Institute, Jingdezhen, 333499, Jiangxi, China
Qiuyang Huang
School of Foreign Languages, Yuxi Normal University, Yuxi, 653100, Yunnan, China
Yanmei Zhao
School of Education Science, GuangXi Minzu Normal University, Chongzuo, 532200, Guangxi, China
Wenling Li
School of Physical Education, Jiangsu University of Science and Technology, Zhenjiang, 212100, Jiangsu, China
Xutao Liu

Authors

Qiuyang Huang
View author publications
Search author on:PubMed Google Scholar
Yanmei Zhao
View author publications
Search author on:PubMed Google Scholar
Wenling Li
View author publications
Search author on:PubMed Google Scholar
Xutao Liu
View author publications
Search author on:PubMed Google Scholar

Contributions

Qiuyang Huang: Conceptualization, methodology, software, validation, formal analysis, investigation, resources, data curation, writing—original draft preparationYanmei Zhao: writing—review and editing, visualization, supervision, project administration, funding acquisitionWenling Li: methodology, software, validation, formal analysisXutao Liu: Conceptualization, methodology, software, validation.

Corresponding author

Correspondence to Yanmei Zhao.

Ethics declarations

Competing interests

The authors declare no competing interests.

Ethical approal

The studies involving human participants were reviewed and approved by School of Foreign Languages, Yuxi Normal University Ethics Committee (Approval Number: 2023.25100256). The participants provided their written informed consent to participate in this study. All methods were performed in accordance with relevant guidelines and regulations.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Huang, Q., Zhao, Y., Li, W. et al. The usage of artificial Intelligence-empowered text analysis model with convolutional neural network in english reading. Sci Rep 15, 42665 (2025). https://doi.org/10.1038/s41598-025-26720-8

Download citation

Received: 20 August 2025
Accepted: 30 October 2025
Published: 28 November 2025
Version of record: 28 November 2025
DOI: https://doi.org/10.1038/s41598-025-26720-8