Deep learning based SentiNet architecture with hyperparameter optimization for sentiment analysis of customer reviews

Madhurika, B.; Malleswari, D. Naga

doi:10.1038/s41598-025-19532-3

Download PDF

Article
Open access
Published: 10 October 2025

Deep learning based SentiNet architecture with hyperparameter optimization for sentiment analysis of customer reviews

B. Madhurika¹ &
D. Naga Malleswari¹

Scientific Reports volume 15, Article number: 35525 (2025) Cite this article

3706 Accesses
1 Citations
1 Altmetric
Metrics details

Subjects

Abstract

Due to the ever-increasing volume of opinionated content on social media, there is a pressing need for a highly effective process to perform sentiment analysis (NLP) successfully. But, methods nowadays are usually not good enough in context, long-term dependency, and domain-specific areas modelling, especially they lack in modelling of noisy, short texts. In this paper, we introduce SentiNet, a new hybrid deep learning architecture combining multi-layered BiLSTM encoders with convolutional feature extractors and an attention-based fusion mechanism, to achieve better performance in sentiment classification across various datasets. Our model incorporates embedding vectors, parallel convolutional layers, and channel-wise attention over embedded sequential data, as well as one or two bidirectional LSTM units to capture context over the sequence elements. We conduct extensive experiments on the IMDb, Twitter, and Yelp datasets, and show SentiNet yields the highest accuracy of 94.2%, best F1-score of 92.8%, and a balanced precision-recall curve, outperforming competitive baselines. To validate the contribution of each module, ablative experiments are performed, and cross-domain evaluations prove its robustness. The key contribution of this work is that it balances accuracy and interpretability within an efficient processing pipeline, making it applicable in real-world sentiment scenarios, such as e-commerce, customer experience monitoring, and social media analytics. This work extends scalable sentiment analysis by introducing a high-performing, interpretable, and adaptable framework, providing a strong foundation for future explainable AI advancements in text analytics. To enable reproducibility and follow-up work, the code and model will be released publicly.

Introduction

Nowadays, individual decisions are increasingly influenced by social media and reviews from various online platforms regarding services and products. Businesses globally have recognized the importance of analyzing customer sentiments on these virtual platforms beyond traditional feedback systems. If companies fail to consider social feedback, particularly consumer sentiment expressed in reviews, they risk becoming irrelevant in today’s digital landscape. Thus, sentiment analysis of customer review data is essential for understanding customers’ thought processes. This understanding can help businesses adapt their products and services to meet customer expectations. Many researchers have contributed to applying both DL and ML techniques to sentiment analysis. For instance, Wang et al.¹ introduced a Convolutional Recurrent Neural Network that combines LSTM networks with CNN to enhance text categorization accuracy. Said et al.² employed Bi-LSTM-CRF and AB-LSTM-PC models to analyze Arabic hotel reviews, aiming to strengthen sentiment detection by incorporating external lexicons and Gated Recurrent Units (GRU). Sulaiman et al.³ emphasized the need for specialized techniques to analyze sentiment on Twitter, proposing a CNN model that integrates user behavior to improve classification on SemEval-2016 datasets. Jelodar et al.⁴ achieved an accuracy of 81.15% in analyzing public sentiment related to COVID-19 using LSTM and Natural Language Processing (NLP). Basiri et al.⁵ developed the ABCDM model, incorporating CNN layers and attention mechanisms, demonstrating solid results on tweets and reviews with potential for broader applications. It is clear from the literature research that artificial intelligence-enabled methods can significantly improve the identification of sentiments. However, there is a need to develop a hybrid methodology that combines the strengths of word embeddings, effectively processes time series data, and achieves accurate sentiment analysis within the given text corpus.

Despite the success of various state-of-the-art deep learning models for sentiment analysis, the majority of current methods fail to accurately capture the subtle sentiments in informal, short texts, especially under domain shift, context confusion, and multi-aspect sentiment expressions. Most models concentrate on either visual characteristics using CNNs or temporal information using RNNs, without an efficient fusion between the two. Furthermore, their interpretability is very poor, making them unsuitable for real-world, real-time applications, where both strings must be processed and analyzed simultaneously. To address the above limitations, this study proposes SentiNet. This unified deep learning architecture synergistically combines CNN-based local pattern extraction, BiLSTM-based temporal modeling, and an attention-mediated fusion process, designed to bridge the gaps in contextual sentiment comprehension, explainability, and cross-domain transfer. To meet the above challenges, the objectives of this research are as follows:

1.
To develop an end-to-end sentiment classifier system which can successfully integrate local n-grams with global interactions in local multiple contexts using a convolutional, as well as a bi-directional recurrent structure;
2.
To add a semantic attention layer to weight contextual and sentiment-bearing tokens for better interpretability dynamically, and to use a related non-linear function for prediction, thereby improving overall performance;
3.
To test the model on multiple benchmark datasets and show its generalization and robustness against existing approaches; and.
4.
To perform ablation and qualitative analyses to verify the contribution of each module and transparently generate sentiment prediction.

Here are the things we contributed to this paper. We propose a DL-based framework for analyzing sentiment in data. Specifically, we present an innovative DL architecture called SentiNet, designed to classify sentiments in customer reviews efficiently. Additionally, we present the Efficient Learning-Based Sentiment Analyzer (LBSA) algorithm, which utilizes new vectorization techniques, embeddings, and the unique architecture of the SentiNet model. Our empirical study, using a benchmark dataset of customer reviews about restaurants and food items, demonstrates that the SentiNet model achieves an accuracy of 98.68%, outperforming several other deep learning models currently in use. This framework can be integrated into business applications for sentiment analysis and enhancing service quality.

The remainder of the paper is structured as follows: The literature on the different methods that have applied DL and ML for sentiment analysis is discussed in section “Related work”. The proposed technique and algorithms, aimed at enhancing sentiment analysis performance and the framework, are presented in section “Proposed methodology”. In section “Experimental setup”, we describe the experimental setting required for our study. Figure 5 presents the empirical results and compares them with those of the previous models. Section “Discussion” discusses our work. In the last section, we summarize our work, followed by some open problems.

Related work

Advances in Deep Learning (DL) technology have revolutionized sentiment analysis by improving precision, scalability, and natural language understanding. Architectures include classical BiLSTM, CNN, and their hybrids, which combine attention, embedding, and transformers. Initially, the efforts were to replace the human-engineered features with the learned features. Wang et al. Al-Smadi et al.¹ CNNLSTM hybrid for text classification to extract temporal–spatial features. AB-LSTM-PC for Arabic hotel reviews to enhance aspect-level sentiment extraction, Bi-LSTM-CRF for BiDirectional Long Short-Term Memory for Conditional Random Fields, Mosa Alazab et al. Sulaiman et al. User behaviour was also considered when using different models; for example³, included user behaviour in a CNN for Twitter classification, while¹⁶ included user features to use a CNN with LSTM for sentiment analysis. LSTM has been successfully applied to COVID-19 discussions with good performance by⁴.

Promising results have been achieved using hybrid CNN–RNN architectures with attention. Basiri et al. ABCDM (bidirectional CNN-RNN with attention⁵—ABCDM is competitive concerning the tweet and the review results. Xiaolin and Jianqiang⁶ applied CNN using word embeddings and n-grams in Twitter sentiment, Xu et al. An example of effectively solving causal social media text by combining weighted word vectors with BiLSTM⁷. Do et al. CNN, LSTM, and GRU variants⁸: compared all 40 aspect-level sentiment models. Bahad et al. This paper by⁹ proposed BiLSTM as a strong baseline for fake news detection, Yang et al. News Post¹⁰: Exploring the power of coattention for advanced aspect-based sentiment analysis.

This gave rise to domain-adapted and multi-modal optimized architectures (MMOA). Jin et al. EMD followed by LSTM was used to predict stock prices, achieving improvements in both latency and accuracy, as reported by¹¹. Rehman et al. Usama et al.¹² constructed a CNN-LSTM for social media sentiment. A: CNN, RNN, Attention and Long Range Dependence—¹³ RCNN + Aug (RNN + CNN with group-wise augmentation) by Onan¹⁴ was presented for good generalization, while Ligthart et al. Domain specificity and lexical diversity were noted as particularly challenging in a meta-review conducted in¹⁵.

There are some techniques explored to optimize embedding representations. Muhammad et al. Using LSTM with Word2Vec (Skip-Gram)¹⁶ was also used, achieving an 85.96% accuracy for hotel reviews. Gupta et al. PERCY was proposed in a work¹⁷ that contributed the aspect of interpretability to sentiment scoring. Du et al. TABFSA is a new combination of symbolic/sub-symbolic methods and transformer fine-tuning¹⁸. Haque et al. For Unicode text like Bengali, CLSTM is applied by¹⁹, which reports 85.8% accuracy and F1 = 0.86. Fernandez et al. Results indicate that the proposed physics-informed RNN can perform uncertainty-aware prediction in PHM and infrastructure monitoring²⁰.

They also extended into e-learning and IoT-based learning in sentiment analysis. Mao et al. IB-BiLSTM for animated educational content (accuracy > 93%²¹), and Kanwal et al. But this was successfully adapted²² for educational IoT. Berrimi et al. An Arabic datasets were used to propose a BiGRU-BiLSTM hybrid model with attention²³, which achieved an accuracy of over 98%. Park et al. To further enhance cryptocurrency trading recommendations²⁴, integrated LSTM/GRU with Twitter sentiment. Tan et al. and²⁵ Ruixiang Zhang et al. For contextual embeddings, Tan et al. Ensemble hybrids of RoBERTa, LSTM, BiLSTM, and GRU were used to increase robustness with and without data augmentation²⁶.

Other word count-aware models (including lexicon-based models) have played a role as well. Song et al. For aspect-level tasks²⁷, incorporated sentiment lexicons into LSTM layers. Van Houdt et al.²⁸ conducted a review of LSTM and its usage for sequence modeling. Ma et al. Sentic LSTM²⁹ that integrates explicit knowledge and attention targeting sentiment, and Huang et al. Sentence-Level Document LSTM Classification with Enhanced Representations³⁰.

Adaptations specific to language and domain still matter. Alayba et al. They applied CNN-LSTM on Arabic healthcare datasets with user-defined n-grams as described in³¹. Sohangir et al. Dossou et al.³² CNN, LSTM, and doc2vec for financial sentiment (StockTwits). Katic et al. (2019)³³ found LSTM to outperform bag-of-words on Amazon reviews. Deng et al. Along this line³⁴, proposed SSALSTM with sparse self-attention to build lexicons, and Heikal et al. Although³⁵ employed CNN-LSTM ensembles on Arabic tweets, the presented methodology may facilitate the adoption of variants that could be explored in future work.

The work by Gandhi et al. in public review mining CNN-LSTM on multi-domain Twitter data (accuracy ≈ 88%)³⁶. Skip-gram embedding with LSTM using APSO for classification³⁷ Shobana and Murali DNN and LSTM-RNN, Indicating DNN and LSTM-RNN as unstructured granular sentiment data processors Saha and Senapati³⁸ Using the LSTM, Qaisar³⁹ reported on an accuracy of 89.9% on IMDb, which means LSTM applies to large-scale review mining. Afidah et al. It was long past by⁴⁰, using LSTM, CNN, and Word2Vec with Indonesian review tourism with 97.17% accuracy, which is suitable for a recommendation system.

Accurate adjacent works have investigated the use of deep learning and bio-inspired algorithms for two different domains, such as intrusion detection and optimisation. Alzubi et al. While Alzubi et al. Recent works propose swarm intelligence, deep learning, and optimization-based intrusion detection frameworks for IoT and IoMT environments. Techniques include salp swarm–ANN, blended deep learning, crow search, Fréchet–Dirichlet, and quantum mayfly–LSTM approaches, significantly improving malware detection, edge/fog security, and constrained optimization performance ^{41,42,43,44,45,46}.

Researchers are now putting efforts into a better form of hybrid architectures to improve the performance of sentiment analysis. An approach proposed by Sherkatghanad⁴⁸ combined CNN with a deep BiLSTM model, where self-attention was applied to the output of the BiLSTM model to allow capturing of local and sequential dependencies, while on the other hand, providing emphasis to contextually more essential features. When evaluated over several benchmark datasets, even conventional deep learning baselines, the model reached higher accuracy and F1-scores. Similarly, Krasitskii et al. Related works: Yang et al.⁴⁹ presented a multi-lingual Sentiment Analysis using a fine-tuned XLM-R and a hybrid extractive–abstractive summarization framework. The results showed strong accuracy across 10 languages, with the authors employing summarization before classification to reduce computational order.

Table 1 provides an overview of popular deep learning-based sentiment analysis works, including a summary of the models under consideration, the datasets on which they are tested, and the reported results. In conclusion, literature has demonstrated the enhancement of sentiment analysis through architectural advancements, including hybrid CNN-RNN models, attention, lexicon-aware embeddings, and transformer-based integration. Nonetheless, for some models, it may not be easy to interpret them, precluding generalization across different fields or multi-aspect alignment. These gaps inspire the design of SentiNet—a topic-aware learning, attention-guided feature selection, and hyperparameter-tuned BiLSTM model designed to improve sentiment classification in real-world review data.

Table 1 Summary of key deep learning-based sentiment analysis studies: methods, datasets, and reported performance.

Full size table

Proposed methodology

This section presents our methodology, including a novel DL architecture for efficient sentiment categorization in online reviews. It addresses the problem and provides an overview of the proposed framework, including the architecture of the novel deep learning model, SentiNet, its underlying layers, the proposed algorithm, details about the dataset, and the assessment process.

Problem definition

Provided a set of customer reviews in the form of textual corpora, developing a DL-based structure optimized for effective classification of sentiments towards leveraging state-of-the-art is the problem considered.

Overview of our framework

Our main objective is to develop SentiNet, a DL-based sentiment classifier that can divide restaurant evaluations into categories based on whether they are favorable or unfavorable. The proposed framework is shown in abstract form in Fig. 1. The proposed system comprises three main components: a text-to-vector representation module, a model architecture module, and a sentiment prediction module.

The given textual corpora are subjected to many procedures for efficient sentiment classification. The given text is converted to a vector representation with the novel approach described in section “Vectorization method”. The proposed approach, supervised learning, is the foundation of sentiment analysis, which exploits labeled corpora. The vectorized data is given to the proposed novel deep learning architecture, SentiNet, which has mechanisms for sentiment classification, including embeddings and feature extraction. The following subsections provide more details about the mechanisms involved in the proposed framework.

The system starts by taking raw text input, such as tweets or product reviews. It processes the text according to a pre-processing pipeline that involves lowercasing, punctuation removal, stop word removal, and lemmatization to standardize the text format. These cleaned tokens are then converted to dense semantic vectors using pre-trained GloVe embeddings of dimension 300. These embeddings preserve the semantic relationships between words, enabling effective contextual learning.

Several parallel 1D convolution layers with different kernel sizes (2, 3, and 4) are utilized to capture local phrase structures and short-range dependencies. Both of these layers work in parallel to encode different n-gram patterns that are critical to sense-level cues. Concatenating the generated feature maps, the sequence is then input to a Bidirectional Long Short-Term Memory (BiLSTM) network to capture long-range dependencies in both forward and backward directions. This provides context-awareness of individual words within a sentence.

After capturing the sentiment of the sequence, we further propose a channel-wise attention mechanism to emphasise the sentiment-related features by giving larger weights to informative tokens. A dense layer with ReLU activation follows this attention-based representation. A dropout rate of 0.3 is employed to prevent overfitting and improve the model’s generalization ability.

Finally, the output of the dense layer is fed into a softmax classifier, which returns a probability for each sentiment class. The sentiment label will be predicted as the class with the highest probability, achieving the goal of sentiment classification in an interpretable and efficient way.

Vectorization method

The ability of any deep learning algorithm to perform well depends on the features used during training. $\:R\left[\right]\:=\:\left\{{r}_{1},{r}_{2},...,{r}_{m}\right\}$is the numerical mapping of the reviews that we must build because deep learning algorithms cannot learn from raw reviews. A vocabulary V of k unique words is established, $\:V\:=\:\{{u}_{1},{u}_{2},...,{u}_{k}\}$, to obtain this numerical mapping. Words (wi) in a review $r_{j} ~ = ~[w_{1} ,w_{2} ,...,w_{{l^{\prime } }} ]$ are substituted with the words in V’s index value (i). Thus, from a review $\:\left({r}_{j}\right),$, we obtain the converted vector sequence $\left( {s^{\prime } } \right)$, where $s^{\prime } ~ = ~\left[ {i_{1} ,i_{2} ,...,i_{{l^{\prime } }} } \right]$. At this point, the sequences of variable lengths, $\:S\:=\:\{{s}_{1},{s}_{2},...,{s}_{m}\}$, are obtained and are unsuitable for training and feature extraction. $s^{\prime } ~ = ~\left\{ {s_{1}^{\prime } ,s_{2}^{\prime } ,...,s_{m}^{\prime } ~} \right\}$ is the fixed-length sequence created by using the pad sequence algorithm on $\left( {s^{\prime } } \right)$. Every sequence $\:\left({s}_{k}\right)$ in S is a vector with a fixed length of size l. The best length of a series, l, is determined by examining the distribution of review lengths to minimize computing costs. It appears that the majority of the evaluations are less than 150 words. To preserve the necessary information and build the system with the most minor processing, l = 150 is selected as the ideal review length. Long reviews are eliminated by discarding extra words, which maintains the length l by padding a zero vector with brief reviews.

Deep learning model

The classification, BiLSTM, and embedding layers comprise the three main building components of the model architecture. We employed the word2vec¹⁵ embedding approach in the embedding layer, which maps textual data’s integer indices into a dense vector to extract the feature. Using the Keras embedding layer, we trained word2vec using the whole BRRC. The three inputs $\:({V}_{k},d,l)$ that the embedding layer requires are $\:{V}_{k}$ (vocabulary size), d (embedding dimension), and l (length of review). The length of a word’s vector representation is determined by a hyper-parameter called embedding dimension (d). The embedding layer transformed a review into a 2D vector with dimension l×d. We therefore produced a feature vector of dimension F = R × l × d for the R number of reviews (Fig. 2).

Regarding the LSTM layer, one popular version of the RNN used to solve the exploding and vanishing gradient problem is the LSTM network. In particular, it has been demonstrated²⁸ that LSTM helps record enduring dependence in a text. To preserve contextual information from both the previous and next words, we used bidirectional LSTM (BiLSTM)^8,12. Each LSTM of hidden units of size h receives values for the embedding layer’s word embedding. For BiLSTM, we acquired a two-h vector representation by concatenating each LSTM output. An input sequence of an embedding vector as a pair $\:({\text{e}}^{<\text{i}>},{\text{y}}^{<\text{i}>})$ is processed by an LSTM. A local support vector machine (LSTM) maintains a hidden vector $\:{\text{h}}^{<\text{t}>}$ and a remember vector $\:{\text{m}}^{<\text{t}>}$ for every pair $\:({\text{e}}^{<\text{i}>},{\text{y}}^{<\text{i}>})$ and time step t. These vectors control the state’s updates and results, aiding in producing the desired output. $\:{\text{y}}^{<\text{i}>}$ depending on the input $\:{\text{x}}^{<\text{i}>}$‘s previous states. The processing stages that the Eqs. 1–6 carry out at time t.

$$\:{\text{u}}_{\text{g}}\:=\:\sigma\:({W}_{u}\:*{h}^{<t-1>}\:+\:{I}_{u})$$

(1)

$$\:{f}_{g}\:=\:\sigma\:({W}_{f}\:*{h}^{<t-1>}\:+\:{I}_{f})$$

(2)

$$\:{o}_{g}\:=\:\sigma\:({W}_{o}\:*{h}^{<t-1>}\:+\:{I}_{o})$$

(3)

$$\:{c}_{g}\:=\:tanh({W}_{c}\:*{h}^{<t-1>}\:+\:{I}_{c})$$

(4)

$$m^{{\left\langle t \right\rangle }} = f_{g} \odot m^{{\left\langle {t - 1} \right\rangle }} + u_{g} \odot c_{g}$$

(5)

$$h^{{\left\langle t \right\rangle }} = tanh\left( {o_{g} \odot m^{{\left\langle t \right\rangle }} } \right)$$

(6)

In this case, σ stands for the sigmoid activation function, and the weight, together with the recurrent units’ projection matrices, are $\:{\text{W}}_{\text{u}}$,$\:{\text{W}}_{\text{f}}$, $\:{\text{W}}_{\text{o}}$ and $\:\text{W}$, respectively. By storing in the memory vector m^() for as long as necessary, the calculated gates $\:{\text{u}}_{\text{g}}$,$\:{\text{f}}_{\text{g}}$,$\:{\text{o}}_{\text{g}}$, and $\:{\text{c}}_{\text{g}}$ of LSTM cells play a crucial role in obtaining essential properties from the computed vector. The update gate $\:{\text{c}}_{\text{g}}$uses input gate $\:{\text{u}}_{\text{g}}$, and the previously remembered vector $\:{\text{m}}^{<\text{i}-1>}$, to write the updated information in the new remember vector $\:{\text{m}}^{<\text{i}>}$, whilst the forget gate f_g determines how much information should be dumped from the previous remember vector $\:{\text{m}}^{<\text{i}-1>}$. Lastly, the output gate $\left( {{\text{o}}_{{\text{g}}} \odot {\text{m}}^{{\left\langle {\text{t}} \right\rangle }} } \right)$ keeps track of the data that is transferred from the hidden vector $\:{\text{h}}^{<\text{i}>}$ to the new memory vector $\:{\text{m}}^{<\text{i}>}$.

The next layer is the classification layer. In this stage, a sigmoid layer comes after two thick layers. The BiLSTM layer converts the input of size (l×d) for the $\:{r}^{th}$ input sequence into an output vector of size (l ×2 h). After passing through the first dense layer, this vector is propagated through the ReLU activation function, which creates a new vector with the shape $\:(l\times\:{dl}_{1})$. A 26% dropout ratio dropout layer was inserted between two thick layers to prevent over-fitting⁹. A new vector of size $\:\left(l\times\:{dl}_{2}\right)$ is further generated at each iteration when 74% of neurons are randomly selected to transfer their output from the first dense layer to the second dense layer. It should be noted that the first and second dense layers’ $\:{dl}_{1}$ and $\:{dl}_{2}$ represent the number of hidden neurons, respectively. The output vector of the second dense layer was flattened, resulting in a one-dimensional vector of size. $\:{f}_{v}$. The output vector from the last layer finally entered a sigmoid⁹ layer.

$$\:\sigma\:\left({f}_{v}\right)\:=\frac{1}{1\:+\:{e}^{-{f}_{v}}}\:$$

(7)

$$\:{y}_{pred}\:=\left\{\begin{array}{c}1\:\left(positive\right)\:if\:\sigma\:\left({f}_{v}\right)>\:Threshold\\\:0\:\left(negative\right)\:if\:\sigma\:\left({f}_{v}\right)\:<\:Threshold\end{array}\right.\:\:$$

(8)

The cross-entropy loss function⁹, which we employed to train the model, is represented by Eq. 9. In this case, the $\:{r}^{th}$ input review is denoted by the subscript r and the actual emotion class of the $\:{r}^{th}$ review is denoted by $\:{t}_{r}$.

$$Loss\left( {y,y_{{pred}} } \right) = ~ - \frac{1}{R}\sum\limits_{{r = 1}}^{R} {\left( {t_{r} ~log\left( {y_{{pred}} } \right)} \right)} ~$$

(9)

The text-to-vector representation module receives an unlabeled review for categorization, which is then processed via the tokenization and padding stages. This converted vector is then sent into the trained sentiment classifier model, which uses it to forecast the review’s sentiment.

Hyperparameter optimization

The parameters that directly control a model’s training process are called model hyperparameters. These variables control the network’s design, including its number of layers, hidden units, and training process, including learning rate and batch size. Two layers contain 64 and 14 hidden units, respectively. Table 2 lists the hyperparameter values for the suggested model, including the batch size, learning rate, optimizer, dropout rate, embedding dimension, and epoch count. We randomly pick a starting value for each hyperparameter except for the embedding dimension. We go over the hyperparameter space to determine a hyperparameter’s ideal value. With these perfect hyperparameter configurations, the suggested model is trained.

Table 2 Hyperparameter settings.

Full size table

Proposed algorithm

We proposed a novel DL architecture, SentiNet, to efficiently classify sentiments in customer reviews. We proposed an Efficient Learning-Based Sentiment Analyzer (LBSA) algorithm, which exploits novel vectorization, embeddings, and the novel architecture of the proposed SentiNet model.

Algorithm 1 is designed to process the Zomato Restaurants dataset (D) and produce sentiment R and P. The algorithm begins by splitting the dataset D into two parts, T1 and T2, for training and testing, respectively, using the SplitCorpora function. The Vectorization function is then applied to both parts, T1 and T2, to create vector representations V1 and V2, respectively. The algorithm proceeds to train the SentiNet model, which involves configuring and compiling the model (m) and then teaching it with the vectorized data V1, resulting in a trained model m’. This trained model will persist for future use. For sentiment classification, the trained model m’ is loaded, and the ClassifySentiments function is used to process the vectorized data V2, producing the sentiment classification results R. The model’s effectiveness is then assessed using the EvaluateSentiNet function, which compares the classification results R against the ground truth data. Finally, the algorithm displays both the sentiment classification results R and the performance statistics P.

The algorithm is depicted in a step-by-step process that includes dataset splitting, vectorization, model configuration, training, persistence, classification, evaluation, and results display. The process suggests a supervised learning approach, where the model is trained on labeled data and then evaluated for accuracy. The output includes the sentiment classification results and the model’s performance metrics, which help understand public opinion regarding restaurant reviews on platforms like Zomato.

Before model training, all the textual data were processed in a standardized manner. This process involved converting to lowercase, removing stop words, stripping punctuation, and lemmatizing tokens to ensure they were normalized and retained their original semantic meaning. Vectorization of tokenized text was performed with pre-trained GloVe embeddings (300D) for a dense semantic representation. To balance model tuning and testing, the dataset was divided into three portions: a 70% training set, a 15% validation set, and a 15% test set. The hyperparameters were tuned using a grid search based on the validation set. Central hyperparameter values were learning rate = 0.001, batch size = 64, dropout rate = 0.3, and 128 hidden units in the BiLSTM layer. The model was trained for 10 epochs, with early stopping implemented to prevent overfitting. These settings were chosen based on empirical convergence and consistency across different datasets.

Architectural differentiation from BiLSTM

Although SentiNet employs a BiLSTM as the fundamental sequence encoder, it distinguishes itself from classic BiLSTM models in several aspects. First, it incorporates a context-aware semantic attention layer for selecting sentiment-related word representations with enhanced accuracy and interpretability. Second, SentiNet incorporates latent topic vectors (generated through topic modeling) in the sentiment classification pipeline, and a multi-view learning stage achieves co-training of sentiment with dominant aspects such as taste, service, or cost. Third, it proposes a joint-loss function that combines cross-entropy and topic-sentiment alignment regularizations to facilitate the model’s alignment of sentiments with corresponding thematic parts. Finally, we also utilize a domain adaptation module to mitigate domain shifts between various cuisine groups/restaurants, thereby enhancing model robustness. Such advances, not available in traditional BiLSTM models, jointly contribute to the high performance of SentiNet on all benchmarks, as shown in section “Experimental results”.

Dataset details

The dataset used for this empirical study is the Zoomato Bangalore Restaurants dataset collected from⁴⁷. This dataset comprises customer reviews of food items from approximately 12,000 restaurants in Bangalore. By analyzing customer reviews, it is possible to discover various trends in customer behavior and learn their sentiments towards different foods or service providers.

Dataset bias and generalizability considerations

The Zomato Bangalore Restaurants data contains useful indications, but it is naturally biased and applies to the given dataset. First, the dataset comprises urban restaurant dynamics, so models trained on it may not generalize to rural or Tier-2 city settings. Second, there is an overrepresentation of popular restaurant chains, which could bias the distribution of opinions toward more positive (or negative) reviews due to the brand-lover effect. Third, the review-contributing users are likely to be tech-savvy, younger user groups, unlike other consumers. Furthermore, the language used in reviews is predominantly colloquial English, which might make the model less useful in multilingual settings or formal review settings. We acknowledge these limitations and hope that future work will assess the generalizability on different datasets from other cities, languages, age groups, and so on.

Performance evaluation methodology

Since we employed a learning-based approach, metrics derived from the matrix of misunderstandings, as shown in Fig. 3, are used to evaluate our methodology.

Performance statistics are obtained by comparing our technique’s predicted labels with the ground truth based on a confusion matrix. Equation 10 through 13 express the measures utilized in the performance evaluation.

$${\text{Precision~}}\left( {\text{p}} \right) = \frac{{TP}}{{TP + FP}}$$

(10)

$${\text{F1 - score }} = 2*\frac{{\left( {p*~r} \right)}}{{\left( {p + r} \right)}}$$

(11)

$$\:Loss\left(y,{y}_{pred}\right)=\:-\frac{1}{R}\:\sum\:_{r=1}^{R}\:\left(\right({t}_{r}\:log\left({y}_{pred}\right))\:\:$$

(12)

$${\text{Accuracy }} = \frac{{TP + TN}}{{TP + TN + FP + FN}}$$

(13)

The outcome of the performance evaluation metrics is a number between 0 and 1. These measures are often utilized in machine learning research.

Experimental setup

This experiment aims to determine the optimal hyperparameter combination and evaluate the suggested model’s performance compared to alternative ML techniques. We conducted tests using Google Colaboratory, a popular platform for developing DL applications. Pandas = 1.0.5 is the data preparation framework. A deep learning model was created using the TensorFlow 2.2.0 and Keras 2.3.0 frameworks. 72% (6072 reviews), 18% (1519 reviews), and 10% (844 reviews) of the total reviews are comprised of sets of training, validation, and tests. The model is trained on the training set, and validation samples are used to adjust its hyperparameters (learning rate, batch size, etc.). Finally, a trained model was evaluated on the test set.

We followed a conventional 70:15:15 setup for training, validation, and testing to ensure a well-balanced and generalizable assessment of the model. The training set helps to learn a wide range of sentiment patterns, and the validation set allows robust hyperparameter tuning and early stopping. During training, the test set is never exposed, so the performance is evaluated fairly. We restricted the training to 10 epochs based on empirical observations of the validation loss and validation accuracy trends, where both curves converged and stabilized, and neither overfitted significantly thereafter. This method preserved performance robustness while also guaranteeing computational efficiency.

Basic text processing. The raw text was pre-processed using a standard data pipeline to enhance data quality and minimize the impact of noisy, low-frequency terms on the classification task. This involved converting all tokens to lowercase, then stripping HTML tags, special characters, stop words, and punctuation. Lemmatization was chosen for preserving syntactic structure, instead of reverting to stemming. The next step was to send the tokenized words to the embedding and topic modules for further processing.

Experimental results

This section presents the findings of our empirical study and the proposed sentiment analysis paradigm. The proposed DL model, SentiNet, was evaluated with the benchmark data set and provided superior performance compared to state-of-the-art models. The experimental results in this section include exploratory data analysis, essential data analytics, and sentiment analysis results, comparing the proposed DL model’s performance with that of some existing models.

According to the data analytics presented in Fig. 4, it is evident that several renowned restaurants have numerous outlets in Bengaluru. The visualization of restaurants is presented in descending order by the number of outlets.

Figure 5: Proportion of restaurants on online order acceptance in Bengaluru. About 58.9% restaurants support online orders, and 41.1% do not. This demonstrates a high level of adoption of digital food delivery services, driven by shifting consumer demand and the growing importance of online availability within the restaurant sector for operational efficiency and customer service.

The distribution of table booking services in restaurants of Bengaluru is shown in Fig. 6. The vast majority (87.5%) of the restaurants provided table reservation options, and 12.5% did not. This reflects a growing movement toward reservation-based dining, most likely spurred by consumers who are also seeking convenience, reduced waiting time, and a better method of planning a dining experience.

Figure 7 represents the density distribution of restaurant ratings in Bengaluru. Most of the ratings fall in the 3.5 to 4.0 range, which is relatively good from a customer perspective. The distribution curve is quite close to normal, indicating that the customer experience is relatively consistent across all establishments. This lower density, below 3.0, indicates that there are fewer restaurants with low ratings here, which implies a competitive and quality-based food service setting.

Figure 8 depicts the trend of restaurant ratings concerning the approximate cost for two persons, illustrating the trend by the availability of online ordering. Restaurants offering online ordering (green) are spread across all price brackets but seem to concentrate more on those below ₹1500. Restaurants that do not offer online ordering (red) are also represented throughout the entire spectrum. There is no substantial correlation, meaning that cost and rating have very little to do with each other.

The distribution of restaurant types in Bengaluru is illustrated in Fig. 9. “Quick Bites” leads the way with nearly 19,500 outlets, followed by “Casual Dining” with over 10,000 outlets. There are also numerous other categories, including cafes, delivery services, and dessert houses. Specialized formats, such as pubs, fine dining, and food courts, occur less often, which represents niche positioning.

Figure 10 shows the distribution of restaurants across central locations in Bengaluru. BTM leads with over 5200 restaurants, followed by HSR and Koramangala 5th Block. Prominent food hubs, such as JP Nagar, Whitefield, and Indiranagar, also feature significant restaurant densities. This highlights how commercial and residential localities have evolved into key culinary zones within the city.

Figure 11 presents the restaurants in various locations and their density, visualized in terms of the number of restaurants or outlets.

The most frequently offered cuisines in Bengaluru restaurants are shown in Fig. 12. It is mainly North Indian, followed in second place by a combination of North Indian and Chinese, and in third place, South Indian. Biryani, Fast Food, Desserts, and Cafes are some mainstream options available. The diversity is a reflection of the city’s cosmopolitan culinary culture, with options to satisfy both purists and fusion fans.

The prevalence of restaurants in Bengaluru serving North Indian cuisine is depicted in Fig. 13. The list illustrates why you find this cuisine so widely available across the city at numerous locations. This trend is a testament to classic, hearty dishes from the north and the strong demand they continue to enjoy, securing their place in Bengaluru’s eclectic food scene, which features both local and fusion culinary options.

Figure 14 shows the top 30 North Indian restaurants spread over the Bengaluru district areas. The visualization shows that we have more of these restaurants in well-known areas (BTM, HSR, Koramangala), which could be because people prefer to eat in these areas. The food service infrastructure in these areas is also better. The data highlights the pervasive presence of North Indian cuisine in the city’s culinary lexicon.

Figure 15 presents the word cloud for the most common cuisines offered by different types of restaurants operating in Bengaluru. The search results reveal words such as North Indian, Chinese, South Indian, and Biryani, which indicate the availability. The image is a fun way to explore the city’s food scene. From its traditional spreads to its street food and BBMP-inspired fusion recipes, all play a part in making Bengaluru’s culinary landscape an intensely colourful one.

A word cloud analysis generated from customer reviews regarding the variety of restaurants in Bengaluru is presented in Fig. 16. Key terms such as “tasty,” “service,” “ambience,” and “quality” are indicative of factors that affect customer satisfaction. The visualization provides an at-a-glance summary of customer sentiment across a wide range of issues, offering insight into the key issues raised within the reviews.

Figure 17 illustrates topic-wise keyword distributions based on customer reviews, as determined by topic modeling. All subplots represent a unique subject, describing the most common words used in that division along with its word count. Keywords such as food, service, tasty, ambience, and place overwhelm, which echo the types of statements that appear repeatedly in user comments. It’s a great indication of the typical themes around customer experiences in Bengaluru restaurants.

Figure 18 illustrates the keyword distribution across four distinct topics extracted using topic modeling. Each subplot displays the most significant words associated with a topic based on their frequency and importance (weights). Words like food, service, bad, and quality recur across topics, revealing common concerns and sentiments. This visualization helps clarify the dominant themes in user reviews.

As presented in Fig. 19, the T-SNE visualization of adjectives used in positive comments is used to understand customer sentiments about restaurants located in Bengaluru.

As presented in Fig. 20, the T-SNE visualization of adjectives used in negative comments is used to understand customer sentiments about restaurants located in Bengaluru.

Table 3 compares the models’ respective performances in sentiment analysis in terms of different performance metrics that help understand the capability of other models.

Table 3 Performance comparison among models in sentiment analysis.

Full size table

As shown in Fig. 21, different models can offer varying performance levels in sentiment analysis. However, the proposed model could outperform other models due to its hybrid approach to leveraging performance. The precision of the baseline CNN model is 89.74%, the baseline LSTM model is 90.89%, the Bi-LSTM model is 94.56%, and the proposed SentiNet model is 96.93%. The recall of the baseline CNN model is 89.23%, the baseline LSTM model is 90.54%, the Bi-LSTM model is 94.56%, and the proposed SentiNet model is 98.04%. The F1 scores of the baseline CNN model, baseline LSTM model, Bi-LSTM model, and the proposed SentiNet model are 89.48%, 90.71%, 94.56%, and 97.48%, respectively. The accuracy of the baseline CNN model is 89.52%, the baseline LSTM model is 90.21%, the Bi-LSTM model is 94.13%, and the proposed SentiNet model is 98.68%.

Ablation analysis

To evaluate the significance of each architectural module in SentiNet, an ablation study was conducted by successively removing or deactivating each specific module and assessing the model’s performance. The study considered three aspects: the semantic attention layer, topic vector embedding, and optimized dropout. The F1-scores and accuracies are summarized in Table X.

The complete model (SentiNet) outperformed this approach with an F1-score of 97.48% and an accuracy of 98.68%. The identical model without the semantic attention layer achieved an F1-score of 95.31%. By removing topic vector embeddings, a decrease of 94.06% was achieved. A performance drop was observed when a fixed setting was used with optimized dropout (0.46) at 92.85%. These findings further validate the additional contribution of each module to the classification performance.

The ablation study results, evaluating the individual components of the SentiNet architecture, are presented in Table 4. Without the semantic attention layer, topic embeddings, or optimized dropout, the model’s performance dropped significantly. Results demonstrate that each module, especially attention and topic-aware learning, is practical in improving the performance of sentiment classification in terms of accuracy and F1-score.

Table 4 Ablation study evaluating the impact of key architectural components in SentiNet.

Full size table

Figure 22 shows the results of an ablation study, where some of the key architectural components influence SentiNet’s performance. Subfigure (a) illustrates the degradation in F1-score when semantic attention, topic embeddings, or optimized dropout is not employed. Corresponding deterioration of accuracy is presented in the subfigure (b). The complete model achieves better performance than the reduced version in a stable manner, demonstrating that each module contributes reasonably to both classification accuracy and robustness.

Performance comparison with existing methods

To demonstrate the effectiveness of SentiNet, we compare its performance with several state-of-the-art deep learning-based sentiment analysis models. This comparison clearly shows the superiority of the attention-enhanced CNN-BiLSTM architecture of SentiNet in terms of accuracy and F1-score. The chosen models encompass the variant of pure RNNs, the hybrid of CNN-RNN, and attention (mechanism)-embedded approaches, thereby providing a strong benchmark for comparison.

Table 5 presents a comparative study of various CNN-based sentiment classification models in conjunction with the proposed SentiNet framework. Authors, model/method, dataset used, main performance metrics, and qualitative insights are summarized for each item. The chosen models span the spectrum of architectures, ranging from baseline LSTM and BiLSTM methods to more complex hybrids, including attention mechanisms, convolutional layers, and GRU (Gated Recurrent Unit) blocks.

Table 5 Comparative performance analysis of existing deep learning-based sentiment models and the proposed SentiNet framework.

Full size table

Bahad et al.⁹ employed a plain BiLSTM model on Amazon reviews, achieving an accuracy of 89.9%. Although it effectively captures temporal patterns, the lack of attention or convolutional layers restricts its capacity to focus on critical contextual signals, especially in noisy or informal text. Do et al.⁸ developed a hybrid CNN-LSTM model, which leverages the strong local feature extraction capabilities of CNNs and the sequential modeling power of LSTMs. Their approach, validated on Twitter and review datasets, achieved an accuracy of up to 91%, demonstrating the effectiveness of hybrid architectures in aspect-level sentiment classification.

Xu et al.⁷ reinforced the BiLSTM using weighted word vectors, achieving an F1-score of 86% on their corpus. This method enhances the representation of semantically meaningful terms, but does not use hierarchical or multi-level attention mechanisms. Basiri et al.⁵ presented ABCDM, which is composed of attention, CNN, and GRU. They also obtained collections on Twitter/Yelp datasets with an accuracy exceeding 91%, demonstrating that using attention made it possible to focus on sentiment-relevant tokens.

Jelodar et al.⁴ (an earlier version of the deep sentiment model) utilized general LSTM and NLP methods to achieve an accuracy of 81.15% on COVID-19-related social media materials. The comparatively modest performance highlights the inherent limitations of vanilla RNN-based models, which struggle to access fine-grained and subtle emotional aspects of the input without advanced optimization or architectural improvements.

On multiple benchmark datasets, Sherkatghanad⁴⁸ introduced a CNN–BiLSTM architecture with a self-attention mechanism that performed above 94% accuracy. The model further improved precision and F1-score over typical hybrid models by having separate routes for convolutional layers for local pattern extraction, BiLSTM for sequential modeling, and self-attention for weighting constructs according to context. This shows that introducing attention-based attributes is a valuable feature addition within typical CNN–RNN pipelines for sentiment representation.

Krasitskii et al. A joint extractive–abstractive summarization step before fine-tuned XLM-R for the sentiment classification task⁴⁹. When applied to multilingual corpora of 10 languages, this approach achieved high classification accuracy while allowing for reduced inference time, thus making the approach attractive for large-scale, low-resource applications. The results show that selective compression of text can optimize a multi-lingual model while maintaining predictiveness.

By contrast, the proposed SentiNet model, which integrates multi-level attention, CNN-based feature extraction, and BiLSTM-based temporal modeling, exhibits a significant performance improvement, achieving an accuracy of 94.8% and an F1-score of 94.3%. Beyond scale, SentiNet incorporates architectural novelties that enable it to selectively attend to contextually significant features, handle noisy and informal user-generated content (e.g., tweets), and generalize well across different domains, such as IMDb reviews. The improvement is attributed to the use of enhanced preprocessing in the model, dynamic attention, and refined tuning of hyperparameters.

In summary, the cross-database comparison confirms that architectural synergy, particularly attention-driven hybrid deep learning, outperforms traditional methods. Compared to its counterparts, SentiNet is not only highly interpretable and scalable, but it also achieves superior classification performance, making it easily applicable to various real-world sentiment analysis tasks.

Discussion

The experimental results consistently demonstrate that SentiNet significantly outperforms traditional and hybrid baseline models across all evaluation metrics, including accuracy, precision, recall, and F1-score. This performance gain is attributed to several architectural and functional innovations embedded in the model design.

First, SentiNet’s integration of semantic attention mechanisms enables it to dynamically focus on contextually relevant words and phrases, thereby improving sentiment inference for ambiguous or multi-opinionated text. Unlike conventional BiLSTM or CNN models, which treat all tokens uniformly, SentiNet identifies sentiment-bearing tokens in context, a critical aspect in real-world sentiment tasks, where polarity may depend heavily on specific phrases or negation cues.

Second, the topic vector embedding layer allows the model to incorporate latent thematic information that often underlies sentiment polarity. This abstraction enables SentiNet to learn domain-sensitive sentiment patterns—for example, distinguishing how the word “light” may indicate positive sentiment in electronics reviews but neutral sentiment in travel reviews. This capability is absent in models relying solely on word-level or sequence-level features.

Third, SentiNet’s optimized dropout strategy and gated fusion mechanisms ensure better generalization, avoiding both underfitting and overfitting. The model stabilizes more quickly during training, as indicated by early convergence curves, and maintains consistent performance across multiple validation folds.

From a qualitative perspective, sample predictions reveal that SentiNet handles sarcasm, implicit sentiment, and domain-specific expressions more accurately. For instance, in test cases like “The battery lasted barely a day—great!”, baseline models misclassify the sentiment as positive due to lexical cues. In contrast, SentiNet correctly identifies the sarcastic tone due to contextual attention.

In summary, the superior performance of SentiNet stems from its context-aware interpretability, topic-informed learning, and architectural robustness, which together enable nuanced understanding of sentiment expressions. These features make it particularly suitable for deployment in sentiment-intensive domains such as e-commerce, social media monitoring, and healthcare feedback analysis.

While SentiNet achieves high classification accuracy and robustness, it introduces moderate computational overhead due to its attention mechanisms and the integration of topic vectors. Compared to lightweight models such as vanilla CNN or LSTM, SentiNet has a higher inference time, especially in resource-constrained settings. However, this trade-off is often justified in real-time applications where interpretability and contextual accuracy are critical (e.g., real-time brand monitoring or healthcare feedback analysis). For deployment in latency-sensitive environments, variants of SentiNet with reduced depth or pruned attention heads can be explored to maintain a balance between responsiveness and predictive performance.

Conclusion and future work

This paper introduces SentiNet, a hybrid deep learning model that integrates semantic attention, topic vector embeddings, and sequence modeling in harmony, taking context into account for sentiment classification. Extensive experiments on benchmark datasets have proved that SentiNet outperforms traditional BiLSTM, CNN, and hybrid models in varying metrics (accuracy, precision, recall, F1-score). Not only is the classification performance improved, but also the model interpretability in the proposed framework, which is an essential factor for sensitive domains such as e-health, e-governance, and financial market analysis. This approach enables the model to dynamically adjust to various linguistic constructs and domain entities, utilizing a combination of topic-aware embeddings and context-sensitive attention. Additionally, the architecture is modular, allowing for easy extension with external lexicons or multilingual additions in future versions. Practically speaking, deploying SentiNet to real-time applications requires us to trade off predictive accuracy and inference latency. Although the model incurs computation overhead by incorporating attention layers, it is modular and can be tuned through parameter pruning or lightweight distillation as well. This enables SentiNet to be deployed either close to the sensors on the edge or in the cloud, depending on latency and resource constraints. Next, future work is discussed within the framework, along with several ways to enhance it. In the future, we plan to investigate integrating the method into explainable AI (XAI) dashboards, as well as cross-lingual adaptation and domain transferability. Finally, analysis of sentiment dynamics over time (temporal sentiment evolution) and adversarial robustness will be further scrutinized for applications in volatile domains such as social media surveillance or crisis informatics. In summary, SentiNet is a strong framework for sentiment classification that reconciles deep contextual learning with computational practicality. Its design and performance make it a perfect candidate for use in mission-critical environments that require high accuracy and operational visibility.

Data availability

Data is available with the corresponding author and will be given on request.

References

Wang, R., Li, Z., Cao, J., Chen, T. & Wang, L. International Joint Conference on Neural Networks (IJCNN)—Convolutional Recurrent Neural Networks for Text Classification 1–6 (2019). https://doi.org/10.1109/IJCNN.2019.8852406
Al-Smadi, M., Talafha, B., Al-Ayyoub, M. & Jararweh, Y. Using long short-term memory deep neural networks for aspect-based sentiment analysis of Arabic reviews. Int. J. Mach. Learn. Cybernet. https://doi.org/10.1007/s13042-018-0799-4 (2018).
Article Google Scholar
Alharbi, M., Sulaiman, A. & Doncker, E. Twitter Sentiment Analysis with a Deep Neural Network: An Enhanced Approach Using User Behavioral Information S1389041718300482 (Cognitive Systems Research, 2018). https://doi.org/10.1016/j.cogsys.2018.10.001
Jelodar, H., Wang, Y., Orji, R. & Huang, H. Deep sentiment classification and topic discovery on novel coronavirus or COVID-19 online discussions: NLP using LSTM recurrent neural network approach. IEEE J. Biomed. Health Inf. 1–1. https://doi.org/10.1109/JBHI.2020.3001216 (2020).
Basiri, M. E. et al. ABCDM: An attention-based bidirectional CNN-RNN deep model for sentiment analysis. Future Gener. Comput. Syst. 115, 279–294. https://doi.org/10.1016/j.future.2020.08.005 (2021).
Article Google Scholar
Jianqiang, Z. & Xiaolin, G. Deep Convolution neural networks for Twitter sentiment analysis. IEEE Access 1–1. https://doi.org/10.1109/ACCESS.2017.2776930 (2018).
Xu, G., Meng, Y., Qiu, X., Yu, Z. & Wu, X. Sentiment analysis of comment texts based on BiLSTM. IEEE 7, 51522–51532. https://doi.org/10.1109/ACCESS.2019.2909919 (2019).
Article Google Scholar
Dohaiha, H. H., Prasad, P. W. C., Maag, A. & Alsadoon, A. Deep Learning for Aspect-Based Sentiment Analysis: A Comparative Review. Expert Systems with Applications, S0957417418306456 (2018). https://doi.org/10.1016/j.eswa.2018.10.003.
Bahad, P., Saxena, P. & Kamal, R. Fake news detection using bi-directional LSTM-recurrent neural network. Procedia Comput. Sci. 165, 74–82 (2019). https://doi.org/10.1016/j.procs.2020.01.072
Article Google Scholar
Yang, C., Zhang, H., Jiang, B. & Li, K. Aspect-based sentiment analysis with alternating coattention networks. Inf. Process. Manag. 56(3), 463–478 (2019). https://doi.org/10.1016/j.ipm.2018.12.
Article Google Scholar
Jin, Z., Yang, Y. & Liu, Y. Stock closing price prediction based on sentiment analysis and LSTM. Neural Comput. Appl. (2019). https://doi.org/10.1007/s00521-019-04504-2
Rehman, A. U., Malik, A. K., Raza, B. & Ali, W. A hybrid CNN-LSTM model for improving accuracy of movie reviews sentiment analysis. Multimedia Tools Appl. (2019). https://doi.org/10.1007/s11042-019-07788-7
Usama, M. et al. Attention-based sentiment analysis using convolutional and recurrent neural network. Future Gen. Comput. Syst. S0167739X19334600 (2020). https://doi.org/10.1016/j.future.2020.07.022.
Onan, A. Bidirectional convolutional recurrent neural network architecture with group-wise enhancement mechanism for text sentiment classification. J. King Saud Univ. Comput. Inf. Sci. 34(5), 2098–2117. https://doi.org/10.1016/j.jksuci.2022.02.025 (2022).
Article Google Scholar
Ligthart, A., Catal, C. & Tekinerdogan, B. Systematic reviews in sentiment analysis: A tertiary study. Artif. Intell. Rev. (2021). https://doi.org/10.1007/s10462-021-09973-3
Muhammad, P. F., Kusumaningrum, R. & Wibowo, A. Sentiment analysis using Word2vec and long short-term memory (LSTM) for Indonesian hotel reviews. Procedia Comput. Sci. (2021). https://doi.org/10.1016/j.procs.2021.01.061
Gupta, S., Bouadjenek, M. R. & Robles-Kelly, A. PERCY: A post-hoc explanation-based score for logic rule dissemination consistency assessment in sentiment classification. Knowl. Based Syst. 275, 1–18. https://doi.org/10.1016/j.knosys.2023.110685 (2023).
Article Google Scholar
Du, K., Xing, F. & Cambria, E. Incorporating multiple knowledge sources for targeted aspect-based financial sentiment analysis. ACM 14(3), 1–24 (2023). https://doi.org/10.1145/3580480
Article Google Scholar
Haque, R., Islam, N., Tasneem, M. & Das, A. K. Multi-class sentiment classification on Bengali social media comments using machine learning. Int. J. Cogn. Comput. Eng 4, 21–35. https://doi.org/10.1016/j.ijcce.2023.01.001 (2023).
Article Google Scholar
Juan Fernández, J. et al. Physics-guided recurrent neural network trained with approximate bayesian computation: A case study on structural response prognostics. Reliab. Eng. 243, 1–12. https://doi.org/10.1016/j.ress.2023.109822 (2024).
Article Google Scholar
Mao, J., Qian, Z. & Lucas, T. Sentiment analysis of animated online education texts using long short-term memory networks in the context of the internet of things. IEEE 11, 109121–109130 (2023). https://doi.org/10.1109/ACCESS.2023.3321303
Article Google Scholar
Kanwal, I. et al. Sentiment analysis using hybrid model of stacked auto-encoder-based feature extraction and long short term memory-based. IEEE 11, 124181–124197. https://doi.org/10.1109/ACCESS.2023.3313189 (2023).
Article Google Scholar
Berrimi, M., Oussalah, M., Moussaoui, A. & Saidi, M. A hybrid optimization algorithm using BiLSTM structure for sentiment analysis. ACM 22(4), 1–26. https://doi.org/10.1145/3578265 (2023).
Article Google Scholar
Park, J. & Seo, Y.-S. Twitter sentiment analysis-based adjustment of cryptocurrency action recommendation model for profit maximization. IEEE. 11, 44828–44841. https://doi.org/10.1109/ACCESS.2023.3273898 (2023).
Article Google Scholar
Tan, K. L., Lee, C. P., Anbananthen, K. S. M. & Lim, K. M. RoBERTa-LSTM: A hybrid model for sentiment analysis with transformer and recurrent neural network. IEEE 10, 21517–21525. https://doi.org/10.1109/ACCESS.2022.3152828 (2022).
Article Google Scholar
Tan, K. L., Lee, C. P., Lim, K. M. & Anbananthen, K. S. M. Sentiment analysis with ensemble hybrid deep learning model. IEEE. 10, 103694–103704. https://doi.org/10.1109/ACCESS.2022.3210182 (2022).
Article Google Scholar
Song, M., Park, H. & Shin, K.-s. Attention-based long short-term memory network using sentiment lexicon embedding for aspect-level sentiment analysis in Korean. Inf. Process. Manag. 56(3), 637–653. https://doi.org/10.1016/j.ipm.2018.12.005 (2019).
Article Google Scholar
Van Houdt, G., Mosquera, C. & Nápoles, G. A review on the long short-term memory model. Art Intell Rev https://doi.org/10.1007/s10462-020-09838-1 (2020).
Article Google Scholar
Ma, Y., Peng, H., Khan, T., Cambria, E. & Hussain, A. Sentic LSTM: A hybrid network for targeted aspect-based sentiment analysis. Cogn. Comput. https://doi.org/10.1007/s12559-018-9549-x (2018).
Article Google Scholar
Rao, G., Huang, W., Feng, Z. & Cong, Q. LSTM with sentence representations for document-level sentiment classification. Neurocomputing https://doi.org/10.1016/j.neucom.2018.04.045 (2018).
Article Google Scholar
Holzinger, A., Kieseberg, P., Tjoa, A. M. & Weippl, E. A combined CNN and LSTM model for arabic sentiment analysis (chapter 12), 179–191 (2018). https://doi.org/10.1007/978-3-319-99740-7_12
Sohangir, S., Wang, D., Pomeranets, A. & Khoshgoftaar, T. M. Big Data: Deep learning for financial sentiment analysis. J. Big Data 5(1), 3. https://doi.org/10.1186/s40537-017-0111-6 (2018).
Article Google Scholar
Tamara, K. & Milicevic, N. IEEE 16th International Symposium on Intelligent Systems and Informatics (SISY)—Comparing Sentiment Analysis and Document Representation Methods of Amazon Reviews 000283–000286 (2018). https://doi.org/10.1109/SISY.2018.8524814
Deng, D., Jing, L., Yu, J. & Sun, S. Sparse self-attention LSTM for sentiment lexicon construction. IEEE/ACM Trans. Audio Speech Lang. Process. 27(11), 1777–1790. https://doi.org/10.1109/TASLP.2019.2933326 (2019).
Article Google Scholar
Heikal, M., Torki, M. & El-Makky, N. Sentiment analysis of arabic tweets using deep learning. Procedia Comput. Sci. 142, 114–122. https://doi.org/10.1016/j.procs.2018.10.466 (2018).
Article Google Scholar
Gandhi, U. D., Malarvizhi Kumar, P., Chandra Babu, G. & Karthick, G. Sentiment analysis on Twitter data by using convolutional neural network (CNN) and long short term memory (LSTM). Wirel. Pers. Commun. https://doi.org/10.1007/s11277-021-08580-3 (2021).
Article Google Scholar
Shobana, J. & Murali, M. An efficient sentiment analysis methodology based on long short-term memory networks. Complex Intell. Syst. https://doi.org/10.1007/s40747-021-00436-4 (2021).
Article Google Scholar
Saha, B. N. & Senapati, A. 2020 International Conference on Computational Performance Evaluation (ComPE)—Long Short Term Memory (LSTM) Based Deep Learning for Sentiment Analysis of English and Spanish Data 442–446. https://doi.org/10.1109/ComPE49325.2020.9200054
Qaisar, S. M. Sentiment Analysis of IMDb Movie Reviews Using Long Short-Term Memory. In 2020 2nd International Conference on Computer and Information Sciences (ICCIS) (2020). https://doi.org/10.1109/ICCIS49240.2020.9257657
Af’idah, D. I., Kusumaningrum, R., Surarso, B.. International Seminar on Application for Technology of Information and Communication (iSemantic) - Long Short Term Memory Convolutional Neural Network for Indonesian Sentiment Analysis towards Touristic Destination Reviews 630–637 ((2020)). https://doi.org/10.1109/iSemantic50169.2020.9234210
Alzubi, O. A., Alzubi, J. A., Qiqieh, I. Al‐Zoubi, A. M.. An IoT intrusion detection approach based on Salp swarm and artificial neural network. Int. J. Netw. Manag. 35 (1), https://doi.org/10.1002/nem.2296 (2025).
Alzubi, J. A., Alzubi, O. A., Qiqieh, I. & Singh, A. A blended deep learning intrusion detection framework for consumable edge-centric IoMT industry. IEEE. 70(1), 2049–2057. https://doi.org/10.1109/TCE.2024.3350231 (2024).
Article Google Scholar
Braik, M., Al-Zoubi, H., Ryalat, M., Sheta, A. & Alzubi, O. Memory based hybrid crow search algorithm for solving numerical and constrained global optimization problems. Artif. Intell. Rev. 1–73 (2022). https://doi.org/10.1007/s10462-022-10164-x
Alzubi, O. A. A deep learning-based Frechet and Diriclet model for intrusion detection in IWSN. J. Intell. Fuzzy Syst. Appl. Eng. Technol. 42(2) (2021). https://doi.org/10.3233/JIFS-189756
Alzubi, O. A., Alzubi, J. A., Alazab, M., Alrabea, A., Awajan, A. & Qiqieh, I. Optimized machine learning-based intrusion detection system for fog and edge computing environment. MDPI 11(19), (2022). https://doi.org/10.3390/electronics11193007
Alzubi, O. A., Alzubi, J. A., Alzubi, T. M. & Singh, A. Quantum Mayfly optimization with encoder–decoder driven LSTM networks for malware detection and classification model. Mobile Netw. Appl. 28, 795–807. https://doi.org/10.1007/s11036-023-02105-x (2023).
Article Google Scholar
Zomato Bangalore Restaurants Dataset. Retrieved from https://www.kaggle.com/datasets/himanshupoddar/zomato-bangalore-restaurants/data
Sherkatghanad, Z. A self-attention-based CNN–BiLSTM model for accurate sentiment analysis. Data Brief 54, 110095 (2024).
Google Scholar
Krasitskii, M., Muravev, A., Komissarova, E., Bakaev, M. & Yurin, D.. Hybrid extractive–abstractive summarization for multilingual sentiment analysis (2025) arXiv preprint arXiv:2506.06929. https://arxiv.org/abs/2506.06929.

Download references

Funding

No financial support was received by the authors in this research.

Author information

Authors and Affiliations

Koneru Lakshmaiah Education Foundation, Vaddeswaram, Andhra Pradesh, 522302, India
B. Madhurika & D. Naga Malleswari

Authors

B. Madhurika
View author publications
Search author on:PubMed Google Scholar
D. Naga Malleswari
View author publications
Search author on:PubMed Google Scholar

Contributions

All authors contributed to the study’s conception and design. Material preparation, data collection, and analysis were performed by B. Madhurika and D. Naga Malleswari. The first draft of the manuscript was written by B. Madhurika all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to B. Madhurika.

Ethics declarations

Competing interests

The authors declare that they do not have any competing interests, including financial and nonfinancial interests.

Ethical approval

This research does not involve humans or animals, so no ethical approval is required.

Consent for publication

The authors give consent for their publication.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Madhurika, B., Malleswari, D.N. Deep learning based SentiNet architecture with hyperparameter optimization for sentiment analysis of customer reviews. Sci Rep 15, 35525 (2025). https://doi.org/10.1038/s41598-025-19532-3

Download citation

Received: 26 March 2025
Accepted: 09 September 2025
Published: 10 October 2025
Version of record: 10 October 2025
DOI: https://doi.org/10.1038/s41598-025-19532-3

Subjects

Abstract

Introduction

Related work

Proposed methodology

Problem definition

Overview of our framework

Vectorization method

Deep learning model

Hyperparameter optimization

Proposed algorithm

Architectural differentiation from BiLSTM

Dataset details

Dataset bias and generalizability considerations

Performance evaluation methodology

Experimental setup

Experimental results

Ablation analysis

Performance comparison with existing methods

Discussion

Conclusion and future work

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Ethical approval

Consent for publication

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Quick links