Abstract
The world faced a major health crisis during the COVID-19 pandemic for nearly two and a half years following its emergence in late 2019. Receiving prompt and accurate news played a crucial role in making various decisions to control the pandemic situation. In recent years, social media has demonstrated the effectiveness of disseminating information very rapidly. A spectrum of research is ongoing in the field of Twitter sentiment analysis. However, most of these studies consider only the local viewpoint to understand public sentiment. In this work, a fine-tuned transformer-based model is used to determine the sentiments of tweets locally. Additionally, various real data sources are augmented to enhance the understanding of sentiment polarity. Moreover, it is challenging to obtain a labeled dataset for analyzing sentiment from Twitter based on both global and local viewpoints. Hence, we propose three methods to achieve precise sentiment analysis by considering both global and local perspectives. The classification results obtained from these methods are used to train various traditional classification models. These results can serve as ground truth for further analysis. Our extensive analysis shows that more precise sentiments, based on the global perspective, can be extracted and are useful for training datasets in different machine learning techniques.
Data availability
Due to Twitter’s Terms of Service and privacy policies, the tweet texts and associated user-level information cannot be publicly shared. The dataset with Tweet IDs and labels are made available. These Tweet IDs can be hydrated by other researchers using the Twitter API, in accordance with Twitter’s Developer Policy. The link of the dataset is available here https://docs.google.com/document/d/1XOENZGV00utJrsYEnLVIZTy4hIgZMwB09QPEA9WGKWc/edit?tab=t.0.
References
Mohan, B. et al. Covid-19: an insight into sars-cov-2 pandemic originated at wuhan city in hubei province of china. J Infect Dis Epidemiol 6, 146 (2020).
Kwon, S. et al. Association of social distancing and face mask use with risk of covid-19. Nat. Commun. 12, 3737 (2021).
Deepa, N. et al. A study on issues and preventive measures taken to control covid-19. In AIP Conference Proceedings, vol. 2393 (AIP Publishing, 2022).
Jalil, Z. et al. COVID-19 Related Sentiment Analysis Using State-of-the-Art Machine Learning and Deep Learning Techniques. Front. Public Heal. 9 (2022).
Shofiya, C. & Abidi, S. Sentiment Analysis on COVID-19-Related Social Distancing in Canada Using Twitter Data. International journal of environmental research and public health 18 (2021).
R, R. C. & Krishna, A. COVID-19 sentiment analysis via deep learning during the rise of novel cases. PLoS ONE 16 (2021).
Alanezi, M. A. & Hewahi, N. M. Tweets sentiment analysis during covid-19 pandemic. In 2020 International Conference on Data Analytics for Business and Industry: Way Towards a Sustainable Economy (ICDABI), 1–6, https://doi.org/10.1109/ICDABI51230.2020.9325679 (2020).
Singh, M., Jakhar, A. K. & Pandey, S. Sentiment analysis on the impact of coronavirus in social life using the BERT model. Social network analysis and mining 11 (2021).
Kucharski, A. J. & et al. Early dynamics of transmission and control of COVID-19: a mathematical modelling study. The Lancet Infect. Dis. 20 (2020).
Din, R. & Algehyne, E. Mathematical analysis of covid-19 by using sir model with convex incidence rate. Results Phys (2021).
Haouari, M. & Mhiri, M. A particle swarm optimization approach for predicting the number of covid-19 deaths. Sci Rep 11 (2021).
Kaur, H., Ahsaan, S. U., Alankar, B. & Chang, V. A proposed sentiment analysis deep learning algorithm for analyzing covid-19 tweets. Inf. Syst. Front. 23, 1417–1429 (2021).
Ribeiro, P. L., Weigang, L. & Li, T. A unified approach for domain-specific tweet sentiment analysis. In 2015 18th International Conference on Information Fusion, 1130–1135 (2015).
Abd-Alrazaq, A., Alhuwail, D., Househ, M., Hamdi, M. & Shah, Z. Top concerns of tweeters during the covid-19 pandemic: infoveillance study. Journal of medical Internet research 22, e19016 (2020).
Alamoodi, A. H. et al. Sentiment analysis and its applications in fighting COVID-19 and infectious diseases: A systematic review. Expert. Syst. with Appl. 167 (2021).
Long, Z., Alharthi, R. & Saddik, A. E. Needfull–a tweet analysis platform to study human needs during the covid-19 pandemic in new york state. IEEE Access 92 (2020).
Kausar, M. A., Soosaimanickam, A. & Nasar, M. Public sentiment analysis on twitter data during covid-19 outbreak. Int. J. Adv. Comput. Sci. Appl. 12 (2021).
Rufai, S. R. & Bunce, C. World leaders’ usage of twitter in response to the covid-19 pandemic: a content analysis. J. public health 42, 510–516 (2020).
Gulati, K. et al. Comparative analysis of machine learning-based classification models using sentiment classification of tweets related to COVID-19 pandemic. In Materials Today: Proceedings 51, 38–41 (2022).
Singh, C., Imam, T., Wibowo, S. & Grandhi, S. A deep learning approach for sentiment analysis of covid-19 reviews. Appl. Sci. 12 (2022).
Hota, H., Sharma, D. K. & Verma, N. 14 - lexicon-based sentiment analysis using twitter data: a case of covid-19 outbreak in india and abroad. In Kose, U., Gupta, D., de Albuquerque, V. H. C. & Khanna, A. (eds.) Data Science for COVID-19, 275–295, https://doi.org/10.1016/B978-0-12-824536-1.00015-0 (Academic Press, 2021).
Azeemi, A. H. & Waheed, A. Covid-19 tweets analysis through transformer language models. arXiv preprint arXiv:2103.00199 (2021).
Chintalapudi, N., Battineni, G. & Amenta, F. Sentimental analysis of covid-19 tweets using deep learning models. Infect. disease reports 13, 329–339 (2021).
Aryal, R. R. & Bhattarai, A. Sentiment Analysis on Covid-19 Vaccination Tweets using Naive Bayes and LSTM. Adv. Eng. Technol. An Int. J. 1, 57–70 (2021).
Alam, K. N. et al. Deep learning-based sentiment analysis of covid-19 vaccination responses from twitter data. Comput. Math. Methods Medicine 2021 (2021).
Qorib, M., Oladunni, T., Denis, M., Ososanya, E. & Cotae, P. Covid-19 vaccine hesitancy: Text mining, sentiment analysis and machine learning on covid-19 vaccination twitter dataset. Expert. Syst. with Appl. 212, 118715 (2023).
Pangtey, L., Bhatnagar, A., Bansal, S., Dar, S. S. & Kumar, N. Large language models meet stance detection: A survey of tasks, methods, applications, challenges and future directions. arXiv preprint arXiv:2505.08464 (2025). May 13.
Jlifi, A., Abidi, C. & Duvallet, C. Beyond the use of a novel ensemble based random forest-bert model (ens-rf-bert) for the sentiment analysis of the hashtag covid19 tweets. Soc. Netw. Analysis Min. 14, https://doi.org/10.1007/s13278-024-01240-x (2024).
Nguyen, D. Q., Vu, T. & Tuan Nguyen, A. BERTweet: A pre-trained language model for English tweets. In Liu, Q. & Schlangen, D. (eds.) Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, 9–14, https://doi.org/10.18653/v1/2020.emnlp-demos.2 (Association for Computational Linguistics, Online, 2020).
Mansoor, M., Gurumurthy, K., U, A. R. & Badri Prasad, V. R. Global Sentiment Analysis Of COVID-19 Tweets Over Time. arXiv e-prints arXiv:2010.14234v2 (2020).
Loureiro, D., Barbieri, F., Neves, L., Anke, L. E. & Camacho-Collados, J. Timelms: Diachronic language models from twitter. CoRR abs/2202.03829 (2022). arXiv2202.03829.
Barbieri, F., Camacho-Collados, J., Espinosa-Anke, L. & Neves, L. TweetEval:Unified Benchmark and Comparative Evaluation for Tweet Classification. In Proceedings of Findings of EMNLP (2020).
Hernández-Orallo, J., Flach, P. & Ferri, C. A unified view of performance metrics: Translating threshold choice into expected classification loss. The J. Mach. Learn. Res. 13, 2813–2869 (2012).
Bauer, D. J. & Sterba, S. K. Fitting multilevel models with ordinal outcomes: performance of alternative specifications and methods of estimation. Psychol. methods 16, 373 (2011).
Capretto, T. et al. Bambi: A simple interface for fitting bayesian linear models in python. J. Stat. Softw. 103, 1–29, https://doi.org/10.18637/jss.v103.i15 (2022).
Acknowledgements
A.M. acknowledges the support received from the MATRICS project grant (MTR/2020/000326) of SERB, DST, Govt. of India.
Author information
Authors and Affiliations
Contributions
D.C.: Conceptualization, Data curation, Writing - original draft, Methodology, Validation. S.C.: Conceptualization, Data curation, Methodology, Writing - original draft. A.M.: Conceptualization, Investigation, Supervision
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Chakrabarty, D., Chatterjee, S. & Mukhopadhyay, A. A global twitter sentiment analysis model for COVID-vaccination. Sci Rep (2026). https://doi.org/10.1038/s41598-026-38553-0
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-026-38553-0