Abstract
Predicting cryptocurrency price movements using social media sentiment remains challenging due to the noisy, heterogeneous, and rapidly evolving nature of online signals. While prior studies commonly combine sentiment analysis with deep learning models, less attention has been given to how sentiment signals are constructed, aggregated, and aligned with price dynamics. This study investigates the impact of sentiment representation and price change labeling on short-term Bitcoin price movement classification. Over 1.1 million Bitcoin-related tweets spanning April to August 2021 are analyzed using a RoBERTa-based sentiment model, incorporating both sentiment probabilities and user-level activity metrics. These features are consolidated via Principal Component Analysis (PCA) and aggregated over time using a decay-weighted scheme to emphasize recent information. Price movements are categorized into discrete regimes using a data-driven K-means clustering approach, with controlled Gaussian noise applied to improve boundary robustness. Multiple predictive models, including a Gated Recurrent Unit (GRU), Temporal Convolutional Network (TCN), LightGBM, and multinomial logistic regression, are evaluated. Although the GRU achieves the highest overall performance, an extensive ablation study demonstrates that the primary performance gains arise from the proposed sentiment construction and labeling framework rather than the forecasting architecture alone. Removing PCA-based aggregation, adaptive clustering, or noise injection leads to substantial degradation, particularly for extreme price movement classes. The findings highlight the importance of sentiment feature design and class definition in cryptocurrency prediction and provide empirical guidance for constructing robust sentiment- driven financial models.
Similar content being viewed by others
Data availability
The Twitter (X) dataset used in this study is publicly available on Kaggle at [https://www.kaggle.com/datasets/kaushiksuresh147/bitcoin-tweets]. The Bitcoin price data from Coinbase is also publicly accessible via Kaggle at [https://www.kaggle.com/datasets/patrickgendotti/btc-and-eth-1 min-price-history? select=coinbaseUSD_1-min_data.csv]. Processed datasets, including the merged sentiment and price data with derived features, are available from the corresponding author upon reasonable request.
Code availability
The custom code developed for this study, including the batch-processing pipeline for RoBERTa-based sentiment analysis, the PCA-based sentiment aggregation, the time-decaying aggregation mechanism, and the implementation of the GRU, TCN, LightGBM, and Logit models, is provided as Supplementary Material in the journal’s submission system.
References
Levine, R. Financial development and economic growth: views and agenda. J. Econ. Lit. 35 (2), 688–726 (1997).
Fama, E. F. Efficient capital markets: a review of theory and empirical work. J. Finance. 25 (2), 383–417. https://doi.org/10.2307/2325486 (1970).
Demirgüç-Kunt, A. & Levine, R. Stock markets, corporate finance, and economic growth: an overview. World Bank Econ. Rev. 10 (2), 223–239 (1996).
Nakamoto, S. Bitcoin: a peer-to-peer electronic cash system. Satoshi Nakamoto (2008).
Yermack, D. Is Bitcoin a real currency? An economic appraisal. In Handbook of Digital Currency 29–40 (Elsevier, 2024).
Corbet, S., Lucey, B. & Yarovaya, L. Datestamping the bitcoin and ethereum bubbles. Finance Res. Lett. 26, 81–88 (2018).
Kehinde, T. et al. Helformer: an attention-based deep learning model for cryptocurrency price forecasting. J. Big Data. 12 (1), 81 (2025).
Corbet, S., Lucey, B., Urquhart, A. & Yarovaya, L. Cryptocurrencies as a financial asset: a systematic analysis. Int. Rev. Financ. Anal. 62, 182–199 (2019).
Foley, S., Karlsen, J. R. & Putniņš, T. J. Sex, drugs, and bitcoin: how much illegal activity is financed through cryptocurrencies? Rev. Financial Stud. 32 (5), 1798–1853. https://doi.org/10.1093/rfs/hhz015 (2019).
Karalevicius, V., Degrande, N. & De Weerdt, J. Using sentiment analysis to predict interday Bitcoin price movements. J. Risk Finance. 19 (1), 56–75 (2018).
Kraaijeveld, O. & De Smedt, J. The predictive power of public Twitter sentiment for forecasting cryptocurrency prices. J. Int. Financ. Mark. Inst. Money. 65, 101188 (2020).
Yeganeh, A., Hu, X., Shongwe, S. C. & Koning, F. F. Using the attention layer mechanism in construction of a novel ratio control chart: an application to Ethereum price prediction and automated trading strategy. Eng. Appl. Artif. Intell. 141, 109652 (2025).
Critien, J. V., Gatt, A. & Ellul, J. Bitcoin price change and trend prediction through twitter sentiment and data volume. Financ. Innov. 8 (1), 1–20 (2022).
Frohmann, M., Karner, M., Khudoyan, S., Wagner, R. & Schedl, M. Predicting the price of bitcoin using sentiment-enriched time series forecasting. Big Data Cogn. Comput. 7 (3), 137 (2023).
Haritha, G. & Sahana, N. Cryptocurrency price prediction using Twitter sentiment analysis. In Paper Presented at the CS & IT Conference Proceedings (2023).
Basher, S. A. & Sadorsky, P. Forecasting Bitcoin price direction with random forests: how important are interest rates, inflation, and market volatility? Mach. Learn. Appl. 9, 100355 (2022).
Davoudi, M., Ghavipour, M., Sargolzaei-Javan, M. & Dinparast, S. Decentralized storage cryptocurrencies: an innovative network-based model for identifying effective entities and forecasting future price trends (2023).
Ortu, M., Uras, N., Conversano, C., Bartolucci, S. & Destefanis, G. On technical trading and social media indicators for cryptocurrency price classification through deep learning. Expert Syst. Appl. 198, 116804 (2022).
Dhawan, A. & Putniņš, T. J. A new wolf in town? Pump-and-dump manipulation in cryptocurrency markets. Rev. Financ. 27 (3), 935–975 (2023).
La Morgia, M., Mei, A., Sassi, F. & Stefa, J. The doge of wall street: analysis and detection of pump and dump cryptocurrency manipulations. ACM Trans. Internet Technol. 23 (1), 1–28 (2023).
Zou, Y. & Herremans, D. PreBit-A multimodal model with Twitter FinBERT embeddings for extreme price movement prediction of Bitcoin. Expert Syst. Appl. 2023, 120838 (2023).
Yasir, M. et al. Deep-learning-assisted business intelligence model for cryptocurrency forecasting using social media sentiment. J. Enterp. Inform. Manage. 36 (3), 718–733 (2023).
Huynh, T. D. & Smith, D. R. Stock price reaction to news: the joint effect of tone and attention on momentum. J. Behav. Finance. 18 (3), 304–328 (2017).
Kaur, R. et al. Development of a cryptocurrency price prediction model: leveraging GRU and LSTM for Bitcoin, Litecoin and Ethereum. PeerJ Comput. Sci. 11, e2675 (2025).
Wang, M. & Ma, T. Mana-net: mitigating aggregated sentiment homogenization with news weighting for enhanced market prediction. In Paper presented at the Proceedings of the 33rd ACM International Conference on Information and Knowledge Management (2024).
Popper, N. Digital gold: bitcoin and the inside story of the misfits and millionaires trying to reinvent money (2015).
Tschorsch, F. & Scheuermann, B. Bitcoin and beyond: a technical survey on decentralized digital currencies. IEEE Commun. Surv. Tutorials. 18 (3), 2084–2123 (2016).
Buterin, V. A next-generation smart contract and decentralized application platform. White Paper 3 (37), 2–1 (2014).
Han, P., Chen, H., Rasool, A., Jiang, Q. & Yang, M. MFB: a generalized multimodal fusion approach for bitcoin price prediction using time-lagged sentiment and indicator features. Expert Syst. Appl. 261, 125515 (2025).
Stencel, A. What is a meme coin? Dogecoin to the moon! (2023).
Smuts, N. What drives cryptocurrency prices? An investigation of Google trends and telegram sentiment. ACM SIGMETRICS Perform. Eval. Rev. 46 (3), 131–134 (2019).
Hu, S., Zhang, Z., Lu, S., He, B. & Li, Z. Sequence-based target coin prediction for cryptocurrency pump-and-dump. Proc. ACM Manage. Data. 1 (1), 1–19 (2023).
Hamrick, J. et al. The economics of cryptocurrency pump and dump schemes. In Paper Presented at the Workshop on the Economics of Information Security (2019).
Kamps, J. & Kleinberg, B. To the moon: defining and detecting cryptocurrency pump-and-dumps. Crime. Sci. 7 (1), 1–18 (2018).
Rather, A. M. A new method of ensemble learning: case of cryptocurrency price prediction. Knowl. Inf. Syst. 65 (3), 1179–1197 (2023).
Oyedele, A. A., Ajayi, A. O., Oyedele, L. O., Bello, S. A. & Jimoh, K. O. Performance evaluation of deep learning and boosted trees for cryptocurrency closing price prediction. Expert Syst. Appl. 213, 119233 (2023).
Arslan, S. Bitcoin price prediction using sentiment analysis and empirical mode decomposition. Comput. Econ. 2024, 1–22 (2024).
Coulter, K. A. The impact of news media on Bitcoin prices: modelling data driven discourses in the crypto-economy with natural language processing. R. Soc. Open. Sci. 9 (4), 220276 (2022).
Miah, M. S. U. et al. A multimodal approach to cross-lingual sentiment analysis with ensemble of transformer and LLM. Sci. Rep. 14 (1), 9603 (2024).
Liu, Y. et al. Roberta: a robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019).
Poecze, F., Ebster, C. & Strauss, C. Social media metrics and sentiment analysis to evaluate the effectiveness of social media posts. Procedia Comput. Sci. 130, 660–666 (2018).
Maćkiewicz, A. & Ratajczak, W. Principal components analysis (PCA). Comput. Geosci. 19 (3), 303–342 (1993).
Cohen, E. & Strauss, M. Maintaining time-decaying stream aggregates. In Paper presented at the Proceedings of the Twenty-second ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (2003).
Giachanou, A. & Crestani, F. Tracking sentiment by time series analysis. In Paper presented at the Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval (2016).
Sable, R., Goel, S. & Chatterjee, P. Empirical study on stock market prediction using machine learning. In Paper Presented at the 2019 International Conference on Advances in Computing, Communication and Control (ICAC3) (2019).
Gnip, P., Vokorokos, L. & Drotár, P. Selective oversampling approach for strongly imbalanced data. PeerJ Comput. Sci. 7, e604 (2021).
Hossin, M. & Sulaiman, M. N. A review on evaluation metrics for data classification evaluations. Int. J. Data Min. Knowl. Manage. Process. 5 (2), 1 (2015).
Bholowalia, P. & Kumar, A. EBK-means: a clustering technique based on elbow method and k-means in WSN. Int. J. Comput. Appl. 105, 9 (2014).
Rousseeuw, P. J. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987).
Chawla, N. V., Bowyer, K. W., Hall, L. O. & Kegelmeyer, W. P. SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002).
Funding
The authors did not receive support from any organization for the submitted work.
Author information
Authors and Affiliations
Contributions
Mahmood Mohammadi Nezhad: Methodology, Data Curation, Formal Analysis, Writing. Saeed Rouhani: Methodology, Formal Analysis, Conceptualization, Supervision. Navid Mohammadi: Methodology, Formal Analysis, Conceptualization, Supervision, Writing Final. Ali Shahedi: Data Curation, Methodology, Writing.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Ethical approval
All of the ethical issues have been considered in this research.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Nezhad, M.M., Rouhani, S., Mohammadi, N. et al. A unified GRU model for cryptocurrency price prediction and harsh price movement detection using enhanced sentiment analysis. Sci Rep (2026). https://doi.org/10.1038/s41598-026-46271-w
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-026-46271-w


