Table 4 Keywords used for data collection and statistics of unlabelled data.
From: Multilingual identification of nuanced dimensions of hope speech in social media texts
Languages | Keywords | # tweets scarped with keywords | # recent tweets on time of collection | Total number of raw tweets | # after preprocessing |
---|---|---|---|---|---|
English | hope, Inshallah, aspire, believe, expect, want, wish | 50,000 | 50,000 | 100,000 | \(\sim\)23,000 |
Spanish | Confiar, Espero, Confío, Ojalá, Creo, Esperanza, Desearía, Rezo, Anhelo, Sueño, Oro | 82,725 | 50,000 | 132,725 | \(\sim\)35,000 |
German | Hoffnung, Wunsch, Erwartung, Träume, optimistisch, glauben, zuversichtlichvorfreuen | 33,330 | 50,000 | 83,330 | \(\sim\)40,000 |