Table 4 Language distribution of tweets.

From: Twitter Sentiment Geographical Index Dataset

Language

# of tweets in 50 million sample

Percentage of tweets in 50 million sample

English (en)

16,618,957

35.81%

Spanish (es)

5,897,099

12.71%

Portugese (pt)

5,300,609

11.42%

Japanese (ja)

4,881,423

10.52%

Arabic (ar)

1,581,877

3.41%

Indonesian (in)

1,370,835

2.95%

Turkish (tr)

1,340,662

2.89%

Hindi (hi)

931,777

2.00%

Others

9,146,503

18.29%