Figure 5
From: Machine learning based attribution mapping of climate related discussions on social media

Showing the underlying features driving the discussions within the optimized clusters obtained for Sample 1. The underlying features are obtained by training a supervised learning-based Random forest binary classifier. From the output, we observe that all the clusters are composed of discussions related to unique themes and sub-themes within them. For instance, we obtain features such as solar, oil, nuclear, etc. in Energy cluster (a), and carbon, coal, emission in Carbon [emissions] cluster (b), whereas features such as trump, Obama, party, etc. dominate the Administration cluster (c). Similarly, warming, global, science, etc. comprise the Climate science cluster (d) whereas temperature, ice, arctic, etc. form the Global warming cluster (e). In the Population & economy cluster, features such as collapse, population, crisis, etc. (f) get the highest weightage, however, plastic, waste, bag, etc. get the highest weightage in the Plastic & waste cluster (g). In the Agriculture & administration cluster (h) features such as water, chemical, ban, etc. dominate the discussions, while in the Wildlife cluster (i), features such as specie, animal, fish, etc. dominate the discussions. Finally, storms, hurricanes, etc. dominate the Natural catastrophe cluster (j), whereas general climate-related terms such as green, environmental, etc. form the General posts cluster (k). The Unidentifiable cluster (l) on the other hand does not have a higher weightage assigned to any climate-related theme in general.