Fig. 3
From: High resolution synthetic residential energy use profiles for the United States

Impurity-based feature importance and correlation. Each plot shows Gini importance of features for two dependent variables – home and work. The x-axis shows independent variables in order of importance based on IncNodePurity. The selection of the parameters for ‘ntree’ (number of decision trees) and ‘node size’ (minimum size of terminal nodes). Eight conditions are tested for the combination of the two parameters: ntree = 500, 1000, 1500, and 2000; node size = 5, and 10. The plots show robust results across the different conditions. According to the plots, the following five independent variables - wrkhrs; worker; age; hinc3; hsize mostly affect all the dependent variables. The right-hand y-axis shows the absolute Pearson Correlation Coefficient. The positive and negative coefficients are distinguished by blue dots and squares, respectively. Except wrkhrs; worker, all other independent variables weakly correlated with the dependent variables.