Figure 5

Twitter estimation of the polls 7 days in advance versus linear extrapolation of the polls. (a) Twitter estimation of the NYT polls 7 days in advance (blue line), 7 days linear extrapolation of the NYT polls (black line), 7 days forecast using a ARIMA model (dotted orange line) and Hillary Clinton NYT National Polling Average score, normalized to the share of Donald Trump and Hillary Clinton, (dashed purple line). Our model is trained using only data from June 1st to September 1st. (b) Estimation error in percentage points of the NYT polls. The Twitter estimation error (blue) has a root-mean-square value of RMSE = 0.40% (correlation coefficient r = 0.89). The 7 days linear extrapolation of the polls (black) has a RMSE = 1.19% (r = 0.64), The 7 days ARIMA forecast (orange) has a RMSE = 0.33% (r = 0.91) and the baseline error, computed as the difference between the NYT Polling Average and the its mean value (red), achieves a RMSE = 0.83%.