Fig. 5

A comparison of farmer-reported vs. government-reported crop yields. In (a), the co-located yield estimates from farmer surveys and regional government statistics are aggregated across all countries and are arranged by crop. In (b), the co-located yield data are aggregated across all crops and instead are split by country. The ‘yield bias’ (y-axis) denotes whether the government-reported yield statistics are systematically higher or lower than the farmer-reported yields. Negative values indicate that farmer-reported yields are lower than government statistics. The mean yield bias (averaged across all countries with available co-located government and farmer survey data) is 11 ± 4.0% (μ ± s.e., where μ is the mean and s.e. is the standard error), indicating that farmer-based estimates are slightly lower than government statistics. In (a,b), the triangles denote the mean yield bias and the boxes denote the 25–75th percentiles of the estimated yield bias. Note that the yield bias is calculated in a total least squares sense. That is, rather than treating either the government-based or the farmer-based yield estimates as a ‘gold-standard’ representation of the true yield, we instead acknowledge that both estimates are flawed representations of the true value. We perform an orthogonal projection of each data point in the government-based yield vs. farmer-based yield crossplots (e.g., see Fig. 4) onto the 1:1 line. Next, we evaluate whether the data cloud tends to sit above or below the 1:1 line, weighting each point observation by it’s orthogonal distance to the 1:1 line and it’s sample size (\(\sqrt{n}\), where n is the number of farmer observations that were aggregated to generate the regional estimate). See Figures S9–S10 in the Supporting Information for the data underlying these yield bias estimates.