Fig. 8

An evaluation of different sources of variability and error in the yield statistics extracted from farmer surveys. We explore four questions: (1) How does the number of farmer observations within a regional administrative boundary affect the quality of the aggregated regional yield estimate? (2) How does the plot area affect the yield estimate? (3) How does the method for estimating the plot area (GPS measurement vs. farmer interview) affect the yield estimate? (4) How does the method for quantifying production (crop cuts vs. farmer interview) affect the yield estimate? To answer each question, we sub-divide the dataset into different groups based on the farmer survey sample size, the field areas, the area measurement methodology, and the crop production measurement methodology (see below for details). We then compare each sub-divided dataset of farmer survey data to the co-located regional government statistics. The top row (a-d) shows the correlation coefficient (ρ) between the farmer survey and government data. The bottom row (e-h) shows the average yield bias between the farmer survey and government data. We define the yield bias the same way as in Fig. 5; negative values indicate that the farmer survey data produce lower yields than the government figures. In (a,d), we divide the dataset of co-located farmer surveys and regional government statistics (e.g., Fig. 4) into four groups based on the number of farmer surveys (n) within the administrative boundary: n≤5, n = 5–20, n = 20–200, and n > 200. In (b,f), we divide the dataset of co-located farmer surveys and regional government statistics according to the plot area (A): A≤0.1 ha, A = 0.1–0.4 ha, A = 0.4–2.0 ha, and A > 2.0 ha. In (c,g), we split the dataset based on whether the farmer surveys used GPS-measured field areas vs. farmer-queried estimates of field areas. In (d,h), we split the dataset between yield estimates based on farmer recall of total production (kg) vs. crop cuts. Note that only Ethiopia has an extensive network of crop cuts (Fig. 6). Cross-plots showing the data underlying this figure are shown in Figure S21 of the Supporting Information.