Fig. 4: The evaluation of intermediate process tasks in GPT-4o- and Gemma3-driven DisasTeller. Each bar plot shows the accuracy scores (0–10 scale) for local disaster grading and map annotation.

Five coloured bars represent five independent runs of DisasTeller. The dashed line indicates the mean score across the five runs. The text box above each subplot reports the average accuracy score together with the corresponding deviation error: grading deviation (+G) for local disaster grading and distance deviation (in pixels) for map annotation. Source data are provided as a Source Data file.