Fig. 3: Scene understanding maps and other fixation prediction models. | Nature Communications

Fig. 3: Scene understanding maps and other fixation prediction models.

From: Eye movements during free viewing to maximize scene understanding

Fig. 3: Scene understanding maps and other fixation prediction models.

a Procedural workflow for creating our proposed scene understanding maps (SUM): The impact of an object on scene understanding was quantified by comparing the similarity (rated by humans) between descriptions that people provide when the object was present and those when the object was removed. See Figure Supplementary 1b, c for details on the consistency of LLMs with human raters on rating the similarity of the descriptions. The SUM was generated by placing a 2D Gaussian on each object, with its amplitude determined by the impact of the object on the scene understanding and with a standard deviation determined by the size of the bounding box around the center of each object (SD range: 0.5 to 1.5 dva) (43, see methods for details). b Workflow for creating maps for other fixation prediction models used in this study: meaning maps (top), DeepGaze (middle) maps, and GBVS (bottom) maps The grey regions in the heat maps indicate the predictions that fall on people in the scene and were excluded from the initial analyses (see further  below for results that include fixations on people). Consent was obtained from the person in the featured image for its publication.

Back to article page