Figure 2
From: Active vision in immersive, 360° real-world environments

Comparing salience and meaning maps to gaze behavior. (A) Each photosphere was first decomposed into smaller undistorted image tiles. Next, we created two models of the content in each real-world environment. (B) “Salience maps” were generated by modeling low-level visual features for each tile using the GBVS Toolbox16. Each tile was then projected onto a two-dimensional salience map. (C) “Meaning maps” were generated via online participants who rated the semantic content, or “meaning” of each image tile. Each tile’s rating was then projected onto a two-dimensional meaning map. (D) Group gaze maps were trimmed (vertically) to match the passive condition field of view. (E) Points are sampled evenly on a sphere and used to account for photosphere distortion in two-dimensional maps. (F) A linear mixed effects model was used to compare the degree to which each model predicted attentional guidance in our two conditions. (Image adapted from Pyramid Oracle Panorama by Nathan Tweti.)