Fig. 1: Maze navigation task.

Subjects explored the pre-learned grid maze from an unknown initial state and were intermittently asked to predict the upcoming scene and estimate their confidence level for that prediction. To successfully perform the task, subjects needed to infer their current state based on the history of actions and previously observed scenes. a A sample action trial sequence. At the beginning of each trial, subjects observed the scene from their current state (i.e., the status of the doors to the left, forward, and right) and then chose an action allowing them to move in one of those three directions. The doors were either open (passable; black) or closed (impassable; brown). Only open doors allowed the subjects to move to an adjacent state in the grid and to see its scene. If a subject’s state inference (i.e., belief) was incorrect, the observed scene in the subsequent trial would differ from their prediction. Subjects performed 1–5 consecutive action trials between prediction trials. The 3 × 3 maze in this figure is used to explain the task, and the maze used in the actual experiment was of a 5 × 5 size (Supplementary Fig. S1a). b A sample prediction trial sequence. In the prediction trial, a fixation cross was displayed for 4–6 s (delay period) during which time the subjects were asked to predict the upcoming scene. Next, the subjects reported their confidence level for the upcoming scene prediction on a scale from 1 (low) and 4 (high). Then they were asked to select their prediction of the upcoming scene from four options, consisting of the true scene and three distractor scenes. A green or red frame was presented around the selected scene to reflect a correct or incorrect choice, respectively. c Occurrence probabilities for each scene selected by subjects as the predicted upcoming scene. There were seven types of scenes based on combinations of door statuses in each scene within the maze (no dead-end, i.e., three closed doors). Center lines of the box plots indicate the medians, boxes indicate the lower and upper quartiles, and the whiskers represent 1.5× interquartile range (IQR). Cross-markers indicate the outliers.