Extended Data Fig. 3: Details of ‘visual Turing tests’.
From: Emulating human-like adaptive vision for efficient and flexible machine visual perception

a, The full procedure of ‘visual Turing tests’. We first collect the visual perception behaviors from real humans, machine (AdaptiveNN), and random generation. Then, we construct multiple trials, each including paired examples of perceptual behaviors. We consider three types of trials: i) human v.s. machine; ii) human v.s. human; and iii) human v.s. random, each corresponding to N trials (we use N=36), yielding totally 3N trials for each ‘visual Turing test’. Finally, these 3N trials are mixed and shuffled for every human judge (n=39). The participants are only instructed to identify the machine behaviors within each trial (for all i)-iii)). Each accuracy of i)-iii) is calculated per participant and aggregated across participants. As a result, i) offers the Turing test results, while ii) and iii) provide randomized control groups as baselines and also validate whether our experimental setups are reasonable. b,c, Results of visual Turing tests: visual fixation behaviours (b) and difficulty-assessment behaviours (c). Each data point represents the average identification accuracy of a human judge. Bars show the mean accuracy across human judges and the corresponding 95% confidence interval. Ideal performance is 50%, where the machine is indistinguishable from human behaviors in these binary choice tasks. Data points and their distributions (Gaussian kernel density estimation) are given above the bars.