Table 1 Overview of testing materials.

From: Testing AI on language comprehension tasks reveals insensitivity to underlying meaning

Sample prompt

Number of featured entities

Target answer

Number of prompts per condition

John deceived Mary and Lucy was deceived by Mary. In this context, did Mary deceive Lucy?

3

Yes

10

Steve hugged Molly and Donna was kissed by Molly. In this context, was Molly kissed?

3

No

10

Jessica and Mary were kissed by Alice. Jessica was kissed by Samuel and Andrew was kissed by Mary. In this context, was Mary kissed?

5

Yes

10

Bob kissed Donna and Barbara kissed Peter. Donna was hugged by Alice. In this context, was Alice hugged?

5

No

10