Table 4 Overall performance of human and o1-preview on systematic thinking.

From: Comparative evaluation of OpenAI O1 and human performance in higher order cognition

ST instrument

Dimension

Human score (mean ± SD)

o1-Preview (mean ± SD)

Z-score

The Village of Abeesee

Problem identification

1.62 ± 0.64

2.50 ± 0.62

1.38

Information needs

1.81 ± 0.52

2.90 ± 0.21

2.10

Stakeholder awareness

1.23 ± 0.99

2.95 ± 0.16

1.74

Goals

1.71 ± 0.62

2.90 ± 0.21

1.92

Unintended consequences

1.38 ± 0.58

2.65 ± 0.24

2.19

Implemented challenges

1.64 ± 0.57

2.70 ± 0.35

1.86

Alignment

1.71 ± 1.00

2.35 ± 0.41

0.64

The Lake Urmia Vignette (LUV)

Variables

10.95 ± 4.00

19.70 ± 1.57

2.19

Causal links

9.17 ± 3.97

23.30 ± 2.21

3.56

Feedback loops

0.16 ± 0.45

3.10 ± 1.10

6.53

Total score

20.08 ± 8.13

46.10 ± 4.12

3.20