Large language models (LLMs), such as ChatGPT-o1, display subtle blind spots in complex reasoning tasks. We illustrate these pitfalls with lateral thinking puzzles and medical ethics scenarios. Our observations indicate that patterns in training data may contribute to cognitive biases, limiting the models’ ability to navigate nuanced ethical situations. Recognizing these tendencies is crucial for responsible AI deployment in clinical contexts.
- Shelly Soffer
- Vera Sorin
- Eyal Klang