AI scientists powered by large language models and AI agents present both opportunities and risks in automatic scientific discovery. Here, the authors examine the vulnerabilities of AI scientists, propose a risk taxonomy based on user intent and impact domains, and develop a triadic safeguarding framework emphasizing human regulation, agent alignment, and environmental feedback understanding.
- Xiangru Tang
- Qiao Jin
- Mark Gerstein