Large language models are starting to be used in safety-critical tasks such as controlling robots. Zhou et al. present LabSafety Bench, a benchmark evaluating the ability of large language models to identify hazards and assess laboratory risks.
- Yujun Zhou
- Jingdong Yang
- Xiangliang Zhang