Nature Medicine

Table 2 Performance comparison stratified by question difficulty for the sleep professional examination

From: A personal health large language model for sleep and fitness coaching

Difficulty	Count	Expert	Gemini Ultra 1.0	PH-LLM
Easy (90–100%)	214	90%	94%	95%
Medium (75–90%)	204	81%	78%	80%
Hard (0–75%)	211	53%	55%	57%

Performance of PH-LLM is compared to that of Gemini Ultra 1.0 and human experts. Questions were classified as ‘Easy’, ‘Medium’ or ‘Hard’ based on the reported percentage range of human test takers who answered the corresponding questions correctly. Bold values indicate the highest performance.

Back to article page

Search

Advanced search

Quick links