Filter By:

Journal Check one or more journals to show results from those journals only.

Choose more journals

Article type Check one or more article types to show results from those article types only.
Subject Check one or more subjects to show results from those subjects only.
Date Choose a date option to show results from those dates only.

Custom date range

Clear all filters
Sort by:
Showing 1–3 of 3 results
Advanced filters: Author: Suhana Bedi Clear advanced filters
  • MedHELM, an extensible evaluation framework including a new taxonomy for classifying medical tasks and a benchmark of many datasets across these categories, enables the evaluation of large language models on real-world clinical tasks.

    • Suhana Bedi
    • Hejie Cui
    • Nigam H. Shah
    Research
    Nature Medicine
    P: 1-9
  • Although large language models (LLMs) show promise in controlled settings, a study now exposes their limitations in real-world clinical applications and points the way towards robust evaluation and benchmarking before clinical use.

    • Suhana Bedi
    • Sneha S. Jain
    • Nigam H. Shah
    News & Views
    Nature Medicine
    Volume: 30, P: 2409-2410