npj Digital Medicine

Table 1 Strategies to mitigate the impact of hallucinations in large language models (LLMs)

From: The long but necessary road to responsible use of large language models in healthcare research

Strategy	Description
Pre-defined purpose	Where possible, LLMs should be tailored for specific use cases (e.g. information abstraction from pathology reports) to ensure that end-users clearly understand their intended applications and limitations.
High-quality data	LLMs should be trained on domain-specific, representative, and factually accurate data. This data should not be limited to publicly available sources; efforts should also be made to include content behind paywalls or member-only subscriptions.
Data templates	Data templates facilitate data consistency and clarity, which may mitigate the risk of generating incorrect outputs.
Chain-of-verification	Chain-of-verification incorporates a structured approach for LLMs to verify each output against a reliable data source (e.g. the original pathology report) before finalization. This process enables LLMs to detect and correct any inconsistencies in their initial outputs.
Degree of uncertainty	Indicating the LLM’s confidence in its output allows end-users to better assess the reliability of the information provided and determine whether additional verification is required.
Response restrictions	Establishing “safe” boundaries for possible LLM outputs may mitigate the risk of generating incorrect or biased responses.
Human in the loop	Human oversight provides valuable domain-specific and social construct expertise to assess LLM outputs, and serves as the final safeguard against hallucinations prior to their intended use.
Updates	Processes should be established to continuously evaluate the accuracy and appropriateness of LLM responses. Updates should be provided as needed to ensure outputs remain aligned with current knowledge.

Adapted from refs. ^4,11^,14,15.

Back to article page

Search

Advanced search

Quick links