Table 2 Model card: OncoLLM

From: PRISM: Patient Records Interpretation for Semantic clinical trial Matching system using large language models

Model details
Developers	OncoLLM was developed by Triomics Research
Model date	March 2024
Model version	1.0
Model type	Large language model (LLM).
Training approach	Fine-tuned using a combination of synthetic and real-world data from a single cancer center’s oncology electronic health records (EHR) datasets via SFT
Paper/resource	N/A
License	CC BY-NC-ND 4.0 DEED
Contact	hrituraj@triomics.com
Intended Use
Primary uses	Providing medical explanations and reference evidence within EHR records, assisting healthcare professionals in querying EHR systems for patient-related information, and supporting clinical decision-making processes by answering oncology-specific questions.
Primary users	Healthcare professionals, researchers, and developers in the oncology domain.
Out-of-scope Uses	Direct patient care decisions and legal or regulatory decision-making processes.
Factors
Relevant factors	Demographic or phenotypic groups, technical attributes specific to oncology EHR datasets.
Evaluation factors	Model performance across various demographic groups, fairness considerations in predictive outcomes.
Metrics	Accuracy on binary questions, Ranking on Outputs, Explanation and Evidence Citation Accuracy
Evaluation data	Historic Trial Enrollment Data and Manually Annotated Patient Charts on Trial Criteria Questions
Datasets	Single cancer center’s oncology EHR datasets.
Motivation	To evaluate the model’s performance in providing accurate and relevant responses to oncology-specific queries within EHR records to match patients to clinical trials.
Preprocessing	Data preprocessing involved cleaning, anonymization, and formatting of EHR records for model training and evaluation.
Training Data	Several thousand Question-Chunk Pairs were manually annotated for citation, explanation, and final answer
Unitary results	Model performance on individual question-answering tasks within oncology EHR records.
Intersectional results	Analysis of model performance across different phenotypic groups within the dataset.
Considerations	Ensured patient privacy and data confidentiality by excluding identifiable patient data from training. Upheld integrity in reporting model performance by using separate datasets for evaluation.
Caveats and Recommendations
Considerations	Interpret model results with caution, considering potential biases inherent in the training data and limitations in generalizability. Use the model as a supportive tool in clinical decision-making processes, with human validation and oversight.

Back to article page

Table 2 Model card: OncoLLM

Search

Quick links