Table 2 Model card: OncoLLM
Model details | Â |
---|---|
Developers | OncoLLM was developed by Triomics Research |
Model date | March 2024 |
Model version | 1.0 |
Model type | Large language model (LLM). |
Training approach | Fine-tuned using a combination of synthetic and real-world data from a single cancer center’s oncology electronic health records (EHR) datasets via SFT |
Paper/resource | N/A |
License | CC BY-NC-ND 4.0 DEED |
Contact | hrituraj@triomics.com |
Intended Use | Â |
Primary uses | Providing medical explanations and reference evidence within EHR records, assisting healthcare professionals in querying EHR systems for patient-related information, and supporting clinical decision-making processes by answering oncology-specific questions. |
Primary users | Healthcare professionals, researchers, and developers in the oncology domain. |
Out-of-scope Uses | Direct patient care decisions and legal or regulatory decision-making processes. |
Factors | Â |
Relevant factors | Demographic or phenotypic groups, technical attributes specific to oncology EHR datasets. |
Evaluation factors | Model performance across various demographic groups, fairness considerations in predictive outcomes. |
Metrics | Accuracy on binary questions, Ranking on Outputs, Explanation and Evidence Citation Accuracy |
Evaluation data | Historic Trial Enrollment Data and Manually Annotated Patient Charts on Trial Criteria Questions |
Datasets | Single cancer center’s oncology EHR datasets. |
Motivation | To evaluate the model’s performance in providing accurate and relevant responses to oncology-specific queries within EHR records to match patients to clinical trials. |
Preprocessing | Data preprocessing involved cleaning, anonymization, and formatting of EHR records for model training and evaluation. |
Training Data | Several thousand Question-Chunk Pairs were manually annotated for citation, explanation, and final answer |
Unitary results | Model performance on individual question-answering tasks within oncology EHR records. |
Intersectional results | Analysis of model performance across different phenotypic groups within the dataset. |
Considerations | Ensured patient privacy and data confidentiality by excluding identifiable patient data from training. Upheld integrity in reporting model performance by using separate datasets for evaluation. |
Caveats and Recommendations | |
Considerations | Interpret model results with caution, considering potential biases inherent in the training data and limitations in generalizability. Use the model as a supportive tool in clinical decision-making processes, with human validation and oversight. |