Fig. 3: t-SNE visualization of embeddings generated using four language models for the pathology reports. | npj Digital Medicine

Fig. 3: t-SNE visualization of embeddings generated using four language models for the pathology reports.

From: HONeYBEE: enabling scalable multimodal AI in oncology through foundation model-driven embeddings

Fig. 3

Each point represents a patient colored by TCGA cancer type. Fine-tuned models show improved cluster separation compared to pre-trained models, with AMI scores increasing for all models: GatorTron: 0.35–0.91, Qwen3: 0.39–0.93, Med-Gemma: 0.26–0.93, Llama: 0.32–0.93. These results demonstrate the superior cancer type discrimination after LLM fine-tuning.

Back to article page