Table 1 Comparison of model characteristics and performance
From: Multimodal AI for Yuan Buddhist sculpture chronology and style
Model | Parameter Size (approx.) | Context Length | Open Source | Characteristics |
|---|---|---|---|---|
ChronoStyleNet | ~7B (Qwen-VL base) | ~32k tokens (vision+ text) | Fully | Domain-specific, fine-tuned for Buddhist sculpture classification tasks. |
GPT-4o (OpenAI) | Undisclosed | 128k tokens (text) | Not open-source, only accessible via API | Proprietary multimodal model, general-purpose with strong vision-language reasoning. |
Claude 3.5 Sonnet (Anthropic) | Undisclosed | 200k+ tokens (text) | Not open-source, only accessible via API | High-performance generalist with long-context understanding and reasoning. |
Gemini 1.5 Pro (Google) | Undisclosed | 1 M tokens (text, experimental) | Not open-source, only accessible via API | Strong code, image, and document reasoning; limited academic access. |
LLaMA 3.3 70B (Meta) ( ) | 70B | 8k–128k tokens | Open-source (under Apache 2.0 license) | Open-source, large-scale general model, fine-tuning friendly. |
Grok 3 Beta (xAI) | Undisclosed | Unknown | Not open-source, proprietary | Tesla-integrated, web-connected; limited documentation. |