Scientific Reports

Table 4 Inference speed comparison of various VLMs across different deployment environments.

From: Multitasking vision language models for vehicle plate recognition with VehiclePaliGemma

Method	Deployment	Speed (second)
Moondream2	Local	0.09
LLaVA-NeXT-7b	Local	0.84
VILA	Local	0.35
Gemini 1.5 flash	API	1.65
GPT-4o-mini	API	1.7
LLaVA-NeXT-34b	Local	10
Gemini 1.5 Pro	API	1.85
GPT-4o	API	1.6
Llama 3.2	Local	0.42
Claude 3.5 Sonnet	API	1.8
Fine-tuned PaliGemma	Local	0.135

The table presents both locally deployed models and API-based models, with inference times measured in seconds. Lower values indicate faster inference.

Back to article page

Search

Advanced search

Quick links