SmolVLM-3B

33%

Compact VLM that comfortably runs on a Jetson Orin Nano. The default 'on-device assistant for cameras' pick.

MultimodalApache-2.0INT8MIXEDvlmsmolcaptioning

184K downloads 6.4K deploymentsUpdated Apr 9, 2028

Headline:142ms · NVIDIA Jetson Orin Nano · INT8

Overview Benchmarks7 Sim Results Deploy7 Files Discussion23

Discussion

Issues, PRs, and methodology questions on SmolVLM-3B. Chip-vendor authors and reference-set maintainers are auto-pinged on threads tagged with their target.

INT4 calibration recipe — overshoot on long-tail classes?

The published INT4 calibration set is heavily weighted to head classes. We’re seeing a 6.4% drop on long-tail traffic categories vs the FP16 baseline; switching to a stratified calibration sample fully recovers it. Recipe attached.

Marcus Chen14 repliesupdated 2h ago

Hailo-10H performance on multi-stream — 4x1080p hits a wall at 38 FPS

Posting matched-pair numbers. We see 38.2 FPS sustained across 4 1080p streams on the Hailo-10H with the published HEF — the bottleneck is the post-processing thread, not the NPU. PR with a fused NMS path coming.

Hailo8 repliesupdated 1d ago

ONNX export breaks dynamic shapes >1024

Repro: any input axis above 1024 fails the symbolic shape inference pass. Workaround in the meantime is to fix the pre-shape; will open an issue upstream.

Dr. Anika Patel5 repliesupdated 3d ago

Proposed: split FP16 and MIXED variants into separate model entries

The MIXED variant has materially different downstream behavior on transformer-class chips (latency goes down, accuracy retention is +0.6pp). Splitting into a separate model card would let it compete on its own merits in the catalog.

EdgeML Collective22 repliesupdated 1w ago