SmolVLM-3B

33%

Compact VLM that comfortably runs on a Jetson Orin Nano. The default 'on-device assistant for cameras' pick.

MultimodalApache-2.0INT8MIXEDvlmsmolcaptioning

184K downloads 6.4K deploymentsUpdated Apr 9, 2028

Headline:142ms · NVIDIA Jetson Orin Nano · INT8

Overview Benchmarks7 Sim Results Deploy7 Files Discussion23

Deploy SmolVLM-3B

Pick a chip family. We hand you the artifacts (HEF, TRT engine, Core ML, ONNX) plus a one-click endpoint deploy. For private endpoints, on-prem deploy, or air-gapped distribution, see Enterprise.

NNVIDIA Jetson Orin Nano

# Build a TensorRT engine
$ focsle pull huggingface-edge/smolvlm-3b --target jetson-orin-nano
$ focsle build trt --plan smolvlm-3b.plan \
    --precision mixed \
    --workspace 4G

# Run with TensorRT
import focsle.runtime as fr
m = fr.load("smolvlm-3b.plan", target="trt")
out = m.run(frame)

One-click endpoint

Spins up a managed endpoint in the closest region. Pro and above.

Or deploy yourself

Docs · NVIDIA backend
SDK on GitHub
CLI install