Florence-2-Edge

12%

Florence-2 distilled for edge inspection use cases — caption, detect, segment from a unified head.

MultimodalMITINT8FP16vlmunifiedindustrial

76K downloads 2.8K deploymentsUpdated Mar 2, 2028

Headline:84ms · NVIDIA Jetson Orin Nano · INT8

Overview Benchmarks5 Sim Results Deploy5 Files Discussion23

Deploy Florence-2-Edge

Pick a chip family. We hand you the artifacts (HEF, TRT engine, Core ML, ONNX) plus a one-click endpoint deploy. For private endpoints, on-prem deploy, or air-gapped distribution, see Enterprise.

NNVIDIA Jetson Orin Nano

# Build a TensorRT engine
$ focsle pull microsoft/florence-2-edge --target jetson-orin-nano
$ focsle build trt --plan florence-2-edge.plan \
    --precision fp16 \
    --workspace 4G

# Run with TensorRT
import focsle.runtime as fr
m = fr.load("florence-2-edge.plan", target="trt")
out = m.run(frame)

One-click endpoint

Spins up a managed endpoint in the closest region. Pro and above.

Or deploy yourself

Docs · NVIDIA backend
SDK on GitHub
CLI install