APby Apple
MobileCLIP-B
5%Larger MobileCLIP. Better recall, twice the compute.
MultimodalApache-2.0FP16clipembedding
142K downloads 8.2K deploymentsUpdated Feb 4, 2028
Headline:9.8ms · Apple Neural Engine (M4) · FP16
About this model
Larger MobileCLIP. Better recall, twice the compute.
Authored by apple. Curated into the Fo’c’sle reference set on 2028-02-04. All cross-chip benchmarks below were collected in matched-pair runs in the HIL lab using the same input pipeline, same upstream preprocessing, and the same downstream consumer. See the methodology page for the full protocol.
- Task
- Multimodal
- Parameters
- 86.3 M
- Benchmarked on
- 5 chips
- Deployments
- 8.2K
Architecture
Vision-text dual encoder
Inferred from upstream weights · simplified
Headline benchmarks
Training data
Pretrained on the upstream maintainer’s released checkpoint. Edge-distillation pass uses 2.4M frames from the Fo’c’sle distillation corpus (consented public data + opt-in publisher contributions). Quantization-aware fine-tune uses 320K calibration samples drawn from the target task’s eval domain.
- Pretraining corpus: upstream maintainer release
- Distillation corpus: 2,400,000 frames
- Calibration set: 320,000 samples (per task)
- Eval set: standard benchmark + matched-pair HIL runs