apple
Apple Neural Engine (M4)
Reference target for on-device foundation-model workloads; benchmarks via Core ML.
top models on this chip
Object detection
Sorted by latency p50 · matched-pair runs on Apple Neural Engine (M4)
Object detection| # | Model | Quant | Latency p50 | Throughput | Acc. | Power |
|---|---|---|---|---|---|---|
| 1 | YOLO27-Edge | FP16 | 19.1 ms | 52 FPS | 99.9% | 7.1 W |
| 2 | RT-DETR-Edge | FP16 | 19.6 ms | 51 FPS | 99.8% | 6.8 W |
top models on this chip
Depth
Sorted by latency p50 · matched-pair runs on Apple Neural Engine (M4)
Depth| # | Model | Quant | Latency p50 | Throughput | Acc. | Power |
|---|---|---|---|---|---|---|
| 1 | MiDaS-Distilled-S | FP16 | 15.9 ms | 63 FPS | 99.4% | 7.4 W |
| 2 | ZoeDepth-Mobile | FP16 | 22.7 ms | 44 FPS | 99.5% | 6.4 W |
| 3 | Depth-Anything-Edge | MIXED | 29.2 ms | 34 FPS | 98.7% | 6.9 W |
top models on this chip
Segmentation
Sorted by latency p50 · matched-pair runs on Apple Neural Engine (M4)
Segmentation| # | Model | Quant | Latency p50 | Throughput | Acc. | Power |
|---|---|---|---|---|---|---|
| 1 | Mask2Former-Mobile | FP16 | 36.3 ms | 28 FPS | 99.6% | 7.3 W |
| 2 | SAM-3-Distilled | FP16 | 55.9 ms | 18 FPS | 99.5% | 7.7 W |
top models on this chip
Embedded ASR
Sorted by latency p50 · matched-pair runs on Apple Neural Engine (M4)
Embedded ASR| # | Model | Quant | Latency p50 | Throughput | Acc. | Power |
|---|---|---|---|---|---|---|
| 1 | Whisper-Edge-Tiny | INT8 | 203.5 ms | 31 tok/s | 97.7% | 6.8 W |
| 2 | Distil-Whisper-Tiny | FP16 | 333.5 ms | 19 tok/s | 99.8% | 7.3 W |
top models on this chip
Multimodal
Sorted by latency p50 · matched-pair runs on Apple Neural Engine (M4)
Multimodal| # | Model | Quant | Latency p50 | Throughput | Acc. | Power |
|---|---|---|---|---|---|---|
| 1 | MobileCLIP-S2 | FP16 | 34.9 ms | 29 FPS | 99.6% | 6.5 W |
| 2 | MobileCLIP-B | FP16 | 66.9 ms | 15 FPS | 99.9% | 6.6 W |
| 3 | Florence-2-Edge | FP16 | 70.2 ms | 14 FPS | 99.8% | 7.2 W |
| 4 | DINO-v3-Distilled | FP16 | 78.1 ms | 13 FPS | 99.8% | 7.1 W |
| 5 | SmolVLM-3B | MIXED | 342.2 ms | 2.9 FPS | 99.4% | 7.3 W |
top models on this chip
Pose
Sorted by latency p50 · matched-pair runs on Apple Neural Engine (M4)
Pose| # | Model | Quant | Latency p50 | Throughput | Acc. | Power |
|---|---|---|---|---|---|---|
| 1 | MoveNet-Lightning | INT8 | 4.65 ms | 215 FPS | 98.2% | 7.1 W |
| 2 | MediaPipe-Hands-Edge | INT8 | 5.29 ms | 189 FPS | 97.6% | 6.2 W |
top models on this chip
OCR
Sorted by latency p50 · matched-pair runs on Apple Neural Engine (M4)
OCR| # | Model | Quant | Latency p50 | Throughput | Acc. | Power |
|---|---|---|---|---|---|---|
| 1 | PaddleOCR-Edge-v5 | FP16 | 10.2 ms | 98 FPS | 99.4% | 7.2 W |
| 2 | DocLayout-Edge | FP16 | 16.6 ms | 60 FPS | 99.7% | 6.6 W |