Moonshine-Embedded
14%Streaming-first embedded ASR. Lower latency than Whisper-Tiny at the cost of language coverage.
Cross-chip benchmark matrix
Every supported chip, in matched-pair runs from the Fo’c’sle HIL lab. Sortable by any column — click a header. Cells where the chip can’t run this model show Not supported.
| Chip platform | Quant | Latency p50(ms) | Latency p95(ms) | Throughput | Acc. retention(%) | Power(W) | Memory(MB) | Tested by |
|---|---|---|---|---|---|---|---|---|
π Raspberry Pi 5 + Hailo HAT 26 TOPS · HAT | INT8 | 279.4 | 343.6 | 23 tok/s | 98.1 | 7.2 | 32 | Community |
H Hailo-8 26 TOPS · M.2 | INT8 | 342.7 | 407.0 | 19 tok/s | 98.2 | 2.1 | 32 | Fo’c’sle HIL |
Q Qualcomm QCS6490 12 TOPS · SoC | INT8 | 511.9 | 656.7 | 13 tok/s | 98.2 | 7.8 | 32 | Publisher |
A Ambarella CV5 16 TOPS · SoC | INT8 | 612.8 | 795.1 | 10 tok/s | 97.7 | 4.0 | 32 | Fo’c’sle HIL |
R Rockchip RK3588 6 TOPS · SoC | INT8 | 895.2 | 1082.8 | 7.1 tok/s | 97.5 | 9.1 | 32 | Fo’c’sle HIL |
I Intel Movidius Myriad X 4 TOPS · SoC | INT8 | 1164.2 | 1399.2 | 5.5 tok/s | 97.9 | 2.4 | 32 | Fo’c’sle HIL |
H Hailo-10H 40 TOPS · M.2 | Not supported | |||||||
Q Qualcomm QCS8550 48 TOPS · SoC | Not supported | |||||||
Q Snapdragon 8 Gen 3 NPU 45 TOPS · SoC | Not supported | |||||||
N NVIDIA Jetson Orin Nano 40 TOPS · SoM | Not supported | |||||||
N NVIDIA Jetson AGX Orin 275 TOPS · Module | Not supported | |||||||
N NVIDIA Jetson Thor 2070 TOPS · Module | Not supported | |||||||
A Ambarella CV72 32 TOPS · SoC | Not supported | |||||||
M MediaTek Genio 700 4 TOPS · SoC | Not supported | |||||||
G Google Coral Edge TPU 4 TOPS · USB | Not supported | |||||||
A AMD Versal AI Edge VE2302 22 TOPS · SoC | Not supported | |||||||
Apple Neural Engine (M4) 38 TOPS · SoC | Not supported | |||||||
HIL conditions
All numbers measured on Fo’c’sle HIL rigs in Tel Aviv (primary), Munich (secondary), and Pittsburgh (robotics). Single-stream, batch-1, real preprocessing, real downstream consumer. p50/p95 are over 10,000-frame steady-state windows after a 30-second warm-up. Power draw is package power, not wall power. Memory footprint is the resident model + activations footprint at peak — not on-disk.
Submitted publisher numbers are accepted only if they reproduce within ±8% of an HIL-lab matched run on the same chip in the same input mode. Otherwise they live separately under the Discussion tab.