Moonshine-Embedded

14%

Streaming-first embedded ASR. Lower latency than Whisper-Tiny at the cost of language coverage.

Embedded ASRMITINT8streamingasrenglish

184K downloads 9.1K deploymentsUpdated Feb 28, 2028

Headline:110ms · Raspberry Pi 5 + Hailo HAT · INT8

Overview Benchmarks6 Sim Results Deploy6 Files Discussion23

Cross-chip benchmark matrix

Every supported chip, in matched-pair runs from the Fo’c’sle HIL lab. Sortable by any column — click a header. Cells where the chip can’t run this model show Not supported.

Chip platform	Quant	Latency p50(ms)	Latency p95(ms)	Throughput	Acc. retention(%)	Power(W)	Memory(MB)	Tested by
π Raspberry Pi 5 + Hailo HAT 26 TOPS · HAT	INT8	279.4	343.6	23 tok/s	98.1	7.2	32	Community
H Hailo-8 26 TOPS · M.2	INT8	342.7	407.0	19 tok/s	98.2	2.1	32	Fo’c’sle HIL
Q Qualcomm QCS6490 12 TOPS · SoC	INT8	511.9	656.7	13 tok/s	98.2	7.8	32	Publisher
A Ambarella CV5 16 TOPS · SoC	INT8	612.8	795.1	10 tok/s	97.7	4.0	32	Fo’c’sle HIL
R Rockchip RK3588 6 TOPS · SoC	INT8	895.2	1082.8	7.1 tok/s	97.5	9.1	32	Fo’c’sle HIL
I Intel Movidius Myriad X 4 TOPS · SoC	INT8	1164.2	1399.2	5.5 tok/s	97.9	2.4	32	Fo’c’sle HIL
H Hailo-10H 40 TOPS · M.2	Not supported
Q Qualcomm QCS8550 48 TOPS · SoC	Not supported
Q Snapdragon 8 Gen 3 NPU 45 TOPS · SoC	Not supported
N NVIDIA Jetson Orin Nano 40 TOPS · SoM	Not supported
N NVIDIA Jetson AGX Orin 275 TOPS · Module	Not supported
N NVIDIA Jetson Thor 2070 TOPS · Module	Not supported
A Ambarella CV72 32 TOPS · SoC	Not supported
M MediaTek Genio 700 4 TOPS · SoC	Not supported
G Google Coral Edge TPU 4 TOPS · USB	Not supported
A AMD Versal AI Edge VE2302 22 TOPS · SoC	Not supported
Apple Neural Engine (M4) 38 TOPS · SoC	Not supported

Leader per columnLibriSpeech test-clean · 16 kHz streaming

HIL conditions

All numbers measured on Fo’c’sle HIL rigs in Tel Aviv (primary), Munich (secondary), and Pittsburgh (robotics). Single-stream, batch-1, real preprocessing, real downstream consumer. p50/p95 are over 10,000-frame steady-state windows after a 30-second warm-up. Power draw is package power, not wall power. Memory footprint is the resident model + activations footprint at peak — not on-disk.

Submitted publisher numbers are accepted only if they reproduce within ±8% of an HIL-lab matched run on the same chip in the same input mode. Otherwise they live separately under the Discussion tab.