Piper
Fast, local neural text-to-speech. Runs fully offline. Single binary + voice model file. Designed for low latency on embedded devices (Raspberry Pi) and home assistant integrations.
Usage
# Basic
echo "Hello world" | piper --model en_US-lessac-medium.onnx --output_file out.wav
# Play directly
echo "Hello world" | piper --model en_US-lessac-medium.onnx --output-raw | aplay -r 22050 -f S16_LE -t raw -
# Pipe to sox for playback
echo "Hello world" | piper -m en_US-lessac-medium.onnx --output-raw | sox -t raw -r 22050 -e signed -b 16 -c 1 - -dModels
Models are .onnx files paired with a .onnx.json config. Download from HuggingFace: rhasspy/piper-voices.
Quality levels: x_low · low · medium · high
en_US-lessac-medium.onnx # ~63MB, good quality
en_US-amy-low.onnx # ~5MB, fast, lower quality
de_DE-thorsten-medium.onnx # German
Home Assistant / Wyoming protocol
# docker-compose
services:
piper:
image: rhasspy/wyoming-piper
command: --voice en_US-lessac-medium
volumes:
- piper_data:/data
ports:
- 10200:10200Add as a Wyoming TTS provider in Home Assistant.
Python API
from piper import PiperVoice
voice = PiperVoice.load("en_US-lessac-medium.onnx")
with open("output.wav", "wb") as f:
voice.synthesize("Hello from Piper!", f)