Piper

Fast, local neural text-to-speech. Runs fully offline. Single binary + voice model file. Designed for low latency on embedded devices (Raspberry Pi) and home assistant integrations.

Usage

# Basic
echo "Hello world" | piper --model en_US-lessac-medium.onnx --output_file out.wav
 
# Play directly
echo "Hello world" | piper --model en_US-lessac-medium.onnx --output-raw | aplay -r 22050 -f S16_LE -t raw -
 
# Pipe to sox for playback
echo "Hello world" | piper -m en_US-lessac-medium.onnx --output-raw | sox -t raw -r 22050 -e signed -b 16 -c 1 - -d

Models

Models are .onnx files paired with a .onnx.json config. Download from HuggingFace: rhasspy/piper-voices.

Quality levels: x_low · low · medium · high

en_US-lessac-medium.onnx         # ~63MB, good quality
en_US-amy-low.onnx               # ~5MB, fast, lower quality
de_DE-thorsten-medium.onnx       # German

Home Assistant / Wyoming protocol

# docker-compose
services:
  piper:
    image: rhasspy/wyoming-piper
    command: --voice en_US-lessac-medium
    volumes:
      - piper_data:/data
    ports:
      - 10200:10200

Add as a Wyoming TTS provider in Home Assistant.

Python API

from piper import PiperVoice
 
voice = PiperVoice.load("en_US-lessac-medium.onnx")
with open("output.wav", "wb") as f:
    voice.synthesize("Hello from Piper!", f)