Text-to-Speech - Runcrate

Generate speech audio from text input using TTS models.

Endpoint

POST https://api.runcrate.ai/v1/audio/speech

Basic Usage

curl https://api.runcrate.ai/v1/audio/speech \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer rc_live_YOUR_API_KEY" \
  --output speech.mp3 \
  -d '{
    "model": "hexgrad/Kokoro-82M",
    "input": "Hello, welcome to Runcrate!",
    "voice": "af_heart"
  }'

Parameters

Parameter	Type	Description
`model`	string	Model ID (required)
`input`	string	Text to synthesize (required)
`voice`	string	Voice preset name
`response_format`	string	Output format (`mp3`, `pcm`)

Available Models & Voices

Kokoro 82M

Lightweight, fast TTS with natural-sounding voices. Voices: af_heart, af_bella, af_nicole, af_sky, am_adam, am_michael, bf_emma, bf_isabella, bm_george, bm_lewis

Orpheus 3B

High-quality expressive speech synthesis. Voices: tara, leah, jess, leo, dan, mia, zac, zoe

Qwen3-TTS

Multilingual TTS from Alibaba. Voices: Vivian, Serena, Uncle_Fu, Dylan, Eric, Ryan, Aiden, Ono_Anna, Sohee

Voice-Clone Models

Some models (HiggsAudio, Zonos, Chatterbox) support voice cloning — they don’t have preset voices but can clone from reference audio.

Response

The response body is raw audio binary (MP3 or PCM). Save it directly to a file or stream it to an audio player.

​Endpoint

​Basic Usage

​Parameters

​Available Models & Voices

​Kokoro 82M

​Orpheus 3B

​Qwen3-TTS

​Voice-Clone Models

​Response