How do I build a real-time transcription pipeline?

Split your audio stream into 5-10 second chunks. Send each chunk to the API as it becomes available. Concatenate the transcripts for a continuous output. Whisper Turbo processes each chunk in under a second.

What is the latency for Whisper Turbo?

Whisper V3 Turbo typically processes a 10-second audio chunk in under 1 second. Combined with network latency, expect end-to-end latency of 1-2 seconds per chunk.

Can I use this for phone call transcription?

Yes. Record or stream phone audio, chunk it, and send to the API. Works well for call center analytics, compliance recording, and meeting notes.

Is there a streaming/WebSocket endpoint?

Currently, the API uses HTTP requests with chunked audio. For real-time streaming, send short audio chunks sequentially. WebSocket support may be added in the future.

runcrate

Contact Sales Console

REAL-TIME STT

Real-time transcription, low latency.

Transcribe live audio streams with Whisper V3 Turbo for near-real-time results. 8x faster than Whisper Large V3 with near-identical accuracy. Ideal for live captioning, call center analytics, meeting transcription, and voice-controlled interfaces. Per-minute billing, no monthly commitments.

$0.02/min

Turbo price

8x faster

Speed

100+

Languages

Get API Key View Pricing

QUICK START

Integrate in minutes.

from openai import OpenAI

client = OpenAI(
    base_url="https://api.runcrate.ai/v1",
    api_key="rc_live_YOUR_API_KEY",
)

# Use Turbo for lowest latency
transcript = client.audio.transcriptions.create(
    model="openai/whisper-large-v3-turbo",
    file=open("audio_chunk.wav", "rb"),
    response_format="verbose_json",
    timestamp_granularities=["word"],
)
for word in transcript.words:
    print(f"[{word.start:.2f}s] {word.word}")

AVAILABLE MODELS

Models you can use today.

Model	Provider	Price	Detail
openai/whisper-large-v3-turbo	OpenAI	$0.02/min	8x faster, ideal for real-time
openai/whisper-large-v3	OpenAI	$0.045/min	Highest accuracy, 100+ languages
mistralai/Voxtral-Small	Mistral	$0.03/min	Strong multilingual, long-form

openai/whisper-large-v3-turbo

OpenAI$0.02/min

8x faster, ideal for real-time

openai/whisper-large-v3

OpenAI$0.045/min

Highest accuracy, 100+ languages

mistralai/Voxtral-Small

Mistral$0.03/min

Strong multilingual, long-form

WHY RUNCRATE

Built for production.

8x Faster with Turbo

Whisper V3 Turbo processes audio 8x faster than the full model with near-identical word error rates. Sub-second processing for short audio chunks.

Word-Level Timestamps

Get precise timing for every word. Essential for live captioning, subtitle generation, and audio-visual synchronization.

Chunked Processing

Split live audio into short chunks and transcribe each one. Stitch results together for a continuous real-time transcript with minimal delay.

100+ Languages

Automatic language detection across 100+ languages. No need to specify the language parameter for most real-time use cases.

FAQ

Common questions.

Start real-time transcription today.

Get API Key View Pricing