Skip to main content

Models FAQ

The Runcrate Models API is OpenAI-compatible. Point your existing OpenAI SDK or HTTP client to https://api.runcrate.ai/v1 and use your Runcrate API key. See the Quickstart for examples.
Yes. Both the Python and JavaScript OpenAI SDKs work with Runcrate. Just change the base_url (Python) or baseURL (JavaScript) to https://api.runcrate.ai/v1 and use your Runcrate API key.
from openai import OpenAI

client = OpenAI(
    base_url="https://api.runcrate.ai/v1",
    api_key="rc_live_YOUR_API_KEY",
)
Use CaseRecommended Models
General chat and assistantsGPT-4o, Claude 4 Sonnet, DeepSeek-V3
Complex reasoning and mathDeepSeek-R1, QwQ
Code generationCodestral, DeepSeek-Coder, Qwen-Coder
Image generationFLUX.1, FLUX.2, Stable Diffusion
Video generationSora 2, Veo 3, Kling
Voice synthesisKokoro, Orpheus
TranscriptionWhisper
Check the Model Catalog for the full list with pricing.
Yes. Streaming is supported for all chat completion models. Set stream: true in your request to receive responses as server-sent events.
  • Text models (chat, code, reasoning) — billed per token (input + output)
  • Image generation — billed per image
  • Video generation — billed per video
  • Text-to-speech — billed per generation
  • Speech-to-text — billed per audio minute
Rates vary by model. Check the Model Catalog or Pricing page.
The Playground is a web-based interface in the dashboard where you can test any model interactively without writing code. Go to Dashboard → Playground to try it. It uses your project’s default API key.
Yes, rate limits apply to prevent abuse. Limits vary by model and account. If you hit a rate limit, you will receive a 429 Too Many Requests response. Wait briefly and retry.
Go to the Model Catalog in the dashboard or the Pricing page in the docs. Each model shows its per-token or per-generation rate.