MISTRAL API
Run Mistral models through a single OpenAI-compatible endpoint. From lightweight instruction-following to powerful code generation, Mistral models deliver strong performance at competitive token costs. Same API format as OpenAI, same SDK, different model parameter.

QUICK START
from openai import OpenAI
client = OpenAI(
base_url="https://api.runcrate.ai/v1",
api_key="rc_live_YOUR_API_KEY",
)
response = client.chat.completions.create(
model="mistralai/Mistral-Small-24B-Instruct-2501",
messages=[
{"role": "user", "content": "Write a Python function to merge two sorted lists."}
],
)
print(response.choices[0].message.content)AVAILABLE MODELS
| Model | Provider | Price | Detail |
|---|---|---|---|
| mistralai/Mistral-Small-24B-Instruct-2501 | Mistral | Per-token | 24B params, strong instruction following |
| mistralai/Voxtral-Small | Mistral | Per-minute | Audio transcription, multilingual |
WHY RUNCRATE
Mistral models use sliding-window attention and grouped-query attention for fast inference at lower memory footprint, translating to lower per-token costs.
Mistral excels at code generation, bug fixing, and code review tasks. Competitive with much larger models on coding benchmarks.
Solid performance across English, French, German, Spanish, Italian, and many more languages for global applications.
Instruct-tuned variants follow complex instructions precisely, making them reliable for structured outputs, JSON generation, and tool use.
FAQ