Skip to main content

Documentation Index

Fetch the complete documentation index at: https://runcrate.ai/docs/llms.txt

Use this file to discover all available pages before exploring further.

The chat completions endpoint powers text generation across Chat, Code, Reasoning, and Vision models. It supports streaming, system prompts, multi-turn conversations, and image inputs for vision-capable models.

Endpoint

POST https://api.runcrate.ai/v1/chat/completions

Basic Usage

curl https://api.runcrate.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer rc_live_YOUR_API_KEY" \
  -d '{
    "model": "deepseek-ai/DeepSeek-V3",
    "messages": [
      {"role": "user", "content": "Explain quantum computing in simple terms"}
    ],
    "max_tokens": 512,
    "temperature": 0.7
  }'

Streaming

Enable real-time token streaming with stream: true:
stream = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-V3",
    messages=[{"role": "user", "content": "Write a poem"}],
    stream=True,
)
for chunk in stream:
    content = chunk.choices[0].delta.content
    if content:
        print(content, end="")

Vision Models

Vision-capable models accept images in the message content. Send images as URLs or base64:
response = client.chat.completions.create(
    model="google/gemini-2.5-flash",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What is in this image?"},
                {"type": "image_url", "image_url": {"url": "https://example.com/photo.jpg"}},
            ],
        }
    ],
)
Vision-capable models include Gemini 2.5, Llama 4 Maverick, Gemma 3, GPT-4o, and others marked with “Vision” in the Model Catalog.

Reasoning Models

Reasoning models (DeepSeek-R1, QwQ, etc.) produce chain-of-thought output. The reasoning steps appear in the reasoning_content field of the streamed response delta, separate from the final answer in content.

Parameters

ParameterTypeDefaultDescription
modelstringrequiredModel ID (e.g., deepseek-ai/DeepSeek-V3)
messagesarrayrequiredConversation messages with role and content
max_tokensintegervariesMaximum tokens to generate
temperaturenumber0.7Randomness (0 = deterministic, 2 = very random)
streambooleanfalseEnable streaming responses
top_pnumber1.0Nucleus sampling threshold

Message Roles

RolePurpose
systemSets the model’s behavior and personality
userThe user’s input
assistantPrevious model responses (for multi-turn)
ModelContextBest For
deepseek-ai/DeepSeek-V3128KGeneral purpose, cost-effective
anthropic/claude-4-sonnet200KReasoning, analysis, coding
google/gemini-2.5-flash1MFast, multimodal, long context
meta-llama/Llama-4-Scout128KMultilingual, efficient
Qwen/Qwen3-Max128KReasoning, multilingual