Skip to main content

Documentation Index

Fetch the complete documentation index at: https://runcrate.ai/docs/llms.txt

Use this file to discover all available pages before exploring further.

Reasoning models think step-by-step before answering. They produce a reasoning_content trace followed by a final answer — improving accuracy on math, logic, coding, and multi-step analysis.

Available models

ModelParametersStrengths
DeepSeek-R1-0528671B MoETop-tier math and code reasoning
Qwen3-Max-ThinkingProprietaryStrong multilingual reasoning
Qwen3-235B-A22B-Thinking-2507235B MoEOpen weights, cost-effective

Basic reasoning

from runcrate import Runcrate

client = Runcrate(api_key="rc_live_YOUR_API_KEY")
response = client.models.chat_completion(
    model="deepseek-ai/DeepSeek-R1-0528",
    messages=[{"role": "user", "content": "A train leaves Chicago at 9am at 80mph. Another leaves New York (790mi away) at 10am at 100mph toward Chicago. When do they meet?"}],
)
print("Thinking:", response.choices[0].message.reasoning_content)
print("Answer:", response.choices[0].message.content)

Streaming

# Using the same client
stream = client.models.chat_completion(
    model="deepseek-ai/DeepSeek-R1-0528",
    messages=[{"role": "user", "content": "Prove that the square root of 2 is irrational."}],
    stream=True,
)
for chunk in stream:
    delta = chunk["choices"][0]["delta"]
    if delta.get("reasoning_content"):
        print(delta["reasoning_content"], end="", flush=True)
    if delta.get("content"):
        print(delta["content"], end="", flush=True)

Vercel AI SDK

import { runcrate } from '@runcrate/ai';
import { streamText } from 'ai';

const result = streamText({
  model: runcrate('deepseek-ai/DeepSeek-R1-0528'),
  prompt: 'Find all primes p where p^2 + 2 is also prime. Prove completeness.',
});
for await (const chunk of result.textStream) {
  process.stdout.write(chunk);
}

Financial analysis

response = client.models.chat_completion(
    model="Qwen/Qwen3-Max-Thinking",
    messages=[
        {"role": "system", "content": "You are a financial analyst. Think step by step."},
        {"role": "user", "content": "Q1: Rev $12.4M, COGS $4.8M, OpEx $5.1M. Q2: $14.1M, $5.2M, $5.4M. Q3: $13.8M, $5.5M, $5.6M. Q4: $16.2M, $5.9M, $5.8M. Calculate margins, identify trends, project Q1 next year."}
    ],
)
print(response.choices[0].message.content)

Code debugging

response = client.models.chat_completion(
    model="Qwen/Qwen3-235B-A22B-Thinking-2507",
    messages=[{"role": "user", "content": "Find the bug:\ndef merge_sorted(a, b):\n    result, i, j = [], 0, 0\n    while i < len(a) and j < len(b):\n        if a[i] <= b[j]: result.append(a[i]); i += 1\n        else: result.append(b[j]); j += 1\n    return result"}],
)
print(response.choices[0].message.content)

Tips

  • reasoning_content contains the step-by-step trace. The final answer is in content.
  • Longer thinking = better answers. 10-30 seconds on hard problems is normal.
  • Cost scales with thinking tokens. Complex problems generate more reasoning tokens.
  • When to use. Math, logic, code debugging, multi-step analysis. For simple Q&A, standard models are faster.