> ## Documentation Index
> Fetch the complete documentation index at: https://runcrate.ai/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Route Between AI Models

> Use different AI models for different tasks in one app. Route cheap models for classification, expensive models for generation. DeepSeek for bulk, Claude for quality.

export const RuncrateStyles = () => {
  if (typeof document !== 'undefined' && !document.getElementById('runcrate-overrides')) {
    const s = document.createElement('style');
    s.id = 'runcrate-overrides';
    s.textContent = `
      /* Match Runcrate's rounding scale (--radius: 0.75rem) */
      .rounded-sm { border-radius: 0.5rem !important; }   /* 8px */
      .rounded-md { border-radius: 0.625rem !important; } /* 10px */
      .rounded-lg { border-radius: 0.75rem !important; }  /* 12px */
      .rounded-l-sm { border-top-left-radius: 0.5rem !important; border-bottom-left-radius: 0.5rem !important; }
      .rounded-r-sm { border-top-right-radius: 0.5rem !important; border-bottom-right-radius: 0.5rem !important; }
      .rounded-l-md { border-top-left-radius: 0.625rem !important; border-bottom-left-radius: 0.625rem !important; }
      .rounded-r-md { border-top-right-radius: 0.625rem !important; border-bottom-right-radius: 0.625rem !important; }
      .rounded-l-lg { border-top-left-radius: 0.75rem !important; border-bottom-left-radius: 0.75rem !important; }
      .rounded-r-lg { border-top-right-radius: 0.75rem !important; border-bottom-right-radius: 0.75rem !important; }

      /* Cards: never pure white in light mode */
      .card { background-color: #fcfcfc !important; border-radius: 0.75rem !important; }
      html.dark .card { background-color: #141414 !important; }

      /* Docs hero box */
      .rc-hero { background-color: #fcfcfc; border: 1px solid #e0e0e0; }
      html.dark .rc-hero { background-color: #141414; border-color: #242424; }
      html.dark .rc-hero h1 { color: #f5f5f5; }

      /* Runcrate scrollbar — thin, transparent track, hide-until-hover thumb */
      ::-webkit-scrollbar { width: 6px; height: 6px; background-color: transparent; }
      ::-webkit-scrollbar-track { background-color: transparent; }
      ::-webkit-scrollbar-thumb { background-color: rgba(155, 155, 155, 0.5); border-radius: 10px; transition: opacity 0.3s ease; opacity: 0; }
      ::-webkit-scrollbar-thumb:hover { background-color: rgba(155, 155, 155, 0.7); }
      *:hover::-webkit-scrollbar-thumb,
      *:focus::-webkit-scrollbar-thumb,
      *:active::-webkit-scrollbar-thumb { opacity: 1; }
      * { scrollbar-width: thin; scrollbar-color: rgba(155, 155, 155, 0.5) transparent; }
    `;
    document.head.appendChild(s);
  }
  return null;
};

<RuncrateStyles />

Not every request needs the same model. Use a fast, cheap model for classification and routing, then send generation tasks to a stronger model. With Runcrate, every model shares the same API — switching is a string change.

## The routing pattern

```
User request
    ↓
Classify intent (fast model — DeepSeek V3.2)
    ↓
┌─────────────────────────────────────────────┐
│  simple question → DeepSeek V3.2 ($0.30/M)  │
│  creative writing → Claude 4 Sonnet ($3/M)   │
│  code generation → Qwen3 Coder ($0.20/M)     │
│  unsafe content  → blocked                    │
└─────────────────────────────────────────────┘
    ↓
Response
```

***

## Build a model router

```python theme={"theme":"github-dark"}
from openai import OpenAI
import json

client = OpenAI(
    base_url="https://api.runcrate.ai/v1",
    api_key="rc_live_YOUR_API_KEY",
)

# Step 1: Classify the request with a fast, cheap model
def classify_intent(user_message: str) -> str:
    response = client.chat.completions.create(
        model="deepseek-ai/DeepSeek-V3.2",
        messages=[
            {
                "role": "system",
                "content": 'Classify the user message into exactly one category: "simple_qa", "creative", "code", "unsafe". Return only the category string, no quotes, no explanation.',
            },
            {"role": "user", "content": user_message},
        ],
        max_tokens=16,
    )
    return response.choices[0].message.content.strip().lower()

# Step 2: Route to the right model
MODEL_MAP = {
    "simple_qa": "deepseek-ai/DeepSeek-V3.2",
    "creative": "anthropic/claude-4-sonnet",
    "code": "Qwen/Qwen3-Coder-480B-A35B-Instruct-Turbo",
}

def route_and_generate(user_message: str) -> str:
    intent = classify_intent(user_message)

    if intent == "unsafe":
        return "I can't help with that request."

    model = MODEL_MAP.get(intent, "deepseek-ai/DeepSeek-V3.2")

    response = client.chat.completions.create(
        model=model,
        messages=[{"role": "user", "content": user_message}],
        max_tokens=2048,
    )
    return response.choices[0].message.content


# Try it
queries = [
    "What is the capital of Japan?",
    "Write a short story about a robot learning to paint.",
    "Write a Python function to parse CSV files with error handling.",
]

for query in queries:
    intent = classify_intent(query)
    model = MODEL_MAP.get(intent, "deepseek-ai/DeepSeek-V3.2")
    print(f"Query: {query}")
    print(f"Intent: {intent} → Model: {model}")
    print(f"Response: {route_and_generate(query)[:100]}...")
    print()
```

***

## Cost comparison

| Strategy                   | Avg cost per request | Quality                      |
| -------------------------- | -------------------- | ---------------------------- |
| Always use Claude 4 Sonnet | \~\$0.015            | Highest                      |
| Always use DeepSeek V3.2   | \~\$0.001            | Good                         |
| **Routed (this example)**  | **\~\$0.003**        | **Highest where it matters** |

Routing typically cuts costs 60-80% compared to always using the strongest model, with minimal quality loss — because most requests are simple Q\&A that a fast model handles perfectly.

***

## Next steps

* [AI content moderation](/examples/ai-content-moderation) — add Llama Guard as a safety check before routing.
* [Build an AI SaaS backend](/examples/ai-saas-backend) — full production backend with routing, billing, and rate limiting.
* [Model catalog](/models/model-catalog) — compare models by price, speed, and capability.
