Inference by the token. Compute by the second. Scale by the month — or don't. We run the infra so you can ship the model.
Trusted by teams at
Inference API
200+ models across chat, code, image, video, audio, and more. One endpoint, every provider.
80+ models


25+ models



30+ models


15+ models


20+ models



15+ models


12+ models



8+ models



Infrastructure
Raw bare metal. Full root access. Pick your hardware, deploy in 60 seconds. Per-minute billing. Scale from 1 node to 128.
Current fleet
Key specs
Full control over your environment. SSH, Docker, custom images.
Scale horizontally on demand. Add nodes in seconds, release when done.
No minimum commitments. Spin up for 5 minutes or 5 months.
Platform
(Not another provider to reconcile.)
Add funds once. Spend on API calls or GPU hours — same balance, same dashboard. No reconciling invoices from three providers.
Get an API key. Ship today. Talk to nobody.
$0
per month · pay as you go
What's included
Deployment
When the rate card starts to hurt.
Custom
volume discounts
What's included
Deployment
Run it our way. Or run it in your VPC.
Custom
tailored contract
What's included
Deployment
One platform for every AI workflow. Your team's default.
ManagedMulti-node clusters with NVLink fabric. Submit jobs via squeue. We handle the orchestration.
Whether you're pre-training a frontier model, finetuning on proprietary data, or serving production traffic — it all runs on the same infrastructure. One dashboard. One bill.
Upload dataset, pick base model. Pay per training second.
Reserved GPUs, p99 SLAs. 40–60% cheaper than hyperscalers.
Multi-node clusters. Hundreds of GPUs, one command.
api.runcrate.ai/v1/chat/completionsWhy Runcrate
Inference API and GPU compute, unified. One API key, one bill, one dashboard. Stop juggling providers.
DeepSeek, Llama, Claude, Qwen, FLUX, Sora — all via OpenAI-compatible API. Swap base_url and ship.
H100, H200, B200, MI300X — per-second billing, no commitments. Stop the instance, stop the meter.
Public rate card beats every aggregator. Volume discounts on dedicated. No hidden fees, credits never expire.