Runcrate Compute
Instances and Crates on the GPU lineup that matters — L40S to B200, single-node to 8× H200 clusters. Sub-minute cold starts, per-second billing, no commit required.
● Multi-cloud · 12 regions · 24/7 capacity
WHY RUNCRATE COMPUTE
Speed
Cold-start to SSH. Template-cached environments mean you wait on your code to load, not your VM.
Variety
GPU SKUs from L40S dev boxes to 8× B200 training clusters. Match the hardware to the job, switch the job to the next.
Cost
Billed by the second. No reservation commits, no minimum runtime, no idle charges. Stop the instance, stop the meter.
WHAT YOU GET
Template-cached environments. SSH or HTTPS access in under 60 seconds — no AMI fetch, no Docker pull, no IAM dance.
No minimum runtime, no idle charges. Stop the instance, stop the meter. The GPU you reserved for an hour-long job pays for an hour-long job.
L40S, A100, H100, H200, B200, RTX 6000 Ada. Single-GPU dev boxes to 8× training clusters in the same workspace.
PyTorch, vLLM, Axolotl, ComfyUI, JupyterLab — start from a working stack, not an empty Ubuntu image.
Durable disks that outlive the instance. Snapshot a dataset on an L40S, attach it to an 8× H200 cluster next.
Real shell, real port forwarding, real tooling. Not a notebook trapped in a browser UI.
TWO SHAPES, ONE PLATFORM
Instances run as containers, VMs, or bare-metal — pick what the job wants, get a full shell on all of them. For dev, training, fine-tuning, or anything that needs a real environment. Crates are pre-configured deployments — model runners, app servers, one-click stacks where you never want to open a terminal.
Both share the same provisioning layer: template cache, multi-cloud autoscaler, per-second billing, persistent volumes. Pick the shape that fits the job. Switch when the job changes.
VS THE TRADITIONAL CLOUDS
| Feature | Traditional cloud | Runcrate Compute |
|---|---|---|
| Boot time | Minutes — AMI fetch + IAM + Docker pull | <60s, template-cached |
| Billing | Per-hour or per-minute | Per-second |
| Reservation commit | 1–3 year RIs for the discounted rate | No commit, list = your rate |
| GPU availability | Capacity errors on H200 / B200 | Multi-cloud, multi-region |
| Pricing model | List + opaque enterprise discount | Public per-second rate card |
| Setup | Custom AMI + Docker + IAM dance | One-click template |
| Dev environments | Cobbled together with SageMaker | First-class shape (Instances) |
| Multi-cloud | Painful, lock-in by design | The default |
ON-DEMAND GPUS
Blackwell
B200
192GB · 4.5TB/s
Hopper
H200
141GB · 4.8TB/s
Hopper
H100
80GB · 3.3TB/s
Ampere
A100
80GB · 2.0TB/s
Ada
L40S
48GB · 864GB/s
Ada
RTX 6000 Ada
48GB · 960GB/s
Single-GPU dev boxes to 8× multi-node clusters. Switch SKU on the next job, not the next contract.
Browse all SKUsTALK TO AN ENGINEER
Self-serve covers most jobs. For reserved multi-node clusters, BYOC deployments, or specialty SKUs, tell us your workload and we'll come back with a rate card within 24 hours.
Prefer Slack?
Create a Slack Connect channel
FAQ