Runcrate Compute

GPU compute,
built for the way AI actually ships.

Instances and Crates on the GPU lineup that matters — L40S to B200, single-node to 8× H200 clusters. Sub-minute cold starts, per-second billing, no commit required.

Multi-cloud · 12 regions · 24/7 capacity

<60s
Cold-start boot
Per-sec
Billing granularity
20+
GPU SKUs

WHY RUNCRATE COMPUTE

Stop fighting the cloud to do AI work.

<60s

Speed

Cold-start to SSH. Template-cached environments mean you wait on your code to load, not your VM.

20+

Variety

GPU SKUs from L40S dev boxes to 8× B200 training clusters. Match the hardware to the job, switch the job to the next.

Per-sec

Cost

Billed by the second. No reservation commits, no minimum runtime, no idle charges. Stop the instance, stop the meter.

WHAT YOU GET

Compute that behaves like a tool, not a project.

Boot in seconds

Template-cached environments. SSH or HTTPS access in under 60 seconds — no AMI fetch, no Docker pull, no IAM dance.

Per-second billing

No minimum runtime, no idle charges. Stop the instance, stop the meter. The GPU you reserved for an hour-long job pays for an hour-long job.

Full GPU lineup

L40S, A100, H100, H200, B200, RTX 6000 Ada. Single-GPU dev boxes to 8× training clusters in the same workspace.

Templates library

PyTorch, vLLM, Axolotl, ComfyUI, JupyterLab — start from a working stack, not an empty Ubuntu image.

Persistent volumes

Durable disks that outlive the instance. Snapshot a dataset on an L40S, attach it to an 8× H200 cluster next.

SSH + HTTPS access

Real shell, real port forwarding, real tooling. Not a notebook trapped in a browser UI.

TWO SHAPES, ONE PLATFORM

Instances + Crates. Different jobs, same plumbing.

Instances run as containers, VMs, or bare-metal — pick what the job wants, get a full shell on all of them. For dev, training, fine-tuning, or anything that needs a real environment. Crates are pre-configured deployments — model runners, app servers, one-click stacks where you never want to open a terminal.

Both share the same provisioning layer: template cache, multi-cloud autoscaler, per-second billing, persistent volumes. Pick the shape that fits the job. Switch when the job changes.

Start from a template
InstanceCrate
FormContainer · VM · Bare-metalContainer
AccessSSH + HTTPSHTTPS endpoint only
Typical jobDev, training, custom buildsModel serving, app deploy
ConfigurationPick a template, drop inPick an app, hit deploy
WorkloadInteractive, multi-stepLong-running, autoscaled
Best forPower usersShipping a product

VS THE TRADITIONAL CLOUDS

Built different from AWS, GCP, Azure for AI work.

Boot time
Traditional: Minutes — AMI fetch + IAM + Docker pull
Runcrate: <60s, template-cached
Billing
Traditional: Per-hour or per-minute
Runcrate: Per-second
Reservation commit
Traditional: 1–3 year RIs for the discounted rate
Runcrate: No commit, list = your rate
GPU availability
Traditional: Capacity errors on H200 / B200
Runcrate: Multi-cloud, multi-region
Pricing model
Traditional: List + opaque enterprise discount
Runcrate: Public per-second rate card
Setup
Traditional: Custom AMI + Docker + IAM dance
Runcrate: One-click template
Dev environments
Traditional: Cobbled together with SageMaker
Runcrate: First-class shape (Instances)
Multi-cloud
Traditional: Painful, lock-in by design
Runcrate: The default

ON-DEMAND GPUS

The lineup that matters, ready in seconds.

Blackwell

B200

192GB · 4.5TB/s

Hopper

H200

141GB · 4.8TB/s

Hopper

H100

80GB · 3.3TB/s

Ampere

A100

80GB · 2.0TB/s

Ada

L40S

48GB · 864GB/s

Ada

RTX 6000 Ada

48GB · 960GB/s

Single-GPU dev boxes to 8× multi-node clusters. Switch SKU on the next job, not the next contract.

Browse all SKUs

TALK TO AN ENGINEER

Need reserved capacity?

Self-serve covers most jobs. For reserved multi-node clusters, BYOC deployments, or specialty SKUs, tell us your workload and we'll come back with a rate card within 24 hours.

Response within 24 hours
Reserved-capacity rate card
BYOC and custom-image support

Prefer Slack?

Create a Slack Connect channel

FAQ

The questions every team asks.

Spin up the instance. Ship the job.