ML INFRASTRUCTURE

ML infrastructure, simplified.

Everything an AI team needs in one platform. Serverless API inference for 200+ models, dedicated GPU instances for custom workloads, managed storage for datasets, and team collaboration tools. Start with the API for prototyping, scale to dedicated instances for production. One account, one billing dashboard, no vendor sprawl.

200+
API models
20+
GPUs available
60s
Deploy time

AVAILABLE GPUS

GPUs you can deploy today.

NVIDIA H200 141GB
NvidiaFrom $2.41/hr
141GB HBM3e, 4.8TB/s bandwidth
NVIDIA H100 80GB
NvidiaFrom $1.54/hr
80GB HBM3, 3.35TB/s, NVLink
NVIDIA B200 192GB
NvidiaFrom $3.20/hr
192GB HBM3e, Blackwell arch
NVIDIA A100 80GB
NvidiaFrom $1.06/hr
80GB HBM2e, 2TB/s bandwidth
NVIDIA L40S 48GB
NvidiaFrom $0.80/hr
48GB GDDR6X, Ada Lovelace
NVIDIA RTX 4090 24GB
NvidiaFrom $0.52/hr
24GB GDDR6X, best value

WHY RUNCRATE

Built for production.

Inference API + Bare-Metal GPUs

200+ models via serverless API for instant inference. Or deploy dedicated H100/H200/B200/A100 instances for training, fine-tuning, and self-hosted models. Both from one account.

Deploy in 60 Seconds

Select GPU, pick a template (PyTorch, CUDA, Jupyter, VS Code), and launch. SSH access, port forwarding, and browser IDE included. No VPC, no IAM, no cloud-architect PhD.

Per-Minute Billing, No Lock-In

Pay per minute for GPU instances, per token for API inference. No reservations, no minimum spend, no contracts. Spin up for 10 minutes or 10 months.

Multi-GPU & Distributed Training

Scale from 1 to 128 GPUs per node. NVLink and NVSwitch interconnects on H100/H200. DeepSpeed, FSDP, and Megatron-LM ready out of the box.

COMPARISON

Runcrate vs AWS SageMaker.

Setup time
Runcrate: 60 seconds
AWS SageMaker: Hours to days
H100 80GB price
Runcrate: $1.54/hr
AWS SageMaker: $4.90/hr (p5.xlarge)
A100 80GB price
Runcrate: $1.06/hr
AWS SageMaker: $4.10/hr (p4d.24xlarge)
Pre-built inference
Runcrate: 200+ models, one API call
AWS SageMaker: JumpStart marketplace + deploy
Billing
Runcrate: Per-minute, prepaid credits
AWS SageMaker: Per-second, invoice billing
Complexity
Runcrate: API key + SDK
AWS SageMaker: IAM, VPC, endpoints, roles...

GET STARTED

Try it now.

import Runcrate from "@runcrate/sdk";

const rc = new Runcrate({ apiKey: "rc_live_YOUR_API_KEY" });

// List available GPU types and pricing
const gpus = await rc.instances.listTypes({ gpuType: "H100" });
console.log(gpus); // [{ gpuType: "H100", hourlyRate: 1.54, ... }]

// Deploy an H100 instance with PyTorch
const instance = await rc.instances.create({
  name: "training-run-1",
  gpuType: "H100",
  gpuCount: 4,
  sshKeyId: "sk_...",
  template: "pytorch-cuda",
  storage: 200,
});
console.log(instance.id, instance.status);

// Check status
const status = await rc.instances.getStatus(instance.id);
console.log(status.ip); // SSH into your instance

// Also run inference via API
const response = await rc.chat.completions.create({
  model: "deepseek-ai/DeepSeek-V3",
  messages: [{ role: "user", content: "Hello!" }],
});

// Terminate when done
await rc.instances.terminate(instance.id);

FAQ

Common questions.

Build your ML infrastructure.