ML INFRASTRUCTURE
Everything an AI team needs in one platform. Serverless API inference for 200+ models, dedicated GPU instances for custom workloads, managed storage for datasets, and team collaboration tools. Start with the API for prototyping, scale to dedicated instances for production. One account, one billing dashboard, no vendor sprawl.
AVAILABLE GPUS
| Model | Provider | Price | Detail |
|---|---|---|---|
| NVIDIA H200 141GB | Nvidia | From $2.41/hr | 141GB HBM3e, 4.8TB/s bandwidth |
| NVIDIA H100 80GB | Nvidia | From $1.54/hr | 80GB HBM3, 3.35TB/s, NVLink |
| NVIDIA B200 192GB | Nvidia | From $3.20/hr | 192GB HBM3e, Blackwell arch |
| NVIDIA A100 80GB | Nvidia | From $1.06/hr | 80GB HBM2e, 2TB/s bandwidth |
| NVIDIA L40S 48GB | Nvidia | From $0.80/hr | 48GB GDDR6X, Ada Lovelace |
| NVIDIA RTX 4090 24GB | Nvidia | From $0.52/hr | 24GB GDDR6X, best value |
WHY RUNCRATE
200+ models via serverless API for instant inference. Or deploy dedicated H100/H200/B200/A100 instances for training, fine-tuning, and self-hosted models. Both from one account.
Select GPU, pick a template (PyTorch, CUDA, Jupyter, VS Code), and launch. SSH access, port forwarding, and browser IDE included. No VPC, no IAM, no cloud-architect PhD.
Pay per minute for GPU instances, per token for API inference. No reservations, no minimum spend, no contracts. Spin up for 10 minutes or 10 months.
Scale from 1 to 128 GPUs per node. NVLink and NVSwitch interconnects on H100/H200. DeepSpeed, FSDP, and Megatron-LM ready out of the box.
COMPARISON
| Feature | Runcrate | AWS SageMaker |
|---|---|---|
| Setup time | 60 seconds | Hours to days |
| H100 80GB price | $1.54/hr | $4.90/hr (p5.xlarge) |
| A100 80GB price | $1.06/hr | $4.10/hr (p4d.24xlarge) |
| Pre-built inference | 200+ models, one API call | JumpStart marketplace + deploy |
| Billing | Per-minute, prepaid credits | Per-second, invoice billing |
| Complexity | API key + SDK | IAM, VPC, endpoints, roles... |
GET STARTED
import Runcrate from "@runcrate/sdk";
const rc = new Runcrate({ apiKey: "rc_live_YOUR_API_KEY" });
// List available GPU types and pricing
const gpus = await rc.instances.listTypes({ gpuType: "H100" });
console.log(gpus); // [{ gpuType: "H100", hourlyRate: 1.54, ... }]
// Deploy an H100 instance with PyTorch
const instance = await rc.instances.create({
name: "training-run-1",
gpuType: "H100",
gpuCount: 4,
sshKeyId: "sk_...",
template: "pytorch-cuda",
storage: 200,
});
console.log(instance.id, instance.status);
// Check status
const status = await rc.instances.getStatus(instance.id);
console.log(status.ip); // SSH into your instance
// Also run inference via API
const response = await rc.chat.completions.create({
model: "deepseek-ai/DeepSeek-V3",
messages: [{ role: "user", content: "Hello!" }],
});
// Terminate when done
await rc.instances.terminate(instance.id);FAQ