How much does the NVIDIA GB200 cost per hour?

GB200 pricing on Runcrate is $3.50-$4.00/hr (avg $3.75/hr) for reserved capacity. AWS lists $6.20, Azure $6.50, GCP $6.95 — Runcrate is the lowest published GB200 cloud rate. Reserved-only because GB200 is supply-constrained.

GB200 vs B200 — what's the difference?

GB200 is a Grace + Blackwell superchip pairing 1 ARM CPU with 2 Blackwell GPUs, sharing 384GB unified memory. B200 is just the Blackwell GPU (192GB HBM3e). GB200 is the right choice for the largest training jobs that need CPU↔GPU unified memory; B200 is right for inference and most fine-tuning.

GB200 vs GB300 — which to choose?

GB300 has 33% more VRAM (256GB vs 192GB) and 25% more bandwidth, plus 33% more compute. For frontier-model pretraining (405B+) or models that don't fit on GB200, GB300 wins. For everything else, GB200 is the more available option today.

Is the GB200 available for on-demand rental?

GB200 is currently reservation-only due to supply constraints. Contact us for a quote with committed term and capacity guarantee.

What's the GB200 NVL72 system?

NVL72 is a rack-scale system with 72 GB200 GPUs and 36 Grace CPUs sharing memory via NVLink. Runcrate offers full NVL72 racks for frontier training; contact sales.

runcrate

Contact Sales Console

NVIDIA · Blackwell · 2024

NVIDIA GB200.

Name: NVIDIA GB200 GPU Cloud Instance
Brand: NVIDIA
Price: 3.75 USD
Availability: PreOrder

Grace + Blackwell unified-memory superchip — frontier-model training at the largest scale.

192 GB

HBM3e

8.0 TB/s

Bandwidth

180

TFLOPS FP16

8× GPU

NVLink

Reserved capacity

Deploy in 60 seconds Reserved quote

Cloud rental

$3.75/hr

On-demand · per-second billing

$3.50–$4.00/hr range

Or $2.63/hr reserved (30% off)

Pricing across clouds

NVIDIA GB200 cloud rental price comparison.

Same GPU, cheapest-first. Prices reflect publicly listed hourly rates for NVIDIA GB200 on each provider. Runcrate is the lowest published rate.

RuncrateCheapest

$3.75/hr

Oracle

$5.85/hr

AWS

$6.20/hr

Azure

$6.50/hr

GCP

$6.95/hr

Workloads

What you'll actually use this for.

Real workloads sized for the NVIDIA GB200, with concrete performance numbers. Click to deploy preconfigured.

405B Pretraining

Full-precision training in NVL72

Deploy

Frontier Multi-Modal

Vision + audio + text in one pass

Deploy

Llama 405B Inference

~165 tok/s with NVLink scaling

Deploy

Reasoning Models at Scale

DeepSeek-R1 / o-series replication

Deploy

One command. NVIDIA GB200 in 60 seconds.

Skip the dashboard if you don't need it. SDK, Python, or cURL — copy the snippet, paste your API key, ship.

import Runcrate from "@runcrate/sdk";

const rc = new Runcrate({ apiKey: "rc_live_••••••••••••••••" });

const instance = await rc.instances.create({
  gpu: "gb200",
  region: "auto",
  image: "runcrate/vllm:latest",
});

console.log(`SSH: ssh root@${instance.host}`);