Dedicated Clusters FAQ

What are Runcrate dedicated clusters?

Dedicated clusters are bare-metal GPU deployments reserved exclusively for your organization. Unlike on-demand GPU instances (shared, hourly), dedicated clusters give you single-tenant infrastructure with guaranteed capacity on a fixed monthly contract.

What GPUs are available?

NVIDIA H100 (80 GB), H200 (141 GB), B200 (192 GB), and B300 (288 GB). All with NVLink intra-node and InfiniBand inter-node networking. See Available GPUs for full specs.

What's the minimum cluster size?

The minimum is typically 16 nodes (128 GPUs). For smaller needs, consider on-demand GPU Instances, which support 1–8 GPUs per instance with no commitment.

How long does provisioning take?

Typically 1–2 weeks from signed agreement to live cluster, depending on GPU type and location.

What contract lengths are available?

12 or 24 months. Longer contracts may offer better per-GPU pricing.

Can I scale my cluster mid-contract?

Yes. You can add nodes to your existing cluster subject to availability. Contact your account manager to discuss expansion.

What software can I run?

Anything. You have full root access to bare-metal servers. Common setups include Kubernetes, Slurm, Docker, and direct bare-metal access. Runcrate can assist with managed Kubernetes or Slurm if needed.

What regions are available?

North America (US West, US East, Canada), Europe (Germany, Netherlands, UK), and Asia Pacific (South Korea, Japan, Singapore, India, Vietnam). Availability varies by GPU type. Contact us for current options.

How are dedicated clusters priced?

Fixed monthly rate based on GPU type, cluster size, and contract duration. Per-GPU-hour pricing is locked for the entire contract. No surprise fees, no usage spikes.

Is there an SLA?

Yes. Dedicated cluster customers receive an uptime SLA and dedicated support. Details are included in your service agreement.

Can I start on shared inference or on-demand and move to a dedicated cluster?

Absolutely. Many customers prototype on the Inference Engine or on-demand GPU instances, then move to a dedicated cluster when they need reserved capacity at scale. Our team can help plan the transition.

How do I get started?

Email support@runcrate.ai with your GPU requirements. We’ll respond within 24 hours with a proposal.