For Agents
Fetch the complete documentation index at: https://runcrate.ai/docs/llms.txt Use this file to discover all available pages before exploring further.

Introducing Runcrate

Runcrate is the complete platform for AI teams to access open-source models and GPU compute. One account gives you production inference for 140+ models, on-demand GPU instances, dedicated clusters, and the SDKs to build with all of it.

Quickstart

Make your first API call in under 60 seconds.

Model Catalog

Browse 140+ open-source models across text, image, video, and audio.

SDKs

Python and TypeScript clients. Drop-in OpenAI SDK replacements.

API Reference

Full REST API documentation for inference and infrastructure.

The Runcrate Platform

Everything your AI team needs: production inference, GPU compute, and dedicated clusters — all under one account and one bill.

Inference Engine

OpenAI-compatible API for 140+ open-source models. Chat, image, video, TTS, ASR — billed per token or per generation.

GPU Compute

On-demand instances and dedicated clusters. H100, H200, B200, B300 with root SSH access.

Models API

Chat completions, image generation, video, TTS, and transcription endpoints.

GPU Instances

Deploy containers or VMs with dedicated NVIDIA GPUs in 60 seconds.

Storage

Persistent volumes with a built-in file explorer. Data survives instance termination.

Dedicated Clusters

Reserved bare-metal clusters from 16 to 128+ nodes with InfiniBand.

Explore use cases

See how teams use Runcrate to build AI products, run inference at scale, train models, and deploy custom servers.

AI SaaS Backend

Build a production AI backend with chat, image generation, and RAG.

RAG Pipeline

Build retrieval-augmented generation with embeddings and vector search.

Fine-tune LLMs

Fine-tune Llama, Mistral, or Qwen on your own data with GPU instances.

Video Generation

Generate videos with Kling, Veo, Sora, and Seedance APIs.

Start building

Python SDK

Official Python client. Drop-in replacement for the OpenAI SDK.

TypeScript SDK

Official TypeScript client for Node.js and edge runtimes.

Vercel AI SDK

First-class Runcrate provider for the Vercel AI SDK.

MCP Server

Control Runcrate from Claude, Cursor, or any MCP-compatible AI assistant. Deploy instances, manage storage, and monitor usage with natural language.

Or use the CLI for full terminal control:

CLI Overview

Deploy instances, SSH in, transfer files, and manage volumes from your terminal.

CLI Installation

Install on macOS, Linux, or Windows and authenticate in 30 seconds.

Which product do you need?

	Inference Engine	Compute
Best for	Building AI features on open-source models	Training, fine-tuning, custom inference servers, reserved capacity
Billing	Per token / per generation	Per hour (instances) · Monthly (dedicated)
Setup time	60 seconds	60 seconds (instances) · 1–2 weeks (dedicated)
Commitment	None	None (instances) · 12–24 months (dedicated)
Access	Self-serve · API key	Self-serve (instances) · Contact sales (dedicated)
GPUs	Managed for you	H100, H200, B200, B300, A100, L40S, RTX 4090

Not sure which fits? Start with the Inference Engine quickstart. Most teams never need anything else.

Welcome

SDKs

CLI

Inference

Compute

Storage

Dedicated Clusters

Billing

Account

FAQ

Introduction

For Agents

Introducing Runcrate

Quickstart

Model Catalog

SDKs

API Reference

The Runcrate Platform

Inference Engine

GPU Compute

Models API

GPU Instances

Storage

Dedicated Clusters

Explore use cases

AI SaaS Backend

RAG Pipeline

Fine-tune LLMs

Video Generation

Start building

Python SDK

TypeScript SDK

Vercel AI SDK

MCP Server

CLI Overview

CLI Installation

Which product do you need?

Welcome

SDKs

CLI

Inference

Compute

Storage

Dedicated Clusters

Billing

Account

FAQ

Documentation Index

​For Agents

​Introducing Runcrate

Quickstart

Model Catalog

SDKs

API Reference

​The Runcrate Platform

Inference Engine

GPU Compute

Models API

GPU Instances

Storage

Dedicated Clusters

​Explore use cases

AI SaaS Backend

RAG Pipeline

Fine-tune LLMs

Video Generation

​Start building

Python SDK

TypeScript SDK

Vercel AI SDK

MCP Server

CLI Overview

CLI Installation

​Which product do you need?

For Agents

Introducing Runcrate

The Runcrate Platform

Explore use cases

Start building

Which product do you need?