> ## Documentation Index
> Fetch the complete documentation index at: https://runcrate.ai/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Chat / LLM completion

> OpenAI-compatible chat completion endpoint. Drop-in replacement for `OpenAI(base_url='https://api.runcrate.ai/v1')`.


## OpenAPI

````yaml /api-reference/openapi.json post /chat/completions
openapi: 3.1.0
info:
  title: Runcrate API
  version: 1.0.0
  summary: OpenAI-compatible inference API + GPU infrastructure API.
  description: >-
    Runcrate exposes two API surfaces. The inference API at
    `https://api.runcrate.ai/v1` is OpenAI-compatible — it accepts the standard
    OpenAI SDK with the base URL changed. The infrastructure API at
    `https://runcrate.ai/api/v1` manages GPU instances, persistent volumes,
    environments, SSH keys, templates, and billing. Both share a single
    `rc_live_*` API key.


    Pricing is per-token, per-image, per-second, or per-minute on the inference
    API (varies by model; see https://www.runcrate.ai/api/models/catalog for
    live rates) and per-second on GPU rentals (see
    https://www.runcrate.ai/pricing for hourly rates per GPU class).


    All infrastructure API responses are wrapped in `{ "data": ... }`. List
    responses also include a `meta` object with pagination cursors. Error
    responses use `{ "error": { "code": "...", "message": "..." } }`.
  termsOfService: https://www.runcrate.ai/terms
  contact:
    name: Runcrate
    url: https://www.runcrate.ai/contact
  license:
    name: Proprietary
    url: https://www.runcrate.ai/terms
servers:
  - url: https://api.runcrate.ai/v1
    description: Inference API (OpenAI-compatible)
  - url: https://runcrate.ai/api/v1
    description: >-
      Infrastructure API (instances, storage, environments, ssh-keys, templates,
      billing)
security:
  - ApiKeyAuth: []
tags:
  - name: Chat
    description: OpenAI-compatible chat completions.
  - name: Images
    description: Text-to-image generation.
  - name: Videos
    description: Async text-to-video and image-to-video generation.
  - name: Audio
    description: Text-to-speech and speech-to-text.
  - name: Embeddings
    description: Vector embeddings for RAG and search.
  - name: Models
    description: Model catalog discovery.
  - name: Instances
    description: GPU instance lifecycle.
  - name: Storage
    description: Persistent volume management.
  - name: Environments
    description: Resource isolation containers within a workspace.
  - name: SSH Keys
    description: SSH key management for instance access.
  - name: Templates
    description: Pre-built instance images and configurations.
  - name: Billing
    description: Credit balance, transactions, usage metrics.
paths:
  /chat/completions:
    post:
      tags:
        - Chat
      summary: Chat / LLM completion
      description: >-
        OpenAI-compatible chat completion endpoint. Drop-in replacement for
        `OpenAI(base_url='https://api.runcrate.ai/v1')`.
      operationId: createChatCompletion
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/ChatCompletionRequest'
      responses:
        '200':
          description: Completion (or SSE stream when stream=true).
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ChatCompletionResponse'
            text/event-stream:
              schema:
                type: string
                description: Server-sent event stream of OpenAI-compatible chunks.
        '401':
          $ref: '#/components/responses/Unauthorized'
        '402':
          $ref: '#/components/responses/PaymentRequired'
        '429':
          $ref: '#/components/responses/RateLimited'
components:
  schemas:
    ChatCompletionRequest:
      type: object
      required:
        - model
        - messages
      properties:
        model:
          type: string
          description: Model id from the catalog (e.g. `deepseek/deepseek-v3.2`).
        messages:
          type: array
          items:
            $ref: '#/components/schemas/ChatMessage'
        max_tokens:
          type: integer
          minimum: 1
        temperature:
          type: number
          minimum: 0
          maximum: 2
        top_p:
          type: number
          minimum: 0
          maximum: 1
        stop:
          oneOf:
            - type: string
            - type: array
              items:
                type: string
        frequency_penalty:
          type: number
          minimum: -2
          maximum: 2
        presence_penalty:
          type: number
          minimum: -2
          maximum: 2
        stream:
          type: boolean
          default: false
        tools:
          type: array
          items:
            type: object
        tool_choice:
          oneOf:
            - type: string
            - type: object
        response_format:
          type: object
    ChatCompletionResponse:
      type: object
      properties:
        id:
          type: string
        object:
          type: string
          enum:
            - chat.completion
        created:
          type: integer
        model:
          type: string
        choices:
          type: array
          items:
            type: object
            properties:
              index:
                type: integer
              message:
                $ref: '#/components/schemas/ChatMessage'
              finish_reason:
                type: string
                enum:
                  - stop
                  - length
                  - tool_calls
                  - content_filter
        usage:
          type: object
          properties:
            prompt_tokens:
              type: integer
            completion_tokens:
              type: integer
            total_tokens:
              type: integer
    ChatMessage:
      type: object
      required:
        - role
        - content
      properties:
        role:
          type: string
          enum:
            - system
            - user
            - assistant
            - tool
        name:
          type: string
        content:
          oneOf:
            - type: string
            - type: array
              items:
                type: object
                properties:
                  type:
                    type: string
                    enum:
                      - text
                      - image_url
                  text:
                    type: string
                  image_url:
                    type: object
                    properties:
                      url:
                        type: string
                        format: uri
    Error:
      type: object
      required:
        - error
      properties:
        error:
          type: object
          required:
            - code
            - message
          properties:
            code:
              type: string
              description: >-
                Machine-readable error code (e.g. `validation_error`,
                `unauthorized`, `not_found`, `insufficient_credits`,
                `rate_limited`, `internal_error`).
            message:
              type: string
              description: Human-readable error message.
            details:
              type: object
              additionalProperties: true
              description: Optional structured error context.
  responses:
    Unauthorized:
      description: Missing or invalid API key.
      content:
        application/json:
          schema:
            $ref: '#/components/schemas/Error'
    PaymentRequired:
      description: Insufficient credits.
      content:
        application/json:
          schema:
            $ref: '#/components/schemas/Error'
    RateLimited:
      description: Rate limited. Retry with exponential backoff.
      content:
        application/json:
          schema:
            $ref: '#/components/schemas/Error'
  securitySchemes:
    ApiKeyAuth:
      type: http
      scheme: bearer
      description: >-
        Use a Runcrate API key with the `rc_live_*` prefix as the bearer token.
        Create one at https://www.runcrate.ai/dashboard/api-keys.

````