GEMMA API

Google Gemma, hosted and ready.

Run Google's Gemma 3 models through Runcrate's OpenAI-compatible endpoint. Gemma 3 27B delivers strong instruction following and reasoning in a relatively compact model. Built on the same research as Gemini but released as open weights for maximum flexibility.

Gemma 3 27B
Latest model
Open-weight
License
128K
Context

QUICK START

Integrate in minutes.

from openai import OpenAI

client = OpenAI(
    base_url="https://api.runcrate.ai/v1",
    api_key="rc_live_YOUR_API_KEY",
)

response = client.chat.completions.create(
    model="google/gemma-3-27b-it",
    messages=[
        {"role": "user", "content": "Compare PostgreSQL and MySQL for a new SaaS project."}
    ],
)
print(response.choices[0].message.content)

AVAILABLE MODELS

Models you can use today.

google/gemma-3-27b-it
GooglePer-token
27B, instruction-tuned, 128K context
google/gemma-3-4b-it
GooglePer-token
4B, lightweight, fast

WHY RUNCRATE

Built for production.

Gemini Heritage

Built on the same research as Google's Gemini. Gemma distills frontier-model quality into open weights you can run anywhere.

Compact and Efficient

At 27B parameters, Gemma 3 punches above its weight class. Strong reasoning and instruction following at lower inference cost than 70B models.

128K Context

Long context window for document analysis, codebase comprehension, and extended conversations without truncation.

Open Weights

Use the API for convenience, or download the weights and self-host on Runcrate GPU instances for full control over serving configuration.

FAQ

Common questions.

Start building with Gemma.