GEMMA API
Run Google's Gemma 3 models through Runcrate's OpenAI-compatible endpoint. Gemma 3 27B delivers strong instruction following and reasoning in a relatively compact model. Built on the same research as Gemini but released as open weights for maximum flexibility.

QUICK START
from openai import OpenAI
client = OpenAI(
base_url="https://api.runcrate.ai/v1",
api_key="rc_live_YOUR_API_KEY",
)
response = client.chat.completions.create(
model="google/gemma-3-27b-it",
messages=[
{"role": "user", "content": "Compare PostgreSQL and MySQL for a new SaaS project."}
],
)
print(response.choices[0].message.content)AVAILABLE MODELS
| Model | Provider | Price | Detail |
|---|---|---|---|
| google/gemma-3-27b-it | Per-token | 27B, instruction-tuned, 128K context | |
| google/gemma-3-4b-it | Per-token | 4B, lightweight, fast |
WHY RUNCRATE
Built on the same research as Google's Gemini. Gemma distills frontier-model quality into open weights you can run anywhere.
At 27B parameters, Gemma 3 punches above its weight class. Strong reasoning and instruction following at lower inference cost than 70B models.
Long context window for document analysis, codebase comprehension, and extended conversations without truncation.
Use the API for convenience, or download the weights and self-host on Runcrate GPU instances for full control over serving configuration.
FAQ