How does Gemma compare to Llama?

Gemma 3 27B is competitive with Llama 3 70B on many benchmarks despite being much smaller. It is particularly strong at instruction following and structured output generation.

Can I fine-tune Gemma?

Gemma weights are open and support fine-tuning. You can fine-tune on Runcrate GPU instances using LoRA or full fine-tuning, then serve the result via the API or self-hosted.

Is Gemma suitable for production?

Yes. Gemma 3 is released under Google's permissive license for commercial use. The 27B model balances quality and cost well for production workloads.

runcrate

Contact Sales Console

GEMMA API

Google Gemma, hosted and ready.

Run Google's Gemma 3 models through Runcrate's OpenAI-compatible endpoint. Gemma 3 27B delivers strong instruction following and reasoning in a relatively compact model. Built on the same research as Gemini but released as open weights for maximum flexibility.

Gemma 3 27B

Latest model

Open-weight

License

128K

Context

Get API Key View Pricing

QUICK START

Integrate in minutes.

from openai import OpenAI

client = OpenAI(
    base_url="https://api.runcrate.ai/v1",
    api_key="rc_live_YOUR_API_KEY",
)

response = client.chat.completions.create(
    model="google/gemma-3-27b-it",
    messages=[
        {"role": "user", "content": "Compare PostgreSQL and MySQL for a new SaaS project."}
    ],
)
print(response.choices[0].message.content)

AVAILABLE MODELS

Models you can use today.

Model	Provider	Price	Detail
google/gemma-3-27b-it	Google	Per-token	27B, instruction-tuned, 128K context
google/gemma-3-4b-it	Google	Per-token	4B, lightweight, fast

google/gemma-3-27b-it

GooglePer-token

27B, instruction-tuned, 128K context

google/gemma-3-4b-it

GooglePer-token

4B, lightweight, fast

WHY RUNCRATE

Built for production.

Gemini Heritage

Built on the same research as Google's Gemini. Gemma distills frontier-model quality into open weights you can run anywhere.

Compact and Efficient

At 27B parameters, Gemma 3 punches above its weight class. Strong reasoning and instruction following at lower inference cost than 70B models.

128K Context

Long context window for document analysis, codebase comprehension, and extended conversations without truncation.

Open Weights

Use the API for convenience, or download the weights and self-host on Runcrate GPU instances for full control over serving configuration.

FAQ

Common questions.

Start building with Gemma.

Get API Key View Pricing