Solutions

·

Fine-Tuning

Adapt any model
to your data.

Run LoRA, QLoRA, or full-parameter fine-tuning on bare-metal GPUs. Hugging Face Transformers, Axolotl, and LLaMA-Factory come pre-installed. Bring your dataset, pick a base model, and start adapting -- per-minute billing means you only pay for active training time.

LoRA
Efficient adapters
CUDA 12.4
Pre-installed
Per-minute
Billing

The Workflow

Every fine-tuning method, one platform.

LoRA & QLoRA

Train lightweight adapters on top of frozen base models. QLoRA with 4-bit quantization lets you fine-tune 70B models on a single GPU.

Full-parameter training

Unfreeze all layers for maximum performance. Multi-GPU instances with NVLink for models that need full fine-tuning.

Pre-installed frameworks

Hugging Face Transformers, Axolotl, LLaMA-Factory, PEFT, DeepSpeed, and bitsandbytes ready out of the box. No setup required.

Dataset management

Upload datasets directly to your instance via SCP, or pull from Hugging Face Hub. Persistent storage across sessions so you never re-upload.

Experiment tracking

Weights & Biases, MLflow, and TensorBoard all work out of the box. Track loss curves, hyperparameters, and checkpoints across runs.

Deploy when done

Export your adapter or merged model directly to the Runcrate Inference API or a dedicated serving instance. No data transfer needed.

Recommended Setups

Match your method
to the right hardware.

LoRA on a single L40S or full fine-tuning across H100s -- pick the setup that fits your model size and budget.

QLoRA on 7-13B modelsL40S · 48 GBFrom $0.50/hr
LoRA on 70B modelsA100 · 80 GBFrom $1.10/hr
Full fine-tune up to 70BH100 · 80 GBFrom $2.49/hr
Full fine-tune 100B+H200 · 141 GBFrom $3.49/hr

How It Works

Three steps to a custom model.

01

Pick a base model and method

Choose any open-source model from Hugging Face. Select LoRA, QLoRA, or full fine-tuning based on your model size and dataset.

02

Upload data and configure

Upload your training data via SCP or Hugging Face Hub. Use Axolotl or LLaMA-Factory config files to set hyperparameters, or write your own training script.

03

Train, evaluate, deploy

Launch training with real-time GPU monitoring. Evaluate against your test set, then push your model directly to a Runcrate serving instance.

Start fine-tuning on Runcrate.

Spin up a GPU instance with all fine-tuning frameworks pre-installed. Per-minute billing, no credit card required to explore.

Per-minute billing
Stop anytime, no waste
Frameworks included
Axolotl, LLaMA-Factory, PEFT
Train to deploy
Serve on the same platform