> ## Documentation Index
> Fetch the complete documentation index at: https://runcrate.ai/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Deploy and Train Models with AI Agents

> Use MCP tools to deploy a GPU, upload training data, run training, monitor progress, download results, and tear down — all through conversation.

export const RuncrateStyles = () => {
  if (typeof document !== 'undefined' && !document.getElementById('runcrate-overrides')) {
    const s = document.createElement('style');
    s.id = 'runcrate-overrides';
    s.textContent = `
      /* Match Runcrate's rounding scale (--radius: 0.75rem) */
      .rounded-sm { border-radius: 0.5rem !important; }   /* 8px */
      .rounded-md { border-radius: 0.625rem !important; } /* 10px */
      .rounded-lg { border-radius: 0.75rem !important; }  /* 12px */
      .rounded-l-sm { border-top-left-radius: 0.5rem !important; border-bottom-left-radius: 0.5rem !important; }
      .rounded-r-sm { border-top-right-radius: 0.5rem !important; border-bottom-right-radius: 0.5rem !important; }
      .rounded-l-md { border-top-left-radius: 0.625rem !important; border-bottom-left-radius: 0.625rem !important; }
      .rounded-r-md { border-top-right-radius: 0.625rem !important; border-bottom-right-radius: 0.625rem !important; }
      .rounded-l-lg { border-top-left-radius: 0.75rem !important; border-bottom-left-radius: 0.75rem !important; }
      .rounded-r-lg { border-top-right-radius: 0.75rem !important; border-bottom-right-radius: 0.75rem !important; }

      /* Cards: never pure white in light mode */
      .card { background-color: #fcfcfc !important; border-radius: 0.75rem !important; }
      html.dark .card { background-color: #141414 !important; }

      /* Docs hero box */
      .rc-hero { background-color: #fcfcfc; border: 1px solid #e0e0e0; }
      html.dark .rc-hero { background-color: #141414; border-color: #242424; }
      html.dark .rc-hero h1 { color: #f5f5f5; }

      /* Runcrate scrollbar — thin, transparent track, hide-until-hover thumb */
      ::-webkit-scrollbar { width: 6px; height: 6px; background-color: transparent; }
      ::-webkit-scrollbar-track { background-color: transparent; }
      ::-webkit-scrollbar-thumb { background-color: rgba(155, 155, 155, 0.5); border-radius: 10px; transition: opacity 0.3s ease; opacity: 0; }
      ::-webkit-scrollbar-thumb:hover { background-color: rgba(155, 155, 155, 0.7); }
      *:hover::-webkit-scrollbar-thumb,
      *:focus::-webkit-scrollbar-thumb,
      *:active::-webkit-scrollbar-thumb { opacity: 1; }
      * { scrollbar-width: thin; scrollbar-color: rgba(155, 155, 155, 0.5) transparent; }
    `;
    document.head.appendChild(s);
  }
  return null;
};

<RuncrateStyles />

Walk through a full training workflow without leaving your editor. Your AI agent handles GPU provisioning, file transfers, training execution, and cleanup.

***

## "Deploy an A100, upload my dataset, and start training."

The agent orchestrates the entire setup:

1. **`list_gpu_types`** — confirms A100 80GB availability and pricing (\$1.60/hr)
2. **`create_instance`** — deploys an instance named `fine-tune-run` with an A100
3. **`instance_status`** — polls every 10 seconds until the instance is `running`
4. **`file_upload`** — uploads `train.py` to `/workspace/train.py`
5. **`file_upload`** — uploads `dataset.jsonl` to `/workspace/dataset.jsonl`
6. **`ssh_execute`** — runs `pip install torch transformers datasets accelerate`
7. **`ssh_execute`** — runs `nohup python /workspace/train.py > /root/train.log 2>&1 &`

> Deployed **fine-tune-run** (1x A100 80GB, \$1.60/hr).
> Uploaded `train.py` and `dataset.jsonl`.
> Training started — check progress with "how's the training going?"

***

## "How's the training going? Show me GPU usage and the last few log lines."

The agent checks two things in parallel:

1. **`ssh_execute`** — runs `nvidia-smi --query-gpu=utilization.gpu,memory.used,memory.total --format=csv,noheader`
2. **`ssh_execute`** — runs `tail -20 /root/train.log`

> **GPU utilization:** 97% | **Memory:** 68,420 / 81,920 MiB (83%)
>
> ```
> Epoch 2/5 | Step 1240/3100 | Loss: 0.4812 | LR: 2.1e-5
> Epoch 2/5 | Step 1260/3100 | Loss: 0.4756 | LR: 2.0e-5
> Epoch 2/5 | Step 1280/3100 | Loss: 0.4701 | LR: 2.0e-5
> ```
>
> Loss is decreasing steadily. At the current rate, training will finish in \~2.5 hours.

***

## "Training is done. Download the model checkpoint and tear down the instance."

The agent wraps up the workflow:

1. **`ssh_execute`** — runs `ls -lh /workspace/output/` to confirm checkpoint files
2. **`file_download`** — downloads `/workspace/output/adapter_model.safetensors`
3. **`file_download`** — downloads `/workspace/output/adapter_config.json`
4. **`file_download`** — downloads `/workspace/output/training_args.json`
5. **`billing_usage`** — checks total spend for this session
6. **`delete_instance`** — terminates `fine-tune-run`

> Downloaded 3 files (adapter weights, config, training args).
> Total cost for this training run: **\$6.40** (4 hours on A100).
> Instance **fine-tune-run** terminated. Billing stopped.

***

## Tools used in this workflow

| Tool              | Purpose                                          |
| ----------------- | ------------------------------------------------ |
| `list_gpu_types`  | Check GPU availability and pricing               |
| `create_instance` | Provision the training machine                   |
| `instance_status` | Wait for deployment to complete                  |
| `file_upload`     | Transfer training code and data                  |
| `ssh_execute`     | Install dependencies, start training, check logs |
| `file_download`   | Retrieve trained model artifacts                 |
| `billing_usage`   | Verify session cost                              |
| `delete_instance` | Clean up and stop billing                        |
