Initial commit: Ollama GPU Switcher

Simple web UI to toggle OpenClaw agents between work mode (qwen3 on ollama)
and lab mode (groq cloud fallback), giving the lab agent exclusive GPU access.

Features:
- One-click mode switching
- Real-time agent status
- Lab model selector
- Direct config file patching + gateway restart
This commit is contained in:
2026-02-18 17:16:35 +00:00
commit 3366d6d9ec
5 changed files with 677 additions and 0 deletions

4
.gitignore vendored Normal file
View File

@@ -0,0 +1,4 @@
__pycache__/
*.pyc
.env
venv/

66
README.md Normal file
View File

@@ -0,0 +1,66 @@
# Ollama GPU Switcher
A simple web UI to toggle OpenClaw agents between **work mode** (local ollama inference) and **lab mode** (cloud fallback), so experiments get exclusive GPU access.
## The Problem
With a single GPU (RTX 3090), loading different models causes VRAM swaps. When the lab agent (Eric) loads granite4 while other agents are using qwen3, both tasks fail. This tool lets you switch all non-lab agents to cloud (groq) with one click.
## Modes
| Mode | GPU Agents | Lab Agent | GPU Status |
|------|-----------|-----------|------------|
| 🛠️ Work | qwen3-128k:14b (ollama) | granite4 (ollama) | Shared |
| 🧪 Lab | groq (cloud) | granite4 (ollama) | Exclusive for lab |
## Features
- One-click mode switching (work ↔ lab)
- Real-time agent status display
- Lab model selector (change what Eric runs)
- Auto-refresh every 30s
- Dark theme, mobile-friendly
- **No LLM involved** — pure config switching via OpenClaw gateway API
## Setup
```bash
pip install -r requirements.txt
```
## Usage
```bash
# Set your gateway token (from openclaw.json)
export OPENCLAW_GATEWAY_TOKEN="your-token-here"
# Run on port 8585 (default)
python app.py
# Or custom port
PORT=9090 python app.py
```
Then open `http://localhost:8585` in your browser.
## Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| `OPENCLAW_GATEWAY_URL` | `http://127.0.0.1:18789` | Gateway API endpoint |
| `OPENCLAW_GATEWAY_TOKEN` | *(empty)* | Gateway auth token |
| `PORT` | `8585` | Web UI port |
## How It Works
The app reads and patches the OpenClaw gateway config via its REST API:
1. **Status**: `GET /api/status` → reads agent model assignments
2. **Switch**: `POST /api/switch` → patches agent models (qwen3 ↔ groq)
3. **Lab model**: `POST /api/lab-model` → changes Eric's model
Config changes trigger an automatic gateway restart.
## License
MIT

195
app.py Normal file
View File

@@ -0,0 +1,195 @@
#!/usr/bin/env python3
"""
Ollama GPU Switcher — Toggle OpenClaw agents between work mode (qwen3) and lab mode (GPU exclusive).
No LLM involved. Reads/writes openclaw.json directly, then signals the gateway to restart.
"""
import json
import os
import signal
import subprocess
import copy
from flask import Flask, jsonify, request, send_from_directory
app = Flask(__name__, static_folder="static")
CONFIG_PATH = os.environ.get("OPENCLAW_CONFIG", os.path.expanduser("~/.openclaw/openclaw.json"))
# Agents that use ollama and compete for GPU
OLLAMA_AGENTS = ["rex", "maddy", "coder", "research"]
WORK_PRIMARY = "ollama/qwen3-128k:14b"
LAB_PRIMARY = "groq/llama-3.3-70b-versatile"
def read_config():
with open(CONFIG_PATH, "r") as f:
return json.load(f)
def write_config(config):
with open(CONFIG_PATH, "w") as f:
json.dump(config, f, indent=2)
f.write("\n")
def restart_gateway():
"""Restart the openclaw gateway via CLI."""
try:
subprocess.run(["openclaw", "gateway", "restart"], timeout=10, capture_output=True)
return True
except Exception:
# Fallback: try SIGUSR1 to the gateway process
try:
result = subprocess.run(["pgrep", "-f", "openclaw.*gateway"], capture_output=True, text=True)
if result.stdout.strip():
pid = int(result.stdout.strip().split("\n")[0])
os.kill(pid, signal.SIGUSR1)
return True
except Exception:
pass
return False
def find_agent(config, agent_id):
for agent in config.get("agents", {}).get("list", []):
if agent.get("id") == agent_id:
return agent
return None
def detect_mode(config):
ollama_count = 0
groq_count = 0
for agent_id in OLLAMA_AGENTS:
agent = find_agent(config, agent_id)
if agent:
primary = agent.get("model", {}).get("primary", "")
if "ollama/" in primary:
ollama_count += 1
elif "groq/" in primary:
groq_count += 1
if ollama_count == len(OLLAMA_AGENTS):
return "work"
elif groq_count >= len(OLLAMA_AGENTS):
return "lab"
return "mixed"
@app.route("/")
def index():
return send_from_directory("static", "index.html")
@app.route("/api/status")
def status():
try:
config = read_config()
mode = detect_mode(config)
agent_details = []
for agent_id in OLLAMA_AGENTS:
agent = find_agent(config, agent_id)
if agent:
agent_details.append({
"id": agent["id"],
"name": agent.get("name", agent["id"]),
"model": agent.get("model", {}).get("primary", "unknown"),
})
lab = find_agent(config, "lab")
lab_info = {
"name": lab.get("name", "Eric") if lab else "Eric",
"model": lab.get("model", {}).get("primary", "unknown") if lab else "unknown",
}
# Subagents default
subagents_primary = (
config.get("agents", {})
.get("defaults", {})
.get("subagents", {})
.get("model", {})
.get("primary", "unknown")
)
return jsonify({
"ok": True,
"mode": mode,
"lab": lab_info,
"agents": agent_details,
"subagentsPrimary": subagents_primary,
})
except Exception as e:
return jsonify({"ok": False, "error": str(e)}), 500
@app.route("/api/switch", methods=["POST"])
def switch():
try:
data = request.json or {}
target_mode = data.get("mode", "work")
if target_mode == "lab":
new_primary = LAB_PRIMARY
elif target_mode == "work":
new_primary = WORK_PRIMARY
else:
return jsonify({"ok": False, "error": f"Unknown mode: {target_mode}"}), 400
config = read_config()
# Patch each agent's primary model
for agent_id in OLLAMA_AGENTS:
agent = find_agent(config, agent_id)
if agent:
if "model" not in agent:
agent["model"] = {}
agent["model"]["primary"] = new_primary
# Patch subagents default
config.setdefault("agents", {}).setdefault("defaults", {}).setdefault("subagents", {}).setdefault("model", {})
config["agents"]["defaults"]["subagents"]["model"]["primary"] = new_primary
write_config(config)
restarted = restart_gateway()
return jsonify({
"ok": True,
"mode": target_mode,
"restarted": restarted,
})
except Exception as e:
return jsonify({"ok": False, "error": str(e)}), 500
@app.route("/api/lab-model", methods=["POST"])
def set_lab_model():
try:
data = request.json or {}
model = data.get("model", "")
if not model:
return jsonify({"ok": False, "error": "No model specified"}), 400
config = read_config()
lab = find_agent(config, "lab")
if not lab:
return jsonify({"ok": False, "error": "Lab agent not found"}), 404
if "model" not in lab:
lab["model"] = {}
lab["model"]["primary"] = model
write_config(config)
restarted = restart_gateway()
return jsonify({"ok": True, "model": model, "restarted": restarted})
except Exception as e:
return jsonify({"ok": False, "error": str(e)}), 500
if __name__ == "__main__":
port = int(os.environ.get("PORT", 8585))
print(f"🔀 Ollama GPU Switcher running on http://0.0.0.0:{port}")
print(f"📄 Config: {CONFIG_PATH}")
app.run(host="0.0.0.0", port=port, debug=False)

2
requirements.txt Normal file
View File

@@ -0,0 +1,2 @@
flask>=3.0
requests>=2.31

410
static/index.html Normal file
View File

@@ -0,0 +1,410 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Ollama GPU Switcher</title>
<style>
:root {
--bg: #0d1117;
--surface: #161b22;
--border: #30363d;
--text: #e6edf3;
--text-dim: #8b949e;
--green: #3fb950;
--green-dim: #238636;
--orange: #d29922;
--orange-dim: #9e6a03;
--blue: #58a6ff;
--red: #f85149;
--purple: #bc8cff;
}
* { box-sizing: border-box; margin: 0; padding: 0; }
body {
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Helvetica, Arial, sans-serif;
background: var(--bg);
color: var(--text);
min-height: 100vh;
display: flex;
flex-direction: column;
align-items: center;
padding: 2rem 1rem;
}
h1 {
font-size: 1.5rem;
font-weight: 600;
margin-bottom: 0.5rem;
}
.subtitle {
color: var(--text-dim);
font-size: 0.875rem;
margin-bottom: 2rem;
}
.card {
background: var(--surface);
border: 1px solid var(--border);
border-radius: 12px;
padding: 1.5rem;
width: 100%;
max-width: 480px;
margin-bottom: 1rem;
}
.card h2 {
font-size: 0.875rem;
font-weight: 600;
text-transform: uppercase;
letter-spacing: 0.05em;
color: var(--text-dim);
margin-bottom: 1rem;
}
/* Mode indicator */
.mode-display {
display: flex;
align-items: center;
gap: 0.75rem;
margin-bottom: 1.5rem;
}
.mode-dot {
width: 12px;
height: 12px;
border-radius: 50%;
flex-shrink: 0;
}
.mode-dot.work { background: var(--green); box-shadow: 0 0 8px var(--green); }
.mode-dot.lab { background: var(--orange); box-shadow: 0 0 8px var(--orange); }
.mode-dot.mixed { background: var(--purple); box-shadow: 0 0 8px var(--purple); }
.mode-label {
font-size: 1.25rem;
font-weight: 600;
}
/* Toggle switch */
.toggle-container {
display: flex;
gap: 0;
border-radius: 8px;
overflow: hidden;
border: 1px solid var(--border);
}
.toggle-btn {
flex: 1;
padding: 0.75rem 1.5rem;
border: none;
background: transparent;
color: var(--text-dim);
font-size: 0.9rem;
font-weight: 500;
cursor: pointer;
transition: all 0.2s;
}
.toggle-btn:hover { background: rgba(255,255,255,0.05); }
.toggle-btn.active-work {
background: var(--green-dim);
color: white;
}
.toggle-btn.active-lab {
background: var(--orange-dim);
color: white;
}
.toggle-btn:disabled {
opacity: 0.5;
cursor: wait;
}
/* Agent list */
.agent-list {
list-style: none;
}
.agent-item {
display: flex;
justify-content: space-between;
align-items: center;
padding: 0.5rem 0;
border-bottom: 1px solid var(--border);
}
.agent-item:last-child { border-bottom: none; }
.agent-name {
font-weight: 500;
}
.agent-model {
font-size: 0.8rem;
color: var(--text-dim);
font-family: 'SF Mono', SFMono-Regular, Consolas, 'Liberation Mono', Menlo, monospace;
}
.agent-model.ollama { color: var(--green); }
.agent-model.groq { color: var(--blue); }
/* Lab model selector */
.lab-model-row {
display: flex;
gap: 0.5rem;
align-items: center;
margin-top: 0.75rem;
}
.lab-model-row select {
flex: 1;
padding: 0.5rem;
border-radius: 6px;
border: 1px solid var(--border);
background: var(--bg);
color: var(--text);
font-size: 0.85rem;
font-family: inherit;
}
.lab-model-row button {
padding: 0.5rem 1rem;
border-radius: 6px;
border: 1px solid var(--border);
background: var(--surface);
color: var(--text);
cursor: pointer;
font-size: 0.85rem;
transition: all 0.2s;
}
.lab-model-row button:hover {
background: rgba(255,255,255,0.1);
}
/* Status bar */
.status-bar {
text-align: center;
font-size: 0.8rem;
color: var(--text-dim);
margin-top: 1rem;
min-height: 1.2em;
}
.status-bar.error { color: var(--red); }
.status-bar.success { color: var(--green); }
/* Loading */
.loading {
text-align: center;
padding: 2rem;
color: var(--text-dim);
}
@keyframes spin {
to { transform: rotate(360deg); }
}
.spinner {
display: inline-block;
width: 20px;
height: 20px;
border: 2px solid var(--border);
border-top-color: var(--blue);
border-radius: 50%;
animation: spin 0.8s linear infinite;
margin-right: 0.5rem;
vertical-align: middle;
}
</style>
</head>
<body>
<h1>🔀 Ollama GPU Switcher</h1>
<p class="subtitle">Toggle agents between work mode and lab experiments</p>
<div class="card">
<h2>Current Mode</h2>
<div class="mode-display">
<div id="mode-dot" class="mode-dot"></div>
<span id="mode-label" class="mode-label">Loading...</span>
</div>
<div class="toggle-container">
<button id="btn-work" class="toggle-btn" onclick="switchMode('work')">
🛠️ Work Mode
</button>
<button id="btn-lab" class="toggle-btn" onclick="switchMode('lab')">
🧪 Lab Mode
</button>
</div>
</div>
<div class="card">
<h2>GPU Agents</h2>
<ul id="agent-list" class="agent-list">
<li class="loading"><span class="spinner"></span> Loading...</li>
</ul>
</div>
<div class="card">
<h2>Lab Agent (Eric)</h2>
<div id="lab-info" style="margin-bottom: 0.5rem;">
<span class="agent-model">loading...</span>
</div>
<div class="lab-model-row">
<select id="lab-model-select">
<option value="ollama/granite4:32b-a9b-h">granite4:32b-a9b-h</option>
<option value="ollama/qwen3-128k:14b">qwen3-128k:14b</option>
<option value="ollama/qwen3:14b">qwen3:14b</option>
<option value="ollama/gpt-oss:20b">gpt-oss:20b</option>
<option value="ollama/gpt-oss:20b-64k">gpt-oss:20b-64k</option>
<option value="ollama/gemma3:27b">gemma3:27b</option>
<option value="ollama/gemma3:12b">gemma3:12b</option>
<option value="ollama/granite3.3:latest">granite3.3:latest</option>
<option value="groq/llama-3.3-70b-versatile">groq (cloud)</option>
</select>
<button onclick="setLabModel()">Apply</button>
</div>
</div>
<div id="status-bar" class="status-bar"></div>
<script>
let currentMode = 'unknown';
let switching = false;
async function fetchStatus() {
try {
const r = await fetch('/api/status');
const data = await r.json();
if (!data.ok) throw new Error(data.error);
updateUI(data);
} catch (e) {
showStatus('Failed to fetch status: ' + e.message, 'error');
}
}
function updateUI(data) {
currentMode = data.mode;
// Mode indicator
const dot = document.getElementById('mode-dot');
const label = document.getElementById('mode-label');
dot.className = 'mode-dot ' + data.mode;
const modeNames = {
work: '🛠️ Work Mode — Agents on qwen3',
lab: '🧪 Lab Mode — Agents on groq, GPU free',
mixed: '⚠️ Mixed — Check agent config',
};
label.textContent = modeNames[data.mode] || data.mode;
// Toggle buttons
const btnWork = document.getElementById('btn-work');
const btnLab = document.getElementById('btn-lab');
btnWork.className = 'toggle-btn' + (data.mode === 'work' ? ' active-work' : '');
btnLab.className = 'toggle-btn' + (data.mode === 'lab' ? ' active-lab' : '');
// Agent list
const list = document.getElementById('agent-list');
list.innerHTML = data.agents.map(a => {
const isOllama = a.model.includes('ollama/');
const cls = isOllama ? 'ollama' : 'groq';
const shortModel = a.model.replace('ollama/', '').replace('groq/', '');
return `<li class="agent-item">
<span class="agent-name">${a.name}</span>
<span class="agent-model ${cls}">${shortModel}</span>
</li>`;
}).join('');
// Lab info
const labInfo = document.getElementById('lab-info');
const shortLab = data.lab.model.replace('ollama/', '').replace('groq/', '');
const labCls = data.lab.model.includes('ollama/') ? 'ollama' : 'groq';
labInfo.innerHTML = `Current: <span class="agent-model ${labCls}">${shortLab}</span>`;
// Set select to current value
const select = document.getElementById('lab-model-select');
for (let opt of select.options) {
if (opt.value === data.lab.model) {
opt.selected = true;
break;
}
}
}
async function switchMode(mode) {
if (switching) return;
if (mode === currentMode) return;
switching = true;
const btns = document.querySelectorAll('.toggle-btn');
btns.forEach(b => b.disabled = true);
showStatus('Switching to ' + mode + ' mode...', '');
try {
const r = await fetch('/api/switch', {
method: 'POST',
headers: {'Content-Type': 'application/json'},
body: JSON.stringify({mode}),
});
const data = await r.json();
if (!data.ok) throw new Error(data.error);
showStatus('Switched to ' + mode + ' mode. Gateway restarting...', 'success');
// Wait for gateway to restart, then refresh
setTimeout(async () => {
await fetchStatus();
switching = false;
btns.forEach(b => b.disabled = false);
}, 3000);
} catch (e) {
showStatus('Switch failed: ' + e.message, 'error');
switching = false;
btns.forEach(b => b.disabled = false);
}
}
async function setLabModel() {
const select = document.getElementById('lab-model-select');
const model = select.value;
showStatus('Setting lab model to ' + model + '...', '');
try {
const r = await fetch('/api/lab-model', {
method: 'POST',
headers: {'Content-Type': 'application/json'},
body: JSON.stringify({model}),
});
const data = await r.json();
if (!data.ok) throw new Error(data.error);
showStatus('Lab model updated. Gateway restarting...', 'success');
setTimeout(fetchStatus, 3000);
} catch (e) {
showStatus('Failed: ' + e.message, 'error');
}
}
function showStatus(msg, type) {
const bar = document.getElementById('status-bar');
bar.textContent = msg;
bar.className = 'status-bar' + (type ? ' ' + type : '');
}
// Init
fetchStatus();
// Auto-refresh every 30s
setInterval(fetchStatus, 30000);
</script>
</body>
</html>