From debfb466ad1821a6e5ebbe6938dd01e72f1196ea Mon Sep 17 00:00:00 2001 From: Michael Date: Mon, 23 Feb 2026 11:14:40 -0500 Subject: [PATCH] Add comprehensive deployment guide with systemd service setup and LXC configuration --- DEPLOYMENT.md | 379 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 379 insertions(+) create mode 100644 DEPLOYMENT.md diff --git a/DEPLOYMENT.md b/DEPLOYMENT.md new file mode 100644 index 0000000..8f9c168 --- /dev/null +++ b/DEPLOYMENT.md @@ -0,0 +1,379 @@ +# Deployment Guide + +This guide covers deploying watsonx-openai-proxy in production environments. + +## System Requirements + +### Fedora 43 (or similar RPM-based distributions) + +#### Essential Packages +```bash +sudo dnf install -y \ + python3.12 \ + python3-pip \ + git +``` + +#### Optional Build Tools (for compiling Python packages) +```bash +sudo dnf install -y \ + python3-devel \ + gcc \ + gcc-c++ \ + make \ + libffi-devel \ + openssl-devel \ + zlib-devel +``` + +**Note**: Most Python packages have pre-built wheels for x86_64 Linux, so build tools are rarely needed. + +## Installation + +### 1. Clone Repository +```bash +cd /home/app +git clone watsonx-openai-proxy +cd watsonx-openai-proxy +``` + +### 2. Install Dependencies +```bash +# Using system Python +python3 -m pip install --user -r requirements.txt + +# Or using virtual environment (recommended) +python3 -m venv venv +source venv/bin/activate +pip install -r requirements.txt +``` + +### 3. Configure Environment + +Create `.env` file (copy from `.env.example`): + +```bash +cp .env.example .env +``` + +**IMPORTANT**: Remove all inline comments from `.env` file. Pydantic cannot parse values with inline comments. + +**Correct format:** +```bash +IBM_CLOUD_API_KEY=your_api_key_here +WATSONX_PROJECT_ID=your_project_id_here +WATSONX_CLUSTER=us-south +HOST=0.0.0.0 +PORT=8000 +LOG_LEVEL=info +TOKEN_REFRESH_INTERVAL=3000 +``` + +**Incorrect format (will cause errors):** +```bash +LOG_LEVEL=info # Options: debug, info, warning, error +TOKEN_REFRESH_INTERVAL=3000 # Refresh token every N seconds +``` + +## Systemd Service Setup + +### 1. Create Service Unit + +Create `/etc/systemd/system/watsonx-proxy.service`: + +```ini +[Unit] +Description=watsonx OpenAI Proxy +After=network.target + +[Service] +Type=simple +User=app +Group=app +WorkingDirectory=/home/app/watsonx-openai-proxy +Environment="PATH=/usr/local/bin:/usr/bin:/bin" +EnvironmentFile=/home/app/watsonx-openai-proxy/.env +ExecStart=/usr/bin/python3 -m uvicorn app.main:app --host 0.0.0.0 --port 8000 --workers 2 +Restart=always +RestartSec=10 + +[Install] +WantedBy=multi-user.target +``` + +### 2. Enable and Start Service + +```bash +# Reload systemd +sudo systemctl daemon-reload + +# Enable service (start on boot) +sudo systemctl enable watsonx-proxy.service + +# Start service +sudo systemctl start watsonx-proxy.service + +# Check status +sudo systemctl status watsonx-proxy.service +``` + +### 3. View Logs + +```bash +# Follow logs +sudo journalctl -u watsonx-proxy.service -f + +# View last 50 lines +sudo journalctl -u watsonx-proxy.service -n 50 +``` + +### 4. Service Management + +```bash +# Stop service +sudo systemctl stop watsonx-proxy.service + +# Restart service +sudo systemctl restart watsonx-proxy.service + +# Disable auto-start +sudo systemctl disable watsonx-proxy.service +``` + +## LXC Container Deployment + +### Recommended Resources (5 req/s) + +#### Minimum Configuration +- **CPU**: 1 core (1000 CPU shares) +- **RAM**: 2 GB +- **Storage**: 10 GB +- **Swap**: 1 GB + +#### Recommended Configuration +- **CPU**: 2 cores (2000 CPU shares) +- **RAM**: 4 GB +- **Storage**: 20 GB +- **Swap**: 2 GB + +#### Optimal Configuration +- **CPU**: 4 cores (4000 CPU shares) +- **RAM**: 8 GB +- **Storage**: 50 GB +- **Swap**: 4 GB + +### Proxmox LXC Configuration + +Edit `/etc/pve/lxc/.conf`: + +```ini +# CPU allocation +cores: 2 +cpulimit: 2 +cpuunits: 2000 + +# Memory allocation +memory: 4096 +swap: 2048 + +# Storage +rootfs: local-lvm:vm--disk-0,size=20G + +# Network +net0: name=eth0,bridge=vmbr0,firewall=1,ip=dhcp,type=veth +``` + +### Resource Monitoring + +```bash +# CPU usage +lxc-cgroup -n cpu.stat + +# Memory usage +lxc-cgroup -n memory.usage_in_bytes + +# Network stats +lxc-attach -n -- ifconfig eth0 +``` + +## Python Version Management + +### Using update-alternatives + +```bash +# Set up alternatives +sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.12 1 +sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.14 2 + +# Switch version +sudo update-alternatives --config python3 + +# Fix pip if needed +python3 -m ensurepip --default-pip --upgrade +``` + +### Verify Installation + +```bash +python3 --version +python3 -m pip --version +``` + +## Troubleshooting + +### pip Module Not Found + +After switching Python versions: + +```bash +python3 -m ensurepip --default-pip --upgrade +python3 -m pip --version +``` + +### Service Fails to Start + +Check logs for errors: + +```bash +sudo journalctl -u watsonx-proxy.service -n 100 --no-pager +``` + +Common issues: +1. **Inline comments in .env**: Remove all `# comments` from environment variable values +2. **Missing dependencies**: Run `pip install -r requirements.txt` +3. **Permission errors**: Ensure `app` user owns `/home/app/watsonx-openai-proxy` +4. **Port already in use**: Change `PORT` in `.env` or stop conflicting service + +### Token Refresh Errors + +Check IBM Cloud credentials: + +```bash +# Test token generation +curl -X POST "https://iam.cloud.ibm.com/identity/token" \ + -H "Content-Type: application/x-www-form-urlencoded" \ + -d "grant_type=urn:ibm:params:oauth:grant-type:apikey&apikey=YOUR_API_KEY" +``` + +### High Memory Usage + +Reduce number of workers: + +```bash +# Edit service file +ExecStart=/usr/bin/python3 -m uvicorn app.main:app --host 0.0.0.0 --port 8000 --workers 1 + +# Restart service +sudo systemctl restart watsonx-proxy.service +``` + +## Performance Tuning + +### Worker Configuration + +- **1 worker**: ~50 MB RAM, handles ~5 req/s +- **2 workers**: ~100 MB RAM, handles ~10 req/s +- **4 workers**: ~200 MB RAM, handles ~20 req/s + +### Scaling Strategy + +1. **Vertical scaling**: Increase workers up to number of CPU cores +2. **Horizontal scaling**: Deploy multiple instances behind load balancer +3. **Auto-scaling**: Monitor CPU/memory and scale based on thresholds + +## Security Considerations + +### API Key Authentication + +Enable proxy authentication: + +```bash +# In .env file +API_KEY=your_secure_random_key_here +``` + +### CORS Configuration + +Restrict origins: + +```bash +# In .env file +ALLOWED_ORIGINS=https://app1.example.com,https://app2.example.com +``` + +### Firewall Rules + +```bash +# Allow only specific IPs +sudo firewall-cmd --permanent --add-rich-rule='rule family="ipv4" source address="10.0.0.0/8" port protocol="tcp" port="8000" accept' +sudo firewall-cmd --reload +``` + +## Monitoring + +### Health Check + +```bash +curl http://localhost:8000/health +``` + +### Metrics + +Monitor these metrics: +- CPU usage (should stay <70%) +- Memory usage (should stay <80%) +- Response times +- Error rates +- Token refresh success rate + +### Log Levels + +Adjust in `.env`: + +```bash +LOG_LEVEL=debug # For troubleshooting +LOG_LEVEL=info # For production +LOG_LEVEL=warning # For minimal logging +``` + +## Backup and Recovery + +### Backup Configuration + +```bash +# Backup .env file +sudo cp /home/app/watsonx-openai-proxy/.env /backup/.env.$(date +%Y%m%d) + +# Backup service file +sudo cp /etc/systemd/system/watsonx-proxy.service /backup/ +``` + +### Disaster Recovery + +```bash +# Restore configuration +sudo cp /backup/.env.YYYYMMDD /home/app/watsonx-openai-proxy/.env +sudo systemctl restart watsonx-proxy.service +``` + +## Updates + +### Update Application + +```bash +cd /home/app/watsonx-openai-proxy +git pull +pip install -r requirements.txt --upgrade +sudo systemctl restart watsonx-proxy.service +``` + +### Zero-Downtime Updates + +Use multiple instances behind a load balancer and update one at a time. + +--- + +For additional help, see: +- [README.md](README.md) - General usage and features +- [MODELS.md](MODELS.md) - Available models +- [AGENTS.md](AGENTS.md) - Development guidelines