Files
farm-manager/README.md
T
pluto fe76ca7456 fix: agent Dockerfile package structure and Dead container crash
- Dockerfile: COPY to /app/agent/ and use agent.main:app for proper
  package imports
- docker_ops: use low-level API in get_health() to avoid NotFound on
  containers stuck in Docker Dead state
- Add comprehensive README with architecture, API docs, and usage

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 22:48:20 -06:00

88 lines
2.9 KiB
Markdown

# Farm Manager
Web dashboard for managing Docker services across the homelab cluster. Provides full control (start/stop/restart/logs/pull) with service grouping for bulk operations.
## Architecture
```
Browser --> API Server (hf-pdocker-01:8888)
|
+------+------+
| | |
Agent Agent Agent
(01:8889)(02:8889)(bart:8889)
| | |
Docker Docker Docker
Socket Socket Socket
```
- **API Server** — FastAPI on :8888. Serves the dashboard, proxies commands to node agents, manages service groups.
- **Node Agent** — FastAPI on :8889 per node. Mounts Docker socket, auto-detects Swarm vs Compose containers.
- **Dashboard** — Vanilla HTML/CSS/JS. Dark theme, service cards, group management, log viewer.
## Quick Start
```bash
# Build and push images
docker build -f agent/Dockerfile -t 127.0.0.1:5050/farm-agent:latest agent/
docker build -f Dockerfile.server -t 127.0.0.1:5050/farm-manager:latest .
docker push 127.0.0.1:5050/farm-agent:latest
docker push 127.0.0.1:5050/farm-manager:latest
# Deploy agent on each node
cd /mnt/docker-data/compose/{node}/farm-agent && docker compose up -d
# Deploy server on hf-pdocker-01
cd /mnt/docker-data/compose/hf-pdocker-01/farm-manager && docker compose up -d
```
## Configuration
`config.json` — Node definitions (name, host, agent_port)
`groups.json` — Service groups for bulk operations (10 default groups included)
Both stored at `/mnt/docker-data/configs/farm-manager/`.
## API
### Server (`:8888`)
| Method | Path | Description |
|--------|------|-------------|
| GET | `/api/nodes` | List nodes with health |
| GET | `/api/services` | All containers across nodes |
| POST | `/api/services/{node}/{container}/start` | Start container |
| POST | `/api/services/{node}/{container}/stop` | Stop container |
| POST | `/api/services/{node}/{container}/restart` | Restart container |
| GET | `/api/services/{node}/{container}/logs` | Get logs |
| POST | `/api/services/{node}/{container}/pull` | Pull image |
| GET/POST/PUT/DELETE | `/api/groups[/{id}]` | Group CRUD |
| POST | `/api/groups/{id}/start\|stop\|restart` | Bulk group actions |
### Agent (`:8889`)
| Method | Path | Description |
|--------|------|-------------|
| GET | `/health` | Agent health check |
| GET | `/containers` | List all containers |
| POST | `/containers/{id}/start\|stop\|restart` | Container actions |
| GET | `/containers/{id}/logs` | Get logs |
| POST | `/containers/{id}/pull` | Pull image |
## Testing
```bash
pip install -r agent/requirements.txt -r server/requirements.txt pytest pytest-asyncio httpx
pytest -v
```
52 tests covering agent Docker operations, agent API, server proxy routes, and group CRUD.
## Swarm Handling
The agent auto-detects Swarm containers via the `com.docker.swarm.service.name` label:
- **Start**: Scales service to 1 replica
- **Stop**: Scales service to 0 replicas
- **Restart**: Force-updates the service