Demesne is a git-versioned, configure-only deployment tool. It configures already-provisioned OVHcloud hosts (Ubuntu 24.04 LTS) to run a private, hardened AI control plane suitable for processing real personal data under UK GDPR / DPA 2018.
Scope (lean control plane). Demesne hosts the LiteLLM gateway, Open
WebUI, and Periscope (later) on the OVH control host. Product inference
runs on hosted Claude under the Anthropic DPA. On-host GPU / local
inference is deferred/conditional (activated only if a client forbids hosted
models), and autonomous agents are out of scope — deal-flow automation runs
off-box (Claude Routines + Claude Code/Cowork + GitHub + Slack).
It does not provision infrastructure. It configures existing hosts via Ansible, behind a reviewed plan → apply workflow. The durable artifact is this repository: deterministic, idempotent, and auditable. An LLM may author and drive the tooling; the deterministic mutation is always Ansible's, behind a dry run you review.
Status: under construction. See
CHANGELOG.mdfor milestone progress andCOMPLETION-REPORT.md(when present) for the full validation summary.
Tailscale tailnet (private, deny-by-default)
┌──────────────────────────────────────────────────────────────┐
│ │
│ control (small VPS) gpu (DORMANT — deferred)│
│ ┌────────────────────────┐ ┌────────────────────┐ │
│ │ Coolify (mgmt, :8000) │ │ Conditional GPU │ │
│ │ dashboard NOT public │ │ tier — activated │ │
│ ├────────────────────────┤ │ ONLY if a client │ │
│ │ gateway (LiteLLM) ─────┼─► hosted │ forbids hosted │ │
│ │ open-webui (chat UI) │ Claude │ models: │ │
│ │ periscope (later) │ (DPA) │ Ollama/vLLM + │ │
│ │ │ │ NVIDIA + worker │ │
│ └────────────────────────┘ └────────────────────┘ │
└──────────────────────────────────────────────────────────────┘
- control — Coolify (deployment/management) plus the light containers: the model gateway, the chat UI, and (later) Periscope.
- gpu — dormant. The GPU/local-inference tier (Ollama/vLLM, NVIDIA,
Coolify worker) is deferred and activated only if a client forbids hosted
models. No
gpuhost is in the default inventory.
Open WebUI ─┐ virtual key ┌─► hosted Claude (Anthropic DPA)
Periscope ──┼──► GATEWAY ────────┤ (de-identified data only)
(later) ┘ (LiteLLM, :4000) └─► local inference (Ollama/vLLM) — DORMANT
│ (GPU tier; activate only if required)
single place for:
• provider keys • data-handling policy
• budgets/virtual keys• redacted logging
The gateway (LiteLLM) is the hub. It is the only place provider keys live and the only place data-handling policy is enforced: product traffic goes to hosted Claude under the Anthropic DPA with de-identified data; prompt/response logging is redacted/minimised. Consumers get a budget-limited, revocable virtual key — never a raw provider key. (The PII-safe local route is dormant until the GPU tier is activated.)
Private by default over Tailscale, with three segmented tiers:
| Tier | Tailscale tag | May reach |
|---|---|---|
| team | tag:team |
chat UI + gateway (over tailnet) |
| guests (semi-trusted) | tag:guest |
nothing sensitive — never the gateway |
| agents (reserved) | tag:agent |
out of scope — automation runs off-box; tag kept as a reserved pattern, no active grant |
The Coolify dashboard and the gateway API stay off the public internet
(tailnet / loopback only). Because Docker bypasses UFW, the firewall
role applies a DOCKER-USER-chain fix and services bind to 127.0.0.1 where
possible. The intended tailnet ACL is policy/tailscale-acls.hujson (applied by
the operator in the Tailscale admin — see DECISIONS.md).
First-class, not bolted on — see policy/:
- Segmentation (
segmentation.md,tailscale-acls.hujson) - Gateway-enforced data handling (
data-handling.md) — hosted Claude under the Anthropic DPA, de-identified data only; PII-safe local route dormant - Encryption — TLS in transit; encrypted volumes + backups at rest (root-FDE
limitation on pre-provisioned hosts honestly documented in
art32-measures.md) - Audit logging (
art32-measures.md) + the.claude/audit hook - Retention & erasure (
retention-and-erasure.md) - DPIA scaffolding (
dpia-scope.md) and the Article 32 measures register (art32-measures.md)
Any Claude Code session operating this repo inherits:
- Permission boundary (
settings.json): denies reading secret/PII paths and blast-radius commands; a generous allowlist is fine because deny always wins. Mutatingtask applyand pushes prompt first. - Audit + guard hook (
hooks/audit-and-guard.sh): aPreToolUsehook that appends every Bash command to an append-only audit log and hard-blocks blast-radius, secret-exfiltration, and guardrail-tampering patterns. Unit-test it withtask guard-test. - Conventions (
CLAUDE.md): the operating rules and §5 invariants.
Control node (your laptop / CI), not the target hosts:
- Python 3.12+,
task(go-task),git sops+age(secrets),gitleaks(secret-scan),shellcheck(lint)ansible-core2.21 and the pinned collections — installed bytask setup
Target hosts (already provisioned by you):
- Two OVH instances running Ubuntu 24.04 LTS (one small
control, onegpu), reachable over Tailscale, with a non-root SSH user (key-only) that cansudo. Thegpuhost has an NVIDIA GPU.
# 0. Tooling + collections
task setup
# 1. Inventory (real, tailnet addresses; gitignored)
cp inventory/hosts.example.yml inventory/hosts.yml
$EDITOR inventory/hosts.yml
# 2. Secrets (SOPS + age)
age-keygen -o ~/.config/sops/age/keys.txt # once; back this up
$EDITOR .sops.yaml # put your age PUBLIC recipients
cp inventory/group_vars/all/secrets.sops.yaml.example \
inventory/group_vars/all/secrets.sops.yaml
$EDITOR inventory/group_vars/all/secrets.sops.yaml # real values
sops --encrypt --in-place inventory/group_vars/all/secrets.sops.yaml
# 3. Validate locally (never touches a host)
task lint && task test
# 4. DRY RUN — review the diff carefully
task plan # ansible-playbook --check --diff
# (optionally: task plan LIMIT=control or task plan TAGS=common)
# 5. APPLY (prompts first)
task applyThen apply the tailnet ACL (policy/tailscale-acls.hujson) in the Tailscale
admin, mint Open WebUI's gateway virtual key (services/gateway/README.md), and
run the post-apply checks in runbooks/apply.md.
inventory/ control + gpu groups, group_vars, *.sops.yaml(.example) secrets
roles/ OS/platform config: common, ssh_hardening, tailscale, firewall,
docker, coolify (+ nvidia, coolify_worker — dormant GPU tier)
services/ Coolify/Compose templates (digest-pinned): gateway, open-webui,
inference (dormant), and _template/ (copy-me)
policy/ compliance as docs/config (DPIA, data-handling, retention,
segmentation, Tailscale ACLs, OVH firewall, Art. 32 register)
runbooks/ operator + future-session workflows (apply, add-service, agent-eval, day-2)
playbooks/ site.yml and per-layer playbooks
tests/ lint/secret-scan/validators/guard-hook/idempotency harness
docs/adr/ architecture decision records
.claude/ guardrails: settings.json, CLAUDE.md, audit+guard hook
Taskfile.yml entry points ansible.cfg .sops.yaml .gitleaks.toml
task lint (yamllint + ansible-lint production + shellcheck) and task test
(syntax-check + secret-scan over tree & history + inventory/SOPS/services
validators + guard-hook unit test) run locally and in CI on every push. Role
idempotency runs via task idempotency (container double-run; capability-aware,
falling back to operator-verify where Docker is unavailable).
Adding a service is a clean, repeatable operation — copy services/_template/
and follow runbooks/add-service.md. Every new service must respect the §5
invariants by construction (digest-pinned; gateway-routed via a virtual key, not
raw provider keys; correct trust tier; *.example secrets; data-handling
documented if it touches personal data).
Not yet chosen — see DECISIONS.md item 7.