Skip to content

WE3io/Demesne

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Demesne — AI control-plane deployment tool (configure-only)

Demesne is a git-versioned, configure-only deployment tool. It configures already-provisioned OVHcloud hosts (Ubuntu 24.04 LTS) to run a private, hardened AI control plane suitable for processing real personal data under UK GDPR / DPA 2018.

Scope (lean control plane). Demesne hosts the LiteLLM gateway, Open WebUI, and Periscope (later) on the OVH control host. Product inference runs on hosted Claude under the Anthropic DPA. On-host GPU / local inference is deferred/conditional (activated only if a client forbids hosted models), and autonomous agents are out of scope — deal-flow automation runs off-box (Claude Routines + Claude Code/Cowork + GitHub + Slack).

It does not provision infrastructure. It configures existing hosts via Ansible, behind a reviewed plan → apply workflow. The durable artifact is this repository: deterministic, idempotent, and auditable. An LLM may author and drive the tooling; the deterministic mutation is always Ansible's, behind a dry run you review.

Status: under construction. See CHANGELOG.md for milestone progress and COMPLETION-REPORT.md (when present) for the full validation summary.


Architecture

Topology — two OVH hosts (both Ubuntu 24.04 LTS)

                 Tailscale tailnet (private, deny-by-default)
   ┌──────────────────────────────────────────────────────────────┐
   │                                                                │
   │   control (small VPS)                  gpu (DORMANT — deferred)│
   │   ┌────────────────────────┐           ┌────────────────────┐ │
   │   │ Coolify (mgmt, :8000)   │           │ Conditional GPU    │ │
   │   │  dashboard NOT public   │           │ tier — activated   │ │
   │   ├────────────────────────┤           │ ONLY if a client   │ │
   │   │ gateway  (LiteLLM) ─────┼─► hosted  │ forbids hosted     │ │
   │   │ open-webui (chat UI)    │   Claude  │ models:            │ │
   │   │ periscope  (later)      │  (DPA)    │  Ollama/vLLM +     │ │
   │   │                         │           │  NVIDIA + worker   │ │
   │   └────────────────────────┘           └────────────────────┘ │
   └──────────────────────────────────────────────────────────────┘
  • control — Coolify (deployment/management) plus the light containers: the model gateway, the chat UI, and (later) Periscope.
  • gpudormant. The GPU/local-inference tier (Ollama/vLLM, NVIDIA, Coolify worker) is deferred and activated only if a client forbids hosted models. No gpu host is in the default inventory.

Service architecture — one gateway, every consumer points at it

   Open WebUI ─┐   virtual key       ┌─► hosted Claude (Anthropic DPA)
   Periscope ──┼──►  GATEWAY  ────────┤    (de-identified data only)
   (later)     ┘   (LiteLLM, :4000)   └─► local inference (Ollama/vLLM) — DORMANT
                         │                 (GPU tier; activate only if required)
                   single place for:
                   • provider keys      • data-handling policy
                   • budgets/virtual keys• redacted logging

The gateway (LiteLLM) is the hub. It is the only place provider keys live and the only place data-handling policy is enforced: product traffic goes to hosted Claude under the Anthropic DPA with de-identified data; prompt/response logging is redacted/minimised. Consumers get a budget-limited, revocable virtual key — never a raw provider key. (The PII-safe local route is dormant until the GPU tier is activated.)

Trust model & segmentation

Private by default over Tailscale, with three segmented tiers:

Tier Tailscale tag May reach
team tag:team chat UI + gateway (over tailnet)
guests (semi-trusted) tag:guest nothing sensitive — never the gateway
agents (reserved) tag:agent out of scope — automation runs off-box; tag kept as a reserved pattern, no active grant

The Coolify dashboard and the gateway API stay off the public internet (tailnet / loopback only). Because Docker bypasses UFW, the firewall role applies a DOCKER-USER-chain fix and services bind to 127.0.0.1 where possible. The intended tailnet ACL is policy/tailscale-acls.hujson (applied by the operator in the Tailscale admin — see DECISIONS.md).

Compliance posture (UK GDPR / DPA 2018)

First-class, not bolted on — see policy/:

  • Segmentation (segmentation.md, tailscale-acls.hujson)
  • Gateway-enforced data handling (data-handling.md) — hosted Claude under the Anthropic DPA, de-identified data only; PII-safe local route dormant
  • Encryption — TLS in transit; encrypted volumes + backups at rest (root-FDE limitation on pre-provisioned hosts honestly documented in art32-measures.md)
  • Audit logging (art32-measures.md) + the .claude/ audit hook
  • Retention & erasure (retention-and-erasure.md)
  • DPIA scaffolding (dpia-scope.md) and the Article 32 measures register (art32-measures.md)

Safety guardrails (.claude/)

Any Claude Code session operating this repo inherits:

  • Permission boundary (settings.json): denies reading secret/PII paths and blast-radius commands; a generous allowlist is fine because deny always wins. Mutating task apply and pushes prompt first.
  • Audit + guard hook (hooks/audit-and-guard.sh): a PreToolUse hook that appends every Bash command to an append-only audit log and hard-blocks blast-radius, secret-exfiltration, and guardrail-tampering patterns. Unit-test it with task guard-test.
  • Conventions (CLAUDE.md): the operating rules and §5 invariants.

Prerequisites

Control node (your laptop / CI), not the target hosts:

  • Python 3.12+, task (go-task), git
  • sops + age (secrets), gitleaks (secret-scan), shellcheck (lint)
  • ansible-core 2.21 and the pinned collections — installed by task setup

Target hosts (already provisioned by you):

  • Two OVH instances running Ubuntu 24.04 LTS (one small control, one gpu), reachable over Tailscale, with a non-root SSH user (key-only) that can sudo. The gpu host has an NVIDIA GPU.

Operator workflow (quickstart)

# 0. Tooling + collections
task setup

# 1. Inventory (real, tailnet addresses; gitignored)
cp inventory/hosts.example.yml inventory/hosts.yml
$EDITOR inventory/hosts.yml

# 2. Secrets (SOPS + age)
age-keygen -o ~/.config/sops/age/keys.txt          # once; back this up
$EDITOR .sops.yaml                                  # put your age PUBLIC recipients
cp inventory/group_vars/all/secrets.sops.yaml.example \
   inventory/group_vars/all/secrets.sops.yaml
$EDITOR inventory/group_vars/all/secrets.sops.yaml  # real values
sops --encrypt --in-place inventory/group_vars/all/secrets.sops.yaml

# 3. Validate locally (never touches a host)
task lint && task test

# 4. DRY RUN — review the diff carefully
task plan                 # ansible-playbook --check --diff
# (optionally: task plan LIMIT=control   or   task plan TAGS=common)

# 5. APPLY (prompts first)
task apply

Then apply the tailnet ACL (policy/tailscale-acls.hujson) in the Tailscale admin, mint Open WebUI's gateway virtual key (services/gateway/README.md), and run the post-apply checks in runbooks/apply.md.

Repository layout

inventory/   control + gpu groups, group_vars, *.sops.yaml(.example) secrets
roles/       OS/platform config: common, ssh_hardening, tailscale, firewall,
             docker, coolify (+ nvidia, coolify_worker — dormant GPU tier)
services/    Coolify/Compose templates (digest-pinned): gateway, open-webui,
             inference (dormant), and _template/ (copy-me)
policy/      compliance as docs/config (DPIA, data-handling, retention,
             segmentation, Tailscale ACLs, OVH firewall, Art. 32 register)
runbooks/    operator + future-session workflows (apply, add-service, agent-eval, day-2)
playbooks/   site.yml and per-layer playbooks
tests/       lint/secret-scan/validators/guard-hook/idempotency harness
docs/adr/    architecture decision records
.claude/     guardrails: settings.json, CLAUDE.md, audit+guard hook
Taskfile.yml entry points    ansible.cfg    .sops.yaml    .gitleaks.toml

Validation gates

task lint (yamllint + ansible-lint production + shellcheck) and task test (syntax-check + secret-scan over tree & history + inventory/SOPS/services validators + guard-hook unit test) run locally and in CI on every push. Role idempotency runs via task idempotency (container double-run; capability-aware, falling back to operator-verify where Docker is unavailable).

Extending the stack

Adding a service is a clean, repeatable operation — copy services/_template/ and follow runbooks/add-service.md. Every new service must respect the §5 invariants by construction (digest-pinned; gateway-routed via a virtual key, not raw provider keys; correct trust tier; *.example secrets; data-handling documented if it touches personal data).

Licence

Not yet chosen — see DECISIONS.md item 7.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors