# petersweb-infra/nixos — CLAUDE.md ## What this repo is NixOS configuration for a single Hetzner server ("mainframe") running Philip Peterson's personal/Quine Foundation infrastructure. One machine, one flake configuration: `nixosConfigurations.mainframe`. ## Applying changes ```bash ./apply.sh # git pull + nixos-rebuild switch --flake .#mainframe # or manually: nixos-rebuild switch --flake /root/petersweb-infra/nixos#mainframe ``` ## File layout | Path | Purpose | |---|---| | `flake.nix` | Single flake, defines `nixosConfigurations.mainframe` | | `hetzner.nix` | Hardware config: GRUB on `/dev/sda`, static networking, openssh | | `linux.nix` | Main system config: services, secrets, docker containers, ACME certs | | `nginx.nix` | Nginx virtual hosts and reverse proxies | | `firewall.nix` | Open TCP ports | | `disk-config.nix` | disko disk layout | | `cloned_repos/` | `pullomatic` configs for auto-pulling git repos to `/etc/pullomatic/` | | `arion/` | Arion (docker-compose-like) for Forgejo | | `arion-riverside/` | Arion for the Riverside service | | `pullomatic/` | Rust tool that watches git remotes and pulls on a schedule | | `invoke-ddns/` | Python DDNS updater for NearlyFreeSpeech DNS | | `secrets/` | agenix-encrypted secrets | | `keys/` | SSH public keys used as age recipients | | `system/` | User definitions and home-manager config | | `pdxdestiny/` | Static site files for pdxdestiny.com | | `vnc-desktop/` | Dockerfile + build scripts for the KDE Plasma VNC desktop container | ## Secrets (agenix) Secrets live in `secrets/*.age`. They are encrypted with the key in `keys/mainframe.pub` (which is identical to `/root/.ssh/id_rsa_nix.pub` on the server). **Important:** Agenix uses three identity paths for decryption (see activation script): 1. `/etc/ssh/ssh_host_rsa_key` 2. `/etc/ssh/ssh_host_ed25519_key` 3. `/root/.ssh/id_rsa_nix` ← **this is the actual working key** The decrypted secrets land at `/run/agenix/` at boot. ### Secret format matters The NixOS `gitea-actions-runner` module reads the token via `EnvironmentFile=`, so the secret file must be in `KEY=VALUE` format: - `forgejo-runner-token.age` → must contain `TOKEN=` (not just the raw token) - `nearlyfreespeech.age` → contains `NEARLYFREESPEECH_API_KEY=...` and `NEARLYFREESPEECH_LOGIN=...` - `webdav.age` → contains `WEBDAV_PASSWORD=...` - `anthropic-api-key.age` → contains `ANTHROPIC_API_KEY=...` - `postmark.age` → contains `POSTMARK_SERVER_TOKEN=...` ### Re-encrypting a secret ```bash # Encrypt new content for the mainframe key printf "TOKEN=newvalue\n" | nix run nixpkgs#age -- \ -r "$(cat /root/petersweb-infra/nixos/keys/mainframe.pub)" \ -o /root/petersweb-infra/nixos/secrets/forgejo-runner-token.age # Verify it decrypts correctly nix run nixpkgs#age -- -d -i /root/.ssh/id_rsa_nix \ /root/petersweb-infra/nixos/secrets/forgejo-runner-token.age ``` Note: `secrets/default.nix` is the agenix recipients file. Agenix looks for `secrets.nix` by default — to use the CLI with this repo's `default.nix`, you'd need a symlink or pass the path manually. Use `age` directly instead (as above). ## Key services | Service | Description | |---|---| | `gitea-runner-ubuntu.service` | Forgejo (Gitea) Actions CI runner, uses docker images | | `forgejo-arion.service` | Forgejo itself, run via Arion/Podman | | `riverside-arion.service` | Riverside app, run via Arion/Docker | | `podman-coldairnetworks-postgres.service` | PostgreSQL 16 on port 5432 (publicly exposed) | | `podman-coldairnetworks-pgadmin.service` | pgAdmin 4 on port 5050 (localhost only) | | `podman-navidrome.service` | Navidrome music server on port 4533 | | `podman-nextcloud.service` | Nextcloud/SSH container on port 8087 | | `podman-sync.io.service` | sync.io app on port 9090 | | `podman-blog-quine.service` | Blog on port 3010 | | `podman-coldairnetworks.service` | Cold Air Networks site on port 3012 | | `podman-vnc-desktop.service` | KDE Plasma desktop, noVNC on port 6080 (localhost only) | | `build-vnc-image.service` | Builds the VNC desktop image from `vnc-desktop/`; runs before `podman-vnc-desktop` | | nginx | Reverse proxy + ACME certs for multiple domains | ## Virtualisation - **Podman** is used for all OCI containers (`virtualisation.oci-containers.backend = "podman"`) — navidrome, nextcloud, blog, VNC desktop, etc. — and for Forgejo via Arion. - **Docker** is still present for the Riverside Arion stack. - `DOCKER_HOST` for the gitea-runner is set to `unix:///run/podman/podman.sock`. - The gitea-runner runs docker images for CI jobs, so the `gitea-runner` user is in the `docker` and `podman` supplementary groups. ## PostgreSQL / pgAdmin (coldairnetworks) Two Podman containers defined in `linux.nix` under `virtualisation.oci-containers`. | Container | Image | Port | Role | |---|---|---|---| | `coldairnetworks-postgres` | `postgres:16` | 5432 (public) | PostgreSQL database | | `coldairnetworks-pgadmin` | `dpage/pgadmin4` | 5050 (localhost) | pgAdmin 4 web UI | ### Credential files (not in git — create manually on server) | Path | Contents | |---|---| | `/var/coldairnetworks-db/postgres.env` | `POSTGRES_USER`, `POSTGRES_PASSWORD`, `POSTGRES_DB` | | `/var/coldairnetworks-db/pgadmin.env` | `PGADMIN_DEFAULT_EMAIL`, `PGADMIN_DEFAULT_PASSWORD` | | `/var/coldairnetworks-db/htpasswd` | nginx basic auth — generate with `htpasswd -c /var/coldairnetworks-db/htpasswd ` | ### Data directories | Host path | Purpose | |---|---| | `/var/coldairnetworks-db/postgres` | PostgreSQL data (owned root:root) | | `/var/coldairnetworks-db/pgadmin` | pgAdmin state (owned uid 5050 — the pgAdmin container user) | ### Access - **Web UI**: `https://db.coldairnetworks.com` — nginx basic auth first, then pgAdmin login - **Direct connection**: `psql -h mainframe.philippeterson.com -U admin -d coldairnetworks` (port 5432 open in firewall) - **pgAdmin → PostgreSQL**: when adding a server in pgAdmin, use `host.containers.internal` as the hostname (Podman host gateway), port 5432 ## VNC desktop `podman-vnc-desktop.service` runs a KDE Plasma desktop inside a container, accessible via noVNC at `localhost:6080` (reverse-proxied by nginx). The image is built locally — no registry involved. - **Image source**: `vnc-desktop/Dockerfile` (Ubuntu 24.04, TigerVNC, KDE, Firefox, patched Discover) - **Auto-rebuild**: `build-vnc-image.service` runs on boot and on `nixos-rebuild switch` whenever `vnc-desktop/` changes. The trigger is `vncContext = builtins.path { path = ./vnc-desktop; }` — a Nix store path that invalidates when any file in the directory changes. - **Auto-restart**: `podman-vnc-desktop.service` has `restartTriggers = [ vncContext ]`, so the container restarts automatically after a rebuild during `nixos-rebuild switch`. - **Secrets**: `VNC_PASSWORD` and `ROOT_PASSWORD` come from `age.secrets.vnc-password`. - **Discover logging**: `vnc-desktop/discover-logging/` contains a build-time patch (`patch.py`) that instruments `PKTransaction.cpp` with `qWarning` calls to diagnose hanging installs. Logs visible via `podman logs vnc-desktop`. ## Networking / DNS - Dynamic DNS via `invoke-ddns` (NearlyFreeSpeech provider). - ACME certs issued via DNS challenge for `philippeterson.com` and `webdav.philippeterson.com`. - Forgejo accessible on ports 3000 (HTTP) and 2200 (SSH). ## OpenClaw OpenClaw runs as two Arion/Podman containers defined in `arion-openclaw/arion-compose.nix`, both using `network_mode = "host"` so they share the host's `127.0.0.1`. | Container | Name | Port | Role | |---|---|---|---| | `openclaw-gateway` | `node:22-alpine` | 18789 (WebSocket) | OpenClaw Gateway (`openclaw@latest`) | | `openclaw` | `node:22-alpine` | 4310 (HTTP) | OpenClaw Control Center (SSR UI) | ### Volumes and paths | Host path | Container path | Notes | |---|---|---| | `/var/openclaw/gateway` | `/app` (gateway), `/gateway` (app) | npm install location for `openclaw` package | | `/var/openclaw/app` | `/app` | Control center git clone + runtime files | | `/root/.openclaw` | `/root/.openclaw` | OpenClaw home; shared **read-write** by both containers | `/root/.openclaw` must be **writable** in the app container (not `:ro`) — the CLI writes state files at startup and connection probes fail with EROFS otherwise. The CLI's effective state dir is `/root/.openclaw/.openclaw/` (double-nested: the CLI treats `OPENCLAW_HOME` as HOME and appends `.openclaw/` internally). ### Auth and connectivity - Gateway runs with `--auth none --dev`. In `--auth none` mode, clients must still present either a device identity (challenge-response) or any token via `OPENCLAW_GATEWAY_TOKEN`. - `OPENCLAW_GATEWAY_TOKEN=openclaw-local-dev` is set in the app container — this lets the CLI probes connect immediately without waiting for device auto-approval. - Device identity lives at `/root/.openclaw/.openclaw/identity/device.json`. In `--dev` mode the gateway auto-approves the local device after first contact. - The control center calls `openclaw status --json` and `openclaw gateway status --json` as CLI subprocesses (not via WebSocket directly). The binary path is set via `OPENCLAW_BIN_PATH=/gateway/node_modules/.bin/openclaw`. ### nginx `claw.quineglobal.com` is proxied to `127.0.0.1:4310`. Key settings: - `forceSSL = false; addSSL = true` — Cloudflare Flexible SSL sends plain HTTP to origin; `forceSSL = true` would create a redirect loop. - `basicAuthFile = "/var/openclaw/htpasswd"` — credentials: `ironmagma / Nargism333`. - WebSocket upgrade headers are set (`Upgrade`, `Connection: upgrade`) so the control center's live-update SSE works through the proxy. ### Control center startup sequence The app container startup script (in `arion-compose.nix`): 1. `apk add git` 2. Clones `https://github.com/TianyiDataScience/openclaw-control-center.git` to `/app/repo` (once) 3. Patches `src/ui/server.ts` and `src/runtime/ui-preferences.ts` via `sed` to default language to `"en"` instead of `"zh"` 4. `npm install && npm run build && npm run dev:ui` ### Usage connector sources The Settings → Usage panel tracks 6 data sources. Current status: | Source | Status | How to connect | |---|---|---| | Context capacity | Connected | `runtime/model-context-catalog.json` exists at `/var/openclaw/app/repo/runtime/` | | Provider attribution | Connected | Derived from context catalog | | Digest history | Partial (auto) | Builds up as the monitor runs over time | | Request counts | Not connected | Needs real AI requests through the gateway | | Budget limit | Not connected | Add cost thresholds to agent config | | Subscription usage | Not connected | Add `runtime/subscription-snapshot.json` or provider billing snapshot | The `model-context-catalog.json` format: ```json { "models": [{ "match": "gpt-5.5", "contextWindowTokens": 200000, "provider": "openai" }, ...] } ``` `match` is compared case-insensitively against the model name reported by the runtime. ### Restarting / rebuilding After changing `arion-compose.nix`, a `nixos-rebuild switch` regenerates the compose YAML but **does not recreate running containers**. You must force recreation: ```bash podman rm -f openclaw # or openclaw-gateway systemctl restart arion-openclaw ``` ### Cloudflare SSL gotcha This server sits behind Cloudflare in **Flexible** mode (Cloudflare → origin over plain HTTP). Any `nginx.nix` virtualHost for a Cloudflare-proxied domain must use `forceSSL = false; addSSL = true`, not `forceSSL = true`. The latter causes an infinite redirect loop because Cloudflare sends HTTP but nginx redirects to HTTPS, which Cloudflare re-proxies as HTTP again. ## Known gotchas - `gitea-runner` is a `DynamicUser` in the systemd service, so it has no persistent uid. Setting `age.secrets.forgejo-runner-token.owner = "gitea-runner"` causes a chown error at activation; use `owner = "root"` instead (the service reads it via `EnvironmentFile` which runs as root before privilege drop). - `secrets/default.nix` must have the public key from `keys/mainframe.pub` as the recipient — if the host SSH keys change, you must also update `mainframe.pub` and re-key all secrets. - `pullomatic` uses `/root/.ssh/id_rsa.pem` (a PEM-format SSH key) to pull private git repos. - **ACME cyclic dependency list**: `linux.nix` has a `systemd.services.nginx.after = lib.mkForce [...]` list that breaks a systemd cycle between nginx and ACME services. Every new domain added with `enableACME = true` in `nginx.nix` **must** also have its `acme-selfsigned-.service` added to this list in `linux.nix`, otherwise nixos-rebuild will fail with a cyclic dependency error.