petersweb-infra/nixos/CLAUDE.md

117 lines
6.7 KiB
Markdown
Raw Normal View History

2026-05-24 23:31:05 -08:00
# petersweb-infra/nixos — CLAUDE.md
## What this repo is
NixOS configuration for a single Hetzner server ("mainframe") running Philip Peterson's personal/Quine Foundation infrastructure. One machine, one flake configuration: `nixosConfigurations.mainframe`.
## Applying changes
```bash
./apply.sh # git pull + nixos-rebuild switch --flake .#mainframe
# or manually:
nixos-rebuild switch --flake /root/petersweb-infra/nixos#mainframe
```
## File layout
| Path | Purpose |
|---|---|
| `flake.nix` | Single flake, defines `nixosConfigurations.mainframe` |
| `hetzner.nix` | Hardware config: GRUB on `/dev/sda`, static networking, openssh |
| `linux.nix` | Main system config: services, secrets, docker containers, ACME certs |
| `nginx.nix` | Nginx virtual hosts and reverse proxies |
| `firewall.nix` | Open TCP ports |
| `disk-config.nix` | disko disk layout |
| `cloned_repos/` | `pullomatic` configs for auto-pulling git repos to `/etc/pullomatic/` |
| `arion/` | Arion (docker-compose-like) for Forgejo |
| `arion-riverside/` | Arion for the Riverside service |
| `pullomatic/` | Rust tool that watches git remotes and pulls on a schedule |
| `invoke-ddns/` | Python DDNS updater for NearlyFreeSpeech DNS |
| `secrets/` | agenix-encrypted secrets |
| `keys/` | SSH public keys used as age recipients |
| `system/` | User definitions and home-manager config |
| `pdxdestiny/` | Static site files for pdxdestiny.com |
2026-05-30 03:05:36 -08:00
| `vnc-desktop/` | Dockerfile + build scripts for the KDE Plasma VNC desktop container |
2026-05-24 23:31:05 -08:00
## Secrets (agenix)
Secrets live in `secrets/*.age`. They are encrypted with the key in `keys/mainframe.pub` (which is identical to `/root/.ssh/id_rsa_nix.pub` on the server).
**Important:** Agenix uses three identity paths for decryption (see activation script):
1. `/etc/ssh/ssh_host_rsa_key`
2. `/etc/ssh/ssh_host_ed25519_key`
3. `/root/.ssh/id_rsa_nix`**this is the actual working key**
The decrypted secrets land at `/run/agenix/<name>` at boot.
### Secret format matters
The NixOS `gitea-actions-runner` module reads the token via `EnvironmentFile=`, so the secret file must be in `KEY=VALUE` format:
- `forgejo-runner-token.age` → must contain `TOKEN=<raw_token>` (not just the raw token)
- `nearlyfreespeech.age` → contains `NEARLYFREESPEECH_API_KEY=...` and `NEARLYFREESPEECH_LOGIN=...`
- `webdav.age` → contains `WEBDAV_PASSWORD=...`
- `anthropic-api-key.age` → contains `ANTHROPIC_API_KEY=...`
- `postmark.age` → contains `POSTMARK_SERVER_TOKEN=...`
### Re-encrypting a secret
```bash
# Encrypt new content for the mainframe key
printf "TOKEN=newvalue\n" | nix run nixpkgs#age -- \
-r "$(cat /root/petersweb-infra/nixos/keys/mainframe.pub)" \
-o /root/petersweb-infra/nixos/secrets/forgejo-runner-token.age
# Verify it decrypts correctly
nix run nixpkgs#age -- -d -i /root/.ssh/id_rsa_nix \
/root/petersweb-infra/nixos/secrets/forgejo-runner-token.age
```
Note: `secrets/default.nix` is the agenix recipients file. Agenix looks for `secrets.nix` by default — to use the CLI with this repo's `default.nix`, you'd need a symlink or pass the path manually. Use `age` directly instead (as above).
## Key services
| Service | Description |
|---|---|
| `gitea-runner-ubuntu.service` | Forgejo (Gitea) Actions CI runner, uses docker images |
| `forgejo-arion.service` | Forgejo itself, run via Arion/Podman |
| `riverside-arion.service` | Riverside app, run via Arion/Docker |
2026-05-30 03:05:36 -08:00
| `podman-navidrome.service` | Navidrome music server on port 4533 |
| `podman-nextcloud.service` | Nextcloud/SSH container on port 8087 |
| `podman-sync.io.service` | sync.io app on port 9090 |
| `podman-blog-quine.service` | Blog on port 3010 |
| `podman-coldairnetworks.service` | Cold Air Networks site on port 3012 |
| `podman-vnc-desktop.service` | KDE Plasma desktop, noVNC on port 6080 (localhost only) |
| `build-vnc-image.service` | Builds the VNC desktop image from `vnc-desktop/`; runs before `podman-vnc-desktop` |
2026-05-24 23:31:05 -08:00
| nginx | Reverse proxy + ACME certs for multiple domains |
## Virtualisation
2026-05-30 03:05:36 -08:00
- **Podman** is used for all OCI containers (`virtualisation.oci-containers.backend = "podman"`) — navidrome, nextcloud, blog, VNC desktop, etc. — and for Forgejo via Arion.
- **Docker** is still present for the Riverside Arion stack.
- `DOCKER_HOST` for the gitea-runner is set to `unix:///run/podman/podman.sock`.
2026-05-24 23:31:05 -08:00
- The gitea-runner runs docker images for CI jobs, so the `gitea-runner` user is in the `docker` and `podman` supplementary groups.
2026-05-30 03:05:36 -08:00
## VNC desktop
`podman-vnc-desktop.service` runs a KDE Plasma desktop inside a container, accessible via noVNC at `localhost:6080` (reverse-proxied by nginx). The image is built locally — no registry involved.
- **Image source**: `vnc-desktop/Dockerfile` (Ubuntu 24.04, TigerVNC, KDE, Firefox, patched Discover)
- **Auto-rebuild**: `build-vnc-image.service` runs on boot and on `nixos-rebuild switch` whenever `vnc-desktop/` changes. The trigger is `vncContext = builtins.path { path = ./vnc-desktop; }` — a Nix store path that invalidates when any file in the directory changes.
- **Auto-restart**: `podman-vnc-desktop.service` has `restartTriggers = [ vncContext ]`, so the container restarts automatically after a rebuild during `nixos-rebuild switch`.
- **Secrets**: `VNC_PASSWORD` and `ROOT_PASSWORD` come from `age.secrets.vnc-password`.
- **Discover logging**: `vnc-desktop/discover-logging/` contains a build-time patch (`patch.py`) that instruments `PKTransaction.cpp` with `qWarning` calls to diagnose hanging installs. Logs visible via `podman logs vnc-desktop`.
2026-05-24 23:31:05 -08:00
## Networking / DNS
- Dynamic DNS via `invoke-ddns` (NearlyFreeSpeech provider).
- ACME certs issued via DNS challenge for `philippeterson.com` and `webdav.philippeterson.com`.
- Forgejo accessible on ports 3000 (HTTP) and 2200 (SSH).
## Known gotchas
- `gitea-runner` is a `DynamicUser` in the systemd service, so it has no persistent uid. Setting `age.secrets.forgejo-runner-token.owner = "gitea-runner"` causes a chown error at activation; use `owner = "root"` instead (the service reads it via `EnvironmentFile` which runs as root before privilege drop).
- `secrets/default.nix` must have the public key from `keys/mainframe.pub` as the recipient — if the host SSH keys change, you must also update `mainframe.pub` and re-key all secrets.
- `pullomatic` uses `/root/.ssh/id_rsa.pem` (a PEM-format SSH key) to pull private git repos.
2026-05-25 23:19:34 -08:00
- **ACME cyclic dependency list**: `linux.nix` has a `systemd.services.nginx.after = lib.mkForce [...]` list that breaks a systemd cycle between nginx and ACME services. Every new domain added with `enableACME = true` in `nginx.nix` **must** also have its `acme-selfsigned-<domain>.service` added to this list in `linux.nix`, otherwise nixos-rebuild will fail with a cyclic dependency error.