# Hetzner rebuild pipeline map This document summarizes the currently intended rebuild flow for the repo-driven Hetzner single-node cluster. It is a companion to the operator runbooks, not a competing source of truth. Use these first for exact commands and required env: - `docs/hetzner-k3s-bootstrap.md` - `docs/hetzner-self-hosted-ci-runbook.md` - `docs/k8s-observability.md` ## High-level rebuild sequence 1. prepare `scripts/hetzner/bootstrap-secrets.env` 2. source it so `*_PASS` mappings resolve through `pass` 3. optionally run `scripts/hetzner/destroy.sh` 4. run `scripts/hetzner/bootstrap.sh` 5. let bootstrap: - provision/update Hetzner infra with Terraform - configure DNS when provider credentials are present - fetch the real kubeconfig from the node - render `.state/hetzner/generated-overlay/` - apply platform + project manifests - bootstrap Forgejo admin, runner, repo, and Actions configuration - seed the repo into Forgejo - trigger the normal Forgejo Actions build/push/deploy path 6. verify public/operator surfaces: - Forgejo - registry - Grafana - Headlamp 7. verify workload health and CI success ## Ownership boundaries ### Terraform owns - Hetzner VM - network - firewall - cloud-init user data ### Cloud-init owns - OS package prep - optional Tailscale join - k3s installation - a marker file under `/opt/unrip/bootstrap/README.txt` Cloud-init does **not** clone this repo or apply Kubernetes manifests. ### Bootstrap script owns - `pass`-resolved secret loading - DNS automation - kubeconfig retrieval/rendering - generated overlay rendering under `.state/hetzner/generated-overlay/` - imperative registry auth secret creation - Forgejo bootstrap API calls - repo seeding - Headlamp token export to `pass` ### Kubernetes manifests own - platform services - project services - ingress/TLS resources - observability stack - persistent volume claims and workload specs ## Current default runtime model Platform services: - Forgejo - Forgejo runner - registry - cert-manager - Grafana - Loki - Promtail - Headlamp Project services: - Redpanda - `near-intents-ingest` - `dummy-reactor` - `dummy-executor` - `dummy-consumer` Ingress/controller model: - Traefik bundled with k3s - no ingress-nginx in the active path ## Rebuild verification checklist After bootstrap, verify: ```bash export KUBECONFIG=$PWD/.state/hetzner/kubeconfig.yaml kubectl get nodes -o wide kubectl get pods -A kubectl -n observability get deploy,ds,pods,svc,ingress,secrets kubectl -n forgejo get deploy,pods,svc,ingress kubectl -n registry get deploy,pods,svc,ingress kubectl -n unrip get deploy,pods ``` Public/operator surfaces should respond: - `https://git./` - `https://registry./v2/` - `https://grafana./` - `https://headlamp./` CI should show a successful deploy workflow in Forgejo Actions. ## Current caveat The core Hetzner/k3s/Forgejo path has been rebuilt successfully before. Headlamp was added afterward and validated live on the rebuilt cluster, but a brand-new destroy/rebuild rehearsal with Headlamp included has not yet been re-run from zero. So the rebuild story is repo-driven and operationally close to fully reproducible, with one remaining value-add validation step: a final clean-room rebuild after the latest Headlamp/docs cleanup.