3.3 KiB
Hetzner rebuild pipeline map
This document summarizes the currently intended rebuild flow for the repo-driven Hetzner single-node cluster.
It is a companion to the operator runbooks, not a competing source of truth. Use these first for exact commands and required env:
docs/hetzner-k3s-bootstrap.mddocs/hetzner-self-hosted-ci-runbook.mddocs/k8s-observability.md
High-level rebuild sequence
- prepare
scripts/hetzner/bootstrap-secrets.env - source it so
*_PASSmappings resolve throughpass - optionally run
scripts/hetzner/destroy.sh - run
scripts/hetzner/bootstrap.sh - let bootstrap:
- provision/update Hetzner infra with Terraform
- configure DNS when provider credentials are present
- fetch the real kubeconfig from the node
- render
.state/hetzner/generated-overlay/ - apply platform + project manifests
- bootstrap Forgejo admin, runner, repo, and Actions configuration
- seed the repo into Forgejo
- trigger the normal Forgejo Actions build/push/deploy path
- verify public/operator surfaces:
- Forgejo
- registry
- Grafana
- Headlamp
- verify workload health and CI success
Ownership boundaries
Terraform owns
- Hetzner VM
- network
- firewall
- cloud-init user data
Cloud-init owns
- OS package prep
- optional Tailscale join
- k3s installation
- a marker file under
/opt/unrip/bootstrap/README.txt
Cloud-init does not clone this repo or apply Kubernetes manifests.
Bootstrap script owns
pass-resolved secret loading- DNS automation
- kubeconfig retrieval/rendering
- generated overlay rendering under
.state/hetzner/generated-overlay/ - imperative registry auth secret creation
- Forgejo bootstrap API calls
- repo seeding
- Headlamp token export to
pass
Kubernetes manifests own
- platform services
- project services
- ingress/TLS resources
- observability stack
- persistent volume claims and workload specs
Current default runtime model
Platform services:
- Forgejo
- Forgejo runner
- registry
- cert-manager
- Grafana
- Loki
- Promtail
- Headlamp
Project services:
- Redpanda
near-intents-ingestdummy-reactordummy-executordummy-consumer
Ingress/controller model:
- Traefik bundled with k3s
- no ingress-nginx in the active path
Rebuild verification checklist
After bootstrap, verify:
export KUBECONFIG=$PWD/.state/hetzner/kubeconfig.yaml
kubectl get nodes -o wide
kubectl get pods -A
kubectl -n observability get deploy,ds,pods,svc,ingress,secrets
kubectl -n forgejo get deploy,pods,svc,ingress
kubectl -n registry get deploy,pods,svc,ingress
kubectl -n unrip get deploy,pods
Public/operator surfaces should respond:
https://git.<public-domain>/https://registry.<public-domain>/v2/https://grafana.<public-domain>/https://headlamp.<public-domain>/
CI should show a successful deploy workflow in Forgejo Actions.
Current caveat
The core Hetzner/k3s/Forgejo path has been rebuilt successfully before. Headlamp was added afterward and validated live on the rebuilt cluster, but a brand-new destroy/rebuild rehearsal with Headlamp included has not yet been re-run from zero.
So the rebuild story is repo-driven and operationally close to fully reproducible, with one remaining value-add validation step: a final clean-room rebuild after the latest Headlamp/docs cleanup.