# Hetzner + k3s + self-hosted Git/CI bootstrap Goal: provision and deploy everything from this repo to a single Hetzner machine with no manual server login. ## Stack - Terraform provisions the Hetzner Cloud VM, private network, and firewall - cloud-init installs Tailscale first when configured, then installs k3s automatically - cloud-init leaves only a bootstrap marker on the node; it does not clone this repo or apply Kubernetes assets - Kubernetes manifests deploy: - Redpanda - trading system services - private registry - Forgejo - Loki + Promtail + Grafana observability - k3s-bundled Traefik ingress resources - cert-manager - ACME issuers - local bootstrap script: - runs Terraform - optionally creates DNS records via Cloudflare or Porkbun - fetches the real kubeconfig from the node - writes overlay secrets/host patches from local env - renders `.state/hetzner/generated-overlay/` from the checked-in Hetzner overlay template plus `deploy/k8s/platform/base/kustomization.yaml` - applies that generated overlay from the operator workstation checkout - builds the current app image locally - imports the bootstrap image into k3s for the first rollout ## Files - `infra/terraform/hetzner/` - `deploy/k8s/base/` - `deploy/k8s/overlays/hetzner-single-node/` - `scripts/hetzner/bootstrap.sh` - `scripts/hetzner/configure-cloudflare-dns.sh` - `scripts/hetzner/destroy.sh` - `scripts/k8s/logs.sh` - `.forgejo/workflows/deploy.yml` ## Required local tools Always required: - `terraform` - `kubectl` - `curl` - `python3` - `ssh` - `git` - `base64` - `realpath` - `pass` when using any `*_PASS` mapping Conditionally required: - `docker` only for `BOOTSTRAP_DELIVERY_MODE=local-image-import`, or as a fallback when no native `htpasswd` binary is available locally - `htpasswd` is preferred for registry secret generation and avoids the docker fallback Required local Python modules: - `PyYAML` (`python3 -m pip install PyYAML`) for kubeconfig rendering during bootstrap - `PyNaCl` (`python3 -m pip install PyNaCl`) only when `BOOTSTRAP_DELIVERY_MODE=forgejo-actions` so bootstrap can encrypt Forgejo Actions secrets ## Required local env Start from: ```bash cp scripts/hetzner/bootstrap-secrets.env.example scripts/hetzner/bootstrap-secrets.env ${EDITOR:-vi} scripts/hetzner/bootstrap-secrets.env source scripts/hetzner/bootstrap-secrets.env ``` The mapping file should contain non-secret config plus `pass` entry references for secrets. Bootstrap and destroy load the first line from each configured pass entry without echoing it. Explicit env exports still override `pass` lookups. When you run `scripts/hetzner/bootstrap.sh`, it uses this file to materialize local Kubernetes inputs before apply: - overwrites `deploy/k8s/overlays/hetzner-single-node/secrets/unrip.env` with `NEAR_INTENTS_API_KEY` - overwrites `deploy/k8s/overlays/hetzner-single-node/secrets/forgejo.env` with Forgejo `root_url` and `domain` - overwrites `deploy/k8s/overlays/hetzner-single-node/secrets/observability.env` with Grafana bootstrap credentials and root URL - renders `.state/hetzner/generated-overlay/` as the bootstrap-time source of truth - copies the checked-in overlay patch behavior into that generated overlay - imports platform resources from `deploy/k8s/platform/base/kustomization.yaml`, so newly added platform modules such as observability manifests are included automatically - creates `registry-secrets` in namespace `registry` from `REGISTRY_USERNAME` and `REGISTRY_PASSWORD` - creates the project docker-registry pull secret in `PROJECT_NAMESPACE` from the same registry credentials This is different from running `kubectl apply -k deploy/k8s/overlays/hetzner-single-node` manually: plain Kustomize apply only consumes the checked-in overlay files, while bootstrap applies the generated overlay copy. Manual apply still only reads the checked-in files and does not read `scripts/hetzner/bootstrap-secrets.env` or create the imperative registry auth secrets on its own. Required values: - `HCLOUD_TOKEN_PASS` or `HCLOUD_TOKEN` - `SSH_PUBLIC_KEY_PATH` - `PUBLIC_DOMAIN` - `BASE_DOMAIN` - recommended Tailscale values: - `TAILSCALE_AUTH_KEY_PASS` or `TAILSCALE_AUTH_KEY` - optional `TAILSCALE_CONTROL_PLANE_HOSTNAME` to force a stable Tailscale DNS name for kube access - if `TAILSCALE_CONTROL_PLANE_HOSTNAME` is left empty, bootstrap auto-discovers the node via local `tailscale status --json` - `FORGEJO_DOMAIN` - `FORGEJO_ROOT_URL` - `REGISTRY_DOMAIN` - `GRAFANA_DOMAIN` - `GRAFANA_ROOT_URL` - `LETSENCRYPT_EMAIL` - `REGISTRY_USERNAME` - `REGISTRY_PASSWORD_PASS` or `REGISTRY_PASSWORD` - `NEAR_INTENTS_API_KEY_PASS` or `NEAR_INTENTS_API_KEY` - `FORGEJO_ADMIN_USERNAME` - `FORGEJO_ADMIN_EMAIL` - `FORGEJO_ADMIN_PASSWORD_PASS` or `FORGEJO_ADMIN_PASSWORD` - `GRAFANA_ADMIN_USERNAME` (defaults to `admin`) - `GRAFANA_ADMIN_PASSWORD_PASS` or `GRAFANA_ADMIN_PASSWORD` - optional repo settings: `FORGEJO_REPO_OWNER`, `FORGEJO_REPO_NAME`, `FORGEJO_REPO_PRIVATE` Optional for automatic DNS: - Cloudflare: - `CLOUDFLARE_API_TOKEN_PASS` or `CLOUDFLARE_API_TOKEN` - `CLOUDFLARE_ZONE_ID_PASS` or `CLOUDFLARE_ZONE_ID` - Porkbun: - `PORKBUN_API_KEY_PASS` or `PORKBUN_API_KEY` - `PORKBUN_SECRET_API_KEY_PASS` or `PORKBUN_SECRET_API_KEY` ## Bootstrap ```bash bash scripts/hetzner/bootstrap.sh ``` Outputs: - Hetzner VM created - Tailscale joined if configured - k3s installed - cloud-init writes `/opt/unrip/bootstrap/README.txt` as a marker that node-local repo bootstrap is not active yet - kubeconfig written to `.state/hetzner/kubeconfig.yaml` - CI kubeconfig written to `.state/hetzner/kubeconfig.incluster.yaml` - overlay secrets and ingress host patches rendered from local env / `pass` - `.state/hetzner/generated-overlay/` rendered and applied as the canonical bootstrap manifest set for that run - namespaces, Redpanda, app deployments, Forgejo, registry, Traefik-targeted ingress resources, cert-manager, issuers, and any additional platform resources referenced by `deploy/k8s/platform/base/kustomization.yaml` applied - Forgejo admin account created automatically if missing - Forgejo runner registration is generated automatically from inside the Forgejo pod and the resulting `/data/.runner` config is stored under the shared `forgejo-data` persistent volume used by the runner deployment - Forgejo repository created automatically in either the admin user's namespace or a pre-existing organization named by `FORGEJO_REPO_OWNER` - Forgejo Actions secrets and variables configured automatically - repo pushed to Forgejo automatically in the default `forgejo-actions` delivery mode via authenticated HTTPS Git push - first deployment triggered from Forgejo Actions by default ## Tailscale-first admin access Recommended mode: - public firewall exposes only `80/443` - admin access uses Tailscale - Kubernetes API uses the Tailscale hostname when `TAILSCALE_CONTROL_PLANE_HOSTNAME` is set `TF_ADMIN_CIDR_BLOCKS` remains only as a fallback if you intentionally want public admin/API exposure. ## DNS and TLS If DNS provider credentials are present, bootstrap updates: - `${PUBLIC_DOMAIN}` - `git.${PUBLIC_DOMAIN}` - `registry.${PUBLIC_DOMAIN}` - `grafana.${PUBLIC_DOMAIN}` Supported scripted providers: - Cloudflare - Porkbun TLS is handled in-cluster by cert-manager using Let's Encrypt issuers and the rendered ingress hosts. Grafana is the default observability UI wired into the public hostname model. Keep Grafana authenticated. The platform base assumes the default k3s Traefik ingress controller is present; it does not install ingress-nginx. For clean-cluster applies, the base kustomization now includes cert-manager before the `ClusterIssuer` resources so the issuer CRs can be created in the same bootstrap flow. ## Observe the cluster ```bash KUBECONFIG=.state/hetzner/kubeconfig.yaml kubectl get pods -A bash scripts/k8s/logs.sh ``` For the web log UI and observability stack, see `docs/k8s-observability.md`. ## Self-hosted CI/CD handoff Default bootstrap now automates the Forgejo handoff: 1. create the Forgejo repo in the admin namespace or in a pre-existing organization named by `FORGEJO_REPO_OWNER` 2. configure the repository Actions secrets: - `KUBECONFIG_B64` - `REGISTRY_USERNAME` - `REGISTRY_PASSWORD` 3. configure the repository Actions variables: - `REGISTRY_HOST=${REGISTRY_DOMAIN}` - `PROJECT_NAME` - `PROJECT_NAMESPACE` - `PROJECT_DEPLOYMENTS` 4. push the current repo to `main` The workflow then: - starts a Kubernetes Job in the target namespace - checks out the repo inside that Job using the Forgejo job token via `Authorization: Bearer ...` HTTP auth - uses Kaniko plus the Kubernetes registry auth secret to build and push `${REGISTRY_DOMAIN}/${PROJECT_NAME}:${GIT_SHA}` - updates the app deployments in `PROJECT_NAMESPACE` - waits for rollout Legacy local-image bootstrap remains available with: ```bash BOOTSTRAP_DELIVERY_MODE=local-image-import bash scripts/hetzner/bootstrap.sh ``` ## Destroy everything Default destroy only removes Terraform-managed Hetzner infrastructure: ```bash source scripts/hetzner/bootstrap-secrets.env bash scripts/hetzner/destroy.sh ``` Opt-in flags make destructive cleanup of bootstrap-managed leftovers explicit: ```bash source scripts/hetzner/bootstrap-secrets.env DESTROY_DNS=true \ DESTROY_LOCAL_STATE=true \ DESTROY_FORGEJO_REPO=true \ bash scripts/hetzner/destroy.sh ``` `destroy.sh` reads `HCLOUD_TOKEN`, optional `TAILSCALE_AUTH_KEY`, optional DNS provider credentials, and optional Forgejo admin credentials via the same `*_PASS` mapping mechanism as bootstrap. It uses the same Terraform inputs as bootstrap for the infrastructure resources, then can optionally: - delete the scripted DNS records for `${BASE_DOMAIN}`, `git.${BASE_DOMAIN}`, `registry.${BASE_DOMAIN}`, and `grafana.${BASE_DOMAIN}` - remove local bootstrap artifacts under `.state/hetzner/`, `deploy/k8s/overlays/hetzner-single-node/generated/`, and the local Terraform working/state files in `infra/terraform/hetzner/` - delete the bootstrap-managed Forgejo repository via the Forgejo API Supported scripted DNS cleanup providers: - Cloudflare - Porkbun Cleanup defaults are intentionally conservative: - `DESTROY_DNS=false` keeps provider records unless you explicitly opt in - `DESTROY_LOCAL_STATE=false` keeps the last kubeconfigs and generated manifests for inspection - `DESTROY_FORGEJO_REPO=false` keeps the remote Git repository unless you explicitly opt in If any optional cleanup step is enabled but cannot run because credentials are missing, `destroy.sh` prints a skip message describing what was not removed. If DNS cleanup or Forgejo repo deletion fails after Terraform teardown, rerun the same cleanup flags or remove the remaining resources manually. ## Current limitations - organization-owned repo bootstrap works only when `FORGEJO_REPO_OWNER` names a pre-existing organization that the configured admin can create repositories in; bootstrap does not create the organization itself - unattended repo seeding now uses an authenticated HTTPS remote built from the configured Forgejo admin credentials, so operators should replace that local remote with a token, SSH, or credential-helper-backed remote after bootstrap if they do not want credentials stored in `.git/config` - cloud-init no longer clones a bootstrap repository onto the node; Kubernetes asset delivery is still workstation-driven after Terraform - `bootstrap_repo_path` in Terraform is only a reserved marker for a future node-local bootstrap/GitOps flow - bootstrap requires either a local `htpasswd` binary or local `docker` as a fallback to generate the registry htpasswd secret - bootstrap and CI authentication paths should still be hardened before production use - runner identity is persisted under the shared `forgejo-data` PVC, so deleting the `forgejo-runner` pod is safe but deleting that PVC forces re-registration on the next bootstrap run