243 lines
12 KiB
Markdown
243 lines
12 KiB
Markdown
# Hetzner + k3s + self-hosted Git/CI bootstrap
|
|
|
|
Goal: provision and deploy everything from this repo to a single Hetzner machine with no manual server login.
|
|
|
|
## Stack
|
|
- Terraform provisions the Hetzner Cloud VM, private network, and firewall
|
|
- cloud-init installs Tailscale first when configured, then installs k3s automatically
|
|
- cloud-init leaves only a bootstrap marker on the node; it does not clone this repo or apply Kubernetes assets
|
|
- Kubernetes manifests deploy:
|
|
- Redpanda
|
|
- trading system services
|
|
- private registry
|
|
- Forgejo
|
|
- Loki + Promtail + Grafana + Headlamp observability
|
|
- k3s-bundled Traefik ingress resources
|
|
- cert-manager
|
|
- ACME issuers
|
|
- local bootstrap script:
|
|
- runs Terraform
|
|
- optionally creates DNS records via Cloudflare or Porkbun
|
|
- fetches the real kubeconfig from the node
|
|
- writes overlay secrets/host patches from local env
|
|
- renders `.state/hetzner/generated-overlay/` from the checked-in Hetzner overlay template plus `deploy/k8s/platform/base/kustomization.yaml`
|
|
- applies that generated overlay from the operator workstation checkout
|
|
- builds the current app image locally
|
|
- imports the bootstrap image into k3s for the first rollout
|
|
|
|
## Files
|
|
- `infra/terraform/hetzner/`
|
|
- `deploy/k8s/base/`
|
|
- `deploy/k8s/overlays/hetzner-single-node/`
|
|
- `scripts/hetzner/bootstrap.sh`
|
|
- `scripts/hetzner/configure-cloudflare-dns.sh`
|
|
- `scripts/hetzner/destroy.sh`
|
|
- `scripts/k8s/logs.sh`
|
|
- `.forgejo/workflows/deploy.yml`
|
|
|
|
## Required local tools
|
|
Always required:
|
|
- `terraform`
|
|
- `kubectl`
|
|
- `curl`
|
|
- `python3`
|
|
- `ssh`
|
|
- `git`
|
|
- `base64`
|
|
- `realpath`
|
|
- `pass` when using any `*_PASS` mapping
|
|
|
|
Conditionally required:
|
|
- `docker` only for `BOOTSTRAP_DELIVERY_MODE=local-image-import`, or as a fallback when no native `htpasswd` binary is available locally
|
|
- `htpasswd` is preferred for registry secret generation and avoids the docker fallback
|
|
|
|
Required local Python modules:
|
|
- `PyYAML` (`python3 -m pip install PyYAML`) for kubeconfig rendering during bootstrap
|
|
- `PyNaCl` (`python3 -m pip install PyNaCl`) only when `BOOTSTRAP_DELIVERY_MODE=forgejo-actions` so bootstrap can encrypt Forgejo Actions secrets
|
|
|
|
## Required local env
|
|
Start from:
|
|
|
|
```bash
|
|
cp scripts/hetzner/bootstrap-secrets.env.example scripts/hetzner/bootstrap-secrets.env
|
|
${EDITOR:-vi} scripts/hetzner/bootstrap-secrets.env
|
|
source scripts/hetzner/bootstrap-secrets.env
|
|
```
|
|
|
|
The mapping file should contain non-secret config plus `pass` entry references for secrets. Bootstrap and destroy load the first line from each configured pass entry without echoing it. Explicit env exports still override `pass` lookups.
|
|
|
|
When you run `scripts/hetzner/bootstrap.sh`, it uses this file to materialize local Kubernetes inputs before apply:
|
|
- overwrites `deploy/k8s/overlays/hetzner-single-node/secrets/unrip.env` with `NEAR_INTENTS_API_KEY`
|
|
- overwrites `deploy/k8s/overlays/hetzner-single-node/secrets/forgejo.env` with Forgejo `root_url` and `domain`
|
|
- overwrites `deploy/k8s/overlays/hetzner-single-node/secrets/observability.env` with Grafana bootstrap credentials and root URL
|
|
- renders `.state/hetzner/generated-overlay/` as the bootstrap-time source of truth
|
|
- copies the checked-in overlay patch behavior into that generated overlay
|
|
- imports platform resources from `deploy/k8s/platform/base/kustomization.yaml`, so newly added platform modules such as observability manifests are included automatically
|
|
- creates `registry-secrets` in namespace `registry` from `REGISTRY_USERNAME` and `REGISTRY_PASSWORD`
|
|
- creates the project docker-registry pull secret in `PROJECT_NAMESPACE` from the same registry credentials
|
|
|
|
This is different from running `kubectl apply -k deploy/k8s/overlays/hetzner-single-node` manually: plain Kustomize apply only consumes the checked-in overlay files, while bootstrap applies the generated overlay copy. Manual apply still only reads the checked-in files and does not read `scripts/hetzner/bootstrap-secrets.env` or create the imperative registry auth secrets on its own.
|
|
|
|
Required values:
|
|
- `HCLOUD_TOKEN_PASS` or `HCLOUD_TOKEN`
|
|
- `SSH_PUBLIC_KEY_PATH`
|
|
- `PUBLIC_DOMAIN`
|
|
- `BASE_DOMAIN`
|
|
- recommended Tailscale values:
|
|
- `TAILSCALE_AUTH_KEY_PASS` or `TAILSCALE_AUTH_KEY`
|
|
- optional `TAILSCALE_CONTROL_PLANE_HOSTNAME` to force a stable Tailscale DNS name for kube access
|
|
- if `TAILSCALE_CONTROL_PLANE_HOSTNAME` is left empty, bootstrap auto-discovers the node via local `tailscale status --json`
|
|
- `FORGEJO_DOMAIN`
|
|
- `FORGEJO_ROOT_URL`
|
|
- `REGISTRY_DOMAIN`
|
|
- `GRAFANA_DOMAIN`
|
|
- `GRAFANA_ROOT_URL`
|
|
- `HEADLAMP_DOMAIN`
|
|
- `LETSENCRYPT_EMAIL`
|
|
- `REGISTRY_USERNAME`
|
|
- `REGISTRY_PASSWORD_PASS` or `REGISTRY_PASSWORD`
|
|
- `NEAR_INTENTS_API_KEY_PASS` or `NEAR_INTENTS_API_KEY`
|
|
- `FORGEJO_ADMIN_USERNAME`
|
|
- `FORGEJO_ADMIN_EMAIL`
|
|
- `FORGEJO_ADMIN_PASSWORD_PASS` or `FORGEJO_ADMIN_PASSWORD`
|
|
- `GRAFANA_ADMIN_USERNAME` (defaults to `admin`)
|
|
- `GRAFANA_ADMIN_PASSWORD_PASS` or `GRAFANA_ADMIN_PASSWORD`
|
|
- optional `HEADLAMP_ADMIN_TOKEN_PASS` for storing the generated Headlamp login token back into `pass`
|
|
- optional repo settings: `FORGEJO_REPO_OWNER`, `FORGEJO_REPO_NAME`, `FORGEJO_REPO_PRIVATE`
|
|
|
|
Optional for automatic DNS:
|
|
- Cloudflare:
|
|
- `CLOUDFLARE_API_TOKEN_PASS` or `CLOUDFLARE_API_TOKEN`
|
|
- `CLOUDFLARE_ZONE_ID_PASS` or `CLOUDFLARE_ZONE_ID`
|
|
- Porkbun:
|
|
- `PORKBUN_API_KEY_PASS` or `PORKBUN_API_KEY`
|
|
- `PORKBUN_SECRET_API_KEY_PASS` or `PORKBUN_SECRET_API_KEY`
|
|
|
|
## Bootstrap
|
|
```bash
|
|
bash scripts/hetzner/bootstrap.sh
|
|
```
|
|
|
|
Outputs:
|
|
- Hetzner VM created
|
|
- Tailscale joined if configured
|
|
- k3s installed
|
|
- cloud-init writes `/opt/unrip/bootstrap/README.txt` as a marker that node-local repo bootstrap is not active yet
|
|
- kubeconfig written to `.state/hetzner/kubeconfig.yaml`
|
|
- CI kubeconfig written to `.state/hetzner/kubeconfig.incluster.yaml`
|
|
- overlay secrets and ingress host patches rendered from local env / `pass`
|
|
- `.state/hetzner/generated-overlay/` rendered and applied as the canonical bootstrap manifest set for that run
|
|
- namespaces, Redpanda, app deployments, Forgejo, registry, Traefik-targeted ingress resources, cert-manager, issuers, and any additional platform resources referenced by `deploy/k8s/platform/base/kustomization.yaml` applied
|
|
- Headlamp is deployed and wired to the configured public hostname model
|
|
- bootstrap stores the generated Headlamp service-account token in `pass` when `HEADLAMP_ADMIN_TOKEN_PASS` is configured
|
|
- Forgejo admin account created automatically if missing
|
|
- Forgejo runner registration is generated automatically from inside the Forgejo pod and the resulting `/data/.runner` config is stored under the shared `forgejo-data` persistent volume used by the runner deployment
|
|
- Forgejo repository created automatically in either the admin user's namespace or a pre-existing organization named by `FORGEJO_REPO_OWNER`
|
|
- Forgejo Actions secrets and variables configured automatically
|
|
- repo pushed to Forgejo automatically in the default `forgejo-actions` delivery mode via authenticated HTTPS Git push
|
|
- first deployment triggered from Forgejo Actions by default
|
|
|
|
## Tailscale-first admin access
|
|
Recommended mode:
|
|
- public firewall exposes only `80/443`
|
|
- admin access uses Tailscale
|
|
- Kubernetes API uses the Tailscale hostname when `TAILSCALE_CONTROL_PLANE_HOSTNAME` is set
|
|
|
|
`TF_ADMIN_CIDR_BLOCKS` remains only as a fallback if you intentionally want public admin/API exposure.
|
|
|
|
## DNS and TLS
|
|
If DNS provider credentials are present, bootstrap updates:
|
|
- `${PUBLIC_DOMAIN}`
|
|
- `git.${PUBLIC_DOMAIN}`
|
|
- `registry.${PUBLIC_DOMAIN}`
|
|
- `grafana.${PUBLIC_DOMAIN}`
|
|
- `headlamp.${PUBLIC_DOMAIN}`
|
|
|
|
Supported scripted providers:
|
|
- Cloudflare
|
|
- Porkbun
|
|
|
|
TLS is handled in-cluster by cert-manager using Let's Encrypt issuers and the rendered ingress hosts.
|
|
Grafana and Headlamp are both wired into the public hostname model by default. Keep Grafana authenticated, and treat the Headlamp token as an operator credential.
|
|
The platform base assumes the default k3s Traefik ingress controller is present; it does not install ingress-nginx.
|
|
For clean-cluster applies, the base kustomization now includes cert-manager before the `ClusterIssuer` resources so the issuer CRs can be created in the same bootstrap flow.
|
|
|
|
## Observe the cluster
|
|
```bash
|
|
KUBECONFIG=.state/hetzner/kubeconfig.yaml kubectl get pods -A
|
|
bash scripts/k8s/logs.sh
|
|
```
|
|
|
|
For the web log UI and observability stack, see `docs/k8s-observability.md`.
|
|
|
|
## Self-hosted CI/CD handoff
|
|
Default bootstrap now automates the Forgejo handoff:
|
|
1. create the Forgejo repo in the admin namespace or in a pre-existing organization named by `FORGEJO_REPO_OWNER`
|
|
2. configure the repository Actions secrets:
|
|
- `KUBECONFIG_B64`
|
|
- `REGISTRY_USERNAME`
|
|
- `REGISTRY_PASSWORD`
|
|
3. configure the repository Actions variables:
|
|
- `REGISTRY_HOST=${REGISTRY_DOMAIN}`
|
|
- `PROJECT_NAME`
|
|
- `PROJECT_NAMESPACE`
|
|
- `PROJECT_DEPLOYMENTS`
|
|
4. push the current repo to `main`
|
|
|
|
The workflow then:
|
|
- starts a Kubernetes Job in the target namespace
|
|
- checks out the repo inside that Job using the Forgejo job token via `Authorization: Bearer ...` HTTP auth
|
|
- uses Kaniko plus the Kubernetes registry auth secret to build and push `${REGISTRY_DOMAIN}/${PROJECT_NAME}:${GIT_SHA}`
|
|
- updates the app deployments in `PROJECT_NAMESPACE`
|
|
- waits for rollout
|
|
|
|
Legacy local-image bootstrap remains available with:
|
|
|
|
```bash
|
|
BOOTSTRAP_DELIVERY_MODE=local-image-import bash scripts/hetzner/bootstrap.sh
|
|
```
|
|
|
|
## Destroy everything
|
|
Default destroy only removes Terraform-managed Hetzner infrastructure:
|
|
|
|
```bash
|
|
source scripts/hetzner/bootstrap-secrets.env
|
|
bash scripts/hetzner/destroy.sh
|
|
```
|
|
|
|
Opt-in flags make destructive cleanup of bootstrap-managed leftovers explicit:
|
|
|
|
```bash
|
|
source scripts/hetzner/bootstrap-secrets.env
|
|
DESTROY_DNS=true \
|
|
DESTROY_LOCAL_STATE=true \
|
|
DESTROY_FORGEJO_REPO=true \
|
|
bash scripts/hetzner/destroy.sh
|
|
```
|
|
|
|
`destroy.sh` reads `HCLOUD_TOKEN`, optional `TAILSCALE_AUTH_KEY`, optional DNS provider credentials, and optional Forgejo admin credentials via the same `*_PASS` mapping mechanism as bootstrap.
|
|
It uses the same Terraform inputs as bootstrap for the infrastructure resources, then can optionally:
|
|
- delete the scripted DNS records for `${PUBLIC_DOMAIN}`, `git.${PUBLIC_DOMAIN}`, `registry.${PUBLIC_DOMAIN}`, `grafana.${PUBLIC_DOMAIN}`, and `headlamp.${PUBLIC_DOMAIN}`
|
|
- remove local bootstrap artifacts under `.state/hetzner/`, `deploy/k8s/overlays/hetzner-single-node/generated/`, and the local Terraform working/state files in `infra/terraform/hetzner/`
|
|
- delete the bootstrap-managed Forgejo repository via the Forgejo API
|
|
|
|
Supported scripted DNS cleanup providers:
|
|
- Cloudflare
|
|
- Porkbun
|
|
|
|
Cleanup defaults are intentionally conservative:
|
|
- `DESTROY_DNS=false` keeps provider records unless you explicitly opt in
|
|
- `DESTROY_LOCAL_STATE=false` keeps the last kubeconfigs and generated manifests for inspection
|
|
- `DESTROY_FORGEJO_REPO=false` keeps the remote Git repository unless you explicitly opt in
|
|
|
|
If any optional cleanup step is enabled but cannot run because credentials are missing, `destroy.sh` prints a skip message describing what was not removed.
|
|
If DNS cleanup or Forgejo repo deletion fails after Terraform teardown, rerun the same cleanup flags or remove the remaining resources manually.
|
|
|
|
## Current limitations
|
|
- organization-owned repo bootstrap works only when `FORGEJO_REPO_OWNER` names a pre-existing organization that the configured admin can create repositories in; bootstrap does not create the organization itself
|
|
- unattended repo seeding now uses an authenticated HTTPS remote built from the configured Forgejo admin credentials, so operators should replace that local remote with a token, SSH, or credential-helper-backed remote after bootstrap if they do not want credentials stored in `.git/config`
|
|
- cloud-init no longer clones a bootstrap repository onto the node; Kubernetes asset delivery is still workstation-driven after Terraform
|
|
- `bootstrap_repo_path` in Terraform is only a reserved marker for a future node-local bootstrap/GitOps flow
|
|
- bootstrap requires either a local `htpasswd` binary or local `docker` as a fallback to generate the registry htpasswd secret
|
|
- bootstrap and CI authentication paths should still be hardened before production use
|
|
- runner identity is persisted under the shared `forgejo-data` PVC, so deleting the `forgejo-runner` pod is safe but deleting that PVC forces re-registration on the next bootstrap run
|