doran/docs/hetzner-k3s-bootstrap.md
2026-03-28 20:53:29 +01:00

141 lines
3.9 KiB
Markdown

# Hetzner + k3s + self-hosted Git/CI bootstrap
Goal: provision and deploy everything from this repo to a single Hetzner machine with no manual server login.
## Stack
- Terraform provisions the Hetzner Cloud VM, private network, and firewall
- cloud-init installs Tailscale first when configured, then installs k3s automatically
- Kubernetes manifests deploy:
- Redpanda
- trading system services
- private registry
- Forgejo
- ingress-nginx
- cert-manager
- ACME issuers
- local bootstrap script:
- runs Terraform
- optionally creates DNS records via Cloudflare or Porkbun
- writes overlay secrets/host patches from local env
- applies the Hetzner single-node k8s overlay
- builds the current app image locally
- fetches the real kubeconfig from the node
- imports the bootstrap image into k3s for the first rollout
## Files
- `infra/terraform/hetzner/`
- `deploy/k8s/base/`
- `deploy/k8s/overlays/hetzner-single-node/`
- `scripts/hetzner/bootstrap.sh`
- `scripts/hetzner/configure-cloudflare-dns.sh`
- `scripts/hetzner/destroy.sh`
- `scripts/k8s/logs.sh`
- `.forgejo/workflows/deploy.yml`
## Required local tools
- `terraform`
- `kubectl`
- `docker`
- `curl`
- `python3`
## Required local env
Start from:
```bash
cp scripts/hetzner/bootstrap-secrets.env.example scripts/hetzner/bootstrap-secrets.env
source scripts/hetzner/bootstrap-secrets.env
```
Required values:
- `HCLOUD_TOKEN`
- `SSH_PUBLIC_KEY_PATH`
- `PUBLIC_DOMAIN`
- `BASE_DOMAIN`
- recommended Tailscale values:
- `TAILSCALE_AUTH_KEY`
- `TAILSCALE_CONTROL_PLANE_HOSTNAME`
- `FORGEJO_DOMAIN`
- `FORGEJO_ROOT_URL`
- `REGISTRY_DOMAIN`
- `LETSENCRYPT_EMAIL`
- `REGISTRY_USERNAME`
- `REGISTRY_PASSWORD`
- `NEAR_INTENTS_API_KEY`
- `FORGEJO_RUNNER_REGISTRATION_TOKEN`
Optional for automatic DNS:
- Cloudflare:
- `CLOUDFLARE_API_TOKEN`
- `CLOUDFLARE_ZONE_ID`
- Porkbun:
- `PORKBUN_API_KEY`
- `PORKBUN_SECRET_API_KEY`
## Bootstrap
```bash
bash scripts/hetzner/bootstrap.sh
```
Outputs:
- Hetzner VM created
- Tailscale joined if configured
- k3s installed
- kubeconfig written to `.state/hetzner/kubeconfig.yaml`
- overlay secrets and ingress host patches rendered from local env
- namespaces, Redpanda, app deployments, Forgejo, registry, ingress, cert-manager, and issuers applied
- bootstrap image built and first rollout triggered
## Tailscale-first admin access
Recommended mode:
- public firewall exposes only `80/443`
- admin access uses Tailscale
- Kubernetes API uses the Tailscale hostname when `TAILSCALE_CONTROL_PLANE_HOSTNAME` is set
`TF_ADMIN_CIDR_BLOCKS` remains only as a fallback if you intentionally want public admin/API exposure.
## DNS and TLS
If DNS provider credentials are present, bootstrap updates:
- `${BASE_DOMAIN}`
- `git.${BASE_DOMAIN}`
- `registry.${BASE_DOMAIN}`
Supported scripted providers:
- Cloudflare
- Porkbun
TLS is handled in-cluster by cert-manager using Let's Encrypt issuers and the rendered ingress hosts.
## Observe the cluster
```bash
KUBECONFIG=.state/hetzner/kubeconfig.yaml kubectl get pods -A
bash scripts/k8s/logs.sh
```
## Self-hosted CI/CD handoff
After bootstrap:
1. open Forgejo at `https://${FORGEJO_DOMAIN}`
2. seed or mirror this repo into Forgejo
3. add Forgejo Actions secrets:
- `KUBECONFIG_B64`
- `REGISTRY_USERNAME`
- `REGISTRY_PASSWORD`
4. add Forgejo Actions variable:
- `REGISTRY_HOST=${REGISTRY_DOMAIN}`
5. push to `main`
The workflow then:
- builds the image
- pushes it to `https://${REGISTRY_DOMAIN}`
- updates the app deployments in `unrip`
- waits for rollout
## Destroy everything
```bash
bash scripts/hetzner/destroy.sh
```
## Current limitations
- Forgejo admin bootstrap and repo seeding are still operator-driven after the first cluster bootstrap.
- bootstrap and CI authentication paths should still be hardened before production use.
- routine deploys are intended to be registry-native through Forgejo Actions, but that still needs a real-world verification pass.