doran/docs/hetzner-self-hosted-ci-runbook.md

5.5 KiB

Hetzner self-hosted CI/CD runbook

This is the operator runbook for the handoff from local bootstrap to self-hosted Forgejo-based deployment.

Bootstrap prerequisites

From your workstation:

cp scripts/hetzner/bootstrap-secrets.env.example scripts/hetzner/bootstrap-secrets.env
source scripts/hetzner/bootstrap-secrets.env
python3 -c 'import nacl'  # verify PyNaCl is installed for Actions secret encryption
bash scripts/hetzner/bootstrap.sh

scripts/hetzner/bootstrap-secrets.env should contain non-secret bootstrap settings and pass entry mappings like HCLOUD_TOKEN_PASS, REGISTRY_PASSWORD_PASS, and FORGEJO_ADMIN_PASSWORD_PASS. If you explicitly export the raw env vars, they override the pass lookups.

After that you should have:

  • .state/hetzner/kubeconfig.yaml
  • .state/hetzner/kubeconfig.incluster.yaml
  • Forgejo reachable at https://${FORGEJO_DOMAIN}
  • the target Forgejo repo created automatically
  • repository Actions secrets/variables populated for CI
  • the current repo pushed to Forgejo automatically in default mode
  • Registry reachable at https://${REGISTRY_DOMAIN}
  • private admin/control-plane access over Tailscale if configured

Bootstrap repo automation requires FORGEJO_ADMIN_USERNAME, FORGEJO_ADMIN_PASSWORD, and Python PyNaCl locally so the script can encrypt Forgejo Actions secrets before upload. The same bootstrap flow now also creates the initial Forgejo admin account and generates the one-time runner registration token after Forgejo is up.

Verify the cluster

export KUBECONFIG=$PWD/.state/hetzner/kubeconfig.yaml
kubectl get nodes -o wide
kubectl get pods -A
kubectl -n forgejo get deploy,pods,svc,ingress
kubectl -n registry get deploy,pods,svc,ingress
kubectl -n unrip get deploy,pods

Seed the repo into Forgejo

Default bootstrap already seeds the repo with:

bash scripts/hetzner/seed-forgejo-repo.sh

You only need to run it manually if you skipped seeding during bootstrap or want to push again after local changes.

Configure Forgejo Actions secrets and variables

Bootstrap upserts these repository secrets automatically:

  • KUBECONFIG_B64
  • REGISTRY_USERNAME
  • REGISTRY_PASSWORD

Bootstrap upserts these repository variables automatically:

  • REGISTRY_HOST=${REGISTRY_DOMAIN}
  • PROJECT_NAME=${PROJECT_NAME}
  • PROJECT_NAMESPACE=${PROJECT_NAMESPACE}
  • PROJECT_DEPLOYMENTS as a comma-separated version of the bootstrap deployment list

The Forgejo repo configuration step is idempotent, so rerunning bootstrap updates the same repo secrets/variables in place.

Workflow behavior

The workflow in .forgejo/workflows/deploy.yml now:

  1. installs kubectl on the Forgejo runner
  2. loads kubeconfig from KUBECONFIG_B64
  3. computes IMAGE=${REGISTRY_HOST}/${PROJECT_NAME}:${GIT_SHA}
  4. creates an in-cluster Kubernetes Job in PROJECT_NAMESPACE
  5. that Job checks out the repo with the Forgejo job token in an init container
  6. Kaniko builds and pushes the image using the Kubernetes registry auth secret
  7. the workflow updates each deployment listed in PROJECT_DEPLOYMENTS inside PROJECT_NAMESPACE
  8. the workflow waits for rollout after each image update

Default behavior if you do not set project variables:

  • PROJECT_NAME=unrip
  • PROJECT_NAMESPACE=unrip
  • PROJECT_DEPLOYMENTS=near-intents-ingest,dummy-reactor,dummy-executor,dummy-consumer
  • PROJECT_REGISTRY_SECRET_NAME=unrip-registry-creds

For a future project, reuse the same workflow by changing only the Forgejo repository variables instead of copying the workflow.

Default bootstrap now uses the same routine CI path for the first deploy:

  • bootstrap fetches the real kubeconfig from the node
  • bootstrap derives an in-cluster kubeconfig for the runner
  • bootstrap creates the Forgejo repo and Actions config
  • bootstrap pushes to main
  • Forgejo Actions builds the image in-cluster and deploys it

Legacy mode still exists if you explicitly set:

BOOTSTRAP_DELIVERY_MODE=local-image-import

Trigger deploys

Push to main in Forgejo:

git push forgejo main

Observe deploys

export KUBECONFIG=$PWD/.state/hetzner/kubeconfig.yaml
kubectl -n unrip rollout status deployment/near-intents-ingest --timeout=300s
kubectl -n unrip rollout status deployment/dummy-reactor --timeout=300s
kubectl -n unrip rollout status deployment/dummy-executor --timeout=300s
kubectl -n unrip rollout status deployment/dummy-consumer --timeout=300s
kubectl -n unrip get pods -o wide
kubectl get events -A --sort-by=.lastTimestamp | tail -n 50

DNS and TLS

If DNS automation was enabled during bootstrap, A records for the base, Forgejo, and registry hosts are already managed from the repo-side bootstrap.

Currently supported DNS providers:

  • Cloudflare
  • Porkbun

TLS is issued by cert-manager using the rendered Let's Encrypt email and ingress hosts.

Current limitations

  • the bootstrap path now creates the initial admin account and one-time runner registration token automatically from inside the Forgejo pod, but it still depends on the operator supplying the intended admin credentials up front
  • runner registration no longer needs a pre-seeded Kubernetes secret, but the runner config still lives on emptyDir, so bootstrap must recreate /data/.runner after a runner pod replacement
  • automated repo creation currently assumes FORGEJO_REPO_OWNER == FORGEJO_ADMIN_USERNAME
  • the runner currently uses host-mode jobs and installs kubectl at job start; the image build itself runs in-cluster via Kaniko, which is functional but not yet optimized