# Kubernetes observability on the Hetzner single-node cluster This cluster now includes a reproducible ops/observability stack in the `observability` namespace: - `loki` for log storage and querying - `promtail` as a DaemonSet that ships pod stdout/stderr logs from every node - `grafana` for log search and historical exploration - `headlamp` for a Kubernetes web UI with pods, workloads, events, and pod logs ## What gets collected Promtail tails Kubernetes container log files under `/var/log/pods` on each node. That means any container writing logs to stdout/stderr automatically shows up in Loki/Grafana. This fits the current app setup in this repo because the services already log to stdout/stderr. What is **not** collected automatically: - arbitrary log files written somewhere else inside a container filesystem - logs from external services that are not running as Kubernetes pods on this cluster ## Access Grafana is exposed through Traefik + cert-manager at: - `https://${GRAFANA_DOMAIN}` when bootstrapped from `scripts/hetzner/bootstrap-secrets.env` - in the current live environment: `https://grafana.doran.133011.xyz/` Grafana credentials come from: - `GRAFANA_ADMIN_USERNAME` - `GRAFANA_ADMIN_PASSWORD_PASS` or `GRAFANA_ADMIN_PASSWORD` The recommended path is `pass`. In the current live setup the password is stored at: - `api/hetznerk3s/grafana-admin-password` Headlamp is exposed at: - `https://${HEADLAMP_DOMAIN}` when bootstrapped from `scripts/hetzner/bootstrap-secrets.env` - in the current live environment: `https://headlamp.doran.133011.xyz/` Headlamp uses a Kubernetes service-account token for login. Bootstrap stores the generated token in `pass` when `HEADLAMP_ADMIN_TOKEN_PASS` is set. In the current live setup it is stored at: - `api/hetznerk3s/headlamp-admin-token` ## Reproducible bootstrap path The observability stack is part of the repo-managed platform layer: - `deploy/k8s/platform/base/observability.yaml` - `deploy/k8s/platform/base/headlamp.yaml` - `deploy/k8s/platform/base/kustomization.yaml` - `deploy/k8s/platform/base/namespace.yaml` - `deploy/k8s/overlays/hetzner-single-node/storage-class.patch.yaml` - `deploy/k8s/overlays/hetzner-single-node/kustomization.yaml` - `deploy/k8s/overlays/hetzner-single-node/ingress-hosts.patch.yaml` - `deploy/k8s/overlays/hetzner-single-node/secrets/observability.env.example` Bootstrap materializes the Grafana secret from local env / `pass` and also stores the generated Headlamp login token back into `pass` when configured: - writes `deploy/k8s/overlays/hetzner-single-node/secrets/observability.env` - copies it into `.state/hetzner/generated-overlay/` - applies the generated overlay - waits for `headlamp-admin-token` - stores that token via `HEADLAMP_ADMIN_TOKEN_PASS` ## Verify the stack ```bash export KUBECONFIG=$PWD/.state/hetzner/kubeconfig.yaml kubectl -n observability get pods kubectl -n observability get pvc kubectl -n observability get ingress kubectl -n observability rollout status deployment/loki --timeout=300s kubectl -n observability rollout status deployment/grafana --timeout=300s kubectl -n observability rollout status deployment/headlamp --timeout=300s kubectl -n observability rollout status daemonset/promtail --timeout=300s ``` ## Verify logs are arriving Generate some app logs, then query Loki directly: ```bash export KUBECONFIG=$PWD/.state/hetzner/kubeconfig.yaml kubectl -n observability port-forward svc/loki 3100:3100 ``` In another shell: ```bash curl -sS 'http://127.0.0.1:3100/loki/api/v1/labels' | jq curl -G -sS 'http://127.0.0.1:3100/loki/api/v1/query' \ --data-urlencode 'query={namespace="unrip"}' | jq ``` If those queries return labels/streams, pod logs are reaching Loki. ## Use Headlamp 1. open `https://headlamp.doran.133011.xyz/` 2. fetch the login token with: ```bash pass show api/hetznerk3s/headlamp-admin-token ``` 3. paste that token into the Headlamp login form 4. browse namespaces, workloads, pods, and use the built-in pod log view For this disposable cluster the generated Headlamp token is bound to `cluster-admin` so the UI can show everything. For a production setup, replace that with narrower RBAC. ## Use Grafana After logging into Grafana: 1. open **Explore** 2. choose the default **Loki** datasource 3. run queries like: - `{namespace="unrip"}` - `{namespace="forgejo"}` - `{namespace="registry"}` - `{pod=~"near-intents-ingest.*"}` - `{container="app"}` Useful labels added by promtail: - `namespace` - `pod` - `container` - `app` - selected `app.kubernetes.io/*` labels ## Day-to-day ops CLI remains useful for fast debugging: ```bash kubectl get pods -A kubectl -n unrip logs deploy/near-intents-ingest -f kubectl -n forgejo logs deploy/forgejo -f bash scripts/k8s/logs.sh ``` Use Headlamp when you want: - a web UI listing workloads and pods - click-through pod inspection - built-in pod log viewing - events and resource browsing Use Grafana when you want: - historical log search - cross-pod filtering - LogQL queries - easier multi-namespace log exploration ## Security notes Grafana is an admin/operator surface. For this cluster it is publicly reachable behind Grafana login. That is acceptable for this disposable single-node setup, but for a harder production posture prefer one of: - Tailscale-only access - ingress auth in front of Grafana and Headlamp - SSO/OIDC ## Add a new app and have logs show up there Nothing special is required as long as the new pod logs to stdout/stderr. If you deploy a new app under Kubernetes and expose it through the usual manifests/Ingress flow, promtail will scrape its pod logs automatically.