No description
Find a file
2026-03-28 23:05:43 +01:00
.forgejo/workflows fix: harden hetzner rebuild bootstrap flow 2026-03-28 23:05:43 +01:00
deploy fix: harden hetzner rebuild bootstrap flow 2026-03-28 23:05:43 +01:00
docs fix: harden hetzner rebuild bootstrap flow 2026-03-28 23:05:43 +01:00
infra/terraform/hetzner feat: bootstrap hetzner k3s deployment 2026-03-28 20:53:29 +01:00
node_modules Initial commit through Cline Kanban 2026-03-28 13:04:10 +01:00
scripts fix: harden hetzner rebuild bootstrap flow 2026-03-28 23:05:43 +01:00
src feat: bootstrap hetzner k3s deployment 2026-03-28 20:53:29 +01:00
.dockerignore feat: bootstrap hetzner k3s deployment 2026-03-28 20:53:29 +01:00
.env.example feat: automate forgejo bootstrap with pass-backed secrets 2026-03-28 21:28:18 +01:00
.gitignore feat: bootstrap hetzner k3s deployment 2026-03-28 20:53:29 +01:00
compose.yml feat: bootstrap hetzner k3s deployment 2026-03-28 20:53:29 +01:00
Dockerfile feat: bootstrap hetzner k3s deployment 2026-03-28 20:53:29 +01:00
index.mjs Initial commit through Cline Kanban 2026-03-28 13:04:10 +01:00
package-lock.json Initial commit through Cline Kanban 2026-03-28 13:04:10 +01:00
package.json feat: bootstrap hetzner k3s deployment 2026-03-28 20:53:29 +01:00
README.md feat: bootstrap hetzner k3s deployment 2026-03-28 20:53:29 +01:00

near-intents-monitor

Production-shaped first slice of the trading system:

  • venue ingest: NEAR Intents solver-bus quote flow
  • bus: Redpanda first, Kafka-compatible by design
  • reactor: dummy decision engine emitting commands
  • executor: dummy execution worker with durable idempotency state
  • result consumer: downstream observer of execution outcomes

Canonical repo shape

src/
  apps/
    near-intents-ingest.mjs
    dummy-reactor.mjs
    dummy-executor.mjs
    dummy-consumer.mjs
  bus/
    kafka/
      producer.mjs
      consumer.mjs
  core/
    event-envelope.mjs
    executor-state-store.mjs
    log.mjs
    pair-filter.mjs
    schemas.mjs
  lib/
    config.mjs
    env.mjs
  venues/
    near-intents/
      ingest.mjs
      normalize.mjs
      ws.mjs
compose.yml
Dockerfile
docs/contracts.md
deploy/hetzner/README.md

Event flow

NEAR Intents WebSocket
        |
        +--> raw.near_intents.quote
        |
        v
norm.swap_demand
        |
        v
cmd.execute_trade
        |
        v
exec.trade_result

Core rule: services do not call each other directly for trading flow; they communicate through bus topics only.

Contracts

See docs/contracts.md.

Current topics:

  • raw.near_intents.quote
  • norm.swap_demand
  • cmd.execute_trade
  • exec.trade_result

Primary deployment path: repo-driven Hetzner bootstrap

The primary production path is no longer a Compose-only VM workflow.

The intended operating model is:

  • Terraform provisions a Hetzner single-node environment
  • cloud-init installs k3s automatically on first boot
  • a local operator workstation performs the first repo-driven bootstrap
  • Kubernetes manifests install Redpanda, the app workloads, Forgejo, runner, registry, and ingress-related components
  • once the in-cluster Git + CI stack is alive, routine app deploys move to self-hosted CI

This is a two-phase model:

  • Phase 0: local workstation bootstrap of a brand-new cluster
  • Phase 1: self-hosted Forgejo + runner takes over app delivery

Compose still exists for local development and optional single-machine testing, but it is not the canonical production story.

Prerequisites for first deployment

Install locally on the operator workstation:

  • Terraform >= 1.6
  • kubectl
  • docker
  • curl

You also need:

  • a Hetzner Cloud API token
  • a local SSH public key file for Terraform node provisioning
  • DNS control for your chosen base domain and Forgejo hostname
  • preferably a Tailscale tailnet and auth key for private admin/control-plane access
  • the repo checked out locally

Required bootstrap secrets and inputs

Create the bootstrap env file:

cp scripts/hetzner/bootstrap-secrets.env.example scripts/hetzner/bootstrap-secrets.env

Set at least:

  • HCLOUD_TOKEN
  • SSH_PUBLIC_KEY_PATH
  • PUBLIC_DOMAIN
  • recommended:
    • TAILSCALE_AUTH_KEY
    • TAILSCALE_CONTROL_PLANE_HOSTNAME
  • optional fallback:
    • TF_ADMIN_CIDR_BLOCKS
  • BASE_DOMAIN
  • FORGEJO_DOMAIN
  • FORGEJO_ROOT_URL
  • REGISTRY_DOMAIN
  • LETSENCRYPT_EMAIL
  • REGISTRY_USERNAME
  • REGISTRY_PASSWORD
  • NEAR_INTENTS_API_KEY
  • FORGEJO_RUNNER_REGISTRATION_TOKEN
  • optional DNS automation:
    • Cloudflare:
      • CLOUDFLARE_API_TOKEN
      • CLOUDFLARE_ZONE_ID
    • Porkbun:
      • PORKBUN_API_KEY
      • PORKBUN_SECRET_API_KEY

Then load them:

source scripts/hetzner/bootstrap-secrets.env

First bootstrap sequence

Run the end-to-end bootstrap from repo root:

bash scripts/hetzner/bootstrap.sh

Current repo behavior of that script:

  1. runs Terraform in infra/terraform/hetzner
  2. optionally creates DNS records for the base, Forgejo, and registry hosts via Cloudflare or Porkbun
  3. if configured, joins the node to Tailscale and prefers the Tailscale control-plane hostname for Kubernetes API access
  4. waits for SSH and the k3s API endpoint to become ready
  5. fetches the real k3s kubeconfig from the node and writes it to .state/hetzner/kubeconfig.yaml
  6. renders the Hetzner single-node overlay from local operator inputs
  7. creates registry pull/auth secrets
  8. applies the Kubernetes bootstrap manifests
  9. builds the app image locally and imports it into k3s on the node
  10. performs the first rollout using the imported bootstrap image

Use the generated kubeconfig afterward:

export KUBECONFIG=$PWD/.state/hetzner/kubeconfig.yaml
kubectl get nodes -o wide
kubectl get pods -A
kubectl -n unrip get deploy,pods
kubectl -n forgejo get deploy,pods,svc

What is deployed into k3s

The repo-managed Kubernetes assets are under deploy/k8s/.

Current single-node target includes resources for:

  • unrip workloads in namespace unrip
  • Redpanda
  • Forgejo
  • Forgejo runner
  • private registry
  • ingress-nginx namespace/resources
  • cert-manager namespace/resources
  • ACME issuers and ingress definitions
  • a bootstrap job for Redpanda topic creation

Shared platform namespaces:

  • forgejo
  • registry
  • ingress-nginx
  • cert-manager

Project-specific namespaces:

  • unrip
  • future projects should get their own namespace rather than sharing unrip

Important current-state nuance:

  • the bootstrap script currently applies deploy/k8s/base
  • the longer-term intended target is deploy/k8s/overlays/hetzner-single-node

Executor persistence in k3s

The executor is stateful by design because it persists idempotency/execution tracking.

Current persistence boundary:

  • app env uses EXECUTOR_STATE_DIR=/var/lib/unrip/executor-state
  • in Kubernetes, the executor deployment mounts storage at that path
  • the Hetzner single-node overlay pins storage to the k3s local-path storage class
  • cloud-init also prepares the host directory boundary for executor state on first boot

Operational meaning:

  • executor state lives on node-backed storage in the single-node k3s environment
  • if that PVC or underlying node storage is lost, duplicate-suppression history is lost too
  • treat executor persistence as part of the minimal durable state of the cluster

Failure recovery and operator checks

If bootstrap fails before Terraform completes

Re-run after fixing the local input problem:

  • missing token
  • invalid CIDRs
  • invalid SSH public key path

If the infrastructure must be torn down:

source scripts/hetzner/bootstrap-secrets.env
bash scripts/hetzner/destroy.sh

If Terraform succeeds but Kubernetes is not ready

Check the public API and cluster state from the workstation:

export KUBECONFIG=$PWD/.state/hetzner/kubeconfig.yaml
kubectl get nodes -o wide
kubectl get pods -A
kubectl get events -A --sort-by=.lastTimestamp | tail -n 50

Typical next checks:

  • cloud-init may still be finishing
  • k3s may still be starting
  • a workload may be crash-looping due to missing secret values or image-delivery issues

If workloads do not roll out

Inspect the affected namespace:

kubectl -n unrip get pods
kubectl -n unrip describe pod <pod-name>
kubectl -n unrip logs deploy/dummy-executor --tail=100
kubectl -n forgejo logs deploy/forgejo --tail=100

If you need to recreate secrets

The workstation bootstrap creates these Secrets:

  • unrip/unrip-secrets
  • forgejo/forgejo-secrets

Verify them:

kubectl -n unrip get secret unrip-secrets
kubectl -n forgejo get secret forgejo-secrets

Current known limitations

Current colony state already identified an important gap:

  • bootstrap and CI are not yet fully production-hardened, even though the first deploy path now fetches the real kubeconfig and imports the bootstrap image directly into k3s

Treat the current bootstrap as a repo-driven first-deploy path suitable for testing, with hardening still pending.

Self-hosted CI handoff

After cluster bootstrap:

  • open Forgejo at https://${FORGEJO_DOMAIN}
  • seed or push this repo into Forgejo
  • create Forgejo repository secrets:
    • KUBECONFIG_B64
    • REGISTRY_USERNAME
    • REGISTRY_PASSWORD
  • create Forgejo repository variables:
    • REGISTRY_HOST=${REGISTRY_DOMAIN}
    • optional: PROJECT_NAME=unrip
    • optional: PROJECT_NAMESPACE=unrip
    • optional: PROJECT_DEPLOYMENTS=near-intents-ingest,dummy-reactor,dummy-executor,dummy-consumer
  • push to main

Routine application deploys then follow .forgejo/workflows/deploy.yml:

  • build image as REGISTRY_HOST/PROJECT_NAME:${GIT_SHA}
  • push to the private registry
  • kubectl set image for each deployment listed in PROJECT_DEPLOYMENTS inside PROJECT_NAMESPACE
  • wait for rollout

If project variables are omitted, the workflow defaults to the current repo project:

  • PROJECT_NAME=unrip
  • PROJECT_NAMESPACE=unrip
  • PROJECT_DEPLOYMENTS=near-intents-ingest,dummy-reactor,dummy-executor,dummy-consumer

Infrastructure changes remain Terraform-driven from the operator workstation unless and until that responsibility is also automated.

For the detailed operator runbooks, see:

  • docs/hetzner-k3s-bootstrap.md
  • docs/hetzner-self-hosted-ci-runbook.md
  • deploy/k8s/projects/README.md
  • docs/next-session-architecture.md

Local development with Compose

Compose remains available for local development and debugging.

npm install
cp .env.example .env
# edit .env

docker compose build
docker compose up -d

Useful commands:

docker compose ps
docker compose logs -f
docker compose logs -f near-intents-ingest dummy-reactor dummy-executor dummy-consumer
docker compose restart dummy-executor
docker compose down
docker compose down -v

Individual services

npm run near-intents:ingest
npm run dummy-reactor
npm run dummy-executor
npm run dummy-consumer

Optional pair filter:

npm run near-intents:ingest -- --pair 'asset_a->asset_b'

Idempotent executor behavior

  • every command has a command_id
  • commands carry idempotency_key and execution_key
  • executor persists state under EXECUTOR_STATE_DIR
  • completed commands are skipped after restart or replay

Env

NEAR_INTENTS_API_KEY=your_solver_jwt
NEAR_INTENTS_WS_URL=wss://solver-relay-v2.chaindefuser.com/ws
KAFKA_BROKERS=redpanda:9092
KAFKA_CLIENT_ID=unrip
KAFKA_TOPIC_RAW_NEAR_INTENTS_QUOTE=raw.near_intents.quote
KAFKA_TOPIC_NORM_SWAP_DEMAND=norm.swap_demand
KAFKA_TOPIC_CMD_EXECUTE_TRADE=cmd.execute_trade
KAFKA_TOPIC_EXEC_TRADE_RESULT=exec.trade_result
KAFKA_CONSUMER_GROUP_DUMMY=dummy-reactor-v1
KAFKA_CONSUMER_GROUP_EXECUTOR=dummy-executor-v1
EXECUTOR_STATE_DIR=/var/lib/unrip/executor-state