orderbooks/docs/KUBERNETES_DEPLOYMENT.md

262 lines
11 KiB
Markdown

# Kubernetes Deployment
Status: draft runtime package for Checkpoint 8G
This document describes the Kubernetes package for the Polymarket raw
order-book collector. It follows the shared Hetzner k3s cluster model from
`../nuri/unrip3`: application code, Dockerfile, manifests, and Forgejo workflow
live in this repository; platform services, the shared registry, and the shared
Forgejo runner remain platform-owned.
This package does not claim production readiness. Production readiness still
requires a real Kubernetes runtime smoke run with preserved evidence.
## Cluster Decisions
- Namespace: `orderbooks`
- Workstation kubeconfig for validation: `../nuri/unrip3/.state/hetzner/kubeconfig.yaml`
- Shared registry and shared Forgejo runner
- Existing rclone Secret: `orderbooks/orderbooks-rclone-config`
- Secret key mounted by the uploader: `rclone.conf`
Do not commit or print rclone config contents.
## Runtime Layout
The collector and uploader share one PVC:
```text
PVC: orderbooks-data
mount: /var/lib/orderbooks
raw files: /var/lib/orderbooks/raw_orderbooks
manifests: /var/lib/orderbooks/manifests
discovery: /var/lib/orderbooks/discovery
```
The REST snapshot collector uses one Deployment with one replica. The container
runs `/app/scripts/run_polymarket_collector_loop.sh`, which repeatedly executes
the existing bounded collector cycle and records loop failure/interruption
manifests instead of relying on Kubernetes crash loops for normal operation.
The websocket recorder canary uses a separate Deployment named
`orderbooks-ws-recorder`. It runs `/app/scripts/run_polymarket_ws_recorder_loop.sh`
and does not replace or stop `orderbooks-collector`. It writes raw websocket
archives under `/var/lib/orderbooks/raw_orderbooks/polymarket/ws_raw/`, REST
checkpoint archives under `/var/lib/orderbooks/raw_orderbooks/polymarket/rest_checkpoints/`,
and runtime manifests under `/var/lib/orderbooks/manifests/`.
The uploader uses one CronJob. It runs the existing rclone uploader in execute
mode, mounts the same PVC, mounts `orderbooks-rclone-config` read-only at
`/etc/rclone/rclone.conf`, sets `RCLONE_CONFIG` to that file, uploads only
closed/aged files, skips `.open`/temporary writer files, and uses
`--cleanup-after-verify`. Local cleanup is allowed only after rclone copy and
check succeed. The Kubernetes retention setting is 3 days because websocket raw
capture is materially larger than REST snapshots and the current PVC is 10Gi.
## Bootstrap This App Repo
Run the orderbooks-specific bootstrap from this repository:
```sh
scripts/deploy/bootstrap_orderbooks_k8s.sh
```
The bootstrap loads platform defaults and resolved secrets from the local
platform state without printing secret values. It ensures namespace `orderbooks`,
creates or updates `orderbooks-registry-creds`, verifies the existing
`orderbooks-rclone-config` secret has key `rclone.conf`, creates or updates the
Forgejo repo `philipp/orderbooks`, and upserts the required Actions secret and
variables.
After bootstrap, push a clean source tree to Forgejo `main`. Do not push local
`data/`, `artifacts/`, `reports/`, `orchestration/`, kubeconfigs, rclone config,
`.env`, private keys, or other local evidence/secrets.
## Image Build And Deploy
The Forgejo workflow is `.forgejo/workflows/deploy.yml`. It follows the shared
runner pattern:
1. load `KUBECONFIG_B64` from Forgejo secrets;
2. clone this repo inside the runner;
3. create an in-cluster Kaniko Job;
4. build and push `REGISTRY_HOST/orderbooks:<git-sha>`;
5. apply `deploy/k8s/base` with the built image;
6. wait for `deployment/orderbooks-collector` and `deployment/orderbooks-ws-recorder` rollout.
Required Forgejo repo secret:
```text
KUBECONFIG_B64
```
Required Forgejo repo variable:
```text
REGISTRY_HOST
```
Project defaults used by the workflow:
```text
PROJECT_NAME=orderbooks
PROJECT_NAMESPACE=orderbooks
PROJECT_DEPLOYMENTS=orderbooks-collector,orderbooks-ws-recorder
PROJECT_REGISTRY_SECRET_NAME=orderbooks-registry-creds
```
The registry pull/build secret `orderbooks-registry-creds` must exist in the
`orderbooks` namespace before the workflow builds and deploys.
Pushes to `main` are intentionally non-deploying during the websocket canary
work. `workflow_dispatch` remains the broad release path and may roll both
Deployments listed in `PROJECT_DEPLOYMENTS`. Do not use that broad workflow for
websocket-only canary evidence.
## Websocket Canary-Only Deploy Path
Checkpoint 10D1 uses `scripts/deploy/deploy_ws_canary_kaniko.sh` for the
websocket canary. The helper builds an image from the committed Forgejo `main`
SHA with an in-cluster Kaniko Job, then applies only:
```text
namespace.yaml
configmap.yaml
pvc.yaml
cronjob-uploader.yaml
deployment-ws-recorder.yaml
```
It does not apply `deployment-collector.yaml`, does not set the
`orderbooks-collector` image, and waits only for
`deployment/orderbooks-ws-recorder`. Validate the scoped apply set first:
```sh
KUBECONFIG=../nuri/unrip3/.state/hetzner/kubeconfig.yaml \
scripts/deploy/deploy_ws_canary_kaniko.sh --server-dry-run
```
After a clean source-only commit has been pushed to Forgejo `main`, deploy the
canary with:
```sh
KUBECONFIG=../nuri/unrip3/.state/hetzner/kubeconfig.yaml \
scripts/deploy/deploy_ws_canary_kaniko.sh --git-ref "$(git rev-parse HEAD)"
```
The helper writes compact deploy evidence under
`data/manifests/ws_canary_deploy_<UTC_TIMESTAMP>.json`.
## Websocket Recorder Canary
Checkpoint 10D adds the websocket recorder as a canary, not as a replacement for
the REST snapshot collector. The canary subscribes to public Polymarket market
websocket messages for active BTC Up/Down token IDs, preserves every websocket
text payload exactly in `raw_text`, and keeps periodic REST `/books` checkpoints
for recovery and divergence evidence.
The script and example config default to `market_limit: 0`, which means all
discovered active BTC Up/Down markets. The Kubernetes canary config currently
sets `market_limit: 2`, `manifest_write_interval_seconds: 60`, `first_message_timeout_seconds: 90`, and `stale_feed_threshold_seconds: 90` as explicit
smoke/safety settings. The 10D local bounded run
wrote about 3.35 MB of compressed websocket data in two minutes for two markets;
running all active BTC markets on the current 10Gi PVC needs a separate sizing
or retention decision before removing the cap. Do not use a cap silently in
production evidence.
Raw/current file safety:
- completed archives end in `.jsonl.gz`;
- the recorder writes current gzip files with a hidden `.open` name and renames
them only after close;
- the uploader skips `.open`, `.tmp`, and `.partial` files;
- verified cleanup deletes local files only after rclone verification succeeds.
## Pre-Deploy Validation
From this repository:
```sh
bash -n scripts/run_polymarket_collector_loop.sh
bash -n scripts/run_polymarket_ws_recorder_loop.sh
bash -n scripts/k8s_runtime_smoke_check.sh
bash -n scripts/k8s_ws_runtime_smoke_check.sh
python -m py_compile scripts/collect_polymarket_ws_orderbooks.py
kubectl kustomize deploy/k8s/base
KUBECONFIG=../nuri/unrip3/.state/hetzner/kubeconfig.yaml kubectl apply -k deploy/k8s/base --dry-run=server
KUBECONFIG=../nuri/unrip3/.state/hetzner/kubeconfig.yaml kubectl -n orderbooks get secret orderbooks-rclone-config -o go-template='{{if index .data "rclone.conf"}}rclone_secret_key_present{{else}}rclone_secret_key_missing{{end}}{{"\n"}}'
```
The last command checks only whether the key exists. It must not print secret
data.
## Runtime Smoke Gate
After the image is built and the workload is actually deployed, run:
```sh
KUBECONFIG=../nuri/unrip3/.state/hetzner/kubeconfig.yaml scripts/k8s_runtime_smoke_check.sh --namespace orderbooks --deployment orderbooks-collector --cronjob orderbooks-uploader --raw-dir /var/lib/orderbooks/raw_orderbooks --manifest-dir /var/lib/orderbooks/manifests --wait-seconds 1800 \
--upload-min-age-seconds 600
```
The smoke gate uses `kubectl`, not systemd. It writes local JSON evidence under
`data/manifests/k8s_runtime_smoke_<UTC_TIMESTAMP>.json` by default. It verifies:
- collector pod is running;
- latest collector manifest has `gate_status: PASS`, `rows_written > 0`, and
`failure_count: 0`;
- raw gzip JSONL parses and is under `/var/lib/orderbooks/raw_orderbooks`;
- deleting the collector pod does not corrupt the old raw file checksum or row
count;
- a later post-restart collector cycle writes valid rows;
- an uploader Job created from the CronJob completes;
- the latest upload manifest records a verified rclone upload with at least one
verified file.
A failed smoke run still writes JSON evidence and exits nonzero. Preserve failed
manifests, raw files, upload manifests, and pod logs for review.
## Websocket Reliability Observation
After deploying a websocket recorder reliability fix, run a read-only bounded
observation before treating the canary as unattended:
```sh
KUBECONFIG=../nuri/unrip3/.state/hetzner/kubeconfig.yaml \
scripts/k8s_ws_reliability_check.sh --wait-seconds 1800
```
The observation fails if websocket message counts and archive mtimes do not
advance while active tokens exist, if REST checkpoints stop succeeding, if parse
errors appear, or if reconnect/stale counters grow rapidly without recovery. It
also records the REST collector image/readiness before and after the observation.
## Not Included
- No trading, signing, wallets, private keys, or API keys.
- No dashboard, database, strategy, backtest, or second-market connector.
- No websocket rewrite.
- No rclone config contents in this repository.
## Websocket Canary Smoke Gate
After the canary image is deployed and has run long enough to close at least one
websocket and REST checkpoint archive, run:
```sh
KUBECONFIG=../nuri/unrip3/.state/hetzner/kubeconfig.yaml scripts/k8s_ws_runtime_smoke_check.sh --namespace orderbooks --deployment orderbooks-ws-recorder --rest-deployment orderbooks-collector --cronjob orderbooks-uploader --wait-seconds 900 --upload-min-age-seconds 600
```
The smoke gate verifies the websocket pod is running, raw websocket gzip JSONL
parses, REST checkpoint gzip JSONL parses, manifests expose reconnect/stale and
divergence counters, pod deletion/restart does not corrupt the prior closed raw
file or produces a SIGTERM-closed archive when no prior closed file exists, a
later pod writes new data, and the existing REST collector remains healthy. For
upload evidence it creates a one-off uploader Job from the deployed image and
same PVC/secret with `ORDERBOOKS_UPLOAD_MIN_AGE_SECONDS=0`, then verifies the
upload manifest has `UPLOAD_VERIFIED`, `gate_status: PASS`, and at least one
verified websocket recorder raw or REST checkpoint file. Production CronJob
upload min age remains 600 seconds.