5.2 KiB
Agent Instructions
Project: Cross-Market Live Orderbook Archive
This repository exists to preserve live market microstructure data that is usually lost: order books, spreads, liquidity, depth, timestamps, request metadata, and enough raw context to later decide whether a trading idea was observable, fillable, and reproducible at the time.
The first market is Polymarket. Future markets may include NEAR-related venues and other prediction or crypto markets, but do not build generic multi-market infrastructure before the second market exists.
Active Collaboration Model
This project uses a two-role workflow:
orchestrator: coordinates checkpoints with the user, keeps scope narrow, records decisions, reviews evidence, states gates, and decides the next smallest step.builder: works in a separate session to implement the active checkpoint artifacts, run commands, collect evidence, and write manifests/reports.
The current primary chat session is the orchestrator. The orchestrator should not silently become the builder unless the user explicitly asks. The builder should treat AGENTS.md, ROADMAP.md, docs/METHODOLOGY.md, and the active checkpoint report as the durable source of instructions.
Hand-offs between orchestrator and builder must be written to disk under orchestration/ or reports/checkpoints/ when they contain decisions, scope changes, endpoint findings, or validation results. Chat-only instructions are not enough for project-critical state.
Non-Negotiable Rules
- Preserve raw data first. Raw API and websocket payloads are the source of truth. Derived datasets are secondary and must reference raw files.
- No trading. Do not add order placement, signing, private-key handling, wallet logic, strategy execution, or bot behavior.
- No secrets in the repo. Never commit API keys, rclone credentials, wallet material, cookies, or private endpoints.
- Every checkpoint needs durable evidence on disk: code or docs, config or run instructions, manifest/report, and validation evidence.
- Do not claim success without commands, outputs, files, checksums, or real collected data to support the claim.
- Do not delete mistakes. If an artifact is wrong, misleading, partial, or deprecated, preserve it and label it with a reason and replacement.
- Keep the scope narrow. No dashboard, database, ML, strategy, backtest, or generic framework until the roadmap gate allows it.
- Public data only unless a later checkpoint explicitly documents why authenticated public-data access is required.
- "Production-ready" is forbidden until the collector has completed a documented 24h soak test with acceptable quality.
Expected Workflow
For each checkpoint:
- Define the smallest useful checkpoint.
- Build only what is needed for that checkpoint.
- Validate with real commands and, when applicable, real public data.
- Write a machine-readable manifest and a short markdown note.
- State PASS, FAIL, or BLOCKED.
- Identify the strongest fake-progress risk.
- Recommend the next smallest step.
- Stop only when a real user or orchestrator decision is needed.
Repository Conventions
scripts/: executable probes, discovery scripts, collectors, normalizers, and upload helpers.config/: example configuration only. Real secrets and machine-local config stay outside git.docs/: durable methodology, data contracts, operational runbooks, and endpoint notes.orchestration/prompts/: prompts and templates used by future agents.data/probes/: bounded endpoint probe outputs and probe notes.data/discovery/: market discovery outputs and manifests.data/live_sample/: short sample collector runs.data/normalized_sample/: derived sample outputs generated from raw samples.data/manifests/: machine-readable manifests for probes, collectors, normalization, uploads, and checkpoints.reports/: human-readable checkpoint, soak test, and incident reports.systemd/: VPS runtime units when added.
The initial Polymarket implementation should remain simple scripts until the collector works. Introduce collectors/<market_name>/ only when adding a second market or when duplication proves painful.
Artifact Status Labels
Every durable artifact should be treated as one of:
valid: current and usable.partial: useful but incomplete.deprecated: superseded by a newer artifact.invalid: known to be wrong or misleading.
When marking an artifact deprecated or invalid, write a sibling markdown note or manifest entry with:
- original artifact path
- status
- reason
- replacement path, if any
- labeled_at_utc
- labeled_by
Do not remove the original artifact unless the user explicitly asks and there is a written reason.
Adding New Market Connectors Later
Before adding a second market, Polymarket must have working discovery, raw order-book collection, Google Drive offload, and a 24h soak test.
When the gate is met:
- Create
collectors/<market_name>/for market-specific code. - Keep shared code minimal and concrete.
- Reuse the same raw-first file layout and manifest format.
- Document endpoint quirks, timestamp semantics, rate limits, and schema differences in
docs/. - Avoid abstract base classes until at least two real collectors expose repeated code that is painful to maintain.