91 lines
5.2 KiB
Markdown
91 lines
5.2 KiB
Markdown
# Agent Instructions
|
|
|
|
Project: Cross-Market Live Orderbook Archive
|
|
|
|
This repository exists to preserve live market microstructure data that is usually lost: order books, spreads, liquidity, depth, timestamps, request metadata, and enough raw context to later decide whether a trading idea was observable, fillable, and reproducible at the time.
|
|
|
|
The first market is Polymarket. Future markets may include NEAR-related venues and other prediction or crypto markets, but do not build generic multi-market infrastructure before the second market exists.
|
|
|
|
## Active Collaboration Model
|
|
|
|
This project uses a two-role workflow:
|
|
|
|
- `orchestrator`: coordinates checkpoints with the user, keeps scope narrow, records decisions, reviews evidence, states gates, and decides the next smallest step.
|
|
- `builder`: works in a separate session to implement the active checkpoint artifacts, run commands, collect evidence, and write manifests/reports.
|
|
|
|
The current primary chat session is the `orchestrator`. The orchestrator should not silently become the builder unless the user explicitly asks. The builder should treat `AGENTS.md`, `ROADMAP.md`, `docs/METHODOLOGY.md`, and the active checkpoint report as the durable source of instructions.
|
|
|
|
Hand-offs between orchestrator and builder must be written to disk under `orchestration/` or `reports/checkpoints/` when they contain decisions, scope changes, endpoint findings, or validation results. Chat-only instructions are not enough for project-critical state.
|
|
|
|
## Non-Negotiable Rules
|
|
|
|
1. Preserve raw data first. Raw API and websocket payloads are the source of truth. Derived datasets are secondary and must reference raw files.
|
|
2. No trading. Do not add order placement, signing, private-key handling, wallet logic, strategy execution, or bot behavior.
|
|
3. No secrets in the repo. Never commit API keys, rclone credentials, wallet material, cookies, or private endpoints.
|
|
4. Every checkpoint needs durable evidence on disk: code or docs, config or run instructions, manifest/report, and validation evidence.
|
|
5. Do not claim success without commands, outputs, files, checksums, or real collected data to support the claim.
|
|
6. Do not delete mistakes. If an artifact is wrong, misleading, partial, or deprecated, preserve it and label it with a reason and replacement.
|
|
7. Keep the scope narrow. No dashboard, database, ML, strategy, backtest, or generic framework until the roadmap gate allows it.
|
|
8. Public data only unless a later checkpoint explicitly documents why authenticated public-data access is required.
|
|
9. "Production-ready" is forbidden until the collector has completed a documented 24h soak test with acceptable quality.
|
|
|
|
## Expected Workflow
|
|
|
|
For each checkpoint:
|
|
|
|
1. Define the smallest useful checkpoint.
|
|
2. Build only what is needed for that checkpoint.
|
|
3. Validate with real commands and, when applicable, real public data.
|
|
4. Write a machine-readable manifest and a short markdown note.
|
|
5. State PASS, FAIL, or BLOCKED.
|
|
6. Identify the strongest fake-progress risk.
|
|
7. Recommend the next smallest step.
|
|
8. Stop only when a real user or orchestrator decision is needed.
|
|
|
|
## Repository Conventions
|
|
|
|
- `scripts/`: executable probes, discovery scripts, collectors, normalizers, and upload helpers.
|
|
- `config/`: example configuration only. Real secrets and machine-local config stay outside git.
|
|
- `docs/`: durable methodology, data contracts, operational runbooks, and endpoint notes.
|
|
- `orchestration/prompts/`: prompts and templates used by future agents.
|
|
- `data/probes/`: bounded endpoint probe outputs and probe notes.
|
|
- `data/discovery/`: market discovery outputs and manifests.
|
|
- `data/live_sample/`: short sample collector runs.
|
|
- `data/normalized_sample/`: derived sample outputs generated from raw samples.
|
|
- `data/manifests/`: machine-readable manifests for probes, collectors, normalization, uploads, and checkpoints.
|
|
- `reports/`: human-readable checkpoint, soak test, and incident reports.
|
|
- `systemd/`: VPS runtime units when added.
|
|
|
|
The initial Polymarket implementation should remain simple scripts until the collector works. Introduce `collectors/<market_name>/` only when adding a second market or when duplication proves painful.
|
|
|
|
## Artifact Status Labels
|
|
|
|
Every durable artifact should be treated as one of:
|
|
|
|
- `valid`: current and usable.
|
|
- `partial`: useful but incomplete.
|
|
- `deprecated`: superseded by a newer artifact.
|
|
- `invalid`: known to be wrong or misleading.
|
|
|
|
When marking an artifact `deprecated` or `invalid`, write a sibling markdown note or manifest entry with:
|
|
|
|
- original artifact path
|
|
- status
|
|
- reason
|
|
- replacement path, if any
|
|
- labeled_at_utc
|
|
- labeled_by
|
|
|
|
Do not remove the original artifact unless the user explicitly asks and there is a written reason.
|
|
|
|
## Adding New Market Connectors Later
|
|
|
|
Before adding a second market, Polymarket must have working discovery, raw order-book collection, Google Drive offload, and a 24h soak test.
|
|
|
|
When the gate is met:
|
|
|
|
1. Create `collectors/<market_name>/` for market-specific code.
|
|
2. Keep shared code minimal and concrete.
|
|
3. Reuse the same raw-first file layout and manifest format.
|
|
4. Document endpoint quirks, timestamp semantics, rate limits, and schema differences in `docs/`.
|
|
5. Avoid abstract base classes until at least two real collectors expose repeated code that is painful to maintain.
|