Strengthen quote lifecycle proof guardrails
All checks were successful
deploy / deploy (push) Successful in 31s
All checks were successful
deploy / deploy (push) Successful in 31s
Proof: The active quote-lifecycle turn now explicitly requires semantic truth, forbidden overclaim labels, negative invariant tests, and naming cleanup so submission evidence cannot be presented as trade completion. Assumptions: The repo can derive truthful first-pass quote lifecycle states from existing decision and execution records, and prevention should be enforced by code and tests rather than reviewer memory. Still fake: This commit only tightens the live planning docs; the implementation work to rename legacy fields and derive lifecycle-backed summaries is still outstanding.
This commit is contained in:
parent
7ddefb500e
commit
b695f60bc6
2 changed files with 104 additions and 3 deletions
|
|
@ -15,6 +15,8 @@ Replace ambiguous quote and decision wording with a truthful per-quote lifecycle
|
||||||
- Prefer one explicit lifecycle derivation path shared by backend and dashboard over ad hoc page-specific wording.
|
- Prefer one explicit lifecycle derivation path shared by backend and dashboard over ad hoc page-specific wording.
|
||||||
- Do not invent downstream certainty where durable evidence is absent.
|
- Do not invent downstream certainty where durable evidence is absent.
|
||||||
- Remove `Actionable` completely from operator-facing copy.
|
- Remove `Actionable` completely from operator-facing copy.
|
||||||
|
- Do not use stronger operator words than the durable evidence supports.
|
||||||
|
- Fix semantic bugs by changing both the code and the tests that encoded the wrong assumption.
|
||||||
|
|
||||||
## Problem statement for this turn
|
## Problem statement for this turn
|
||||||
The current dashboard still forces operators to infer too much:
|
The current dashboard still forces operators to infer too much:
|
||||||
|
|
@ -29,6 +31,11 @@ The repo already stores enough of the real lifecycle to do better:
|
||||||
- emitted command id
|
- emitted command id
|
||||||
- execution result status and result code
|
- execution result status and result code
|
||||||
|
|
||||||
|
The recent submission-versus-trade bug showed the broader prevention gap:
|
||||||
|
- wrong semantics were encoded in backend query names
|
||||||
|
- the dashboard rendered stronger claims than the evidence supported
|
||||||
|
- tests asserted the wrong meaning instead of protecting the truth
|
||||||
|
|
||||||
The turn therefore needs to improve:
|
The turn therefore needs to improve:
|
||||||
- lifecycle derivation
|
- lifecycle derivation
|
||||||
- durable reason mapping
|
- durable reason mapping
|
||||||
|
|
@ -53,6 +60,8 @@ The first mandatory states are:
|
||||||
- `Awaiting outcome`
|
- `Awaiting outcome`
|
||||||
- `Completed`
|
- `Completed`
|
||||||
|
|
||||||
|
These states must become the repo-owned evidence vocabulary for operator surfaces and summaries.
|
||||||
|
|
||||||
Suggested meanings:
|
Suggested meanings:
|
||||||
- `Filtered`
|
- `Filtered`
|
||||||
quote never entered the active trade path or was excluded before strategy decision
|
quote never entered the active trade path or was excluded before strategy decision
|
||||||
|
|
@ -94,13 +103,34 @@ If the exact reason is missing:
|
||||||
- expose `reason_unknown`
|
- expose `reason_unknown`
|
||||||
- keep the row truthful instead of synthesizing an explanation
|
- keep the row truthful instead of synthesizing an explanation
|
||||||
|
|
||||||
|
## Semantic guardrails for this turn
|
||||||
|
|
||||||
|
### 1. Ban overloaded certainty words unless evidence justifies them
|
||||||
|
Review and remove or rename operator-facing and backend terms such as:
|
||||||
|
- `successfulTradeCount`
|
||||||
|
- `lastSuccessfulTradeAt`
|
||||||
|
- `loadSuccessfulTradesPage`
|
||||||
|
- `trade_asset_changes`
|
||||||
|
- any UI label implying trade completion, realized asset movement, or PnL attribution from mere submission evidence
|
||||||
|
|
||||||
|
Allowed wording must be tied to the strongest durable evidence actually present.
|
||||||
|
|
||||||
|
### 2. Encode semantic invariants in code and tests
|
||||||
|
Add explicit checks and regression coverage for:
|
||||||
|
- `submitted != completed`
|
||||||
|
- `submitted != realized asset delta`
|
||||||
|
- executor blocking != strategy rejection
|
||||||
|
- no UI label may claim trade completion from submission-only evidence
|
||||||
|
|
||||||
|
Negative tests are required, not just positive-path tests.
|
||||||
|
|
||||||
## Backend changes
|
## Backend changes
|
||||||
|
|
||||||
### 1. Add a lifecycle derivation helper
|
### 1. Add a lifecycle derivation helper
|
||||||
Create or extend a backend module that derives quote lifecycle from:
|
Create or extend a backend module that derives quote lifecycle from:
|
||||||
- recent trade decisions
|
- recent trade decisions
|
||||||
- recent execution results
|
- recent execution results
|
||||||
- successful trade records
|
- later terminal records only where they are real
|
||||||
- any available quote-status or venue result surfaces
|
- any available quote-status or venue result surfaces
|
||||||
|
|
||||||
It should emit a normalized row object with:
|
It should emit a normalized row object with:
|
||||||
|
|
@ -121,9 +151,11 @@ The backend should no longer leave the frontend to infer execution from isolated
|
||||||
|
|
||||||
For each recent quote/decision row:
|
For each recent quote/decision row:
|
||||||
- attach the matching execution result by `command_id`, `decision_id`, or `quote_id`
|
- attach the matching execution result by `command_id`, `decision_id`, or `quote_id`
|
||||||
- attach successful-trade or later terminal evidence where available
|
- attach terminal completion or non-fill evidence only where it is genuinely available
|
||||||
- expose whether the row is strategy-only, strategy-plus-command, or strategy-plus-execution
|
- expose whether the row is strategy-only, strategy-plus-command, or strategy-plus-execution
|
||||||
|
|
||||||
|
As part of this phase, rename misleading backend aggregation helpers and payload fields where practical so code meaning matches evidence meaning.
|
||||||
|
|
||||||
### 3. Preserve operator drilldown identifiers
|
### 3. Preserve operator drilldown identifiers
|
||||||
Ensure the bootstrap payload exposes:
|
Ensure the bootstrap payload exposes:
|
||||||
- full quote id
|
- full quote id
|
||||||
|
|
@ -142,6 +174,8 @@ Remove `Actionable` from:
|
||||||
|
|
||||||
Replace it with explicit state labels driven by lifecycle derivation.
|
Replace it with explicit state labels driven by lifecycle derivation.
|
||||||
|
|
||||||
|
Also remove or rename any remaining wording that presents submission evidence as trade completion or realized asset movement.
|
||||||
|
|
||||||
### 5. Make recent rows self-explanatory
|
### 5. Make recent rows self-explanatory
|
||||||
For each row, render:
|
For each row, render:
|
||||||
- primary lifecycle state
|
- primary lifecycle state
|
||||||
|
|
@ -179,6 +213,7 @@ If a strategy-only summary remains, it must be visually separate from per-quote
|
||||||
Inspect quote and system surfaces for similar ambiguity and align the wording if they expose the same concepts.
|
Inspect quote and system surfaces for similar ambiguity and align the wording if they expose the same concepts.
|
||||||
|
|
||||||
Do not let one page say `Submitted` while another page still says `Actionable` for the same row.
|
Do not let one page say `Submitted` while another page still says `Actionable` for the same row.
|
||||||
|
Do not let one page say `trade` while another page only has `submitted` evidence for the same row.
|
||||||
|
|
||||||
## Data and state edge cases
|
## Data and state edge cases
|
||||||
- Strategy decision exists, no command emitted:
|
- Strategy decision exists, no command emitted:
|
||||||
|
|
@ -193,6 +228,8 @@ Do not let one page say `Submitted` while another page still says `Actionable` f
|
||||||
render as `Submitted` or `Awaiting outcome`
|
render as `Submitted` or `Awaiting outcome`
|
||||||
- Successful trade summary exists but no explicit per-quote completion event:
|
- Successful trade summary exists but no explicit per-quote completion event:
|
||||||
only promote to `Completed` where the durable linkage is real
|
only promote to `Completed` where the durable linkage is real
|
||||||
|
- Submission evidence appears in profitability or summary widgets:
|
||||||
|
rename and constrain those widgets so they do not imply realized trade truth
|
||||||
|
|
||||||
## Concrete implementation order
|
## Concrete implementation order
|
||||||
|
|
||||||
|
|
@ -200,11 +237,14 @@ Do not let one page say `Submitted` while another page still says `Actionable` f
|
||||||
- inspect current durable decision and execution payloads
|
- inspect current durable decision and execution payloads
|
||||||
- write the normalized lifecycle state mapping
|
- write the normalized lifecycle state mapping
|
||||||
- define forbidden and allowed operator labels
|
- define forbidden and allowed operator labels
|
||||||
|
- list misleading backend and UI names that must be changed
|
||||||
|
- define the semantic invariant tests up front
|
||||||
|
|
||||||
### Phase 2. Implement backend aggregation
|
### Phase 2. Implement backend aggregation
|
||||||
- derive unified recent lifecycle rows
|
- derive unified recent lifecycle rows
|
||||||
- expose full identifiers and reason codes
|
- expose full identifiers and reason codes
|
||||||
- keep old consumers working until the frontend is switched
|
- keep old consumers working until the frontend is switched
|
||||||
|
- rename misleading submission-as-trade helpers and summary fields where touched
|
||||||
|
|
||||||
### Phase 3. Update Strategy page rendering
|
### Phase 3. Update Strategy page rendering
|
||||||
- replace verdict column with lifecycle state
|
- replace verdict column with lifecycle state
|
||||||
|
|
@ -216,11 +256,13 @@ Do not let one page say `Submitted` while another page still says `Actionable` f
|
||||||
- remove `Actionable`
|
- remove `Actionable`
|
||||||
- align supporting labels
|
- align supporting labels
|
||||||
- ensure blocked vs rejected vs submitted are clearly distinct
|
- ensure blocked vs rejected vs submitted are clearly distinct
|
||||||
|
- ensure submitted vs completed vs realized asset movement are clearly distinct
|
||||||
|
|
||||||
### Phase 5. Validate with live recent rows
|
### Phase 5. Validate with live recent rows
|
||||||
- verify a row rejected due to executor disarmed renders as blocked with reason
|
- verify a row rejected due to executor disarmed renders as blocked with reason
|
||||||
- verify a submitted row renders as submitted
|
- verify a submitted row renders as submitted
|
||||||
- verify quote ids can be copied and used for tracing
|
- verify quote ids can be copied and used for tracing
|
||||||
|
- verify no submission-only row is rendered as a trade, completion, or realized asset delta
|
||||||
|
|
||||||
## Test plan
|
## Test plan
|
||||||
- unit tests for lifecycle derivation from:
|
- unit tests for lifecycle derivation from:
|
||||||
|
|
@ -228,14 +270,18 @@ Do not let one page say `Submitted` while another page still says `Actionable` f
|
||||||
- executor-disarmed rows
|
- executor-disarmed rows
|
||||||
- submission-failed rows
|
- submission-failed rows
|
||||||
- submitted rows
|
- submitted rows
|
||||||
|
- unit tests for semantic invariants:
|
||||||
|
- submitted rows must not be counted as completed trades
|
||||||
|
- submission-only rows must not render as asset deltas
|
||||||
- dashboard bootstrap tests for:
|
- dashboard bootstrap tests for:
|
||||||
- forbidden `Actionable` removal
|
- forbidden `Actionable` removal
|
||||||
- explicit lifecycle labels
|
- explicit lifecycle labels
|
||||||
- reason text rendering
|
- reason text rendering
|
||||||
- identifier exposure
|
- identifier exposure
|
||||||
|
- dashboard summary tests for renamed or narrowed submission metrics
|
||||||
- frontend component tests if needed for copy affordance or row rendering logic
|
- frontend component tests if needed for copy affordance or row rendering logic
|
||||||
|
|
||||||
No lifecycle ambiguity fix is complete without a regression test proving the old ambiguous wording cannot return.
|
No lifecycle ambiguity fix is complete without a regression test proving the old ambiguous wording or overclaim cannot return.
|
||||||
|
|
||||||
## Validation checklist against the proof
|
## Validation checklist against the proof
|
||||||
- `Actionable` no longer appears
|
- `Actionable` no longer appears
|
||||||
|
|
@ -243,9 +289,18 @@ No lifecycle ambiguity fix is complete without a regression test proving the old
|
||||||
- recent blocked rows explain why they did not trade
|
- recent blocked rows explain why they did not trade
|
||||||
- recent submitted rows show that they were submitted
|
- recent submitted rows show that they were submitted
|
||||||
- quote ids are directly usable from the dashboard
|
- quote ids are directly usable from the dashboard
|
||||||
|
- submission-only evidence is no longer rendered as trade completion or asset delta truth
|
||||||
|
|
||||||
## Failure modes to plan for
|
## Failure modes to plan for
|
||||||
- the backend joins rows incorrectly and attributes the wrong execution result
|
- the backend joins rows incorrectly and attributes the wrong execution result
|
||||||
- the UI uses softer wording than the backend lifecycle state
|
- the UI uses softer wording than the backend lifecycle state
|
||||||
- older rows lack enough evidence and the UI pretends certainty
|
- older rows lack enough evidence and the UI pretends certainty
|
||||||
- ids are still truncated without a copy or expand path
|
- ids are still truncated without a copy or expand path
|
||||||
|
- misleading legacy names remain in place and create new semantic drift later
|
||||||
|
|
||||||
|
## Truth review checklist for this turn
|
||||||
|
For every operator-facing label, metric, table, or badge touched in this proof:
|
||||||
|
- what exact durable table or event backs it?
|
||||||
|
- what is the strongest claim the evidence supports?
|
||||||
|
- what wording would overclaim certainty?
|
||||||
|
- what negative regression test locks that boundary in?
|
||||||
|
|
|
||||||
46
PROOF.md
46
PROOF.md
|
|
@ -12,6 +12,7 @@ The concrete target is the live NEAR Intents BTC/EURe system:
|
||||||
- execution submission must be distinguishable from strategy approval
|
- execution submission must be distinguishable from strategy approval
|
||||||
- blocked, rejected, submitted, failed, and not-filled paths must be visibly different
|
- blocked, rejected, submitted, failed, and not-filled paths must be visibly different
|
||||||
- quote identifiers must be directly usable by operators for tracing and support
|
- quote identifiers must be directly usable by operators for tracing and support
|
||||||
|
- operator-facing labels must not overclaim beyond the durable evidence actually stored
|
||||||
|
|
||||||
## Why this is a meaningful architecture test
|
## Why this is a meaningful architecture test
|
||||||
The current operator surface still fails a core thesis requirement:
|
The current operator surface still fails a core thesis requirement:
|
||||||
|
|
@ -22,6 +23,13 @@ The current operator surface still fails a core thesis requirement:
|
||||||
|
|
||||||
That is not just a copy problem. It is an observability gap in the trading product itself. If the system cannot explain a quote outcome precisely, execution is outrunning observability.
|
That is not just a copy problem. It is an observability gap in the trading product itself. If the system cannot explain a quote outcome precisely, execution is outrunning observability.
|
||||||
|
|
||||||
|
The immediate trigger for this turn is a real semantic failure:
|
||||||
|
- the dashboard treated `trade_execution_results.status = submitted` as a successful trade
|
||||||
|
- recent submitted quote terms were rendered as if they were realized asset deltas
|
||||||
|
- tests passed because that wrong assumption had been encoded into the test suite itself
|
||||||
|
|
||||||
|
This turn must therefore fix both the UI and the conditions that allowed the mistake through.
|
||||||
|
|
||||||
## Hypothesis
|
## Hypothesis
|
||||||
`unrip` becomes more trustworthy if quote handling is modeled and rendered as an explicit lifecycle instead of a single strategy verdict:
|
`unrip` becomes more trustworthy if quote handling is modeled and rendered as an explicit lifecycle instead of a single strategy verdict:
|
||||||
- strategy evaluation is only one stage in the lifecycle
|
- strategy evaluation is only one stage in the lifecycle
|
||||||
|
|
@ -49,9 +57,12 @@ The turn passes only if an operator can inspect a quote and immediately understa
|
||||||
- The existing durable stores already contain enough information for at least the current live path through strategy decision and executor result.
|
- The existing durable stores already contain enough information for at least the current live path through strategy decision and executor result.
|
||||||
- Some downstream venue-outcome states may still be partially fake or unavailable for older rows; if so, the UI must say that plainly rather than implying more certainty.
|
- Some downstream venue-outcome states may still be partially fake or unavailable for older rows; if so, the UI must say that plainly rather than implying more certainty.
|
||||||
- The immediate turn should prioritize truthful lifecycle explanation over broader analytics such as markout or long-window outcome attribution.
|
- The immediate turn should prioritize truthful lifecycle explanation over broader analytics such as markout or long-window outcome attribution.
|
||||||
|
- The prevention strategy must be implemented in repo code and tests rather than left to reviewer judgment alone.
|
||||||
|
|
||||||
## Turn-shaping rules
|
## Turn-shaping rules
|
||||||
- `Actionable` is forbidden as an operator-facing state or label.
|
- `Actionable` is forbidden as an operator-facing state or label.
|
||||||
|
- Operator-facing labels must not overstate event certainty.
|
||||||
|
- Terms such as `trade`, `success`, `filled`, `completed`, `profit`, and `asset delta` are forbidden unless backed by a durable event explicitly representing that fact.
|
||||||
- Do not add a second analytics product. Stay focused on per-quote lifecycle truth for the live active pair.
|
- Do not add a second analytics product. Stay focused on per-quote lifecycle truth for the live active pair.
|
||||||
- Do not invent lifecycle states that cannot be backed by durable repo-owned evidence.
|
- Do not invent lifecycle states that cannot be backed by durable repo-owned evidence.
|
||||||
- If a state transition is inferred rather than durably observed, the UI must make that distinction explicit.
|
- If a state transition is inferred rather than durably observed, the UI must make that distinction explicit.
|
||||||
|
|
@ -78,6 +89,19 @@ For each visible quote or decision row, the operator must be able to identify th
|
||||||
|
|
||||||
Exact labels may vary, but they must be specific and mutually meaningful.
|
Exact labels may vary, but they must be specific and mutually meaningful.
|
||||||
|
|
||||||
|
The repo must adopt a hard evidence-state vocabulary for this turn. At minimum:
|
||||||
|
- `observed`
|
||||||
|
- `evaluated`
|
||||||
|
- `command_emitted`
|
||||||
|
- `rejected`
|
||||||
|
- `blocked`
|
||||||
|
- `submitted`
|
||||||
|
- `failed`
|
||||||
|
- `awaiting_outcome`
|
||||||
|
- `completed`
|
||||||
|
|
||||||
|
No operator surface may collapse these into softer or stronger claims.
|
||||||
|
|
||||||
### Reason truth
|
### Reason truth
|
||||||
Each non-terminal or terminal non-trade state must expose a clear decisive reason, such as:
|
Each non-terminal or terminal non-trade state must expose a clear decisive reason, such as:
|
||||||
- unsupported pair
|
- unsupported pair
|
||||||
|
|
@ -111,16 +135,27 @@ Any replacement label must answer a concrete operator question, such as:
|
||||||
- was it submitted?
|
- was it submitted?
|
||||||
- did it fail?
|
- did it fail?
|
||||||
|
|
||||||
|
### Semantic invariants
|
||||||
|
The implementation and tests must enforce at least these invariants:
|
||||||
|
- `submitted` is not `completed`
|
||||||
|
- `submitted` is not a realized asset delta
|
||||||
|
- executor-side blocking is not strategy rejection
|
||||||
|
- stronger labels must not be rendered from weaker evidence
|
||||||
|
|
||||||
|
These invariants are proof-critical, not optional cleanup.
|
||||||
|
|
||||||
## Definition of done
|
## Definition of done
|
||||||
- `Actionable` is removed from operator-facing dashboard surfaces.
|
- `Actionable` is removed from operator-facing dashboard surfaces.
|
||||||
- A durable quote lifecycle model exists in repo-owned code and is used by the dashboard.
|
- A durable quote lifecycle model exists in repo-owned code and is used by the dashboard.
|
||||||
- At least the current live quote path through strategy decision and executor result is rendered coherently per quote.
|
- At least the current live quote path through strategy decision and executor result is rendered coherently per quote.
|
||||||
- The operator can tell, from one row, why a recent quote did or did not turn into a submitted trade.
|
- The operator can tell, from one row, why a recent quote did or did not turn into a submitted trade.
|
||||||
- Quote ids are copyable and clearly visible enough for tracing.
|
- Quote ids are copyable and clearly visible enough for tracing.
|
||||||
|
- overloaded backend and UI names that imply stronger certainty than the evidence supports are removed or renamed
|
||||||
- Regression tests cover at least:
|
- Regression tests cover at least:
|
||||||
- strategy-approved but executor-disarmed rows
|
- strategy-approved but executor-disarmed rows
|
||||||
- submitted rows
|
- submitted rows
|
||||||
- forbidden ambiguous label removal
|
- forbidden ambiguous label removal
|
||||||
|
- forbidden semantic overclaims such as treating `submitted` as `completed`
|
||||||
|
|
||||||
For this turn to close with status `passed`, the specific operator question:
|
For this turn to close with status `passed`, the specific operator question:
|
||||||
|
|
||||||
|
|
@ -134,6 +169,7 @@ must be answerable directly from the dashboard for recent rows without needing m
|
||||||
- direct evidence that a submitted row renders as submitted
|
- direct evidence that a submitted row renders as submitted
|
||||||
- direct evidence that quote ids are directly usable for tracing
|
- direct evidence that quote ids are directly usable for tracing
|
||||||
- automated test evidence for lifecycle derivation and dashboard rendering
|
- automated test evidence for lifecycle derivation and dashboard rendering
|
||||||
|
- automated test evidence for negative semantic invariants, especially `submitted != completed`
|
||||||
|
|
||||||
## Failure conditions
|
## Failure conditions
|
||||||
- `Actionable` still appears in the dashboard
|
- `Actionable` still appears in the dashboard
|
||||||
|
|
@ -141,6 +177,8 @@ must be answerable directly from the dashboard for recent rows without needing m
|
||||||
- non-trade rows still lack a decisive reason
|
- non-trade rows still lack a decisive reason
|
||||||
- quote ids remain hidden or non-copyable
|
- quote ids remain hidden or non-copyable
|
||||||
- lifecycle labels are only cosmetic and not backed by durable repo-owned state
|
- lifecycle labels are only cosmetic and not backed by durable repo-owned state
|
||||||
|
- the repo still uses `trade` or `asset delta` language for mere submission evidence
|
||||||
|
- tests still encode the old overclaiming semantics
|
||||||
|
|
||||||
## Current real before this turn
|
## Current real before this turn
|
||||||
- strategy decisions are stored durably
|
- strategy decisions are stored durably
|
||||||
|
|
@ -152,3 +190,11 @@ must be answerable directly from the dashboard for recent rows without needing m
|
||||||
- full venue settlement attribution for all historic trades
|
- full venue settlement attribution for all historic trades
|
||||||
- generalized quote analytics beyond lifecycle explanation
|
- generalized quote analytics beyond lifecycle explanation
|
||||||
- multi-venue lifecycle harmonization
|
- multi-venue lifecycle harmonization
|
||||||
|
|
||||||
|
## Prevention requirements for this proof
|
||||||
|
- Add a truth-review checklist to the implementation work:
|
||||||
|
- what exact durable table or event backs this label?
|
||||||
|
- what is the strongest claim the evidence supports?
|
||||||
|
- what would make this wording false?
|
||||||
|
- what negative regression test prevents that overclaim from returning?
|
||||||
|
- Separate lifecycle derivation from summary metrics so summaries are computed from lifecycle states rather than raw convenience queries.
|
||||||
|
|
|
||||||
Loading…
Add table
Reference in a new issue