ELNOR REPO READER TEXT MIRROR Original path: Active Working and Red Team/DOC23 Working/DOC23 Red Teaming/Test-set Card V2 Red Team Responses/Grok_xAI_Review.md Source repo: /Users/OpenClaw1/Elnor/Elnor Specs Git branch: main Git commit: dbaa25962edc11ab30e8d4ca1715f9ae5bf77331 Generated: 2026-06-09T01:23:58.539Z --- # Grok xAI Red-Team Review -- DOC23 Addenda B Test-Set Adjudication Card V2 **Reviewer:** Grok (xAI) **Date:** 2026-06-03 **Card Version:** V2 (architect-confirmed dispositions from 2026-05-31) **Inputs:** Card V2 primary; operative Addenda B set @ main (Core R0.7.1, V3.3.1, Common Contracts V1.1.1, etc.); CLAUDE.md current state (Stage 6 E0 open, DOC23 Body system operative); prior reviews summarized in card. **Style adherence:** Direct, no hedging. Every claim cites card section/item + target operative spec/section. Paste-ready TS/interfaces/lints/fixtures prioritized. No phantom features -- every contract traces to AB-T item, V- verified fact, or P-1..P-6 construct. Comprehensive spec-level solutions. --- ## Executive Assessment **CONFIRMED:** The six unifying constructs (P-1..P-6) are a high-quality, minimal-surface-area synthesis that correctly folds the Test harvest's net-new enforcement layer into the existing Reporting->Enforcement spine while adding the admission (P-2), durability (P-1 substrate), recovery (P-3), and gate-signal integrity (P-4) axes surfaced by the three reviewers. Dispositions are correct and total. D1 (field-not-state for AB-T-02) is the right architectural call. AB-T-08 hard gate decline is correct. The 8 interaction bugs are comprehensive; no major missed interaction when P-1..P-4 run together. Gemini bugs triaged accurately (BUG-03 correctly redirected; BUG-04 carries the necessary security caveat, which I strengthen below). AB-T-HYDR is net-new and correctly carried as tracked obligation with open placement. **No phantom features.** All proposed additions (schemas, precedence rules, lints, fixtures) trace directly to: - AB-T-01..17 findings (Sec. 3) - V-1..V-5 verified facts (Sec. 1) - Interaction bugs 1-8 (Sec. 4) - P-1..P-6 constructs (Sec. 2) - Gemini BUG-01..04 (Sec. 5) - AB-T-HYDR (Sec. 6) The card is already A-grade. This review supplies the missing concrete paste-ready material (tightened/completed schemas with missing fields/enums/edge cases, lints, executable fixtures, precise landing sections, and OP-A obligation contracts) so implementation has zero guesswork. --- ## 1. Correctness of Key Adjudications & Divergences (card Sec. 3, 7) **D1 -- AB-T-02 field-not-state vs. Gemini discrete `satisfied_downgraded` state: CONFIRMED field-not-state is correct.** Card Sec. 3 (AB-T-02) + Sec. 7 Q1. The structured `AssuranceExecutionRecord` (or equivalent on P-1 passthrough) + mandatory `FeedbackFindingView` passthrough + per-risk-class flip rule through the *existing* `EvaluationChainResolutionPolicy` achieves the DAG-routable halt without introducing a new enum variant that every consumer of `OutcomeEvaluationState` must immediately handle. Gemini's concern (a boolean/state can be flown past a DAG node) is met by the combination of (a) the structured record carrying the downgrade detail and (b) the flip rule forcing `needs_verification`/`needs_human_judgment` when a *required* basis for the risk class is dropped. Adding a discrete state would bloat the state machine surface area and create migration debt across DOC20/21/22 read-models and any downstream DAG router. Field + policy flip is lower blast radius and reuses existing machinery. Correct call. **AB-T-08 hard `CausalProof` gate: CONFIRMED decline (keep light `RevisionChangeRationale` only).** Card Sec. 3 (AB-T-08). Highest theater risk: an LLM asserting a "rigorous causal chain" is not evidence of one existing. The existing `RevisionReviewPacket.finding_to_change_map` + a lightweight `RevisionChangeRationale` field on the revision plan is the correct minimal surface. Hard escalation gate would create false-positive blocks on legitimate revisions and invite prompt-injection gaming of the causal claim. Decline is right; light rationale + existing finding-to-change wiring is sufficient and already partially present. **AB-T-03 CRITICAL (N:1 ambiguous matter hold): CONFIRMED highest severity + tightened schema is correct.** Card Sec. 1 V-2 + Sec. 3 (AB-T-03) + Sec. 2 P-1/P-2. Silent privilege-boundary crossing on ambiguous N:1 resolution is malpractice-grade. The proposed `MatterResolution { candidate_matter_refs; top_confidence; separation; status: "resolved"|"ambiguous_hold"|"unresolved_hold" }` with resolve-iff-single->=floor OR >=2-with-separation>=threshold (privileged higher floor) + emit `PENDING_MATTER_ASSIGNMENT` `ContextBoundaryRef` (P-2) + fail-closed to privileged + quarantine is exactly right. No auto-bind across matter boundary. Correct. All other ADOPT / re-scope / DISCUSS dispositions (AB-T-01/02/05/07/09 scoped/10/11/12/13 split/14 re-scoped/15 detection-only/16/17 pin-only; AB-T-04 fold-in; AB-T-06 runtime-only; AB-T-08 stay DISCUSS) are correct per the V-3/V-4/V-5 verification in card Sec. 1 and the re-scope logic in Sec. 3. No over-adoption. --- ## 2. Paste-Ready Spec Content -- Completed/Tightened Schemas, Lints, Fixtures (Primary Directive) Per P-1..P-6 and every adopted AB-T item: full contract + lints + ≥1 executable fixture. Target operative spec + section named for each addition. Where card already gave schema, I complete missing fields, enums, edge cases, and provenance constraints (P-4). ### P-1 `CleanVerdictEligibility` (Verdict-Honesty object) -- unifies AB-T-01/02/03/05 + substrate staleness (V-3) **Type: BETTER_IDEA (card Sec. 2). Disposition: ADOPT.** **Landing:** Add interface + predicate + precedence function to `Current Specs/DOC23/DOC23 Addenda B/DOC23_EVALUATION_COMMON_CONTRACTS_V1_1_1.md` §4 (new "Verdict Honesty Layer"). Wire consumption into `DOC23_ADDB_OUTCOME_EVALUATOR_REVISOR_V3_3_1.md` §7.4 (Evaluation Chain Resolution Policy) and Core R0.7.1 §11.21 (revalidation cascade -- MUST recompute). Also surface via `FeedbackFindingView` passthrough (card already requires). Cross-ref Core §13A.3 for matter_resolution dimension. **Tightened/Completed Schema (added: risk_class, quarantine_recommendation, disclosure_log_ref, full DimensionStatus, AdequatelyGroundedPredicate, provenance on every dimension):** ```ts // Current Specs/DOC23/DOC23 Addenda B/DOC23_EVALUATION_COMMON_CONTRACTS_V1_1_1.md §4 export type VerdictHonestyDimension = | "affirmative_grounding" // AB-T-01: every in-scope factual claim has affirmative support | "assurance_floor" // AB-T-02: executed_assurance >= target (incl. quorum for risk class) | "matter_resolution" // AB-T-03: resolved to exactly one matter (no *_hold) | "source_documentation_tier" // AB-T-05: load-bearing claim's source meets MinimumDocumentationTierPolicy[risk_class] | "substrate_freshness"; // V-3(b/c): consumed findings / TKP / sources not stale per AB-T-15 policy export type DimensionStatus = | "met" | "downgraded_disclosed" // policy-accepted downgrade + carried through FeedbackFindingView passthrough | "unmet" | "pending_revalidation"; // transient during Sec.11.21 cascade export type RiskClass = | "filing_bound" | "privileged_matter" | "evaluator_load_bearing" | "ordinary"; export type PresentationStatus = | "clean" | "clean_with_disclosure" | "needs_verification" | "needs_human_judgment"; export type GateSignalProvenance = "deterministic" | "independent_check" | "self_reported"; // P-4 export interface CleanVerdictEligibility { outcome_ref: string; // EvaluationOutcome.id this eligibility describes dimensions: Array<{ dimension: VerdictHonestyDimension; status: DimensionStatus; detail_ref?: string; // claim_id | SourceRetrievalOutcome.id | TKP.id | etc. signal_provenance: GateSignalProvenance; // P-4: MUST be deterministic or independent_check for high-stakes }>; clean_verdict_allowed: boolean; // false if ANY non-degradable unmet per precedence presentation_status: PresentationStatus; risk_class: RiskClass; // drives non-degradable set + human-judgment threshold quarantine_recommendation: boolean; // derived: true if presentation_status !== "clean" -- feeds P-5 directly precedence_rule_id: "P1-2026-05-31-v1"; // audit trail recomputed_at: string; // ISO-8601; MUST be updated on every Sec.11.21 revalidation; never carried stale disclosure_log_ref?: string; // when clean_with_disclosure: ref to the FeedbackFindingView entry that carried the downgrade facts } // Computed predicate (AB-T-01 ^ AB-T-05 composition -- interaction bug 1 fix) export interface AdequatelyGroundedPredicate { claim_id: string; support_status: "supported" | "unsupported" | "partial"; source_tier: number; // from SourceRecord or EvidencePackage meets_minimum: boolean; // source_tier >= MinimumDocumentationTierPolicy[risk_class] } // Precedence rule (total -- no unhandled combination; card Sec. 2 + interaction bugs 2/8) // Evaluate in order; first non-degradable unmet wins. Degradable only becomes clean_with_disclosure if disclosed. export function applyPrecedence( dimensions: CleanVerdictEligibility['dimensions'], riskClass: RiskClass ): PresentationStatus { const isNonDegradable = (dim: VerdictHonestyDimension) => { if (dim === "matter_resolution") return true; if (riskClass === "filing_bound" || riskClass === "privileged_matter") return dim === "affirmative_grounding"; return false; }; const firstNonDegradableUnmet = dimensions.find(d => d.status === "unmet" && isNonDegradable(d.dimension)); if (firstNonDegradableUnmet) { return firstNonDegradableUnmet.dimension === "matter_resolution" ? "needs_human_judgment" : "needs_verification"; } const hasDisclosedDowngrade = dimensions.some(d => d.status === "downgraded_disclosed"); return hasDisclosedDowngrade ? "clean_with_disclosure" : "clean"; } ``` **Lints (add to Outcome Evaluator lint suite in V3.3.1 §7 and Core §11.21):** - `verdict.clean_verdict_allowed_true_with_unmet_non_degradable_dimension` - `verdict.downgrade_not_passed_through_to_feedback_view` - `verdict.eligibility_carried_stale_after_revalidation` (recomputed_at older than last reval timestamp) - `verdict.precedence_rule_not_applied_or_wrong_order` - `source.load_bearing_claim_supported_by_subminimum_tier` (AB-T-05 enforcement at USE time) - `verdict.quarantine_recommendation_mismatch_presentation_status` **Executable Fixture (P1-Precedence-MultiUnmet-Privileged -- paste into test harness):** ```ts // Fixture: privileged matter, affirmative_grounding=unmet, assurance_floor=downgraded_disclosed, matter=met // Expected: grounding dominates -> needs_verification, clean_verdict_allowed=false, quarantine=true const mockOutcome = { risk_class: "privileged_matter" as const, claims: [{ id: "c1", support_status: "unsupported", source_tier: 0 }], assurance: { target_basis: [...], executed_basis: [...] /* missing required for class */ }, matter_resolution: { status: "resolved" } }; const eligibility = computeCleanVerdictEligibility(mockOutcome); // impl in evaluator assertEquals(eligibility.presentation_status, "needs_verification"); assertEquals(eligibility.clean_verdict_allowed, false); assertEquals(eligibility.quarantine_recommendation, true); const groundingDim = eligibility.dimensions.find(d => d.dimension === "affirmative_grounding"); assertEquals(groundingDim!.status, "unmet"); assertEquals(groundingDim!.signal_provenance, "independent_check"); // or deterministic // assurance downgrade is ignored because non-degradable grounding unmet wins ``` --- ### P-2 `ContextBoundaryRef` (canonical isolation primitive) -- unifies AB-T-09/10/11 **Type: BETTER_IDEA (card Sec. 2). Disposition: ADOPT.** **Landing:** Add to `Current Specs/DOC23/DOC23 Addenda B/DOC23_ADDENDA_B_CORE_R0_7_1.md` §13A (new subsection 13A.10 "Canonical Boundary Primitive" or extend existing isolation invariants in 13A.2/13A.3). Also wire into `DOC23_ADDB_SUBSYS_TASK_FORUM_RUN_BOARD_V1_0_1.md` (append-log partition) and learning sites (Sec. 9A / 16.6.5). P-1 `matter_resolution` hold emits `PENDING_MATTER_ASSIGNMENT` instance. **Tightened Schema (added: computed is_cross_matter, privilege constraint enforcement):** ```ts // Current Specs/DOC23/DOC23 Addenda B/DOC23_ADDENDA_B_CORE_R0_7_1.md §13A.10 export interface ContextBoundaryRef { boundary_id: string; // stable uuid or matter-derived key matter_ref?: string; // dominant; absent only for explicit cross-matter/global context_class_key?: string; // subordinate sub-key WITHIN the matter (never used for cross-matter isolation) scope_kind: "matter" | "work_product" | "library" | "source_set" | "global"; privilege_class?: "privileged" | "work_product" | "ordinary"; is_cross_matter: boolean; // computed: true if matter_ref absent and scope_kind allows cross } // Factory for limbo (AB-T-03 hold + interaction bug 5) export function createPendingMatterAssignmentBoundary( tentative_refs: string[] ): ContextBoundaryRef { return { boundary_id: `pending-${tentative_refs.join('-')}`, matter_ref: undefined, scope_kind: "matter", privilege_class: "privileged", // fail-closed is_cross_matter: false }; } ``` **Lints:** - `boundary.matter_scoped_mechanism_keyed_without_matter_ref` - `learning.sec9a_signal_crosses_matter_without_boundary_ref` (Sec. 9A learning must key under matter_ref) - `appendlog.privileged_streams_share_physical_file` (one physical stream per boundary) **Fixture:** Scenario with suggestion rejected on 3 matters does not suppress on other 37 matters (P-2 + AB-T-11). --- ### P-3 `RecoveryPolicyRegistry` -- unifies AB-T-13/14 **Type: BETTER_IDEA (card Sec. 2). Disposition: ADOPT (reuses existing task_agent_fallback_policy shape).** **Landing:** `Current Specs/DOC23/DOC23 Addenda B/DOC23_ADDB_OUTCOME_EVALUATOR_REVISOR_V3_3_1.md` §6.9 (extend existing fallback) + Core R0.7.1 §11.17 (write-failure taxonomy extension for durable_store_exhausted). **Tightened Schema (privilege-reclass re-taint cascade is the high-severity net-new):** ```ts // V3.3.1 §6.9 + Core §11.17 export type RecoveryTrigger = | "malformed_loadbearing_eval_output" // AB-T-14 | "durable_store_exhausted" // AB-T-13 storage half (taxonomy extension) | "mid_run_privilege_reclassification" // AB-T-13 high-severity net-new (re-taint cascade) | "tool_or_model_unavailable"; export interface RecoveryPolicy { trigger: RecoveryTrigger; strategy: "retry_alternate_model" | "cheap_fallback" | "mark_indeterminate" | "escalate_human" | "fail_closed_write" | "pause_and_retaint"; retaint_emitted_artifacts: boolean; // true for privilege-reclass: cascade re-taint over already-produced artifacts under old classification rollback_side_effects: boolean; records_to: "ModuleDecisionRationale" | "HardCallResolutionLedger"; } // Registry lookup (deterministic) export function getRecoveryPolicy(trigger: RecoveryTrigger, riskClass: RiskClass): RecoveryPolicy { ... } ``` **Lints:** `recovery.privilege_reclass_without_retaint_cascade`; `recovery.durable_store_exhausted_not_fail_closed`. **Fixture:** Mid-run privilege reclassification on a privileged-matter outcome triggers retaint of already-emitted artifacts + records to HardCallResolutionLedger. --- ### P-4 Gate-signal integrity (cross-cutting precondition) **Type: BETTER_IDEA (card Sec. 2 -- "the single most important addition"). Disposition: ADOPT.** **Landing:** Add enum + rule to `DOC23_EVALUATION_COMMON_CONTRACTS_V1_1_1.md` §2 (Gate Primitives). Enforce in evaluator (V3.3.1 §7) and any high-stakes gate (AB-T-02/05 triggers). Cross-ref P-1 signal_provenance field. **Tightened Rule + Implementation Note:** ```ts // Common Contracts §2 export type GateSignalProvenance = "deterministic" | "independent_check" | "self_reported"; // Rule (no_self_certified_bypass) // For risk_class in {filing_bound, privileged_matter, evaluator_load_bearing}: // dimension may be "met" or "downgraded_disclosed" ONLY if signal_provenance ≠ "self_reported" ``` **Concrete consequences (paste-ready):** 1. `executed_assurance_basis` (AB-T-02) MUST be *derived by the system* from the actual `EvaluationExecutionTrace` / `AssuranceBasis` execution records (which models/voices/checks actually ran). Never accepted from the evaluating model's output claim field. 2. `risk_class` (AB-T-05) for high-stakes MUST come from deterministic matter classification or task-type registry, never from the producing model. **Lints:** `gate.high_stakes_dimension_met_on_self_reported_signal`; `assurance.executed_basis_model_claimed_not_trace_derived`. **Fixture:** High-stakes outcome with self_reported assurance_basis -> dimension stays unmet even if model claims met. --- ### P-5 Evidentiary Quarantine -> DOC73 (OP-A) **Type: BETTER_IDEA (card Sec. 2). Disposition: ADOPT as OP-A only (no direct edit).** **Landing:** `OBL-DOC73-QUARANTINE-01` (see Sec. 7 below). `quarantine_recommendation` from P-1 feeds directly; no new contract needed. --- ### P-6 `EnforcementBadge` read-model -> DOC20 (OP-A) **Type: BETTER_IDEA (card Sec. 2). Disposition: ADOPT as OP-A only.** **Landing:** `OBL-DOC20-ENFORCEMENT-BADGE-01`. Pure projection of P-1; headline + click-through `EnforcementCase` grouping dimensions + provenance. --- ### AB-T-01/02/03/05/07 (roll into P-1) + AB-T-04/06/12/13/14/15/16/17 (standalone or via P-2/P-3) All adopted items have schemas/lints/fixtures above or in card. No additional net-new contracts required beyond the P-1..P-4 completions. AB-T-07 `CitationManifest` scoped to final/filing/load-bearing factual text only (correct); must bind to `SourceRecord` that will live in future `EvidencePackage` (co-lands with P-1). AB-T-15 freshness detection policy cross-refs existing `OBL-DOC24-CTXPKT-01` (do not duplicate). AB-T-17 pin is `SnapshotReferencePin` carrying `ContextBoundaryRef` + privilege constraint (net-new is the "while referenced by live reliance/dirty/replay" override of default retention). --- ## 3. Completeness / Missing Wiring / Phantom Check **CONFIRMED no phantom features.** Every field, lint, fixture, and OP-A obligation traces to an AB-T item or V- fact. **GAP (minor, low cost):** P-1 `CleanVerdictEligibility` should carry `boundary_ref: ContextBoundaryRef` (for matter-limbo AB-T-03 holds and direct feed to P-5 quarantine). This eliminates extra wiring for interaction bug 5. **SUGGESTION:** Add `quarantine_recommendation` (already in my tightened P-1) as the single derived signal P-5 consumes -- no new event/contract. **Interaction bug coverage:** The 8 bugs (card Sec. 4) are comprehensive. P-1 precedence + single object + P-2 boundary + P-4 provenance resolve all of them. One small reinforcement: stale eligibility after revalidation (bug 8) must also re-evaluate the `substrate_freshness` dimension using the new AB-T-15 detection policy (card already requires recompute on 11.21; this just makes the dimension explicit). **No missing contracts** for the adopted set. The card correctly scoped AB-T-09 as admission *policy* only (never orchestrator) -- 2.2-safe. --- ## 4. Six Packages Soundness - **P-1:** Unifies AB-T-01/02/03/05 + staleness correctly. Precedence rule is total (first non-degradable unmet wins; degradable only if disclosed). No unhandled combination. - **P-2:** Correctly makes matter dominant and `context_class_key` subordinate. Fixes the two-isolation-units inconsistency (AB-T-11) between Sec. 9A learning and Sec. 16.6.5 pattern promotion. PENDING_MATTER_ASSIGNMENT for limbo is the right home for interaction bug 5. - **P-3:** Sound reuse of existing fallback shape. Privilege-reclass re-taint cascade is the genuine high-severity net-new. - **P-4:** The single most important addition (card). Self-reported ban for high-stakes dimensions is the enforcement spine that makes the rest non-gameable. - **P-5/P-6:** Correctly kept as OP-A (cross-doc, flattening-safe). No direct edits. All consolidations are sound and minimal. --- ## 5. Interaction Bugs (Sec. 4) + Gemini Bugs (Sec. 5) **Interaction bugs 1-8: All fixes correct and complete via P-1..P-4.** No major missed. The P-1 single object + precedence eliminates the verdict-precedence conflict (bug 2), unifies quorum/assurance (bug 3), bounds verification spin (bug 7), etc. Bug 1 (AB-T-01 ^ 05) fixed by `AdequatelyGroundedPredicate`. Bug 4 (budget rollback) handled by preserving evidence layer on rollback. Bug 5 (matter-limbo) resolved by P-2 pending boundary. Bug 6 (quorum waiver) requires `HumanGateDecisionRecord` with `quorum_waived: true` (card + AB-T-04). Bug 8 (stale) fixed by mandatory recompute. **Gemini BUG-04 (syntactic-taint deadlock) -- triage CONFIRMED with strengthened caveat.** Card Sec. 5. The P-4 provenance caveat is necessary. It is not yet sufficient. Strengthen: the classification of a tool/run as `deterministic_mechanical` (the bifurcation point) must itself be P-4-clean -- i.e., derived from a deterministic capability manifest (DOC25) or independent registry, never self-reported by the producing model or a mechanical formatter. Otherwise the relaxation path becomes exactly the injection-laundering vector feared. Route the full taint-model discussion (including this constraint) to the taint owner (DOC25 or post-flatten DOC82) with the P-4 requirement explicit. DISCUSS status with security caveat is correct; do not adopt without the strengthened guard. BUG-01 (quadratic token) and BUG-02 (stochastic idempotency) correctly identified as net-new and routed (V3.3.1 revision loop and §11.8 respectively). BUG-03 correctly redirected to existing `OBL-DOC24-CTXPKT-01`. --- ## 6. AB-T-HYDR Placement + Schema (Sec. 6) **Recommendation: BETTER_IDEA -- extend existing `OBL-DOC24-CTXPKT-01` (already covers freshness + fidelity) and add concrete schema subsection to Core R0.7.1 §13A.9 (new) "Instruction Precedence Resolution at Task Start".** Rationale: Sec. 13A already owns the task-context source set (13A.3), sealed pass (13A.7), and attachments/ingestion evidence (13A.4). The precedence order among non-factual/non-budget conflicts belongs next to that machinery. DOC24 (packet assembly owner per CLAUDE.md) implements the de-conflict via the OP-A row. Keeps flattening-safe (no direct DOC24 edit). **Proposed schema (paste-ready, add to Core R0.7.1 §13A.9):** ```ts // Current Specs/DOC23/DOC23 Addenda B/DOC23_ADDENDA_B_CORE_R0_7_1.md §13A.9 export type InstructionSourceClass = | "user_directive" // highest -- safety/policy non-droppable | "task_initial_input" | "local_blueprint_guidance" // RunGuidanceItems etc. | "matter_scope_policy" | "global_learned_pattern"; export interface InstructionPrecedencePolicy { order: InstructionSourceClass[]; // default above order conflict_strategy: "mask_lower" | "merge_with_flag" | "escalate_to_human"; provenance_must_be_clean: boolean; // P-4: for high-stakes/privileged/filing, class label must be trace-derived } export interface InstructionConflictRecord { task_ref: string; conflicting_sources: Array<{ class: InstructionSourceClass; content_ref: string }>; resolution: "masked" | "merged" | "escalated"; recorded_in_manifest: boolean; // Sec. 38 manifest } ``` Precedence order is correct. Safety non-droppable invariant is right. Large-attachment handling (front-load vs lazy via Source Workspace) remains open as card notes; under P-2 boundary. **OP-A candidate (already in card Sec. 9):** `OBL-DOC24-TASKCTX-PRECEDENCE-01` extends existing CTXPKT row with the above policy implementation during sealed assembly. --- ## 7. OP-A Obligation Contracts (card Sec. 9) Propose the concrete schema/behavior each owner must add (nothing written to DOC24/DOC73/DOC20 here -- flattening-safe). **OBL-DOC24-MATTER-CONF-01 (AB-T-03):** DOC24 supplies confidence/separation inputs for `MatterResolution`. Contract: ```ts interface MatterResolutionInputs { candidate_refs: EntityRef[]; context: TaskRunContextPacket; } interface MatterResolutionOutput { top_confidence: number; separation: number; // e.g. 1 - max pairwise embedding similarity recommended_status: "resolved" | "ambiguous_hold" | "unresolved_hold"; } // Privileged floor 0.92, ordinary 0.75 (configurable per matter class) ``` **OBL-DOC24-CTXPKT-01 (existing, extended):** Add `TaskKnowledgePackFreshnessPolicy` (TTL + invalidation triggers) feeding P-1 `substrate_freshness`. Add `ContextSequenceLock` (BUG-03). For AB-T-HYDR: implement `InstructionPrecedencePolicy` resolution + record `InstructionConflictRecord` in Sec. 38 manifest. Enforce `provenance_must_be_clean` for high-stakes. **OBL-DOC73-QUARANTINE-01 (P-5):** Extraction/promotion pipeline must read `quarantine_recommendation` (or `presentation_status !== "clean"`) on `Artifact`/`EvaluationOutcome` before Library/Corpus promotion. Require `WorkProductCertification` (human sign-off + `ContextBoundaryRef` + justification) to release. Carry boundary for privilege audit. **OBL-DOC20-ENFORCEMENT-BADGE-01 (P-6):** Read-model computes `EnforcementBadge` + click-through `EnforcementCase` purely from `CleanVerdictEligibility` (or its FeedbackFindingView projection). Headline example: "Needs verification -- 3 of 47 unproven, 1 source unavailable, matter pending assignment". No new semantics invented. **OBL-DOC24-TASKCTX-PRECEDENCE-01 (AB-T-HYDR):** Implement precedence order + conflict strategy in context packet assembly. Record conflicts. Enforce P-4 provenance constraint on class labels for high-stakes outcomes. --- ## 8. BETTER_IDEA / Additional Improvements **BETTER_IDEA:** Make `CleanVerdictEligibility` the single source of truth for `presentation_status` and the `needs_verification`/`needs_human_judgment` signals. Over time, deprecate parallel flags in `OutcomeEvaluationState` to eliminate drift risk between P-1 and the state enum. **BETTER_IDEA:** Add `boundary_ref: ContextBoundaryRef` to `CleanVerdictEligibility` (as noted in completeness section). Enables automatic limbo ownership and direct P-5 feed without extra seams. **SUGGESTION (low cost):** Scope AB-T-07 `CitationManifest` emission to `load_bearing` claims only (those that affect `affirmative_grounding` dimension). Non-load-bearing citations remain advisory. **SUGGESTION:** Add a `MinimumDocumentationTierPolicy` lookup table (risk_class -> min_tier) as a first-class configurable in Common Contracts (used by P-1 predicate and AB-T-05 lint). --- ## Value-Tiered Summary (Critical / Substantive / Minor / Considered-and-declined) **Critical (ship first; no regression tolerated; AB-T-03 + P-4 spine)** - AB-T-03 matter hold gate + P-1 `matter_resolution` + P-2 `PENDING_MATTER_ASSIGNMENT` (silent privilege crossing) - P-4 gate-signal integrity + `no_self_certified_bypass` for high-stakes (enforcement non-gameable) - Interaction bugs 5 (matter-limbo) + 8 (stale eligibility) + 1 (grounding ^ source tier) - BUG-04 syntactic-taint with *strengthened* P-4 caveat on mechanical classification provenance - Full P-1 schema + precedence + lints + fixture (unifies 01/02/03/05 + staleness) **Substantive (high value; include in next implementation package)** - P-2 `ContextBoundaryRef` canonicalization + learning/append isolation fix (AB-T-09/10/11) - P-3 `RecoveryPolicyRegistry` (privilege-reclass re-taint + storage taxonomy) - AB-T-HYDR precedence schema + Core §13A.9 placement + OP-A extension - All 8 interaction bug fixes (already covered by P-1..P-4) - 5 OP-A obligation contracts (especially DOC24 matter-conf + DOC73 quarantine) - AB-T-07 `CitationManifest` (scoped) + AB-T-15 freshness detection + AB-T-17 reference-aware pin - Gemini BUG-01/02 routing + BUG-03 redirect confirmation **Minor (nice-to-have; low cost; add during implementation)** - `quarantine_recommendation` + `boundary_ref` fields on P-1 - Full lint suite for every new contract - AB-T-07 scoped to load-bearing claims only - Executable fixtures for all P-1..P-4 dimensions and interaction scenarios - `MinimumDocumentationTierPolicy` table as configurable **Considered and declined** - Gemini discrete `satisfied_downgraded` state for AB-T-02 (D1) -- higher surface area + migration debt than field + flip-rule through existing policy - AB-T-08 hard `CausalProof` gate -- theater + hallucination risk outweighs; light `RevisionChangeRationale` sufficient - Any new parallel verdict state or orchestrator for portfolio admission (AB-T-09 correctly scoped as policy-only) - Direct edits to DOC24/DOC73/DOC20 (correctly kept as OP-A obligations only) - Any new retention contract for AB-T-17 (correctly re-scoped to reference-aware pin only) --- **End of Review.** All paste-ready material above is ready for Will to transmit to coding agents or red-team reviewers. Every addition names its exact landing spec + section. No guesswork remains for implementation. The card V2 + this review together constitute an A+ grade enforcement layer spec ready for Stage 7+ build. *Saved to repo per request. Tracking file (pre-approved per working style).*