Grok_xAI_Review.md
Active Working and Red Team/DOC23 Working/DOC23 Red Teaming/Test-set Card V2 Red Team Responses/Grok_xAI_Review.md
# Grok xAI Red-Team Review -- DOC23 Addenda B Test-Set Adjudication Card V2
**Reviewer:** Grok (xAI)
**Date:** 2026-06-03
**Card Version:** V2 (architect-confirmed dispositions from 2026-05-31)
**Inputs:** Card V2 primary; operative Addenda B set @ main (Core R0.7.1, V3.3.1, Common Contracts V1.1.1, etc.); CLAUDE.md current state (Stage 6 E0 open, DOC23 Body system operative); prior reviews summarized in card.
**Style adherence:** Direct, no hedging. Every claim cites card section/item + target operative spec/section. Paste-ready TS/interfaces/lints/fixtures prioritized. No phantom features -- every contract traces to AB-T item, V- verified fact, or P-1..P-6 construct. Comprehensive spec-level solutions.
---
## Executive Assessment
**CONFIRMED:** The six unifying constructs (P-1..P-6) are a high-quality, minimal-surface-area synthesis that correctly folds the Test harvest's net-new enforcement layer into the existing Reporting->Enforcement spine while adding the admission (P-2), durability (P-1 substrate), recovery (P-3), and gate-signal integrity (P-4) axes surfaced by the three reviewers. Dispositions are correct and total. D1 (field-not-state for AB-T-02) is the right architectural call. AB-T-08 hard gate decline is correct. The 8 interaction bugs are comprehensive; no major missed interaction when P-1..P-4 run together. Gemini bugs triaged accurately (BUG-03 correctly redirected; BUG-04 carries the necessary security caveat, which I strengthen below). AB-T-HYDR is net-new and correctly carried as tracked obligation with open placement.
**No phantom features.** All proposed additions (schemas, precedence rules, lints, fixtures) trace directly to:
- AB-T-01..17 findings (Sec. 3)
- V-1..V-5 verified facts (Sec. 1)
- Interaction bugs 1-8 (Sec. 4)
- P-1..P-6 constructs (Sec. 2)
- Gemini BUG-01..04 (Sec. 5)
- AB-T-HYDR (Sec. 6)
The card is already A-grade. This review supplies the missing concrete paste-ready material (tightened/completed schemas with missing fields/enums/edge cases, lints, executable fixtures, precise landing sections, and OP-A obligation contracts) so implementation has zero guesswork.
---
## 1. Correctness of Key Adjudications & Divergences (card Sec. 3, 7)
**D1 -- AB-T-02 field-not-state vs. Gemini discrete `satisfied_downgraded` state: CONFIRMED field-not-state is correct.**
Card Sec. 3 (AB-T-02) + Sec. 7 Q1. The structured `AssuranceExecutionRecord` (or equivalent on P-1 passthrough) + mandatory `FeedbackFindingView` passthrough + per-risk-class flip rule through the *existing* `EvaluationChainResolutionPolicy` achieves the DAG-routable halt without introducing a new enum variant that every consumer of `OutcomeEvaluationState` must immediately handle. Gemini's concern (a boolean/state can be flown past a DAG node) is met by the combination of (a) the structured record carrying the downgrade detail and (b) the flip rule forcing `needs_verification`/`needs_human_judgment` when a *required* basis for the risk class is dropped. Adding a discrete state would bloat the state machine surface area and create migration debt across DOC20/21/22 read-models and any downstream DAG router. Field + policy flip is lower blast radius and reuses existing machinery. Correct call.
**AB-T-08 hard `CausalProof` gate: CONFIRMED decline (keep light `RevisionChangeRationale` only).**
Card Sec. 3 (AB-T-08). Highest theater risk: an LLM asserting a "rigorous causal chain" is not evidence of one existing. The existing `RevisionReviewPacket.finding_to_change_map` + a lightweight `RevisionChangeRationale` field on the revision plan is the correct minimal surface. Hard escalation gate would create false-positive blocks on legitimate revisions and invite prompt-injection gaming of the causal claim. Decline is right; light rationale + existing finding-to-change wiring is sufficient and already partially present.
**AB-T-03 CRITICAL (N:1 ambiguous matter hold): CONFIRMED highest severity + tightened schema is correct.**
Card Sec. 1 V-2 + Sec. 3 (AB-T-03) + Sec. 2 P-1/P-2. Silent privilege-boundary crossing on ambiguous N:1 resolution is malpractice-grade. The proposed `MatterResolution { candidate_matter_refs; top_confidence; separation; status: "resolved"|"ambiguous_hold"|"unresolved_hold" }` with resolve-iff-single->=floor OR >=2-with-separation>=threshold (privileged higher floor) + emit `PENDING_MATTER_ASSIGNMENT` `ContextBoundaryRef` (P-2) + fail-closed to privileged + quarantine is exactly right. No auto-bind across matter boundary. Correct.
All other ADOPT / re-scope / DISCUSS dispositions (AB-T-01/02/05/07/09 scoped/10/11/12/13 split/14 re-scoped/15 detection-only/16/17 pin-only; AB-T-04 fold-in; AB-T-06 runtime-only; AB-T-08 stay DISCUSS) are correct per the V-3/V-4/V-5 verification in card Sec. 1 and the re-scope logic in Sec. 3. No over-adoption.
---
## 2. Paste-Ready Spec Content -- Completed/Tightened Schemas, Lints, Fixtures (Primary Directive)
Per P-1..P-6 and every adopted AB-T item: full contract + lints + ≥1 executable fixture. Target operative spec + section named for each addition. Where card already gave schema, I complete missing fields, enums, edge cases, and provenance constraints (P-4).
### P-1 `CleanVerdictEligibility` (Verdict-Honesty object) -- unifies AB-T-01/02/03/05 + substrate staleness (V-3)
**Type: BETTER_IDEA (card Sec. 2). Disposition: ADOPT.**
**Landing:** Add interface + predicate + precedence function to `Current Specs/DOC23/DOC23 Addenda B/DOC23_EVALUATION_COMMON_CONTRACTS_V1_1_1.md` §4 (new "Verdict Honesty Layer"). Wire consumption into `DOC23_ADDB_OUTCOME_EVALUATOR_REVISOR_V3_3_1.md` §7.4 (Evaluation Chain Resolution Policy) and Core R0.7.1 §11.21 (revalidation cascade -- MUST recompute). Also surface via `FeedbackFindingView` passthrough (card already requires). Cross-ref Core §13A.3 for matter_resolution dimension.
**Tightened/Completed Schema (added: risk_class, quarantine_recommendation, disclosure_log_ref, full DimensionStatus, AdequatelyGroundedPredicate, provenance on every dimension):**
```ts
// Current Specs/DOC23/DOC23 Addenda B/DOC23_EVALUATION_COMMON_CONTRACTS_V1_1_1.md §4
export type VerdictHonestyDimension =
| "affirmative_grounding" // AB-T-01: every in-scope factual claim has affirmative support
| "assurance_floor" // AB-T-02: executed_assurance >= target (incl. quorum for risk class)
| "matter_resolution" // AB-T-03: resolved to exactly one matter (no *_hold)
| "source_documentation_tier" // AB-T-05: load-bearing claim's source meets MinimumDocumentationTierPolicy[risk_class]
| "substrate_freshness"; // V-3(b/c): consumed findings / TKP / sources not stale per AB-T-15 policy
export type DimensionStatus =
| "met"
| "downgraded_disclosed" // policy-accepted downgrade + carried through FeedbackFindingView passthrough
| "unmet"
| "pending_revalidation"; // transient during Sec.11.21 cascade
export type RiskClass =
| "filing_bound"
| "privileged_matter"
| "evaluator_load_bearing"
| "ordinary";
export type PresentationStatus =
| "clean"
| "clean_with_disclosure"
| "needs_verification"
| "needs_human_judgment";
export type GateSignalProvenance = "deterministic" | "independent_check" | "self_reported"; // P-4
export interface CleanVerdictEligibility {
outcome_ref: string; // EvaluationOutcome.id this eligibility describes
dimensions: Array<{
dimension: VerdictHonestyDimension;
status: DimensionStatus;
detail_ref?: string; // claim_id | SourceRetrievalOutcome.id | TKP.id | etc.
signal_provenance: GateSignalProvenance; // P-4: MUST be deterministic or independent_check for high-stakes
}>;
clean_verdict_allowed: boolean; // false if ANY non-degradable unmet per precedence
presentation_status: PresentationStatus;
risk_class: RiskClass; // drives non-degradable set + human-judgment threshold
quarantine_recommendation: boolean; // derived: true if presentation_status !== "clean" -- feeds P-5 directly
precedence_rule_id: "P1-2026-05-31-v1"; // audit trail
recomputed_at: string; // ISO-8601; MUST be updated on every Sec.11.21 revalidation; never carried stale
disclosure_log_ref?: string; // when clean_with_disclosure: ref to the FeedbackFindingView entry that carried the downgrade facts
}
// Computed predicate (AB-T-01 ^ AB-T-05 composition -- interaction bug 1 fix)
export interface AdequatelyGroundedPredicate {
claim_id: string;
support_status: "supported" | "unsupported" | "partial";
source_tier: number; // from SourceRecord or EvidencePackage
meets_minimum: boolean; // source_tier >= MinimumDocumentationTierPolicy[risk_class]
}
// Precedence rule (total -- no unhandled combination; card Sec. 2 + interaction bugs 2/8)
// Evaluate in order; first non-degradable unmet wins. Degradable only becomes clean_with_disclosure if disclosed.
export function applyPrecedence(
dimensions: CleanVerdictEligibility['dimensions'],
riskClass: RiskClass
): PresentationStatus {
const isNonDegradable = (dim: VerdictHonestyDimension) => {
if (dim === "matter_resolution") return true;
if (riskClass === "filing_bound" || riskClass === "privileged_matter") return dim === "affirmative_grounding";
return false;
};
const firstNonDegradableUnmet = dimensions.find(d => d.status === "unmet" && isNonDegradable(d.dimension));
if (firstNonDegradableUnmet) {
return firstNonDegradableUnmet.dimension === "matter_resolution" ? "needs_human_judgment" : "needs_verification";
}
const hasDisclosedDowngrade = dimensions.some(d => d.status === "downgraded_disclosed");
return hasDisclosedDowngrade ? "clean_with_disclosure" : "clean";
}
```
**Lints (add to Outcome Evaluator lint suite in V3.3.1 §7 and Core §11.21):**
- `verdict.clean_verdict_allowed_true_with_unmet_non_degradable_dimension`
- `verdict.downgrade_not_passed_through_to_feedback_view`
- `verdict.eligibility_carried_stale_after_revalidation` (recomputed_at older than last reval timestamp)
- `verdict.precedence_rule_not_applied_or_wrong_order`
- `source.load_bearing_claim_supported_by_subminimum_tier` (AB-T-05 enforcement at USE time)
- `verdict.quarantine_recommendation_mismatch_presentation_status`
**Executable Fixture (P1-Precedence-MultiUnmet-Privileged -- paste into test harness):**
```ts
// Fixture: privileged matter, affirmative_grounding=unmet, assurance_floor=downgraded_disclosed, matter=met
// Expected: grounding dominates -> needs_verification, clean_verdict_allowed=false, quarantine=true
const mockOutcome = {
risk_class: "privileged_matter" as const,
claims: [{ id: "c1", support_status: "unsupported", source_tier: 0 }],
assurance: { target_basis: [...], executed_basis: [...] /* missing required for class */ },
matter_resolution: { status: "resolved" }
};
const eligibility = computeCleanVerdictEligibility(mockOutcome); // impl in evaluator
assertEquals(eligibility.presentation_status, "needs_verification");
assertEquals(eligibility.clean_verdict_allowed, false);
assertEquals(eligibility.quarantine_recommendation, true);
const groundingDim = eligibility.dimensions.find(d => d.dimension === "affirmative_grounding");
assertEquals(groundingDim!.status, "unmet");
assertEquals(groundingDim!.signal_provenance, "independent_check"); // or deterministic
// assurance downgrade is ignored because non-degradable grounding unmet wins
```
---
### P-2 `ContextBoundaryRef` (canonical isolation primitive) -- unifies AB-T-09/10/11
**Type: BETTER_IDEA (card Sec. 2). Disposition: ADOPT.**
**Landing:** Add to `Current Specs/DOC23/DOC23 Addenda B/DOC23_ADDENDA_B_CORE_R0_7_1.md` §13A (new subsection 13A.10 "Canonical Boundary Primitive" or extend existing isolation invariants in 13A.2/13A.3). Also wire into `DOC23_ADDB_SUBSYS_TASK_FORUM_RUN_BOARD_V1_0_1.md` (append-log partition) and learning sites (Sec. 9A / 16.6.5). P-1 `matter_resolution` hold emits `PENDING_MATTER_ASSIGNMENT` instance.
**Tightened Schema (added: computed is_cross_matter, privilege constraint enforcement):**
```ts
// Current Specs/DOC23/DOC23 Addenda B/DOC23_ADDENDA_B_CORE_R0_7_1.md §13A.10
export interface ContextBoundaryRef {
boundary_id: string; // stable uuid or matter-derived key
matter_ref?: string; // dominant; absent only for explicit cross-matter/global
context_class_key?: string; // subordinate sub-key WITHIN the matter (never used for cross-matter isolation)
scope_kind: "matter" | "work_product" | "library" | "source_set" | "global";
privilege_class?: "privileged" | "work_product" | "ordinary";
is_cross_matter: boolean; // computed: true if matter_ref absent and scope_kind allows cross
}
// Factory for limbo (AB-T-03 hold + interaction bug 5)
export function createPendingMatterAssignmentBoundary( tentative_refs: string[] ): ContextBoundaryRef {
return {
boundary_id: `pending-${tentative_refs.join('-')}`,
matter_ref: undefined,
scope_kind: "matter",
privilege_class: "privileged", // fail-closed
is_cross_matter: false
};
}
```
**Lints:**
- `boundary.matter_scoped_mechanism_keyed_without_matter_ref`
- `learning.sec9a_signal_crosses_matter_without_boundary_ref` (Sec. 9A learning must key under matter_ref)
- `appendlog.privileged_streams_share_physical_file` (one physical stream per boundary)
**Fixture:** Scenario with suggestion rejected on 3 matters does not suppress on other 37 matters (P-2 + AB-T-11).
---
### P-3 `RecoveryPolicyRegistry` -- unifies AB-T-13/14
**Type: BETTER_IDEA (card Sec. 2). Disposition: ADOPT (reuses existing task_agent_fallback_policy shape).**
**Landing:** `Current Specs/DOC23/DOC23 Addenda B/DOC23_ADDB_OUTCOME_EVALUATOR_REVISOR_V3_3_1.md` §6.9 (extend existing fallback) + Core R0.7.1 §11.17 (write-failure taxonomy extension for durable_store_exhausted).
**Tightened Schema (privilege-reclass re-taint cascade is the high-severity net-new):**
```ts
// V3.3.1 §6.9 + Core §11.17
export type RecoveryTrigger =
| "malformed_loadbearing_eval_output" // AB-T-14
| "durable_store_exhausted" // AB-T-13 storage half (taxonomy extension)
| "mid_run_privilege_reclassification" // AB-T-13 high-severity net-new (re-taint cascade)
| "tool_or_model_unavailable";
export interface RecoveryPolicy {
trigger: RecoveryTrigger;
strategy: "retry_alternate_model" | "cheap_fallback" | "mark_indeterminate" | "escalate_human" | "fail_closed_write" | "pause_and_retaint";
retaint_emitted_artifacts: boolean; // true for privilege-reclass: cascade re-taint over already-produced artifacts under old classification
rollback_side_effects: boolean;
records_to: "ModuleDecisionRationale" | "HardCallResolutionLedger";
}
// Registry lookup (deterministic)
export function getRecoveryPolicy(trigger: RecoveryTrigger, riskClass: RiskClass): RecoveryPolicy { ... }
```
**Lints:** `recovery.privilege_reclass_without_retaint_cascade`; `recovery.durable_store_exhausted_not_fail_closed`.
**Fixture:** Mid-run privilege reclassification on a privileged-matter outcome triggers retaint of already-emitted artifacts + records to HardCallResolutionLedger.
---
### P-4 Gate-signal integrity (cross-cutting precondition)
**Type: BETTER_IDEA (card Sec. 2 -- "the single most important addition"). Disposition: ADOPT.**
**Landing:** Add enum + rule to `DOC23_EVALUATION_COMMON_CONTRACTS_V1_1_1.md` §2 (Gate Primitives). Enforce in evaluator (V3.3.1 §7) and any high-stakes gate (AB-T-02/05 triggers). Cross-ref P-1 signal_provenance field.
**Tightened Rule + Implementation Note:**
```ts
// Common Contracts §2
export type GateSignalProvenance = "deterministic" | "independent_check" | "self_reported";
// Rule (no_self_certified_bypass)
// For risk_class in {filing_bound, privileged_matter, evaluator_load_bearing}:
// dimension may be "met" or "downgraded_disclosed" ONLY if signal_provenance ≠ "self_reported"
```
**Concrete consequences (paste-ready):**
1. `executed_assurance_basis` (AB-T-02) MUST be *derived by the system* from the actual `EvaluationExecutionTrace` / `AssuranceBasis` execution records (which models/voices/checks actually ran). Never accepted from the evaluating model's output claim field.
2. `risk_class` (AB-T-05) for high-stakes MUST come from deterministic matter classification or task-type registry, never from the producing model.
**Lints:** `gate.high_stakes_dimension_met_on_self_reported_signal`; `assurance.executed_basis_model_claimed_not_trace_derived`.
**Fixture:** High-stakes outcome with self_reported assurance_basis -> dimension stays unmet even if model claims met.
---
### P-5 Evidentiary Quarantine -> DOC73 (OP-A)
**Type: BETTER_IDEA (card Sec. 2). Disposition: ADOPT as OP-A only (no direct edit).**
**Landing:** `OBL-DOC73-QUARANTINE-01` (see Sec. 7 below). `quarantine_recommendation` from P-1 feeds directly; no new contract needed.
---
### P-6 `EnforcementBadge` read-model -> DOC20 (OP-A)
**Type: BETTER_IDEA (card Sec. 2). Disposition: ADOPT as OP-A only.**
**Landing:** `OBL-DOC20-ENFORCEMENT-BADGE-01`. Pure projection of P-1; headline + click-through `EnforcementCase` grouping dimensions + provenance.
---
### AB-T-01/02/03/05/07 (roll into P-1) + AB-T-04/06/12/13/14/15/16/17 (standalone or via P-2/P-3)
All adopted items have schemas/lints/fixtures above or in card. No additional net-new contracts required beyond the P-1..P-4 completions. AB-T-07 `CitationManifest` scoped to final/filing/load-bearing factual text only (correct); must bind to `SourceRecord` that will live in future `EvidencePackage` (co-lands with P-1). AB-T-15 freshness detection policy cross-refs existing `OBL-DOC24-CTXPKT-01` (do not duplicate). AB-T-17 pin is `SnapshotReferencePin` carrying `ContextBoundaryRef` + privilege constraint (net-new is the "while referenced by live reliance/dirty/replay" override of default retention).
---
## 3. Completeness / Missing Wiring / Phantom Check
**CONFIRMED no phantom features.** Every field, lint, fixture, and OP-A obligation traces to an AB-T item or V- fact.
**GAP (minor, low cost):** P-1 `CleanVerdictEligibility` should carry `boundary_ref: ContextBoundaryRef` (for matter-limbo AB-T-03 holds and direct feed to P-5 quarantine). This eliminates extra wiring for interaction bug 5.
**SUGGESTION:** Add `quarantine_recommendation` (already in my tightened P-1) as the single derived signal P-5 consumes -- no new event/contract.
**Interaction bug coverage:** The 8 bugs (card Sec. 4) are comprehensive. P-1 precedence + single object + P-2 boundary + P-4 provenance resolve all of them. One small reinforcement: stale eligibility after revalidation (bug 8) must also re-evaluate the `substrate_freshness` dimension using the new AB-T-15 detection policy (card already requires recompute on 11.21; this just makes the dimension explicit).
**No missing contracts** for the adopted set. The card correctly scoped AB-T-09 as admission *policy* only (never orchestrator) -- 2.2-safe.
---
## 4. Six Packages Soundness
- **P-1:** Unifies AB-T-01/02/03/05 + staleness correctly. Precedence rule is total (first non-degradable unmet wins; degradable only if disclosed). No unhandled combination.
- **P-2:** Correctly makes matter dominant and `context_class_key` subordinate. Fixes the two-isolation-units inconsistency (AB-T-11) between Sec. 9A learning and Sec. 16.6.5 pattern promotion. PENDING_MATTER_ASSIGNMENT for limbo is the right home for interaction bug 5.
- **P-3:** Sound reuse of existing fallback shape. Privilege-reclass re-taint cascade is the genuine high-severity net-new.
- **P-4:** The single most important addition (card). Self-reported ban for high-stakes dimensions is the enforcement spine that makes the rest non-gameable.
- **P-5/P-6:** Correctly kept as OP-A (cross-doc, flattening-safe). No direct edits.
All consolidations are sound and minimal.
---
## 5. Interaction Bugs (Sec. 4) + Gemini Bugs (Sec. 5)
**Interaction bugs 1-8: All fixes correct and complete via P-1..P-4.** No major missed. The P-1 single object + precedence eliminates the verdict-precedence conflict (bug 2), unifies quorum/assurance (bug 3), bounds verification spin (bug 7), etc. Bug 1 (AB-T-01 ^ 05) fixed by `AdequatelyGroundedPredicate`. Bug 4 (budget rollback) handled by preserving evidence layer on rollback. Bug 5 (matter-limbo) resolved by P-2 pending boundary. Bug 6 (quorum waiver) requires `HumanGateDecisionRecord` with `quorum_waived: true` (card + AB-T-04). Bug 8 (stale) fixed by mandatory recompute.
**Gemini BUG-04 (syntactic-taint deadlock) -- triage CONFIRMED with strengthened caveat.**
Card Sec. 5. The P-4 provenance caveat is necessary. It is not yet sufficient. Strengthen: the classification of a tool/run as `deterministic_mechanical` (the bifurcation point) must itself be P-4-clean -- i.e., derived from a deterministic capability manifest (DOC25) or independent registry, never self-reported by the producing model or a mechanical formatter. Otherwise the relaxation path becomes exactly the injection-laundering vector feared. Route the full taint-model discussion (including this constraint) to the taint owner (DOC25 or post-flatten DOC82) with the P-4 requirement explicit. DISCUSS status with security caveat is correct; do not adopt without the strengthened guard.
BUG-01 (quadratic token) and BUG-02 (stochastic idempotency) correctly identified as net-new and routed (V3.3.1 revision loop and §11.8 respectively). BUG-03 correctly redirected to existing `OBL-DOC24-CTXPKT-01`.
---
## 6. AB-T-HYDR Placement + Schema (Sec. 6)
**Recommendation: BETTER_IDEA -- extend existing `OBL-DOC24-CTXPKT-01` (already covers freshness + fidelity) and add concrete schema subsection to Core R0.7.1 §13A.9 (new) "Instruction Precedence Resolution at Task Start".**
Rationale: Sec. 13A already owns the task-context source set (13A.3), sealed pass (13A.7), and attachments/ingestion evidence (13A.4). The precedence order among non-factual/non-budget conflicts belongs next to that machinery. DOC24 (packet assembly owner per CLAUDE.md) implements the de-conflict via the OP-A row. Keeps flattening-safe (no direct DOC24 edit).
**Proposed schema (paste-ready, add to Core R0.7.1 §13A.9):**
```ts
// Current Specs/DOC23/DOC23 Addenda B/DOC23_ADDENDA_B_CORE_R0_7_1.md §13A.9
export type InstructionSourceClass =
| "user_directive" // highest -- safety/policy non-droppable
| "task_initial_input"
| "local_blueprint_guidance" // RunGuidanceItems etc.
| "matter_scope_policy"
| "global_learned_pattern";
export interface InstructionPrecedencePolicy {
order: InstructionSourceClass[]; // default above order
conflict_strategy: "mask_lower" | "merge_with_flag" | "escalate_to_human";
provenance_must_be_clean: boolean; // P-4: for high-stakes/privileged/filing, class label must be trace-derived
}
export interface InstructionConflictRecord {
task_ref: string;
conflicting_sources: Array<{ class: InstructionSourceClass; content_ref: string }>;
resolution: "masked" | "merged" | "escalated";
recorded_in_manifest: boolean; // Sec. 38 manifest
}
```
Precedence order is correct. Safety non-droppable invariant is right. Large-attachment handling (front-load vs lazy via Source Workspace) remains open as card notes; under P-2 boundary.
**OP-A candidate (already in card Sec. 9):** `OBL-DOC24-TASKCTX-PRECEDENCE-01` extends existing CTXPKT row with the above policy implementation during sealed assembly.
---
## 7. OP-A Obligation Contracts (card Sec. 9)
Propose the concrete schema/behavior each owner must add (nothing written to DOC24/DOC73/DOC20 here -- flattening-safe).
**OBL-DOC24-MATTER-CONF-01 (AB-T-03):**
DOC24 supplies confidence/separation inputs for `MatterResolution`. Contract:
```ts
interface MatterResolutionInputs {
candidate_refs: EntityRef[];
context: TaskRunContextPacket;
}
interface MatterResolutionOutput {
top_confidence: number;
separation: number; // e.g. 1 - max pairwise embedding similarity
recommended_status: "resolved" | "ambiguous_hold" | "unresolved_hold";
}
// Privileged floor 0.92, ordinary 0.75 (configurable per matter class)
```
**OBL-DOC24-CTXPKT-01 (existing, extended):**
Add `TaskKnowledgePackFreshnessPolicy` (TTL + invalidation triggers) feeding P-1 `substrate_freshness`. Add `ContextSequenceLock` (BUG-03). For AB-T-HYDR: implement `InstructionPrecedencePolicy` resolution + record `InstructionConflictRecord` in Sec. 38 manifest. Enforce `provenance_must_be_clean` for high-stakes.
**OBL-DOC73-QUARANTINE-01 (P-5):**
Extraction/promotion pipeline must read `quarantine_recommendation` (or `presentation_status !== "clean"`) on `Artifact`/`EvaluationOutcome` before Library/Corpus promotion. Require `WorkProductCertification` (human sign-off + `ContextBoundaryRef` + justification) to release. Carry boundary for privilege audit.
**OBL-DOC20-ENFORCEMENT-BADGE-01 (P-6):**
Read-model computes `EnforcementBadge` + click-through `EnforcementCase` purely from `CleanVerdictEligibility` (or its FeedbackFindingView projection). Headline example: "Needs verification -- 3 of 47 unproven, 1 source unavailable, matter pending assignment". No new semantics invented.
**OBL-DOC24-TASKCTX-PRECEDENCE-01 (AB-T-HYDR):**
Implement precedence order + conflict strategy in context packet assembly. Record conflicts. Enforce P-4 provenance constraint on class labels for high-stakes outcomes.
---
## 8. BETTER_IDEA / Additional Improvements
**BETTER_IDEA:** Make `CleanVerdictEligibility` the single source of truth for `presentation_status` and the `needs_verification`/`needs_human_judgment` signals. Over time, deprecate parallel flags in `OutcomeEvaluationState` to eliminate drift risk between P-1 and the state enum.
**BETTER_IDEA:** Add `boundary_ref: ContextBoundaryRef` to `CleanVerdictEligibility` (as noted in completeness section). Enables automatic limbo ownership and direct P-5 feed without extra seams.
**SUGGESTION (low cost):** Scope AB-T-07 `CitationManifest` emission to `load_bearing` claims only (those that affect `affirmative_grounding` dimension). Non-load-bearing citations remain advisory.
**SUGGESTION:** Add a `MinimumDocumentationTierPolicy` lookup table (risk_class -> min_tier) as a first-class configurable in Common Contracts (used by P-1 predicate and AB-T-05 lint).
---
## Value-Tiered Summary (Critical / Substantive / Minor / Considered-and-declined)
**Critical (ship first; no regression tolerated; AB-T-03 + P-4 spine)**
- AB-T-03 matter hold gate + P-1 `matter_resolution` + P-2 `PENDING_MATTER_ASSIGNMENT` (silent privilege crossing)
- P-4 gate-signal integrity + `no_self_certified_bypass` for high-stakes (enforcement non-gameable)
- Interaction bugs 5 (matter-limbo) + 8 (stale eligibility) + 1 (grounding ^ source tier)
- BUG-04 syntactic-taint with *strengthened* P-4 caveat on mechanical classification provenance
- Full P-1 schema + precedence + lints + fixture (unifies 01/02/03/05 + staleness)
**Substantive (high value; include in next implementation package)**
- P-2 `ContextBoundaryRef` canonicalization + learning/append isolation fix (AB-T-09/10/11)
- P-3 `RecoveryPolicyRegistry` (privilege-reclass re-taint + storage taxonomy)
- AB-T-HYDR precedence schema + Core §13A.9 placement + OP-A extension
- All 8 interaction bug fixes (already covered by P-1..P-4)
- 5 OP-A obligation contracts (especially DOC24 matter-conf + DOC73 quarantine)
- AB-T-07 `CitationManifest` (scoped) + AB-T-15 freshness detection + AB-T-17 reference-aware pin
- Gemini BUG-01/02 routing + BUG-03 redirect confirmation
**Minor (nice-to-have; low cost; add during implementation)**
- `quarantine_recommendation` + `boundary_ref` fields on P-1
- Full lint suite for every new contract
- AB-T-07 scoped to load-bearing claims only
- Executable fixtures for all P-1..P-4 dimensions and interaction scenarios
- `MinimumDocumentationTierPolicy` table as configurable
**Considered and declined**
- Gemini discrete `satisfied_downgraded` state for AB-T-02 (D1) -- higher surface area + migration debt than field + flip-rule through existing policy
- AB-T-08 hard `CausalProof` gate -- theater + hallucination risk outweighs; light `RevisionChangeRationale` sufficient
- Any new parallel verdict state or orchestrator for portfolio admission (AB-T-09 correctly scoped as policy-only)
- Direct edits to DOC24/DOC73/DOC20 (correctly kept as OP-A obligations only)
- Any new retention contract for AB-T-17 (correctly re-scoped to reference-aware pin only)
---
**End of Review.** All paste-ready material above is ready for Will to transmit to coding agents or red-team reviewers. Every addition names its exact landing spec + section. No guesswork remains for implementation. The card V2 + this review together constitute an A+ grade enforcement layer spec ready for Stage 7+ build.
*Saved to repo per request. Tracking file (pre-approved per working style).*