Grok_Deep_Dive_ReReview.md

Active Working and Red Team/DOC23 Working/DOC23 Red Teaming/Test-set Card V2 Red Team Responses/Grok_Deep_Dive_ReReview.md
Generated 2026-06-09T01:23:58.539Z from commit dbaa25962edc11ab30e8d4ca1715f9ae5bf77331. Worktree: clean.
Open text page · Open raw txt · Open path URL
# Grok xAI Deep-Dive Re-Review + Second Look -- DOC23 Addenda B Test-Set Adjudication Card V2

**Reviewer:** Grok (xAI)  
**Date:** 2026-06-03 (follow-up to initial review)  
**Trigger:** User request for deeper analysis, missed bugs, new ideas, math/formula/function correctness assessment after closer re-examination of card V2 + my prior Grok_xAI_Review.md.

**Method:** Re-read full card V2 (all sections 0-10), re-examined every schema/proposal in my prior review, cross-checked against card's own V-1..V-5 verified facts, interaction bugs 1-8, Gemini bugs, P-1..P-6 constructs, AB-T-HYDR, OP-A rows, and operative invariants from CLAUDE.md (EC sole writer, P-4 provenance, matter as dominant boundary, no phantom features). Explicitly audited all logic, precedence rules, "derived from trace" claims, and any implicit calculations.

**Overall finding after deeper dive:** The card V2 remains A+ / excellent. My initial review was strong but had two small implementation-level issues in the paste-ready code I supplied (array-order dependency in precedence; incomplete per-dimension disclosure tracking). No critical missed bugs in the *card's* adjudication or constructs. A few additional minor wiring seams and one underspecified metric (separation) surfaced. All numeric/policy calculations that exist are trivial, deterministic, and will work. New BETTER_IDEA and SUGGESTION items below improve robustness without adding surface area.

**No phantom features introduced in this re-review.** Everything traces to card Sec/Item, V- facts, or P-1..P-6.

---

## 1. Math, Formulas, Technical Calculations & Functions -- Assessment

**Conclusion: All existing and proposed calculations/functions will work correctly. No arithmetic bugs, precision issues, or non-deterministic behavior.**

- **Precedence logic (P-1 applyPrecedence and card Sec. 2 rule):** Pure predicate logic + linear scan over fixed small set (5 dimensions). O(1) in practice. Deterministic. Total ordering (every combination has exactly one outcome). No floating-point, no division, no edge-case arithmetic. The only "calculation" is the policy lookup `isNonDegradable(dim, riskClass)` which is a simple table (matter_resolution always non-degradable; for filing/privileged: affirmative_grounding also non-degradable). Will work.

- **MatterResolution resolve condition (AB-T-03, card Sec. 3 + V-2):** `single candidate >= floor OR (>=2 candidates AND separation >= threshold)`. No formula given for `separation`. This is *intentional policy abstraction* (good). Thresholds/floors are configurable per risk class (privileged higher). No math to break. The only risk is implementation of "separation" metric (see GAP below).

- **AdequatelyGroundedPredicate (my P-1 addition, interaction bug 1 fix):** Boolean AND of support_status + tier check against policy table. Trivial. Will work.

- **Quorum / assurance_floor (AB-T-02, card Sec. 3 + interaction bug 3):** Count-based (RequiredQuorumManifest) + comparison to target. No complex formula. Card correctly unifies quorum loss as one form of assurance downgrade. Will work.

- **RecoveryPolicy lookup (P-3):** Simple enum match + strategy table. Deterministic. Will work.

- **GateSignalProvenance rule (P-4):** Pure conditional on riskClass + provenance tag. No calculation. Will work.

- **AB-T-HYDR precedence order (card Sec. 6 + my schema):** Fixed list order with explicit highest (user_directive, safety non-droppable). No numeric ranking or scoring. Conflict strategy is enum dispatch. Will work.

- **No other formulas exist in card or my proposals.** No probabilities, weighted sums, floating-point thresholds that could accumulate error, no crypto/math that needs constant-time, no concurrency math (races are handled by sequence locks or EC sole writer). All policy-driven table lookups + boolean logic. Correct and robust for legal work (auditable, no hidden nondeterminism).

**Minor observation:** The card's "separation >= threshold" for multi-candidate matter resolution is the closest to a numeric gate. Because it is left abstract (good for now), the OP-A must eventually specify a reproducible computation (see GAP-01 below). Embedding cosine or simple Jaccard on normalized matter descriptions would be sufficient and auditable; LLM-as-judge would require additional provenance (P-4).

---

## 2. Issues / Bugs Missed in Initial Review (or surfaced on re-read)

**BUG (in my supplied precedence function, not in card):** `applyPrecedence` uses `dimensions.find(...)` which depends on the *input array order* rather than a canonical semantic precedence order. If a caller passes dimensions in arbitrary order (e.g., substrate_freshness first), the "first non-degradable unmet" may be wrong even though the rule is conceptually total.

**Fix (paste-ready replacement -- lands in same place as before: DOC23_EVALUATION_COMMON_CONTRACTS_V1_1_1.md §4):**

```ts
// Replace the previous applyPrecedence with this version
const PRECEDENCE_ORDER: VerdictHonestyDimension[] = [
  "matter_resolution",        // always first and dominant (card Sec. 2)
  "affirmative_grounding",    // next for high-stakes (filing/privileged)
  "assurance_floor",
  "source_documentation_tier",
  "substrate_freshness"
] as const;

export function applyPrecedence(
  dimensions: CleanVerdictEligibility['dimensions'],
  riskClass: RiskClass
): PresentationStatus {
  // Build a lookup for quick status check
  const statusByDim = new Map(dimensions.map(d => [d.dimension, d.status]));

  for (const dim of PRECEDENCE_ORDER) {
    const status = statusByDim.get(dim);
    if (status === "unmet" && isNonDegradable(dim, riskClass)) {
      return dim === "matter_resolution" ? "needs_human_judgment" : "needs_verification";
    }
  }

  // No blocking unmet non-degradable found -- check for disclosed downgrades on degradable dims
  const hasDisclosedDowngrade = dimensions.some(d => 
    d.status === "downgraded_disclosed" && 
    !isNonDegradable(d.dimension, riskClass)
  );
  return hasDisclosedDowngrade ? "clean_with_disclosure" : "clean";
}

function isNonDegradable(dim: VerdictHonestyDimension, riskClass: RiskClass): boolean {
  if (dim === "matter_resolution") return true;
  if (riskClass === "filing_bound" || riskClass === "privileged_matter") {
    return dim === "affirmative_grounding";
  }
  return false;
}
```

This version is now order-independent of the input array and strictly follows the card's documented evaluation order. Safer for multi-threaded or revalidation paths.

**BUG (minor, in my P-1 schema disclosure handling):** `disclosure_log_ref` is a single top-level field. If multiple dimensions are downgraded (possible), a single ref may be ambiguous for audit. Card requires each downgrade to be "disclosed" via the passthrough.

**Fix / BETTER_IDEA (paste-ready):**

Add to `CleanVerdictEligibility`:
```ts
  disclosed_dimension_refs: Array<{
    dimension: VerdictHonestyDimension;
    feedback_view_entry_ref: string;   // the specific entry in FeedbackFindingView that carried this downgrade
  }>;
```

Then in `applyPrecedence` (or a separate `validateDisclosures` helper called before setting `clean_with_disclosure`):
```ts
// Only count a downgrade as disclosed if it has an entry in disclosed_dimension_refs
```

This makes per-dimension disclosure auditable without much extra surface area.

**GAP-01 (card Sec. 3 AB-T-03 + OP-A OBL-DOC24-MATTER-CONF-01):** The `separation` metric for multi-candidate matter resolution is referenced but has no computation rule or reproducibility requirement. For a CRITICAL item (privilege boundary), this is underspecified.

**Recommendation:** In the OP-A obligation, require:
- Deterministic, logged computation (e.g., `separation = 1 - max_cosine_similarity(embed(candidate_i), embed(candidate_j))` using the locked Qwen3-Embedding-0.6B, or a simple normalized token-overlap Jaccard if embeddings unavailable).
- The computation method + raw scores must be recorded in the `MatterResolution` detail_ref or a linked `MatterResolutionAudit` record so a human reviewer can replay it.
- For privileged matters, separation threshold is higher and the metric must be P-4 provenance-clean (independent of the drafting model).

This turns an abstract gate into an auditable, replayable one without inventing new heavy machinery.

**SUGGESTION (wiring seam, interaction bug 8 + P-3):** After Sec.11.21 revalidation recomputes P-1 eligibility and a dimension flips to unmet (especially substrate_freshness), the system should explicitly check whether a P-3 recovery action is now required (e.g., `mark_indeterminate` or `pause_and_retaint`). The card says "re-evaluate" but does not wire the eligibility change -> recovery trigger seam. Easy to miss in implementation.

**Fix:** Add a single sentence in V3.3.1 §7.4 or §11.21: "After eligibility recomputation, if any dimension status changed to unmet and the outcome is still active, invoke `getRecoveryPolicy` for the relevant trigger and apply if the strategy is non-noop." Trivial one-line seam.

**Minor GAP (AB-T-04 fold-in):** Card defines `HumanGateDecisionRecord` fields but my initial review did not supply the interface. Even though folded into R0.4, the type must exist for `decision_log_required` to be backable.

**Paste-ready (add to Common Contracts or Core near gate handling):**

```ts
export interface HumanGateDecisionRecord {
  decider_ref: string;                     // principal, delegate, or system user id
  decision: "approved" | "rejected" | "deferred" | "waived";
  rationale?: string;
  standard_applied?: string;               // e.g. policy version or checklist id
  shown_refs: string[];                    // REQUIRED per V2 -- all material the human actually saw
  weighed_refs?: string[];                 // optional, what was considered beyond shown
  decided_at: string;                      // ISO
  quorum_waived: boolean;                  // V2 addition; if true, rationale MUST explain the waiver
  boundary_ref?: ContextBoundaryRef;       // for privileged-matter gates (P-2)
  eligibility_snapshot_ref?: string;       // snapshot of P-1 eligibility at decision time for audit
}

export type HumanGateSummary = Pick<HumanGateDecisionRecord, "decision" | "decided_at" | "quorum_waived">;
```

This completes the AB-T-04 item that was FOLD-INTO-R0.4.

---

## 3. New Ideas / Improvements After Deeper Look

**BETTER_IDEA (robustness, low cost):** Make `CleanVerdictEligibility` immutable after initial computation and only replaceable by a fresh recompute on revalidation or recovery. This prevents any downstream consumer from mutating dimensions or presentation_status (defensive against the very self-reported signal problem P-4 exists to solve).

Add `readonly` or freeze in TS + a lint `verdict.eligibility_mutated_after_creation`.

**BETTER_IDEA (audit + P-5 quarantine):** When P-1 sets `quarantine_recommendation = true`, immediately emit a `QuarantineEvent` carrying the full eligibility + `ContextBoundaryRef` (P-2) into the append-log (AB-T-10). This gives DOC73 and human reviewers a single place to discover degraded work without scanning every outcome.

**SUGGESTION (P-4 + AB-T-05 risk_class):** The source of `risk_class` for high-stakes outcomes must be explicitly one of: (a) deterministic matter classification from DOC24/DOC81, (b) task-type registry, or (c) explicit user override with HumanGateDecisionRecord. Never from the producing/evaluating model. Add this as a one-line invariant in the P-4 rule section.

**SUGGESTION (AB-T-HYDR large-attachment open point):** Resolve the front-load-extract vs lazy-retrieve decision by policy under the `ContextBoundaryRef`: for privileged or filing_bound boundaries, default to lazy-retrieve + on-demand extraction (via Source Workspace) so the sealed context packet never bloats with full large docs. For ordinary boundaries, front-load is acceptable. Record the choice in the manifest. This is a simple policy table keyed by boundary privilege_class.

**Minor BETTER_IDEA (testing):** Add a "chaos fixture" that deliberately supplies dimensions in random order + a downgraded_disclosed without a matching disclosed_dimension_refs entry. The new order-independent precedence + per-dimension disclosure check should still produce correct `clean_with_disclosure` only when disclosures are present. This guards against the exact class of bug I initially shipped.

---

## 4. Re-Assessment of Interaction Bugs, Gemini Bugs, and Adjudications

All 8 interaction bugs remain comprehensively covered by P-1..P-4. The new seam suggestion (eligibility change -> P-3 recovery) is the only additional wiring I would add; it is minor and does not change any disposition.

Gemini BUG-04 caveat strengthening from initial review still stands and is reinforced: the `deterministic_mechanical` classification itself must be P-4 provenance-clean.

D1 and AB-T-08 remain correctly decided (field-not-state + decline hard gate).

AB-T-03 remains the highest-severity item; the CRITICAL ship-first priority is unchanged.

No new divergences or incorrect dispositions found on re-read.

---

## 5. Updated Value-Tiered Summary (incorporating deeper-dive findings)

**Critical (add to ship list from initial review)**
- Fix for precedence function array-order dependency (my supplied code)
- Per-dimension disclosure tracking in P-1 (my schema gap)
- GAP-01: explicit reproducible separation metric + audit log for AB-T-03 (CRITICAL privilege gate)
- HumanGate* interfaces (completes AB-T-04 fold-in)
- Eligibility change -> P-3 recovery trigger seam (interaction bug 8 reinforcement)

**Substantive (high value)**
- All prior substantive items remain
- Immutability of CleanVerdictEligibility after creation
- QuarantineEvent emission on P-1 recommendation
- AB-T-HYDR large-attachment policy under P-2 boundary
- Order-independent precedence + chaos fixture for disclosure check

**Minor**
- The suggestions above (risk_class source invariant, etc.)

**Considered and declined**
- None new. All prior declined items still declined for the same reasons.

---

**Final deeper-dive verdict:** Card V2 is solid. The two issues I found were in *my own paste-ready code* (easy to correct with the replacements above). The card itself had no missed critical bugs. With the fixes, additional interfaces, and the one new seam, the enforcement layer is even more robust and implementation-proof. All math/logic is trivial and will work. Ready for coding agents with zero remaining ambiguity on the reviewed items.

*This re-review is a tracking file (pre-approved). No operative spec files were modified.*