Elnor Repo Reader

DOC72_HYPER_INTELLIGENCE_OVERLAY_R5_73.md

Current Specs/DOC72/DOC72_HYPER_INTELLIGENCE_OVERLAY_R5_73.md

Short text page 9645a7599907. Generated 2026-06-09T01:23:58.539Z from commit dbaa25962edc11ab30e8d4ca1715f9ae5bf77331. Worktree: clean.

Open readable HTML page · Open raw txt · Open path URL

ELNOR REPO READER TEXT MIRROR
Original path: Current Specs/DOC72/DOC72_HYPER_INTELLIGENCE_OVERLAY_R5_73.md
Source repo: /Users/OpenClaw1/Elnor/Elnor Specs
Git branch: main
Git commit: dbaa25962edc11ab30e8d4ca1715f9ae5bf77331
Generated: 2026-06-09T01:23:58.539Z

---

# DOC72 — Hyper Intelligence Overlay R5.73

## ELNOR Knowledge Architecture: Entity Graph, Procedural Intelligence, Goals, and Extended Learning

**Status:** R5.73 absorbs the V2.2 revision of the Graph Intelligence Enhancement (29 normalized patches NP01-NP29 from a 5-reviewer red-team adjudication) into §42A. The R5.72 owner text outside §42A is preserved verbatim. Both the prior V2.1 absorption (in §42A) and the freestanding V2.2 proposal are superseded by §42A as it appears here.  
**Version:** R5.73  
**Date:** 2026-04-26  
**Adjudication source:** DOC72 R5.6 Final Red Team Adjudication Card (final), incorporating the accepted and accepted-with-modifications findings across ChatGPT, Claude, Grok/Harper, and Gemini, plus the post-adjudication meta-review fixes. Companion integration sources absorbed where required by that card: DOC24 KDA R2, DOC72 Graph Intelligence Enhancement Proposal V2.2 (incorporating all 29 normalized patches NP01-NP29), DOC72 Knowledge Intelligence Enhancement R2, DOC24 Addendum A BDSM v6.4, DOC24 R2.5, DOC8 v1.11.4, and MultiDoc PropA R6.1.  
**Scope:** DOC72 is the intelligence substrate for ELNOR. It defines how knowledge is structured, how entities relate, how skills compound, how goals drive behavior, how the system learns from use, how knowledge evolves over time, and how knowledge intake from all surfaces is observed, assessed, extracted, quality-gated, promoted, rendered into downstream contracts, and governed over time. DOC72 owns the ARCHITECTURE of knowledge — the schemas, payload contracts, relationships, render-input contracts, intake pipeline contracts, domain signal profiles, evidence-first relation promotion, self-learning interfaces, storage registry, and graph hygiene rules. DOC24/KDA owns rendering templates and runtime rendering; DOC8 owns learning computation; DOC1 owns memory governance; DOC3 owns procedural skill lifecycle; EC remains the sole durable writer.

**Integration note:** This revision uses the carry-forward-clean R5.71 text as the base and applies the remaining adjudication-gap fixes needed to treat the DOC72 owner text as the complete R5.72 build. Surviving R5.6 owner text remains preserved verbatim unless a change was required by the adjudication card, the prior audit report, or the residual numbering/contract defects discovered during this closure pass.  


---

### Integration note — V2.2 absorption (2026-04-26)

R5.73 differs from R5.72 in one substantive way: §42A is the V2.2 absorption of the Graph Intelligence Enhancement, replacing the V2.1 absorption that landed in R5.7. V2.2 incorporated 29 normalized patches (NP01-NP29) over V2.1, drawn from a 5-reviewer red-team adjudication (ChatGPT, Claude, Gemini, Grok, Codex). Wholesale replacement was performed because V2.2's supersession statement covers V2.1 in full and the V2.1 integration mods (§42A.x.y numbering, heading nesting) were purely mechanical. No other R5.72 sections changed.

**One V2.2 oversight corrected during absorption.** V2.2 omitted §5.4 (Temporal regularity computation) and §5.5 (Standing procedure candidacy) from its restated §5 content and did not mark them with `*(Unchanged from V2.1)*` placeholders. Surviving V2.2 schemas (`CoOccurrencePatternSchema`, `ProcedureMotifSchema`) still carry `temporal_regularity`, `temporal_regularity_confidence`, and `suggested_node_kind: ["procedure", "standing_procedure", "undetermined"]` fields whose computation is defined only in those two subsections. Dropping the prose while keeping the fields would violate V2.2's own §0 governing principle 13 ("Every schema field and scoring term must have a computation function. No declared-but-never-computed fields.") and leave no NP rationale in the patch list explaining the deletion. R5.73 §42A.5.4 and §42A.5.5 are therefore restored verbatim from R5.72 §42A as a gap-fix during absorption.

Per the post-absorption versioning rule, both V2.1 (already archived) and V2.2 are now archived. Future work in this area authors a fresh proposal against R5.73 §42A, not a revision of either archived proposal.

---

## 1. The Problem

ELNOR's current skill model (DOC3) treats skills as standalone procedural recipes. Each SKILL.md file is a self-contained document describing how to perform a specific task. This creates three compounding problems:

**Redundancy.** Ten different Word skills each contain their own instructions for navigating the ribbon, using the font menu, and managing heading styles. The same procedural knowledge is duplicated across every skill that touches the same application. Teaching Elnor a correction to font menu navigation requires updating every skill that references it.

**No compounding.** When Elnor learns how to use the References Tab during a TOC creation session, that knowledge doesn't benefit future proofreading, track changes, or brief formatting sessions. Each skill is an island. Learning in one skill doesn't strengthen others, even when they share the same underlying application knowledge.

**Manual skill creation.** Every new procedure requires either a hand-authored SKILL.md or a formal learning pipeline cycle (observation → signal → synthesis → proposal → review → installation). The user cannot simply use an application and have Elnor accumulate structured knowledge from successful interactions.

---

## 2. The Solution

Make all knowledge — about the world, about skills, about goals, about the user's language and habits — a first-class citizen of a unified entity graph. Every piece of knowledge exists in six dimensions simultaneously, creating an intelligence substrate that is inspectable, auditable, learnable, and temporally resilient.

### 2.1 The Six-Dimension Knowledge Framework

Every knowledge node in DOC72 exists in six dimensions. These dimensions are the overarching structure for the entire architecture.

| Dimension | Question it answers | What it contains |
|---|---|---|
| **Content** | WHAT does Elnor know? | Every knowledge node type: world entities (matters, people, orgs, calendars, folders, system objects — tasks, rooms, panels, notes), application entities (apps + lightweight capability metadata), procedures (semantic intent actions), execution traces, standing procedures, goals, domain concepts, obligations, work products, memory directives (preferences, constraints, vocabulary, styles, archetypes) |
| **Provenance** | WHERE/WHY does Elnor believe it? | The evidence chain — emails, chats, documents, user statements, corrections, authorities, traces, system inferences, onboarding. Every node has one. Three-tiered by node importance. |
| **Temporal** | WHEN did things change and how fresh? | Creation time, last verification, staleness state, change history, TTL. When knowledge was learned and whether it's still current. |
| **Confidence** | HOW SURE is Elnor? | Beta distribution: `C = α/(α+β)` with lazy time-decay. Derived from provenance quality, experience outcomes, recency, and correction history. Aligned with DOC1's Beta(2,2) model. |
| **Connections** | HOW does knowledge relate to everything else? | Typed relationships to other nodes — goals it serves, procedures it uses, entities it links to, constraints that govern it, skills it composes into. |
| **Experience** | HOW has this knowledge lived in actual use? | Usage frequency, outcome history (positive AND negative), behavioral distribution, trend signals, corrections, contextual patterns. Three-tiered by node importance. The substrate where learning happens. |

The six dimensions apply to EVERY node type. A procedure node has content (its semantic intent steps), provenance (where it was learned), temporal state (when it was last validated), confidence (Beta-derived from success rate), connections (which skills use it, which app it belongs to, which goals it serves), and experience (how many times it's been used, what outcomes occurred, what corrections were made). Skills improve through use because the experience dimension captures outcomes, DOC8 detects patterns, and DOC3 manages content updates.

**Not all dimensions are equally deep on every node.** The six-dimension sparsity policy (§2.2) ensures high-value professional nodes get full depth on all dimensions while low-value operational nodes carry minimal provenance and no separate experience records. Content and Confidence are always present (lightweight). Connections are always present but may be sparse. The variable dimensions are Provenance, Experience, and Temporal depth.

### 2.2 Six-Dimension Sparsity Policy

| Node tier | Examples | Provenance | Experience | Temporal | Connections |
|---|---|---|---|---|---|
| **Tier A** (high-value professional) | Domain concepts, goals, obligations, standing procedures, work products, matters | Full chain (all entries) | Full record (20 recent events + variants + trends) | Full TemporalMetadata + ChangeRecord history | Full typed edges |
| **Tier B** (operational) | Procedures, applications, important world entities | Compact (latest 3 entries + summary) | Light (counters + last_outcome) | Full TemporalMetadata, minimal ChangeRecord | Full typed edges |
| **Tier C** (ephemeral/low-value) | Candidate entities, suggestions, low-value world entities | Minimal (single source_ref) | None (usage_count inline on node) | staleness_state + created_at only | Edges as needed |

The tier is determined by `node_kind` → tier mapping (configured, not per-node). This mapping is part of DOC72's ontology definition.

**Example — pineapple allergy memory:**
1. **Content:** "Will is allergic to pineapple" (memory_directive, preference/health)
2. **Provenance:** Dr. Martinez email Aug 2025 → user confirmation Sep 2025 → user correction Mar 2026 (now managed with medication)
3. **Temporal:** Created Aug 2025, corrected Mar 2026, current
4. **Confidence:** α=4.95, β=2 → C_base=0.71, decayed ≈ 0.71 (recently verified)
5. **Connections:** linked to Dr. Martinez (person), health domain, nutrition goal (if exists), grocery ordering (standing procedure constraint)
6. **Experience:** Applied 3 times in grocery contexts. Corrected once (system excluded pineapple, user said it's fine now). Current trend: pineapple is acceptable.

**Example — loss causation legal concept:**
1. **Content:** "Loss causation requires showing fraud caused economic loss" (domain_concept)
2. **Provenance:** Tellabs, 551 U.S. 308 (binding authority) + Will's Westlaw research + confirmed in Henderson MTD
3. **Temporal:** Created Mar 2026, last Shepardized Mar 20, fresh
4. **Confidence:** α=4.90, β=2 → C_base=0.71 (multiple authorities, user confirmation). Policy cap: has authority → no cap. Half-life 365 days → decay ≈ 1.0
5. **Connections:** serves Henderson settlement goal, addressed in Henderson MTD, element of 10b-5, Judge Chen has ruled on this 3 times
6. **Experience:** Argued in 4 briefs. Corrective disclosure approach used 3/4 times. One adverse ruling (Judge Chen found it insufficient — required direct price impact evidence). Trend: consider addressing price impact preemptively.

**Example — "Insert Table of Contents" skill procedure:**
1. **Content:** Semantic intent: "Open the references section and insert a table of contents with automatic heading detection." Environment: Word desktop, Mac.
2. **Provenance:** Learned from execution trace Mar 27. User taught keyboard shortcut variant Mar 30. LLM bootstrap confirmed References section exists.
3. **Temporal:** Created Mar 27, last validated Apr 2, fresh. Stale after 90 days (app procedure TTL).
4. **Confidence:** α=5.45, β=2 → C_base=0.73 (4 successful traces, user-taught variant)
5. **Connections:** `composes_into` → "Create filing-ready brief" composite procedure. `constrained_by` → "12pt Times New Roman for legal briefs." `part_of_application` → Microsoft Word.
6. **Experience:** Used 4 times. 4 successes. 0 failures. 0 corrections. Context: 3x Henderson, 1x Narayanan. Trend: stable, reliable.

**The self-improvement cycle for skills:** Skills get better through use automatically. Experience captures outcomes → DOC8 detects patterns (e.g., cite-check step fails 1 in 3 times) → DOC8 proposes improvement (add precondition, update step) → DOC3 manages the update (reviews, approves, regenerates skill artifact) → Content improves → next execution benefits. No manual skill editing required.

### 2.3 Four-Layer Procedural Taxonomy

The system distinguishes four layers of procedural knowledge, each with different owners, triggers, and complexity:

1. **Procedural memory** (DOC1) — guidance and principles. "When filing in S.D.N.Y., check local rules first." Not executable automation — soft knowledge that gets injected into prompts as context. Lives in DOC1 as memory_directive.

2. **Skill procedures** (DOC72 graph + DOC3) — reusable app/tool interaction techniques stored as semantic intent. "How to insert a TOC in Word." Answers HOW to do something. Lives in the graph as procedure nodes; DOC3 owns the executable skill lifecycle. Procedures store SEMANTIC ACTION DESCRIPTIONS, not mechanical UI step sequences — the LLM bridges intent to current UI at runtime.

3. **Standing procedures** (DOC72 graph + DOC1 governance) — conditional trigger-action behavioral automation. "When court filings arrive, extract deadlines and calendar them." Answers WHEN to do something and WHAT to do. Lives in the graph with DOC1 governing lifecycle and confirmation policy.

4. **DOC23 tasks** (DOC23) — hardened modular automation pipelines. Full module graphs with typed cables, gates, and retry logic. Maximum reliability, maximum configuration effort. Standing procedures may be promoted to DOC23 tasks when reliability requirements increase.

Each layer has a clear escalation path: procedural memory can inspire a standing procedure. A standing procedure can be promoted to a DOC23 task. Knowledge flows upward through the layers as confidence and operational importance increase.

### 2.4 Ownership Split

**DOC72 graph** is the shared knowledge substrate — storing what exists, how things relate, what was tried, and what worked. Owns the SHAPES of knowledge (schemas, ontology, relationships).

**DOC3** remains the executable skill lifecycle owner — owning observation, candidate synthesis, proposal, review, promotion, enablement, and execution-facing skill artifacts.

**DOC1** remains the lifecycle/governance/confidence owner for memories, preferences, constraints, directives, and standing procedure governance. DOC1 memories are unified into the same SQLite database as `node_kind = 'memory_directive'` — one database, one transaction, one truth. DOC1's Write Gate, maturity lifecycle, and calibrated_confidence (Beta α/β) operate on these rows. The governance LOGIC doesn't change — only the storage backend.

**DOC8** remains the learning engine — processing experience data to detect patterns, generate confidence adjustments, and propose behavioral optimizations. Reads from and writes to the experience dimension. Updates α and β on experience events.

Neither DOC3 nor DOC72 replaces the other. The graph provides the reuse substrate. DOC3 provides the executable projection.

---

## 3. Storage Architecture

### 3.1 SQLite as Primary Query Surface

The entity graph uses SQLite as its primary materialized query surface, stored as a file under `ELNOR_MEMORY/`. This provides file-backed, local-first, inspectable storage with native recursive CTE traversals, concurrent read/write via WAL mode, and atomic transactions.

An append-only JSONL event log is retained alongside SQLite for audit, replay, and deterministic rebuild capability. SQLite is the authoritative truth for queries. JSONL is an async export for auditability. If JSONL export fails, SQLite remains authoritative and export retries.

**Configuration:**
- `PRAGMA journal_mode = WAL` for concurrent read/write
- `PRAGMA query_only = ON` for read-only connections (Text-to-SQL, agent inspection tools)
- Deterministic snapshot rebuild capability from event log

### 3.1A Storage Model Registry

R5.7 makes the storage classification explicit so SQLite-canonical graph truth, append-only audit lanes, atomic JSON current views, and derived read-models cannot be confused.

```ts
type StorageRegistryEntry = {
  path: string;
  store_type: "sqlite_table" | "jsonl_append" | "atomic_json" | "derived_sqlite";
  canonical_or_derived: "canonical" | "derived";
  owner_doc: string;
  rebuild_source?: string;
  description: string;
};
```

**Required registry entries in this revision:**
- `entity_graph.sqlite` — canonical durable graph store (`nodes`, `aliases`, `edges`, `experience_records`, `provenance_entries`, `conversation_threads`, `thread_entity_links`, `disambiguation_biases`, `pending_extractions`, `outcome_chains`, `outcome_chain_concept_index`)
- `graph_events.jsonl` — async audit export written from canonical SQLite mutations
- `ELNOR_MEMORY/system/backlink/mention_observations.jsonl` — derived append-only evidence log
- `mention_observations_derived` — derived SQLite query table rebuilt from mention observations log
- `ELNOR_MEMORY/system/backlink/evidence_bundles.json` — derived atomic JSON evidence bundles
- `ELNOR_MEMORY/system/learning/satisfaction_matrix/*` — derived Matrix artifacts under DOC72-governed schemas, rebuildable from graph experience + learning inputs
- `ELNOR_MEMORY/system/learning/implicit_preference_candidates.json` — derived atomic JSON
- `ELNOR_MEMORY/system/learning/implication_results.jsonl` — derived append-only JSONL
- `ELNOR_MEMORY/system/config/parameter_registry.json` — canonical atomic JSON parameter registry

### 3.2 Core Schema

```sql
CREATE TABLE nodes (
    id TEXT PRIMARY KEY,
    node_kind TEXT NOT NULL,
    canonical_name TEXT NOT NULL,
    alpha REAL NOT NULL DEFAULT 2.0,        -- Beta distribution positive evidence
    beta REAL NOT NULL DEFAULT 2.0,         -- Beta distribution negative evidence
    confidence REAL NOT NULL DEFAULT 0.5,   -- Last computed: alpha/(alpha+beta) * decay
    staleness_state TEXT NOT NULL DEFAULT 'fresh',
    lifecycle_state TEXT NOT NULL DEFAULT 'observed',
    created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
    updated_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
    last_verified_at DATETIME,
    payload JSON,                           -- Node-kind-specific fields
    is_active BOOLEAN DEFAULT 1
);
CREATE INDEX idx_nodes_kind ON nodes(node_kind, is_active);
CREATE INDEX idx_nodes_lifecycle ON nodes(lifecycle_state, is_active);

CREATE TABLE aliases (
    id TEXT PRIMARY KEY,
    node_id TEXT NOT NULL REFERENCES nodes(id) ON DELETE CASCADE,
    normalized_alias TEXT NOT NULL,
    alias_type TEXT NOT NULL,               -- 'name' | 'abbreviation' | 'shorthand' | 'system'
    confidence REAL DEFAULT 0.8
);
CREATE INDEX idx_alias_search ON aliases(normalized_alias);

CREATE TABLE edges (
    source_id TEXT NOT NULL REFERENCES nodes(id) ON DELETE CASCADE,
    target_id TEXT NOT NULL REFERENCES nodes(id) ON DELETE CASCADE,
    relation_type TEXT NOT NULL,
    confidence REAL NOT NULL DEFAULT 0.5,
    lifecycle_state TEXT DEFAULT 'active',
    start_date DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
    end_date DATETIME,
    payload JSON,
    PRIMARY KEY (source_id, target_id, relation_type, start_date)
);
CREATE INDEX idx_edges_target ON edges(target_id, relation_type);
CREATE UNIQUE INDEX idx_edges_current_unique
ON edges(source_id, target_id, relation_type)
WHERE end_date IS NULL;

CREATE TABLE experience_records (
    target_node_id TEXT PRIMARY KEY REFERENCES nodes(id) ON DELETE CASCADE,
    total_usage_count INTEGER DEFAULT 0,
    total_success_count INTEGER DEFAULT 0,
    total_failure_count INTEGER DEFAULT 0,
    total_correction_count INTEGER DEFAULT 0,
    recent_usage_count INTEGER DEFAULT 0,
    recent_success_count INTEGER DEFAULT 0,
    recent_failure_count INTEGER DEFAULT 0,
    recent_correction_count INTEGER DEFAULT 0,
    last_used_at DATETIME,
    variant_distribution JSON,
    context_distribution JSON,
    recent_events JSON,
    older_event_summary JSON,
    overall_trend TEXT DEFAULT 'new',
    trend_signal TEXT
);
CREATE INDEX idx_experience_last_used ON experience_records(last_used_at);

CREATE TABLE provenance_entries (
    id TEXT PRIMARY KEY,
    node_id TEXT NOT NULL REFERENCES nodes(id) ON DELETE CASCADE,
    entry_type TEXT NOT NULL,
    source_description TEXT,
    source_ref TEXT,
    citation TEXT,
    authority_type TEXT,
    still_current TEXT,
    confidence_contribution REAL,
    supersedes_entry_id TEXT,
    correction_reason TEXT,
    jurisdiction_scope JSON,
    created_at DATETIME DEFAULT CURRENT_TIMESTAMP
);
CREATE INDEX idx_provenance_node ON provenance_entries(node_id);

CREATE TABLE conversation_threads (
    thread_id TEXT PRIMARY KEY,
    title_hint TEXT,
    linked_entity_ids JSON,
    key_decisions JSON,
    unresolved_items JSON,
    checkpoint_summary TEXT,
    occurred_at DATETIME,
    turn_count INTEGER,
    channel TEXT
);
CREATE INDEX idx_threads_occurred ON conversation_threads(occurred_at DESC);

CREATE TABLE thread_entity_links (
    thread_id TEXT REFERENCES conversation_threads(thread_id),
    entity_id TEXT REFERENCES nodes(id),
    PRIMARY KEY (thread_id, entity_id)
);
CREATE INDEX idx_thread_entity ON thread_entity_links(entity_id);

CREATE TABLE disambiguation_biases (
    context_key TEXT NOT NULL,
    preferred_entity_id TEXT NOT NULL REFERENCES nodes(id),
    correction_count INTEGER DEFAULT 1,
    last_corrected_at DATETIME,
    PRIMARY KEY (context_key, preferred_entity_id)
);

CREATE TABLE pending_extractions (
    pending_id TEXT PRIMARY KEY,
    idempotency_key TEXT NOT NULL,
    source_ref TEXT NOT NULL,
    from_version_id TEXT,
    to_version_id TEXT,
    queued_at DATETIME NOT NULL,
    updated_at DATETIME NOT NULL,
    priority TEXT NOT NULL DEFAULT 'normal',
    retries INTEGER NOT NULL DEFAULT 0,
    max_retries INTEGER NOT NULL DEFAULT 3,
    max_age_hours INTEGER NOT NULL DEFAULT 168,
    status TEXT NOT NULL DEFAULT 'queued',
    error_code TEXT
);
CREATE UNIQUE INDEX idx_pending_extractions_idempotency
ON pending_extractions(idempotency_key);
CREATE INDEX idx_pending_extractions_status_updated
ON pending_extractions(status, updated_at);
```

**Normative notes:**
- `updated_at` is never NULL. Every node mutation SHALL set `updated_at = CURRENT_TIMESTAMP`.
- The SQL schema in §3.2 implements the TypeScript type in §34.3. Field names MUST match.
- Hard deletes are rare but supported. When they occur, FTS and dependent rows are cleaned through explicit delete triggers and `ON DELETE CASCADE`.
- `WorkProductPayload.addressed_concepts` is a derived/cache field rebuilt from canonical `addresses_concept` edges; it is not an independent source of truth.

### 3.3 Hybrid Vector + Symbolic Retrieval (sqlite-vec)

```sql
-- Vector embeddings for semantic search fallback
CREATE VIRTUAL TABLE vec_nodes USING vec0(
    node_id TEXT PARTITION KEY,
    embedding float[1024]
);
```

**Normative:** `1024` MUST be sourced from a single canonical embedding-dimension configuration value. The DDL, embedding infrastructure config, and migration pipeline must not duplicate incompatible literals.

Embedding model: See §3.7 Embedding Infrastructure for primary model configuration. Brute-force exact search in `sqlite-vec` handles 50K-100K vectors in 2-5ms on M4 Pro — no ANN index needed at this scale.

Embeddings are precomputed on node creation/update (async, Tier 3). Each node's `canonical_name: description` is embedded. Embedding model change requires full re-embedding via the migration pipeline (§3.7.3).

### 3.4 FTS5 Keyword Pre-Filtering

```sql
CREATE VIRTUAL TABLE fts_nodes USING fts5(
    canonical_name,
    searchable_text,
    tokenize='unicode61',
    content=''
);

CREATE TRIGGER nodes_ai AFTER INSERT ON nodes BEGIN
  INSERT INTO fts_nodes(rowid, canonical_name, searchable_text)
  VALUES (
    NEW.rowid,
    NEW.canonical_name,
    NEW.canonical_name || ' ' ||
    COALESCE(
      (SELECT GROUP_CONCAT(normalized_alias, ' ')
       FROM aliases
       WHERE node_id = NEW.id),
      ''
    ) || ' ' ||
    COALESCE(json_extract(NEW.payload, '$.description'), '')
  );
END;

CREATE TRIGGER nodes_au AFTER UPDATE ON nodes BEGIN
  DELETE FROM fts_nodes WHERE rowid = OLD.rowid;
  INSERT INTO fts_nodes(rowid, canonical_name, searchable_text)
  VALUES (
    NEW.rowid,
    NEW.canonical_name,
    NEW.canonical_name || ' ' ||
    COALESCE(
      (SELECT GROUP_CONCAT(normalized_alias, ' ')
       FROM aliases
       WHERE node_id = NEW.id),
      ''
    ) || ' ' ||
    COALESCE(json_extract(NEW.payload, '$.description'), '')
  );
END;

CREATE TRIGGER nodes_ad AFTER DELETE ON nodes BEGIN
  DELETE FROM fts_nodes WHERE rowid = OLD.rowid;
END;

CREATE TRIGGER aliases_ai AFTER INSERT ON aliases BEGIN
  DELETE FROM fts_nodes
   WHERE rowid = (SELECT rowid FROM nodes WHERE id = NEW.node_id);
  INSERT INTO fts_nodes(rowid, canonical_name, searchable_text)
  SELECT
    n.rowid,
    n.canonical_name,
    n.canonical_name || ' ' ||
    COALESCE(
      (SELECT GROUP_CONCAT(normalized_alias, ' ')
       FROM aliases
       WHERE node_id = NEW.node_id),
      ''
    ) || ' ' ||
    COALESCE(json_extract(n.payload, '$.description'), '')
  FROM nodes n
  WHERE n.id = NEW.node_id;
END;

CREATE TRIGGER aliases_au AFTER UPDATE ON aliases BEGIN
  DELETE FROM fts_nodes
   WHERE rowid = (SELECT rowid FROM nodes WHERE id = NEW.node_id);
  INSERT INTO fts_nodes(rowid, canonical_name, searchable_text)
  SELECT
    n.rowid,
    n.canonical_name,
    n.canonical_name || ' ' ||
    COALESCE(
      (SELECT GROUP_CONCAT(normalized_alias, ' ')
       FROM aliases
       WHERE node_id = NEW.node_id),
      ''
    ) || ' ' ||
    COALESCE(json_extract(n.payload, '$.description'), '')
  FROM nodes n
  WHERE n.id = NEW.node_id;
END;

CREATE TRIGGER aliases_ad AFTER DELETE ON aliases BEGIN
  DELETE FROM fts_nodes
   WHERE rowid = (SELECT rowid FROM nodes WHERE id = OLD.node_id);
  INSERT INTO fts_nodes(rowid, canonical_name, searchable_text)
  SELECT
    n.rowid,
    n.canonical_name,
    n.canonical_name || ' ' ||
    COALESCE(
      (SELECT GROUP_CONCAT(normalized_alias, ' ')
       FROM aliases
       WHERE node_id = OLD.node_id),
      ''
    ) || ' ' ||
    COALESCE(json_extract(n.payload, '$.description'), '')
  FROM nodes n
  WHERE n.id = OLD.node_id;
END;
```

FTS5 keyword search is a contentless derived index rebuilt from `nodes` + `aliases`. Alias table changes are first-class refresh triggers. Hard-delete support remains explicit even when most operational deletes are realized as tombstone-first lifecycle transitions.

### 3.5 DOC1 Memory Unification

DOC1 memories are stored in the same SQLite database as the entity graph, as rows with `node_kind = 'memory_directive'`. This eliminates the dual-write consistency problem — one database, one transaction, one truth.

```sql
-- DOC1 memories as graph nodes
-- node_kind = 'memory_directive'
-- payload JSON contains DOC1-specific fields:
--   maturity_state, memory_type ('preference' | 'constraint' | 'standing_order' |
--   'vocabulary_mapping' | 'style_profile' | 'document_archetype' | 'heuristic'),
--   taint, inject_count, inject_correct_count,
--   recency_type, verification_ttl_days, etc.
-- All DOC1 governance operates on these rows via normal SQL queries
```

**What changes from the DOC1 perspective:**
- `entity_knowledge_write` command writes both the graph node AND the associated DOC1 memory in ONE SQLite transaction. No 2PC needed.
- JSONL event log is retained as an async audit/replay trail. A background worker exports committed transactions to JSONL. If JSONL export fails, SQLite remains the authoritative truth and export retries.
- DOC1's `memory_store.jsonl` becomes a legacy format. Migration: import existing JSONL memories into SQLite `nodes` table at startup if no SQLite records exist.
- DOC1's Write Gate, maturity transitions, calibrated_confidence (Beta α/β), and all governance rules operate on SQLite rows. The governance LOGIC doesn't change — only the storage backend.
- DOC1's query seams (§4.6 in DOC1 R1) query SQLite instead of parsing JSONL.

### 3.6 Nightly Database Maintenance

```sql
-- Run after all nightly cleanup jobs complete
-- Rebuilds the database file and reclaims disk space
VACUUM;
```

Schedule: nightly if >100 nodes were archived/deleted that day, otherwise weekly. Runs after trace compaction, experience compaction, candidate decay, suggestion expiry, and archive operations.

### 3.7 Embedding Infrastructure

The embedding model is infrastructure, not a configurable agent. Unlike LLM extraction models which can fall back to cheaper alternatives, the embedding model is the LANGUAGE of the vector space. If you switch models without migration, cosine similarity produces garbage. The embedding model config locks when set and changes ONLY through the migration pipeline.

#### 3.7.1 Configuration

```ts
type EmbeddingInfraConfig = {
  active_model_id: string;               // e.g., "qwen3-0.6b-int8"
  embedding_dimension: number;           // canonical dimension value, currently 1024
  execution_mode: "local_mlx" | "local_ollama";
  locked: true;                          // ALWAYS true
  migration_in_progress?: {
    target_model_id: string;
    target_embedding_dimension: number;
    progress_pct: number;
    started_at: string;
  };
};
```

`embedding_dimension` is the single canonical source for vec-dimension wiring in this revision.

This is NOT in the model selection UI alongside LLM agents. It's in Settings > System > Vector Infrastructure, with clear warnings about migration requirements.

#### 3.7.2 Default Model

Qwen3-Embedding-0.6B (or 4B quantized to int8) running locally via MLX. Nomic-embed-text as fallback only if Qwen3 doesn't work on the hardware. Both are interchangeable via the migration pipeline.

**The embedding model is NOT subject to fallback.** Unlike extraction models that can degrade to cheaper alternatives, the embedding model must remain consistent across the entire vector space. If the primary embedding model is unavailable, embedding generation pauses (queued for retry) rather than falling back to a different model.

#### 3.7.3 Incremental Vector Migration Pipeline

When upgrading:
1. EC creates `vec_nodes_v_next` with new dimension size
2. Tier 3 background worker re-embeds in batches of 100
3. Foreground routing uses `vec_nodes` throughout
4. When v_next reaches 100% parity, atomic swap:

```sql
BEGIN EXCLUSIVE TRANSACTION;
DROP TABLE vec_nodes;
ALTER TABLE vec_nodes_v_next RENAME TO vec_nodes;
UPDATE system_config SET active_embedding_model = :new_model_id;
COMMIT;
```

Zero downtime. Zero hallucinated routing during upgrade.

#### 3.7.4 Semantic Chunking for Documents

Documents are embedded as chunks (semantic boundaries — headers, paragraph breaks), not whole-document. vec_nodes supports multiple vectors per document. DOC18 LlamaIndex sidecar handles chunking.

#### 3.7.5 Future: Dual-Vector Architecture (flagged, not v1)

Architecture supports adding a second embedding column for domain-specific precision (e.g., Voyage Law via ZDR API for legal nodes). Each column queried in its own language — never cross-compared. Implementation deferred until user evaluates Voyage Law.

---

## 4. Node Taxonomy — 11 Canonical Types

The entity graph uses 11 canonical node types with rich metadata. R5.7 adds `tool_capability` as a first-class canonical node kind while preserving the reduced-from-v4 simplification that eliminates unnecessary granularity and keeps routing and promotion semantically clear.

| Canonical Type | What it represents | Absorbs from v4 |
|---|---|---|
| `world_entity` | Matters, persons, orgs, calendars, folders, email accounts | Via `entity_type` subtype field |
| `application` | Apps + lightweight capability metadata | Apps. UI affordances become metadata, not separate nodes |
| `procedure` | Atomic and composite semantic-intent procedures | Procedures + composite skills (composite flag + `used_procedures[]`) |
| `execution_trace` | Evidence nodes — records of what happened | Stays unchanged |
| `standing_procedure` | Trigger-action behavioral automation | Stays unchanged |
| `goal` | Narrative strategic context | Stays unchanged |
| `domain_concept` | Authority-backed knowledge with hierarchical concepts | Stays unchanged |
| `obligation` | Lightweight commitments — lighter than tasks | Stays unchanged |
| `work_product` | Documents, briefs, filings as first-class entities | NEW — fills the document entity gap |
| `memory_directive` | Preferences, constraints, vocabulary, styles, archetypes | DOC1-governed typed memories with schemas |
| `tool_capability` | Learned semantic capabilities and reliability facts about runtime tools | R5.7 addition; runtime existence still belongs to DOC24 capability registry |

**Key changes from v4 and R5.7:**
- **UI affordances are NOT separate nodes.** They become metadata on application entities (`known_capabilities: string[]` for bootstrap/retrieval). This addresses the strongest red team critique: UI nodes break when apps update. The semantic intent approach (§4.3) makes durable UI tracking unnecessary.
- **Vocabulary mappings, communication styles, and document archetypes** become typed `memory_directive` nodes governed by DOC1's Write Gate and maturity lifecycle. They connect to entities through the edge model.
- **Actor domain overlays** are stored as a separate overlay table (not a separate node type), keyed by `actor_entity_id + domain`.
- **Composite skills** are folded into the `procedure` type with an `is_composite: true` flag and `used_procedure_ids: string[]`.
- **Tool capability knowledge** is now a first-class graph concept. DOC72 stores learned reliability, alternatives, and procedure/tool relationships as `tool_capability` nodes. DOC24 still owns the live capability registry and runtime invocation truth; DOC72 never becomes a second runtime registry.

### 4.1 Knowledge Node Scoping

Every knowledge node carries ownership and visibility fields:

```ts
type KnowledgeNodeScope = {
  principal_id: string;                    // Who "owns" this knowledge (defaults to current user)
  scope: "firm_shared" | "matter_scoped" | "personal" | "private";
  scope_inference_basis: "matter_linked" | "domain_profile" | "personal_surface" | "user_override" | "default";
};
```

**Scope inference at write time (automatic, no manual classification):**
- Content linked to a matter entity OR extracted by a professional domain profile → `firm_shared`
- Content from personal notes not linked to any matter, personal preferences, hobbies → `personal`
- Content the user explicitly marks private → `private`
- Default for ambiguous → `personal`

Users can override scope conversationally. When multi-user access is built later, `firm_shared` nodes are visible to all users, `matter_scoped` to users assigned to that matter, `personal` and `private` only to the owning principal.

These fields are stored in the node's `payload` JSON and populated automatically by the entity creation pipeline at write time. Every `entity_knowledge_write` command infers scope from context if not explicitly provided.

### 4.2 Application Entities

Durable representations of software applications and tools Elnor interacts with.

```ts
// Stored in nodes table with node_kind = 'application'
type ApplicationPayload = {
  canonical_name: string;                // "Microsoft Word"
  aliases: string[];                     // ["Word", "MS Word", "word"]
  platform: "mac" | "windows" | "web" | "cross_platform" | "unknown";
  app_variant?: string;                  // "desktop", "web", "365"
  version_hint?: string;                 // "16.x", "2024", etc.
  source: "system" | "trace" | "manual";
  known_capabilities: string[];          // Lightweight list for bootstrap/retrieval
  // e.g., ["references section", "styles", "track changes", "comments",
  //        "find and replace", "headers and footers", "table of contents"]
};
```

**Creation trigger:** First successful interaction with the application. Auto-confirmed — applications are facts, not speculation.

**Known capabilities:** A lightweight list of capability areas the application provides. NOT durable UI element tracking — these are semantic capability descriptions used for retrieval and bootstrap. Populated initially by LLM bootstrap, refined by trace observation. The LLM resolves these capabilities to the current UI at runtime.

### 4.3 Procedure Nodes — Semantic Intent

Reusable procedural knowledge stored as SEMANTIC ACTION DESCRIPTIONS, not mechanical UI step sequences. This is the single most important architectural change from v4.

**Before (v4):** Procedure steps stored UI navigation: "Click Home tab → Click Font Size dropdown → Type 12." These break when apps update, require durable UI affordance nodes, and produce a fragile graph.

**After (v5):** Procedure steps store semantic intent: "Set the font size to 12 points." The LLM (or vision model, or accessibility tree reader) resolves the intent to the current UI at runtime. This is resilient to UI changes, eliminates durable UI nodes, and produces a smaller, cleaner graph.

```ts
// Stored in nodes table with node_kind = 'procedure'
type ProcedurePayload = {
  canonical_name: string;                 // "Insert Table of Contents"
  aliases: string[];
  description: string;                    // concise purpose statement
  parent_app_id?: string;                 // link to application entity
  is_composite: boolean;                  // true for composite skills
  used_procedure_ids?: string[];          // for composites: constituent procedure IDs

  // Semantic intent steps (NOT mechanical UI steps)
  steps: SemanticStep[];

  preconditions?: string[];               // what must be true before execution
  postconditions?: string[];              // what is true after successful execution
  environment_scope: ProcedureContext;
  source: "trace_captured" | "onboarding" | "manual" | "doc3_learned";
  lifecycle_state: "captured" | "candidate" | "validated" | "promoted";
};

type SemanticStep = {
  step_index: number;
  intent: string;                         // "Set font size to 12pt"
                                          // NOT "Click Home → Click Font → Type 12"
  notes?: string;                         // additional context
};

type ProcedureContext = {
  app_entity_id: string;
  platform: "mac" | "windows" | "web" | "unknown";
  app_variant?: string;                   // "desktop", "web", "365"
};
```

**Why semantic intent is stronger than UI tracking:**
- Shared procedures still compound — "set font size" is reused across skills. The INTENT is shared; UI navigation is resolved at runtime.
- The graph is ~100x smaller without durable UI affordance nodes.
- Procedures survive application updates without invalidation.
- Cross-platform procedures work naturally — the intent "set font size to 12pt" is the same on Mac and Windows even though the UI path differs.

**Ephemeral UI context during traces:** When an execution trace is captured, the actual UI elements navigated exist in the Working Context during the trace but are NEVER persisted as durable graph nodes. Procedures map to application entities directly.

### 4.4 Work Product Entities

**Normative payload note:** The conceptual summary below remains useful, but R5.7 defines the canonical document payload split as: durable document identity/lifecycle in `work_product`, and heavy document-intelligence operational artifacts (raw text, page summaries, model caches, send-history) in a derived document-intelligence store keyed by node ID. Precomputed summaries are retrieval aids, not direct graph writes.

Documents, briefs, filings, and other artifacts that need graph-linked knowledge tracking. This fills the gap identified during red team review — documents were referenced throughout v4 but had no canonical entity type.

```ts
// Stored in nodes table with node_kind = 'work_product'
type WorkProductPayload = {
  document_type: string;              // "motion_to_dismiss", "expert_report", "client_update"
  archetype_ref?: string;             // link to archetype memory_directive if applicable
  matter_entity_id?: string;
  file_ref?: string;                  // path or artifact ID
  version?: number;
  prior_version_id?: string;          // for evolution tracking (§17.6)
  status: "draft" | "review" | "filed" | "final";
  addressed_concepts?: string[];      // domain concept IDs
  cited_authorities?: string[];       // provenance entry IDs
  red_team_results?: string[];        // CANDOR findings linked
};
```

Connections: `about_case` → matter entity, `addresses_concept` → domain concepts, `uses_style` → style memory_directive, `created_by_procedure` → drafting procedure, `serves_goal` → strategic goal, `reviewed_by` → CANDOR/red team results, `version_of` → prior version.

### 4.5 Memory Directive Nodes

**Normative payload note:** The simplified subtype summary below is now backed by the canonical absorbed `MemoryDirectiveKnowledgeContractSchema` in §4A.4. The extraction pipeline classifies conversation capture scope separately from durable assertion class; the Write Gate uses the durable assertion class only for graph-eligible items.

DOC1-governed memories stored as graph nodes. This unifies storage while preserving DOC1's governance ownership. The `memory_type` field distinguishes subtypes:

| Memory type | What it represents | Example |
|---|---|---|
| `preference` | How the user wants things done | "Use 12pt Times New Roman for legal briefs" |
| `constraint` | What NOT to do | "Never use Word web for final formatting" |
| `standing_order` | Persistent behavioral directive | "Always CC the paralegal on Henderson emails" |
| `vocabulary_mapping` | User-specific phrase-to-meaning | "'Calendar it' = create calendar event" |
| `style_profile` | Communication/writing style rules | "Litigation external: very formal, BluBook citations" |
| `document_archetype` | Structural template for document types | "Motion to dismiss: Introduction → Facts → Standard → Argument → Conclusion" |
| `heuristic` | Generalized pattern from DOC1 §12 | "Cases typically need calendars within 48 hours" |
| `correction` | Explicit user correction | "Pineapple is fine now — managed with medication" |

These are governed by DOC1's Write Gate and maturity lifecycle. DOC72 defines the schema; DOC1 defines the governance rules.

### 4.6 Tool Capability Nodes

Runtime capability truth remains owned by DOC24's capability registry and live action state. DOC72 stores learned, durable semantic knowledge about tools — what they do, which procedures execute through them, reliability trends, alternatives, and user-experienced fit by context.

```ts
type ToolCapabilityPayload = {
  tool_id: string;
  tool_source: "mcp" | "wrapper" | "applescript" | "accessibility_bridge";
  description?: string;
  capabilities_summary: string[];
  runtime_binding_ref?: string;
};
```

Relations introduced in R5.7:
- `has_tool`: application → tool_capability
- `executes_via`: procedure → tool_capability
- `alternative_to`: tool_capability → tool_capability

### 4.7 Remaining Node Types

**Execution traces** (`execution_trace`): Evidence nodes recording what actually happened during a procedure execution. Traces are facts, not promoted. They feed procedure confidence and experience records. Retained per procedure: most recent 10 with full detail, older traces archived with statistical summary.

**Standing procedures** (`standing_procedure`): Trigger-action automation. Full schema in §18.

**Goals** (`goal`): Narrative strategic context. Full schema in §29.

**Domain concepts** (`domain_concept`): Authority-backed knowledge with hierarchical concept models. Full schema in §35.

**Obligations** (`obligation`): Lightweight social commitments lighter than DOC23 tasks. Created from conversation mining, email extraction, or user statements. Tracked with deadlines and urgency. Can be promoted to DOC23 tasks when formalization is needed.

```ts
// Stored in nodes table with node_kind = 'obligation'
type ObligationPayload = {
  description: string;                   // "Send discovery responses to opposing counsel"
  obligor_hint?: string;                 // "Will" (usually the user)
  obligee_entity_ref?: string;           // person/org entity this is owed to
  linked_entity_refs: string[];          // matter, task, document entities involved
  deadline?: string;                     // ISO date if explicit
  deadline_source?: string;             // "conversation", "email", "court_order", "calendar"
  urgency: "critical" | "high" | "normal" | "low" | "unknown";
  obligation_status: "active" | "completed" | "overdue" | "cancelled" | "delegated";
  completed_at?: string;
  completed_evidence_ref?: string;
  source: "conversation_mined" | "email_extracted" | "user_stated" | "onboarding";
  source_artifact_ref?: string;
};
```

**World entities** (`world_entity`): Matters, persons, organizations, calendars, folders, email accounts, and other real-world entities. Distinguished by `entity_type` subtype field in payload. Existing DOC24 §8.5 promotion rules apply.

**System entity subtypes within `world_entity`:** ELNOR's own operational objects — tasks, rooms, panels, forums, notes — are part of the user's world and need graph representation. They are stored as `world_entity` nodes with system-specific `entity_type` subtypes:

| Entity type subtype | What it represents | Example |
|---|---|---|
| `task` | A DOC23 task pipeline | "Henderson deadline extraction task" |
| `room` | A DOC12 multi-agent room | "Henderson strategy room" |
| `panel` | A DOC12 deliberation panel | "Narayanan expert selection panel" |
| `forum` | A DOC12 forum | "Securities litigation knowledge forum" |
| `note` | A DOC20 note | "Henderson research notes" |
| `document_view` | A document opened in the Document Viewer | "Henderson MTD v3 review session" |

These system entities are linkable to matters (via `about_case`), referenced by standing procedures, carry experience records, and participate in graph traversal like any other entity. When Elnor resolves "the Henderson research notes," it resolves to a `world_entity` with `entity_type: "note"` linked to the Henderson matter.

```ts
// System entity payload examples
type SystemEntityPayload = {
  entity_type: "task" | "room" | "panel" | "forum" | "note" | "document_view";
  system_id: string;                     // DOC23 task_id, DOC12 room_id, DOC20 note_id, etc.
  display_name: string;
  linked_matter_ids?: string[];
  status: "active" | "archived" | "completed" | "closed";
  participants?: ParticipantRef[];       // For rooms/panels/forums
  created_by: "user" | "agent" | "system";
  content_summary?: string;              // Brief summary of what this object contains
};

type ParticipantRef = {
  participant_id: string;               // agent_id or "user"
  participant_type: "user" | "agent";
  participant_name: string;             // "Will", "Elnor", "Prism", "Nova"
  role?: string;                        // "moderator", "reviewer", "panelist"
};
```

### 4.8 Actor Domain Overlay

Existing person/org entities gain domain-specific knowledge without changing the base entity, stored as a separate overlay table:

```sql
CREATE TABLE actor_domain_overlays (
    overlay_id TEXT PRIMARY KEY,
    actor_entity_id TEXT NOT NULL REFERENCES nodes(id),
    domain TEXT NOT NULL,
    role TEXT NOT NULL,  -- 'judge' | 'opposing_counsel' | 'party' | 'expert' | etc.
    domain_notes JSON,  -- Array<{ note, confidence, source }>
    associated_concept_ids JSON,
    relevant_decisions JSON,
    UNIQUE(actor_entity_id, domain)
);
```

"What does Judge Chen do on securities motions to dismiss?" → traverse: Judge Chen → overlay → relevant_decisions filtered by concept + domain → return summary with citations from provenance chain. Every assertion backed by verifiable authority.

---


## 4A. Canonical Shared Knowledge Contract Schemas (Absorbed from KDA R2)

The following incorporated material is now normative DOC72 owner text. It is carried directly from the accepted KDA shared-contract revision because the adjudication requires DOC72 payload schemas to absorb the KDA shared knowledge contracts instead of maintaining thinner parallel payload examples. Rendering templates and runtime rendering remain DOC24/KDA-owned. If a simplified payload sketch elsewhere in this document conflicts with the schemas below, this section governs.

### 4A.0 Architectural Invariants

#### 4A.0.1 Contract ownership — DOC72 owns shape, DOC24 owns delivery

**[Source: A1, SR-32, SR-36, ChatGPT III.15]**

The shared knowledge contracts defined in this document are **DOC72-governed payload schemas**. They define the shape of knowledge stored in the entity graph. DOC24 **consumes** these contracts for rendering and delivery — it does not own them.

- **DOC72 owns:** All shared knowledge contract schemas (`ProcedureKnowledgeContractSchema`, `MemoryDirectiveKnowledgeContractSchema`, etc.), the entity graph that stores them, and write-time validation via `NodePayloadValidatorRegistry` (SR-32).
- **DOC24 owns:** Rendering templates, tiered injection (compact/standard/full), dynamic composition, injection manifests, execution-strategy cache, routing integration, packet assembly, and the rendering pipeline.
- **EC (ELNOR Core) is the sole durable writer.** All graph mutations go through `entity_knowledge_write`. Q Dashboard is read/control surface only.

The contracts are defined in this document because they were designed alongside the rendering pipeline. When DOC72 is next revised, it adopts them as canonical payload type definitions. Until then, this document is the authoritative schema reference.

#### 4A.0.2 One context authority — DOC24 feeds existing assembly chain

**[Source: SR-36]**

DOC24's knowledge delivery pipeline MUST feed DOC10/DOC11/OpenClaw's existing prompt assembly chain. DOC10 builds routing facts. DOC11 assembles the lean annotation block and owns final runtime prompt assembly via OpenClaw. DOC24 produces rendered knowledge cards that are consumed by this chain. DOC24 MUST NOT create a parallel prompt assembler that bypasses DOC10/DOC11. The rendered cards are inputs to the existing assembly pipeline, not a replacement for it.

#### 4A.0.3 No LLM call at render time

Rendering is deterministic string assembly from stored graph data plus execution-strategy cache lookups. The LLM does heavy lifting during interpretation (intake). Delivery is pure template rendering. See §3.1.

#### 4A.0.4 Procedures store semantic intent, not mechanical UI steps

"Insert a table of contents with automatic heading detection" — NOT "Click References tab → Click Table of Contents." The LLM bridges intent to current UI at runtime.

---

### 4A.1 The Shared Contract Architecture

#### 4A.1.1 Design principles

1. **The contract IS the graph payload.** Not a separate schema — the `ProcedureKnowledgeContract` is stored directly as the procedure node's `payload` JSON in DOC72. Interpretation fills it in. The graph stores it. Rendering reads it. One schema, three consumers. **Ownership: DOC72 governs the schema; DOC24 consumes it for rendering (§0.1).**

2. **Universal fixed fields + extensible annotations.** Fixed fields cover what EVERY instance of a node type needs regardless of domain. Annotations cover everything else with an open vocabulary.

3. **Domain-agnostic field naming.** Fields describe the ROLE of information, not the DOMAIN. Not `jurisdictional_notes` but `contextual_variations`. Not `key_authorities` but `reference_sources`.

4. **The rendering template is assembly, not generation.** Every field the rendering needs, the interpretation was asked to produce. No LLM call at render time. Pure string interpolation with graph queries.

5. **Configurable extraction model.** The LLM that fills in the contract is user-configurable, not hardcoded.

6. **Canonical payload stores semantic knowledge only.** Execution bindings (tool directives, invalidation flags, resolution timestamps) live in a derived execution-strategy cache (§3.5), not in the canonical DOC72 payload. **[Source: SR-19]**

#### 4A.1.2 The universal annotation model

Instead of pre-defining every possible annotation field per node type, use a typed annotations array with an open vocabulary:

```ts
// packages/contracts/src/knowledge/annotation.ts
import { z } from "zod";

export const KnowledgeAnnotationSchema = z.object({
  annotation_type: z.string().max(60),        // Open vocabulary — NOT an enum
  // Examples: "jurisdictional_variation", "genre_convention", "version_specific_behavior",
  //           "prerequisite_check", "signal_chain_position", "court_requirement",
  //           "hardware_dependency", "client_preference", "framework_gotcha"
  
  content: z.string().max(2000),              // The annotation content
  
  // Optional structured metadata
  context_scope: z.string().max(200).optional(),   // When does this annotation apply?
  source: z.enum(["user_stated", "inferred", "imported", "system"]).optional(),
  confidence: z.number().min(0).max(1).optional(),
  
  schema_version: z.literal(1),
});
```

The annotation_type is a free string. The system learns which types exist from usage. No enum, no pre-definition, no schema migration when a new type emerges. A legal user produces annotations of type `court_requirement`. A musician produces `gear_setting`. A developer produces `framework_gotcha`. All stored the same way.

#### 4A.1.3 Shared contract: Procedures

**[Incorporates: SR-07, SR-08, SR-09, SR-19, SR-21, SR-22, SR-23, SR-27, SR-31]**

```ts
// packages/contracts/src/knowledge/procedure-knowledge-contract.ts
import { z } from "zod";
import { KnowledgeAnnotationSchema } from "./annotation";

export const SemanticStepSchema = z.object({
  step_index: z.number().int().min(1),
  
  // The one-line summary
  intent: z.string().max(500),
  
  // Full instructions — what the LLM actually reads to execute
  detailed_instructions: z.string().max(3000).optional(),
  
  // Specific values to use in this step
  // [SR-08: is_variable for parameterized reuse]
  // [SR-23: literal_retention_class for privacy enforcement]
  parameters: z.array(z.object({
    name: z.string().max(80),
    value: z.string().max(200),
    context: z.string().max(200).optional(),
    is_variable: z.boolean().default(false),              // [SR-08]
    literal_retention_class: z.enum([                     // [SR-23]
      "placeholder_only",          // Default: value is a placeholder like "[Client Name]"
      "retain_if_user_confirmed",  // Retained only after explicit user confirmation
      "retain_allowed",            // Generic values safe to retain (e.g., "Times New Roman")
    ]).default("placeholder_only"),
  })).default([]),
  
  // How to verify the step succeeded
  verification: z.object({
    check: z.string().max(500),
    method: z.enum([
      "applescript_query", "ax_state_check", "visual_confirmation",
      "user_confirmation", "mcp_tool_result", "none",
    ]),
    query: z.string().max(500).optional(),
  }).optional(),
  
  // When to skip this step
  conditional: z.string().max(500).optional(),
  
  // Why this step matters (from narration)
  rationale: z.string().max(500).optional(),
  
  // What app capability area this step exercises
  capability_area: z.string().max(80).optional(),
  
  // Whether this step is mechanical (tool-driven), cognitive (LLM evaluation), or hybrid
  // [SR-09]
  step_nature: z.enum(["mechanical", "cognitive", "hybrid"]).default("mechanical"),
  
  // Source event references (ephemeral — purged after session cleanup)
  source_event_ids: z.array(z.string().uuid()).optional(),

  // NOTE: resolved_execution is NOT stored here. It lives in the derived
  // execution-strategy cache (§3.5). The canonical payload stores semantic
  // knowledge only. [SR-19]
});

export const ProcedureConstraintSchema = z.object({
  rule: z.string().max(500),
  scope: z.enum(["this_procedure", "this_app", "global"]),
  priority: z.enum(["absolute", "strong", "default", "suggestion"]),
  source: z.enum(["user_stated", "inferred", "imported", "system"]),
});

export const ConstituentOrderingSchema = z.object({
  procedure_id: z.string().max(160),
  order_position: z.number().int(),
  depends_on: z.array(z.string().max(160)).default([]),
  dependency_reason: z.string().max(300).optional(),
});

// [SR-20: app_name added]
export const EnvironmentScopeSchema = z.object({
  app_entity_id: z.string().max(160),
  app_name: z.string().max(80),                            // [SR-20]
  platform: z.enum(["mac", "windows", "web", "unknown"]),
  app_variant: z.string().max(40).optional(),
  app_bundle_id: z.string().max(120).optional(),
});

// [SR-27: contract quality metadata]
export const ContractQualitySchema = z.object({
  extraction_model_ref: z.string().max(160),
  overall_quality_score: z.number().min(0).max(1),
  field_quality: z.record(z.string(), z.number().min(0).max(1)).default({}),
  missing_fields: z.array(z.string()).default([]),
});

export const ProcedureKnowledgeContractSchema = z.object({
  // ── IDENTITY ──
  canonical_name: z.string().max(200),
  description: z.string().max(1000),
  
  // ── APPLICABILITY ──
  use_conditions: z.array(z.string().max(300)).min(1),
  non_use_conditions: z.array(z.string().max(300)).default([]),
  
  // ── ROUTING DISCOVERY — separate from applicability [SR-21] ──
  // These are routing-only metadata. NOT rendered into injection cards.
  trigger_phrases: z.array(z.string().max(120)).min(3).max(8).default([]),
  semantic_lookup_phrases: z.array(z.string().max(160)).max(8).default([]),
  
  // ── PREREQUISITES AND POSTCONDITIONS ──
  preconditions: z.array(z.string().max(500)).default([]),
  postconditions: z.array(z.string().max(500)).default([]),
  
  // ── STEPS ──
  steps: z.array(SemanticStepSchema).min(1),
  
  // ── COMPOSITE ORDERING ──
  is_composite: z.boolean().default(false),
  used_procedure_ids: z.array(z.string().max(160)).optional(),
  constituent_ordering: z.array(ConstituentOrderingSchema).optional(),
  
  // ── EXECUTION TOPOLOGY [SR-31] ──
  // Linear by default. Branching fields exist for forward compatibility.
  // For v1: execution_topology is always "linear" and decision_points is always empty.
  execution_topology: z.enum(["linear", "branching"]).default("linear"),
  decision_points: z.array(z.object({
    decision_id: z.string().max(160),
    prompt: z.string().max(400),
    branches: z.array(z.object({
      when: z.string().max(300),
      then_step_indices: z.array(z.number().int()).min(1),
    })).min(2),
  })).default([]),
  
  // ── CONSTRAINTS ──
  constraints: z.array(ProcedureConstraintSchema).default([]),
  
  // ── STRUCTURAL METADATA — fixed fields for deterministic dedup [SR-22] ──
  artifact_class: z.string().max(120).optional(),       // "caption_page", "brief_body", "toc"
  document_region: z.string().max(120).optional(),      // "first_page_caption_block", "body_text"
  derivation_quality: z.enum(["high", "medium", "low"]).default("medium"),
  
  // ── CAPABILITY TAGS — procedure-level, derived from steps [SR-07] ──
  // EC auto-populates by deduplicating capability_area values from all steps at write time.
  capability_tags: z.array(z.string().max(80)).default([]),
  
  // ── ENVIRONMENT ──
  environment: EnvironmentScopeSchema,
  source: z.enum([
    "manual_demonstration", "doc3_learned", "trace_captured",
    "imported", "system", "onboarding",
  ]),
  
  // ── QUALITY METADATA [SR-27] ──
  // Populated by the extraction pipeline. Rendering may suppress low-quality optional
  // fields at compact/standard tiers. overall_quality_score < 0.60 forces compact
  // rendering unless procedure is direct target.
  contract_quality: ContractQualitySchema.optional(),
  
  // ── EXTENSIBLE ANNOTATIONS ──
  annotations: z.array(KnowledgeAnnotationSchema).default([]),
  
  schema_version: z.literal(1),
});
```

**Field design notes:**

- **`trigger_phrases` and `semantic_lookup_phrases` (SR-21):** Routing discovery reads these fields. The rendering template does NOT include them in the injection card — they are routing-only metadata. `use_conditions` remains semantic applicability rendered into the card as "WHEN TO USE." The interpretation prompt must generate both trigger phrases and use_conditions separately. This overrides the original Patch 7 which merged them.

- **`artifact_class`, `document_region`, `derivation_quality` (SR-22):** Fixed optional fields needed by deterministic dedup hard vetoes. The dedup algorithm queries these as direct field comparisons (`incoming.artifact_class !== existing.artifact_class` → never merge). These are NOT annotations — they're structural metadata needed by core algorithms.

- **`capability_tags` (SR-07):** Procedure-level field auto-populated at write time. `capability_area` is per-step instructional grouping metadata. `capability_tags` is the procedure-level freshness/dedup/routing field. On write, EC MUST derive `capability_tags` from distinct non-empty `steps[*].capability_area` values unless explicitly provided.

- **`is_variable` (SR-08):** When `true`, the rendering template produces a placeholder (e.g., `[Client Name]` instead of `"Henderson"`) and the LLM is prompted to substitute the context-appropriate value at runtime.

- **`literal_retention_class` (SR-23):** Schema-level privacy enforcement. EC write-time validation: if `literal_retention_class` is `placeholder_only` and the value doesn't look like a placeholder (doesn't contain `[` or generic terms), log a warning. Post-interpretation scrubber before graph write converts observed literal values to placeholders unless explicitly user-confirmed.

- **`step_nature` (SR-09):** Mechanical steps have tool directives and execute via system.run/MCP/GUI. Cognitive steps (e.g., "Review the caption for formatting errors") have no tool directive — the rendering template tells the LLM to evaluate the document context rather than perform an action. Hybrid steps do both. The rendering template adjusts format accordingly.

- **`resolved_execution` removed from SemanticStepSchema (SR-19):** Tool directives, invalidation flags, and resolution timestamps live in a derived execution-strategy cache (§3.5), not in the canonical payload. The canonical DOC72 payload stores durable semantic knowledge only. See §3.5 for cache architecture and §3.2 for how the renderer merges cache data into cards.

- **`execution_topology` and `decision_points` (SR-31):** Forward-compatible branching support. For v1, both fields are at their defaults (linear, empty). When branching is implemented (Phase 3), the rendering template and execution engine will read `decision_points`. This avoids a schema migration later while keeping v1 simple.

#### 4A.1.4 Shared contract: Memory Directives

**[Incorporates: SR-20, SR-24]**

```ts
// packages/contracts/src/knowledge/memory-directive-knowledge-contract.ts
import { z } from "zod";
import { KnowledgeAnnotationSchema } from "./annotation";

export const MemoryDirectiveKnowledgeContractSchema = z.object({
  // ── IDENTITY ──
  memory_type: z.enum([
    "preference", "constraint", "standing_order", "vocabulary_mapping",
    "style_profile", "document_archetype", "heuristic", "correction",
  ]),
  summary: z.string().max(500),
  
  // ── ASSERTION CLASS [SR-24] ──
  // Separate from source_certainty. source_certainty is HOW we know something;
  // assertion_class is WHAT KIND of claim it is. DOC1 maturity bypass keys on
  // assertion_class, not source_certainty.
  assertion_class: z.enum([
    "durable_fact",       // Rarely changes (firm name, jurisdiction rules)
    "preference",         // User preference, can change (font choice, style)
    "constraint",         // User-imposed rule (never auto-number appendices)
    "standing_order",     // Behavioral directive (always check local rules)
    "vocabulary_rule",    // Terminology mapping (definitional, stable)
    "heuristic",          // Inferred pattern (may become outdated)
  ]),
  
  // ── APPLICABILITY ──
  applies_when: z.array(z.string().max(300)).min(1),
  does_not_apply_when: z.array(z.string().max(300)).default([]),
  
  // [SR-20: project_specific replaces matter_specific — domain-agnostic naming]
  scope: z.enum([
    "global", "app_specific", "project_specific",
    "document_type_specific", "context_specific",
  ]),
  scope_details: z.string().max(500).optional(),
  
  // ── CONFLICT RESOLUTION ──
  priority_class: z.enum([
    "absolute",    // Never overridden (ethical rules, court mandates)
    "strong",      // Override by explicit user instruction only
    "default",     // Override by client preference or context
    "suggestion",  // Can be ignored if inconvenient
  ]).default("default"),
  
  overridden_by: z.array(z.string().max(300)).default([]),
  
  // ── ORIGIN ──
  source_certainty: z.enum([
    "user_explicitly_stated", "inferred_from_actions",
    "inferred_from_documents", "imported",
  ]),
  
  // ── RENDERING HINT ──
  injection_format: z.enum([
    "imperative",     // "Use Times New Roman"
    "conditional",    // "When filing in state court, use Times New Roman"
    "contextual",     // "The user prefers Times New Roman for filings"
    "cautionary",     // "Note: user has specified Times New Roman for filings"
  ]).default("imperative"),
  
  // ── EXTENSIBLE ANNOTATIONS ──
  annotations: z.array(KnowledgeAnnotationSchema).default([]),
  
  schema_version: z.literal(1),
});
```

**SR-24 design note:** `assertion_class` and `source_certainty` are orthogonal dimensions. A user-stated preference is `source_certainty: "user_explicitly_stated"` but `assertion_class: "preference"`, not `"durable_fact"`. DOC1 maturity bypass requires `assertion_class: "durable_fact"` AND `confidence >= 0.85` AND user confirmation. Preferences, constraints, and heuristics follow standard maturity progression regardless of source_certainty.

#### 4A.1.5 Shared contract: Domain Concepts

```ts
// packages/contracts/src/knowledge/domain-concept-knowledge-contract.ts
import { z } from "zod";
import { KnowledgeAnnotationSchema } from "./annotation";

export const DomainConceptKnowledgeContractSchema = z.object({
  // ── IDENTITY ──
  concept_name: z.string().max(200),
  definition: z.string().max(2000),
  
  // ── APPLICABILITY ──
  applies_in: z.array(z.string().max(300)),
  does_not_apply_in: z.array(z.string().max(300)).default([]),
  
  // ── NUANCE ──
  pitfalls: z.array(z.string().max(500)).default([]),
  
  key_distinctions: z.array(z.object({
    this_concept: z.string().max(200),
    versus: z.string().max(200),
    distinction: z.string().max(500),
  })).default([]),
  
  // ── REFERENCES ──
  reference_sources: z.array(z.object({
    citation: z.string().max(300),
    relevance: z.string().max(300),
    still_current: z.boolean(),
  })).default([]),
  
  // ── CONTEXTUAL VARIATIONS ──
  contextual_variations: z.array(z.object({
    context: z.string().max(120),
    variation: z.string().max(500),
  })).default([]),
  
  // ── PRACTICAL GUIDANCE ──
  practical_guidance: z.array(z.string().max(500)).default([]),
  
  // ── EXTENSIBLE ANNOTATIONS ──
  annotations: z.array(KnowledgeAnnotationSchema).default([]),
  
  schema_version: z.literal(1),
});
```

#### 4A.1.6 Shared contract: Goals

```ts
// packages/contracts/src/knowledge/goal-knowledge-contract.ts
import { z } from "zod";
import { KnowledgeAnnotationSchema } from "./annotation";

export const GoalKnowledgeContractSchema = z.object({
  goal_description: z.string().max(500),
  success_criteria: z.array(z.string().max(300)).default([]),
  priorities: z.array(z.object({
    priority: z.string().max(200),
    rank: z.number().int(),
    tension_with: z.string().max(200).optional(),
  })).default([]),
  time_pressure: z.object({
    description: z.string().max(300).optional(),
    key_date: z.string().max(100).optional(),
    urgency: z.enum(["low", "moderate", "high", "critical"]).optional(),
  }).optional(),
  stakeholder_considerations: z.array(z.string().max(300)).default([]),
  annotations: z.array(KnowledgeAnnotationSchema).default([]),
  schema_version: z.literal(1),
});
```

#### 4A.1.7 Shared contract: Obligations

**[Incorporates: SR-20 typed datetime]**

```ts
// packages/contracts/src/knowledge/obligation-knowledge-contract.ts
import { z } from "zod";
import { KnowledgeAnnotationSchema } from "./annotation";

export const ObligationKnowledgeContractSchema = z.object({
  obligation_description: z.string().max(500),
  deadline: z.object({
    date: z.string().datetime(),      // [SR-20: typed datetime with timezone]
    type: z.enum(["court_ordered", "statutory", "contractual", "internal", "courtesy"]),
    consequence_of_missing: z.string().max(300),
    extendable: z.boolean(),
    extension_mechanism: z.string().max(300).optional(),
  }),
  dependencies: z.array(z.object({
    description: z.string().max(300),
    deadline_offset: z.string().max(100).optional(),
  })).default([]),
  preparation_requirements: z.array(z.object({
    task: z.string().max(300),
    lead_time_days: z.number().int(),
    assignee_hint: z.string().max(100).optional(),
  })).default([]),
  status: z.enum(["pending", "in_progress", "completed", "overdue", "extended", "waived"]),
  annotations: z.array(KnowledgeAnnotationSchema).default([]),
  schema_version: z.literal(1),
});
```

#### 4A.1.8 Shared contract: Standing Procedures

```ts
// packages/contracts/src/knowledge/standing-procedure-knowledge-contract.ts
import { z } from "zod";
import { KnowledgeAnnotationSchema } from "./annotation";

export const StandingProcedureKnowledgeContractSchema = z.object({
  trigger_summary: z.string().max(300),
  trigger_conditions: z.array(z.object({
    signal_type: z.enum([
      "email_received", "file_added", "calendar_event", "time_based",
      "document_state_change", "user_request", "system_event",
    ]),
    conditions: z.array(z.string().max(300)),
  })).min(1),
  trigger_exclusions: z.array(z.string().max(300)).default([]),
  action_summary: z.string().max(300),
  action_steps: z.array(z.object({
    step_index: z.number().int().min(1),
    intent: z.string().max(500),
    detailed_instructions: z.string().max(3000).optional(),
    requires_user_confirmation: z.boolean(),
  })),
  safety_class: z.enum(["notify", "confirm_first", "auto_safe"]),
  false_positive_handling: z.string().max(300).optional(),
  failure_handling: z.string().max(300).optional(),
  annotations: z.array(KnowledgeAnnotationSchema).default([]),
  schema_version: z.literal(1),
});
```

---

#### 4A.1.9 Shared contract: Tool Capabilities

```ts
export const ToolCapabilityKnowledgeContractSchema = z.object({
  tool_id: z.string().max(160),
  tool_source: z.enum(["mcp", "wrapper", "applescript", "accessibility_bridge"]),
  description: z.string().max(1000).optional(),
  capabilities_summary: z.array(z.string().max(200)).default([]),
  runtime_binding_ref: z.string().max(160).optional(),
  annotations: z.array(KnowledgeAnnotationSchema).default([]),
  schema_version: z.literal(1),
});
```

**Owner rule:** DOC72 owns this schema and the graph facts written under it. DOC24 owns whether the referenced runtime binding is live at any given moment.
## 5. Confidence Model — Pure Beta Distribution

### 5.1 The Formula

Confidence uses a pure Beta distribution, aligned with DOC1's existing Beta(2,2) `calibrated_confidence` model. This is mathematically elegant, O(1) to compute, naturally bounded [0,1], and incrementally updatable.

```ts
function computeConfidence(
  state: { alpha: number; beta: number },
  ageDays: number,
  halfLifeDays: number,
  policyFlags: {
    domain_principle_no_authority?: boolean;
    staleness_expired?: boolean;
    staleness_invalidated?: boolean;
  }
): number {
  // Prevent Beta explosion at large scale
  let { alpha, beta } = state;
  const total = alpha + beta;
  if (total > 200) {
    const scale = 200 / total;
    alpha *= scale;
    beta *= scale;
  }

  const betaMean = alpha / (alpha + beta);
  const decay = Math.exp(-Math.log(2) * ageDays / halfLifeDays);
  let confidence = betaMean * decay;

  // Policy caps
  if (policyFlags.domain_principle_no_authority) confidence = Math.min(confidence, 0.39);
  if (policyFlags.staleness_expired) confidence *= 0.75;
  if (policyFlags.staleness_invalidated) confidence = 0.0;
  if ((policyFlags as any).superseded_authority) confidence = Math.min(confidence, 0.15);

  return Math.max(0, Math.min(1, confidence));
}

// Contradictions go into β directly:
function recordContradiction(nodeId: string, weight: number) {
  await db.run(`UPDATE nodes SET beta = beta + ? WHERE id = ?`, [weight, nodeId]);
}
```

**α+β cap at 200:** Prevents unbounded accumulation of evidence that would make confidence insensitive to new events. When total exceeds 200, both α and β are scaled down proportionally, preserving the ratio while allowing new evidence to have meaningful impact.

### 5.2 Event Weights

| Event type | Weight |
|---|---|
| user_flagged (Focus / "learn from this") | 1.00 |
| confirmed_by_user | 1.00 |
| taught_by_user | 0.95 |
| supported_by_authority (binding) | 0.95 |
| stated_by_user (in room/panel) | 0.90 |
| supported_by_statute/rule | 0.90 |
| learned_from_onboarding | 0.85 |
| accepted_from_agent | 0.80 |
| learned_from_note (user-authored) | 0.80 |
| learned_from_candor | 0.75 |
| learned_from_trace | 0.75 |
| learned_from_task_execution | 0.70 |
| learned_from_email | 0.65 |
| learned_from_document_view | 0.65 |
| learned_from_chat | 0.55 |
| learned_from_room (agent, uncontested) | 0.40 |
| inferred_by_system | 0.35 |
| learned_from_panel (pending) | 0.30 |
| llm_bootstrap | 0.25 |
| agent_observation | 0.20 |

### 5.3 Half-Life Defaults by Node Kind

| Node kind | Half-life (days) | Rationale |
|---|---|---|
| domain_concept | 365 | Law changes slowly |
| work_product | 365 | Documents don't change after filing |
| world_entity | 180 | People and orgs change moderately |
| standing_procedure | 120 | Automation needs periodic review |
| procedure | 90 | Apps update quarterly |
| goal | 90 | Goals evolve with strategy |
| obligation | 30 | Obligations are time-sensitive |
| memory_directive | 180 | Preferences change slowly |

### 5.4 Policy Caps

Applied after decay:
- Domain principle with no authority provenance: `C = min(C, 0.39)`
- `staleness_state = "expired"`: `C *= 0.75`
- `staleness_state = "invalidated"`: `C = 0.0`

### 5.5 Storage and Computation

Each node stores `alpha: REAL` and `beta: REAL` alongside `confidence: REAL` (last computed). DOC8 updates α and β on experience events. The `confidence` field is recomputed lazily on read (with time-decay) or eagerly during nightly consolidation.

**Alignment with DOC1:** DOC1's `calibrated_confidence = (α + inject_proceed_count) / (α + β + inject_count)` is the SAME Beta model applied to memory injection feedback. DOC72's model generalizes it to all knowledge nodes. When a node is also a DOC1 memory (`node_kind = 'memory_directive'`), DOC1's inject feedback updates the same α/β counters. One mathematical foundation for the entire system.

**Example — procedure node:**
- Starts at α=2, β=2 (prior) → C_base = 0.50
- User teaches it (+0.95 to α) → α=2.95, C_base = 0.60
- 2 successful traces (+0.75 each to α) → α=4.45, C_base = 0.69
- Last verified yesterday, half_life 90 days → decay ≈ 0.992
- C = 0.69 × 0.992 = 0.685
- After 10 more successes: α ≈ 12, C_base ≈ 0.86, decayed ≈ 0.85

**Owner:** DOC8 updates α and β. DOC72 defines the schema, priors, weights, half-lives, and policy caps. Confidence field is recomputed on read or during nightly batch.

---


### 5.6 Conflict Score / Evidence Mass

R5.7 retains Beta confidence as the epistemic backbone but adds an explicit conflict-aware computed property so the system can distinguish “low evidence” from “lots of balanced contradictory evidence.”

```ts
function conflictScore(alpha: number, beta: number): number {
  const total = alpha + beta;
  const balance = 1 - Math.abs(alpha - beta) / total;
  const mass = Math.min(total / 50, 1);
  return balance * mass;
}
```

**Delivery implications:**
- `conflict_score > 0.7` → DOC24 injects with `[distinguish]` and surfaces both support and contradiction when relevant.
- `conflict_score > 0.5` → DOC24 injects with `[caution]`.
- Matrix confidence adjustments are runtime-effective only and MUST NOT raise effective confidence above the DOC72 baseline by more than 0.15. They never persist back into canonical node α/β.

### 5.7 Canonical Lifecycle and Staleness Enums

```ts
// Shared across DOC72, DOC24, and KDA
type StalenessState =
  | "fresh"
  | "stale"
  | "verification_required"
  | "expired"
  | "invalidated";

// Shared coarse lifecycle state
type CanonicalLifecycleState =
  | "observed"
  | "suggested"
  | "confirmed"
  | "managed"
  | "archived"
  | "merged"
  | "split"
  | "retracted";

type NodeLifecycleMetadata = {
  lifecycle_state: CanonicalLifecycleState;
  lifecycle_substate?: string;
};
```

**Normative rules:**
1. `stale` is NEVER a lifecycle state; it belongs only to `StalenessState`.
2. `tombstoned` is deletion/retention metadata, not a shared lifecycle enum value.
3. `verification_required` is the cross-doc state used when upstream dependencies changed and the knowledge needs explicit re-verification.


## 6. Differentiated Promotion Model

### 6.1 Promotion Thresholds by Node Type

| Node type | Creation trigger | Starting state | Starting confidence | Promotion path |
|---|---|---|---|---|
| Application entity | First interaction | confirmed (auto) | α=2, β=2 + 0.95 trace | Immediate — apps are facts |
| Procedure | First successful trace | captured | α=2, β=2 + 0.75 trace | validated after 2+ successes, promoted after cross-skill use |
| Composite procedure | DOC3 pipeline or composition detection | candidate | α=2, β=2 | validated after successful composite execution |
| Execution trace | Every execution | N/A (evidence) | N/A | Traces are facts, not promoted |
| Goal | User stated or onboarding | active | α=2+0.95, β=2 | Not promoted — achieved, evolved, or abandoned |
| Standing procedure | User taught or generalized | candidate | α=2, β=2 | active after confirmation |
| World entity (existing) | Cross-source corroboration | observed/suggested | Varies | Existing DOC24 §8.5 rules apply |
| Domain concept | Research, authority, user-taught | suggested | Varies | confirmed on authority + user verification |
| Obligation | Conversation mining, email | active | α=2+0.65, β=2 | Not promoted — fulfilled or overdue |
| Work product | File detection, user creation | confirmed | α=2+0.75, β=2 | Not promoted — tracks document lifecycle |
| Memory directive | DOC1 Write Gate | Per DOC1 maturity | Per DOC1 | Governed by DOC1 maturity lifecycle |

### 6.2 Environment-Scoped Confidence

Procedure confidence is scoped to the exact environment where validation occurred. A procedure validated in Word desktop on Mac has high confidence for that environment and unknown confidence for Word desktop on Windows or Word web. When Elnor encounters a request in an unvalidated environment, the procedure is still RETRIEVABLE but confidence is lower.

### 6.3 Correction Propagation

When the user corrects a procedure step:
1. The trace records the correction (`outcome: "corrected"`, step-level `correction` field)
2. The procedure node is updated with the corrected semantic intent
3. The original step is preserved as a `specializes` variant (may still be valid in other environments)
4. All composite procedures that use this procedure automatically get the correction via graph edge traversal
5. DOC3 skill artifacts that reference this procedure are flagged for regeneration

This is the core compounding benefit: fixing one procedure fixes every skill that uses it.

### 6.4 Promotion Hysteresis / Cooldown

To prevent promotion oscillation on borderline nodes:

```ts
type PromotionHysteresis = {
  node_id: string;
  last_promotion_attempt_at: string;
  last_demotion_at?: string;
  cooldown_ms: number;  // default 86400000 (24 hours)
  consecutive_demotion_count: number;
};
```

Rule: After demotion, a node cannot be re-promoted for at least `cooldown_ms`. After 2+ consecutive demotions, re-promotion requires user confirmation.

### 6.5 Extraction Quality Gates

Per-candidate-class precision gates prevent graph pollution from low-quality extraction:

```ts
type ExtractionQualityGate = {
  candidate_class: "entity" | "relationship" | "preference" | "procedure" | "goal" | "obligation";
  rolling_precision_7d: number;
  min_precision_required: number;
  action_if_below: "hold_for_review" | "demote_confidence" | "disable_auto_promotion";
};
```

Precision floors: entity 0.85, relationship 0.80, preference 0.75, goal 0.90, obligation 0.80.

Measured against user corrections — corrections indicate extraction errors. If extraction quality drops below threshold, auto-promotion freezes for that class until the extractor is improved.

### 6.6 Suggested Entity Visibility and Routing Eligibility

Entities mined from conversation at `lifecycle_state: "suggested"` with `truth_grade: "inferred"` remain hedged and reviewable, but R5.7 no longer blocks all routing to them. When the query is unambiguous and the candidate meets the routability policy below, DOC24 may resolve to it with a `suggested_unconfirmed` trace marker and hedged rendering.

```ts
type SuggestedRoutingEligibility = {
  min_confidence: number;
  min_distinct_sources: number;
  require_explicit_string_match: boolean;
};

function isSuggestedEntityRoutable(input: {
  confidence: number;
  distinctSourceCount: number;
  explicitMatch: boolean;
  policy: SuggestedRoutingEligibility;
}): boolean {
  if (input.confidence < input.policy.min_confidence) return false;
  if (input.distinctSourceCount < input.policy.min_distinct_sources) return false;
  if (input.policy.require_explicit_string_match && !input.explicitMatch) return false;
  return true;
}
```

**Default policy:** `min_confidence = 0.60`, `min_distinct_sources = 2`, `require_explicit_string_match = true`. Promotion to `confirmed` still requires corroboration or user confirmation.

### 6.7 Domain-Specific Promotion Gates

Domain signal profiles (§35) may define additional promotion requirements specific to their knowledge domain. These are evaluated via a generic accessor:

```ts
function getNestedField(obj: any, path: string): any {
  return path.split('.').reduce((o, key) => o?.[key], obj);
}

function canPromoteGeneric(p: DomainKnowledgeExtraction, profile: DomainSignalProfile): boolean {
  if (!profile.promotion_requirements) return p.confidence >= 0.6;
  return profile.promotion_requirements.every(req => {
    const value = getNestedField(p, req.field);
    switch (req.operator) {
      case "exists": return value !== undefined;
      case "min_count": return Array.isArray(value) && value.length >= (req.value as number);
      case "confidence_gte": return typeof value === 'number' && value >= (req.value as number);
      case "min_length": return typeof value === 'string' && value.length >= (req.value as number);
      default: return true;
    }
  });
}
```

For legal propositions specifically:

```ts
function canPromoteLegalProposition(p: DomainKnowledgeExtraction): boolean {
  if (p.supporting_refs.length === 0) return false;
  if (p.verified_state !== "verified") return false;
  const facet = p.domain_payload as LegalPropositionFacet;
  if (!facet.verbatim_excerpt) return false;
  return true;
}
```

---


### 6.8 Procedure Lifecycle Policy

Procedures do not decay exactly like general factual knowledge. Demonstrated skills encode user-taught operational know-how and therefore receive differentiated lifecycle rules.

```ts
type ProcedureLifecyclePolicy = {
  decay_model: "time_based" | "usage_based" | "none";
  half_life_days?: number;
  min_usage_for_active: number;
  verified_by_user_immune: boolean;
  demonstrated_skill_protection: boolean;
};
```

**Default policy:** standard procedures use a 90-day half-life; demonstrated skills learned by example use a 180-day half-life; user-verified procedures may be immune from ordinary decay; a procedure demonstrated once may enter `active` lifecycle state if the user explicitly taught it.


## 7. Relationship Vocabulary

Typed edges between nodes. The relationship vocabulary is governed — new relation types require explicit registration.

| Relationship | Source → Target | Meaning |
|---|---|---|
| `part_of_application` | procedure → application | This procedure belongs to this app |
| `uses_procedure` | procedure (composite) → procedure | This composite uses this sub-procedure |
| `depends_on_procedure` | procedure → procedure | This procedure requires this other procedure first |
| `validated_by_trace` | procedure → execution_trace | This trace validates this procedure |
| `specializes` | procedure → procedure | This is a variant of a more general procedure |
| `constrained_by` | procedure/goal → memory_directive | This constraint limits behavior |
| `preferred_in_context` | procedure → memory_directive | This preference applies when using this procedure |
| `composes_into` | procedure → procedure (composite) | This procedure is part of this composite |
| `learned_from` | procedure → execution_trace | This procedure was extracted from this trace |
| `serves_goal` | entity/action → goal | This serves this goal |
| `sub_goal_of` | goal → goal | This is a sub-goal |
| `competes_with` | goal → goal | These goals are in tension |
| `stakeholder_of` | world_entity (person/org) → goal | This person has a stake in this goal |
| `informed_by_decision` | goal/entity → memory_directive | This was shaped by a past decision |
| `tracked_by_deadline` | goal → world_entity (calendar/deadline) | This deadline tracks this goal |
| `uses_style` | work_product/entity → memory_directive (style) | This uses this style profile |
| `has_archetype` | application/entity → memory_directive (archetype) | This domain produces this type of document |
| `obligated_to` | obligation → world_entity (person/org) | This commitment is owed to this entity |
| `fulfills_obligation` | execution_trace → obligation | This action fulfilled this obligation |
| `about_case` | work_product/obligation → world_entity (matter) | Linked to this matter |
| `addresses_concept` | work_product → domain_concept | This document addresses this concept |
| `version_of` | work_product → work_product | This is a newer version of that document |
| `created_by_procedure` | work_product → procedure | Created using this procedure |
| `parent_concept` | domain_concept → domain_concept | Hierarchical concept relationship |
| `related_concept` | domain_concept → domain_concept | Lateral concept relationship |
| `triggers_procedure` | standing_procedure → procedure | This standing procedure invokes this skill |
| `promoted_to_task` | standing_procedure → (DOC23 task ref) | This was promoted to a DOC23 task |
| `contradicts` | domain_concept → domain_concept | Direct contradiction between knowledge nodes |
| `tension_with` | domain_concept → domain_concept | Tension or partial conflict |
| `limits` | domain_concept → domain_concept | One concept limits the scope of another |
| `distinguishes` | domain_concept → domain_concept | One concept distinguishes the applicability of another |
| `overrules` | domain_concept → domain_concept | One authority overrules another |
| `narrows` | domain_concept → domain_concept | One concept narrows the holding of another |
| `expands` | domain_concept → domain_concept | One concept expands the holding of another |

All edges carry temporal metadata via the `edges` table: `start_date`, `end_date`, `lifecycle_state`, `confidence`. Ended relationships aren't deleted — they become historical context.


### 7.0A Canonical Relation Vocabulary Governance

R5.7 tightens relation typing. The canonical edge vocabulary is governed and validated; arbitrary free-string relation types are no longer an acceptable write contract.

```ts
type CoreRelationType =
  | "related_to" | "part_of" | "specializes" | "contradicts" | "supersedes"
  | "version_of" | "cites" | "references" | "applies_to" | "produced_by"
  | "has_tool" | "executes_via" | "alternative_to" | "learned_from"
  | "applies_to_concept" | "applies_to_actor" | "applies_to_procedure"
  | "about_case" | "addresses_concept" | "created_by_procedure" | "serves_goal"
  | "constrained_by" | "preferred_in_context" | "validated_by_trace"
  | "uses_procedure" | "depends_on_procedure" | "composes_into"
  | "parent_concept" | "related_concept" | "tension_with" | "limits"
  | "distinguishes" | "overrules" | "narrows" | "expands";

type DomainRelationType = `domain_${string}`;
type RelationType = CoreRelationType | DomainRelationType;
```

EC MUST reject edge writes that are neither in the canonical set nor prefixed `domain_`. Domain profiles register extensions before use.

### 7.1 Contradiction Edges

```ts
type ContradictionEdge = {
  source_id: string;
  target_id: string;
  relation_type: "contradicts";
  split_type: "direct_contradiction" | "circuit_split" | "factual_tension" | "strategic_disagreement";
  detected_by: "semantic_folding" | "candor_finding" | "user_correction";
  jurisdiction_match: boolean;
};
```

Contradiction edges are created by semantic folding (§42), CANDOR findings, or user corrections. When contradictions are detected, they are stored as edges with rich metadata enabling the delivery layer (DOC24) to enforce tension-aware injection policies.


---

## 8. Learning Pipeline

### 8.1 Aggressive First-Interaction Capture

On the FIRST successful interaction with any application, the system captures everything asynchronously:

```
Action succeeds
    │
    ▼ (async, off the hot path)
    │
Application entity exists?
    ├── No: Create application entity (auto-confirmed)
    └── Yes: Continue
    │
    ▼
Capture execution trace
    │
    ▼
Extract semantic-intent procedure candidate from trace
    │
    ▼
Link procedure to application entity
Link trace to procedure (validated_by_trace)
    │
    ▼
Check for composite procedure candidates
(does this procedure + recent procedures compose into a higher-level skill?)
```

**User-perceived latency:** Zero. The action completed before any of this runs.

**Cost:** One background write operation per action. No LLM call required for basic trace capture — the trace is a structured log. LLM calls are only needed for procedure EXTRACTION (interpreting the trace into a reusable semantic-intent procedure), which can be batched or deferred.

### 8.2 LLM-Assisted Entity Bootstrapping

On first interaction with a NEW application entity, a background LLM call bootstraps the entity's capability list:

```ts
type BootstrapExtractionRequest = {
  app_entity_id: string;
  app_name: string;
  platform: string;
  trigger_trace_id: string;
};
```

**Extraction prompt (bounded):**

```
You just successfully interacted with {app_name} on {platform}.
List the 10-15 most important capability areas that are likely to be
useful for future tasks in this application. Describe each as a
semantic capability ("insert table of contents") not a UI element
("References Tab"). Only list capabilities you are confident exist
in the current version. Return as JSON array of { capability, description }.
```

**Rules:**
- **Async only.** Never blocks a user-facing action.
- **Bounded.** Cap at 15 capability candidates per bootstrap.
- **Moderate confidence.** Bootstrap capabilities enter the app's `known_capabilities` list with source `llm_bootstrap`.
- **One-time per application.** Re-bootstrap only if the app version changes significantly.

### 8.3 Background Procedure Extraction

Traces alone are raw execution logs. Extracting REUSABLE PROCEDURES from traces requires interpretation — identifying which steps are essential vs incidental, mapping to semantic intent, and identifying preconditions and postconditions.

```ts
type ProcedureExtractionResult = {
  source_trace_ids: string[];
  proposed_procedure: {
    canonical_name: string;
    steps: SemanticStep[];           // Semantic intent, NOT UI steps
    environment_scope: ProcedureContext;
    preconditions: string[];
    postconditions: string[];
  };
  confidence: number;
  extraction_model: string;
};
```

**Extraction logic:**
1. Take one or more related traces
2. Identify the reusable core (steps that appear in multiple traces or are clearly app-specific)
3. Convert mechanical steps to semantic intent descriptions
4. Check for existing procedures that overlap (dedup via entity linking)
5. If new: create procedure node at `captured` state
6. If existing: enrich with new trace evidence, update confidence (increment α)

**Scheduling:** Extraction runs after each significant interaction (async). Batch extraction runs nightly for unprocessed traces.

### 8.4 Composite Procedure Detection

When Elnor performs a sequence of related procedures in one session, the learning pipeline checks whether this sequence represents a composite procedure:
- Are 3+ procedures from the same application used in sequence?
- Do they share a coherent goal?
- Has a similar sequence occurred before?

If yes, propose a composite procedure candidate. The candidate links to all constituent procedures via `uses_procedure` edges. It enters DOC3's promotion pipeline for review.

### 8.5 Enrichment-on-Promotion Trigger

When a procedure transitions from `captured` to `validated` (2+ successful traces or user confirmation), it is immediately queued for enrichment:
- Search for other procedures in the same application that might share steps
- Search for composite procedures that could benefit from this procedure
- Check if any existing composites have gaps this procedure fills

---

## 9. Onboarding for Applications

The three-lane onboarding model applies to application knowledge:

### Lane A — Passive ambient learning

Trace capture and background extraction discover application knowledge automatically. Elnor uses Word → traces capture what happened → semantic-intent procedures extracted → graph grows. No user action needed.

### Lane B — Natural conversational capture

During normal conversation, Elnor can capture application knowledge:

User: "For legal briefs, always use 12-point Times New Roman."
→ Creates a preference memory_directive linked to the Word entity.
→ Creates `constrained_by` edges from font-related procedures to this directive.

User: "Use Cmd+Shift+S for styles instead of the ribbon."
→ Creates or updates a procedure variant with `source: "manual"`, high confidence.
→ Creates a `specializes` edge from this variant to the general procedure.

### Lane C — Structured onboarding studio

When the user says "Elnor, learn how I use Word," the onboarding studio activates:

**Phase 1:** Check existing Word knowledge (capabilities, validated procedures, trace history). Identify gaps.

**Phase 2:** Targeted questions based on gaps:
- "Do you prefer keyboard shortcuts or menu navigation?"
- "What are your most common Word tasks?"
- "Are there firm-specific templates you use?"
- "Do you use Word web or only desktop?"

**Phase 3:** Answers create memory_directive nodes (preferences, constraints), procedure refinements, and environment scoping updates. All linked to the Word entity.

**Phase 4:** Summary and review. "I learned 3 preferences, 2 constraints, and refined 4 procedures for Word. Accept / edit / reject?"

### Lane D — Teach-by-doing (demonstration mode)

When the user says "Elnor, watch how I do this," demonstration mode activates. Different from passive capture:

**Passive capture:** Standard-detail trace. Background learning.

**Demonstration mode:** MAXIMUM detail capture. Every step, every choice, every correction. Learning pipeline treats every trace as high-confidence teaching material. Procedure extraction is immediate. User's verbal explanations are captured as procedure notes and constraint memory_directives.

What demonstration mode produces:
- High-confidence procedure nodes (source: "manual", α boosted by 0.95)
- Procedure-linked notes from verbal explanation
- Constraint memory_directives from "don't do it this way" explanations
- Candidate standing procedures if trigger conditions described
- Candidate vocabulary mappings if shorthand used

Demonstration mode ends on user signal or configurable idle timeout. Elnor summarizes what he learned.

---

## 10. DOC1 Integration

### 10.1 Unified Storage

DOC1 memories and DOC72 graph entities share the same SQLite database. `entity_knowledge_write` writes both the graph node AND associated DOC1 memory in ONE transaction. See §3.5 for storage details.

### 10.2 What DOC1 Stores for Skill Knowledge

- **Preferences** linked to applications or procedures: "Use keyboard shortcuts in Word," "For legal briefs, use 12pt Times New Roman"
- **Constraints** linked to applications or procedures: "Never use Word web for final formatting"
- **Standing orders** linked to application workflows: "When proofreading, always check heading styles first"
- **Heuristics** generalized from multiple applications: "Desktop apps usually have richer UI than web versions" (type-level, learned through DOC1 §12 generalization)

### 10.3 Memory vs Procedure Precedence

- **Procedure** = how to do something (structural, graph-owned)
- **Memory/preference** = how the USER wants it done (soft, DOC1-owned)
- **Constraint** = what NOT to do (governance, DOC1-owned)

If a procedure says "navigate to references" but a preference says "use the keyboard shortcut," the preference wins at execution time. CIL injects the preference alongside the procedure, and the LLM adapts.

### 10.4 Maturity Bypass for User-Stated Durable Facts

When the agent proposes a knowledge write during conversation and the user confirms, certain high-confidence facts can bypass the normal maturity lifecycle:

```ts
// Only graph-eligible durable facts at high confidence bypass maturity:
if (
  source === "user_stated_in_chat" &&
  (capture_scope === "durable" || capture_scope === "contextual") &&
  confidence >= 0.85 &&
  durable_assertion_class === "durable_fact"
) {
  maturity_state = "active";
} else {
  maturity_state = "observation";
}
```

This allows explicit user statements like "Jones & Smith is opposing counsel on Henderson" to enter the graph at `active` maturity immediately, rather than waiting for corroboration. The capture must be graph-eligible (`capture_scope = "durable" | "contextual"`), the durable assertion class must be `durable_fact`, and the confidence must be ≥ 0.85.

### 10.5 Correction Pipeline

Corrections flow through multiple docs:

```ts
type CorrectionEvent = {
  correction_id: string;
  target_node_id: string;
  correction_kind: "user_correction" | "adverse_ruling" | "new_authority" | "candor_critique" |
                   "contradictory_work_product" | "failed_application";
  source_ref: string;
  severity: "low" | "medium" | "high" | "critical";
  proposed_action: "reduce_confidence" | "mark_caution" | "split_node" | "supersede_node" |
                   "create_conflict_set" | "archive_variant";
};
```

Pipeline: DOC72 (schema, detection) → DOC1 (governance, maturity impact) → DOC8 (pattern learning, threshold adjustment) → DOC24 (injection update, caution tag propagation).

---

## 11. DOC8 Integration

### 11.1 Friction Signals for Skill Learning

| Signal | Friction type | Stage |
|---|---|---|
| Procedure failed during execution | `tool_failure` | `skill_execution:{procedure_id}` |
| User corrected a procedure step | `quality_degradation` | `skill_correction:{procedure_id}` |
| Procedure worked but user was unhappy | `ux_annoyance` | `skill_quality:{procedure_id}` |
| Bootstrap created unhelpful capabilities | `quality_degradation` | `skill_bootstrap:{app_entity_id}` |
| Learning pipeline created noisy candidates | `ux_annoyance` | `skill_noise:{app_entity_id}` |

### 11.2 DOC8 as the Experience Dimension Processor

DOC8's friction events are negative-outcome UsageEvents in the experience records. DOC8's positive signals are positive-outcome UsageEvents. DOC8 becomes the ENGINE that processes the experience dimension:

1. Something happens (action executed, preference applied, concept argued, procedure run)
2. A UsageEvent is recorded on the relevant node's ExperienceRecord (§34)
3. DOC8 reads experience records during processing cycles (nightly + real-time for critical events)
4. DOC8 detects patterns: declining success rate, behavioral shift, recurring correction, consistent success, adverse outcomes with reasoning
5. DOC8 produces signals: α/β adjustment, content update proposal, behavioral tuning, staleness flag, trend alert
6. Those signals update other dimensions: confidence adjusts, content gets correction proposals, temporal freshness updates

### 11.3 What DOC8 Covers

**Negative outcomes:** Tool failures, user corrections, adverse rulings, rejected suggestions. UsageEvents with `outcome: "failure"` or `outcome: "corrected"`. DOC8 detects patterns: recurring failures → prevention rule, repeated corrections → content update proposal.

**Positive outcomes:** Successful procedures, correct entity resolutions, accepted standing procedure executions. UsageEvents with `outcome: "success"`. Consistent success → α increase, repeated acceptance → promotion candidate.

**Behavioral shifts:** Brand X declining, Brand Y increasing. Corrective disclosure approach getting adverse rulings. DOC8 detects through variant_distribution analysis and proposes content updates.

**Cross-graph patterns:** Patterns across multiple nodes. "Cases consistently need calendars within 48 hours." DOC8 reads experience records across entity types and detects cross-cutting patterns for DOC1 generalization.

**Graph quality:** Stale entities, near-duplicates, decaying confidence, sparse enrichment candidates. DOC8 flags from experience record analysis.

### 11.4 How DOC8 Feeds Back to DOC72

- **→ Confidence:** Adjust α/β based on experience outcome patterns
- **→ Content:** Propose updates when behavioral shifts detected
- **→ Temporal:** Flag staleness when experience records show inactivity
- **→ Provenance:** Record DOC8's recommendations as provenance entries
- **→ Connections:** Suggest new relationships discovered through cross-graph patterns

### 11.5 This Stays in DOC8, Not DOC72

DOC72 defines the ExperienceRecord SCHEMA (§34) — the shape of the data. DOC8 defines the PROCESSING LOGIC — how to read experience data, detect patterns, compute recommendations, and produce signals. DOC72 is the brain's structure. DOC8 is the neuroplasticity mechanism.

### 11.6 Self-Learning Extraction Feedback Interface

DOC72 sends DOC8 five categories of feedback signals from the intake pipeline:

**Signal 1 — PatternPerformanceRecord:** Per-pattern precision, extraction yield, confirmation rate, and current weight adjustment (±0.05 per cycle, small to prevent oscillation). DOC8 adjusts domain signal profile weights based on these.

**Signal 2 — FalseNegativeSignal:** Detected when the user teaches something that existed in content the system skipped, when a correction traces back to skipped content, or when a concept is found in previously skipped content.

```ts
type FalseNegativeSignal = {
  signal_id: string;
  detected_at: string;
  detection_method: "user_taught_existing_content" | "user_correction_traceback" | "concept_found_in_skipped_content";
  knowledge_summary: string;
  source_artifact_ref?: string;
  original_assessment?: { score: number; dispatch: string; reason_for_skip: string };
};

type MissedKnowledgeSignal = {
  source_surface: IntakeSurface;
  source_ref: string;
  missed_kind: "entity" | "concept" | "obligation" | "procedure" | "conversation_checkpoint";
  discovered_by: "user_correction" | "later_reuse" | "manual_search" | "doc8_inference";
};
```

**Signal 3 — Surface ROI:** Weekly budget reallocation recommendation based on value-per-cost across surfaces.

**Signal 4 — Prompt performance:** Per-extraction-type precision tracking. Below 0.70 → DOC8 surfaces recommendation to revise extraction prompt.

**Signal 5 — Sonar quality:** Utility metric (connections_used / connections_created). Below 0.10 → DOC8 proposes reducing sonar frequency.

**Typed extraction outcomes (sent to DOC8 for all five dimensions):**

```ts
type ExtractionOutcomeEvent = {
  event_id: string;
  source_surface: IntakeSurface;
  extracted_item_id?: string;
  extracted_item_kind: "entity" | "relationship" | "memory_directive" | "domain_concept" |
                       "standing_procedure" | "conversation_checkpoint" | "obligation";
  outcome: "confirmed" | "used_successfully" | "ignored" | "rejected" |
           "corrected" | "contradicted" | "stale_unused" | "promoted" | "demoted";
  severity?: "low" | "medium" | "high";
  user_visible: boolean;
  occurred_at: string;
};
```

DOC8 uses these to tune significance thresholds, pattern weights, and extraction budgets. See §36 for the complete self-learning loop.

---


### 11.7 Extraction Quality Ledger, Verification, and Active Sampling

R5.7 closes the largest operational risk — silent extraction drift — by wiring extraction quality tracking directly into the existing quality gate, adding cheap post-extraction verification, and allocating a bounded active-learning sampling budget.

```ts
type ExtractionQualityLedger = {
  surface: IntakeSurface;
  model_id: string;
  candidate_class: "entity" | "relationship" | "preference" | "procedure" | "goal" | "obligation";
  period: string;
  extractions_total: number;
  user_corrections: number;
  auto_detected_issues: number;
  false_negative_proxy_count: number;
  precision_estimate: number;
  recall_proxy_estimate: number;
};

type ExtractionQualityResponsePolicy = {
  candidate_class: "entity" | "relationship" | "preference" | "procedure" | "goal" | "obligation";
  consecutive_periods_to_trip: number;
  action_below_floor: "hold_for_review" | "demote_confidence" | "disable_auto_promotion";
};
```

**Normative consume path:** if `precision_estimate < ExtractionQualityGate.min_precision_required` for 2 consecutive periods for the same `(surface, model_id, candidate_class)`, EC SHALL apply `ExtractionQualityGate.action_if_below` and emit a health event.

```ts
function verifyExtractionAgainstSource(
  extracted: ExtractedEntity[],
  sourceText: string
): { verified: ExtractedEntity[]; suspect: ExtractedEntity[] } {
  const verified: ExtractedEntity[] = [];
  const suspect: ExtractedEntity[] = [];
  for (const entity of extracted) {
    if (sourceText.toLowerCase().includes(entity.canonical_name.toLowerCase()) ||
        entity.aliases?.some(a => sourceText.toLowerCase().includes(a.toLowerCase()))) {
      verified.push(entity);
    } else {
      suspect.push(entity);
    }
  }
  return { verified, suspect };
}
```

**Active learning sampling:** DOC24 may surface 3–5 recently extracted entities once per day for user confirmation. This is a bounded sampling lane, not a prompt storm.

**Clarifying note (AUDIT-2):** the pattern-level adjustments discussed in §11.6 are additional DOC8 outputs computed from aggregated experience, not a second replacement mechanism for the per-event α/β increments in §5.2.


## 12. Retrieval Model for Skill Knowledge

When Elnor needs to perform an application task, the retrieval infrastructure handles skill knowledge with no additional systems:

### Example: "Create a table of contents for the Henderson brief"

**Step 1 — Entity resolution:** Semantic router resolves "Henderson" (world entity, type: matter) and "table of contents" (procedure linked to Word).

**Step 2 — Graph traversal:**
- Henderson entity → linked folder root → brief file location
- "Create TOC" composite procedure → constituent procedures: apply heading styles, insert TOC, configure TOC format
- Each procedure → semantic intent steps
- Each procedure → recent traces: last 3 successful TOC creations
- Each procedure → constraints: "use 12-point Times New Roman for legal briefs"

**Step 3 — Pack mounting:** Word is type `application` → entity-type-to-pack mapping mounts `files_pack` + `desktop_apps_pack`. Henderson is type `matter` → mounts `email_pack` + `calendar_pack` + `files_pack` + `search_pack`.

**Step 4 — Context injection:** CIL injects: Henderson entity card, Word entity card with relevant procedures summarized, applicable constraints/preferences, mounted tool directory.

**Step 5 — Execution:** Elnor executes using the retrieved semantic-intent procedure steps, resolved to current UI by the LLM. If a step fails, the trace records the failure and β is incremented.

### Retrieval is graph traversal, not flat search

The key difference from SKILL.md: retrieval follows EDGES rather than searching by keyword similarity. "Create TOC in Word" traverses from "Word" to "Create TOC" procedure to constituent procedures, pulling in all shared knowledge along the way. This is faster and more complete.

---

## 13. Scope Controls and Memory Modes

### 13.1 What Should NOT Become Graph Nodes

- One-off actions specific to a single document and not reusable
- Raw clipboard operations, mouse movements, or low-level OS interactions
- Application-internal state that doesn't represent a meaningful capability
- Procedures that are purely LLM reasoning (not app interaction)
- Individual UI elements (per semantic intent approach — §4.2)

### 13.2 Graph Growth Limits

- **Capabilities per application:** Soft cap at 30. If exceeded, review lowest-confidence capabilities for removal.
- **Procedure nodes per application:** Soft cap at 50. Encourage composition and archival of superseded procedures.
- **Traces per procedure:** Keep most recent 10. Archive older traces with statistical summary.
- **Bootstrap candidates per extraction:** Hard cap at 15.
- **Overall active node target:** Under 50K for optimal SQLite + sqlite-vec performance.

### 13.3 Environment Scoping Discipline

Every procedure MUST carry a `ProcedureContext`. Procedures without environment scoping are treated as `platform: "unknown"` and receive lower confidence for any specific platform.

### 13.4 Memory Modes

```ts
type MemoryModeState = {
  mode: "normal" | "matter_only" | "personal_only" | "incognito";
  activated_at: string;
  activated_by: "user" | "policy";
  ttl_minutes?: number;
  suppress_extraction: boolean;
  suppress_injection: boolean;
  suppress_cross_scope_linking: boolean;
  excluded_matter_ids?: string[];
};
```

**Mode behaviors:**
- `normal`: Full extraction and injection. Default mode.
- `matter_only`: Only inject knowledge scoped to the currently active work context. Personal preferences, goals, hobbies → suppressed from injection. Professional procedures and firm-level knowledge → still injected.
- `personal_only`: Only inject personal knowledge. Work context-scoped knowledge → suppressed.
- `incognito`: No extraction, no injection. The agent operates as if it has no memory — clean baseline. Core personality and tools remain available.

**Incognito TTL enforcement:** EC's `handleOperationEnvelope` checks timestamp against TTL. If `now > activated_at + ttl_minutes`, silently revert to `normal` before processing.

**Per-matter incognito:** `excluded_matter_ids` allows excluding specific matters from extraction/injection while remaining in normal mode for everything else. Useful when working on a particularly sensitive matter.

**Two independent toggles** in Settings > Knowledge: extraction ON/OFF, injection ON/OFF. Emergency brake for broken memory behavior.

UI badge always visible when not in normal mode.

---

## 14. Architectural Relationship to DOC24

### 14.1 Where Skill Nodes Fit

The skill graph uses the SAME SQLite infrastructure as the entity graph:
- **Storage:** Same SQLite database under `ELNOR_MEMORY/`
- **EC single writer:** Same — all graph writes go through EC
- **Retrieval:** Same hybrid retrieval (graph traversal + FTS5 keyword + sqlite-vec vector)
- **Injection:** CIL injects relevant procedure summaries alongside entity cards, within the same token budget
- **Learning:** Same background pipeline infrastructure

### 14.2 Entity-Type-to-Pack Mapping Extension

| Entity kind | Default packs | Probing hints |
|---|---|---|
| application | files_pack, desktop_apps_pack | common tasks, preferred shortcuts, version/platform |
| procedure | (inherits from parent application) | preconditions, environment scope |
| domain_concept | search_pack, files_pack | jurisdiction, related authorities |
| work_product | files_pack | status, version, matter context |

### 14.3 Routing Integration

When the semantic router resolves a request involving an application:
1. Resolve the application entity
2. Traverse to relevant procedures (composite and atomic)
3. Mount associated packs
4. Inject procedure summaries as context
5. The LLM executes using semantic intent steps, resolved to current UI

The routing doesn't need a separate "skill routing" path — it's the same entity resolution → pack mounting → context injection pipeline.

### 14.4 Routing Cascade with FTS5

The routing cascade gains a keyword tier between exact alias lookup and vector search:

| Tier | Method | Latency | When used |
|---|---|---|---|
| **Tier 1** | Exact alias lookup (hash index) | <1ms | Always — first attempt |
| **Tier 1.5** | FTS5 BM25 keyword search | <3ms | If Tier 1 returns no/low-confidence match |
| **Tier 2** | Vector similarity via sqlite-vec | <15ms | Only if Tier 1 + 1.5 return low/zero results |

**R5.7 additions:** before Tier 1 alias resolution, the router checks `disambiguation_biases` for context-specific user corrections. The novelty gate builder is a DOC72-owned reusable service and may be consumed by both extraction prompts and DOC24 packet assembly contexts.
| **Tier 3** | Model-assisted disambiguation | <500ms | Only for persistent ambiguity |

This saves embedding computation cost on most queries.

### 14.5 Routing Cache

If the user asks about Henderson three times in 5 minutes, the routing cascade shouldn't re-resolve each time:

```ts
type RoutingCache = {
  key: string;  // hash(normalized_query + active_entity_ids)
  result: RoutingDecision;
  cached_at: number;
  ttl_ms: number;  // default 300000 (5 min)
};
```

Invalidation: any graph write within the active entity neighborhood invalidates matching cache entries.

### 14.6 Target Resolution Reason Traces

When `resolve_target` says Henderson means X, include "why":

```ts
type ResolutionTrace = {
  resolution_path: "alias_exact" | "alias_fuzzy" | "fts5_keyword" | "vector_similarity" | "memory_hint" | "provider_metadata";
  matched_alias?: string;
  similarity_score?: number;
  confidence: number;
  disambiguated_by?: string;  // "active_context" | "recent_thread" | "user_clarification"
};
```

Feeds conversational inspectability (§41) — when the user asks "why did you think I meant that?" Elnor can explain the resolution path.

### 14.7 Entity-Scoped Search / Query Rewriting

When the LLM uses `search_world_model` or `search_memory`, the query is enriched with active context. "The Henderson brief" → system resolves Henderson, then searches within Henderson-linked artifacts for briefs.

```ts
// Entity-scoped search variant added to GraphQuery API
{ kind: "entity_scoped_search"; entity_id: string; query: string; max_results: number }
```

### 14.8 Parallel Retrieval Within Tiers

The routing cascade is conceptually sequential between tiers (Tier 1 → Tier 2 → Tier 3 with short-circuit), but within each tier, sub-operations execute in parallel where independent. Entity alias lookup, memory search, and capability check can all start simultaneously and merge results: `Promise.all([aliasLookup(), memorySearch(), capabilityCheck()])`.

### 14.9 DOC24 Delivery Architecture Split

The Knowledge-to-LLM Delivery Architecture — including packet assembly, card rendering, injection selection, retrieval lanes, rendering templates, injection tags, provenance display policy, and kill-switch behavior — is specified in DOC24. DOC72 owns payload schemas and render-input contracts; DOC24/KDA owns rendering templates and runtime rendering.

**What DOC72 sends to DOC24:**

| Data from DOC72 | How DOC24 uses it |
|---|---|
| Entity cards with confidence + provenance | Rendered into packets with appropriate tags |
| AuthorialVoice + AssertionType on provenance | Mapped to injection tags ([cite_as_rule], etc.) |
| DomainSignalProfile.rendering_templates / render-input contracts | Used to format domain-specific cards |
| MemoryModeState | Filters injection scope; also consumed by the intake pipeline for extraction gating |
| LegalSupportBundle on domain concept cards | Rendered with citation display |
| ContradictionEdge between nodes | Triggers [distinguish] tag and mandatory paired injection |
| ExtractionOutcomeEvent | Feeds back to DOC8 for quality tuning |
| SurfaceThresholdProfile | Informs retrieval priority per surface |
| FocusModeOverride | Adjusts retrieval lane budgets |
| PriorContextCard | Injected when continuity detected |
| Outcome chains / lessons / syntheses / implications | Rendered through DOC24/KDA contracts when eligible |
| Tool capability nodes + reliability state | Combined with DOC24 live capability truth during routing and execution selection |

**Matrix baseline fallback:** When `apply_matrix = false`, Gate 2 uses DOC24 structural relevance scores only (no experience-weighted boost). Gate 3 assigns `force_level: 'standard'` for all eligible cards and `hedge_mode` from DOC72 confidence alone. Matrix-off therefore means baseline non-Matrix packet assembly, not silent suppression of knowledge delivery.

**Derived Matrix artifact rule:** BDSM compiled bundles, utility ledgers, attribution records, feedback logs, and related current views are derived non-graph artifacts stored under `ELNOR_MEMORY/system/learning/`. They are governed by DOC72-owned schemas but are not canonical graph truth and do not appear in the DOC72 node taxonomy.


## 15. Non-Application Skill Knowledge

The graph-backed procedural substrate works for any repeatable process, not just desktop applications:

**Legal procedures:** "File a motion in S.D.N.Y." is a composite procedure. Sub-procedures: check local rules, format per court requirements, file via ECF, serve opposing counsel. Each stores semantic intent.

**System operations:** "Set up a DOC23 task pipeline" is a composite procedure. Sub-procedures: configure trigger modules, set up LLM extraction, configure calendar write modules, cable them together.

**Research workflows:** "Research loss causation case law" is a composite procedure. Sub-procedures: search corpora, search legal databases, analyze results, draft summary.

**Personal workflows:** "Process Eurorack patch documentation" is a composite procedure if done regularly. Sub-procedures: photograph patch, document module settings, save to notes.

The architecture is domain-agnostic. Any repeatable multi-step process benefits from graph-backed procedural knowledge.

---

## 16. Temporal Resilience — How the System Handles Change

Everything in the graph exists in time and may change. Files move. Contacts change firms. Websites redesign. Apps update. Deadlines pass. Procedures become obsolete.

### 16.1 Core Principle

Every graph node and relationship carries temporal metadata. All knowledge is treated as potentially mutable. The question is never "is this true?" but "is this still true, and when was it last verified?"

**Cross-cutting governance note:** governance constraints (DOC1 maturity state, consent policy, operational eligibility, suppression flags, visibility restrictions) cross-cut all six dimensions. Delivery eligibility is always the intersection of dimensional strength and governance state.

### 16.2 Temporal Metadata on Every Node

```ts
type TemporalMetadata = {
  created_at: string;
  updated_at: string;
  last_verified_at?: string;
  verification_method?: "trace" | "index_refresh" | "user_confirmed" | "system_check";
  stale_after_ms?: number;            // TTL before staleness flag
  staleness_state: "fresh" | "stale" | "verification_required" | "expired" | "invalidated";
  superseded_by?: string;             // if replaced by a newer version
  change_history: ChangeRecord[];     // append-only history of mutations
};

type ChangeRecord = {
  change_id: string;
  changed_at: string;
  change_type: "created" | "updated" | "moved" | "renamed" | "merged" |
               "corrected" | "invalidated" | "restored" | "environment_changed" |
               "policy_drift";
  changed_fields: string[];
  previous_value_summary?: string;
  change_source: "user" | "indexer" | "system" | "trace" | "correction";
  notes?: string;
};
```

Note: For Tier C nodes (§2.2), only `staleness_state` + `created_at` are tracked. Full ChangeRecord history is Tier A/B only.

### 16.3 Categories of Change

#### A. File and Folder Movement

Procedures and standing procedures reference ENTITY IDs, not raw file paths. When Elnor saves a file to the Henderson folder, he resolves `henderson_folder_root` to its current path at execution time. If the folder moved, the entity's path is updated and the procedure works without modification.

Detection: background indexer monitors approved file roots. Resolution: update folder_root entity path, log `moved` ChangeRecord, emit `CapabilityStateChangeEvent`.

#### B. Contact and Relationship Changes

Person entities linked by role-based relationships. When opposing counsel changes on a matter, the old relationship gets `end_date`, the new one gets created. Standing procedures apply to the new opposing counsel because they reference the relationship type, not the person.

#### C. Application Changes

Procedures store semantic intent, not UI steps — so app updates rarely invalidate procedures. When an app fundamentally changes a capability, the procedure's confidence decays through time and failed traces increment β. The system adapts organically.

#### D. Deadline and Schedule Changes

Time-proximity triggers recompute against current entity dates. When a trial date moves, preparation triggers move with it automatically.

#### E. Entity State Changes

Entities have states that evolve: matters go from `active` to `trial_prep` to `settled` to `closed`. State transitions trigger propagation rules.

#### F. Correction and Contradiction

User corrections are the highest-authority change type. When a correction conflicts with existing knowledge, the correction wins and the old knowledge gets a `superseded_by` pointer.

#### G. Environment Migration

When the user changes machines or updates OS, procedures scoped to the old environment get `staleness_state: "stale"`. Re-validation through successful traces in the new environment restores confidence.

#### H. Policy and Permissions Drift

Auth tokens expire, connector permissions change, provider access is revoked. These affect CAPABILITY, not just knowledge. When an OAuth scope narrows, standing procedures that depended on the broader scope may fail. When a provider is disconnected, entities sourced from it are flagged for staleness review.

Detection: `CapabilityStateChangeEvent` from DOC24 → propagate to DOC72 entities linked to that provider. Standing procedures with `trigger_filter` referencing affected providers are flagged.

### 16.4 Change Propagation Rules

```
Change detected on entity X
    │
    ▼
Update entity X (path, name, state, confidence, staleness)
Log ChangeRecord in X's change_history
    │
    ▼
Traverse outgoing relationships from X (depth 1)
For each linked entity Y:
    │
    ├── If Y directly depends on the changed property of X:
    │   mark Y as needing verification
    │
    ├── If Y is a procedure that references X:
    │   check if the procedure's steps are still valid
    │
    ├── If Y is a standing procedure scoped to X:
    │   check if trigger and actions still resolve
    │
    └── If Y is a composite procedure that uses procedures linked to X:
        flag for potential impact, don't invalidate automatically
    │
    ▼
Emit CapabilityStateChangeEvent if any capability affected
Update packet version
Surface material changes in suggestions inbox
```

**Propagation depth:** One hop for immediate impact. Two hops maximum for transitive warnings. Never unbounded traversal.

### 16.5 Conflicting Change Detector Reconciliation

When multiple change detection signals conflict (file watcher says moved, indexer says deleted, user says renamed), apply priority ordering:
1. User correction (highest authority) — always wins
2. Direct observation (file watcher, provider API) — high trust
3. Indexer inference — medium trust
4. System heuristic — lowest trust

If two same-tier signals conflict, hold both as competing hypotheses, surface in suggestions inbox. Never auto-resolve conflicting same-tier signals.

### 16.6 Matter State Transition Propagation

When matter state changes, propagation rules apply:
- `active → trial_prep`: Boost injection priority for trial-related entities, procedures, deadlines. Surface anticipatory preparation standing procedures.
- `active → settled`: Suspend all standing procedures for this matter. Archive non-essential connections. Keep domain concepts and authority refs (reusable).
- `settled → closed`: Archive matter entity and all linked operational nodes. Preserve domain knowledge.

### 16.7 Staleness Tiers

| Staleness state | Meaning | Retrieval behavior | Action required |
|---|---|---|---|
| `fresh` | Recently verified or within TTL | Normal retrieval and confidence | None |
| `stale` | Past TTL but not known wrong | Retrieval with lower ranking + caution | Verification encouraged |
| `expired` | Significantly past TTL, no recent activity | Retrieval only as fallback, clearly marked | Review or archive recommended |
| `invalidated` | Known to be wrong | Excluded from primary retrieval | Must be corrected, superseded, or archived |

### 16.8 Resilience Design Rules

1. **Reference entities, not raw values.** Procedures link to entity IDs, not file paths, emails, or URLs.
2. **Never silently break.** If a dependency changes and something can't be resolved, surface it.
3. **History is append-only.** Changes are recorded, not overwritten. Previous state is recoverable.
4. **Staleness is graduated.** Fresh → stale → expired → invalidated. Never complete amnesia.
5. **Environment-scoped invalidation.** A procedure failing on Word web doesn't invalidate it on Word desktop.
6. **Graph edges carry temporal metadata.** Relationships have start_date and optionally end_date.

---

## 17. Extended Applications of the Graph Architecture

### 17.1 Communication Intelligence

The graph entity for each contact gains relationship-scoped communication metadata: tone patterns, CC conventions, subject line formats, response urgency expectations, preferred sending account. Learned from email traces.

DOC1's generalization engine detects type-level communication heuristics: "emails to opposing counsel are formal, include case number in subject, CC the partner." These attach to the `opposing_counsel` relationship type so every future opposing counsel inherits them.

Communication style memory_directives (§4.4) provide domain-specific writing intelligence beyond per-recipient patterns. Profiles cascade through the graph hierarchy: global → space → entity type → specific entity → specific recipient.

### 17.2 Decision Memory

Decision events captured as memory_directive nodes: the decision itself, alternatives considered, reasoning, context at the time, and links to involved entities. Created from conversations where conclusions are reached.

When facing a similar decision on a different matter, Elnor surfaces prior reasoning. Decisions carry temporal metadata: what was known WHEN the decision was made. If new information contradicts a past decision's premise, the system can flag it.

### 17.3 Anticipatory Preparation

Standing procedures triggered by time proximity rather than events. "Two weeks before any trial date, remind me to check exhibit designations." The graph provides entity links; the standing procedure provides the temporal trigger. Anticipatory workflows are temporally resilient — if a trial date moves, the preparation trigger moves with it automatically.

### 17.4 Research Lineage

Research sessions produce trace nodes: search queries → results found → sources evaluated → sources kept (with reasoning) → sources rejected (with reasoning) → argument threads constructed. Each trace links to the matter entity, legal theory entity, and specific sources as entities.

### 17.5 Workflow Rhythm Detection

Session traces record when you work on what. DOC1's generalization engine detects patterns: "Will does email review first thing. Deep work blocks run 2+ hours. Administrative tasks happen afternoons." These feed probing (DOC8: don't probe during deep work), anticipatory systems, and context switching suggestions.

### 17.6 Document Evolution Tracking

Work product versions linked in the graph. `Henderson_MTD_v1.docx` through `Henderson_MTD_v7.docx` are the same document at different stages. The graph tracks what changed between versions. When you say "pull up the Henderson MTD," Elnor knows you mean the LATEST version.

### 17.7 Preference Inheritance Through Graph Hierarchy

Preferences cascade through graph relationships: global → space (work/personal) → entity type → specific entity → session context. Each level can override above. One-time context overrides apply only to the current interaction and don't persist.

### 17.8 Relationship Evolution

Graph relationships carry temporal data: start dates, end dates, change history. "Jane Smith was opposing counsel on Henderson from January 2025 to June 2025." When she appears on a new case, Elnor notes the history.

### 17.9 Failure Pattern Recognition

DOC8's friction clustering promotes recurring problem types into named failure patterns linked to applications, procedures, or interaction patterns. "TOC creation in Word web fails because custom styles aren't supported." Proactive warnings through standing procedures.

### 17.10 Ambient State Awareness

The graph tracks current STATUS of active entities: Henderson is in trial prep, taxes are due April 15, the Eurorack build is waiting for parts. Entity states update from calendar events, email activity, task completions.

### 17.11 Attention and Interruption Modeling

Attention policies are behavioral memory_directives — entity-scoped or global — governing urgency classification, interruption thresholds, channel appropriateness, time-of-day sensitivity, bundling rules, and escalation rules. Learned from explicit teaching, onboarding, and DOC8 friction tracking.

### 17.12 Cross-Domain Transfer Learning

DOC1's generalization engine clusters similar items WITHIN entity types. Cross-domain transfer learns meta-patterns about HOW THE USER ORGANIZES across all domains. Global-scope organizational heuristics apply across legal work, music production, personal tasks.

### 17.13 Cross-Matter Pattern Detection

After accumulating experience across multiple cases, DOC8 nightly analysis runs cross-matter graph queries: "In 7 of 10 securities cases, corrective disclosure was stronger." "Opposing counsel Jones & Smith focuses on scienter. Historical success: 3 of 4."

Results stored as `cross_matter_insight` domain_concept nodes with multi-matter provenance.

---

## 18. Standing Procedures

**Normative payload note:** The conceptual standing-procedure architecture in this section is now backed by the absorbed `StandingProcedureKnowledgeContractSchema` in §4A.8.

### 18.1 What Standing Procedures Are

A standing procedure answers WHEN something should happen and WHAT should be done. The actions might USE skill procedures, but the standing procedure is the conditional logic layer.

Examples:
- "When opposing counsel emails about Henderson, save the attachment to the Henderson folder and notify me."
- "When court filings arrive for any case, extract deadlines and add them to the case calendar."
- "Two weeks before any trial date, remind me to check exhibit designations."

### 18.2 Schema

```ts
// Stored in nodes table with node_kind = 'standing_procedure'
type StandingProcedurePayload = {
  canonical_name: string;
  description: string;

  // Trigger — structured for machine matching, natural language for display
  trigger: TriggerSpec;

  // Action sequence
  actions: StandingProcedureAction[];

  // Scope
  scope: "global" | "entity_specific" | "entity_type" | "space";
  scoped_entity_id?: string;
  scoped_entity_type?: string;

  // Lifecycle and governance
  source: "user_taught" | "onboarding" | "generalized" | "system_suggested";
  lifecycle_state: "candidate" | "active" | "suspended" | "archived";
  requires_confirmation: boolean;
  execution_count: number;
  last_executed_at?: string;
  last_succeeded_at?: string;
};
```

### 18.3 Structured Trigger Spec

```ts
type TriggerSpec = {
  trigger_type: "event" | "condition" | "schedule" | "time_proximity" | "manual";
  event_source?: "email_received" | "file_created" | "file_modified" |
                 "calendar_event" | "task_completed" | "task_failed" |
                 "entity_created" | "entity_confirmed" | "folder_changed" |
                 "deadline_approaching" | "capability_changed";
  trigger_filter: TriggerFilter[];       // Concrete filter language (ISS-13)
  time_reference_field?: string;         // e.g., "trial_date" — for proximity triggers
  time_offset_ms?: number;              // e.g., -1209600000 (2 weeks before)
  entity_scope?: string[];              // specific entity IDs
  entity_type_scope?: string[];         // entity types this applies to
  natural_language_description: string; // human-readable for display/editing
};

// Concrete filter language — replaces Record<string, unknown>
type TriggerFilter = {
  field_path: string;                   // JSONPath-style: "$.sender_email", "$.subject", "$.folder_path"
  operator: "equals" | "contains" | "matches_regex" | "starts_with" |
            "ends_with" | "in_list" | "not_in_list" | "greater_than" |
            "less_than" | "exists" | "not_exists";
  value: string | string[] | number | boolean;
};
```

**Matching algorithm:** For each incoming event, iterate over active standing procedures matching `event_source`. For each, evaluate ALL `trigger_filter` entries (AND logic). All must match for the procedure to fire. This is deterministic field comparison — no LLM call.

**R5.7 performance rules:** trigger filters are precompiled at procedure creation time, indexed by `event_source`, and may use a compiled regex cache for `matches_regex` operators. If evaluation exceeds 10ms, degradation telemetry is emitted to the health dashboard.

### 18.4 Action Schema with Safety Classes

```ts
type StandingProcedureAction = {
  step_index: number;
  action_kind: "invoke_capability" | "update_memory" | "emit_notification" |
               "delegate_to_agent" | "branch" | "invoke_skill";
  safety_class: ActionSafetyClass;      // ISS-47: safety classification
  description: string;                   // natural language description
  target_action_id?: string;            // capability registry action_id
  target_entity_selector?: string;      // how to resolve target at execution time
  branch_condition?: string;            // for conditional steps
  requires_confirmation_override?: boolean;
};

type ActionSafetyClass = "read" | "draft" | "internal_write" | "destructive_send";
```

**Graduation rules by safety class:**
- `read`: Auto-graduate after 3 successful executions. Low risk.
- `draft`: Auto-graduate after 5 successful executions. Creates artifacts but doesn't send.
- `internal_write`: Auto-graduate after 5 successful executions with daily digest review. Writes to internal systems (calendar, files, notes) but nothing external.
- `destructive_send`: **NEVER auto-graduate.** Stays `requires_confirmation: true` permanently unless the user explicitly overrides per-procedure. Sending emails, filing on ECF, posting externally — always confirmed.

This prevents the most dangerous automation failure: silently sending the wrong thing to the wrong person.

### 18.5 Execution Receipts

Every execution produces a receipt:

```ts
type StandingProcedureExecution = {
  execution_id: string;
  procedure_id: string;
  trigger_event_id?: string;
  triggered_at: string;
  status: "pending" | "running" | "success" | "failed" |
          "skipped" | "requires_confirmation" | "partial_success";
  skip_reason?: "duplicate_event" | "conflict_with_procedure" |
                "confirmation_timeout" | "capability_unavailable";
  executed_actions: Array<{
    step_index: number;
    action_kind: string;
    safety_class: ActionSafetyClass;
    status: "success" | "failed" | "skipped";
    result_ref?: string;
    error_code?: string;
  }>;
  result_refs?: string[];
  error_codes?: string[];
  completed_at?: string;
};
```

### 18.6 Dedup Keys

Duplicate event delivery triggering duplicate actions is a real production risk:

```ts
type StandingProcedureExecutionKey = {
  procedure_id: string;
  trigger_event_fingerprint: string;    // hash of event identifying fields
  dedup_window_ms: number;              // default 900000 (15 min)
};
```

Rule: Before execution, check dedup store. If key exists within window, skip with receipt `status: "skipped", reason: "duplicate_event"`.

### 18.7 Conflict Resolution

When multiple standing procedures match the same event with conflicting actions:

```ts
type StandingProcedureConflict = {
  procedure_a_id: string;
  procedure_b_id: string;
  conflict_type: "action_overlap" | "contradictory_actions" | "resource_contention";
  resolution: "a_wins_by_confidence" | "b_wins_by_confidence" | "user_resolution_required";
};
```

1. Check for action conflicts (two procedures saving same file to different locations)
2. If conflict detected: execute higher-confidence procedure, skip conflicting one with receipt
3. If no conflict: execute all matching procedures (independent)
4. Surface conflicts in daily automation digest

### 18.8 How Standing Procedures Are Created

**From conversation:** "Elnor, whenever opposing counsel sends us a brief on Henderson, save it to the Henderson folder." One sentence creates a standing procedure linked to Henderson, opposing counsel, and Henderson folder entities.

**From onboarding:** "Are there things you want me to do automatically?"

**From generalization (DOC1 §12):** If Elnor notices you've done the same thing three times, DOC1 proposes: "Want me to do this automatically?"

**From system suggestion:** Type-level heuristics suggest relevant standing procedures when new matters are created.

**From conversation mining (§20):** Post-chat extraction identifies instructional statements ("from now on...", "whenever...", "always...").

### 18.9 Confirmation Model and Daily Digest

New standing procedures default to `requires_confirmation: true`. After successful executions without corrections, the user can set it to autonomous (subject to ActionSafetyClass constraints — `destructive_send` NEVER auto-graduates).

**Daily automation digest:** Summarizes all standing procedure executions (both confirmed and autonomous). "Here's what your standing procedures did today: [list]. Any problems?" This provides oversight without per-execution fatigue.

DOC8 tracks confirmation fatigue: if user confirms everything within 2 seconds, they're not reading — surface a warning.

### 18.10 Promotion to DOC23 Tasks

```ts
type ProcedurePromotionDecision = {
  source_procedure_id: string;
  promotion_target: "doc23_task" | "composite_procedure" | "keep_as_standing_procedure";
  reason_codes: string[];
  promotion_criteria_met: {
    repeated_successful_executions: boolean;   // 10+ successes
    low_ambiguity: boolean;                    // trigger matching consistently clean
    high_operational_importance: boolean;      // failure would cause real problems
    needs_retry_audit: boolean;                // wants guaranteed execution
    stable_trigger_pattern: boolean;           // trigger hasn't been modified recently
    structured_io_suitable: boolean;           // actions map to DOC23 module graph
  };
};
```

### 18.11 Temporal Resilience of Standing Procedures

Standing procedures reference ENTITIES and RELATIONSHIPS, not raw values. When a file path changes, the entity updates and the procedure follows. When opposing counsel changes, the procedure applies to the new person because it references the relationship type. When a deadline moves, time-proximity triggers recompute.

### 18.12 What Should NOT Become a Standing Procedure

A standing procedure requires:
- A clear, repeatable trigger condition
- A meaningful multi-step action sequence
- Reusable value beyond a single instance
- Enough confidence the behavior is intentional

Simple preferences stay as memory_directives. One-time instructions stay as ephemeral context. DOC1's generalization engine requires 3+ similar instances before proposing.

---

## 19. Latency Discipline

### 19.1 The Constraint

OpenClaw's baseline model call overhead is approximately 20 seconds. The knowledge system has ZERO budget for additional perceptible latency. **The entire DOC72 knowledge system MUST NOT add more than 50ms total to prompt assembly time under any circumstances.**

### 19.2 Three-Tier Timing Model

Nothing in Tier 2 or Tier 3 may BLOCK a Tier 1 operation. If any Tier 1 operation cannot complete in time, the system degrades gracefully.

#### Tier 1 — Prompt-time (must complete in <50ms total cumulative)

Pre-computed, cached, in-memory data only. No model calls, no database queries beyond indexed lookups, no network requests.

| Operation | Method | Target latency |
|---|---|---|
| Entity resolution | Cached alias/name/ID index + FTS5 | <3ms |
| Pack selection | Cached entity-type-to-pack mapping | <1ms |
| Entity card retrieval | SQLite indexed lookup | <2ms |
| Standing order/directive retrieval | Cached DOC1 memory | <1ms |
| Capability summary | Cached live action state | <1ms |
| Standing procedure trigger check | Structured filter matching | <5ms |
| Packet assembly (string rendering) | Template rendering | <5ms |
| **Total Tier 1 budget** | | **<50ms** |

#### Tier 2 — Background-immediate (during/after model turn, <500ms)

| Operation | When |
|---|---|
| Warm-path memory search | On session start, topic shift |
| Complex-path routing with vector search | When fast-path resolution returns no match |
| Execution trace capture | After tool action completes |
| Standing procedure trigger evaluation for background events | After email triage, file change |

#### Tier 3 — Background-deferred (async, seconds to hours)

| Operation | Schedule |
|---|---|
| Entity extraction from email/calendar/files | Continuous + nightly deep enrichment |
| Procedure extraction from execution traces | After significant interactions |
| LLM bootstrap for new applications | Once per new application entity |
| DOC1 nightly generalization | Nightly |
| DOC8 friction/experience analysis | Nightly |
| Cross-chat pattern detection | Nightly |
| Graph integrity checks + cleanup | Nightly |
| Conversation mining / chat extraction | After chat closes, significance-gated |
| Embedding generation for new/updated nodes | Async on creation/update |
| Change propagation through graph edges | Event-driven, async |

### 19.3 Pre-Computed Data Requirements

For Tier 1 to work:
- **Graph alias index** — loaded on EC startup, updated incrementally
- **Entity-type-to-pack mapping** — loaded from config
- **DOC1 hot-path memories** — cached per DOC1 §5.1
- **Capability summary** — cached, invalidated on changes
- **Active standing procedure trigger index** — indexed by event_source
- **Tool pack definitions** — loaded once, cached
- **Routing cache** — LRU with 5-minute TTL

### 19.4 Degradation Rules

If any Tier 1 data is unavailable:
1. Use last available version (stale > absent)
2. If no version exists, proceed with core-pack-only and minimal context
3. Log the degradation as a DOC8 friction event
4. Queue a background refresh
5. Never block the user's turn waiting for data

### 19.5 Background Resource Management

```ts
type BackgroundSchedulerPolicy = {
  foreground_active: {
    tier3_state: "paused" | "throttled";
    max_tier3_cpu_percent: number;        // 10% when foreground active
    max_concurrent_extractions: number;   // 1
    max_embedding_batch_size: number;     // 10
  };
  foreground_idle: {
    tier3_state: "active";
    max_tier3_cpu_percent: number;        // 50%
    max_concurrent_extractions: number;   // 3
    max_embedding_batch_size: number;     // 100
  };
  backpressure: {
    max_extraction_queue: number;         // 50 (ring buffer)
    max_enrichment_queue: number;         // 200
    max_embedding_queue: number;          // 1000
    drop_policy: "lowest_priority_first";
  };
};
```

When the active chat session is hot, all Tier 3 background extraction and embedding generation must be paused or throttled. This prevents background processing from starving the foreground model of compute resources on a single MacBook.

---


### 19.6 Daily Extraction Budget Governance

Tier 3 extraction operations are further governed by a daily extraction budget (§23). The budget provides a hard ceiling on deep extraction slots per day, with fair-share allocation across active work contexts and reserved slots for critical events (§24). When the daily budget is exhausted, non-critical deep extractions are deferred to the next day with carried-forward slots.

### 19.7 Critical Non-Drop Queue

Critical events (user corrections, court order detections, task failures, CANDOR critical findings, explicitly accepted recommendations, user-flagged Focus) bypass normal queue priority and budget constraints via a protected zone in the extraction ring buffer (§24). This ensures safety-critical knowledge is never lost due to backpressure.


### 19.8 Backlog Health, SLA Visibility, and Incremental Nightly Work

The system now surfaces extraction backlog health directly in the Knowledge Manager / System Health surfaces: queue depth, items deferred, items dropped this week, and nightly catch-up status. Historical calibration sweeps use a separate budget and do not draw from daily extraction capacity.

Nightly jobs process incrementally against per-job high-water marks. If a job exceeds its configured maximum duration, it checkpoints and resumes on the next cycle rather than blocking downstream jobs.


## 20. Conversation Mining — Post-Chat Knowledge Extraction

### 20.1 The Problem

During conversations, Elnor is focused on the user's task — not simultaneously cataloging every entity, relationship, and preference. Cross-chat pattern detection is impossible in real-time. In Monday's chat the user mentions "the Blackstone deal." In Wednesday's chat: "Sarah at Blackstone." In Friday's chat: "check the Blackstone folder." Only post-chat review across conversations can connect these dots.

### 20.2 What Conversation Mining Does

A background process reviews conversations AFTER they happen, identifies knowledge mentioned but not explicitly saved, and proposes it for the graph. Operates on CLOSED conversations only.

This section covers two-party chat extraction (user + Elnor). For multi-participant surfaces (rooms, panels, forums) and other surfaces (notes, document viewer, tasks, CANDOR), see §20B — each has a dedicated extraction contract with attribution rules. For how the system decides WHEN to extract (significance gating), see §20A.

### 20.3 What It Looks For

- **Entity candidates:** Names, organizations, matters, projects, applications not already in the graph
- **Relationship signals:** "Jane Smith emailed about Henderson" → if both entities exist, is there an edge?
- **Implicit preferences:** User chose Word over Markdown, sent from personal email, used specific subject line format
- **Procedural demonstrations:** Multi-step processes demonstrated in conversation enrich procedure nodes
- **Decision events:** Strategic decisions with reasoning, captured as memory_directives
- **Standing procedure candidates:** Instructional statements ("from now on...", "whenever...", "always...")
- **Corrections:** Path corrections, entity corrections, relationship corrections

### 20.4 Extraction Process

The process uses a spec-defined durable-knowledge extraction prompt. R5.7 replaces the earlier thin block with the integrated prompt below from the accepted Knowledge Intelligence Enhancement R2 module:

The following prompt block is added to DOC72 conversation/document extraction (§20.4):

```
You extract durable knowledge from user conversation and related artifacts into
strict structured outputs. Extract ONLY when the content is durable, scoped,
and useful beyond the immediate turn.

OUTPUT CLASSES
1. entities
2. relationships
3. memory_directives
4. procedures
5. obligations
6. goals
7. outcome_chains
8. behavioral_actor_notes
9. user_taught_lessons

GENERAL RULES
- Prefer abstaining over guessing.
- Do not globalize matter-specific knowledge.
- Do not infer protected traits, emotional states, health, or private-life details.
- For legal/authority-derived content, preserve the authority source and direct
  excerpt when available.
- If information is incomplete, output partial objects with explicit nulls;
  do not hallucinate missing fields.

OUTCOME CHAINS
Extract an outcome chain only for SUBSTANTIVE outcomes:
- ruling/decision/result/success/failure/correction with durable learning value
Do NOT extract routine successful operations.

For each outcome chain, capture:
- approach_summary
- arguments_applied (reference known concept IDs if possible: {novelty_gate_concepts})
- procedures_used (reference known procedure IDs if possible)
- tools_used (stable tool/action IDs if known)
- outcome_type (from: success, partial_success, failure, rejection,
  correction_needed, unexpected_positive, neutral)
- outcome_summary
- reasoning (ONLY if explicit or strongly supported. Do NOT speculate.)
- consequence_severity using this rubric:
  critical = real-world harm / missed deadline / wrong recipient / sanctions risk
  high = major rework / rejected filing / major strategy change
  medium = retry / moderate delay / formatting correction
  low = minor annoyance / cosmetic issue
- severity_justification (if adjusting from deterministic baseline, cite evidence)
- root_cause_class (from: knowledge_gap, procedural_failure, tool_failure,
  infrastructure_failure, external_factor, third_party_delay, human_error,
  user_strategic_choice)
- counterfactual_hint (ONLY if explicit or strongly implied)
- work_context_id, work_context_phase, jurisdiction_scope, related_actor_ids
- source

If root cause is external (outage, third-party delay, unavailable data), mark:
  root_cause_class = infrastructure_failure or external_factor or third_party_delay
and DO NOT convert this by itself into a durable lesson.

USER-TAUGHT LESSONS
When the user explicitly states a reusable takeaway ("next time…", "I should have…",
"don't do X; do Y", "lesson learned", "never again", "the takeaway is"):
- lesson_summary, what_happened, why_it_happened, what_to_do_instead
- applicable_contexts at the NARROWEST supported scope
- creation_trigger = user_taught

BEHAVIORAL ACTOR NOTES
Extract ONLY professional/operational behavior from the allowed list:
- workflow_preference, scheduling_habit, communication_pattern,
  decision_tendency, negotiation_style, procedural_preference
NEVER extract: protected characteristics, personality diagnoses, health/mental state,
private-life speculation, broad competence labels, physical appearance.
For each note: actor_ref, note, note_type, evidence_count, source_refs, confidence.

MEMORY DIRECTIVES
Output the full DOC72/KDA contract fields including:
- memory_type, summary, assertion_class, applies_when, does_not_apply_when,
  scope, priority_class, source_certainty

FINAL RULE
If unsure whether something is durable, extract as a suggestion/review candidate,
not as a strongly scoped durable rule.
```

---

The extraction pipeline classifies conversation capture scope separately from durable assertion class. `capture_scope = session | hypothetical | transient` is never graph-promoted. Only graph-eligible captures may carry a durable assertion class into the DOC1 Write Gate.

```ts
type CaptureScope =
  | "durable"
  | "contextual"
  | "session"
  | "hypothetical"
  | "transient";

type DurableAssertionClass =
  | "durable_fact"
  | "preference"
  | "constraint"
  | "standing_order"
  | "vocabulary_rule"
  | "heuristic";

type ConversationCaptureClassification = {
  capture_scope: CaptureScope;
  durable_assertion_class?: DurableAssertionClass;  // only set when capture_scope = "durable" | "contextual"
};
```

**Normative:**
- `capture_scope = session | hypothetical | transient` → never graph-promoted
- `capture_scope = durable | contextual` → may map into KDA/DOC1 durable assertion classes
- The extraction prompt (§20.4) classifies using `CaptureScope`; the Write Gate (DOC1) uses `DurableAssertionClass`

### 20.5 Extraction Output Schema

```ts
type ChatExtractionResult = {
  extraction_id: string;
  source_chat_id: string;
  source_chat_summary: string;
  extracted_at: string;
  entity_candidates: EntityCandidate[];
  relationship_signals: Array<{
    src_hint: string;
    dst_hint: string;
    relationship_type: string;
    confidence: number;
  }>;
  preference_signals: Array<{
    description: string;
    scope: string;
    confidence: number;
  }>;
  procedural_signals: Array<{
    description: string;
    procedure_type: "skill" | "standing" | "guidance";
    confidence: number;
  }>;
  decision_signals: Array<{
    decision: string;
    reasoning: string;
    linked_entity_hints: string[];
    confidence: number;
  }>;
  correction_signals: Array<{
    what_was_corrected: string;
    old_value_hint?: string;
    new_value: string;
    target_entity_hint?: string;
    confidence: number;
  }>;
};
```

### 20.6 Promotion Pipeline

Extracted items go through the SAME promotion pipeline as all other graph candidates:
- Entity candidates → source gating → entity linking → promotion rules
- Memory candidates → DOC1 Write Gate (dedup, contradiction, maturity)
- Procedure candidates → skill learning pipeline
- Standing procedure candidates → governance flow (requires confirmation)
- Corrections → direct graph/memory updates with high confidence

### 20.7 Cross-Chat Pattern Detection

The nightly consolidation pass aggregates extractions from multiple recent chats:
- Same entity in 3+ chats → high promotion confidence
- Same person mentioned with same matter across chats → relationship signal strengthened
- Same workflow pattern multiple times → DOC1 generalization candidate
- Same correction multiple times → strong preference/constraint signal
- Same standing instruction repeated → should definitely be a standing procedure

### 20.8 When Mining Runs

**Per-chat extraction:** Triggered on chat close when ANY of the following is true:
- chat length > 10 turns AND at least one substantive turn
- any known graph entity is referenced by the user
- any tool is executed
- the user explicitly states durable knowledge (corrections, decisions, “from now on…”, standing instructions)

Short casual chats outside those conditions skip extraction entirely.

**Cross-chat consolidation:** Runs as part of DOC1's nightly performance loop.

### 20.9 Mining Queue Management

Ring-buffer bounded queue with priority ordering prevents OOM:

```
MAX_EXTRACTION_QUEUE = 50
if extraction_queue.length() > MAX_EXTRACTION_QUEUE:
    dropped_chat = extraction_queue.pop_lowest_priority()
    log_telemetry("intake.chat.extraction_dropped", {chat_id: dropped_chat.id})
```

High-priority: chats with known entity references and tool execution. Low-priority: short casual chats.

### 20.10 Global Pause Toggle

A GLOBAL "pause learning" toggle in settings for times when the user wants full privacy (sensitive client conversations, personal matters). Single switch, not per-conversation. When active, no conversation mining runs and no knowledge is extracted.

### 20.11 What Mining Does NOT Do

- Does not re-read every chat every night (maintains a processed-chat cursor)
- Does not extract transient task details or small talk
- Does not auto-create high-impact entities without promotion discipline
- Does not modify the graph during a live conversation
- Does not add hot-path latency to any user interaction
- Does not store full chat transcripts redundantly (references existing chat by ID)

---


### 20.12 Real-Time Agent-Initiated Knowledge Writes (Chat Intake)

In addition to post-chat mining, the agent can propose knowledge writes inline during conversation:

```ts
type ProposeKnowledgeWrite = {
  updates: Array<{
    type: "create_entity" | "create_relationship" | "update_entity" | "create_obligation";
    data: Record<string, unknown>;
    source: "user_stated_in_chat";
    confidence: number;
    capture_scope: CaptureScope;
    durable_assertion_class?: DurableAssertionClass;
  }>;
  inline_confirmation?: string;
};
```

**Capture-scope rule:** Only graph-eligible captures (`capture_scope = "durable" | "contextual"`) may carry `durable_assertion_class` into the DOC1 Write Gate. Session, hypothetical, and transient captures remain ephemeral and MUST NOT be graph-promoted.

**Maturity bypass for durable facts:** When the user explicitly states a fact in chat and the agent proposes it as a knowledge write with `capture_scope = "durable" | "contextual"`, `durable_assertion_class = "durable_fact"`, and `confidence >= 0.85`, the node enters at `active` maturity immediately (see §10.4). All other graph-eligible assertion classes enter at `observation` maturity.

**Agent guardrails for chat intake:**
- CAN: write candidates at `suggested`, update experience records, create edges, write high-confidence durable facts
- CANNOT without confirmation: promote to `confirmed`, demote/archive confirmed knowledge
- CANNOT ever: override user corrections, modify user-stated provenance, change DOC1 standing orders

This populates the graph ~10x faster than post-chat mining, builds trust (user sees learning in real-time), and reduces extraction error because the LLM works with full conversational context.


---

## 20A. Knowledge Intake — Observation and Assessment

### 20A.1 The Core Distinction — Learning vs Access vs Episodic Recall

Content in the system exists in three persistence modes:

| Mode | What it means | How it works | Cost | Example |
|---|---|---|---|---|
| **Accessible** | ELNOR can find and retrieve this content on demand | File on disk, indexed by LlamaIndex/QMD, findable by name | Near zero | "Pull up the Brown complaint" |
| **Episodically Recallable** | ELNOR can recall prior conversations and their context | Thread checkpoints in conversation corpus with decisions, entities, unresolved items | Low | "What did we discuss about Henderson last Tuesday?" |
| **Learned (Graph Knowledge)** | ELNOR extracted structured knowledge and committed it to the graph | Entities, relationships, decisions, domain concepts as graph nodes with provenance | Expensive | "What legal issues did I argue in the Henderson MTD?" |

Episodic recall entries remain in the conversation corpus and are never automatically promoted to durable graph nodes unless the user explicitly says "learn from this conversation" or accepts a DOC8 suggestion.

**Governing principle:** ELNOR learns only from user actions that demonstrate intent (saving, citing, asking, editing intensively, deciding, or explicit "Focus" / "learn from this"). Passive viewing, glancing, or incidental dwell time produces only temporary access indexing, never durable knowledge nodes. This principle is enforced at the significance gate and is non-negotiable.

```ts
type KnowledgePersistenceMode =
  | "accessible_only"
  | "episodic_recall"
  | "candidate_graph"
  | "durable_graph";
```

### 20A.2 The Three-Stage Observation Pipeline

Every surface follows the same pipeline:

#### Stage 1 — DETECT (free)

EC listens to its existing command stream. No new event buses. The one exception: Q Browser dwell timer (time between navigations on approved domains). This is the only new observation mechanism and is capped at SHALLOW extraction only (§20B.2). No deep extraction ever triggers from dwell alone.

#### Stage 2 — ASSESS (cheap, deterministic)

```ts
type IntakeSurface = "note" | "document" | "browser" | "chat" | "room" | "panel" | "forum" | "candor" | "task" | "email" | "bucket";

type IntakeObservationEnvelope = {
  observation_id: string;
  observed_at: string;
  surface: IntakeSurface;
  source_ref: string;
  source_version_id?: string;
  actor: "user" | "agent" | "system";
  memory_mode_snapshot: MemoryModeState;
  features: IntakeScoreFeatures;
  dedup_key: string;                // stable hash(surface + source_ref + source_version_id) — excludes trigger_kind to prevent double-extraction
};

type IntakeDispatchDecision = {
  observation_id: string;
  decision: "skip" | "shallow" | "deep" | "defer";
  queue_class: "critical_non_drop" | "high_priority" | "normal" | "background";
  reason_codes: string[];
  score: number;
  persistence_mode: KnowledgePersistenceMode;
  created_at: string;
};
```

#### Stage 3 — EXTRACT (expensive, gated)

Shallow: regex + alias matching (no LLM, <100ms). Deep: background LLM call using contextual retrieval. Results through entity pipeline, DOC1 Write Gate, versioned commits (§37).

**The funnel:** ~100-200 observations/day → ~25-50 pass gating → ~12-25 extracted → ~30-80 items become durable graph knowledge.

### 20A.3 Who Runs Extraction

**EC decides. Background pipeline executes.** NOT a named agent. Uses the cheapest adequate model under the governing extraction policy. Extraction prompts are spec-defined templates with tunable parameters. DOC8 may tune thresholds and selected parameter families but does not rewrite canonical prompt structure.

```ts
type PendingExtractionRecord = {
  pending_id: string;
  idempotency_key: string;
  source_ref: string;
  from_version_id?: string;
  to_version_id?: string;
  queued_at: string;
  updated_at: string;
  priority: "critical" | "normal" | "deferred";
  retries: number;
  max_retries: number;
  max_age_hours: number;
  status:
    | "queued"
    | "running"
    | "extracted"
    | "write_committed"
    | "audited"
    | "failed"
    | "stale_archived";
  error_code?: string;
};
```

```ts
async function recoverCrashedExtractions(): Promise<void> {
  const stuck = await db.all(
    `SELECT * FROM pending_extractions
     WHERE status = 'running' AND updated_at < datetime('now', '-10 minutes')`
  );
  for (const record of stuck) {
    if (record.retries < record.max_retries) {
      await db.run(
        `UPDATE pending_extractions SET status = 'queued', retries = retries + 1,
         updated_at = CURRENT_TIMESTAMP WHERE pending_id = ?`, [record.pending_id]
      );
    } else {
      await db.run(
        `UPDATE pending_extractions SET status = 'failed',
         error_code = 'crash_recovery_exhausted',
         updated_at = CURRENT_TIMESTAMP WHERE pending_id = ?`,
        [record.pending_id]
      );
    }
  }
}
```

**Normative rules:**
- Each extraction produces a single atomic `KnowledgeCommit`. Partial commits are not possible.
- `stale_archived` is forbidden for `priority = "critical"`.
- After `max_retries` are exhausted on `deep` extraction, downgrade to `shallow` and re-queue with a fresh retry count. If shallow also fails, mark as `stale_archived` and surface in the audit / health surfaces.
- Low-priority dropped work is recorded in the deferred extraction journal for nightly top-N requeue.

**Model fallback policy:** if the primary extraction model is unavailable or over budget, fall back according to the governed extraction policy. Never silently skip queued extraction. Record fallback reason in `ExtractionAuditRecord` (§38).

**Embedding model is NOT subject to fallback.** See §3.7 — the embedding model is infrastructure, not a configurable agent.

### 20A.4 Importance Scoring — Unified Across Surfaces

```ts
type IntakeScoreFeatures = {
  explicit_user_intent: number;      // 0-1: bookmark, save, ask, focus button, accept correction
  novelty_score: number;             // 0-1: new entity/concept vs existing graph
  domain_impact_score: number;       // 0-1: domain-specific signals from active profiles (§35)
  entity_centrality_score: number;   // 0-1: linked to active work context/goal
  action_strength_score: number;     // 0-1: annotate, resolve comment, accept recommendation
  matter_link_confidence: number;    // 0-1: confidence this belongs to an active work context
  duplication_risk: number;          // 0-1: semantic near-duplicate probability (POSITIVE value)
  privacy_risk: number;              // 0-1: privileged/sensitive scope (POSITIVE value)
};

type SurfaceWeightProfile = {
  surface: IntakeSurface;
  intent: number;
  novelty: number;
  domain: number;
  centrality: number;
  action: number;
  matter: number;
  dup: number;                       // Stored as POSITIVE, subtracted in formula
  privacy: number;                   // Stored as POSITIVE, subtracted in formula
  min_score_for_shallow: number;
  min_score_for_deep: number;
  schema_version: 1;
};

function scoreImportance(features: IntakeScoreFeatures, surface: IntakeSurface): number {
  // Hard overrides
  if (features.explicit_user_intent >= 0.9) return 1.0;
  if (features.privacy_risk >= 0.85 && features.explicit_user_intent < 0.9) return 0.0;

  const w = loadSurfaceWeightProfile(surface);
  const raw = (
    w.intent * features.explicit_user_intent +
    w.novelty * features.novelty_score +
    w.domain * features.domain_impact_score +
    w.centrality * features.entity_centrality_score +
    w.action * features.action_strength_score +
    w.matter * features.matter_link_confidence -
    w.dup * features.duplication_risk -
    w.privacy * features.privacy_risk
  );
  return Math.max(0, Math.min(1, raw));
}

// Default weights (DOC8 tunes per surface):
// intent: 0.22, novelty: 0.18, domain: 0.22, centrality: 0.14,
// action: 0.12, matter: 0.06, dup: 0.10, privacy: 0.08
// min_score_for_shallow: 0.45, min_score_for_deep: 0.75
```


**Explicit user intent mapping (normative):**

| User action | `explicit_user_intent` value |
|---|---|
| Focus button / “learn from this” | 1.0 (hard override) |
| Save as artifact/note | 0.95 |
| Accept agent recommendation | 0.90 (hard override) |
| Bookmark (domain-relevant) | 0.85 |
| Ask Elnor about content | 0.80 |
| User correction | 1.0 (hard override + critical lane) |
| Annotate / highlight | 0.75 |
| None of the above | 0.0 |

**Clarifying note:** the privacy-risk hard override in `scoreImportance()` handles moderate privacy concerns where explicit user intent can still elevate the item. Absolute privilege and legal-hold rules are enforced separately in `resolveExtractionPolicy()` and cannot be overridden by focus or score.

### 20A.5 First-Week Calibration

During the first 30 days, use conservative thresholds (0.65 deep, 0.35 shallow). First-week budget elevated to 40 deep slots. DOC8 calibrates to normal levels as confirmation rates validate. Add first-run messaging: "Elnor is learning your workspace. This is more active than usual and will settle down within a week."

### 20A.6 Policy Precedence Matrix

```ts
type ExtractionPolicyResult = {
  allow: boolean;
  decision: "skip" | "shallow" | "deep" | "defer";
  reason_code: "incognito_mode" | "extraction_disabled" | "privilege_block" |
               "critical_lane" | "budget_deferred" | "score_dispatch" | "focus_override";
};

function resolveExtractionPolicy(ctx: {
  mode: MemoryModeState;
  eventClass: "critical" | "normal";
  budgetOk: boolean;
  scoreDecision: "skip" | "shallow" | "deep";
  focusPressed: boolean;
}): ExtractionPolicyResult {
  // Evaluation order — strict precedence:
  // 1. incognito → FORCE_SKIP (Focus button cannot override)
  // 2. suppress_extraction → FORCE_SKIP
  // 3. privilege/legal_hold policy → FORCE_SKIP
  // 4. critical event class → allow in critical lane
  // 5. budget exceeded → DEFER or SHALLOW
  // 6. significance dispatch via scoreImportance()
  // 7. Focus button elevates to DEEP only at step 6, never above 1-3
}
```

### 20A.7 Domain Signal Integration

Domain signal profiles (§35) contribute to importance scoring by matching surface-specific patterns against the content delta:

```ts
const domain_hits = activeProfiles.flatMap(profile =>
  profile.high_value_patterns
    .filter(p => new RegExp(p.pattern, 'i').test(delta.added_content))
    .map(p => ({ pattern_name: p.pattern_name, weight: p.weight, profile_id: profile.profile_id }))
);
const domain_impact_score = Math.min(1.0, domain_hits.reduce((sum, h) => sum + h.weight, 0));
```

### 20A.8 Cost and Volume Governance

**Calibration budget rule:** historical calibration sweeps and active-work-context bootstrap sweeps use a separate calibration budget. They do not draw from the ordinary daily extraction budget.

**Daily volume estimate (typical workday):**

| Surface | Observations/day | After gating | Deep | Shallow | Cost |
|---|---|---|---|---|---|
| Notes | 10-20 | 3-8 | 2-5 | 2-3 | 20-150s |
| Browser | 20-40 | 2-5 | 1-3 | 1-2 | 10-90s |
| Documents | 5-15 | 3-8 | 2-4 | 2-3 | 20-120s |
| File watcher | 20-50 | 2-5 | 1-3 | 1-2 | 10-90s |
| Chats | 3-5 | 2-4 | 2-4 | 0-1 | 20-120s |
| Tasks | 10-50 | 1-5 | 0-2 | 1-3 | 0-60s |
| Rooms | 1-3 | 1-2 | 1-2 | 0-1 | 10-60s |
| Panels/CANDOR | 0-2 | 0-2 | 0-2 | 0 | 0-60s |
| Emails | 0-5 | 0-5 | 0-2 | 0-5 | 0s–60s depending on `extraction_method` |
| Sonar | 0-3 | 0-3 | 0-3 | 0 | 0-90s |
| **Total** | **~100-200** | **~15-50** | **~10-25** | **~8-20** | **~2-14 min** |

Hard daily cost ceiling (token/dollar/GPU caps). Model fallback for extraction (NOT for embedding). Auto-pause non-critical deep extraction when foreground latency rises.

### 20A.9 What the System Sees vs What It Remembers

**What the system SEES (observation — free):** Every note save, every page visit, every document open, every task run, every room message.

**What the system NOTICES (significance gating — cheap):** A subset of observations that pass deterministic significance rules. ~25-50% of events.

**What the system ANALYZES (extraction — expensive):** A smaller subset dispatched for LLM extraction. ~15-25% of observations.

**What the system REMEMBERS (promotion — selective):** An even smaller subset that passes DOC1 Write Gate and promotion rules. ~30-50% of extracted items become durable graph knowledge.

The funnel: **100-200 observations → 25-50 significant → 12-25 extractions → actual graph writes on maybe 30-80 items per day.**

### 20A.10 Significance Rule Tuning

**R5.7 edge-case protections:**
1. Dwell time alone is insufficient for high-significance document extraction; combine dwell with scroll/focus signals.
2. A small edit that changes a date, name, or numeric value on a known entity is high significance regardless of delta size.
3. Duplicate suppression is overridden when the observation carries a status change (deadline moved, obligation updated, version superseded, etc.).

Significance thresholds are configurable, not hardcoded. Initial defaults from this spec, adjusted through DOC8 feedback:

- If extraction consistently produces no new knowledge from a surface, DOC8 proposes raising the significance threshold.
- If the user corrects something that was present in a skipped observation, DOC8 flags the missed extraction and proposes lowering the threshold.
- Significance rules are stored in `ELNOR_MEMORY/config/significance_rules.json` and editable through Q Settings or conversationally.


---

## 20B. Knowledge Intake — Surface-Specific Contracts

### 20B.1 The Multi-Participant Attribution Problem

In a two-party chat, extraction is simple: the user said it, so it's high-authority. In a multi-agent room, panel, or forum, multiple participants speak. The extraction system MUST attribute every extracted signal to a specific participant:

```ts
type AttributedExtraction = {
  extracted_content: string;
  attribution: {
    participant_id: string;
    participant_type: "user" | "agent";
    participant_name: string;
    authority_level: "directive" | "recommendation" | "observation";
  };
  confidence: number;
  context_turn_ref?: string;
};
```

**Authority rules:**

| Participant type | What they said | Authority level | Graph treatment |
|---|---|---|---|
| **User** | Statement of fact, preference, decision, correction | `directive` — highest authority | Creates/updates graph directly |
| **User** | Question, hypothetical, request for analysis | Not extracted as knowledge | Context only — no graph write |
| **Agent (accepted)** | Agent recommendation that user explicitly accepted | `directive` — becomes user-endorsed | Creates graph node with dual provenance |
| **Agent (uncontested)** | Agent statement that user didn't contradict | `observation` — low authority | Creates candidate at `suggested` state |
| **Agent (contested)** | Agent statement that user pushed back on | Not extracted as positive knowledge | May generate a correction signal |
| **Agent (rejected)** | Agent recommendation user explicitly rejected | Negative signal | Records as rejected option |

**The critical rule:** Agent speech is NEVER treated as user-endorsed knowledge unless the user explicitly accepted it. Silence is NOT acceptance.

### 20B.2 Notes (DOC20) — `intake.note.content_analyzed`

**Session-close triggers:** 20-minute timeout (configurable 10/20/30). OS focus loss as early trigger (5 min) where supported. Note-to-note switching within Notes surface does NOT trigger close.

```ts
type NoteAssessmentPointer = {
  note_id: string;
  last_knowledge_assessed_version_id: string;
  pending_extraction_id?: string;
  extraction_baseline_version_id?: string;
  updated_at: string;
};
```

Pointer advances ONLY after successful extraction. If extraction is in flight, defer new assessments.

**Note type classification:**

```ts
type NoteTypeClassification = {
  note_type: string;
  source: "universal_rule" | "domain_profile" | "learned_cluster";
  confidence: number;
  supporting_signals: string[];
};
```

Universal types: `today_note`, `scratch`, `planning`, `meeting_notes`, `personal`, `user_flagged`, `reference_dump`. Domain-derived types (active when relevant profile is active): `legal_research`, `matter_working`, `strategy_note`, `hearing_prep`, `authority_bank`, `client_intake`, `technical`.

**Significance assessment:**

```ts
function onNoteSessionClose(noteId: string) {
  const current = loadCurrentNoteVersion(noteId);
  const prior = loadLastKnowledgeAssessedVersion(noteId);

  if (note.extraction_baseline_version_id) {
    return dispatch("defer");
  }

  if (!prior) {
    if (current.char_count < 200) return markAssessed(noteId, current.version_id, "skip_placeholder");
    return queueExtraction(noteId, undefined, current.version_id, "deep_first_write");
  }

  const delta = diff(prior.content, current.content);
  const deltaBytes = byteLength(delta.added_content);
  if (deltaBytes < 50) return markAssessed(noteId, current.version_id, "skip_micro_edit");

  const entityHits = scanAliasIndex(delta.added_content);
  const noteType = classifyNoteType(noteId, current, entityHits);
  if (noteType.note_type === "user_flagged") {
    return queueExtraction(noteId, prior.version_id, current.version_id, "deep_user_flagged");
  }

  const semanticDiff = detectSemanticDiff(delta);
  const features = computeNoteFeatures(noteType, delta, entityHits, semanticDiff, activeProfiles);
  const decision = dispatchScore(features, "note");

  if (hasPriorDeepInSession(noteId) && deltaBytes < 200) {
    return markAssessed(noteId, current.version_id, "skip_double_extract");
  }

  if (decision === "deep" || decision === "shallow") {
    note.extraction_baseline_version_id = current.version_id;
    return queueExtraction(noteId, prior.version_id, current.version_id, decision);
  }
  return markAssessed(noteId, current.version_id, "skip_scored");
}
```

**Semantic diff signals:**

```ts
type SemanticDiffSignals = {
  added_content: string;
  removed_content: string;
  changed_dates: boolean;
  changed_citations: boolean;
  polarity_shift_detected: boolean;
  conclusion_shift_detected: boolean;
  deadline_shift_detected: boolean;
};
```

Any semantic reversal forces at least SHALLOW, often DEEP.

**Block-level extraction guidance:** Additive context for prompts, not a filter. Text blocks = highest priority. Heading blocks = section context. TaskList = obligations/priorities. Citation blocks = authorities. Resolved agent comments = user-endorsed. Unresolved = skip. Rejected tracked changes = negative signal.

**To-do module as priority declaration:** To-do items with matter linkage AND obligation language spawn `Obligation` nodes. General personal to-dos stay as episodic context. Today Note to-dos: same-day context, expire at midnight.

**Contextual retrieval for diffs:** Full prior version as baseline. LLM extracts from diff WITH full context.

**Deep extraction prompt:** Includes NOVELTY GATE (max 80 tokens compact known entities summary), domain_extraction_guidance from active profiles, semantic reversal detection, and quote annotation detection.

### 20B.3 Q Browser (DOC20) — `intake.browser.*`

Deep extraction triggers ONLY on explicit engagement: bookmark (only when domain_impact_score or matter_link_confidence is above threshold), save as artifact/note, Ask Elnor, Focus button, or quote/cite into work product. Dwell time alone NEVER triggers deep extraction — SHALLOW only per §20A.2.

**Default significance rules:**

| Signal | Assessment | Rationale |
|---|---|---|
| Page visited, left within 30 seconds | `skip` | Glance — not meaningful engagement |
| Page visited, dwell >2 minutes on approved domain | `shallow` | Moderate engagement — scan for entity mentions only |
| User bookmarked page (with domain/matter relevance) | `deep` | Explicit save signal — user thinks this matters |
| User saved page as artifact or note | `deep` | Strongest signal — user explicitly preserved content |
| User asked Elnor about page content | `deep` | User engaged Elnor — the interaction itself is worth capturing |
| User pressed Focus button | `deep` | Explicit "learn from this" signal |
| Page on non-approved domain | `skip` | Domain consent model (§39.4) — don't capture without approval |
| Same page visited 3+ times in 7 days | Upgrade previous `shallow` to `deep` | Repeated visits = significance |

**Key R5.4 policy change from R5.3:** Dwell time alone (even >5 minutes) produces SHALLOW only, never deep. R5.3 had dwell >5 min → deep, which was too aggressive — users read pages without intending to memorize them. Deep extraction requires an explicit engagement action.

Research Session button. Domain suggestions after 10+ visits. Research session summary always generated on session close regardless of budget.

### 20B.4 Document Viewer & File System — `intake.document.content_analyzed`

**Default significance rules (Document Viewer):**

| Signal | Assessment | Rationale |
|---|---|---|
| Document opened and closed within 10 seconds | `skip` | Quick glance — wrong document, or just checking title |
| Document viewed for 10-60 seconds | `shallow` | Brief review — capture identity and entity mentions |
| Document viewed for >60 seconds | `deep` | Meaningful reading — full content extraction |
| User annotated (comment, highlight) | `deep` | Active engagement — extract annotation context |
| User used "Ask Elnor" on document content | `deep` | Elnor interaction — already has content in context |
| Document is a new version of known work_product | `deep` | Version tracking — compare with prior knowledge |
| Document is from opposing counsel or court | `deep` | High-value external content |

**Document type multi-signal classification:**

**Precomputed-summary rule:** document-intelligence precomputed summaries and page extractions are cache/read-model artifacts first. Entities, relations, obligations, or concepts derived from them still pass through DOC72 significance gates and the DOC1 Write Gate before entering canonical graph truth.

```ts
type DocumentClassification = {
  document_kind: "own_work" | "opposing_work" | "court_order" | "client_doc" | "reference" | "research_notebook" | "unknown";
  party_role: "self" | "firm_colleague" | "co_counsel" | "opposing" | "court" | "vendor";
  confidence: number;
  basis: string[];
  recheck_if_new_evidence: boolean;
};

type DocumentClassificationRule = {
  rule_id: string;
  sender_pattern?: string;
  subject_pattern?: string;
  attachment_name_pattern?: string;
  mime_type?: string;
  inferred_kind: string;
  inferred_party_role: string;
  confidence: number;
  sample_count: number;
  ttl_days: number;
};
```

Signal chain: filename → folder → email provenance → author metadata → first-page scan → ask once if ambiguous. Default for ambiguous: "external work product." Classification rules scoped by sender + subject + mime + TTL.

OneNote sections (`.one` files, OneDrive-synced notebooks) recognized as `research_notebook` with `party_role: "self"`. At minimum Tier 1 ambient indexing.

**File activity tracking:**

```ts
type FileActivityRecord = {
  file_path: string;
  file_name: string;
  folder_path: string;
  matter_id?: string;
  first_seen_at: string;
  last_modified_at: string;
  significant_modification_count: number;
  last_significant_size: number;
  last_text_content_hash?: string;
  text_delta_chars?: number;
  version_files?: string[];
  attention_tier: 1 | 2;
};
```

For `.docx` files: compare extracted text content, not raw ZIP file size. Significant if `text_delta_chars > 200`. For non-docx: `size_delta > 500 bytes`.

**Work-product concept-link rule:** when deep extraction identifies domain concepts in a document, EC SHALL create canonical `addresses_concept` edges from the `work_product` node to the relevant `domain_concept` nodes. `WorkProductPayload.addressed_concepts` is rebuilt from those edges for fast rendering and MUST NOT drift into an independent truth store.

### 20B.5 DOC23 Tasks — `intake.task.execution_completed`

```ts
type TaskExecutionIntake = {
  task_id: string;
  task_name: string;
  execution_id: string;
  status: "success" | "failure" | "partial_success" | "timeout" | "cancelled";
  executed_at: string;
  duration_ms: number;
  input_entity_refs: string[];
  output_entity_refs: string[];
  output_artifacts?: string[];
  entity_state_changes?: Array<{
    entity_id: string;
    change_description: string;
  }>;
  failure_module_id?: string;
  failure_reason?: string;
  error_code?: string;
  originated_from_standing_procedure?: string;
};
```

Failure → ALWAYS deep. Repeated failures → DOC8 pattern promotion to standing procedure caution. First success → DEEP. Routine success → SKIP. Success with state changes → SHALLOW.

### 20B.6 DOC12 Rooms — `intake.room.extracted`

| Who said it | User reaction | Weight | Treatment |
|---|---|---|---|
| **User** stated | N/A | 0.90 | High-confidence write |
| **Agent** recommended | User accepted | 0.80 | Write with dual provenance |
| **Agent** stated | No response, <3 agents agree | 0.40 | `suggested` only |
| **Agent** stated | No response, 3+ agents agree | 0.55 | `suggested` + "review-needed" flag |
| **Agent** stated | User contradicted | Not positive | User's correction extracted |

Silence is NOT acceptance. `turn_count > 20` override applies ONLY to rooms where user participated. Agent-only rooms → SHALLOW regardless of length.

### 20B.7 Panels and Forums — `intake.panel.extracted`

Panel output focus, not deliberation. PENDING → `stale_suggested` after 7 days.

```ts
type PanelExtractionResult = {
  panel_id: string;
  panel_purpose: string;
  linked_entity_ids: string[];
  panel_recommendation?: {
    recommendation: string;
    confidence: number;
    supporting_agents: string[];
    dissenting_agents?: string[];
    dissent_reasoning?: string[];
  };
  entity_assessments: Array<{
    entity_id: string;
    assessment: string;
    assessing_agent_id: string;
    confidence: number;
  }>;
  domain_concept_signals: Array<{
    concept_ref: string;
    signal: string;
    signal_type: "strength" | "weakness" | "gap" | "contradiction";
    attributing_agents: string[];
  }>;
  user_disposition?: "accepted" | "rejected" | "modified" | "pending";
  user_modification?: string;
};
```

### 20B.8 CANDOR Sessions — `intake.candor.findings_extracted`

```ts
const BASE_CANDOR_WEIGHT = 0.75;
const candorBetaWeight: Record<string, number> = {
  critical: 2.0, major: 1.5, minor: 0.5, observation: 0.25,
};
```

`authority_contradicted` → set `still_current` to `"uncertain"` (NOT directly "overruled"). Requires second independent signal to move to "overruled". "No findings" does NOT auto-increment α.

```ts
type CANDORExtractionResult = {
  session_id: string;
  session_type: "red_team" | "review" | "analysis";
  target_work_product_id: string;
  target_matter_id?: string;
  reviewing_agents: string[];
  findings: CANDORFinding[];
  overall_confidence: number;
  overall_recommendation?: string;
};

type CANDORFinding = {
  finding_id: string;
  finding_type: "argument_weakness" | "factual_gap" | "citation_missing" |
                "authority_contradicted" | "logic_flaw" | "strategic_risk" |
                "style_issue" | "procedural_error" | "strength_confirmed";
  severity: "critical" | "major" | "minor" | "observation";
  description: string;
  affected_concept_ids?: string[];
  affected_work_product_id?: string;
  affected_procedure_ids?: string[];
  citation_ref?: string;
  contradicting_authority?: string;
  finding_agent_id: string;
  concurring_agents?: string[];
};
```

### 20B.9 Structured Email Parsing

```ts
type StructuredEmailTemplate = {
  template_id: string;
  template_name: string;
  template_version: number;
  sender_pattern: string;
  subject_pattern?: string;
  body_pattern?: string;
  extraction_method: "regex" | "llm_schema";
  field_extractors?: Array<{ field_name: string; extraction_regex: string; entity_kind?: string }>;
  extraction_schema_ref?: string;
  llm_extraction_prompt?: string;
  default_matter_linking: "timing" | "client_id_field" | "subject_line" | "none";
  create_entity_kind?: string;
  false_positive_guardrail?: string;
};
```

Westlaw: regex (stable). ECF/CourtDrive: llm_schema (volatile). Before creating entities, check existing graph nodes. DOC8 detects recurring structured emails and proposes new templates. The Westlaw Client ID field is NEVER prompted — optional bonus signal only.

### 20B.10 Focus Button

Single consistent button on Notes, Document Viewer, Q Browser. Respects incognito precedence (§20A.6). Cooldown: if same content Focus-extracted within last hour, surface note rather than re-extracting.

### 20B.11 Provenance Entry Types for Surface Extraction

```ts
// Additional entry_type values for surface-specific extraction:
| "learned_from_note"              // Extracted from DOC20 note content
| "learned_from_document_view"     // Extracted from Document Viewer session
| "learned_from_room"              // Extracted from DOC12 room conversation
| "learned_from_panel"             // Extracted from DOC12 panel
| "learned_from_candor"            // Extracted from DOC14 CANDOR session
| "learned_from_task_execution"    // Extracted from DOC23 task run
| "accepted_from_agent"            // Agent proposed, user accepted
| "agent_observation"              // Agent stated, user did not contradict
| "learned_from_bucket_file"       // Extracted from DOC7 context bucket file
```

### 20B.12 Passive-Signal Exceptions

Bypass engagement gate to SHALLOW: Westlaw email, file in Tier 2 folder, calendar event. Respect memory mode gating — incognito blocks even passive signals.

### 20B.13 Extraction Scheduling Summary

| Surface | Trigger | Timing tier | Volume estimate |
|---|---|---|---|
| Chat (§20) | Session close, significance-gated | Tier 3 | 3-5/day |
| Notes | Save or significant edit (>100 chars changed) | Tier 3 | 5-15/day |
| Document Viewer | Open (shallow) + close (deep if viewed >30s) | Tier 3 | 5-10/day |
| DOC23 Tasks | After each execution | Tier 2 | 10-50/day (many automated) |
| DOC12 Rooms | Room close | Tier 3 | 1-3/day |
| DOC12 Panels/Forums | Panel conclusion or periodic | Tier 3 | 0-2/day |
| DOC14 CANDOR | Session complete | Tier 3 | 0-1/day |
| Browser (§39) | Various — see §39.8 | Tier 2-3 | Varies |
| Bucket files (DOC7) | File add or content update | Tier 2 (high) | 1-5/day |

All surface extraction respects the background scheduler policy (§19.5) — Tier 3 extraction pauses when foreground is active.


### 20B.14 Context Bucket Files (DOC7) — `intake.bucket.file_added`

Context bucket files are high-value intake sources. The user explicitly curated them as relevant reference material — adding a document to a bucket is an intentional act that signals the content matters. Bucket files receive elevated extraction priority and starting confidence.

**Trigger:** When EC processes a `context_bucket_file_add` command (DOC7) and the file reaches `index_status: "ready"`, EC emits an `intake.bucket.file_added` observation. File content updates (detected via `content_hash` change on re-add or version increment) also trigger extraction. File removals do NOT trigger extraction or entity deletion — extracted knowledge persists independently of the bucket file's lifecycle.

**Default significance rules:**

| Signal | Assessment | Rationale |
|---|---|---|
| File added to any bucket | `deep` | User explicitly curated this content — always extract |
| File updated (content_hash changed) | `deep` | User refreshed the document — re-extract |
| File removed from bucket | No action | Knowledge persists; bucket membership is ephemeral |
| File already extracted (same content_hash) | `skip` | Dedup — don't re-extract identical content |

**Extraction prompt (bucket-file-specific):**

The extraction prompt for bucket files is enriched with bucket context:

```
Extract structured knowledge from this document.

Document title: {file_title}
Source bucket: {bucket_title}
Bucket summary: {bucket_summary}
Bucket matter associations: {linked_matter_names}
User's active work contexts: {active_work_context_names}

Extract:
- Named entities (people, organizations, courts, case names, case numbers)
- Dates and deadlines mentioned
- Key facts, holdings, or rulings
- Document type and purpose (complaint, motion, brief, letter, memo, order, etc.)
- Parties and their roles
- Claims or causes of action
- Key arguments or positions taken
- Relief sought
- Any procedural history mentioned

For each extracted item, include:
- The entity or fact
- Confidence (how clearly stated vs inferred)
- The approximate location in the document (beginning/middle/end or section reference)

Output as structured JSON candidates.
Do NOT extract boilerplate, signature blocks, certificates of service, or formatting instructions.
```

**Provenance:** All candidates extracted from bucket files carry:

```ts
provenance: {
  entry_type: "learned_from_bucket_file",
  source_ref: "{bucket_id}:{file_id}",
  source_content_hash: "{content_hash}",
  extraction_model: "{model_used}",
  extracted_at: "{timestamp}",
}
```

This provenance chain enables DOC24's overlap detection — DOC24 can identify knowledge cards that were extracted FROM a bucket file and suppress them when the same file is being inlined by DOC7.

**Starting confidence:** Bucket file candidates receive elevated starting confidence because the user explicitly curated the file:
- Entities: starting α = 3 (vs α = 2 for conversation-mined entities)
- Facts and holdings: starting α = 3
- The user chose to put this document in a bucket — that is an endorsement of its relevance

**Matter association:** If the bucket is assigned to a project or matter, all extracted entities SHALL be linked to that matter via edges. If the bucket has no matter association, extracted entities are linked based on standard entity resolution against existing graph nodes.

**Scheduling:** Bucket file extraction is queued as high priority in the BackgroundJobOrchestrator (EC Core Addendum A §3) using the tier2_extractor agent profile. High priority because the user explicitly added the file — this is an intentional action indicating the content matters now.

**Memory mode gating:** Bucket file extraction respects the global memory control hierarchy (EC Core Addendum A §2). If `memory_system_enabled = false`, `collection_enabled = false`, or the bucket surface collection toggle is off, no extraction occurs. Incognito mode does not apply to bucket files (buckets are persistent curated content, not ephemeral browsing).


---


### 20B.15 Calendar — `intake.calendar.event_observed`

Calendar events are first-class intake sources in R5.7. Significance defaults:
- Court hearings, trial dates, depositions, filing deadlines, and similar adjudicative or externally binding events → `deep`
- Single-instance work-context events with linked parties, documents, or obligations → `shallow` or `deep` depending on linked-entity density
- Recurring meetings and low-salience routine events → `shallow` or `skip`
- External invites with unknown parties → attendee/entity extraction plus shallow event capture

Calendar extraction SHOULD prefer deterministic identity, participant, and timing capture before any LLM expansion.

### 20B.15A To-Do / Checklist Intake — `intake.todo.item_changed`

To-do items are observed eagerly but do not bypass significance discipline.

```ts
function shouldExtractTodo(todo: TodoItem): ExtractionDecision {
  if (todo.matter_linked && todo.has_obligation_language) return "graph_entity";
  if (todo.matter_linked) return "shallow_extraction";
  if (todo.has_obligation_language && !todo.matter_linked) return "suggestion_inbox";
  return "episodic_context_only";
}
```

This prevents trivial personal reminders from becoming durable obligations while still capturing work-linked or obligation-like items.

### 20B.16 IntakeSurfaceContract Template

```ts
type IntakeSurfaceContract = {
  surface_id: IntakeSurface;
  trigger_events: string[];
  significance_policy: SignificanceGateConfig;
  extraction_modes: ("shallow" | "deep")[];
  output_node_kinds: NodeKind[];
  dedup_rules: DedupConfig;
  backlog_priority: "critical" | "normal" | "deferred";
  failure_semantics: FailurePolicy;
};
```

Every new intake surface in DOC72 MUST define itself through this template or a strict superset of it.


## 20C. Knowledge Intake — Cross-Surface Consolidation

```ts
type CrossSurfaceMergeDecision = {
  concept_or_entity_id: string;
  contributing_sources: Array<{
    source_surface: IntakeSurface;
    source_ref: string;
    confidence: number;
    authority_weight: number;
  }>;
  merge_outcome: "merge" | "variant" | "conflict_set" | "suggested_review";
  dominant_source_ref?: string;
};
```

**Precedence:** user-confirmed > authority-backed document > user-authored note > accepted room result > generic chat > browser observation. Citation normalization for dedup. MinHash/semantic similarity for batch dedup.

When the same entity or concept appears across multiple surfaces, nightly consolidation merges them using the precedence ordering above. The highest-authority source becomes the dominant provenance entry; lower-authority sources become supporting evidence.

### 20C.1A Field-Level Merge Policy

Cross-surface consolidation in R5.7 is field-wise, not merely node-wise.

```ts
type MergePolicy =
  | "authoritative_override"
  | "multi_value_union"
  | "conflict_set"
  | "latest_authoritative_if_fresher";
```

**Default rules:**
- identity and canonical name fields: `authoritative_override`
- aliases and reference sets: `multi_value_union`
- contested factual assertions: `conflict_set` until resolved
- freshness-sensitive operational fields: `latest_authoritative_if_fresher`

---

## 21. Retroactive Knowledge Sonar

### 21.1 Domain-Agnostic

Triggers for ANY significant new node — work context, goal, domain concept, person, obligation. Not just legal concepts.

### 21.2 Two-Pass Architecture

Pass 1 searches GLOBAL index (both Tier 1 and Tier 2). Pass 2 prioritizes Tier 2 hits but includes high-relevance Tier 1.

```ts
type RetroactiveEnrichmentJob = {
  job_id: string;
  target_node_id: string;
  target_node_kind: string;
  search_queries: string[];
  status: "queued" | "searching_index" | "extracting_chunks" | "completed" | "cancelled";
  max_chunks_to_process: number;    // default 50
  token_cap?: number;
  usd_cap?: number;
  sonar_derived: boolean;           // If true, results CANNOT trigger secondary sweeps
  chunks_found: number;
  chunks_processed: number;
  connections_created: number;
};
```

**Sonar-derived flag:** Nodes created by sonar are FORBIDDEN from triggering secondary sweeps. Prevents infinite cascade.

### 21.3 Cooldown and Triggers

30-day cooldown per concept. User-initiated sweeps always bypass cooldown. Triggers: confidence ≥ 0.7 + active work context, new obligation with deadline, new work context entity, user explicitly requests.

### 21.4 Surfacing Model

Sonar results enrich the graph in background. **NOT proactively injected into conversations.** Surfaced ON DEMAND when user asks. Exception: contradictions found by sonar → weekly digest or suggestions inbox.

### 21.5 User-Initiated Sweep

"Elnor, learn about loss causation from my cases." Triggers immediately. Elnor shows current knowledge → "go deeper" → sweep runs → reports back.

---

## 22. Folder Attention Tiers

Tier 1: ambient index (file names, structure, no content). Tier 2: active monitoring with content extraction.

Promotion to Tier 2: manual user designation, detected engagement (repeated file opens), work context linkage. LRU demotion when cap exceeded.

---

## 23. Daily Extraction Budgets

**Separate calibration budget:** historical calibration and active-work-context bootstrap sweeps use a separate budget lane and do not consume the ordinary daily extraction capacity. See §31.8 and the bootstrap config in this revision.

```ts
type DailyExtractionBudget = {
  deep_max_total: number;           // default 20 (40 first week)
  token_max_total: number;          // e.g., 400k input-equivalent/day
  usd_max_total: number;            // e.g., $3/day
  gpu_minutes_max_total: number;
  deep_max_browser: number;         // default 3
  deep_max_notes: number;           // default 6
  deep_max_docs: number;            // default 6
  reserved_critical_slots: number;  // default 5
  carried_forward_slots: number;    // max 2x daily default
  focus_mode: "normal" | "deep" | "conservative" | "context_focus";
  focus_context_id?: string;
  last_reset_at: string;
};

type FocusModeOverride = {
  mode: "normal" | "deep" | "conservative" | "context_focus";
  focus_context_id?: string;
  override_fair_share: boolean;
  activated_at: string;
  ttl_hours?: number;               // Auto-revert, default 24
};
```

**Fair-share quotas:** Each active work context gets minimum allocation. One hot context cannot consume all slots. "Go deep on Henderson" temporarily suspends fair-share — Henderson gets priority access to full budget, others still get minimum 1-2 reserved slots.

---

## 24. Critical Non-Drop Extraction Queue

**R5.7 rule:** items in the critical lane are never `stale_archived` and process ahead of normal and deferred work.

```ts
type CriticalEventClass =
  | "user_correction"
  | "court_order_detected"
  | "task_failure"
  | "candor_critical_finding"
  | "explicit_accepted_recommendation"
  | "user_flagged_focus";
```

Protected zone in ring buffer (5 reserved slots). Hard upper bound to prevent unbounded growth.

---

## 25. Conversation Corpus / Episodic Recall

Conversation mining (turning chats into graph entities) is NOT the same as episodic recall ("in that thread from Tuesday, we already discussed which Henderson motion you meant"). Both are needed. The conversation corpus is a separate searchable store for cross-chat continuity.

### 25.1 Schema

```ts
type ConversationThreadSummary = {
  thread_id: string;
  title_hint: string;
  linked_entity_ids: string[];
  key_decisions: string[];
  unresolved_items: string[];
  checkpoint_summary: string;
  occurred_at: string;
  turn_count: number;
  channel: string;
};

type DecisionCheckpoint = {
  decision_id: string;
  thread_id: string;
  description: string;
  reasoning: string;
  linked_entity_ids: string[];
  occurred_at: string;
};
```

Stored in the `conversation_threads` SQLite table (§3.2) with vector embedding for semantic search.

### 25.2 Retrieval Tools

- `search_conversation_history(query, entity_scope?)` — semantic search over thread summaries
- `get_related_threads(entity_id)` — threads linked to a specific entity
- `resume_context(entity_id?)` — retrieve the most recent thread's checkpoint for an entity

### 25.3 Injection

New packet card type: `[Prior Context — reference]` card injected when the router detects continuity with a recent thread. Full transcripts retained on disk but not in the graph — only summaries and checkpoints enter the conversation_threads table.

---

## 26. The Three-System Architecture

ELNOR is built from three distinct systems. DOC72 is one of them. Understanding the boundaries prevents overlap.

### 26.1 Knowledge System (the brain) — understands the world

| Doc | Role |
|---|---|
| **DOC72** | Hyper Intelligence Overlay — graph architecture, ontology, entity/skill/goal/procedure models, temporal resilience, extended intelligence patterns, knowledge intake architecture |
| **DOC1** | Memory Governance — Write Gate, maturity lifecycle, confidence calibration, authority memories, contradiction detection |
| **DOC8** (expanded) | Learning Engine — friction, positive signals, cross-graph pattern detection, behavioral optimization, graph quality improvement, extraction threshold tuning |
| **DOC3** | Skill Acquisition Pipeline — observe, synthesize, propose, review, promote. Writes to DOC72 graph |

### 26.2 Operations System (the nervous system) — communicates and coordinates

| Doc | Role |
|---|---|
| **DOC24** | Capability, Routing, Invocation, and Delivery — registry, routing cascade, tool packs, MCP, receipts, packet assembly, knowledge-to-LLM delivery, injection tags, rendering |
| **DOC15** | CIL / Context Injection — injection hierarchy, authority injection, token budgets, context assembly |
| **DOC4** | OpenClaw Bridge — runtime interface, native tools, ambient baseline |
| **DOC11** | Gateway / Model Controls — runtime truth, model routing, auth |
| Q UI specs | DOC21/DOC22/DOC20 UI layer — dashboard, pages, surfaces |

### 26.3 Body (the muscles) — acts in the world

| Doc | Role |
|---|---|
| **DOC23** | Task automation |
| **DOC12** | Rooms, panels, forums |
| **DOC16** | Provider integrations (M365, future Gmail, etc.) |
| **DOC5** | File sync |
| **DOC18** | LlamaIndex retrieval sidecar |
| **DOC20** | Notes, documents (capability layer) |
| **DOC17** | Overlays |
| **DOC14** | CANDOR |

### 26.4 Information Flow Between Systems

**Perceiving:** World events → **Body** (receives email, detects file changes, gets calendar events) → **Operations** (routes the information) → **Knowledge** (learns from it, updates graph)

**Acting:** User request → **Operations** (routes, resolves entities from **Knowledge**, mounts packs, assembles packet with context from **Knowledge**) → **Body** (executes the action) → **Operations** (captures receipt and trace) → **Knowledge** (updates graph with trace, enriches entities)

**Learning:** **Knowledge** produces intelligence (entity cards, goals, procedures, directives). **Operations** delivers it to the agent. **Body** executes and produces outcomes. Outcomes flow back through **Operations** to **Knowledge**. The loop continues.

---

## 27. DOC72 Scope and Boundaries

### 27.1 What DOC72 Owns (Knowledge Architecture)

- Entity graph schema (node types, edge types, ontology) — 10 canonical types
- Full entity ontology with semantic intent procedures
- 4-layer procedural taxonomy
- Standing procedure model (trigger schemas, action schemas, ActionSafetyClass, execution receipts, dedup, conflict resolution, promotion criteria)
- Goal model (narrative goals, sub-goals, tensions, stakeholders, relevance tiers)
- Temporal resilience model (TemporalMetadata, ChangeRecord, staleness tiers, change propagation, matter state transitions, policy drift)
- Entity creation pipeline rules
- Six-dimension sparsity policy
- Confidence model (Beta distribution, event weights, half-lives, policy caps)
- Retrieval model (how graph queries work, routing cascade tiers)
- Rendering contract (how knowledge cards are formatted)
- Conversation corpus / episodic recall
- Extraction quality gates
- Graph cleanup / entropy control
- Knowledge intake architecture — observation pipeline, significance gating, surface-specific contracts, domain signal profiles, self-learning feedback loop
- Cold start / bootstrap knowledge requirements
- Embedding infrastructure
- Observation mechanism architecture — how each surface is technically watched, significance gating for extraction dispatch, extraction depth tiers, cost/volume governance
- Surface-specific knowledge intake contracts — per-surface extraction schemas and authority rules
- Multi-participant attribution model
- Agent knowledge profiles and delegation protocol

### 27.2 What DOC72 Does NOT Own

| Engine | Owner | How it interacts with DOC72 |
|---|---|---|
| Friction detection and prevention | DOC8 | Reads graph, writes friction patterns and prevention rules |
| Positive learning and optimization | DOC8 | Reads graph, writes α/β adjustments and tuning parameters |
| Cross-graph pattern detection | DOC8 | Reads graph, writes pattern proposals to DOC1 |
| Memory Write Gate and maturity | DOC1 | Reads graph for dedup/contradiction, writes maturity transitions |
| Memory generalization | DOC1 §12 | Reads graph context, writes generalized heuristic proposals |
| Skill learning pipeline | DOC3 | Reads graph for reuse, writes skill nodes and traces |
| Background indexer execution | EC Core / DOC24 | Writes entity candidates through intake interface |
| Conversation mining execution | EC Core / DOC24 | Writes extraction results through intake interface |
| Routing and pack mounting | DOC24 | Reads graph for entity resolution and pack mapping |
| Packet assembly and CIL injection | DOC24 + DOC15 | Reads graph for entity cards, goals, directives |
| Knowledge-to-LLM delivery | DOC24 | Reads rendering contracts, assembles packets, applies injection tags |
| Standing procedure trigger matching | DOC24 | Reads standing procedure trigger index from graph |
| Post-generation compliance verification | DOC24 | Reads injected tags, checks LLM output compliance |

### 27.3 The Key Principle

DOC72 defines the WHAT and SHAPE of knowledge. The engines define the HOW of processing. DOC72 says "here's what a procedure node looks like." DOC8 says "here's how I detect that confidence should change." DOC1 says "here's how I decide what passes the Write Gate." DOC3 says "here's how I turn an observation into a skill." DOC24 says "here's how I route this to the right tool and deliver knowledge to the LLM."

---

## 28. The Entity Graph as a First-Class Durable Store

### 28.1 What the Graph IS

The entity graph is a **first-class EC-owned durable store** under `ELNOR_MEMORY/`. It is written exclusively by EC (single writer preserved). Stored in SQLite with append-only JSONL event log for audit/replay.

### 28.2 What the Graph is NOT

- It is NOT DOC1. DOC1 owns memory governance. The graph owns structural knowledge. (But they share the same SQLite database for storage — §3.5.)
- It is NOT a rebuildable cache. Extracted entities, cross-source relationships, user-confirmed structures, and goal narratives are durable knowledge that cannot be trivially rebuilt.
- It is NOT a second hidden writer. EC writes it. Only EC writes it.

### 28.3 Relationship Between DOC1 and the Entity Graph

**DOC1 feeds the graph.** When Elnor saves a memory through `memory_save`, it goes through DOC1's pipeline AND creates/updates graph nodes. One EC command (`entity_knowledge_write`) coordinates both — in a single SQLite transaction.

**The graph is NOT DOC1.** DOC1 §4.12 explicitly states it does not create a universal graph database. The entity graph is a separate EC-owned service with its own schema and lifecycle states. DOC1 memories (`memory_directive` nodes) are one input source. Background indexing is another.

## 29. Goals

**Normative payload note:** The conceptual goal discussion in this section is now backed by the absorbed `GoalKnowledgeContractSchema` in §4A.6.

### 29.1 Why Goals Matter

Everything in the system answers WHAT (entities), HOW (procedures), and WHEN (standing procedures). Goals answer WHY. Without WHY, Elnor executes tasks in isolation. With goal context, Elnor reasons strategically.

### 29.2 What Goals Are

Goals are **narrative nodes** in the entity graph. Natural-language descriptions of what the user is trying to achieve, linked to relevant entities, with sub-goals, tensions, stakeholders, and temporal context.

Goals are NOT project plans, KPIs, or rigid structured decompositions. Goals ARE natural-language strategic intent, linked through graph edges, hierarchical, tension-aware, temporal, and injected as LLM context when strategically relevant.

### 29.3 Goal Node Schema

```ts
// Stored in nodes table with node_kind = 'goal'
type GoalPayload = {
  canonical_name: string;                    // "Resolve Henderson favorably"
  narrative: string;                         // 50-200 tokens natural language
  parent_goal_id?: string;
  sub_goal_ids?: string[];
  linked_entity_ids: string[];
  primary_entity_id?: string;
  tensions?: GoalTension[];
  stakeholders?: GoalStakeholder[];
  success_indicators?: string[];             // "Settlement at or above $30M"
  time_horizon?: string;                     // "Before Q3 2026"
  deadline_entity_ref?: string;
  goal_status: "active" | "achieved" | "abandoned" | "evolved" | "on_hold";
  evolved_from_goal_id?: string;
  evolution_reason?: string;
  source: "user_stated" | "onboarding" | "conversation_mined" | "system_suggested";
  last_referenced_at?: string;
};

type GoalTension = {
  tension_description: string;
  current_priority?: string;
  linked_sub_goal_ids?: string[];
};

type GoalStakeholder = {
  stakeholder_entity_id?: string;
  stakeholder_role: string;
  interest_description?: string;
};
```

### 29.4 Three Tiers of Goal Relevance

**Tier 1 — Explicitly goal-linked.** "File the Henderson MTD" — Henderson has a stated goal. Goal narrative injected. Strategic reasoning enabled.

**Tier 2 — Potentially goal-relevant.** "Order groceries" — IF a nutrition goal exists, tangentially relevant. Context AVAILABLE but only injected if request is strategically relevant.

**Tier 3 — Not goal-relevant.** "What's the weather?" — no goal context needed. Most daily interactions.

The tier is determined by checking whether the active entity has linked goals and whether the action type is strategically relevant (decision-making, communication, deadline management, planning) vs purely operational.

### 29.5 How Goals Are Created

**Goals are created TOP-DOWN, not BOTTOM-UP.** Elnor should almost never create a goal by observing repeated actions. Goals come from the user — conversation, onboarding, or explicit statements.

**From conversation:** "The whole point of the expert report is to strengthen our damages argument for settlement." Goal-level language detected, goal proposed.

**From onboarding:** User-initiated ("let's talk about Henderson goals") or Elnor-initiated gap detection ("You've been doing a lot of work on Henderson. What are you trying to accomplish?").

**From system suggestion:** Type-level heuristics suggest goals for entity types that benefit from them.

Goals emerge INCREMENTALLY across conversations. Cross-chat pattern detection (§20.7) aggregates fragments into richer goal pictures.

### 29.6 How Goals Evolve

Goals CHANGE, not just become stale. "Survive the MTD" evolves to "leverage the MTD denial into settlement pressure."

Goal evolution captured through:
- `evolved_from_goal_id` linking to prior version
- `evolution_reason` explaining what changed
- Old goal node gets `goal_status: "evolved"` (preserved for history)
- New goal node becomes active version

### 29.7 What Should NOT Have Goals

Goal probing weighted by entity type:
- `case` / `matter` — almost always benefits
- `work_context` (projects) — usually benefits
- `person` / `organization` — sometimes (sensitive)
- `application` / `folder_root` / `calendar` — never need their own goals
- Personal domains — only when user expresses them

### 29.8 Goal Context Injection Budget

Goal narratives: 50-200 tokens. Injection budget: ≤100 tokens per turn. If multiple goals active, prioritize by: most recently referenced > primary entity match > highest confidence. Goal context sits within DOC15's injection hierarchy alongside entity cards and directives.

---

## 30. Information Intake — How the Graph Gets Fed

### 30.1 The Intake Contract

DOC72 defines the SCHEMAS. DOC24 and body systems produce the EVENTS. EC's event system carries them. The graph service processes them.

### 30.2 Intake Event Sources

| Source | What it produces | Event type | Timing |
|---|---|---|---|
| Email processing (DOC16) | Entity candidates, relationships | `intake.email.entities_extracted` | Continuous (Tier 3) |
| Calendar discovery (DOC16) | Calendar entities, events | `intake.calendar.discovered` | Periodic (Tier 3) |
| File system monitoring | Folder/file changes | `intake.filesystem.changed` | Event-driven (Tier 3) |
| Tool execution traces | Execution traces | `intake.trace.captured` | After each action (Tier 2) |
| Chat session close | Mining extraction results | `intake.chat.extracted` | After close, significance-gated (Tier 3) |
| Room close (DOC12) | Multi-participant extraction with attribution | `intake.room.extracted` | After close (Tier 3) |
| Panel/forum conclusion (DOC12) | Deliberation outcomes, recommendations, votes | `intake.panel.extracted` | After conclusion (Tier 3) |
| CANDOR session (DOC14) | Red team findings, argument weaknesses, concept feedback | `intake.candor.findings_extracted` | After session (Tier 3) |
| DOC23 task execution | Task outcome, performance data, entity state changes | `intake.task.execution_completed` | After each run (Tier 2) |
| Note content analysis (DOC20) | Entities, research, decisions from note content | `intake.note.content_analyzed` | On save/significant edit (Tier 3) |
| Document Viewer session (DOC20) | Document understanding, annotations, entity extraction | `intake.document.content_analyzed` | On view/annotate (Tier 3) |
| Settings changes | New accounts, aliases, providers | `intake.settings.changed` | Immediate (Tier 2) |
| Capability state changes | Auth changes, provider health | `intake.capability.changed` | Immediate (Tier 2) |
| Onboarding commit | Entities, memories, goals, procedures | `intake.onboarding.committed` | On commit (Tier 2) |
| User corrections | Entity/relationship/goal corrections | `intake.correction.applied` | Immediate (Tier 1 for graph update) |
| Internal object creation | Tasks, panels, rooms, notes created | `intake.internal.created` | Immediate (Tier 2) |
| Agent-initiated proposals | Inline knowledge update proposals | `intake.agent.proposed` | During conversation (Tier 2) |
| Browser (various) | See §32.8 | `intake.browser.*` | Various (Tier 2-3) |

Each surface-specific extraction contract is defined in §20B. Significance gating (what triggers extraction vs what's ignored) is defined in §20A.

### 30.3 Intake Processing Rule

Every intake event goes through the entity creation pipeline — source gating → extraction → classification → linking → promotion. The pipeline is the single entry point for ALL graph knowledge. No intake source bypasses it.

Exception: user corrections and settings-backed auto-confirmations fast-track (high trust, immediate confirmation).

---
## 31. Cold Start and Bootstrap

### 31.1 The Problem

When ELNOR starts fresh — no email connected, no entities, no memories — the knowledge graph is empty.

### 31.2 Bootstrap Behavior

When the entity graph has fewer than 5 confirmed entities AND no providers connected, inject a one-time bootstrap context note: "I don't have access to your email, calendar, or files yet. Would you like to connect them so I can learn about your world?"

Injected once per session until dismissed or 10+ entities exist. Not a blocking gate.

### 31.3 Progressive Enrichment Visibility

When background indexing is in progress:
- Q UI: "Learning about your email — 847 of 4,200 items processed"
- Packet (when entity resolution fails): "Background indexing is 20% complete. Some matters may not appear yet."

### 31.4 Concrete User Journey

**Day 1:** Provider connected. Background indexer runs shallow pass. Elnor says "I'm learning about your email — I've found 15 matters, 42 contacts, and 8 calendars so far. Want to spend a few minutes telling me about your most important cases?"

**Day 7:** Graph has ~500 entities from indexing + onboarding. Entity resolution works for known matters. Standing procedures suggested for detected patterns. "I've noticed you check ECF every morning. Want me to do that automatically?"

**Day 30:** Graph has ~2,000 entities. Procedures validated through traces. Goals set for active work contexts. Domain concepts accumulating from research. The system feels like it "knows" the user's practice.

### 31.5 Agent-Initiated Knowledge Writes

Instead of waiting for post-chat mining, Elnor proposes knowledge updates inline during conversation: "I notice you mentioned Jones & Smith as new opposing counsel. I'm adding them to Narayanan. Correct?"

Graph updates immediately on confirmation. This populates the graph 10x faster than mining, builds trust (user sees learning in real-time), and reduces extraction error because the LLM works with full conversational context.

```ts
// Core pack tool
type ProposeKnowledgeUpdate = {
  entity_type: string;
  data: Record<string, unknown>;
  linked_entities: string[];
  confirmation_prompt: string;
};
// If confirmed → entity_knowledge_write. If rejected → negative DOC8 signal.
```

### 31.6 Predictive Entity Pre-Loading

Before user speaks, predict likely entities from: time of day, day of week, calendar proximity, recent activity patterns, channel context. Pre-load their cards into hot cache.

```ts
type PredictiveLoadResult = {
  predicted_entities: EntityRef[];
  prediction_method: "temporal" | "calendar" | "activity" | "channel";
  confidence: number;
};
// Runs at session start, Tier 2, <500ms. Zero latency impact if wrong.
```

### 31.7 Pause Onboarding Tool

The LLM can voluntarily pause probing when question debt is high:

```ts
type PauseOnboardingRequest = {
  reason: "question_debt_high" | "user_seems_busy" | "context_sufficient" | "session_ending";
  resume_after_turns?: number;
};
```

DOC8 probing policy respects this signal. Prevents the "20 questions" feel.

---


### 31.8 Historical Calibration Pass (Onboarding)

During onboarding or first setup, user designates folders or files for calibration: "Here are my current active cases — learn from these." Flexible quantity — user might give 15 MTDs, 10 oppositions, complaints, court orders.

Background extraction processes all designated content. User reviews results in Knowledge Manager. Corrections calibrate DOC8. After calibration: 50-100+ authorities, 20-30+ domain concepts, functional concept hierarchy before first regular workday.

This eliminates the cold-start problem for professional domains where the user already has substantial work product.


---


### 31.9 Active-Work-Context Bootstrap Lane

R5.7 adds an explicit bootstrap lane for large existing corpora so the user does not wait weeks for useful intelligence.

```ts
type BootstrapConfig = {
  active_work_context_priority_multiplier: number;
  bootstrap_deep_budget_per_hour: number;
  bootstrap_max_duration_days: number;
};
```

Historical calibration passes run under this separate budget, not the daily extraction budget. The lane is scoped to the user-designated active work context and should front-load folders, briefs, authorities, and key artifacts most likely to support immediate work.


## 32. DOC24 ↔ DOC72 Split Guide

### 32.1 What DOC72 Owns

- Entity graph schema, ontology, node types, edge types
- Knowledge intake pipeline architecture (DETECT → ASSESS → EXTRACT)
- Domain signal profiles
- Surface-specific intake contracts
- Confidence model, event weights, half-lives
- Payload schemas and render-input contracts (how knowledge nodes are shaped for delivery)
- Embedding infrastructure
- Self-learning extraction feedback schemas
- Cold start, bootstrap, onboarding intelligence model
- Conversational inspectability and Knowledge Manager UI
- Graph cleanup, semantic folding, entropy control
- Agent knowledge profiles

### 32.2 What DOC24 Owns

- Capability registry, live action state
- Routing cascade and pack mounting
- Tool packs, MCP exposure
- Invocation bindings and receipts
- Knowledge-to-LLM DELIVERY architecture (packet assembly, card rendering, injection selection, injection tags, retrieval lanes, rendering templates implementation)
- Provenance display policy
- Injection kill-switch behavior
- Prior context cards
- Semantic cache warming
- Weekly digest rendering
- Deadline cascade intelligence
- Cross-work-context prior work product retrieval
- Post-generation compliance verification

### 32.3 The Split Principle

DOC72 defines the SHAPE of knowledge (schemas, ontology, intake contracts, confidence rules, payload schemas, and render-input contracts). DOC24/KDA defines the DELIVERY of knowledge (how it gets assembled into prompts, how injection decisions are made, and how runtime rendering happens). DOC72 is the brain's structure. DOC24 is the nervous system that delivers intelligence to the muscles.


---

## 33. Universal Provenance Chain and Epistemic Tiering

### 33.1 Design Principles

The provenance chain ensures that the system can explain WHY it believes anything, with verifiable evidence for every claim.

### 33.2 Three-Tiered Provenance Depth

| Tier | Provenance depth | Applied to |
|---|---|---|
| **Full** | Complete chain — every evidence source, every correction, every supersession | Tier A nodes: domain concepts, goals, obligations, standing procedures, work products, matters |
| **Compact** | Latest 3 entries + summary of earlier history | Tier B nodes: procedures, applications, important world entities |
| **Minimal** | Single source reference | Tier C nodes: candidates, suggestions, low-value entities |

### 33.3 Provenance Entry Schema

```ts
type ProvenanceEntry = {
  id: string;
  node_id: string;
  entry_type: string;                    // See §20B.11 for full list
  source_description: string;
  source_ref: string;
  citation?: string;
  authority_type?: "binding" | "persuasive" | "informational" | "personal";
  still_current?: "yes" | "uncertain" | "overruled" | "outdated" | "not_checked";
  confidence_contribution: number;
  supersedes_entry_id?: string;
  correction_reason?: string;
  jurisdiction_scope?: string[];
  
  // Epistemic tiering fields (R5.4 addition)
  authorial_voice?: AuthorialVoice;
  author_entity_id?: string;
  assertion_type?: AssertionType;
  verbatim_excerpt?: string;             // max 500 chars, MANDATORY for legal/authority, OPTIONAL for other
  
  // Email source context
  email_context?: {
    from: string;
    to?: string[];
    cc?: string[];
    subject: string;
    date: string;
    has_attachments: boolean;
    attachment_names?: string[];
  };
  
  created_at: string;
};
```

### 33.3A Immutable Provenance Snapshot

```ts
type ProvenanceSnapshot = {
  excerpt_hash: string;
  capture_timestamp: string;
  source_metadata: {
    filename?: string;
    email_subject?: string;
    url?: string;
  };
};
```

When the original source later disappears, explanation degrades gracefully and re-verification receives `source_unavailable` status, but the node remains citable with its last captured evidence digest.

### 33.4 AuthorialVoice and AssertionType

```ts
type AuthorialVoice =
  // Universal
  | "self" | "colleague" | "external_party" | "authority_source" | "system"
  // Legal-specific (active when legal profile active)
  | "first_party" | "second_party" | "opposing_counsel" | "adjudicator" | "statutory_body";

type AssertionType =
  | "direct_quote" | "synthesized_rule" | "applied_argument" | "factual_claim" | "procedural_rule";
```

These drive injection tag mapping in DOC24 (e.g., `authority_source` + `direct_quote` → `[cite_as_rule]`). See DOC24 for the complete injection rendering tag table.

### 33.5 Email Source Context

When knowledge is derived from an email, the provenance entry captures enough metadata for Elnor to answer "where did this come from?" without re-reading the original email. DOC16 already parses these fields during email triage — populating `email_context` is a copy at write time.

## 34. Experience Dimension

### 34.1 What Experience Captures

Experience tracks HOW knowledge has lived in actual use — usage frequency, outcome history (positive AND negative), behavioral distribution, trend signals, corrections, and contextual patterns.

### 34.2 Three-Tier Experience Model

- **Full experience** (procedures, composite procedures, standing procedures, goals, domain concepts): Full ExperienceRecord with 20 recent events, variant distribution, trend signals. Stored in `experience_records` table.
- **Light experience** (world entities, applications): `{ usage_count, last_used_at, last_outcome }` — 3 fields. Stored in `nodes.payload` JSON.
- **No separate experience** (memory_directives at minimal tier): Just `usage_count` and `last_used_at` inline on node.

### 34.3 ExperienceRecord Schema

```ts
type ExperienceRecord = {
  target_node_id: string;
  total_usage_count: number;
  total_success_count: number;
  total_failure_count: number;
  total_correction_count: number;
  recent_usage_count: number;
  recent_success_count: number;
  recent_failure_count: number;
  recent_correction_count: number;
  variant_distribution?: Array<{
    variant_label: string;
    count: number;
    recent_count: number;
    last_used_at: string;
    trend: "increasing" | "stable" | "decreasing";
  }>;
  context_distribution?: Array<{
    context_label: string;
    count: number;
    recent_count: number;
  }>;
  recent_events: UsageEvent[];            // Last 20 for Tier A nodes
  older_event_summary?: {
    total_count: number;
    date_range: { from: string; to: string };
    success_rate: number;
    most_common_context?: string;
    most_common_variant?: string;
  };
  overall_trend: "growing" | "stable" | "declining" | "shifting" | "new";
  trend_signal?: string;
};

type UsageEvent = {
  event_id: string;
  used_at: string;
  usage_type: "accessed" | "applied" | "referenced" | "executed" |
              "invoked" | "injected" | "suggested" | "rejected";
  context?: string;
  outcome?: "success" | "failure" | "corrected" | "ignored" |
            "adverse_ruling" | "partial_success" | "unknown";
  outcome_detail?: string;
  outcome_source_ref?: string;
  variant_used?: string;
  correction_detail?: string;
  session_ref?: string;
  agent_id?: string;                     // Which agent performed the action
};
```

### 34.4 Experience Feeds All Other Dimensions

- **→ Confidence:** 20 successes → increment α. 3 consecutive failures → increment β.
- **→ Content:** Behavioral shift (Brand X → Brand Y) → propose content update.
- **→ Temporal:** `last_used_at` feeds freshness.
- **→ Provenance:** Significant events generate provenance entries.
- **→ Connections:** Usage patterns reveal hidden connections.

### 34.5 Negative Experiences as Learning Events

Every bad outcome captures WHY, not just THAT:
- **Legal — adverse ruling:** "Judge found corrective disclosure insufficient — required direct price impact evidence."
- **Preference — user correction:** "User requested Brand Y instead of Brand X." Variant distribution shifts.
- **Procedure — execution failure:** "Step 4 failed — file format rejected, PDF/A required." Procedure gains precondition.

### 34.6 Experience Retention

Recent events: keep last 20 with full detail. Older events: summarize into aggregate statistics. Never delete aggregate statistics. Individual events beyond window: archived, not deleted.

---




### 34.7 Shared Outcome Taxonomy

Experience records, extraction outcome events, and BDSM feedback transforms use one canonical outcome vocabulary so the same event is not described in incompatible terms across subsystems.

```ts
type CanonicalOutcome =
  | "success"
  | "partial_success"
  | "failure"
  | "corrected"
  | "ignored"
  | "rejected"
  | "adverse_ruling"
  | "promoted"
  | "demoted"
  | "unknown";
```

DOC72 execution outcomes track whether knowledge or a procedure worked when used. BDSM utility tracks whether injecting or selecting that knowledge was helpful in context. These are orthogonal measurements and MUST NOT be collapsed into one score.


### 34.8 Extraction Outcome Events

Every extraction result feeds back into the experience dimension via typed outcome events:

```ts
type ExtractionOutcomeEvent = {
  event_id: string;
  source_surface: IntakeSurface;
  extracted_item_id?: string;
  extracted_item_kind: "entity" | "relationship" | "memory_directive" | "domain_concept" |
                       "standing_procedure" | "conversation_checkpoint" | "obligation";
  outcome: "confirmed" | "used_successfully" | "ignored" | "rejected" |
           "corrected" | "contradicted" | "stale_unused" | "promoted" | "demoted";
  severity?: "low" | "medium" | "high";
  user_visible: boolean;
  occurred_at: string;
};
```

These events close the self-learning loop (§36): extraction produces knowledge → knowledge is used or not → outcomes are recorded → DOC8 adjusts extraction parameters.

**Orthogonality note:** execution success in `ExperienceRecord` and injection utility in the Satisfaction Matrix answer different questions. A node can be reliable when executed yet low-value when injected into context, or vice versa. Utility can indirectly generate review or β-increment proposals through DOC8, but it does not directly rewrite canonical α/β truth.


---


## 34A. Knowledge Intelligence Enhancement R2 (Integrated)

The following module is incorporated directly from the accepted DOC72 Knowledge Intelligence Enhancement R2 proposal. It is normative in R5.7.

### 34A.0 Governing Principles

1. **Intelligence, not just storage.** The graph should reason about what it knows — connecting outcomes to causes, recognizing patterns across contexts, detecting implications of changes, and learning what NOT to do.
2. **Severity over frequency.** A single catastrophic failure is a lesson. Three minor annoyances might not be. The system must assess consequence, not just count occurrences. Aggregate friction from many low-severity failures also matters (§2.4B).
3. **No hardcoded magic numbers.** Every numeric threshold must be classified as a tunable threshold (DOC8-learnable with a defined feedback signal) or a user-configurable setting. Unclassified thresholds are spec debt. Architectural invariants and model-coupled constants are documented in their owner-doc sections, not in the parameter registry.
4. **Wire through existing machinery.** These enhancements use DOC72's existing graph schema, DOC8's existing nightly processing, DOC24/KDA's existing rendering pipeline, and BDSM's existing utility tracking. No new subsystems — new capabilities built on existing architecture.
5. **The BDSM closes the loop.** Every new knowledge type (lessons, synthesis, implications) is injectable. BDSM tracks whether injection helps. The system learns which intelligence to surface and which to suppress.
6. **Domain-agnostic core.** All schemas, enums, and extraction rules use generic terminology. Domain-specific extensions (legal outcome types, judicial role patterns, litigation severity rules) live in domain signal profiles, not in the core schema.
7. **Start narrow, broaden with evidence.** Lessons, preferences, and syntheses begin at the narrowest scope supported by their source data. Broadening requires evidence from multiple distinct contexts.

---

### 34A.1 Outcome Chains — Structured Causal Reasoning on Experience Records

#### 34A.1.1 Problem

DOC72's `UsageEvent` (§34) records THAT something succeeded or failed, with `outcome_detail` as a free-text string. But for reasoning — lessons learned, cross-context patterns, implication detection — the system needs to know WHY. "Motion denied" is a fact. "Motion denied because the court found corrective disclosure insufficient without direct price impact evidence, distinguishing from Tellabs" is a causal chain that enables learning.

#### 34A.1.2 Schema

```ts
type OutcomeChain = {
  chain_id: string;
  
  // Anchor — the PRIMARY entity this outcome is about
  anchor_node_id: string;              // typically work_product, work_context, procedure, or obligation
  anchor_kind: "work_product" | "work_context" | "procedure" | "obligation";
  
  // Link to source event
  usage_event_id?: string;             // FK to experience_records. Nullable — chains can exist
                                       // without a linked usage event (e.g., extracted from document)
  
  // What approach was taken
  approach_summary: string;            // "Argued corrective disclosure for loss causation"
  arguments_applied?: string[];        // domain_concept node IDs used in the approach
  procedures_used?: string[];          // procedure node IDs executed
  tools_used?: string[];               // tool/capability IDs invoked
  
  // What happened
  outcome_type: OutcomeType;
  outcome_summary: string;             // "Motion denied; court required direct price impact evidence"
  
  // Why it happened — the causal link
  reasoning?: string;                  // "Court distinguished from Tellabs, found temporal proximity
                                       //  alone insufficient for loss causation"
                                       // ONLY if the user explained or the source explicitly states.
                                       // Extraction MUST NOT speculate.
  ruling_entity_ref?: string;          // If outcome is a ruling, link to the authority node
  
  // Consequence assessment
  consequence_severity: ConsequenceSeverity;
  severity_justification?: string;     // LLM's cited evidence for ±1 band adjustment
  consequence_description?: string;    // "Must now address price impact in amended complaint"
  
  // Root cause classification — gates lesson creation
  root_cause_class: RootCauseClass;
  
  // What might have worked instead
  counterfactual_hint?: string;        // "Address price impact evidence preemptively"
                                       // ONLY if discussed or implied. Do NOT speculate.
  
  // Temporal and jurisdictional context
  occurred_at?: string;                // When the outcome actually happened (vs when captured)
  work_context_id?: string;
  work_context_phase?: string;         // "discovery", "trial_prep", "settlement", etc.
  jurisdiction_scope?: string[];       // Jurisdictional applicability
  
  // Related actors
  related_actor_ids?: string[];        // Judge, opposing counsel, expert, etc.
  
  // Provenance
  source: OutcomeChainSource;
  source_ref?: string;
  
  // Quality and lifecycle
  extraction_confidence: number;       // 0-1. How confident is the extraction model in this chain.
                                       // NOT DOC72 epistemic confidence. Used only for gating
                                       // lesson/synthesis quality. Never fed into α/β.
  completeness_score: number;          // 0-1. Based on how many optional fields are populated.
                                       // Cross-context synthesis requires >= 0.6.
  superseded_by?: string;              // chain_id of a later chain that overrides this one
};

type CoreOutcomeType = "success" | "partial_success" | "failure" | "rejection" |
                       "correction_needed" | "unexpected_positive" | "neutral";
type DomainOutcomeType = `domain_${string}`;
type OutcomeType = CoreOutcomeType | DomainOutcomeType;
// Domain profiles register extensions:
// securities_litigation: domain_adverse_ruling, domain_favorable_ruling
// software_development: domain_deploy_failure, domain_regression_detected

type ConsequenceSeverity = "critical" | "high" | "medium" | "low" | "none";
// critical = real-world harm: missed deadline, client-visible error, sanctions risk,
//            sent to wrong recipient, data loss, financial loss
// high = significant rework: rejected filing, major strategy change, credibility impact
// medium = moderate friction: formatting error, needed retry, minor delay
// low = minor annoyance: UI issue, extra step needed, cosmetic problem
// none = no negative consequence (successful outcomes)

type RootCauseClass =
  | "knowledge_gap"                    // System or user lacked relevant knowledge
  | "procedural_failure"               // A defined procedure failed or was wrong
  | "tool_failure"                     // A tool malfunctioned or was misused
  | "infrastructure_failure"           // Network outage, system crash, service unavailable
  | "external_factor"                  // Opposing party action, court decision outside control
  | "third_party_delay"               // Vendor, co-counsel, or external party delayed
  | "human_error"                     // User made a mistake unrelated to knowledge
  | "user_strategic_choice";           // Deliberate choice with known risks

type OutcomeChainSource =
  | "user_explained"                   // User described outcome in conversation
  | "agent_analyzed"                   // Agent identified outcome during reflection
  | "ruling_extracted"                 // Extracted from a ruling or decision document
  | "candor_finding"                   // CANDOR red team finding
  | "task_outcome";                    // DOC23 task execution result
```

#### 34A.1.3 Severity Assessment — Deterministic Baseline + Bounded LLM

Severity is assessed in two steps:

**Step 1 — Deterministic baseline.** Domain-profile-driven rules produce a baseline severity:

```ts
type SeverityBaselineRule = {
  rule_id: string;
  trigger: RegExp | string;
  baseline_severity: ConsequenceSeverity;
  domain_profile_id?: string;          // null = universal rule
  reason: string;
};

// Universal rules (always active)
const UNIVERSAL_SEVERITY_RULES: SeverityBaselineRule[] = [
  { rule_id: "sev_missed_deadline", trigger: /(?:missed|late|untimely)\s+(?:deadline|filing|submission)/i,
    baseline_severity: "critical", reason: "missed_deadline" },
  { rule_id: "sev_wrong_recipient", trigger: /(?:sent|filed|served)\s+(?:wrong|incorrect|to\s+wrong)/i,
    baseline_severity: "critical", reason: "misdirected_communication" },
  { rule_id: "sev_data_loss", trigger: /(?:data\s+loss|deleted|lost\s+(?:file|document|work))/i,
    baseline_severity: "critical", reason: "data_loss" },
];

// Domain-specific rules registered by domain signal profiles
// securities_litigation would add: sanctions → critical, dismissed with prejudice → critical,
//   filing rejection → high, adverse ruling on dispositive motion → high
// software_development would add: production outage → critical, data breach → critical,
//   deploy failure → high, regression → high

function assessSeverity(
  chain: Partial<OutcomeChain>,
  llmAssessment: ConsequenceSeverity,
  activeProfiles: DomainSignalProfile[]
): { severity: ConsequenceSeverity; justification?: string } {
  const text = `${chain.outcome_summary ?? ""} ${chain.reasoning ?? ""} ${chain.consequence_description ?? ""}`;
  
  // Check all active rules (universal + domain-specific)
  const allRules = [...UNIVERSAL_SEVERITY_RULES,
    ...activeProfiles.flatMap(p => p.severity_baseline_rules ?? [])];
  
  let baseline = llmAssessment;
  for (const rule of allRules) {
    const pattern = rule.trigger instanceof RegExp ? rule.trigger : new RegExp(rule.trigger, 'i');
    if (pattern.test(text) && severityRank(rule.baseline_severity) > severityRank(baseline)) {
      baseline = rule.baseline_severity;
    }
  }
  
  // LLM can move ±1 band from baseline, but ONLY with cited textual evidence
  const llmDelta = severityRank(llmAssessment) - severityRank(baseline);
  if (Math.abs(llmDelta) <= 1 && chain.severity_justification) {
    return { severity: llmAssessment, justification: chain.severity_justification };
  }
  return { severity: baseline };
}

function severityRank(s: ConsequenceSeverity): number {
  return { none: 0, low: 1, medium: 2, high: 3, critical: 4 }[s];
}
```

**Step 2 — LLM refinement.** The extraction LLM may adjust severity ±1 band from the deterministic baseline, but ONLY with explicit textual evidence cited in `severity_justification`. If no justification is provided, the baseline stands.

#### 34A.1.4 When outcome chains are captured

Outcome chains are NOT captured on every `UsageEvent`. They are captured when:

1. **User explains an outcome in conversation.** "The MTD was denied because Judge Chen wanted price impact evidence." The conversation mining pipeline extracts this as an `OutcomeChain` with `source: "user_explained"`.

2. **A ruling or decision is extracted from a document.** Document viewer extraction (§20B.4) captures rulings as domain concepts. When a ruling references a prior approach, the extraction prompt links them into an outcome chain with `source: "ruling_extracted"`.

3. **A task or procedure fails with consequence.** Task execution (§20B.5) with `status: "failure"` produces an outcome chain with `source: "task_outcome"`. The `root_cause_class` is assessed from the failure type.

4. **CANDOR produces a finding.** Red team findings with `severity: "critical"` or `"major"` produce outcome chains with `source: "candor_finding"`.

5. **Agent analysis of an interaction.** After significant interactions, the reflection loop (DOC24 §11.3) may produce an outcome chain with `source: "agent_analyzed"` when the agent can identify what approach was taken and what outcome occurred.

**Partial chains are encouraged.** Missing `reasoning`, `arguments_applied`, or `counterfactual_hint` should not prevent chain creation. The `completeness_score` tracks how complete the chain is. Lesson creation from critical outcomes requires only severity + outcome. Cross-context synthesis requires `completeness_score >= 0.6` and `extraction_confidence >= 0.6`.

**Entity resolution in extraction:** The extraction prompt includes a known-concept context block (from the Graph Enhancement V2 contextual novelty gate, or a bounded list of top 20 active concept names + IDs from the current work context as fallback) so the LLM can resolve mentioned concepts to existing node IDs:

```
If you recognize any of the following known concepts in the outcome,
reference them by ID:
{novelty_gate_concepts}
```

If resolution fails, `arguments_applied` is empty — the chain is still useful for lesson creation but not for concept-based synthesis.

#### 34A.1.5 Storage

The `outcome_chains` table is the SOLE CANONICAL STORE for outcome chain data. The `UsageEvent` stores only `outcome_chain_id: string` as a foreign key reference — NOT a copy of the chain data.

```sql
CREATE TABLE outcome_chains (
    chain_id TEXT PRIMARY KEY,
    anchor_node_id TEXT NOT NULL REFERENCES nodes(id) ON DELETE CASCADE,
    anchor_kind TEXT NOT NULL,
    usage_event_id TEXT,                            -- nullable FK
    approach_summary TEXT NOT NULL,
    arguments_applied JSON,                         -- string[] of concept node IDs
    procedures_used JSON,                           -- string[] of procedure node IDs
    tools_used JSON,                                -- string[] of tool/capability IDs
    outcome_type TEXT NOT NULL,
    outcome_summary TEXT NOT NULL,
    reasoning TEXT,
    ruling_entity_ref TEXT,
    consequence_severity TEXT NOT NULL,
    severity_justification TEXT,
    consequence_description TEXT,
    root_cause_class TEXT NOT NULL,
    counterfactual_hint TEXT,
    occurred_at DATETIME,
    work_context_id TEXT,
    work_context_phase TEXT,
    jurisdiction_scope JSON,                        -- string[]
    related_actor_ids JSON,                         -- string[]
    source TEXT NOT NULL,
    source_ref TEXT,
    extraction_confidence REAL NOT NULL DEFAULT 0.5,
    completeness_score REAL NOT NULL DEFAULT 0.0,
    superseded_by TEXT,
    created_at DATETIME DEFAULT CURRENT_TIMESTAMP
);

CREATE INDEX idx_oc_context ON outcome_chains(work_context_id, outcome_type);
CREATE INDEX idx_oc_severity ON outcome_chains(consequence_severity);
CREATE INDEX idx_oc_anchor ON outcome_chains(anchor_node_id);
CREATE INDEX idx_oc_root_cause ON outcome_chains(root_cause_class);
CREATE INDEX idx_oc_created ON outcome_chains(created_at DESC);
```

**Pre-computed concept index (rebuilt nightly by DOC8):**

```sql
-- Derived table for efficient cross-context concept queries
-- Rebuilt nightly. Not canonical — rebuildable from outcome_chains.
CREATE TABLE outcome_chain_concept_index AS
SELECT oc.chain_id, oc.work_context_id, je.value AS concept_id,
       oc.outcome_type, oc.consequence_severity, oc.extraction_confidence
FROM outcome_chains oc, json_each(oc.arguments_applied) je
WHERE oc.arguments_applied IS NOT NULL;
CREATE INDEX idx_occi_concept ON outcome_chain_concept_index(concept_id);
CREATE INDEX idx_occi_context ON outcome_chain_concept_index(work_context_id);
```

**FK constraints:** `ON DELETE CASCADE` on `anchor_node_id` ensures chains don't outlive their anchor entity. `usage_event_id` is nullable — chains can exist without a linked usage event (e.g., extracted from a document).

#### 34A.1.6 Completeness scoring

```ts
function computeCompletenessScore(chain: Partial<OutcomeChain>): number {
  const fields = [
    { name: "approach_summary", weight: 0.15, filled: !!chain.approach_summary },
    { name: "arguments_applied", weight: 0.15, filled: (chain.arguments_applied?.length ?? 0) > 0 },
    { name: "reasoning", weight: 0.20, filled: !!chain.reasoning },
    { name: "counterfactual_hint", weight: 0.10, filled: !!chain.counterfactual_hint },
    { name: "related_actor_ids", weight: 0.10, filled: (chain.related_actor_ids?.length ?? 0) > 0 },
    { name: "jurisdiction_scope", weight: 0.05, filled: (chain.jurisdiction_scope?.length ?? 0) > 0 },
    { name: "work_context_phase", weight: 0.05, filled: !!chain.work_context_phase },
    { name: "occurred_at", weight: 0.05, filled: !!chain.occurred_at },
    { name: "consequence_description", weight: 0.10, filled: !!chain.consequence_description },
    { name: "severity_justification", weight: 0.05, filled: !!chain.severity_justification },
  ];
  return fields.reduce((sum, f) => sum + (f.filled ? f.weight : 0), 0);
}
```

#### 34A.1.7 Normative rules

1. The `outcome_chains` table is the sole canonical store. `UsageEvent` holds `outcome_chain_id` only.
2. Outcome chains are optional on `UsageEvent`. Most usage events have no chain.
3. Consequence severity uses deterministic baseline + bounded LLM ±1 band with cited justification.
4. Severity baseline rules are domain-profile-driven. Universal rules are always active.
5. `arguments_applied` links to existing `domain_concept` node IDs for cross-context queries.
6. Partial chains are valid. Missing fields reduce `completeness_score` but don't prevent creation.
7. `root_cause_class` gates lesson creation (§2). Only `knowledge_gap` and `procedural_failure` auto-create.
8. DOC8 SHALL use outcome chains as primary input for lesson synthesis (§2) and cross-context synthesis (§4).
9. The extraction prompt MUST NOT speculate about `reasoning` or `counterfactual_hint`. If the source doesn't explain why, those fields are null.

#### 34A.1.8 Socratic probe for incomplete chains

DOC24 MAY allocate a daily clarification probe budget specifically for incomplete outcome chains (missing `reasoning` or `arguments_applied` on chains with `consequence_severity >= "medium"`). This probe uses the contextual probe cost and fires only when the user is working in the relevant work context.

The daily probe count is registered as a Category B tunable threshold in the governance registry (§7):

```ts
// Parameter: doc72.outcome_chain.daily_clarification_probes
// Category: B (DOC8-tunable)
// Default: 1
// Min: 0, Max: 3
// Feedback signal: probe acceptance rate × chain completeness improvement
// Adjustment step: 1
// Cooldown: 30 days
```

---

### 34A.2 Lessons Learned as a First-Class Memory Type

#### 34A.2.1 Problem

When something goes wrong, the negative outcome is recorded in experience records and DOC8 detects patterns. But the SYNTHESIZED LESSON — the actionable takeaway — never becomes a named, retrievable, injectable entity.

#### 34A.2.2 New memory_directive subtype

Add `"lesson_learned"` to the memory_directive `memory_type` enumeration (DOC72 §4.4).

#### 34A.2.3 Lesson learned payload extension

```ts
type LessonLearnedPayload = {
  memory_type: "lesson_learned";
  
  // Lesson content
  lesson_summary: string;
  what_happened: string;
  why_it_happened?: string;
  what_to_do_instead?: string;
  
  // Source evidence
  source_outcome_chain_ids: string[];
  source_correction_ids?: string[];
  
  // Scope — when should this lesson be injected?
  applicable_contexts: LessonContext[];
  applicability_rule?: ApplicabilityRule;
  
  // Creation path
  creation_trigger: LessonCreationTrigger;
  
  // Governance
  auto_created: boolean;
  reviewed_by_user: boolean;
  
  // Lifecycle
  lesson_lifecycle_state: LessonLifecycleState;
};

type LessonContext = {
  context_type: "concept" | "entity_type" | "procedure" | "actor" | "work_context" | "global";
  context_ref?: string;
  context_description: string;
};

type ApplicabilityRule = {
  rule_id: string;
  source_node_id: string;
  applies_when: string[];
  does_not_apply_when: string[];
  support_count: number;
  contradiction_count: number;
  scope_evolution_history: Array<{
    changed_at: string;
    old_scope: string;
    new_scope: string;
    evidence_chain_id: string;
  }>;
};

type LessonCreationTrigger =
  | "single_critical_outcome"
  | "single_high_outcome"
  | "pattern_detected"
  | "aggregate_friction"
  | "user_taught"
  | "candor_critical_finding"
  | "correction_with_explanation";

type LessonLifecycleState =
  | "suggested"          // auto-created, awaiting user review
  | "active"             // user-reviewed or user-taught, actively injected
  | "watchlist"          // BDSM utility declining for 30+ days OR 3+ successful handlings
  | "superseded"         // user learned the skill, no longer makes the mistake
  | "archived";          // no longer relevant

// Lifecycle transitions:
// suggested → active: user confirms
// suggested → archived: user dismisses
// active → watchlist: BDSM utility declining 30+ days OR 3+ consecutive successes
// watchlist → superseded: continued success without the mistake
// watchlist → active: user makes the mistake again
// active → archived: matter closed, procedure changed, user explicitly dismisses
// any → archived: user explicitly archives
```

#### 34A.2.4 Lesson creation pipeline

**Path A — Immediate creation from critical/high-severity outcomes:**

```ts
function shouldCreateImmediateLesson(chain: OutcomeChain): boolean {
  // Infrastructure/external failures are not knowledge lessons
  if (["infrastructure_failure", "external_factor", "third_party_delay", "system_bug"]
      .includes(chain.root_cause_class)) return false;
  
  // Only knowledge gaps and procedural failures auto-create lessons
  if (!["knowledge_gap", "procedural_failure"].includes(chain.root_cause_class)) return false;
  
  if (chain.consequence_severity === "critical") return true;
  if (chain.consequence_severity === "high" && chain.reasoning) return true;
  
  // High-severity without reasoning: queue for user clarification
  if (chain.consequence_severity === "high" && !chain.reasoning) {
    if (!isSensitiveOutcome(chain.outcome_summary)) {
      queueClarificationProbe(chain);
    } else {
      // Sensitive outcomes: create partial lesson, defer probe to next session
      queueDeferredProbe(chain);
    }
    return false;
  }
  return false;
}

function isSensitiveOutcome(summary: string): boolean {
  const sensitivePatterns = [
    /sanction/i, /malpractice/i, /bar\s+complaint/i, /contempt/i,
    /terminated/i, /fired/i, /client\s+lost/i,
  ];
  return sensitivePatterns.some(p => p.test(summary));
}
```

For critical-severity outcomes, lesson enters at `lifecycle_state: "suggested"` through DOC1 Write Gate. Notification surfaces immediately. The lesson candidate is created internally even when user-facing probe is deferred for sensitivity reasons.

**Path B — DOC8 pattern-based synthesis from accumulated outcomes:**

For medium and low-severity outcomes, DOC8's nightly processing reads outcome chains and detects recurring patterns. BDSM pattern records (`repeated_correction`, `degradation`) that meet lesson criteria are forwarded to this pipeline as additional input — one pattern detector (DOC8), one lesson creator.

```ts
type LessonCandidateFromPattern = {
  pattern_id: string;
  outcome_chain_ids: string[];
  pattern_description: string;
  occurrence_count: number;
  distinct_work_contexts: number;
  aggregate_severity_score: number;    // SUM of severity weights
  proposed_lesson: {
    lesson_summary: string;
    what_to_do_instead?: string;
    applicable_contexts: LessonContext[];
  };
  confidence: number;
};
```

**Pattern detection thresholds (all DOC8-tunable via §7):**
- Occurrence count: default 3+ outcome chains with similar `outcome_type` + overlapping `arguments_applied` or `procedures_used`
- OR aggregate severity score ≥ 8 (e.g., 4 low-severity failures = 4, 2 medium + 2 low = 6, not enough; 4 medium = 8, triggers lesson)

DOC8 lesson synthesis SQL:

```sql
WITH concept_refs AS (
  SELECT oc.chain_id, oc.work_context_id, je.value AS concept_id,
         oc.outcome_type, oc.consequence_severity, oc.reasoning,
         oc.counterfactual_hint, oc.created_at
  FROM outcome_chains oc
  JOIN json_each(oc.arguments_applied) je
  WHERE oc.arguments_applied IS NOT NULL
    AND oc.root_cause_class IN ('knowledge_gap', 'procedural_failure')
),
negative_medium_low AS (
  SELECT * FROM concept_refs
  WHERE outcome_type IN ('failure', 'rejection', 'correction_needed')
    AND consequence_severity IN ('medium', 'low')
    AND created_at > datetime('now', '-90 days')
),
bucketed AS (
  SELECT chain_id, work_context_id, concept_id,
         LOWER(TRIM(COALESCE(reasoning, 'unspecified'))) AS reason_key,
         counterfactual_hint,
         CASE consequence_severity WHEN 'medium' THEN 2 WHEN 'low' THEN 1 ELSE 0 END AS sev_weight
  FROM negative_medium_low
),
pattern_candidates AS (
  SELECT concept_id, reason_key,
         COUNT(*) AS occurrence_count,
         COUNT(DISTINCT work_context_id) AS distinct_context_count,
         SUM(sev_weight) AS aggregate_severity_score,
         json_group_array(chain_id) AS source_chain_ids,
         json_group_array(counterfactual_hint) AS counterfactual_hints
  FROM bucketed
  GROUP BY concept_id, reason_key
  HAVING COUNT(*) >= :min_occurrences OR SUM(sev_weight) >= :min_aggregate_severity
)
SELECT pc.*, n.canonical_name AS concept_name
FROM pattern_candidates pc
JOIN nodes n ON n.id = pc.concept_id
ORDER BY pc.aggregate_severity_score DESC, pc.occurrence_count DESC
LIMIT 10;
```

DOC8 then: rejects buckets with low semantic coherence, scores candidates, and converts top-N into `lesson_learned` candidates through DOC1 Write Gate using a canary window on delivery force.

**Path C — User-taught lessons:**

The user explicitly states a lesson. The conversation mining pipeline detects lesson-language patterns ("lesson learned," "never again," "from now on avoid," "the takeaway is," "I should have") and creates a `lesson_learned` with `creation_trigger: "user_taught"` and `auto_created: false`. User-taught lessons enter directly at `active` maturity — the user IS the authority.

**Path D — CANDOR critical findings:**

CANDOR findings with `severity: "critical"` produce `lesson_learned` candidates with `creation_trigger: "candor_critical_finding"`. Enter at `suggested` state.

#### 34A.2.5 Lesson scope inference

Initial scope is the NARROWEST supported by the source chain's data. Broadening requires evidence.

```ts
function inferLessonContexts(chain: OutcomeChain): LessonContext[] {
  const contexts: LessonContext[] = [];
  
  // Linked concepts — most valuable scope (fires when concept is used anywhere)
  if (chain.arguments_applied?.length) {
    for (const conceptId of chain.arguments_applied) {
      contexts.push({
        context_type: "concept", context_ref: conceptId,
        context_description: `When using concept ${conceptId}`,
      });
    }
  }
  
  // Specific actors involved
  if (chain.related_actor_ids?.length) {
    for (const actorId of chain.related_actor_ids) {
      contexts.push({
        context_type: "actor", context_ref: actorId,
        context_description: `When interacting with actor ${actorId}`,
      });
    }
  }
  
  // Jurisdictionally bounded — don't add global scope
  if (!chain.jurisdiction_scope?.length && !chain.related_actor_ids?.length) {
    contexts.push({ context_type: "global", context_description: "General practice lesson" });
  }
  
  return contexts;
}
```

Scope evolution via `ApplicabilityRule`: when the same pattern appears in a new context, the scope broadens. When the lesson is overridden in a specific context, an exception is added to `does_not_apply_when`.

#### 34A.2.6 Interaction with CorrectionEvent pipeline

A single failure may produce BOTH:
- A `CorrectionEvent` (DOC72 §10.5) adjusting confidence on existing knowledge
- A `lesson_learned` providing new actionable guidance

These are complementary, not competing. **Ordering:** correction pipeline runs first (adjusts α/β), then lesson synthesis runs second (creates new guidance node), linked by `source_correction_ids`.

#### 34A.2.7 Injection policy

Lessons are injected with `primary_tag: "caution"`.
- `hedge_mode`: `"state_as_fact"` (user-reviewed) or `"hedged"` (auto-created, not yet reviewed)
- `force_level`: `"strong"` initial for user-taught, `"standard"` for auto-created. BDSM adjusts.

#### 34A.2.8 Adversarial safeguards

1. A single user-taught lesson cannot reach `force_level: "strong"` without at least 3 separate supporting interactions.
2. Before promoting a lesson to `active`, verify it doesn't contradict existing `active` lessons or `[enforce]`-tagged constraints.
3. Never upgrade to `strong` from one user-taught heuristic alone without corroboration from at least one outcome chain or correction event.

#### 34A.2.9 Edges

```
lesson_learned → learned_from → anchor entity (work_product, domain_concept, procedure)
lesson_learned → applies_to_concept → domain_concept
lesson_learned → applies_to_actor → world_entity
lesson_learned → applies_to_procedure → procedure
```

---

### 34A.3 Actor Behavioral Profiles as a Prioritized Extraction Target

#### 34A.3.1 Problem

The actor domain overlay (DOC72 §4.6) stores `domain_notes` per actor per domain. But the extraction pipeline doesn't systematically look for behavioral characterizations.

#### 34A.3.2 Behavioral pattern signals

Add generic behavioral pattern signals to the universal base signals (DOC72 §35.1):

```ts
const UNIVERSAL_BEHAVIORAL_SIGNALS: HighValuePattern[] = [
  // Generic tendency detection — domain-agnostic
  { pattern: "(?:(?:he|she|they|[A-Z][a-z]+(?:\\s[A-Z][a-z]+)?)\\s+(?:tends?|usually|typically|always|never|prefers?|dislikes?|requires?|expects?|is\\s+(?:known|likely|unlikely)))",
    pattern_name: "actor_tendency", weight: 0.12 },
];
```

Legal-specific role patterns (judge, opposing counsel, etc.) belong in the `securities_litigation` domain signal profile, not in universal signals. Each domain profile adds its own actor-role vocabulary.

#### 34A.3.3 Actor domain overlay note structure

```ts
type ActorDomainNote = {
  note: string;
  note_type: AllowedActorNoteClass;
  confidence: number;
  source: string;
  evidence_count: number;
  last_observed_at: string;
  outcome_chain_refs?: string[];
};

type AllowedActorNoteClass =
  | "workflow_preference"
  | "scheduling_habit"
  | "communication_pattern"
  | "decision_tendency"
  | "negotiation_style"
  | "procedural_preference";

type ForbiddenActorNoteClass =
  | "protected_characteristic"
  | "health_or_mental_state"
  | "physical_appearance"
  | "private_life"
  | "broad_competence_label"
  | "personality_diagnosis";
```

**Dual gate:** The extraction pipeline MUST reject notes matching forbidden classes. It MUST ALSO reject notes whose `note_type` doesn't map to an allowed class, even if no forbidden pattern matched. The allowlist is the primary gate; the forbidden list is the safety net.

#### 34A.3.4 Note consolidation

Actor behavioral notes are consolidated via the compiled truth mirror (Graph Enhancement V2 §4). The mirror's `buildKnowledgeSummaryRenderInput()` reads all `domain_notes` for an actor entity, groups by `note_type`, and produces a synthesized behavioral profile summary. Individual raw notes are retained as supporting evidence but not injected individually. The synthesized summary is what gets injected.

#### 34A.3.5 Actor patterns from outcome chains

A nightly DOC8 job detects actor-level patterns from accumulated outcome chains using `related_actor_ids`:

```ts
type ActorPatternDetection = {
  actor_node_id: string;
  pattern: string;
  evidence_chain_ids: string[];
  observation_count: number;
  consistency: number;
};
```

This produces evidence-based behavioral notes (more reliable than regex-based detection).

#### 34A.3.6 Normative rules

1. Behavioral characterization extraction triggers from universal base signals. Domain-specific role patterns live in domain signal profiles.
2. Single-observation notes enter at `confidence: 0.5`. Multi-observation notes accumulate through α/β.
3. Extraction uses a dual gate: allowed class allowlist + forbidden class blacklist.
4. Actor behavioral notes are NEVER candidates for DOC1 type-level generalization. "Judge Chen prefers concise briefs" stays on Judge Chen. It never becomes "judges prefer concise briefs."
5. Notes linked to outcome chains gain higher injection priority (evidence of real-world consequence).

---

### 34A.4 Cross-Context Pattern Synthesis

#### 34A.4.1 Problem

After accumulating outcome chains across multiple work contexts, the system should synthesize patterns: "In 4 out of 5 securities cases, the corrective disclosure argument succeeded."

#### 34A.4.2 Synthesis job

A nightly DOC8 job queries outcome chains across work contexts for shared concepts.

```ts
type CrossContextSynthesis = {
  synthesis_id: string;
  pattern_description: string;
  pattern_type: "argument_effectiveness" | "procedural_reliability" | "actor_behavior_consistency" |
                "approach_comparison" | "risk_pattern";
  supporting_chains: Array<{
    outcome_chain_id: string;
    work_context_id: string;
    work_context_name: string;
    outcome_type: OutcomeType;
    supports_pattern: boolean;
  }>;
  total_observations: number;
  success_rate?: number;
  success_rate_lower_bound?: number;   // Wilson lower bound
  distinct_work_contexts: number;
  actionable_insight?: string;
  risk_if_ignored?: string;
  node_kind: "domain_concept";
  source: "cross_context_synthesis";
  confidence: number;
  created_at: string;
};
```

#### 34A.4.3 Confidence scoring with sample-size penalty

```ts
function computeSynthesisConfidence(input: {
  total_observations: number;
  consistency: number;
  distinct_contexts: number;
  success_rate: number;
}): number {
  // Sample-size penalty: steep below n=5, moderate up to n=10, minimal above
  const samplePenalty = Math.min(1.0, (input.total_observations - 2) / 8);
  
  // Context diversity bonus
  const contextBonus = Math.min(0.15, (input.distinct_contexts - 2) * 0.05);
  
  // Wilson lower bound for success rate (conservative estimate)
  const z = 1.96; // 95% confidence
  const n = input.total_observations;
  const p = input.success_rate;
  const wilsonLower = (p + z*z/(2*n) - z * Math.sqrt((p*(1-p) + z*z/(4*n)) / n)) / (1 + z*z/n);
  
  // Use Wilson lower bound instead of raw success rate
  const adjustedConsistency = Math.max(wilsonLower, 1 - wilsonLower); // how extreme the skew is
  
  return Math.max(0, Math.min(1.0,
    adjustedConsistency * samplePenalty + contextBonus
  ));
}
```

Default confidence threshold for synthesis creation: **0.55** (DOC8-tunable).

#### 34A.4.4 Synthesis SQL

```sql
WITH concept_links AS (
  SELECT oc.chain_id, oc.work_context_id, je.value AS concept_id,
         oc.outcome_type, oc.consequence_severity, oc.reasoning,
         oc.counterfactual_hint, oc.created_at
  FROM outcome_chains oc
  JOIN json_each(oc.arguments_applied) je
  WHERE oc.arguments_applied IS NOT NULL
    AND oc.extraction_confidence >= 0.6
    AND oc.completeness_score >= 0.6
),
concept_stats AS (
  SELECT concept_id,
         COUNT(*) AS total_observations,
         COUNT(DISTINCT work_context_id) AS distinct_work_contexts,
         SUM(CASE WHEN outcome_type IN ('success', 'partial_success', 'unexpected_positive')
                  THEN 1 ELSE 0 END) AS success_count,
         SUM(CASE WHEN outcome_type IN ('failure', 'rejection', 'correction_needed')
                  THEN 1 ELSE 0 END) AS failure_count
  FROM concept_links
  GROUP BY concept_id
  HAVING COUNT(*) >= :min_observations
     AND COUNT(DISTINCT work_context_id) >= :min_contexts
),
supporting_chains AS (
  SELECT cl.concept_id,
         json_group_array(json_object(
           'chain_id', cl.chain_id, 'work_context_id', cl.work_context_id,
           'outcome_type', cl.outcome_type, 'severity', cl.consequence_severity,
           'reasoning', cl.reasoning, 'counterfactual_hint', cl.counterfactual_hint,
           'created_at', cl.created_at
         )) AS supporting_chain_bundle
  FROM concept_links cl
  GROUP BY cl.concept_id
)
SELECT cs.concept_id, n.canonical_name AS concept_name,
       cs.total_observations, cs.distinct_work_contexts,
       cs.success_count, cs.failure_count,
       ROUND(1.0 * cs.success_count / NULLIF(cs.success_count + cs.failure_count, 0), 3) AS success_rate,
       sc.supporting_chain_bundle
FROM concept_stats cs
JOIN nodes n ON n.id = cs.concept_id
JOIN supporting_chains sc ON sc.concept_id = cs.concept_id
ORDER BY cs.distinct_work_contexts DESC, cs.total_observations DESC;
```

#### 34A.4.5 Cluster prioritization

When more than 5 clusters are eligible, prioritize by: `severity_weighted_support × context_diversity × recency × novelty × (1 - contradiction_penalty)`. Process in priority order, cap at 5 per night. Unprocessed clusters re-evaluate next night.

#### 34A.4.6 Synthesis invalidation

When a new outcome chain shares concept IDs with an existing synthesis, check for contradiction. If the new chain's outcome contradicts the synthesis pattern, recompute confidence. If support drops below threshold, flag for re-evaluation.

#### 34A.4.7 Conflict-aware rendering with lessons

When DOC24 detects concept overlap between a `lesson_learned` and a `domain_concept` (source: synthesis) during packet assembly, render as a composed conflict-aware card: "Base pattern says X across contexts; known failure mode is Y when condition Z holds." This uses DOC24's existing tension-aware injection policy. Lessons are the exception structure around a broader synthesis.

#### 34A.4.8 Output

Synthesis results are written as `domain_concept` nodes with `source: "cross_context_synthesis"` through DOC1 Write Gate at `suggested` state. BDSM tracks injection utility. If utility declines, BDSM reduces `force_level` to `avoid` (no proactive injection, but available for on-demand retrieval). Synthesis nodes are NOT tombstoned on low utility — they're expensive to recompute.

#### 34A.4.9 Future enhancement

Analogical reasoning via embedding-based similarity over outcome chain summaries. Augments concept-matching with vector similarity for broader analogical discovery. Deferred to next proposal. The `outcome_chains` table and its indexes are designed to support this.

#### 34A.4.10 Normative rules

1. Synthesis runs nightly after DOC72 cleanup and after new outcome chains are committed.
2. Bounded: max 5 clusters/night, max 3 patterns/cluster.
3. Requires `completeness_score >= 0.6` and `extraction_confidence >= 0.6` on source chains.
4. Uses Wilson lower bound for success rate, not raw ratio.
5. Confidence threshold 0.55 (DOC8-tunable).
6. Synthesis nodes are derived — rebuildable from outcome chains.

---

### 34A.5 Implicit Preference Acceleration

#### 34A.5.1 Problem

DOC1's generalization engine detects repeated patterns but detected patterns wait in the suggestions inbox. The system should learn and apply safe preferences faster.

#### 34A.5.2 Design: DOC1 fast path

Implicit preference acceleration is a SPECIALIZED FAST PATH within DOC1's existing generalization pipeline, not a standalone subsystem. It runs THROUGH DOC1's Write Gate. Before creating a new preference, check existing DOC1 memory directives for the same behavior pattern — if one exists, upgrade it to `auto_applied: true`.

#### 34A.5.3 Auto-applicable preference classification

```ts
type AutoApplicablePreferenceClass =
  | "ui_navigation_default"
  | "output_format_default"
  | "display_format_default"
  | "default_tool_selection_local"      // "Word for drafting" — local, reversible only
  | "ui_layout_preference"
  | "search_behavior_default";          // "always search Henderson first"

// REMOVED from auto-applicable (external consequences):
// - communication_channel_default (affects how others receive communications)
// - workflow_entry_point (too many contextual exceptions)
// - default_tool_selection_collaborative (shared docs, co-counsel involvement)

type NeverAutoApplicableClass =
  | "content_decision"
  | "communication_content"
  | "communication_channel_default"     // moved here from auto-applicable
  | "external_action"
  | "standing_procedure"
  | "scope_change"
  | "default_tool_selection_collaborative";
```

#### 34A.5.4 Safety boundary

Auto-applicable preferences are ONLY those where:
1. The outcome is visible and immediately reversible.
2. No external consequence.
3. The preference is about HOW, not WHAT.

`default_tool_selection_local` checks for collaboration signals before auto-applying:

```ts
function isCollaborativeContext(workContextId: string): boolean {
  // Check for shared folder roots, co-counsel entities, external collaborators
  const collaborators = db.all(`
    SELECT 1 FROM edges e
    JOIN nodes n ON e.target_id = n.id
    WHERE e.source_id = ? AND e.relation_type IN ('co_counsel', 'shared_with', 'collaborated_on')
    AND n.is_active = 1 LIMIT 1
  `, [workContextId]);
  return collaborators.length > 0;
}
```

#### 34A.5.5 Detection and application

```ts
type ImplicitPreferenceCandidate = {
  candidate_id: string;
  preference_class: AutoApplicablePreferenceClass;
  description: string;
  observation_count: number;
  distinct_session_count: number;
  last_observed_at: string;
  first_observed_at: string;
  auto_apply_eligible: boolean;
  auto_applied: boolean;
  auto_applied_at?: string;
  override_count: number;
  override_rate: number;
  excluded_contexts?: string[];
};

function shouldAutoApply(candidate: ImplicitPreferenceCandidate): boolean {
  if (candidate.distinct_session_count < 3) return false;   // DOC8-tunable
  if (!isAutoApplicableClass(candidate.preference_class)) return false;
  // Override rate check — not raw count
  if (candidate.observation_count >= 5) {
    if (candidate.override_count / candidate.observation_count > 0.25) return false;
  }
  return true;
}
```

#### 34A.5.6 Auto-application with undo

1. Applied silently.
2. Notification: "I noticed you prefer opening documents in tabs — doing that now. [Undo] [Not always]"
3. "Not always" → demoted to `suggested` for manual review.
4. Auto-applied preferences appear in "Recent Learning" section.

#### 34A.5.7 Override clustering

When overrides are detected, DOC8 analyzes context distribution:
- If >75% of overrides cluster in a single work context → split into base preference + `excluded_contexts` list
- If overrides are uniformly distributed → preference is being abandoned, demote

#### 34A.5.8 Normative rules

1. Auto-application limited to the defined `AutoApplicablePreferenceClass` set. Expansion requires spec review.
2. Auto-applied preferences are `memory_directive` nodes with `auto_applied: true` flag.
3. Single toggle: Settings > Knowledge & Learning > "Auto-apply observed preferences." Default: ON.
4. BDSM tracks outcomes. DOC8 tunes session count and override thresholds.
5. Cross-session evidence: 3+ distinct sessions required (DOC8-tunable).

---

### 34A.6 Implication Detection on Significant Changes

#### 34A.6.1 Problem

DOC72's change propagation (§16.4) updates structural dependencies but doesn't reason about logical implications.

#### 34A.6.2 Two-tier architecture

**Tier 1 — Deterministic structural implications (inline, no LLM).**

Traverse 1-hop graph edges from the changed entity. Identify affected standing procedures, linked obligations, and deadline dependencies. These are factual consequences, not speculation. Urgency can be `immediate`.

**Tier 2 — Bounded LLM inference (async, Tier 3 priority).**

For changes that pass a significance threshold, queue a bounded LLM call to reason about non-obvious implications. These are speculative and may only be `soon` or `background`. Default tag: `[consider]`, never `[caution]` or `[enforce]`.

#### 34A.6.3 Trigger filter

```ts
type ImplicationTrigger =
  | "deadline_change"
  | "work_context_phase_transition"     // domain-agnostic
  | "key_role_change"
  | "external_decision_received"        // domain-agnostic
  | "new_authority_affects_concept"
  | "provider_capability_change"
  | "obligation_status_change";

type ChangeEventEnvelope = {
  node_id: string;
  node_kind: string;
  change_record: ChangeRecord;
  work_context_ids: string[];
};

function shouldTriggerImplicationCheck(event: ChangeEventEnvelope): ImplicationTrigger | null {
  const fields = event.change_record.changed_fields;
  
  if (fields.some(f => ["deadline", "due_date", "trial_date"].includes(f))) {
    return "deadline_change";
  }
  if (event.node_kind === "world_entity" && fields.includes("status")) {
    return "work_context_phase_transition";
  }
  // Key role changes: check for typed edge deltas with domain-profile role vocabulary
  if (event.node_kind === "world_entity" && fields.includes("role_edges_changed")) {
    return "key_role_change";
  }
  if (event.node_kind === "domain_concept" && event.change_record.change_type === "created"
      && fields.includes("authority_type")) {
    return "external_decision_received";
  }
  return null;
}
```

**Role-change detection uses typed edge deltas** (not substring matching). When a `world_entity` person node gains or loses an edge with `relation_type` in the domain profile's role vocabulary (e.g., `adjudicated_by`, `represented_by`), that constitutes a role change.

#### 34A.6.4 Implication result schema

```ts
type ImplicationResult = {
  implication_id: string;
  trigger_event: string;
  trigger_node_id: string;
  work_context_id: string;
  tier: "structural" | "inferred";
  implications: Implication[];
  source: "implication_detection";
  detected_at: string;
};

type Implication = {
  description: string;
  affected_entity_ids: string[];
  urgency: "immediate" | "soon" | "background";
  action_suggested?: string;
  confidence: number;
};
```

#### 34A.6.5 Rate limiting

- Max 1 `immediate` implication card per session.
- Max 3 Tier 2 (LLM-inferred) implications per work context per week.
- Dedup by trigger family.
- BDSM suppresses low-utility trigger types over time.

#### 34A.6.6 Feedback taxonomy

```ts
type ImplicationFeedbackOutcome =
  | "acted_on"
  | "acknowledged_no_action"
  | "already_known"
  | "not_relevant"
  | "dismissed_wrong"
  | "timed_out_unobserved";
```

#### 34A.6.7 Output routing

- `"immediate"` (Tier 1 only): injected as `[caution]` card on next interaction + Q notification.
- `"soon"`: suggestions inbox with priority badge + next weekly digest.
- `"background"`: weekly digest only.

#### 34A.6.8 Normative rules

1. Tier 1 is deterministic graph traversal (no LLM, no hallucination risk). Urgency can be `immediate`.
2. Tier 2 is bounded LLM, async. Tag: `[consider]` only. Never `[caution]` or `[enforce]`. Max urgency: `soon`.
3. Implications are ephemeral suggestions, not durable graph nodes.
4. BDSM tracks which trigger types produce valuable implications.

---

### 34A.7 Threshold Governance Framework

#### 34A.7.1 Problem

Dozens of hardcoded numeric thresholds across the spec suite. Each a reasonable initial guess. None empirically validated. Most invisible to the user.

#### 34A.7.2 Registry scope

The threshold governance registry covers TUNABLE THRESHOLDS (Category B) and USER SETTINGS (Category C) only. Architectural invariants (single-writer enforcement, max cascade depth) and model-coupled constants (embedding dimension) are documented in their owner-doc sections, not in the parameter registry.

#### 34A.7.3 Registry schema

```ts
type GovernanceParameterEntry =
  | {
      kind: "tunable_threshold";
      parameter_id: string;
      category: "B";
      current_value: number;
      default_value: number;
      min_value?: number;
      max_value?: number;
      owner_doc: string;
      section_ref: string;
      description: string;
      feedback_signal: string;
      adjustment_step: number;
      adjustment_cooldown_days: number;
      tuning_family?: string;
    }
  | {
      kind: "user_setting";
      parameter_id: string;
      category: "C";
      current_value: string | number | boolean | string[];
      owner_doc: string;
      section_ref: string;
      description: string;
      settings_surface: string;
    };
```

#### 34A.7.4 Initial registry

**Category B — DOC8-tunable:**

| parameter_id | Default | Feedback signal | Step | Family |
|---|---|---|---|---|
| `doc72.extraction.deep_threshold` | 0.75 | extraction precision | ±0.05 | extraction |
| `doc72.extraction.shallow_threshold` | 0.45 | extraction precision | ±0.05 | extraction |
| `doc1.generalization.min_instances` | 3 | heuristic acceptance rate | ±1 | — |
| `doc72.lesson.pattern_min_occurrences` | 3 | lesson utility (acknowledged/dismissed) | ±1 | lesson_synthesis |
| `doc72.lesson.min_aggregate_severity` | 8 | lesson utility | ±2 | lesson_synthesis |
| `doc72.synthesis.min_observations` | 3 | synthesis utility (used/ignored) | ±1 | lesson_synthesis |
| `doc72.synthesis.confidence_threshold` | 0.55 | synthesis utility | ±0.05 | — |
| `doc72.preference.auto_apply.min_sessions` | 3 | override rate | ±1 | preference |
| `doc72.preference.override_rate_threshold` | 0.25 | user satisfaction | ±0.05 | preference |
| `doc72.sonar.cooldown_days` | 30 | sonar utility | ±7 | — |
| `doc72.cleanup.candidate_decay_days` | 14 | re-discovery rate | ±3 | cleanup |
| `doc72.cleanup.suggestion_expiry_days` | 30 | late acceptance validity | ±7 | cleanup |
| `doc72.procedure.half_life_days` | 90 | stale procedure success rate | ±15 | — |
| `doc72.outcome_chain.daily_clarification_probes` | 1 | probe acceptance × completeness improvement | ±1 | — |

**Category C — User-configurable:**

| parameter_id | Default | Settings location |
|---|---|---|
| `doc72.extraction.daily_deep_budget` | 20 | Settings > Knowledge & Learning > Extraction |
| `doc72.extraction.daily_cost_ceiling_usd` | 3 | Settings > Knowledge & Learning > Extraction |
| `doc72.preference.auto_apply.enabled` | true | Settings > Knowledge & Learning > Preferences |
| `doc72.domain_profiles.active_set` | per install | Settings > Knowledge & Learning > Domain Profiles |

#### 34A.7.5 Tuning families and coupling constraints

| Family | Members | Constraint |
|---|---|---|
| extraction | deep_threshold, shallow_threshold | deep - shallow ≥ 0.15 |
| cleanup | candidate_decay_days, suggestion_expiry_days | decay ≤ expiry |
| lesson_synthesis | pattern_min_occurrences, min_aggregate_severity, synthesis.min_observations | lesson threshold ≤ synthesis threshold |
| preference | auto_apply.min_sessions, override_rate_threshold | coupled behavior |

**DOC8 may only run canary trials for ONE family at a time.**

#### 34A.7.6 DOC8 tuning protocol

```ts
type ParameterTuningDecision = {
  parameter_id: string;
  current_value: number;
  proposed_value: number;
  direction: "increase" | "decrease" | "hold";
  evidence: { feedback_signal_value: number; observation_count: number; confidence: number };
  decision: "apply" | "trial" | "hold";
};
```

1. DOC8 computes feedback signal nightly.
2. If suboptimal, propose adjustment within `adjustment_step`.
3. 7-day canary trial (reusing DOC72 §36.2 `ThresholdTrial`).
4. Auto-revert if trial worsens signal. Confirm if improves or holds.
5. `adjustment_cooldown_days` between adjustments per parameter (default 14).
6. Coupling constraints enforced before applying any adjustment.

#### 34A.7.7 Oscillation detector

```ts
type OscillationDetector = {
  parameter_id: string;
  recent_adjustments: Array<{ direction: "increase" | "decrease"; at: string }>;
  oscillation_detected: boolean;
};

// 3+ direction changes in 6 cycles → freeze at midpoint, extend cooldown to 30 days
function checkOscillation(detector: OscillationDetector): boolean {
  if (detector.recent_adjustments.length < 6) return false;
  const last6 = detector.recent_adjustments.slice(-6);
  let directionChanges = 0;
  for (let i = 1; i < last6.length; i++) {
    if (last6[i].direction !== last6[i-1].direction) directionChanges++;
  }
  return directionChanges >= 3;
}
```

#### 34A.7.8 Default classification for new thresholds

New thresholds enter as `unclassified` spec debt. They MUST be classified before the spec is considered complete. This creates a forcing function — no unclassified thresholds in normative text.

#### 34A.7.9 Transparency

Knowledge Manager Health Dashboard (DOC72 §41.8, Tab 1) includes a "System Parameters" section:
- All Category B parameters with current values and last adjustment date
- Trend indicator: DOC8's assessment
- Category C parameters with link to Settings

---

### 34A.8 Extraction Prompt

The following prompt block is added to DOC72 conversation/document extraction (§20.4):

```
You extract durable knowledge from user conversation and related artifacts into
strict structured outputs. Extract ONLY when the content is durable, scoped,
and useful beyond the immediate turn.

OUTPUT CLASSES
1. entities
2. relationships
3. memory_directives
4. procedures
5. obligations
6. goals
7. outcome_chains
8. behavioral_actor_notes
9. user_taught_lessons

GENERAL RULES
- Prefer abstaining over guessing.
- Do not globalize matter-specific knowledge.
- Do not infer protected traits, emotional states, health, or private-life details.
- For legal/authority-derived content, preserve the authority source and direct
  excerpt when available.
- If information is incomplete, output partial objects with explicit nulls;
  do not hallucinate missing fields.

OUTCOME CHAINS
Extract an outcome chain only for SUBSTANTIVE outcomes:
- ruling/decision/result/success/failure/correction with durable learning value
Do NOT extract routine successful operations.

For each outcome chain, capture:
- approach_summary
- arguments_applied (reference known concept IDs if possible: {novelty_gate_concepts})
- procedures_used (reference known procedure IDs if possible)
- tools_used (stable tool/action IDs if known)
- outcome_type (from: success, partial_success, failure, rejection,
  correction_needed, unexpected_positive, neutral)
- outcome_summary
- reasoning (ONLY if explicit or strongly supported. Do NOT speculate.)
- consequence_severity using this rubric:
  critical = real-world harm / missed deadline / wrong recipient / sanctions risk
  high = major rework / rejected filing / major strategy change
  medium = retry / moderate delay / formatting correction
  low = minor annoyance / cosmetic issue
- severity_justification (if adjusting from deterministic baseline, cite evidence)
- root_cause_class (from: knowledge_gap, procedural_failure, tool_failure,
  infrastructure_failure, external_factor, third_party_delay, human_error,
  user_strategic_choice)
- counterfactual_hint (ONLY if explicit or strongly implied)
- work_context_id, work_context_phase, jurisdiction_scope, related_actor_ids
- source

If root cause is external (outage, third-party delay, unavailable data), mark:
  root_cause_class = infrastructure_failure or external_factor or third_party_delay
and DO NOT convert this by itself into a durable lesson.

USER-TAUGHT LESSONS
When the user explicitly states a reusable takeaway ("next time…", "I should have…",
"don't do X; do Y", "lesson learned", "never again", "the takeaway is"):
- lesson_summary, what_happened, why_it_happened, what_to_do_instead
- applicable_contexts at the NARROWEST supported scope
- creation_trigger = user_taught

BEHAVIORAL ACTOR NOTES
Extract ONLY professional/operational behavior from the allowed list:
- workflow_preference, scheduling_habit, communication_pattern,
  decision_tendency, negotiation_style, procedural_preference
NEVER extract: protected characteristics, personality diagnoses, health/mental state,
private-life speculation, broad competence labels, physical appearance.
For each note: actor_ref, note, note_type, evidence_count, source_refs, confidence.

MEMORY DIRECTIVES
Output the full DOC72/KDA contract fields including:
- memory_type, summary, assertion_class, applies_when, does_not_apply_when,
  scope, priority_class, source_certainty

FINAL RULE
If unsure whether something is durable, extract as a suggestion/review candidate,
not as a strongly scoped durable rule.
```

---

### 34A.9 Rendering Contracts

#### 34A.9.1 Lesson Learned

```ts
function renderLessonLearnedCard(lesson: LessonLearnedPayload, tier: "compact" | "standard" | "full"): string {
  if (tier === "compact") {
    return `⚠ [Lesson] ${lesson.lesson_summary}`;
  }
  if (tier === "standard") {
    return [
      `⚠ [Lesson — ${lesson.auto_created ? "reviewed pattern" : "user-taught"}] ${lesson.lesson_summary}`,
      `When: ${lesson.applicable_contexts.map(c => c.context_description).join("; ")}`,
      `Why: ${lesson.why_it_happened}`,
      `Do instead: ${lesson.what_to_do_instead}`,
    ].filter(Boolean).join("\n");
  }
  return [
    `⚠ [Lesson — ${lesson.creation_trigger}] ${lesson.lesson_summary}`,
    `What happened: ${lesson.what_happened}`,
    `Why: ${lesson.why_it_happened}`,
    `Do instead: ${lesson.what_to_do_instead}`,
    `Applies when: ${lesson.applicable_contexts.map(c => c.context_description).join("; ")}`,
    `Sources: ${lesson.source_outcome_chain_ids.join(", ")}`,
    `Review state: ${lesson.reviewed_by_user ? "reviewed" : "unreviewed"}`,
  ].filter(Boolean).join("\n");
}
```

**Delivery rule:** Lessons default to `[caution]` tag.

#### 34A.9.2 Cross-Context Synthesis

```ts
function renderCrossContextSynthesisCard(synthesis: CrossContextSynthesis, tier: "compact" | "standard" | "full"): string {
  if (tier === "compact") {
    return `[Pattern] ${synthesis.pattern_description}`;
  }
  if (tier === "standard") {
    return [
      `[Pattern] ${synthesis.pattern_description}`,
      `Support: ${synthesis.total_observations} outcomes across ${synthesis.distinct_work_contexts} contexts`,
      `Success rate: ${Math.round((synthesis.success_rate ?? 0) * 100)}%`,
      `Insight: ${synthesis.actionable_insight}`,
    ].filter(Boolean).join("\n");
  }
  return [
    `[Pattern — ${synthesis.pattern_type}] ${synthesis.pattern_description}`,
    `Support: ${synthesis.total_observations} outcomes across ${synthesis.distinct_work_contexts} contexts`,
    `Success rate: ${Math.round((synthesis.success_rate ?? 0) * 100)}% (lower bound: ${Math.round((synthesis.success_rate_lower_bound ?? 0) * 100)}%)`,
    `Risk if ignored: ${synthesis.risk_if_ignored}`,
    `Insight: ${synthesis.actionable_insight}`,
  ].filter(Boolean).join("\n");
}
```

**Delivery rule:** Syntheses default to `[consider]` tag unless highly supported and user-reviewed.

---

### 34A.10 Knowledge Manager UI Surfaces

No new tabs. Slot into existing DOC72 §41.8 tabs:

| Tab | Additions |
|---|---|
| Tab 1 — Overview / Health | Outcome chain volume, lesson candidate count, implication backlog, System Parameters panel |
| Tab 2 — World Model / Entity Browser | Actor detail right panel gets Behavioral Profile section |
| Tab 3 — Domain Concepts | Filter/badge for `source = cross_context_synthesis` |
| Tab 5 — Suggestions Inbox | Lessons, syntheses, and implications as reviewable items |
| Tab 7 — Memory & Constraints | `lesson_learned` subtype filter, review state column |
| Tab 8 — Audit / Telemetry | Parameter changes, implication logs, auto-application reversals |

---

### 34A.11 Weekly Digest

Tiered delivery model:

- **Tier 1 daily micro-push:** Critical lessons and immediate implications only.
- **Tier 2 weekly summary:** Sections for Lessons to Review, New Cross-Context Patterns, Auto-Applied Preferences with Overrides, Actor Profile Updates.
- **Tier 3 monthly/full in Knowledge Manager:** Full supporting chain lists, parameter changes, dismissed implication stats.

---

### 34A.12 First Useful Output Timeline

| Subsystem | First useful output | Steady state |
|---|---|---|
| Implicit preferences | ~1-2 weeks (3 sessions) | Ongoing |
| Actor behavioral profiles | ~1-2 weeks (first characterization) | Ongoing |
| Lessons learned (user-taught) | Day 1 | Ongoing |
| Lessons learned (auto-created) | First critical/high failure | Ongoing |
| Outcome chains | ~1-2 weeks (first outcome discussion) | Ongoing |
| Implication detection | ~1-4 weeks (first significant change) | Ongoing |
| Cross-context synthesis | ~6-9 months | After 3+ contexts with 5+ chains each |
| Threshold governance | Month 2-3 | Ongoing |

Cross-context synthesis is explicitly a long-horizon capability. The user gets value from lessons, actor profiles, and implicit preferences long before synthesis matures.

---

### Appendix A — Cross-Doc Obligations

#### A.1 DOC8

| Obligation | Priority |
|---|---|
| Lesson synthesis from outcome chains (Path B, nightly) | High |
| Cross-context synthesis job (nightly) | High |
| Actor pattern detection from outcome chains (nightly) | High |
| Implicit preference detection + override tracking | High |
| Parameter tuning engine (all Category B via §7) | High |
| Feedback signal computation (nightly) | High |
| Outcome chain concept index rebuild (nightly) | Medium |

#### A.2 DOC24 / KDA

| Obligation | Priority |
|---|---|
| `LessonLearnedRenderContract` (compact/standard/full) | High |
| `CrossContextSynthesisRenderContract` | Medium |
| `ImplicationCardRenderContract` with urgency-based injection | Medium |
| Actor behavioral note injection (`[apply]`/`[consider]` by type) | Medium |
| Auto-applied preference injection as `[apply]` directives | Medium |
| Conflict-aware lesson+synthesis rendering via tension-aware injection | Medium |
| Daily clarification probe budget for incomplete chains | Medium |

#### A.3 BDSM (DOC24 Addendum A)

| Obligation | Priority |
|---|---|
| Lesson utility tracking (`[caution]` acknowledged/dismissed) | High |
| Preference auto-application tracking (positive outcomes/overrides) | High |
| Synthesis utility tracking | Medium |
| Implication utility tracking with rich feedback taxonomy | Medium |

#### A.4 DOC1

| Obligation | Priority |
|---|---|
| Accept `lesson_learned` as valid `memory_type` | High |
| Auto-applied preference commit path (`maturity_state: "active"`) | Medium |
| Immediate lesson creation from critical outcomes (fast-track to `suggested`) | High |
| Exclude actor behavioral notes from type-level generalization | Medium |

#### A.5 EC Core

| Obligation | Priority |
|---|---|
| `outcome_chains` table creation with FK constraints | High |
| `outcome_chain_concept_index` nightly rebuild | Medium |
| Parameter registry maintenance (build artifact) | Medium |
| Implication detection job registration (Tier 3 background) | Medium |
| Cross-context synthesis job registration | Medium |
| Settings wiring for Category C parameters | Medium |

---

### Appendix B — Implementation Index

| Change | DOC72 | DOC1 | DOC8 | DOC24/KDA | BDSM | EC |
|---|---|---|---|---|---|---|
| OutcomeChain schema + table | ✓ table | — | ✓ reads | ✓ render | ✓ tracks | ✓ writer |
| `lesson_learned` memory type | ✓ subtype | ✓ gate | ✓ synthesis | ✓ render | ✓ utility | ✓ writer |
| `auto_applied` preference | ✓ field | ✓ path | ✓ tracking | ✓ inject | ✓ utility | ✓ writer |
| `ApplicabilityRule` | ✓ schema | — | ✓ evolution | — | — | ✓ writer |
| Actor behavioral notes | ✓ overlay | — | ✓ detection | ✓ inject | — | ✓ writer |
| Cross-context synthesis | ✓ source | ✓ gate | ✓ job | ✓ render | ✓ utility | ✓ writer |
| Implication results | — ephemeral | — | — | ✓ render | ✓ utility | ✓ JSONL |
| Parameter registry | ✓ schema | — | ✓ tuning | — | — | ✓ registry |
| Conflict rendering | — | — | — | ✓ tension | — | — |

---

### Appendix C — Build Dependency Order

| Step | What | Dependencies |
|---|---|---|
| 1 | Outcome chains table + schema (§1) | DOC72 entity graph (existing) |
| 2 | Outcome chain extraction prompt (§8) | Step 1 |
| 3a | Lesson learned type + immediate creation (§2) | Steps 1-2 |
| 3b | Actor behavioral profile extraction (§3) | None |
| 3c | Implicit preference DOC1 fast path (§5) | None |
| 3d | Threshold governance registry (§7) | None |
| 4 | DOC8 lesson synthesis from patterns (§2.4B) | Steps 1-3a |
| 5 | Cross-context pattern synthesis (§4) | Steps 1-2 (needs accumulated chains) |
| 6 | Implication detection (§6) | Steps 1-2 |

Steps 3a-3d can be built in parallel. Steps 5 and 6 in parallel after 1-2.

---

### Appendix D — Storage Paths

| Artifact | Path | Format |
|---|---|---|
| Outcome chains | `outcome_chains` SQLite table | Relational |
| Outcome chain concept index | `outcome_chain_concept_index` SQLite table | Derived, nightly rebuild |
| Lesson learned nodes | `nodes` table, `memory_type = 'lesson_learned'` | Existing schema |
| Cross-context syntheses | `nodes` table, `source = 'cross_context_synthesis'` | Existing schema |
| Implicit preference candidates | `ELNOR_MEMORY/system/learning/implicit_preference_candidates.json` | Atomic JSON |
| Implication results | `ELNOR_MEMORY/system/learning/implication_results.jsonl` | Append-only JSONL |
| Parameter registry | `ELNOR_MEMORY/system/config/parameter_registry.json` | Atomic JSON |
| Parameter tuning history | `ELNOR_MEMORY/system/learning/parameter_tuning_history.jsonl` | Append-only JSONL |

---

### Appendix E — Normative Rules Summary

1. Outcome chains are the sole canonical store for causal reasoning about outcomes. No duplication.
2. Severity assessment uses deterministic baseline + bounded LLM ±1 band with justification.
3. Only `knowledge_gap` and `procedural_failure` root causes auto-create lessons.
4. Lessons start at the narrowest scope. Broadening requires cross-context evidence.
5. Lessons and CorrectionEvents are complementary. Correction first, lesson second.
6. Actor behavioral notes use dual gate: allowed class + forbidden class.
7. Actor behavioral notes are NEVER candidates for type-level generalization.
8. Cross-context synthesis uses Wilson lower bound, sample-size penalty, and confidence ≥ 0.55.
9. Implicit preference acceleration is a DOC1 fast path, not a parallel subsystem.
10. Implication detection has two tiers. Only Tier 1 (deterministic) can be `immediate`.
11. Every numeric threshold is classified as Category B or C. Unclassified = spec debt.
12. DOC8 tunes one parameter family at a time. Oscillation → freeze + extended cooldown.
13. Conflict-aware rendering for co-injected lessons + syntheses.
14. This proposal requires Graph Intelligence Enhancement V2 co-landing in R5.7.
15. All new knowledge types are tracked by BDSM for injection utility.
16. A single user-taught lesson cannot reach `force_level: "strong"` without corroborating evidence.

---

*End of DOC72 Proposal — Knowledge Intelligence Enhancement R2*

## 35. Domain Knowledge Architecture

**Normative payload note:** The domain-concept architecture in this section is now backed by the absorbed `DomainConceptKnowledgeContractSchema` in §4A.5.

### 35.1 Domain Signal Profiles — Pluggable Domain Intelligence

**Multi-domain bleed control:** active profile scoring is bound per work context. Signals from profile X apply only within work contexts whose `DomainProfileBinding` includes profile X. This prevents legal-domain scoring from spilling into personal finance, personal workflow, or other non-legal contexts.

```ts
type DomainProfileBinding = {
  work_context_id: string;
  active_profile_ids: string[];
};
```

```ts
type DomainSignalProfile = {
  profile_id: string;
  profile_name: string;
  lifecycle_state: "suggested" | "trial" | "active" | "inactive" | "archived";
  activated_at?: string;
  activation_source?: "pre_built" | "user_command" | "doc8_confirmed" | "onboarding_created";
  profile_activation_hint?: string;

  high_value_patterns: Array<{
    pattern: string;
    pattern_name: string;
    weight: number;
  }>;

  domain_entity_kinds: string[];
  priority_document_types: string[];
  domain_extraction_guidance?: string;

  promotion_requirements?: Array<{
    field: string;
    operator: "exists" | "min_length" | "min_count" | "confidence_gte";
    value?: unknown;
  }>;

  surface_biases?: Record<IntakeSurface, number>;
  entity_biases?: Record<string, number>;
  rendering_templates?: Record<string, string>;  // Rendering details in DOC24

  schema_version: 1;
};

type DomainFacetRegistryEntry = {
  profile_id: string;
  facet_name: string;
  json_schema: Record<string, unknown>;
  required_fields: string[];
  validator_version: number;
};
```

**Universal base signals (always active, hardcoded):** Decision language, obligation language, date references, quote patterns. Domain-specific patterns live in profiles.

**Profile lifecycle:**
- `suggested`: DOC8 detected consistent patterns. Appears in suggestions inbox. **Never silently activated.**
- `trial`: Shadow mode — profile patterns fire but produce NO graph writes for 7 days. DOC8 computes what WOULD have been extracted and estimates precision/recall.
- `active`: User confirmed after trial, or activated conversationally, or pre-built, or onboarding-created.
- `inactive`/`archived`: Deactivated or zero utility over 90 days.

Maximum 3-4 active profiles simultaneously.

**Pre-built profiles:** Ship with the system (only `securities_litigation` active by default for law firm install):
- `securities_litigation` — legal citations, securities elements, motion types, party references
- `client_intake` — stock tickers, plaintiff names, loss amounts, case filings, retainer terms
- `general_legal` — broader legal patterns for non-securities work
- `music_production` — VST names, BPM, signal routing, patch references
- `personal_finance` — ticker symbols, account references, tax terms

### 35.2 Domain Knowledge Extraction — Generic Supertype

```ts
type DomainKnowledgeExtraction = {
  id: string;
  extraction_type: string;              // Discriminator
  text: string;
  concept_ids: string[];
  supporting_refs: string[];
  domain_profile_id: string;
  verified_state: "verified" | "unverified" | "contradicted";
  confidence: number;
  domain_payload: LegalPropositionFacet | ClientIntakeFacet | AudioRoutingFacet | Record<string, unknown>;
  schema_version: 1;
};
```

### 35.3 Legal-Specialized Facet

```ts
type LegalPropositionFacet = {
  jurisdiction: string[];
  authorial_voice: AuthorialVoice;
  assertion_type: AssertionType;
  legal_knowledge_kind: "authority_holding" | "derived_doctrine" | "argument_position" |
                        "strategic_heuristic" | "open_question";
  authority_type?: "binding" | "persuasive" | "informational";
  still_current?: "yes" | "uncertain" | "overruled" | "outdated" | "not_checked";
  verbatim_excerpt: string;          // MANDATORY for legal propositions
  excerpt_location?: string;
};
```

### 35.4 Legal Conflict Engine

```ts
type LegalConflictSet = {
  conflict_id: string;
  concept_cluster_id: string;
  conflicting_node_ids: string[];
  conflict_kind: "authority_vs_authority" | "jurisdiction_conflict" | "argument_vs_authority" |
                 "framing_conflict" | "overruled_vs_current";
  dominant_node_id?: string;
  needs_user_review: boolean;
};

type LegalFormulationVariant = {
  variant_id: string;
  canonical_concept_id: string;
  statement_text: string;
  source_ref: string;
  variant_kind: "authority_quote" | "paraphrase" | "work_product_framing" | "heuristic";
};
```

Merge when same scope + same epistemic status + same authority pattern. Otherwise create variant or conflict set.

### 35.5 Correction Pipeline

See §10.5 for the CorrectionEvent schema. Corrections flow: DOC72 (schema, detection) → DOC1 (governance) → DOC8 (learning) → DOC24 (injection).

### 35.6 Citation PageRank

Computed as SQL View from provenance entries (not stored in nodes table — avoids cache invalidation):

```sql
CREATE VIEW v_firm_citation_weights AS
SELECT json_extract(payload, '$.authority_ref') AS authority,
       COUNT(DISTINCT source_ref) AS citation_weight,
       COUNT(DISTINCT matter_id) AS matter_diversity
FROM provenance_entries
WHERE entry_type IN ('learned_from_note', 'learned_from_document_view', 'learned_from_trace')
GROUP BY authority;
```

### 35.7 Progressive Expertise Path

Months 1-3: citation chain building. Months 3-6: concept hierarchy. Months 6-12: cross-context pattern detection. Argument template detection triggers after 3+ work products of same type across 2+ work contexts.

```ts
type ArgumentTemplate = {
  id: string;
  template_name: string;
  concept_ids_ordered: string[];
  authority_refs_per_step: string[][];
  contexts_used_in: string[];
  success_rate?: number;
  last_used_at: string;
};
```

### 35.8 Hierarchical Concept Model

Domain concepts support parent/child relationships through `parent_concept` edges. The hierarchy enables concept-scoped retrieval, supersession tracking, and conflict detection at the concept cluster level.

### 35.9 Novelty Gate

All extraction prompts include a DOC72-owned novelty gate. In R5.7 the static 80-token sketch is superseded by the integrated contextual novelty gate in §42A, which supports extraction and packet-assembly use cases through one reusable service.

### 35.10 Matter-Link Confidence

```ts
type MatterLinkCandidate = {
  entity_id: string;
  matter_id: string;
  confidence: number;
  basis: string[];
  needs_confirmation: boolean;
};
```

Unresolved links below 0.8 confidence are surfaced for review rather than auto-committed.


---

## 36. Self-Learning Extraction Feedback Loop

### 36.1 Five Feedback Dimensions

**Dimension 1 — Per-pattern performance:** `PatternPerformanceRecord` with precision, extraction yield, confirmation rate, and current weight adjustment (±0.05 per cycle).

**Dimension 2 — False negative detection:** See §11.6 for FalseNegativeSignal schema.

**Dimension 3 — Surface ROI:** Weekly budget reallocation based on value-per-cost.

**Dimension 4 — Prompt performance:** Per-type precision tracking. Below 0.70 → DOC8 surfaces recommendation to revise.

**Dimension 5 — Sonar quality:** Utility metric (connections_used / connections_created). Below 0.10 → propose reducing frequency.

### 36.2 Adaptive Thresholds with Canary Trial

```ts
type SurfaceThresholdProfile = {
  surface: IntakeSurface;
  deep_threshold: number;
  shallow_threshold: number;
  target_precision: number;
  current_precision: number;
  min_threshold: number;
  max_threshold: number;
};

type ThresholdTrial = {
  trial_id: string;
  surface: IntakeSurface;
  old_thresholds: { deep: number; shallow: number };
  new_thresholds: { deep: number; shallow: number };
  started_at: string;
  ends_at: string;
  status: "active" | "reverted" | "confirmed";
};
```

DOC8 proposes changes → 7-day trial → auto-revert if precision drops. **Exploration quota:** 5-10% of items that would be skipped are randomly processed to track whether useful knowledge is being missed.

**Anti-oscillation rule:** after a threshold family moves in one direction, DOC8 must respect a minimum hold period before moving it in the opposite direction. If oscillation is detected, freeze the family and extend cooldown.

### 36.3 Complete Self-Learning Loop

1. Intake event → 2. Pipeline dispatches → 3. Knowledge committed → 4. Outcomes observed → 5. ExtractionOutcomeEvent emitted → 6. DOC8 updates patterns/thresholds/ROI/prompts → 7. DOC1 updates confidence/maturity/cautions → 8. DOC24 adjusts injection preferences.

---

## 37. Versioned Knowledge Commits

```ts
type KnowledgeCommit = {
  idempotency_key?: string;
  expected_updated_at?: string;
  commit_id: string;
  parent_commit_ids: string[];
  kind: "forward" | "rollback";
  rolled_back_commit_id?: string;
  timestamp: string;
  source_event: string;
  change_summary: string;
  mutations: Array<{
    type: "create" | "update" | "archive" | "edge_create" | "edge_remove";
    target_id: string;
    patch: Array<{ op: "add" | "replace" | "remove"; path: string; value?: any }>;       // JSON Patch RFC 6902
    reverse_patch: Array<{ op: "add" | "replace" | "remove"; path: string; value?: any }>;
  }>;
  extraction_prompt_version?: string;
  reversible: boolean;
};
```

Rollback is compensating commit — never mutate historical records. JSON Patch instead of full state snapshots. Selective re-extraction available.

**Concurrency note:** all `entity_knowledge_write` operations SHOULD carry an idempotency key and MAY carry `expected_updated_at` for lightweight optimistic concurrency. Full semantic merge remains deferred; V1 relies on EC serialization plus these guards.

---

## 38. Extraction Audit Trail

```ts
type ExtractionAuditRecord = {
  extraction_id: string;
  surface: IntakeSurface;
  source_content_ref: string;
  extraction_prompt_version: string;
  significance_score: number;
  dispatch_decision: "skip" | "shallow" | "deep" | "defer";
  skip_reason?: string;
  error_code?: string;
  retry_count: number;
  skip_content_fingerprint?: string;    // Hash of key terms in skipped content
  skip_entity_mentions?: string[];      // Entity aliases found during shallow scan even if skipped
  items_extracted: number;
  items_promoted: number;
  items_filtered_by_write_gate: number;
  knowledge_commit_ids: string[];
  extracted_at: string;
};
```

---

## 39. Browser Integration

### 39.1 The Integrated Browser (DOC20 §6.19)

The embedded browser runs inside Electron's webview, giving Elnor the ability to see page content alongside full ELNOR_MEMORY context. This is the core value: context fusion.

### 39.2 Browser Dwell Policy

Browser dwell time produces SHALLOW extraction only, never deep. Dwell is the ONLY new observation mechanism DOC72 introduces beyond EC's existing command stream. Deep extraction requires explicit engagement: bookmark (conditional on domain_impact_score/matter_link_confidence), save as artifact/note, Ask Elnor, Focus button, or Research Session.

Research Session button activates focused capture mode. Research session summary always generated on session close regardless of extraction budget.

### 39.3 Credential Management

Elnor KNOWS ABOUT credentials (which accounts, which usernames, which services) but defers actual secret storage to Apple Keychain. Non-secret personal data (addresses, phone numbers) are graph entity attributes.

### 39.4 Domain Consent Model

- **Default off.** Browser intelligence is opt-in per domain.
- **Domain allowlisting.** User approves domains explicitly.
- **Per-domain privacy levels:** four tiers governing what data is captured.
- **Session scoping.** "Learn from this session" toggle for temporary capture.
- **Local-only storage.** Browsing data never leaves the local machine.

```ts
type BrowserDomainApproval = {
  domain: string;
  privacy_level: "full_capture" | "entity_extraction" | "navigation_only" | "blocked";
  approved_at: string;
  approved_by: "user_explicit" | "onboarding";
  capture_policy: BrowserCapturePolicy;
  linked_app_entity_id?: string;
};
```

Privacy levels: `full_capture` (work tools), `entity_extraction` (professional sites), `navigation_only` (rhythm tracking), `blocked` (default).

### 39.5 Field-Level Redaction Policy

```ts
type BrowserCapturePolicy = {
  domain: string;
  allowed_data_classes: Array<"structure_only" | "metadata" | "content_summary" | "full_text">;
  blocked_patterns: string[];
  blocked_field_selectors?: string[];
  privileged_context_mode: "block_fulltext" | "allow_with_confirm";
  retain_duration_days?: number;
};
```

### 39.6 Browser Application Entity Model

Each significant website becomes an application entity with `platform: "web"`. Procedures for web applications follow the same semantic-intent model.

### 39.7 Research Session Capture

When in a research session on an approved domain, the system captures: search queries, sources evaluated, sources kept/rejected with reasoning, argument threads constructed. This produces research lineage trace nodes linked to matters and legal theories.

### 39.8 Browser Intake Sources

| Source | Event type | Timing |
|---|---|---|
| Page visit (approved domain) | `intake.browser.page_visited` | Continuous (Tier 3) |
| Research session | `intake.browser.research_session` | After session ends (Tier 3) |
| Website interaction trace | `intake.browser.interaction_traced` | After interaction (Tier 2) |
| Account/service observation | `intake.browser.account_observed` | On login detection (Tier 3) |

### 39.9 Graceful Degradation

When Q runs in Chrome (no Electron webview), browser intake sources don't fire. But knowledge already captured persists and remains accessible.

### 39.10 Observer Effect Prevention

When Elnor writes text into a document (tracked changes, direct insertion), EC records:

```ts
type AgentInsertionLog = {
  document_id: string;
  inserted_text_hash: string;
  inserted_at: string;
  character_range?: [number, number];
};
```

When extraction later processes that document, the pipeline checks the insertion log. If a passage matches a logged hash → skip (agent-generated). No document-level metadata, no watermarks, no invisible characters. Documents sent to court/counsel are completely clean.

## 40. Agent Knowledge Profiles and Inter-Agent Awareness

**Conflict adjudication rule:** conflicting agent conclusions are stored, connected through `contradicts` or `tension_with` edges, and surfaced with `[distinguish]` delivery. User authority outranks all other sources; binding authority outranks accepted multi-agent consensus; accepted multi-agent consensus outranks a single agent observation.

```ts
type ConflictResolutionPriority =
  | "user_authority"
  | "binding_authority"
  | "accepted_multi_agent_consensus"
  | "single_agent_observation";
```

Both conclusions are stored. `contradicts` or `tension_with` edges connect them. DOC24 injects with `[distinguish]` delivery. The user resolves; their decision is recorded as `user_correction` with highest authority weight.

### 40.1 Agent Knowledge Profiles with Least-Privilege Default

```ts
type AgentKnowledgeProfile = {
  agent_id: string;
  agent_name: string;
  inheritance_mode: "minimal" | "domain_template" | "full";
  approved_domains: string[];            // high injection priority
  observed_domains: string[];            // auto-populated from usage
  suppressed_domains?: string[];         // explicitly excluded (user-set only)
  auto_evolve: boolean;                  // default true — profiles adapt from usage
  inherits_from?: string;                // "elnor_baseline" for new agents
};
```

New agents start with `inheritance_mode: "minimal"` (core knowledge only). User explicitly grants domains. Elnor stays `"full"`. This addresses the privacy concern — new agents shouldn't automatically know health/financial data.

**Soft domain scoping:** Nothing is hard-blocked except explicit user suppression. If routing resolves relevant knowledge from any domain, the agent gets it. Domain scoping only affects PRIORITY when token budget forces choices. `approved_domains` get full injection priority. `observed_domains` get slightly lower priority. Knowledge outside both is still available if directly relevant.

**No manual tagging required.** Agent profiles build from observed usage. Users can edit if they want (inspectable), but the system works fully automatically.

### 40.2 Agent Evolution

Agent profiles adapt through use. `auto_evolve: true` by default:

- When an agent is used in a new domain, that domain is added to `observed_domains` automatically
- When an agent's observed domain usage exceeds a threshold (20+ interactions), DOC8 proposes promoting it to `approved_domains`
- When an agent stops being used in a previously observed domain for an extended period, the domain eventually drops from `observed_domains` (but is never hard-blocked)
- When a user explicitly tells Elnor "start using Nova for brief drafting," Nova's profile gets "litigation" added to `approved_domains` immediately

Spawned/temporary agents have session-scoped capability assessments. If a spawned agent performs well, that data persists in the experience record even if the agent is later deleted (UsageEvents carry `agent_id` for historical reference).

### 40.3 Inter-Agent Awareness

All agents write to the SAME graph. When Nova red-teams a brief and produces adverse findings, those are UsageEvents tagged with `agent_id: "nova"`. When Elnor later retrieves that concept, Nova's findings are in the experience record.

```ts
type AgentCapabilityProfile = {
  agent_entity_id: string;
  observed_capabilities: Array<{
    domain: string;
    capability_label: string;
    usage_count: number;
    success_rate: number;
    last_used_at: string;
    user_assessment?: string;
  }>;
  delegated_to_count: number;
  delegated_from_count: number;
};
```

### 40.4 Delegation Protocol

When Elnor delegates to another agent, what knowledge travels:

```ts
type DelegationPayload = {
  delegating_agent_id: string;
  receiving_agent_id: string;
  task_description: string;
  // Knowledge that travels:
  entity_cards: string[];               // pre-rendered, relevant to task
  goal_context?: string;                // if strategically relevant
  constraints: string[];                // applicable constraints
  procedure_refs?: string[];            // suggested procedures
  conversation_context?: string;        // relevant prior summary
  // What does NOT travel:
  // - Full graph access (receiving agent uses its own profile)
  // - Personal memories outside task scope
  // - Suppressed knowledge
};
```

Delegation payload is scoped by the receiving agent's knowledge profile. If the receiving agent doesn't have access to a domain, those items are excluded.

### 40.5 Agent ID on Experience Events

Every UsageEvent carries `agent_id` identifying which agent performed the action. Enables per-agent performance analysis, delegation intelligence, agent evolution tracking, and multi-agent pattern detection.

---

## 41. Conversational Inspectability

### 41.1 Why This Matters

The knowledge system needs constant refinement. Memories will be wrong. Injection decisions will be suboptimal. The ONLY way this works in practice is if you can debug and adjust it CONVERSATIONALLY.

### 41.2 The `inspect_knowledge` Tool

Core pack tool, always available:

```ts
type InspectKnowledgeRequest = {
  query_type:
    | "node_detail"              // full six-dimension profile
    | "injection_history"        // what was injected on recent turns
    | "provenance_chain"         // full evidence chain
    | "experience_record"        // usage/outcome history
    | "injection_decision"       // why was/wasn't something injected
    | "agent_profile"            // current knowledge profile
    | "suppression_list"         // what's currently suppressed
    | "tag_explanation"          // why a specific tag was applied
    | "confidence_breakdown"     // how confidence was computed
    | "connected_nodes"          // what's connected to a node
    | "entity_knowledge_brief"   // 200-300 token structured brief
    | "search_knowledge";        // search across the full graph
  target_node_id?: string;
  target_agent_id?: string;
  turn_ref?: string;
  query_params?: Record<string, unknown>;
};

type InspectKnowledgeResult = {
  query_type: string;
  result_summary: string;              // human-readable for LLM to relay
  result_data: Record<string, unknown>;
  suggested_actions?: string[];
};
```

### 41.3 Conversational Interactions

**"Why did you remember that?"**
Elnor calls `inspect_knowledge` with `query_type: "node_detail"`, explains the provenance chain and experience record, suggests corrections if warranted.

**"Where did you learn that?"**
Calls `inspect_knowledge` with `query_type: "provenance_chain"`, traces back through emails, confirmations, corrections.

**"Why didn't you include Henderson context?"**
Calls `inspect_knowledge` with `query_type: "injection_decision"` for the recent turn, explains routing decision.

**"How confident are you?"**
Calls `inspect_knowledge` with `query_type: "confidence_breakdown"`, shows α/β values, authority backing, experience record.

### 41.4 Knowledge Brief

"What do you know about Henderson?" → `inspect_knowledge` with `query_type: "entity_knowledge_brief"`. Returns structured 200-300 token brief covering: entities, concepts, goals, procedures, obligations, recent activity, confidence, stale items. Available from day 1.

### 41.5 Adjustment Capabilities

Through conversation, the user can:
- **Correct knowledge:** "Actually, I switched to Brand Y." → Update node, add correction provenance.
- **Suppress injection:** "Don't include X when I'm doing Y." → Context-scoped suppression.
- **Boost injection:** "Always include Henderson context for securities cases." → Priority boost.
- **Update goals:** "Henderson strategy changed — we're going to trial." → Goal evolves.
- **Teach new knowledge:** "In S.D.N.Y., expert reports are due 30 days before trial." → Domain principle with user-taught provenance.
- **Modify agent profiles:** "Start using Nova for brief drafting." → Profile update.
- **Review injection decisions:** "Show me what you injected for the last Henderson request."

What CANNOT be adjusted conversationally (requires Claude Code):
- Injection algorithm thresholds and parameters
- CIL hierarchy positions
- Token budget allocations
- Core pack tool definitions
- ProvenanceChain schema changes

### 41.6 Text-to-SQL Inspectability

Because data is structured in SQLite, the LLM can query its own brain directly:

```ts
type QueryKnowledgeSQLRequest = {
  sql: string;               // read-only SQL
  max_rows: number;          // default 20, hard cap 100
  timeout_ms: number;        // default 500, hard cap 2000
};

type QueryKnowledgeSQLResult = {
  columns: string[];
  rows: unknown[][];
  row_count: number;
  truncated: boolean;
  execution_ms: number;
};
```

**Safety constraints:**
- Read-only connection (`PRAGMA query_only = ON`)
- Query timeout (prevents expensive full-table scans)
- Row limit
- Schema provided to LLM as part of KOI or on-demand via `describe_knowledge_schema`

When the user asks "What cases did I work on last week involving loss causation?" the LLM writes SQL against the graph tables directly. This replaces most custom `inspect_knowledge` query types with one general-purpose tool.

### 41.7 The Knowledge Operating Instructions Surface

Q Dashboard displays alongside packet inspector:
- **Current KOI baseline** (editable)
- **Last injection report:** what was injected, what tags, what trimmed
- **Agent knowledge profiles** (editable)
- **Active suppressions**
- **Injection experience summary:** which knowledge used well vs causing friction
- **Knowledge health dashboard** (§36)

---


### 41.8 Knowledge Manager UI Specification

**Cross-doc UI obligation:** any new Knowledge Manager / System Health surface, stored content type, route, setting, or telemetry panel added by this revision SHALL be registered in DOC20 §6.18.2, DOC21, and DOC22 in the same integration pass.

The Knowledge Manager is the user-facing control surface for the entire knowledge system. It is a page in the Q Dashboard, not a separate application.

**Layout:** Left sidebar with navigation tabs. Main content area for the active section. Right panel (optional) for detail view of selected entity/concept/memory.

**Tab 1 — Overview / Health Dashboard**
- Graph stats: total entities, domain concepts, obligations, procedures, memories
- Active domain profiles with status indicators
- Daily budget usage (progress bar)
- Extraction quality: rolling 30-day precision per surface (sparkline charts)
- Items needing review count (badge)
- One-click "Sweep now" for any active work context
- Recent extraction activity (last 24h summary)
- Knowledge debt score (unresolved suggestions, low-confidence active nodes, active contradictions, extraction-precision trend)
- Backlog health (queue depth, dropped items this week, deferred items)
- Graph density / hygiene metrics with manual hygiene trigger
- Extraction quality panel by surface/model/candidate class

```ts
type KnowledgeDebtScore = {
  unresolved_suggestion_count: number;
  low_confidence_active_nodes: number;  // active nodes with confidence < 0.4
  active_contradictions: number;
  extraction_precision_7d: number;
  debt_score: number;                   // composite 0-100 (lower is better)
};
```

Knowledge debt is a deterministic derived health metric surfaced in Tab 1. The composite score MUST be computed from the fields above and refreshed with the same cadence as the Health Dashboard read-models; it MUST NOT be a hand-entered or purely qualitative value.

**Tab 2 — World Model / Entity Browser**
- Searchable, filterable list of all entities
- Columns: name, kind, confidence, lifecycle state, last updated, linked work context
- Filter by: entity kind, work context, confidence range, lifecycle state, domain profile
- Click entity → right panel: full properties, relationships graph, provenance chain, experience records, edit/merge/archive actions

**Tab 3 — Domain Concepts**
- Hierarchical tree view of domain concepts
- Each concept shows: name, hierarchy path, linked authorities count, confidence, `still_current` status
- Conflict indicators on concepts with `contradicts` edges

**Tab 4 — Conversation Recall**
- Searchable list of conversation thread checkpoints
- Action: "Resume this context" (loads into active context)

**Tab 5 — Suggestions Inbox**
- Pending suggestions: entity merges, profile activations, domain concept proposals, correction proposals
- Actions per suggestion: accept, reject, edit, defer
- Batch actions for quick triage

**Tab 6 — Procedures / Standing Procedures**
- List of all procedures and standing procedures
- Status, trigger conditions, last executed, success rate

**Tab 7 — Memory & Constraints**
- List of all memory directives
- Columns: summary, scope, confidence, source, last applied, correction count
- Actions: edit/rewrite, demote/delete, convert to standing procedure, change scope

**Tab 8 — Audit / Telemetry**
- Knowledge timeline: chronological feed of creates, promotes, corrections, merges, contradictions, injections
- Filter by entity, surface, date range, event type
- Export capability for review

```ts
type KnowledgeTelemetryEvent =
  | { kind: "entity_created"; entity_id: string; source_surface: IntakeSurface; created_at: string }
  | { kind: "entity_promoted"; entity_id: string; prior_state: string; new_state: string; created_at: string }
  | { kind: "entity_merged"; source_ids: string[]; target_id: string; created_at: string }
  | { kind: "memory_corrected"; memory_id: string; correction_id: string; created_at: string }
  | { kind: "standing_procedure_executed"; procedure_id: string; status: string; created_at: string }
  | { kind: "domain_concept_conflicted"; conflict_id: string; concept_ids: string[]; created_at: string }
  | { kind: "packet_card_used"; card_type: string; linked_id?: string; created_at: string }
  | { kind: "suggestion_accepted"; suggestion_id: string; created_at: string }
  | { kind: "suggestion_rejected"; suggestion_id: string; created_at: string }
  | { kind: "knowledge_commit_created"; commit_id: string; created_at: string }
  | { kind: "knowledge_commit_rolled_back"; commit_id: string; rollback_id: string; created_at: string };
```

**Decay alerting (constrained):** NO per-item decay notifications. Weekly digest includes a single line for deadline-linked obligations and current-status items only.

**R5.7 design note:** the growing Health / Audit / Graph-Hygiene surface should be reviewed during DOC20/DOC21 design. Splitting operational health into a dedicated System Health page remains a valid product-direction option.

**Live knowledge health widget (Q Dashboard sidebar):** Persistent small widget showing budget usage, active profile indicators, items needing review, one-click "Sweep now."

### 41.9 Observer Effect Prevention in Inspectability

When the user inspects their own knowledge graph through the Knowledge Manager or conversational queries, the inspection itself must NOT generate extraction events. Hash-based extraction-side tracking (§39.10) prevents the system from "learning" its own inspection output. Conversational queries about knowledge state are explicitly excluded from the conversation mining pipeline.

## 42. Graph Cleanup / Entropy Control

### 42.1 Why This Matters

Without active graph hygiene, the graph becomes sludge after 6 months — full of stale candidates, near-duplicates, orphan nodes, and contradictions.

### 42.2 Nightly Cleanup Jobs

| Job | Rule | Threshold |
|---|---|---|
| **Candidate decay** | `observed` entities that never reached `suggested` | 14 days → archive |
| **Suggestion expiry** | `suggested` entities that never reached `confirmed` | 30 days → archive |
| **Near-duplicate detection** | Similar canonical_name + overlapping aliases | → merge suggestion queue |
| **Contradiction detection** | Conflicting provenance entries on same node | → contradiction review queue |
| **Orphan detection** | Nodes with zero active edges | → flag for review |
| **Trace compaction** | Execution traces older than 180 days | → compress to summary records |
| **Experience compaction** | UsageEvents older than 30 days on non-Tier-A nodes | → aggregate into counters |
| **Stale procedure sweep** | Procedures past TTL without validation | → mark stale, surface top 10 |

### 42.3 Entropy Metrics (Knowledge Health Dashboard)

- Active node count by kind
- Archive rate this week
- Merge candidates pending
- Contradiction queue depth
- Orphan node count
- Average confidence by node kind
- Staleness distribution
- Extraction quality (rolling precision per candidate class from §6.5)

### 42.4 Operator Review Surfaces (Q Dashboard)

- Merge/split queue with preview
- Contradiction review with provenance comparison
- Stale node list with one-click archive or re-validate
- Suggestions inbox with bulk dismiss

### 42.5 SQLite VACUUM

Runs after all nightly cleanup jobs. See §3.6.

---


### 42.6 Semantic Folding

#### 42.6.1 Four-Step Architecture

**Step 1 — Delta vector sweep (recent vs all):**

```sql
WITH recent_nodes AS (
  SELECT id, node_kind FROM nodes
  WHERE updated_at >= datetime('now', '-1 day') AND is_active = 1
)
SELECT r.id AS node_a, b.id AS node_b,
       vec_distance_cosine(va.embedding, vb.embedding) AS distance
FROM recent_nodes r
JOIN vec_nodes va ON r.id = va.node_id
JOIN nodes b ON r.node_kind = b.node_kind AND r.id < b.id
JOIN vec_nodes vb ON b.id = vb.node_id
WHERE b.is_active = 1 AND distance < :threshold
ORDER BY distance ASC LIMIT 50;
```

Per-node_kind thresholds: domain concepts 0.08, preferences 0.10, entities 0.05, procedures 0.06.

**Step 2 — LLM discriminator:** FOLD / RELATED / CONFLICT verdicts.

**Step 3 — Canonical synthesis:** Merged node gets combined description, dominant provenance.

**Step 4 — Graph surgery:**

```ts
type FoldDecision = {
  verdict: "FOLD" | "RELATED" | "CONFLICT";
  node_a_id: string;
  node_b_id: string;
  canonical_name?: string;
  canonical_description?: string;
  reviewed_by: "llm" | "human";
  confidence: number;
};

type NodeMergeMap = {
  old_node_id: string;
  new_node_id: string;
  merged_at: string;
  merge_commit_id: string;
};
```

α/β merging: `merged_alpha = Math.max(a.alpha, b.alpha) + 0.5` (corroboration bonus), `merged_beta = Math.max(a.beta, b.beta)`. Edge rewiring uses the canonical temporal key and updates `confidence = MAX(edges.confidence, excluded.confidence)`. Reject merge if either node has unresolved corrections unless human-approved.

```sql
-- In semantic folding edge rewiring:
INSERT INTO edges (source_id, target_id, relation_type, confidence, start_date)
VALUES (?, ?, ?, ?, COALESCE(?, CURRENT_TIMESTAMP))
ON CONFLICT(source_id, target_id, relation_type, start_date)
DO UPDATE SET confidence = MAX(edges.confidence, excluded.confidence);
```

#### 42.6.2 Folding Constraints

- Same `node_kind` only. Both `confirmed` or higher.
- Never auto-fold across different active work contexts.
- Never fold `domain_concepts` if earliest provenance dates separated by >24 months unless LLM discriminator verifies standard hasn't been superseded.
- Maximum 50 pairs per night. "Never fold these" user override.

### 42.7 Hierarchical Summarization

Soft-suppress cluster members (NOT archive). `CommunitySummaryNode` stores `member_node_ids` for recovery. This prevents the graph from being cluttered with dozens of similar low-level nodes when a single cluster summary serves better for retrieval.

### 42.8 Automated Regression Harness

Maintain fixed set of gold notes/docs/chats. Run nightly extraction diff. Block DOC8 threshold updates when regression crosses threshold. This prevents self-learning from silently degrading extraction quality.

### 42.9 Graph Health Practices

FK cascades (`ON DELETE CASCADE`). Incremental auto-vacuum. Incremental vector migration for embedding model upgrades (§3.7).


### 42.10 Nightly Job DAG and Failure Policy

```ts
type NightlyJobId =
  | "doc72_cleanup"
  | "doc72_consolidation"
  | "doc72_conflict_refresh"
  | "matrix_reward_and_attribution"
  | "matrix_bundle_compile"
  | "matrix_bundle_activate"
  | "doc24_runtime_refresh";

const NIGHTLY_DAG: Array<{ job_id: NightlyJobId; depends_on: NightlyJobId[] }> = [
  { job_id: "doc72_cleanup", depends_on: [] },
  { job_id: "doc72_consolidation", depends_on: ["doc72_cleanup"] },
  { job_id: "doc72_conflict_refresh", depends_on: ["doc72_consolidation"] },
  { job_id: "matrix_reward_and_attribution", depends_on: ["doc72_conflict_refresh"] },
  { job_id: "matrix_bundle_compile", depends_on: ["matrix_reward_and_attribution"] },
  { job_id: "matrix_bundle_activate", depends_on: ["matrix_bundle_compile"] },
  { job_id: "doc24_runtime_refresh", depends_on: ["matrix_bundle_activate"] },
];
```

If any job fails, keep the prior active runtime generation, mark the system degraded, and retry on the next nightly cycle. Jobs process incrementally from per-job high-water marks and checkpoint when they hit their duration cap.

### 42.11 Link Density Governor and Graph Hygiene

```ts
type LinkDensityGovernor = {
  max_outgoing_edges_per_node_kind: Record<string, number>;
  max_new_edges_per_observation: number;
  min_edge_confidence_for_promotion: number;
  max_generic_semantic_edges_per_node: number;
};
```

Nightly hygiene rule: edges with `utility_score < 0.1` for 60 days and no authority or user verification are demoted to `observed` and excluded from hot retrieval.



## 42A. Graph Intelligence Enhancement V2.2 (Integrated)

The following module is incorporated directly from the accepted Graph Intelligence Enhancement V2.2 proposal. It is normative in R5.73 and supersedes the V2.1 absorption that landed in R5.7. V2.2 incorporates 29 normalized patches (NP01-NP29) drawn from a 5-reviewer red-team adjudication. Like V2.1, it supersedes earlier thinner novelty-gate, direct edge-candidate, hybrid retrieval, compiled-truth, and graph-hygiene text where conflicts exist.

### 42A.0 Governing principles for V2.2

1. **No phasing.** All subsystems built and running from day one. Empty ledgers and zero-result queries are natural zero states.
2. **DOC72 owns knowledge shape.** DOC24/KDA owns rendering templates and the rendering pipeline. This applies to compiled truth, constellations, and weekly digests: DOC72 owns mirror schemas and render-input builders; DOC24/KDA owns the markdown/email render templates.
3. **Mention observations are evidence, not canonical truth.** Raw mentions → evidence bundles → promoted canonical edges. This is the core architectural change from V1.
4. **Deterministic tiers run inline; LLM tiers are queued/nightly by default.** Tiers 1, 2, 3, 3.5 run synchronously (sub-500ms, no LLM). Tier 4 is queued for nightly batch by default, with bounded opt-in synchronous mode. Tier 5 produces review queue items only.
5. **Every config knob must be wired to a real consume path.** No ghost controls.
6. **Every user-facing action must map to a real command chain.** `user action → EC command → durable write → telemetry → refreshed read-model`.
7. **Resolution optimizes truth, not graph exploration.** Tied candidates → additional discriminators or abstain. No non-hub preference.
8. **Hub penalty at promotion time only.** Not at resolution time.
9. **Domain-agnostic architecture [NP09].** Relation types, span extraction patterns, and seed rules split into domain-agnostic core + domain-profile extensions loaded from DOC72 §35. Legal-specific vocabulary is a domain profile contribution, not a hardcoded default. A fresh ELNOR instance with no domain profile active SHALL use only domain-agnostic core patterns and core relation types.
10. **Typed relation vocabulary.** Relation types use a canonical enum with core + domain splits, not free strings.
11. **Maximal cliques for pattern clustering.** Not connected components (prevents hairball merging via bridging entities).
12. **Cross-context expansion is fallback-only.** Triggered when local resolution fails, not by regex detection of comparative language alone.
13. **Every schema field and scoring term must have a computation function.** No declared-but-never-computed fields.

#### 42A.0.1 Supersession statement

> **Supersession statement.** This V2.2 proposal supersedes the DOC72 Graph Intelligence Enhancement Proposal V2.1 in full. It incorporates all 29 normalized patches from the combined V2.1 red-team adjudication: domain-agnosticism restructuring (NP09), Tier 4 queue/output closure (NP05), evidence scoring and link-budget wiring (NP04), self-learning computation closure (NP11), derived-table infrastructure (NP01), mention-observation idempotency fix (NP02), evidence-bundle durability with monotonic watermark (NP03), alias-lexicon visibility and scale fixes (NP08/NP26), ordered motif mining algorithm (NP06), Tier 5 diversity gates and link-existing resolution (NP07), UI/route closure including retarget/rekey (NP10), retention/compaction (NP12), clique bounds (NP13), constellation scene quality and render-owner seam (NP14), graph-hygiene state model (NP15), promoted-edge visibility enforcement (NP16), regex runtime enforcement (NP17), relation directionality (NP18), participant matching improvement (NP19), relation-cue fallback noise suppression (NP20), mirror rebuild dedup (NP21), weekly digest content and render seam (NP22), helper function definitions (NP23), graph-context diversity consumption (NP25), compiled truth render polymorphism (NP27), cooldown reset semantics (NP28), and nightly DAG (NP29).

---

### 42A.1 Contextual P-0 Novelty Gate

#### 42A.1.1 What it replaces
The static ~80-token "ALREADY KNOWN" cap in DOC72 §20A.

#### 42A.1.2 Design
Priority-fill within a 500-token budget, with conditional diversity guardrails, hub suppression, and generic-span exclusion.

#### 42A.1.3 Configuration schema

```ts
export const ContextualNoveltyGateConfigSchema = z.object({
  max_tokens: z.number().int().positive().default(500),
  work_context_cap_tokens: z.number().int().positive().default(300),
  participant_cap_tokens: z.number().int().positive().default(120),
  recent_cap_tokens: z.number().int().positive().default(80),
  cross_context_cap_tokens: z.number().int().positive().default(80),
  participant_min_reserved_tokens: z.number().int().nonnegative().default(40),
  cross_context_min_reserved_tokens: z.number().int().nonnegative().default(30),
  include_relationship_hints: z.boolean().default(true),
  suppress_hub_nodes_above_edge_count: z.number().int().positive().default(150),
  exclude_generic_document_refs: z.boolean().default(true),
  recent_window_days: z.number().int().positive().default(14),
  max_recent_entities: z.number().int().positive().default(20),
  schema_version: z.literal(1),
});
```

**Storage:** `ELNOR_MEMORY/config/novelty_gate_config.json`

#### 42A.1.4 Gate builder algorithm

1. Gather candidates from four sources: work-context entities, participant entities, recently active entities, cross-context reusable entities.
2. Apply quality filters to each source: remove hub nodes above threshold, remove generic document references, apply visibility constraints, apply minimum salience threshold.
3. After filtering, activate minimum reservations for participant and cross-context sources IF filtered candidates exist. If a source has no candidates after filtering, its reservation releases to the general pool.
4. Fill remaining budget by priority score, capping each source category.
5. Surplus from unused caps flows to the next-highest-priority unfilled candidates from any source.

**Source queries (verified against DOC72 R5.6 schema):**

Source 1 — work-context entities:
```sql
SELECT DISTINCT n.id, n.canonical_name, n.node_kind, n.confidence, e.relation_type
FROM edges e
JOIN nodes n ON (
  CASE WHEN e.source_id = ? THEN e.target_id ELSE e.source_id END
) = n.id
WHERE (e.source_id = ? OR e.target_id = ?)
  AND n.is_active = 1
  AND n.lifecycle_state IN ('active', 'observed')
ORDER BY n.confidence DESC
LIMIT 40
```

Source 2 — participant entities:
```sql
SELECT n.id, n.canonical_name
FROM aliases a
JOIN nodes n ON a.node_id = n.id
WHERE n.is_active = 1
  AND n.node_kind = 'world_entity'
  AND (
    LOWER(a.normalized_alias) LIKE LOWER(?)
    OR LOWER(n.canonical_name) LIKE LOWER(?)
  )
LIMIT 5
```

Source 3 — recently active entities:
```sql
SELECT id, canonical_name FROM nodes
WHERE is_active = 1
  AND lifecycle_state IN ('active', 'observed')
  AND updated_at > datetime('now', '-14 days')
  AND id NOT IN (/* already selected */)
ORDER BY updated_at DESC
LIMIT 20
```

**Output format:** Each entity renders as a line: `"Entity Name (relation type)"` if hints enabled. Token estimate: `Math.ceil(line.length / 3.5)`.

#### 42A.1.5 Telemetry

```ts
export const NoveltyGateTelemetrySchema = z.object({
  observation_id: z.string().max(120),
  total_candidates_before_filter: z.number().int().nonnegative(),
  total_candidates_after_filter: z.number().int().nonnegative(),
  tokens_used: z.number().int().nonnegative(),
  source_token_share: z.record(z.string(), z.number().int().nonnegative()).default({}),
  unused_reserved_tokens: z.number().int().nonnegative().default(0),
  reservation_released_tokens: z.number().int().nonnegative().default(0),
  starvation_events: z.array(z.object({
    source: z.string().max(40),
    requested_tokens: z.number().int().nonnegative(),
    granted_tokens: z.number().int().nonnegative(),
  })).default([]),
  hub_nodes_suppressed: z.number().int().nonnegative().default(0),
  generic_refs_excluded: z.number().int().nonnegative().default(0),
  gate_hit_rate: z.number().min(0).max(1).optional(),
  entities_referenced_from_gate: z.number().int().nonnegative().optional(),
  schema_version: z.literal(1),
});
```

**Gate effectiveness feedback [Enhancement]:** After each extraction, compare P-0's output entity references against the novelty gate contents. Track `gate_hit_rate = entities_referenced / entities_in_gate`. If the hit rate is consistently below 10%, the gate is wasting tokens on irrelevant entities. This feeds back into gate tuning and the self-learning lane.

#### 42A.1.6 Cost impact

| Gate size | Extra tokens/day (25 runs) | % of 500K cap | $/day |
|---|---|---|---|
| 80 (current) | 0 | 0% | $0 |
| 500 (default) | 10,500 | 2.1% | ~$0.013 |

#### 42A.1.7 Normative rules
1. Replaces the static 80-token cap in §20A.
2. `max_tokens` is user-configurable.
3. Gate builder is SQL-only, no LLM cost.
4. Reservations are conditional on candidates surviving quality filters.

---

### 42A.2 Back-Link Enforcement Pipeline

#### 42A.2.1 Core architecture (V2 change from V1)

**V1:** Scan → resolve → create edge candidates directly.
**V2:** Scan → create mention observations → aggregate into evidence bundles → promote qualifying bundles to canonical edges.

#### 42A.2.2 Canonical relation type vocabulary [NP09, NP18]

All relation types use a canonical enum split into domain-agnostic core types and domain-profile extensions. Learned relation types from user corrections extend the enum through a governed migration path.

```ts
// Domain-agnostic core relation types (always available)
export const CoreRelationTypeSchema = z.enum([
  "linked_to_context",
  "affiliated_with",
  "produced_by",
  "referenced_in",
  "managed_by",
  "created_by",
  "collaborated_with",
  "used_in",
  "depends_on",
  "part_of",
  "semantically_related",
]);

// Domain-specific relation types (loaded from domain profiles)
// Legal domain profile contributes these:
export const LegalDomainRelationTypeSchema = z.enum([
  "adjudicated_by",
  "represented_by",
  "expert_in",
  "party_to",
]);

// Combined canonical type: core + active domain extensions
export const CanonicalRelationTypeSchema = z.union([
  CoreRelationTypeSchema,
  LegalDomainRelationTypeSchema,
  // Additional domain schemas merged at profile load time
]);

export const ExtendedRelationTypeSchema = z.union([
  CanonicalRelationTypeSchema,
  z.string().max(80).refine(
    (val) => val.startsWith("learned_"),
    "Extended relation types must be prefixed with 'learned_'",
  ),
]);
```

**Relation directionality [NP18]:**

```ts
export const UndirectedRelationTypes = new Set([
  "affiliated_with", "collaborated_with", "semantically_related",
]);

function normalizeEdgePairOrdering(
  sourceId: string, targetId: string, relationType: string,
): { source_id: string; target_id: string } {
  if (UndirectedRelationTypes.has(relationType)) {
    return sourceId < targetId
      ? { source_id: sourceId, target_id: targetId }
      : { source_id: targetId, target_id: sourceId };
  }
  return { source_id: sourceId, target_id: targetId };
}

// Domain-profile relation definitions declare directionality
export const DomainRelationDefinitionSchema = z.object({
  domain_id: z.string().max(80),
  relation_type: z.string().max(80),
  directionality: z.enum(["directed", "undirected"]),
  reverse_relation_type: z.string().max(80).optional(),
  schema_version: z.literal(1),
});
```

At profile load time, EC SHALL merge all `directionality = "undirected"` relation types from domain profiles into the pair-normalization set. Evidence bundles and promoted edges SHALL normalize pair ordering for all undirected relation types before write.

Seed rules use canonical types. User-learned types use `learned_` prefix. Learned types may be promoted to canonical via a governed migration when support count exceeds a threshold.

#### 42A.2.3 Data layer schemas

##### 42A.2.3A Derived table infrastructure [NP01]

The back-link pipeline relies on three derived SQLite tables rebuilt from canonical JSONL logs. These are queryable projections only, NOT second truth stores.

```sql
CREATE TABLE mention_observations_derived (
  append_seq INTEGER PRIMARY KEY AUTOINCREMENT,
  mention_id TEXT NOT NULL UNIQUE,
  observation_id TEXT NOT NULL,
  source_node_id TEXT NOT NULL,
  resolved_node_id TEXT,
  normalized_mention TEXT NOT NULL,
  span_class TEXT NOT NULL,
  state TEXT NOT NULL,
  visibility_class TEXT NOT NULL,
  work_context_id TEXT,
  surface TEXT NOT NULL,
  occurred_at DATETIME NOT NULL,
  start_offset INTEGER NOT NULL,
  end_offset INTEGER NOT NULL
);
CREATE INDEX idx_mod_observation ON mention_observations_derived(observation_id);
CREATE INDEX idx_mod_resolved ON mention_observations_derived(resolved_node_id);
CREATE INDEX idx_mod_state ON mention_observations_derived(state);
CREATE INDEX idx_mod_work_context ON mention_observations_derived(work_context_id);
CREATE INDEX idx_mod_append_seq ON mention_observations_derived(append_seq);

CREATE TABLE backlink_rejections_derived (
  rejection_id TEXT PRIMARY KEY,
  source_node_id TEXT NOT NULL,
  rejected_target_node_id TEXT NOT NULL,
  matched_text_span TEXT NOT NULL,
  rejection_reason TEXT NOT NULL,
  correct_target_node_id TEXT,
  created_at DATETIME NOT NULL
);
CREATE INDEX idx_brd_source_target ON backlink_rejections_derived(source_node_id, rejected_target_node_id);

CREATE TABLE review_queue_derived (
  queue_item_id TEXT PRIMARY KEY,
  item_kind TEXT NOT NULL,
  primary_ref TEXT NOT NULL,
  title TEXT,
  visibility_class TEXT NOT NULL DEFAULT 'cloud_allowed',
  status TEXT NOT NULL DEFAULT 'pending',
  created_at DATETIME NOT NULL
);
CREATE INDEX idx_review_queue_status ON review_queue_derived(status, item_kind);
CREATE INDEX idx_review_queue_created ON review_queue_derived(created_at DESC);
```

**TypeScript row schemas:**
```ts
export const MentionObservationsDerivedRowSchema = z.object({
  append_seq: z.number().int().positive(),
  mention_id: z.string().max(120),
  observation_id: z.string().max(120),
  source_node_id: z.string().max(120),
  resolved_node_id: z.string().max(120).nullable(),
  normalized_mention: z.string().max(500),
  span_class: MentionSpanClassSchema,
  state: z.enum(["observed_mention","candidate_relation","reinforced_relation","user_verified","suppressed"]),
  visibility_class: z.enum(["local_only","cloud_allowed","cloud_warn","blocked"]),
  work_context_id: z.string().max(120).optional(),
  surface: z.string().max(80),
  occurred_at: z.string().datetime(),
  start_offset: z.number().int().nonnegative(),
  end_offset: z.number().int().nonnegative(),
  schema_version: z.literal(1),
});

export const BackLinkRejectionDerivedRowSchema = z.object({
  rejection_id: z.string().max(120),
  source_node_id: z.string().max(120),
  rejected_target_node_id: z.string().max(120),
  matched_text_span: z.string().max(500),
  rejection_reason: z.enum(["wrong_entity","no_relationship","wrong_relation_type","spurious_mention"]),
  correct_target_node_id: z.string().max(120).optional(),
  created_at: z.string().datetime(),
  schema_version: z.literal(1),
});
```

**Rebuild rule:** EC SHALL rebuild all derived tables at startup from canonical JSONL logs if they are absent or marked invalid. During normal operation, EC SHALL append/update these tables transactionally after each new write. Rebuilds SHALL be idempotent.

##### Mention observations (event/history lane) [NP02]

```ts
export const MentionSpanClassSchema = z.enum([
  "canonical_alias",
  "shorthand_alias",
  "role_titled_person",
  "organization_name",
  "formal_reference",
  "specific_document_reference",
  "generic_document_reference",
  "definite_description",
  "cross_context_reference",
  "product_or_model_reference",
]);

export const MentionObservationSchema = z.object({
  mention_id: z.string().max(120),
  observation_id: z.string().max(120),
  source_node_id: z.string().max(120),
  resolved_node_id: z.string().max(120).optional(),
  mention_text: z.string().max(500),
  normalized_mention: z.string().max(500),
  span_class: MentionSpanClassSchema,
  is_nested: z.boolean().default(false),
  parent_mention_id: z.string().max(120).optional(),
  start_offset: z.number().int().nonnegative(),
  end_offset: z.number().int().nonnegative(),
  relation_hypothesis: ExtendedRelationTypeSchema.optional(),
  resolution_method: z.enum([
    "alias_exact", "alias_prefix", "fts5_bm25", "canonical_substring",
    "vec_semantic", "vec_coreference", "llm_closed_set", "user_corrected", "unresolved",
  ]),
  resolution_confidence: z.number().min(0).max(1).default(0),
  graph_context_score: z.number().min(0).max(1).default(0),
  context_window: z.string().max(1000),
  work_context_id: z.string().max(120).optional(),
  surface: z.string().max(80),
  visibility_class: z.enum(["local_only", "cloud_allowed", "cloud_warn", "blocked"]),
  state: z.enum([
    "observed_mention", "candidate_relation", "reinforced_relation",
    "user_verified", "suppressed",
  ]).default("observed_mention"),
  suppression_reason: z.string().max(120).optional(),
  occurred_at: z.string().datetime(),
  schema_version: z.literal(1),
});
```

**Storage:** `ELNOR_MEMORY/system/backlink/mention_observations.jsonl` (append-only)

**Idempotency rule [NP02]:** `mention_id` is computed as `sha256(observation_id | source_node_id | normalized_mention | start_offset | end_offset | span_class)`. This key is resolution-independent — `resolved_node_id` is MUTABLE on the record and is updated in-place when Tier 4 resolves a previously unresolved mention. The UNIQUE constraint is on `mention_id` alone. Duplicate `mention_id` values are skipped via `INSERT OR IGNORE`.

Evidence aggregation SHALL count distinct mention groups per observation so nested parent/child spans sharing the same target do not double-inflate support for one target/relation.

##### Evidence bundles (aggregation lane) [NP03]

```ts
export const LinkEvidenceBundleSchema = z.object({
  evidence_bundle_id: z.string().max(120),
  source_node_id: z.string().max(120),
  target_node_id: z.string().max(120),
  relation_type: ExtendedRelationTypeSchema,
  support_mention_ids: z.array(z.string().max(120)).default([]),
  contradiction_mention_ids: z.array(z.string().max(120)).default([]),
  support_count: z.number().int().nonnegative().default(0),
  contradiction_count: z.number().int().nonnegative().default(0),
  distinct_observation_count: z.number().int().nonnegative().default(0),
  distinct_surface_count: z.number().int().nonnegative().default(0),
  distinct_work_context_count: z.number().int().nonnegative().default(0),
  visibility_class: z.enum(["local_only", "cloud_allowed", "cloud_warn", "blocked"]).default("cloud_allowed"),
  last_supported_at: z.string().datetime().optional(),
  hub_penalty: z.number().min(0).max(1).default(0),
  salience_score: z.number().min(0).max(1).default(0),
  user_verdict: z.enum(["none", "approved", "rejected", "relation_corrected"]).default("none"),
  promotion_state: z.enum([
    "candidate_relation", "reinforced_relation", "user_verified", "suppressed",
  ]).default("candidate_relation"),
  schema_version: z.literal(1),
});
```

**Storage [NP03]:** Evidence bundles are materialized in a derived SQLite table `evidence_bundles_derived`, rebuilt incrementally using a monotonic `append_seq` watermark from `mention_observations_derived`. Full rebuild on `force_full_scan = true` or when state file is missing. During promotion, the promotion function reads from and writes to this table.

**Bundle ID [NP03]:** `evidence_bundle_id = sha256(source_node_id | target_node_id | relation_type)`. Bundle IDs are deterministic and stable across rebuilds given the same key tuple. Pair ordering for undirected relation types MUST be normalized before computing the bundle ID.

**Incremental aggregation state [NP03]:**
```ts
export const EvidenceAggregationStateSchema = z.object({
  last_processed_append_seq: z.number().int().nonnegative().default(0),
  last_aggregated_at: z.string().datetime(),
  schema_version: z.literal(2),
});
```

Incremental evidence aggregation SHALL advance by `append_seq`, not by `mention_id`. `mention_id` is deterministic identity; `append_seq` is ingestion order. EC SHALL update `last_processed_append_seq` only after the aggregation transaction commits successfully.

**Visibility aggregation rule:** `visibility_class` on a bundle SHALL be the most restrictive class among its supporting mention observations. `blocked` bundles MAY be stored for audit but MUST NOT be promoted into canonical edges or returned on disallowed surfaces.

##### Promotion policy

```ts
export const BackLinkPromotionPolicySchema = z.object({
  min_support_count_for_promotion: z.number().int().positive().default(2),
  min_distinct_observations_for_promotion: z.number().int().positive().default(2),
  min_salience_for_promotion: z.number().min(0).max(1).default(0.58),
  max_promoted_generic_relations_per_source: z.number().int().nonnegative().default(2),
  hard_block_relation_types: z.array(ExtendedRelationTypeSchema).default(["semantically_related"]),
  allow_single_observation_promotion: z.boolean().default(true),
  single_observation_requires_alias_exact: z.boolean().default(true),
  single_observation_min_graph_context: z.number().min(0).max(1).default(0.7),
  single_observation_blocked_relation_types: z.array(ExtendedRelationTypeSchema).default([
    "semantically_related",
  ]),
  schema_version: z.literal(1),
});
```

**Storage:** `ELNOR_MEMORY/config/backlink_promotion_policy.json`

#### 42A.2.4 Evidence scoring and promotion [NP04, NP23]

**Hub penalty computation [NP04]:**
```ts
async function computeBundleHubPenalty(
  targetNodeId: string,
  hubThreshold: number = 150,
): Promise<number> {
  const edgeCount = await db.get(`
    SELECT COUNT(*) as cnt FROM edges
    WHERE source_id = ? OR target_id = ?
  `, [targetNodeId, targetNodeId]);
  const degree = edgeCount?.cnt || 0;
  if (degree <= hubThreshold) return 0;
  return Math.min(0.8, (degree - hubThreshold) / (hubThreshold * 2));
}
```

**Salience computation [NP04]:**
```ts
function computeBundleSalienceScore(
  bundle: z.infer<typeof LinkEvidenceBundleSchema>,
): number {
  let salience = 0.5;
  if (bundle.distinct_surface_count >= 2) salience += 0.15;
  if (bundle.distinct_work_context_count >= 2) salience += 0.15;
  if (!["semantically_related"].includes(bundle.relation_type)) salience += 0.10;
  if (bundle.contradiction_count === 0) salience += 0.10;
  return Math.min(1.0, salience);
}
```

Both `hub_penalty` and `salience_score` SHALL be computed during evidence bundle aggregation (nightly) and stored on the bundle BEFORE `computeEvidenceScore` is called for promotion decisions.

**Evidence score formula:**
```ts
export function computeEvidenceScore(
  bundle: z.infer<typeof LinkEvidenceBundleSchema>,
): number {
  const support = Math.min(1, bundle.support_count * 0.18);
  const observationSpread = Math.min(1, bundle.distinct_observation_count * 0.14);
  const surfaceSpread = Math.min(1, bundle.distinct_surface_count * 0.10);
  const contradictionPenalty = Math.min(1, bundle.contradiction_count * 0.25);
  const hubPenalty = bundle.hub_penalty;

  return Math.max(0, Math.min(1,
    0.45 * support +
    0.25 * observationSpread +
    0.10 * surfaceSpread +
    0.20 * bundle.salience_score -
    0.25 * contradictionPenalty -
    0.15 * hubPenalty,
  ));
}
```

**Helper functions [NP23]:**
```ts
async function getFirstSupportingMention(
  bundle: z.infer<typeof LinkEvidenceBundleSchema>,
): Promise<z.infer<typeof MentionObservationSchema> | null> {
  if (bundle.support_mention_ids.length === 0) return null;
  const firstId = bundle.support_mention_ids[0];
  return db.get(`SELECT * FROM mention_observations_derived WHERE mention_id = ? LIMIT 1`, [firstId]) ?? null;
}

async function loadCandidateEvidenceBundles(): Promise<z.infer<typeof LinkEvidenceBundleSchema>[]> {
  return db.all(`SELECT * FROM evidence_bundles_derived WHERE promotion_state != 'suppressed'`);
}
```

**Promotion logic:**
```ts
export function shouldPromoteEdge(
  bundle: z.infer<typeof LinkEvidenceBundleSchema>,
  policy: z.infer<typeof BackLinkPromotionPolicySchema>,
): boolean {
  if (bundle.visibility_class === "blocked") return false;
  if (policy.hard_block_relation_types.includes(bundle.relation_type)) return false;
  if (bundle.user_verdict === "rejected") return false;
  if (bundle.user_verdict === "approved") return true;

  if (
    policy.allow_single_observation_promotion &&
    bundle.support_count === 1 &&
    bundle.distinct_observation_count === 1
  ) {
    const firstMention = getFirstSupportingMention(bundle);
    if (!firstMention) return false;
    if (firstMention.resolution_method !== "alias_exact") return false;
    if (firstMention.graph_context_score < policy.single_observation_min_graph_context) return false;
    if (policy.single_observation_blocked_relation_types.includes(bundle.relation_type)) return false;
    return true;
  }

  if (bundle.support_count < policy.min_support_count_for_promotion) return false;
  if (bundle.distinct_observation_count < policy.min_distinct_observations_for_promotion) return false;
  if (computeEvidenceScore(bundle) < policy.min_salience_for_promotion) return false;
  return true;
}
```

**Promotion with budget enforcement [NP04]:**
```ts
export async function materializePromotedEdges(
  policy: z.infer<typeof BackLinkPromotionPolicySchema>,
  budget: z.infer<typeof NodeLinkBudgetPolicySchema>,
): Promise<void> {
  const bundles = await loadCandidateEvidenceBundles();
  const promotedCountBySource = new Map<string, number>();
  const genericCountBySource = new Map<string, number>();

  for (const bundle of bundles) {
    if (!shouldPromoteEdge(bundle, policy)) continue;

    // Enforce per-source budget [NP04]
    const sourceCount = promotedCountBySource.get(bundle.source_node_id) || 0;
    if (sourceCount >= budget.max_promoted_edges_per_source_node) continue;

    // Enforce generic relation limit [NP04]
    if (bundle.relation_type === "semantically_related" || bundle.relation_type.startsWith("learned_")) {
      const genericCount = genericCountBySource.get(bundle.source_node_id) || 0;
      if (genericCount >= budget.max_generic_relations_per_source_node) continue;
      genericCountBySource.set(bundle.source_node_id, genericCount + 1);
    }

    const { source_id, target_id } = normalizeEdgePairOrdering(
      bundle.source_node_id, bundle.target_node_id, bundle.relation_type,
    );
    const score = computeEvidenceScore(bundle);
    const lifecycleState = bundle.user_verdict === "approved" ? "active" : "observed";

    await db.run(`
      INSERT INTO edges (source_id, target_id, relation_type, confidence, lifecycle_state, payload)
      VALUES (?, ?, ?, ?, ?, json(?))
      ON CONFLICT(source_id, target_id, relation_type)
      DO UPDATE SET confidence = excluded.confidence, lifecycle_state = excluded.lifecycle_state, payload = excluded.payload
    `, [
      source_id, target_id, bundle.relation_type, score, lifecycleState,
      JSON.stringify({
        learned_by: "backlink_pipeline_v2",
        evidence_bundle_id: bundle.evidence_bundle_id,
        support_count: bundle.support_count,
        contradiction_count: bundle.contradiction_count,
        last_supported_at: bundle.last_supported_at,
        visibility_class: bundle.visibility_class,
      }),
    ]);

    promotedCountBySource.set(bundle.source_node_id, sourceCount + 1);
  }
}
```

#### 42A.2.5 Pipeline configuration

```ts
export const BackLinkEnforcementConfigSchema = z.object({
  enabled: z.boolean().default(true),
  alias_matching_enabled: z.boolean().default(true),
  fts_matching_enabled: z.boolean().default(true),
  fts_min_score: z.number().min(0).max(1).default(0.4),
  embedding_matching_enabled: z.boolean().default(true),
  embedding_min_similarity: z.number().min(0).max(1).default(0.78),
  embedding_coreference_enabled: z.boolean().default(true),
  coreference_min_similarity: z.number().min(0).max(1).default(0.82),
  coreference_min_graph_context: z.number().min(0).max(1).default(0.4),
  llm_resolution_enabled: z.boolean().default(true),
  llm_resolution_mode: z.enum(["synchronous", "nightly", "both"]).default("nightly"),
  llm_max_unresolved_for_sync: z.number().int().positive().default(5),
  entity_creation_enabled: z.boolean().default(true),
  entity_creation_min_unresolved_observations: z.number().int().positive().default(2),
  cross_context_expansion_enabled: z.boolean().default(true),
  cross_context_max_candidates: z.number().int().positive().default(10),
  cross_context_allowed_node_kinds: z.array(z.string().max(40)).default([
    "domain_concept", "procedure", "standing_procedure", "work_product",
  ]),
  text_substrate: z.enum(["bounded_extraction_text", "raw_document_fallback"]).default("bounded_extraction_text"),
  regex_engine: z.enum(["native", "re2", "worker_isolated"]).default("worker_isolated"),
  regex_eval_timeout_ms: z.number().int().positive().default(50),
  max_input_text_chars: z.number().int().positive().default(10000),
  max_mentions_per_observation: z.number().int().positive().default(60),
  max_new_evidence_bundles_per_observation: z.number().int().positive().default(30),
  cooldown_days_per_node: z.number().int().positive().default(7),
  schema_version: z.literal(2),
});
```

**Storage:** `ELNOR_MEMORY/config/backlink_enforcement_config.json`

**Consume path for every config field:**
- `enabled` → checked at pipeline entry; if false, return empty.
- `alias_matching_enabled` → gates Pass A in resolution.
- `fts_matching_enabled` / `fts_min_score` → gates Pass B and filters results.
- `embedding_matching_enabled` / `embedding_min_similarity` → gates Tier 3 embedding pass.
- `embedding_coreference_enabled` / `coreference_min_similarity` / `coreference_min_graph_context` → gates Tier 3.5 coreference pass.
- `llm_resolution_enabled` / `llm_resolution_mode` / `llm_max_unresolved_for_sync` → gates Tier 4 dispatch and mode selection.
- `entity_creation_enabled` / `entity_creation_min_unresolved_observations` → gates Tier 5 candidate creation.
- `cross_context_expansion_enabled` / `cross_context_max_candidates` / `cross_context_allowed_node_kinds` → gates fallback cross-context candidate expansion.
- `text_substrate` → selects bounded extraction text by default.
- `regex_engine` / `regex_eval_timeout_ms` → choose the Stage 1 regex runtime and enforce timeout.
- `max_input_text_chars` → span extraction truncates input beyond this limit.
- `max_mentions_per_observation` → span extraction stops after this many spans.
- `max_new_evidence_bundles_per_observation` → pipeline caps new bundles per extraction.
- `cooldown_days_per_node` → pipeline skips source nodes processed within this window.

#### 42A.2.6 Stage 1 — Mention span extraction (deterministic, sub-50ms) [NP09, NP17]

**Substrate rule:** Stage 1 SHALL prefer the bounded extraction text produced by the intake/extraction pipeline, not arbitrary full raw documents. Raw-document fallback is permitted only when the bounded substrate is unavailable and the runtime config allows it.

**Runtime safety rule [NP17]:** For `worker_isolated`, regex execution SHALL occur in a dedicated worker thread/process with one-batch input only. If execution exceeds `regex_eval_timeout_ms`, the worker SHALL be terminated and the batch SHALL be marked degraded. For `native`, timeout-based enforcement is forbidden; `native` MAY be used only with bounded extraction text and only when no dynamic or user-learned regex is executed.

**Input cap:** Span extraction SHALL run on at most `max_input_text_chars` characters (default 10,000). If input exceeds the cap, process the first `max_input_text_chars` characters.

**Domain-configurable span patterns [NP09]:**

Core domain-agnostic patterns (always active):

| Span class | Pattern | Examples |
|---|---|---|
| `role_titled_person` | Title + name (Dr., Prof., Mr., Ms.) | "Dr. Chen", "Prof. Williams" |
| `organization_name` | Capitalized + LLC/Inc/Corp/etc. | "Morrison & Foerster LLP", "Mutable Instruments" |
| `specific_document_reference` | Named artifacts | "Exhibit A", "README.md", "Chapter 3" |
| `generic_document_reference` | Contextual doc refs | "the draft", "the document", "the report" |
| `definite_description` | Contextual entity refs | "the client", "the project", "the team" |
| `canonical_alias` | 2+ capitalized words (fallback) | "Sarah Chen", "Pacific Ventures" |
| `product_or_model_reference` | Product/model identifiers | "HP-200A", "Plaits", "Pro-Q 3" |

Domain-profile patterns (loaded from `ELNOR_MEMORY/config/domain_span_patterns/{domain_id}.json`):

```ts
export const DomainSpanPatternSetSchema = z.object({
  domain_id: z.string().max(80),
  formal_reference_patterns: z.array(z.object({
    regex: z.string().max(500),
    description: z.string().max(200),
  })).default([]),
  role_title_prefixes: z.array(z.string().max(40)).default([]),
  generic_document_terms: z.array(z.string().max(80)).default([]),
  definite_description_terms: z.array(z.string().max(80)).default([]),
  product_reference_patterns: z.array(z.object({
    regex: z.string().max(500),
    description: z.string().max(200),
  })).default([]),
  schema_version: z.literal(1),
});
```

Stage 1 loads all active domain pattern sets and applies them in order. Domain-agnostic patterns run regardless. Domain-specific patterns augment extraction.

**Nested span support:** When a wider span (e.g., "Mutable Instruments Plaits") contains a narrower span (e.g., "Mutable Instruments"), BOTH are emitted. The narrower span carries `is_nested: true` and `parent_mention_id` pointing to the wider span. Stage 2 resolves both independently.

#### 42A.2.6A Alias lexicon snapshot and batched alias scanning [NP08, NP26]

The backlink pipeline SHALL maintain a derived alias lexicon table built from `nodes` + `aliases`. This is a rebuildable local optimization for Stage 1/2 discovery, not a second canonical alias store.

**Derived SQLite table [NP26]:**
```sql
CREATE TABLE alias_lexicon_derived (
  node_id TEXT NOT NULL,
  canonical_name TEXT NOT NULL,
  normalized_alias TEXT NOT NULL,
  alias_type TEXT NOT NULL,
  node_kind TEXT NOT NULL,
  confidence REAL NOT NULL,
  visibility_class TEXT NOT NULL DEFAULT 'cloud_allowed',
  PRIMARY KEY (node_id, normalized_alias)
);
CREATE INDEX idx_alias_lexicon_lookup ON alias_lexicon_derived(normalized_alias);
```

```ts
export const AliasLexiconConfigSchema = z.object({
  enabled: z.boolean().default(true),
  min_alias_length: z.number().int().positive().default(2),
  max_alias_tokens: z.number().int().positive().default(6),
  prefer_exact_over_fuzzy: z.boolean().default(true),
  schema_version: z.literal(1),
});
```

**Visibility derivation [NP08]:**
```ts
async function deriveAliasVisibilityClass(nodeId: string): Promise<"local_only" | "cloud_allowed" | "cloud_warn" | "blocked"> {
  // Derive from DOC72 node scope if present, else from PropA classification state
  const nodeScope = await db.get(`
    SELECT json_extract(payload, '$.scope') AS scope FROM nodes WHERE id = ?
  `, [nodeId]);
  if (nodeScope?.scope === 'private') return "local_only";
  if (nodeScope?.scope === 'matter_scoped') return "cloud_warn";
  // If DOC72 nodes do not carry scope in payload, this function must consume
  // PropA's sensitivity classification state via cross-doc seam function.
  return "cloud_allowed";
}
```

**Alias matching with visibility enforcement and config consumption [NP08]:**
```ts
function findAliasMatchCandidates(
  mentionText: string,
  executionVisibility: "local_only" | "cloud_allowed" | "cloud_warn",
  config: z.infer<typeof AliasLexiconConfigSchema>,
): AliasMatchCandidate[] {
  const normalized = normalizeMentionForAliasMatch(mentionText);
  const rows = db.all(`
    SELECT * FROM alias_lexicon_derived WHERE normalized_alias = ?
  `, [normalized]);
  return rows
    .filter((entry) => {
      if (entry.visibility_class === "blocked") return false;
      if (entry.visibility_class === "local_only" && executionVisibility !== "local_only") return false;
      return true;
    })
    .filter(e => e.normalized_alias.length >= config.min_alias_length)
    .filter(e => e.normalized_alias.split(/\s+/).length <= config.max_alias_tokens)
    .map((entry) => ({
      mention_text: mentionText,
      normalized_mention: normalized,
      matched_node_id: entry.node_id,
      matched_alias: entry.normalized_alias,
      alias_type: entry.alias_type,
      visibility_class: entry.visibility_class,
      raw_alias_score: entry.alias_type === "canonical_name" ? 0.99 : 0.96,
      schema_version: 1 as const,
    }))
    .sort((a, b) => b.raw_alias_score - a.raw_alias_score);
}
```

**Rebuild triggers [NP08]:** A lexicon rebuild SHALL be triggered when any of the following change: node activation state, canonical_name, any alias row for the node, visibility_class derivation input, or node_kind. Confidence-only changes SHALL NOT trigger a lexicon rebuild.

#### 42A.2.7 Resolution context

```ts
export const BackLinkResolutionContextSchema = z.object({
  observation_id: z.string().max(120),
  source_node_id: z.string().max(120),
  work_context_id: z.string().max(120).optional(),
  surface: z.string().max(80),
  participants: z.array(z.string().max(240)).default([]),
  visibility_class: z.enum(["local_only", "cloud_allowed", "cloud_warn", "blocked"]).default("cloud_allowed"),
  novelty_gate_summary: z.string().max(2000).optional(),
  schema_version: z.literal(1),
});
```

`novelty_gate_summary` is carried so it can be injected into the Tier 4 LLM prompt for global context awareness.

#### 42A.2.8 Stage 2 — Graph-aware resolution (SQL, sub-200ms per span)

**Helper: Alias normalization (verified against DOC72 `normalized_alias` column):**

```ts
function normalizeForLookup(input: string): string {
  return input.trim().toLowerCase().replace(/[._\-]+/g, " ").replace(/\s+/g, " ");
}
```

**Pass A — Exact alias (against `aliases.normalized_alias`):**
```sql
SELECT n.id, n.canonical_name
FROM aliases a JOIN nodes n ON a.node_id = n.id
WHERE n.is_active = 1 AND a.normalized_alias = ?
LIMIT 10
```

**Pass B — FTS5 BM25 (against `fts_nodes`):**
```sql
SELECT n.id, n.canonical_name, bm25(fts_nodes) AS rank
FROM fts_nodes JOIN nodes n ON fts_nodes.rowid = n.rowid
WHERE fts_nodes MATCH ? AND n.is_active = 1
ORDER BY rank LIMIT 10
```
Query construction: terms joined with OR, each quoted: `'"henderson" OR "pacific"'`.

**Pass C — Vector semantic (against `vec_nodes`):**
```sql
SELECT n.id, n.canonical_name, vec_distance_cosine(v.embedding, ?) AS distance
FROM vec_nodes v JOIN nodes n ON v.node_id = n.id
WHERE n.is_active = 1
ORDER BY distance ASC LIMIT 10
```

**Combined score weights:**

| Tier | Raw weight | Graph context weight |
|---|---|---|
| Alias exact (1) | 0.55 | 0.45 |
| FTS5 BM25 (2) | 0.45 | 0.55 |
| Embedding semantic (3) | 0.35 | 0.65 |
| Embedding coreference (3.5) | 0.35 | 0.65 |

#### 42A.2.9 Graph context disambiguation [NP19, NP25]

```ts
async function computeGraphContextScore(
  candidateNodeId: string,
  ctx: z.infer<typeof BackLinkResolutionContextSchema>,
): Promise<number> {
  let score = 0.0;

  // 1. Same work-context link
  if (ctx.work_context_id) {
    const shared = await db.get(`
      SELECT 1 FROM edges
      WHERE ((source_id = ? AND target_id = ?) OR (source_id = ? AND target_id = ?))
      LIMIT 1
    `, [candidateNodeId, ctx.work_context_id, ctx.work_context_id, candidateNodeId]);
    if (shared) score += 0.40;
  }

  // 2. Prior evidence support (confidence-weighted + diversity bonus) [NP25]
  const priorEvidence = await db.get(`
    SELECT SUM(confidence) as weighted_score,
           COUNT(DISTINCT json_extract(payload, '$.learned_by')) as diversity
    FROM edges
    WHERE ((source_id = ? AND target_id = ?) OR (source_id = ? AND target_id = ?))
      AND confidence > 0.3
  `, [candidateNodeId, ctx.source_node_id, ctx.source_node_id, candidateNodeId]);
  if (priorEvidence?.weighted_score) {
    score += Math.min(0.20, priorEvidence.weighted_score * 0.05);
    if (priorEvidence.diversity >= 2) score += 0.05;
  }

  // 3. Participant match [NP19]
  if (ctx.participants.length > 0) {
    const normalizedParticipants = ctx.participants.map(p =>
      p.includes('@') ? p.split('@')[0].replace(/[._-]/g, ' ').toLowerCase() : p.toLowerCase()
    );
    const aliases = await db.all(
      `SELECT normalized_alias FROM aliases WHERE node_id = ?`, [candidateNodeId],
    );
    for (const p of normalizedParticipants) {
      for (const a of aliases) {
        if (tokenOverlapScore(p, a.normalized_alias) >= 0.7) {
          score += 0.20;
          break;
        }
      }
      if (score >= 0.60) break;
    }
  }

  // 4. Recency bonus
  const recency = await db.get(`SELECT updated_at FROM nodes WHERE id = ?`, [candidateNodeId]);
  if (recency?.updated_at) {
    const daysAgo = (Date.now() - new Date(recency.updated_at).getTime()) / 86400000;
    score += Math.max(0, 0.1 * (1 - daysAgo / 365));
  }

  // 5. Negative signal: prior rejection penalty
  const rejected = await db.get(`
    SELECT 1 FROM backlink_rejections_derived WHERE source_node_id = ? AND rejected_target_node_id = ? LIMIT 1
  `, [ctx.source_node_id, candidateNodeId]);
  if (rejected) score -= 0.8;

  return Math.max(0, Math.min(1.0, score));
}

function tokenOverlapScore(a: string, b: string): number {
  const tokensA = new Set(a.toLowerCase().split(/\s+/).filter(t => t.length >= 3));
  const tokensB = new Set(b.toLowerCase().split(/\s+/).filter(t => t.length >= 3));
  const intersection = new Set([...tokensA].filter(t => tokensB.has(t)));
  const union = new Set([...tokensA, ...tokensB]);
  return union.size > 0 ? intersection.size / union.size : 0;
}
```

#### 42A.2.10 Tier 3.5 — Embedding coreference resolution

For `definite_description` spans or spans that failed Tiers 1-2. Uses Qwen3-Embedding-0.6B / sqlite-vec.

**Context window:** Embed the span WITH surrounding context. Use sentence-boundary detection when available; fall back to ±120 characters (expanded from V1's ±80 to reduce cross-paragraph contamination). Sentence-boundary detection: split on `.!?` followed by whitespace + uppercase letter, take the sentence containing the span plus the preceding sentence.

**Resolution:** Query `vec_nodes`, filter by `coreference_min_similarity` and `coreference_min_graph_context`. Combined score: `similarity * 0.35 + graphContextScore * 0.65`.

**Escalation:** If best Tier 3.5 candidate has `similarity ≥ 0.82` AND `graph_context_score ≥ 0.4`, accept. Otherwise queue for Tier 4.

#### 42A.2.11 Tier 4 — LLM-grounded resolution [NP05]

**Execution policy:**
- Default: queued for nightly batch.
- Sync opt-in: only when configured AND unresolved count ≤ `llm_max_unresolved_for_sync` AND subject to PropA execution trust.
- LLM selects from a CLOSED candidate set. Does not hallucinate entities.

**Prompt contract:**

```text
You are resolving an entity mention in a document.

MENTION: "{span_text}"
CONTEXT: "{context_window}"
SOURCE: {surface_type}, context: {work_context_name}
KNOWN ENTITIES IN THIS CONTEXT: "{novelty_gate_summary}"

CANDIDATES:
{candidates_formatted}

Which candidate does this mention refer to?
- If one matches clearly, return its candidate_id, the relationship type, and your confidence.
- If none match, return matched_candidate_id = "none".
- If ambiguous, return matched_candidate_id = "ambiguous".

Relationship types: {canonical_relation_type_enum_values}

Return strict JSON only.
```

The novelty gate summary is injected as `KNOWN ENTITIES IN THIS CONTEXT`, giving the LLM global awareness of the work context without expanding the context window for each span.

**Mode semantics [NP05]:**
- `"synchronous"`: sync only, no nightly queue.
- `"nightly"`: queue only, no sync.
- `"both"`: sync for top-N unresolved by priority (N = `llm_max_unresolved_for_sync`), then queue remainder excluding already-resolved mentions. Idempotency key: `(mention_id, tier4_generation_id)`.

**Unresolved queue schema [NP05]:**
```ts
export const Tier4UnresolvedQueueItemSchema = z.object({
  queue_item_id: z.string().max(120),
  mention_id: z.string().max(120),
  observation_id: z.string().max(120),
  source_node_id: z.string().max(120),
  span_text: z.string().max(500),
  context_window: z.string().max(1200),
  candidate_set: z.array(z.object({
    candidate_id: z.string().max(120),
    node_id: z.string().max(120),
    score: z.number().min(0).max(1),
  })).max(20),
  execution_requirement: z.enum(["same_machine_local_only", "cloud_or_local"]),
  status: z.enum(["queued", "running", "succeeded", "failed", "dead_letter"]).default("queued"),
  retry_count: z.number().int().nonnegative().default(0),
  next_retry_at: z.string().datetime().optional(),
  last_error_code: z.string().max(80).optional(),
  created_at: z.string().datetime(),
  updated_at: z.string().datetime(),
  schema_version: z.literal(1),
});
```

**Storage:** `ELNOR_MEMORY/system/backlink/tier4_queue.jsonl`

Dead-letter threshold: `retry_count >= 5`. Dead-letter items are retained for audit but not reprocessed unless manually retried.

**Resolution result schema [NP05]:**
```ts
export const Tier4ResolutionResultSchema = z.object({
  mention_id: z.string().max(120),
  matched_candidate_id: z.union([z.string().max(120), z.literal("none"), z.literal("ambiguous")]),
  relation_type: ExtendedRelationTypeSchema.optional(),
  confidence: z.number().min(0).max(1),
  rationale_short: z.string().max(280).optional(),
  schema_version: z.literal(1),
});
```

Only `Tier4ResolutionResultSchema` is accepted. Non-conforming outputs are `failed_parse` and queued for retry.

**Routes [NP05]:**
- `POST /api/system/backlink/tier4/resolve-batch` — triggers batch resolution
- `GET /api/system/backlink/tier4/queue-status` — returns queue depth, dead-letter count, last-run timestamp

**Cross-context expansion (FALLBACK ONLY):**
1. Resolve the span through Tiers 1-3.5 within the local work context.
2. If NO candidates are found, THEN drop the `work_context_id` filter and search globally.
3. Global search is restricted to `cross_context_allowed_node_kinds` and capped at `cross_context_max_candidates`.

#### 42A.2.12 Tier 5 — Unresolved span → entity creation candidates [NP07]

```ts
export const EntityCreationCandidateSchema = z.object({
  candidate_id: z.string().max(120),
  suggested_canonical_name: z.string().max(200),
  suggested_node_kind: z.enum(["world_entity", "domain_concept", "work_product", "obligation"]),
  source_observation_ids: z.array(z.string().max(120)).default([]),
  source_surface: z.string().max(80),
  linked_work_context_id: z.string().max(120).optional(),
  detection_method: z.literal("backlink_unresolved_span"),
  span_class: MentionSpanClassSchema,
  visibility_class: z.enum(["local_only", "cloud_allowed", "cloud_warn"]),
  confidence: z.number().min(0).max(1).default(0.3),
  status: z.enum(["pending_review", "approved", "rejected", "linked_existing"]).default("pending_review"),
  created_at: z.string().datetime(),
  schema_version: z.literal(1),
});
```

**Storage:** `ELNOR_MEMORY/system/backlink/entity_creation_candidates.jsonl`

**Gate policy [NP07]:**
```ts
export const EntityCreationCandidatePolicySchema = z.object({
  allowed_span_classes: z.array(MentionSpanClassSchema).default([
    "canonical_alias", "role_titled_person", "organization_name",
    "formal_reference", "product_or_model_reference",
  ]),
  min_distinct_observations: z.number().int().positive().default(2),
  min_distinct_surfaces: z.number().int().positive().default(2),
  min_normalized_span_length: z.number().int().positive().default(4),
  block_all_caps_short_aliases_under: z.number().int().positive().default(5),
  daily_candidate_queue_cap: z.number().int().positive().default(100),
  historical_sweep_candidate_queue_cap: z.number().int().positive().default(500),
  stop_phrases: z.array(z.string().max(200)).default([]),
  schema_version: z.literal(1),
});
```

**Storage:** `ELNOR_MEMORY/config/entity_creation_candidate_policy.json`

**Gates (all must pass):**
1. `span_class` in configurable allowed set (default: `{canonical_alias, role_titled_person, organization_name, formal_reference, product_or_model_reference}`).
2. Same normalized span seen unresolved in ≥ 2 distinct observations AND ≥ 2 distinct surfaces (unless `formal_reference`).
3. `visibility_class` not `"blocked"`.
4. `local_only` visibility carries through to the created entity.
5. Not an all-uppercase alias shorter than 5 characters (unless matching a known formal-reference pattern).
6. Not in the configurable stop-phrase suppression list.
7. Mention text length ≥ 4 characters.

**Queue caps [NP07]:** Daily candidate queue cap: 100. Historical sweep candidate queue cap: 500. Excess candidates are deferred to the next day's processing window, prioritized by salience.

**Resolution command [NP07]:**
```ts
export const ResolveEntityCreationCandidateCommandSchema = z.discriminatedUnion("action", [
  z.object({
    action: z.literal("create_new"),
    command_id: z.string().max(120),
    candidate_id: z.string().max(120),
    canonical_name: z.string().max(200),
    approved_node_kind: CanonicalNodeKindSchema,
    issued_at: z.string().datetime(),
    schema_version: z.literal(1),
  }),
  z.object({
    action: z.literal("link_existing"),
    command_id: z.string().max(120),
    candidate_id: z.string().max(120),
    existing_node_id: z.string().max(120),
    issued_at: z.string().datetime(),
    schema_version: z.literal(1),
  }),
  z.object({
    action: z.literal("dismiss"),
    command_id: z.string().max(120),
    candidate_id: z.string().max(120),
    reason: z.enum(["noise", "duplicate_of_existing", "generic_reference", "not_durable"]),
    issued_at: z.string().datetime(),
    schema_version: z.literal(1),
  }),
]);
```

**Route:** `POST /api/knowledge/review/entity-creation-candidates/resolve`

**Approval-time re-resolve [NP07]:** Before an approved Tier 5 candidate may create a canonical node, EC SHALL rerun current-graph exact alias, FTS, and canonical-name collision checks. If a likely existing node is found above threshold, approval SHALL convert into a `link_existing` result. Q SHALL surface `link_existing` as an explicit option.

#### 42A.2.12A Visibility propagation closure [NP16]

`visibility_class` SHALL be carried across mention observations, evidence bundles, promoted edge payloads, entity creation candidates, review queue items, co-occurrence patterns, procedure motifs, compiled-truth inputs/read-models, and work-context constellations.

**Promoted edge visibility enforcement [NP16]:**

```ts
export function mostRestrictiveVisibility(
  values: Array<"cloud_allowed" | "cloud_warn" | "local_only" | "blocked">
): "cloud_allowed" | "cloud_warn" | "local_only" | "blocked" {
  if (values.includes("blocked")) return "blocked";
  if (values.includes("local_only")) return "local_only";
  if (values.includes("cloud_warn")) return "cloud_warn";
  return "cloud_allowed";
}

export function canReadPromotedEdge(input: {
  execution_visibility: "cloud_allowed" | "cloud_warn" | "local_only";
  edge_visibility: "cloud_allowed" | "cloud_warn" | "local_only" | "blocked";
  source_node_visibility: "cloud_allowed" | "cloud_warn" | "local_only" | "blocked";
  target_node_visibility: "cloud_allowed" | "cloud_warn" | "local_only" | "blocked";
  source_classification_state: "classified" | "provisional_source_only" | "deferred_unavailable" | "quarantined_review" | "unclassified";
  target_classification_state: "classified" | "provisional_source_only" | "deferred_unavailable" | "quarantined_review" | "unclassified";
}): boolean {
  if (["deferred_unavailable", "quarantined_review", "unclassified"].includes(input.source_classification_state)) return false;
  if (["deferred_unavailable", "quarantined_review", "unclassified"].includes(input.target_classification_state)) return false;
  const effective = mostRestrictiveVisibility([
    input.edge_visibility, input.source_node_visibility, input.target_node_visibility,
  ]);
  if (effective === "blocked") return false;
  if (effective === "local_only" && input.execution_visibility !== "local_only") return false;
  return true;
}
```

All consumers reading backlink-promoted edges SHALL use `canReadPromotedEdge(...)` or a derived `visible_edges_current` view. This enforces effective visibility across edge + both endpoint nodes + PropA classification states.

#### 42A.2.13 Relation type inference — Cue DSL [NP09, NP20]

```ts
export const RelationCueSchema = z.object({
  cue_id: z.string().max(120),
  span_class: MentionSpanClassSchema,
  lexical_cues: z.array(z.string().max(80)).max(12).default([]),
  negative_cues: z.array(z.string().max(80)).max(12).default([]),
  required_candidate_node_kinds: z.array(z.string().max(40)).default([]),
  forbidden_candidate_node_kinds: z.array(z.string().max(40)).default([]),
  inferred_relation_type: ExtendedRelationTypeSchema,
  base_modifier: z.number().min(-0.5).max(0.5),
  source: z.enum(["seed_rule", "user_correction", "self_review"]).default("seed_rule"),
  support_count: z.number().int().nonnegative().default(1),
  schema_version: z.literal(1),
});
```

**Domain-agnostic seed rules (always active) [NP09]:**

| Span class | Lexical cues | Relation type | Modifier |
|---|---|---|---|
| `organization_name` | (any) | `affiliated_with` | 0.0 |
| `specific_document_reference` | created, wrote, built, made, produced, filed, submitted | `produced_by` | +0.05 |
| `specific_document_reference` | cited, referenced, discussed, mentioned, used, based on | `referenced_in` | 0.0 |
| `canonical_alias` | (any) | `semantically_related` | -0.05 |

**Domain-profile seed rules (loaded from profile):**

Legal domain profile contributes: `role_titled_person` + judge/hon/justice → `adjudicated_by` (+0.10), `role_titled_person` + expert/witness → `expert_in` (+0.05), `role_titled_person` + counsel/attorney → `represented_by` (+0.05), `formal_reference` → `linked_to_context` (+0.15).

**Fallback noise suppression [NP20]:** Mention observations with `relation_hypothesis = "semantically_related"` SHALL NOT create new evidence bundles unless a second corroborating mention from a different observation exists with the same source-target pair. This prevents single-mention `semantically_related` bundles from accumulating.

#### 42A.2.14 Negative signal tracking

```ts
export const BackLinkRejectionSchema = z.object({
  rejection_id: z.string().max(120),
  source_node_id: z.string().max(120),
  rejected_target_node_id: z.string().max(120),
  matched_text_span: z.string().max(500),
  rejection_reason: z.enum([
    "wrong_entity", "no_relationship", "wrong_relation_type", "spurious_mention",
  ]),
  correct_target_node_id: z.string().max(120).optional(),
  correct_relation_type: ExtendedRelationTypeSchema.optional(),
  created_at: z.string().datetime(),
  schema_version: z.literal(1),
});
```

**Storage:** `ELNOR_MEMORY/system/backlink/rejections.jsonl` (append-only JSONL)
**Queryable view:** `backlink_rejections_derived` SQLite table, rebuilt from JSONL at startup, appended on each new rejection.

#### 42A.2.15 Per-node cooldown [NP28]

Before running the back-link pipeline on a source node, check whether it was processed within `cooldown_days_per_node` (default 7). If yes, skip. Track via `last_backlinked_at` in a lightweight lookup table:

```ts
export const BackLinkCooldownSchema = z.object({
  node_id: z.string().max(120),
  last_backlinked_at: z.string().datetime(),
  schema_version: z.literal(1),
});
```

**Storage:** `ELNOR_MEMORY/system/backlink/cooldown.json` (atomic JSON, keyed by node_id)

Updated after successful pipeline completion for a source node. The historical sweep ignores cooldowns (it's a one-time operation on existing nodes).

**Cooldown reset semantics [NP28]:** The cooldown for a source node SHALL be reset (allowing immediate re-linking) when any of the following occur:
1. A user correction (relation change, entity merge, rejection reversal) affects the source node or any of its direct edges.
2. A new alias is added to the source node.
3. A Tier 5 entity creation candidate is approved that would create a node related to this source.
4. The source node's `lifecycle_state` changes.

Confidence-only changes and nightly decay SHALL NOT reset cooldown. The historical sweep ignores cooldowns.

#### 42A.2.16 Review commands [NP10]

```ts
export const ReviewBackLinkCandidateCommandSchema = z.object({
  command_id: z.string().max(120),
  evidence_bundle_id: z.string().max(120),
  action: z.enum(["approve", "reject"]),
  rejection_reason: z.enum([
    "wrong_entity", "no_relationship", "wrong_relation_type", "spurious_mention",
  ]).optional(),
  correct_target_node_id: z.string().max(120).optional(),
  correct_relation_type: ExtendedRelationTypeSchema.optional(),
  issued_at: z.string().datetime(),
  schema_version: z.literal(1),
});

export const UpdateBackLinkRelationCommandSchema = z.object({
  command_id: z.string().max(120),
  source_node_id: z.string().max(120),
  target_node_id: z.string().max(120),
  from_relation_type: ExtendedRelationTypeSchema,
  to_relation_type: ExtendedRelationTypeSchema,
  supporting_mention_id: z.string().max(120).optional(),
  issued_at: z.string().datetime(),
  schema_version: z.literal(1),
});

export const RetargetBackLinkCandidateCommandSchema = z.object({
  command_id: z.string().max(120),
  mention_id: z.string().max(120),
  source_node_id: z.string().max(120),
  old_target_node_id: z.string().max(120).nullable(),
  new_target_node_id: z.string().max(120),
  relation_type: z.string().max(80),
  reason: z.enum(["wrong_entity", "merge_to_existing", "reviewer_correction"]),
  issued_at: z.string().datetime(),
  schema_version: z.literal(1),
});

export const SuppressMentionPatternCommandSchema = z.object({
  command_id: z.string().max(120),
  normalized_mention: z.string().max(500),
  span_class: MentionSpanClassSchema,
  scope: z.enum(["global", "work_context", "surface_specific"]),
  work_context_id: z.string().max(120).optional(),
  surface: z.string().max(80).optional(),
  reason: z.string().max(240),
  issued_at: z.string().datetime(),
  schema_version: z.literal(1),
});

export const ApproveProcedureMotifCommandSchema = z.object({
  command_id: z.string().max(120),
  motif_id: z.string().max(120),
  action: z.enum(["confirm_as_procedure", "confirm_as_standing_procedure", "dismiss"]),
  issued_at: z.string().datetime(),
  schema_version: z.literal(1),
});
```

**Retarget/rekey function [NP10]:**

```ts
export async function rekeyEvidenceBundleAfterRetarget(input: {
  mention_id: string;
  source_node_id: string;
  old_target_node_id: string | null;
  new_target_node_id: string;
  old_relation_type: string;
  new_relation_type: string;
}): Promise<void> {
  const oldBundleId = stableSha256(
    `${input.source_node_id}|${input.old_target_node_id ?? "null"}|${input.old_relation_type}`
  );
  const newBundleId = stableSha256(
    `${input.source_node_id}|${input.new_target_node_id}|${input.new_relation_type}`
  );
  await db.transaction(async () => {
    await updateMentionResolution(input.mention_id, input.new_target_node_id, input.new_relation_type, "user_corrected");
    if (input.old_target_node_id) {
      await removeMentionFromBundle(oldBundleId, input.mention_id);
      await deleteBundleIfEmpty(oldBundleId);
    }
    await addMentionToBundle(newBundleId, input.mention_id, {
      source_node_id: input.source_node_id,
      target_node_id: input.new_target_node_id,
      relation_type: input.new_relation_type,
    });
    await rebuildPromotedEdgeForAffectedBundles([oldBundleId, newBundleId]);
  });
}
```

#### 42A.2.17 Review queue and routes [NP10]

```ts
export const BackLinkReviewQueueItemSchema = z.object({
  queue_item_id: z.string().max(120),
  item_kind: z.enum([
    "candidate_relation",
    "entity_creation_candidate",
    "suppression_candidate",
    "procedure_motif",
  ]),
  primary_ref: z.string().max(120),
  title: z.string().max(240),
  description: z.string().max(1000),
  evidence_refs: z.array(z.string().max(120)).default([]),
  available_actions: z.array(z.string().max(80)).default([]),
  visibility_class: z.enum(["local_only", "cloud_allowed", "cloud_warn", "blocked"]).default("cloud_allowed"),
  status: z.enum(["pending", "resolved", "dismissed"]).default("pending"),
  created_at: z.string().datetime(),
  schema_version: z.literal(1),
});
```

**Storage:** `ELNOR_MEMORY/system/backlink/review_queue.jsonl`

```ts
export const GetBackLinkReviewQueueQuerySchema = z.object({
  item_kind: z.enum(["candidate_relation", "entity_creation_candidate", "suppression_candidate", "procedure_motif"]).optional(),
  status: z.enum(["pending", "resolved", "dismissed"]).optional(),
  page: z.number().int().positive().default(1),
  page_size: z.number().int().positive().max(100).default(25),
  schema_version: z.literal(1),
});

export const BackLinkReviewQueuePageSchema = z.object({
  items: z.array(BackLinkReviewQueueItemSchema),
  page: z.number().int().positive(),
  page_size: z.number().int().positive(),
  total_items: z.number().int().nonnegative(),
  schema_version: z.literal(1),
});
```

Pagination queries use the `review_queue_derived` SQLite table from §42A.2.3A.

**Routes:**
- `GET /api/knowledge/review/backlink-queue` → `BackLinkReviewQueuePageSchema`
- `POST /api/knowledge/review/backlink-candidates/:evidenceBundleId/action` → consumes `ReviewBackLinkCandidateCommandSchema`
- `POST /api/knowledge/review/backlink-relations/update` → consumes `UpdateBackLinkRelationCommandSchema`
- `POST /api/knowledge/review/backlink-candidates/retarget` → consumes `RetargetBackLinkCandidateCommandSchema`
- `POST /api/knowledge/review/entity-creation-candidates/resolve` → consumes `ResolveEntityCreationCandidateCommandSchema`
- `POST /api/knowledge/review/procedure-motifs/:motifId/action` → consumes `ApproveProcedureMotifCommandSchema`
- `POST /api/knowledge/review/suppressions` → consumes `SuppressMentionPatternCommandSchema`

All GET routes SHALL return `ready|empty|blocked|degraded|stale` envelopes. All POST routes SHALL return success/blocked/degraded receipts + validation errors. Review-queue items SHALL be sorted by `impact × uncertainty × recency`, not FIFO.

#### 42A.2.18 Integration points

1. **After every deep extraction (§20A):** Run Tiers 1-3.5 synchronously. Queue unresolved for Tier 4 nightly. Check Tier 5 gates. Respect `cooldown_days_per_node`.
2. **After every sonar sweep Pass 2 (§21):** Same pipeline.
3. **Nightly:** Tier 4 LLM resolution, evidence bundle aggregation, promotion evaluation, co-occurrence detection.
4. **Historical sweep (one-time):** Full graph re-link. Ignores cooldowns. Subject to review queue caps (max 500 candidates). Registered as `backlink_historical_sweep`.

#### 42A.2.19 Orchestrator task registration and nightly DAG [NP29]

| task_type_id | owner_doc | priority_class | requires_llm | agent_tier |
|---|---|---|---|---|
| `backlink_rebuild_derived_tables` | DOC72 | low | false | none |
| `backlink_alias_lexicon_rebuild` | DOC72 | low | false | none |
| `backlink_nightly_resolution` | DOC72 | low | true | tier1_analyst |
| `backlink_evidence_aggregation` | DOC72 | low | false | none |
| `backlink_promotion` | DOC72 | low | false | none |
| `backlink_enrichment_sweep` | DOC72 | low | false | none |
| `backlink_pattern_detection` | DOC72 | low | false | none |
| `backlink_ordered_motif_detection` | DOC72 | low | false | none |
| `backlink_graph_hygiene` | DOC72 | low | false | none |
| `backlink_constellation_refresh` | DOC72 | low | false | none |
| `backlink_compiled_truth_refresh` | DOC72 | low | false | none |
| `backlink_compile_learning_bundle` | DOC72 | low | false | none |
| `backlink_weekly_digest` | DOC72 | low | false | none |
| `backlink_compaction_gc` | DOC72 | low | false | none |
| `backlink_historical_sweep` | DOC72 | low | true | tier1_analyst |

All background work SHALL register with the unified `BackgroundJobOrchestrator` from EC Core Addendum A. No local scheduler, embedded cron, or doc-private worker loop is allowed.

**Nightly DAG [NP29]:**

```ts
export const BackLinkNightlyDagSchema = z.object({
  dependencies: z.array(z.object({
    task_id: z.string().max(80),
    depends_on: z.array(z.string().max(80)).default([]),
    schema_version: z.literal(1),
  })).default([
    { task_id: "rebuild_derived_tables", depends_on: [], schema_version: 1 },
    { task_id: "aggregate_evidence_bundles", depends_on: ["rebuild_derived_tables"], schema_version: 1 },
    { task_id: "resolve_tier4_queue", depends_on: ["aggregate_evidence_bundles"], schema_version: 1 },
    { task_id: "compile_backlink_learning_bundle", depends_on: ["resolve_tier4_queue"], schema_version: 1 },
    { task_id: "detect_cooccurrence_cliques", depends_on: ["aggregate_evidence_bundles"], schema_version: 1 },
    { task_id: "detect_ordered_motifs", depends_on: ["aggregate_evidence_bundles"], schema_version: 1 },
    { task_id: "refresh_compiled_truth", depends_on: ["aggregate_evidence_bundles"], schema_version: 1 },
    { task_id: "refresh_constellations", depends_on: ["detect_cooccurrence_cliques", "detect_ordered_motifs", "refresh_compiled_truth"], schema_version: 1 },
    { task_id: "run_graph_hygiene", depends_on: ["refresh_constellations"], schema_version: 1 },
    { task_id: "assemble_weekly_digest", depends_on: ["refresh_constellations", "run_graph_hygiene"], schema_version: 1 },
    { task_id: "backlink_compaction_gc", depends_on: ["assemble_weekly_digest"], schema_version: 1 },
  ]),
  schema_version: z.literal(1),
});
```

The backlink nightly path SHALL honor `BackLinkNightlyDagSchema`. No subsystem-local schedulers for mirror rebuilds, clique detection, motif mining, learning compilation, or hygiene are permitted.

#### 42A.2.20 Self-learning substrate (day one, naturally empty) [NP11]

The backlink learning lane SHALL exist from day one. Empty ledgers and empty compiled bundles are natural zero states.

```ts
export const BackLinkFeedbackEventSchema = z.discriminatedUnion("event_kind", [
  z.object({
    event_kind: z.literal("edge_user_confirmed"),
    evidence_bundle_id: z.string().max(120),
    source_node_id: z.string().max(120),
    target_node_id: z.string().max(120),
    relation_type: ExtendedRelationTypeSchema,
    observed_at: z.string().datetime(),
    schema_version: z.literal(1),
  }),
  z.object({
    event_kind: z.literal("edge_user_rejected"),
    mention_id: z.string().max(120),
    source_node_id: z.string().max(120),
    target_node_id: z.string().max(120).optional(),
    rejection_reason: z.string().max(80),
    observed_at: z.string().datetime(),
    schema_version: z.literal(1),
  }),
  z.object({
    event_kind: z.literal("relation_corrected"),
    source_node_id: z.string().max(120),
    target_node_id: z.string().max(120),
    from_relation_type: ExtendedRelationTypeSchema,
    to_relation_type: ExtendedRelationTypeSchema,
    observed_at: z.string().datetime(),
    schema_version: z.literal(1),
  }),
  z.object({
    event_kind: z.literal("span_suppressed"),
    normalized_span: z.string().max(500),
    span_class: MentionSpanClassSchema,
    scope: z.enum(["global", "work_context", "surface_specific"]),
    observed_at: z.string().datetime(),
    schema_version: z.literal(1),
  }),
]);

export const BackLinkLearningCountersSchema = z.object({
  total_mentions_observed: z.number().int().nonnegative().default(0),
  total_bundles_created: z.number().int().nonnegative().default(0),
  total_edges_promoted: z.number().int().nonnegative().default(0),
  total_edges_rejected: z.number().int().nonnegative().default(0),
  total_relations_corrected: z.number().int().nonnegative().default(0),
  total_spans_suppressed: z.number().int().nonnegative().default(0),
  total_entity_candidates_created: z.number().int().nonnegative().default(0),
  total_entity_candidates_approved: z.number().int().nonnegative().default(0),
  last_updated_at: z.string().datetime(),
  schema_version: z.literal(1),
});

export const EntityResolutionUtilityRecordSchema = z.object({
  normalized_span: z.string().max(500),
  candidate_node_id: z.string().max(120),
  context_class_key: z.string().max(240),
  utility_alpha: z.number().nonnegative(),
  utility_beta: z.number().nonnegative(),
  support_count: z.number().int().nonnegative(),
  last_updated_at: z.string().datetime(),
  schema_version: z.literal(1),
});

export const RelationInferenceUtilityRecordSchema = z.object({
  span_class: MentionSpanClassSchema,
  candidate_node_kind: z.string().max(40),
  relation_type: ExtendedRelationTypeSchema,
  context_class_key: z.string().max(240),
  utility_alpha: z.number().nonnegative(),
  utility_beta: z.number().nonnegative(),
  support_count: z.number().int().nonnegative(),
  last_updated_at: z.string().datetime(),
  schema_version: z.literal(1),
});

export const SpanSuppressionUtilityRecordSchema = z.object({
  normalized_span: z.string().max(500),
  span_class: MentionSpanClassSchema,
  context_class_key: z.string().max(240),
  suppress_alpha: z.number().nonnegative(),
  suppress_beta: z.number().nonnegative(),
  support_count: z.number().int().nonnegative(),
  last_updated_at: z.string().datetime(),
  schema_version: z.literal(1),
});

export const CrossContextExpansionUtilityRecordSchema = z.object({
  trigger_family: z.string().max(120),
  target_node_kind: z.string().max(40),
  context_class_key: z.string().max(240),
  utility_alpha: z.number().nonnegative(),
  utility_beta: z.number().nonnegative(),
  support_count: z.number().int().nonnegative(),
  last_updated_at: z.string().datetime(),
  schema_version: z.literal(1),
});
```

**Context class key definition [NP11]:**
```ts
export const ContextClassKeySchema = z.object({
  domain: z.string().max(40),
  task_type: z.string().max(60),
  artifact_class: z.string().max(60),
}).transform(v => `${v.domain}:${v.task_type}:${v.artifact_class}`.toLowerCase());

export const BackLinkContextClassGuardrailSchema = z.object({
  min_support_for_artifact_specific_key: z.number().int().positive().default(10),
  fallback_artifact_class: z.literal("unknown").default("unknown"),
  max_distinct_artifact_classes_per_task: z.number().int().positive().default(20),
  schema_version: z.literal(1),
});
```

Artifact-specific context keys SHALL NOT be emitted into compiled policy bundles until support meets `min_support_for_artifact_specific_key`. Low-support artifact classes collapse to `"unknown"`.

**Update math [NP11]:**
```ts
// Nightly update for each utility ledger record
alpha_new = alpha + positive_weight;
beta_new = beta + negative_weight;
posterior_mean = alpha_new / (alpha_new + beta_new);

// Context smoothing (when enable_context_class_smoothing = true)
smoothed = (k * prior_mean + n * observed_mean) / (k + n);  // k = 20
```

**Skew protection [NP11]:**
```ts
export const BackLinkLearningSkewGuardSchema = z.object({
  max_single_work_context_support_share: z.number().min(0).max(1).default(0.6),
  require_cross_context_support_for_global_promotion: z.boolean().default(true),
  schema_version: z.literal(1),
});
```

Any compiled policy rule intended for global application SHALL either satisfy the cross-context support requirement, or be scoped to a `context_class_key` and never promoted to a global rule.

**Staged generation and rollback [NP11]:**
```ts
export const BackLinkCompiledPolicyGenerationStatusSchema = z.object({
  generation_id: z.string().max(120),
  status: z.enum(["staged", "active", "rolled_back"]),
  staged_at: z.string().datetime(),
  activated_at: z.string().datetime().optional(),
  schema_version: z.literal(1),
});
```

Runtime consumes only the `active` generation pointer. Rollback reverts to prior active generation. If a learned cue family experiences three consecutive negative utility windows or falls below the support floor for two nightly cycles, DOC8 SHALL quarantine it from compiled policy output.

**Consume path mapping [NP11]:**
- `gather_backlink_feedback` gates append to `feedback_events.jsonl`.
- `apply_backlink_learning` gates read/use of `compiled_policy_bundle.json`.
- `enable_context_class_smoothing` gates smoothing stage only.
- `enable_cross_context_learning` gates cross-context allowances only.

```ts
export const BackLinkCompiledPolicyBundleSchema = z.object({
  generation_id: z.string().max(120),
  active: z.boolean().default(true),
  relation_promotions: z.array(z.object({
    span_class: MentionSpanClassSchema,
    candidate_node_kind: z.string().max(40),
    favored_relation_type: ExtendedRelationTypeSchema,
    min_support_count: z.number().int().positive(),
  })).default([]),
  relation_demotions: z.array(z.object({
    relation_type: ExtendedRelationTypeSchema,
    context_class_key: z.string().max(240),
    max_promotion_score: z.number().min(0).max(1),
  })).default([]),
  span_suppressions: z.array(z.object({
    normalized_span: z.string().max(500),
    span_class: MentionSpanClassSchema,
    context_class_key: z.string().max(240),
  })).default([]),
  cross_context_allowances: z.array(z.object({
    trigger_family: z.string().max(120),
    target_node_kind: z.string().max(40),
    context_class_key: z.string().max(240),
  })).default([]),
  generated_at: z.string().datetime(),
  schema_version: z.literal(1),
});

export const BackLinkLearningRuntimeSettingsSchema = z.object({
  gather_backlink_feedback: z.boolean().default(true),
  apply_backlink_learning: z.boolean().default(true),
  enable_context_class_smoothing: z.boolean().default(true),
  enable_cross_context_learning: z.boolean().default(true),
  schema_version: z.literal(1),
});
```

**Owner split:** DOC72 owns schemas. DOC8 computes nightly. EC writes artifacts. The artifacts are derived, rebuildable, kill-switchable, and MUST NOT become a second canonical confidence system.

#### 42A.2.21 Retention and compaction [NP12]

```ts
export const BackLinkRetentionPolicySchema = z.object({
  mentions_ttl_days: z.number().int().positive().default(180),
  review_queue_ttl_days: z.number().int().positive().default(365),
  feedback_events_ttl_days: z.number().int().positive().default(365),
  keep_user_verified_forever: z.boolean().default(true),
  compaction_interval_days: z.number().int().positive().default(7),
  schema_version: z.literal(1),
});
```

**Storage:** `ELNOR_MEMORY/config/backlink_retention_policy.json`

After `mentions_ttl_days`, resolved mention observations older than the TTL are compacted into per-pair summary records. Raw records older than the TTL are archived to `mention_observations_archive.jsonl.gz` and removed from the active JSONL. The derived table only loads non-archived records. User-verified mentions are retained forever if `keep_user_verified_forever = true`.

Registered as orchestrator task `backlink_compaction_gc`.

---

### 42A.3 RRF Hybrid Retrieval

#### 42A.3.1 Placement
DOC72 §12 (new §12A). Consumed by DOC24 packet assembly.

#### 42A.3.2 Configuration

```ts
export const HybridRetrievalConfigSchema = z.object({
  fusion_method: z.literal("rrf"),
  rrf_k: z.number().int().positive().default(60),
  max_candidates_per_lane: z.number().int().positive().default(50),
  normalize_output: z.boolean().default(true),
  schema_version: z.literal(1),
});
```

**Storage:** `ELNOR_MEMORY/config/hybrid_retrieval_config.json`

#### 42A.3.3 Implementation

```ts
function calculateRRF(
  keywordResults: { id: string }[],
  vectorResults: { id: string }[],
  k: number,
  normalize: boolean,
): { id: string; rrfScore: number }[] {
  const scoreMap = new Map<string, number>();

  for (let i = 0; i < keywordResults.length; i++) {
    const prev = scoreMap.get(keywordResults[i].id) || 0;
    scoreMap.set(keywordResults[i].id, prev + 1 / (k + i + 1));
  }
  for (let i = 0; i < vectorResults.length; i++) {
    const prev = scoreMap.get(vectorResults[i].id) || 0;
    scoreMap.set(vectorResults[i].id, prev + 1 / (k + i + 1));
  }

  const sorted = Array.from(scoreMap.entries())
    .map(([id, rrfScore]) => ({ id, rrfScore }))
    .sort((a, b) => b.rrfScore - a.rrfScore);

  // Normalize to 0-1 range for DOC24 consumption
  if (normalize && sorted.length > 0) {
    const maxScore = sorted[0].rrfScore;
    if (maxScore > 0) {
      return sorted.map(r => ({ id: r.id, rrfScore: r.rrfScore / maxScore }));
    }
  }

  return sorted;
}
```

**Score normalization:** RRF produces scores around 0.01-0.03. DOC24's injection thresholds expect 0-1 normalized scores. When `normalize_output = true` (default), the output is scaled so the top result = 1.0 and others scale proportionally. This ensures DOC24's threshold logic works correctly.

---

### 42A.4 Compiled Truth Mirror and Conversational Inspectability

#### 42A.4.1 Owner split (CONVERGED — do not relitigate)
- **DOC72 owns:** `CompiledTruthMirrorSchemaV2`, `KnowledgeSummaryRenderInputSchema`, `buildKnowledgeSummaryRenderInput()`.
- **DOC24/KDA owns:** `renderKnowledgeSummaryMarkdown()`, `KnowledgeSummaryRenderContractSchema`.
- **EC:** Calls DOC72 input builder → passes to DOC24 renderer → writes mirror file.

#### 42A.4.2 Mirror schema with multi-hash invalidation [NP21]

```ts
export const CompiledTruthMirrorSchemaV2 = z.object({
  node_id: z.string().max(120),
  canonical_name: z.string().max(200),
  node_kind: z.string().max(40),
  visibility_class: z.enum(["local_only", "cloud_allowed", "cloud_warn", "blocked"]).default("cloud_allowed"),
  markdown: z.string(),
  render_contract_id: z.string().max(80),
  node_hash: z.string().length(64),
  relationship_hash: z.string().length(64),
  provenance_hash: z.string().length(64),
  render_input_hash: z.string().length(64),
  rendered_at: z.string().datetime(),
  stale: z.boolean().default(false),
  stale_reason: z.enum([
    "node_changed", "relationships_changed", "provenance_changed", "renderer_changed",
  ]).optional(),
  schema_version: z.literal(2),
});

export const CompiledTruthReadModelSchema = CompiledTruthMirrorSchemaV2.extend({
  status: z.enum(["ready", "stale", "building"]).default("ready"),
  schema_version: z.literal(3),
});
```

**Storage:** `ELNOR_MEMORY/mirror/{node_id}.json`

**Deterministic hashing:** Do NOT rely on raw object coercion or unsorted `JSON.stringify`. Hash from a stable, recursively key-sorted representation:

```ts
function stableJson(value: unknown): string {
  if (Array.isArray(value)) {
    return `[${value.map(stableJson).join(',')}]`;
  }
  if (value && typeof value === 'object') {
    const entries = Object.entries(value as Record<string, unknown>)
      .sort(([a], [b]) => a.localeCompare(b))
      .map(([k, v]) => `${JSON.stringify(k)}:${stableJson(v)}`);
    return `{${entries.join(',')}}`;
  }
  return JSON.stringify(value);
}

function computeNodeHash(node: NodeRecord): string {
  const canonical = stableJson({
    id: node.id,
    canonical_name: node.canonical_name,
    node_kind: node.node_kind,
    confidence: node.confidence,
    lifecycle_state: node.lifecycle_state,
    updated_at: node.updated_at,
    payload: node.payload ?? null,
  });
  return createHash('sha256').update(canonical).digest('hex');
}

function computeRelationshipHash(nodeId: string, edges: EdgeRecord[]): string {
  const canonical = stableJson(
    edges
      .map((e) => ({
        source_id: e.source_id,
        target_id: e.target_id,
        relation_type: e.relation_type,
        confidence: e.confidence,
        lifecycle_state: e.lifecycle_state,
        payload: e.payload ?? null,
      }))
      .sort((a, b) => `${a.source_id}|${a.target_id}|${a.relation_type}`.localeCompare(`${b.source_id}|${b.target_id}|${b.relation_type}`)),
  );
  return createHash('sha256').update(`${nodeId}|${canonical}`).digest('hex');
}

function computeProvenanceHash(entries: ProvenanceEntry[]): string {
  const canonical = stableJson(
    entries
      .map((e) => ({
        id: e.id,
        entry_type: e.entry_type,
        source_ref: e.source_ref,
        authority_type: e.authority_type,
        still_current: e.still_current,
        created_at: e.created_at,
      }))
      .sort((a, b) => `${a.id}|${a.created_at}`.localeCompare(`${b.id}|${b.created_at}`)),
  );
  return createHash('sha256').update(canonical).digest('hex');
}
```

**Mirror rebuild dedup [NP21]:**
```ts
export const MirrorBuildLockSchema = z.object({
  node_id: z.string().max(120),
  status: z.enum(["building", "idle"]),
  build_started_at: z.string().datetime(),
  owner_job_id: z.string().max(120),
  schema_version: z.literal(1),
});
```

Acquire lock before scheduling rebuild. If lock exists and fresh (< 5 min old), return stale with `status="building"`. Prevents stampede rebuilds on popular nodes.

**Async rendering:** If a mirror is stale when requested, return the stale markdown immediately with `stale: true` and a visual indicator in Q (`[Updating...]`). Fire the re-render asynchronously. Stream the updated markdown to Q via state update when complete. Never block the read path.

#### 42A.4.3 Render-input contract (DOC72-owned) [NP27]

```ts
export const GetKnowledgeSummaryQuerySchema = z.object({
  node_id: z.string().max(120),
  include_relationships: z.boolean().default(true),
  include_timeline: z.boolean().default(false),
  include_low_confidence: z.boolean().default(true),
  max_related_entities: z.number().int().min(0).max(20).default(10),
  schema_version: z.literal(1),
});

export const KnowledgeSummaryRenderInputSchema = z.object({
  node_id: z.string().max(120),
  canonical_name: z.string().max(200),
  node_kind: z.string().max(40),
  visibility_class: z.enum(["local_only", "cloud_allowed", "cloud_warn", "blocked"]).default("cloud_allowed"),
  confidence: z.number().min(0).max(1),
  lifecycle_state: z.string().max(40),
  payload: z.record(z.string(), z.unknown()),
  related_entities: z.array(z.object({
    node_id: z.string().max(120),
    canonical_name: z.string().max(200),
    node_kind: z.string().max(40),
    relation_type: ExtendedRelationTypeSchema,
    confidence: z.number().min(0).max(1),
  })).default([]),
  low_confidence_items: z.array(z.object({
    node_id: z.string().max(120),
    canonical_name: z.string().max(200),
    relation_type: ExtendedRelationTypeSchema,
    confidence: z.number().min(0).max(1),
  })).default([]),
  recent_provenance: z.array(z.object({
    date: z.string().max(10),
    summary: z.string().max(500),
  })).default([]),
  schema_version: z.literal(1),
});
```

**Builder function** (`buildKnowledgeSummaryRenderInput`) queries `nodes`, `edges`, and `provenance_entries` using the verified DOC72 R5.6 schema, assembles the structured input, and returns it for the DOC24-owned renderer.

**Render polymorphism by node_kind [NP27]:** The `KnowledgeSummaryRenderInputSchema` includes `node_kind` and `payload`. DOC24/KDA render template SHALL be polymorphic by `node_kind`:
- `world_entity`: key facts + relationships + confidence
- `procedure`: intent + applicability + steps (if available in payload)
- `obligation`: obligee + deadline + status
- `standing_procedure`: trigger + action + frequency
- Other kinds: generic layout

This is a DOC24/KDA rendering obligation, not a DOC72 schema change.

#### 42A.4.4 `inspect_knowledge_summary` conversational tool [NP10]

```ts
export const InspectKnowledgeSummaryToolSchema = z.object({
  tool_id: z.literal("inspect_knowledge_summary"),
  description: z.literal(
    "Retrieve what Elnor currently knows about a person, matter, organization, " +
    "concept, or any entity. Returns a structured summary with key facts, " +
    "relationships, confidence levels, and timeline. Use this when the user " +
    "asks 'what do you know about X' or when you want to check your own knowledge.",
  ),
  parameters: z.object({
    query: z.string().max(500),
    include_relationships: z.boolean().default(true),
    include_timeline: z.boolean().default(false),
    include_low_confidence: z.boolean().default(true),
    max_related_entities: z.number().int().min(0).max(20).default(10),
  }),
});
```

**Inspect query resolver [NP10]:**
```ts
export const ResolveKnowledgeNodeQuerySchema = z.object({
  query: z.string().max(500),
  max_candidates: z.number().int().min(1).max(10).default(3),
  schema_version: z.literal(1),
});
```

Route: `POST /api/knowledge/resolve-node-query` → returns candidates with normalized RRF scores and disambiguation requirement.

**Inspect correction command [NP10]:**
```ts
export const SubmitKnowledgeSummaryCorrectionCommandSchema = z.object({
  command_id: z.string().max(120),
  node_id: z.string().max(120),
  correction_kind: z.enum([
    "durable_fact_correction", "relation_correction",
    "naming_correction", "temporary_context_note",
  ]),
  correction_text: z.string().max(1000),
  target_ref: z.string().max(120).optional(),
  issued_at: z.string().datetime(),
  schema_version: z.literal(1),
});
```

Route: `POST /api/knowledge/nodes/:nodeId/summary/corrections`

**Correction authority mapping:**

| Correction kind | Authority outcome |
|---|---|
| Durable fact correction | `directive_authority` |
| Relation correction | `graph_correction_only` |
| Naming correction | `graph_correction_only` |
| Temporary context note | `transient_only` |

No blanket rule that all inspect-summary corrections become directive authority.

#### 42A.4.5 Work-context constellation read-model [NP14]

```ts
export const WorkContextConstellationSchema = z.object({
  work_context_id: z.string().max(120),
  work_context_name: z.string().max(200),
  visibility_class: z.enum(["local_only", "cloud_allowed", "cloud_warn", "blocked"]).default("cloud_allowed"),
  scene_confidence: z.number().min(0).max(1).default(0),
  primary_entities: z.array(z.object({
    node_id: z.string().max(120),
    canonical_name: z.string().max(200),
    node_kind: z.string().max(40),
    relevance_score: z.number().min(0).max(1),
  })).default([]),
  active_goals: z.array(z.object({
    node_id: z.string().max(120),
    canonical_name: z.string().max(200),
    status: z.string().max(40),
  })).default([]),
  active_obligations: z.array(z.object({
    node_id: z.string().max(120),
    canonical_name: z.string().max(200),
    due_hint: z.string().max(120).optional(),
  })).default([]),
  recent_changes: z.array(z.object({
    kind: z.string().max(80),
    ref_id: z.string().max(120),
    summary: z.string().max(300),
    occurred_at: z.string().datetime(),
  })).default([]),
  unresolved_items: z.array(z.string().max(300)).default([]),
  likely_next_procedures: z.array(z.object({
    node_id: z.string().max(120),
    canonical_name: z.string().max(200),
    confidence: z.number().min(0).max(1),
  })).default([]),
  expected_but_missing_concepts: z.array(z.string().max(200)).default([]),
  updated_at: z.string().datetime(),
  stale: z.boolean().default(false),
  schema_version: z.literal(2),
});
```

**Render-owner seam [NP14]:** `rendered_summary_markdown` removed from the DOC72-owned schema. The constellation follows the same render-input / render-template split as compiled truth: DOC72 builds the structured `WorkContextConstellationSchema` as the render input. DOC24/KDA owns the constellation markdown render template.

**Build policy [NP14]:**
```ts
export const WorkContextConstellationBuildPolicySchema = z.object({
  max_primary_entities: z.number().int().positive().default(12),
  max_recent_changes: z.number().int().positive().default(8),
  max_likely_next_procedures: z.number().int().positive().default(5),
  require_repeated_pattern_support_for_missing_concepts: z.boolean().default(true),
  schema_version: z.literal(1),
});
```

**Scene confidence [NP14]:** `scene_confidence` is computed from coverage of primary entities, goals, and obligations. High confidence = strong entity coverage + recent activity + confirmed motifs. Low confidence = sparse entities + no recent changes + no motifs.

**`expected_but_missing_concepts` computation [NP14]:** Populated only from confirmed motifs. Empty until confirmed motifs exist. Algorithm: for each confirmed motif whose `work_context_id` matches, check whether all step entities are present in the current work context. Missing entities from confirmed motifs are surfaced as "expected but missing."

**`likely_next_procedures` computation [NP14]:** Populated from confirmed motifs matching current active entities. If the user has started but not completed a motif sequence (some steps are active), suggest the next unmatched step.

**Routes:**
- `GET /api/knowledge/work-contexts/:workContextId/constellation` → `WorkContextConstellationReadModelSchema`
- `POST /api/knowledge/work-contexts/:workContextId/constellation/refresh` → `RefreshWorkContextConstellationCommandSchema`
- `POST /api/knowledge/work-contexts/:workContextId/constellation/review-unresolved` → `QueueConstellationUnresolvedItemReviewCommandSchema`

#### 42A.4.6 Weekly knowledge digest [NP22]

```ts
export const WeeklyDigestConfigSchema = z.object({
  enabled: z.boolean().default(true),
  delivery_day: z.enum(["monday","tuesday","wednesday","thursday","friday","saturday","sunday"]).default("monday"),
  max_entities_per_context: z.number().int().positive().default(15),
  include_new_entities: z.boolean().default(true),
  include_confidence_changes: z.boolean().default(true),
  include_corrections: z.boolean().default(true),
  include_approaching_deadlines: z.boolean().default(true),
  include_stale_knowledge: z.boolean().default(true),
  destination: z.enum(["q_inbox", "email"]).default("q_inbox"),
  schema_version: z.literal(1),
});

export const WeeklyDigestContentSchema = z.object({
  digest_id: z.string().max(120),
  week_start: z.string().datetime(),
  week_end: z.string().datetime(),
  work_context_sections: z.array(z.object({
    work_context_id: z.string().max(120),
    work_context_name: z.string().max(200),
    new_entities: z.array(z.object({
      node_id: z.string().max(120),
      canonical_name: z.string().max(200),
      node_kind: z.string().max(40),
      created_at: z.string().datetime(),
    })).default([]),
    confidence_changes: z.array(z.object({
      node_id: z.string().max(120),
      canonical_name: z.string().max(200),
      old_confidence: z.number().min(0).max(1),
      new_confidence: z.number().min(0).max(1),
    })).default([]),
    corrections_applied: z.number().int().nonnegative().default(0),
    approaching_deadlines: z.array(z.object({
      node_id: z.string().max(120),
      canonical_name: z.string().max(200),
      deadline_hint: z.string().max(200),
    })).default([]),
    stale_knowledge_count: z.number().int().nonnegative().default(0),
  })).default([]),
  schema_version: z.literal(1),
});

export const WeeklyDigestRenderInputSchema = z.object({
  digest_id: z.string().max(120),
  generated_at: z.string().datetime(),
  visibility_class: z.enum(["cloud_allowed", "cloud_warn", "local_only", "blocked"]),
  content: WeeklyDigestContentSchema,
  schema_version: z.literal(1),
});
```

**Storage:**
- Config: `ELNOR_MEMORY/config/weekly_digest_config.json`
- Digest artifacts: `ELNOR_MEMORY/system/backlink/digests/{digest_id}.json`

**Render-owner seam [NP22]:** DOC72 SHALL assemble `WeeklyDigestContentSchema` and `WeeklyDigestRenderInputSchema`. DOC24/KDA SHALL own the markdown/email render template and delivery formatting. EC SHALL orchestrate assembly, render, and destination dispatch. Email delivery is opt-in only; `q_inbox` is the default destination.

Visibility filtering SHALL occur at digest assembly time, not only at delivery time.

**Routes [NP10]:**
- `GET /api/settings/knowledge/weekly-digest` → `WeeklyDigestConfigSchema`
- `PATCH /api/settings/knowledge/weekly-digest` → partial update
- `GET /api/knowledge/digests/latest` → latest digest render-input
- `GET /api/knowledge/digests/history` → digest history list

---

### 42A.5 Co-Occurrence Pattern and Procedure Discovery

#### 42A.5.1 Architecture change from V1

V1 clustered on canonical edges and used `edges.created_at` (doesn't exist). V2 clusters on the mention observation ledger, which has timestamps and surface metadata.

V1 used connected components (union-find). V2 uses **maximal clique detection** (Bron-Kerbosch algorithm). Connected components merge unrelated workflows via bridging entities (e.g., the user's name connects every task into one mega-cluster). Maximal cliques ensure every entity in a detected pattern co-occurs with every OTHER entity.

#### 42A.5.2 Pattern schema

```ts
export const CoOccurrencePatternSchema = z.object({
  pattern_id: z.string().max(120),
  entity_ids: z.array(z.string().max(120)).min(2),
  entity_names: z.array(z.string().max(200)),
  co_occurrence_count: z.number().int().positive(),
  visibility_class: z.enum(["local_only", "cloud_allowed", "cloud_warn", "blocked"]).default("cloud_allowed"),
  distinct_observation_count: z.number().int().positive(),
  surfaces: z.array(z.string().max(80)).default([]),
  work_context_ids: z.array(z.string().max(120)).default([]),
  temporal_span_days: z.number().nonnegative(),
  temporal_regularity: z.enum(["none", "weekly", "monthly", "irregular"]),
  temporal_regularity_confidence: z.number().min(0).max(1).default(0),
  suggested_procedure_kind: z.enum([
    "tool_workflow", "document_pattern", "communication_pattern",
    "analysis_pattern", "research_pattern", "composite_unknown",
  ]),
  status: z.enum(["detected", "pending_review", "confirmed", "dismissed"]).default("detected"),
  created_at: z.string().datetime(),
  schema_version: z.literal(1),
});
```

**Storage:** `ELNOR_MEMORY/system/backlink/co_occurrence_patterns.jsonl`

#### 42A.5.3 Detection algorithm and bounds [NP13]

```ts
export const CliqueDetectionConfigSchema = z.object({
  max_nodes_considered: z.number().int().positive().default(5000),
  min_pair_support: z.number().int().positive().default(3),
  max_clique_size: z.number().int().positive().default(6),
  max_cliques_emitted: z.number().int().positive().default(2000),
  hub_exclusion_degree_threshold: z.number().int().positive().default(50),
  use_degeneracy_ordering: z.boolean().default(true),
  max_runtime_ms: z.number().int().positive().default(30000),
  min_lift_threshold: z.number().positive().default(1.5),
  schema_version: z.literal(1),
});
```

**Pre-filters [NP13]:**
1. Remove hub nodes with degree > `hub_exclusion_degree_threshold` from the co-occurrence mining graph (not from canonical graph).
2. Before a frequent pair may participate in clique construction, it SHALL satisfy both `co_count >= min_pair_support` and `lift >= min_lift_threshold`, where lift compares observed co-occurrence against expected independent frequency.
3. Only search for cliques of size ≤ `max_clique_size`.
4. Stop after emitting `max_cliques_emitted` cliques.
5. Abort if runtime exceeds `max_runtime_ms`. Emit partial results with `truncated: true`.

#### 42A.5.4 Temporal regularity

Same algorithm: compute intervals, coefficient of variation. CV < 0.3 + mean 5-9 days → weekly. CV < 0.3 + mean 25-35 days → monthly. CV < 0.5 → irregular. Else → none.

#### 42A.5.5 Standing procedure candidacy

`temporal_regularity` of `weekly` or `monthly` with `confidence >= 0.6` → additionally flagged as standing procedure candidate. User confirms via `ApproveProcedureMotifCommandSchema` with `confirm_as_standing_procedure`.

#### 42A.5.6 Ordered procedure motif lane [NP06]

```ts
export const OrderedMotifStepSchema = z.object({
  sequence_index: z.number().int().nonnegative(),
  node_id: z.string().max(120),
  canonical_name: z.string().max(200),
  node_kind: z.string().max(40),
  median_offset_ms: z.number().nonnegative().optional(),
  schema_version: z.literal(1),
});

export const ProcedureMotifSchema = z.object({
  motif_id: z.string().max(120),
  work_context_id: z.string().max(120).optional(),
  visibility_class: z.enum(["local_only", "cloud_allowed", "cloud_warn", "blocked"]).default("cloud_allowed"),
  steps: z.array(OrderedMotifStepSchema).min(2),
  support_count: z.number().int().positive(),
  distinct_observation_count: z.number().int().positive(),
  success_support_count: z.number().int().nonnegative().default(0),
  failure_support_count: z.number().int().nonnegative().default(0),
  surface_families: z.array(z.string().max(80)).default([]),
  temporal_regularity: z.enum(["none", "weekly", "monthly", "irregular"]),
  temporal_regularity_confidence: z.number().min(0).max(1).default(0),
  suggested_node_kind: z.enum(["procedure", "standing_procedure", "undetermined"]),
  status: z.enum(["detected", "pending_review", "confirmed", "dismissed"]).default("detected"),
  created_at: z.string().datetime(),
  schema_version: z.literal(1),
});
```

**Mining configuration [NP06]:**
```ts
export const OrderedMotifDetectionConfigSchema = z.object({
  enabled: z.boolean().default(true),
  mining_mode: z.enum(["contiguous_only"]).default("contiguous_only"),
  dedupe_consecutive_duplicates: z.boolean().default(true),
  min_support_count: z.number().int().positive().default(3),
  min_step_count: z.number().int().positive().default(2),
  max_step_count: z.number().int().positive().default(6),
  min_success_ratio_for_standing_procedure: z.number().min(0).max(1).default(0.7),
  schema_version: z.literal(1),
});
```

**Normative algorithm [NP06]:**

V2.2 motif mining SHALL use contiguous subsequence mining only. Gapped subsequences are out of scope. Within a single observation sequence, consecutive duplicate `resolved_node_id` values SHALL be collapsed to one occurrence before mining. Sequences longer than `max_step_count` SHALL be processed as sliding windows.

```ts
function mineContiguousSubsequences(
  observationWindows: { observation_id: string; ordered_node_ids: string[] }[],
  config: z.infer<typeof OrderedMotifDetectionConfigSchema>,
): Map<string, { steps: string[]; support: number; observations: Set<string> }> {
  const candidates = new Map<string, { steps: string[]; support: number; observations: Set<string> }>();
  for (const window of observationWindows) {
    const seq = window.ordered_node_ids;
    for (let len = config.min_step_count; len <= Math.min(config.max_step_count, seq.length); len++) {
      for (let start = 0; start <= seq.length - len; start++) {
        const subseq = seq.slice(start, start + len);
        const key = subseq.join('|');
        const existing = candidates.get(key) || { steps: subseq, support: 0, observations: new Set() };
        if (!existing.observations.has(window.observation_id)) {
          existing.support++;
          existing.observations.add(window.observation_id);
        }
        candidates.set(key, existing);
      }
    }
  }
  for (const [key, val] of candidates) {
    if (val.support < config.min_support_count) candidates.delete(key);
  }
  return candidates;
}
```

Support is counted per distinct `observation_id`, not per raw occurrence. `success_support_count` and `failure_support_count` are populated from DOC72 §34 `experience_records` when linked to supporting observations. Observations without execution traces contribute to neither count.

---

### 42A.6 Graph Density Health Metrics [NP15]

```ts
export const GraphDensityMetricsSchema = z.object({
  total_nodes: z.number().int().nonnegative(),
  total_edges: z.number().int().nonnegative(),
  promoted_edge_count: z.number().int().nonnegative(),
  observed_mention_count: z.number().int().nonnegative(),
  mean_edges_per_node: z.number().nonnegative(),
  median_edges_per_node: z.number().nonnegative(),
  isolated_node_count: z.number().int().nonnegative(),
  weak_relation_ratio: z.number().min(0).max(1),
  generic_relation_ratio: z.number().min(0).max(1),
  hub_node_count: z.number().int().nonnegative(),
  median_support_per_promoted_edge: z.number().nonnegative(),
  suppression_rate: z.number().min(0).max(1),
  promotion_rate: z.number().min(0).max(1),
  llm_resolution_rate: z.number().min(0).max(1),
  mean_candidates_per_observation: z.number().nonnegative(),
  low_utility_edge_count: z.number().int().nonnegative(),
  top_hub_nodes: z.array(z.object({
    node_id: z.string().max(120),
    canonical_name: z.string().max(200),
    edge_count: z.number().int().nonnegative(),
  })).max(10).default([]),
  computed_at: z.string().datetime(),
  schema_version: z.literal(1),
});

export const NodeLinkBudgetPolicySchema = z.object({
  max_promoted_edges_per_source_node: z.number().int().positive().default(25),
  max_generic_relations_per_source_node: z.number().int().positive().default(3),
  max_new_targets_per_observation: z.number().int().positive().default(12),
  hub_penalty_threshold: z.number().int().positive().default(150),
  schema_version: z.literal(1),
});
```

**Storage:**
- `ELNOR_MEMORY/system/graph_density_metrics.json` (nightly)
- `ELNOR_MEMORY/config/node_link_budget_policy.json`

`NodeLinkBudgetPolicy` is consumed in §42A.2.4 `materializePromotedEdges`.

**Edge lifecycle state [NP15]:**
```ts
export const EdgeLifecycleStateSchema = z.enum(["active", "observed", "suppressed", "archived"]);
```

#### 42A.6.1 Graph hygiene, demotion, and archive-review lane [NP15]

The backlink system SHALL include a conservative hygiene lane so weak backlink-derived artifacts do not accumulate forever.

```ts
export const BackLinkGraphHygieneConfigSchema = z.object({
  enabled: z.boolean().default(true),
  edge_decay_half_life_days: z.number().int().positive().default(90),
  min_confidence_for_active_retention: z.number().min(0).max(1).default(0.18),
  archive_isolated_candidate_nodes_after_days: z.number().int().positive().default(30),
  never_auto_archive_user_verified: z.boolean().default(true),
  never_auto_archive_authority_backed: z.boolean().default(true),
  schema_version: z.literal(1),
});

export const GraphHygieneMutationReceiptSchema = z.object({
  receipt_id: z.string().max(120),
  demoted_edge_count: z.number().int().nonnegative(),
  archived_candidate_node_count: z.number().int().nonnegative(),
  reviewed_bundle_count: z.number().int().nonnegative(),
  executed_at: z.string().datetime(),
  schema_version: z.literal(1),
});

export const RunBackLinkGraphHygieneCommandSchema = z.object({
  command_id: z.string().max(120),
  force_full_scan: z.boolean().default(false),
  issued_at: z.string().datetime(),
  schema_version: z.literal(1),
});

export const GetGraphHealthQuerySchema = z.object({
  include_top_hubs: z.boolean().default(true),
  schema_version: z.literal(1),
});

export const GraphHealthReadModelSchema = GraphDensityMetricsSchema.extend({
  last_hygiene_receipt: GraphHygieneMutationReceiptSchema.optional(),
  status: z.enum(["ready", "stale", "building"]).default("ready"),
  schema_version: z.literal(1),
});
```

**Async first-run [NP15]:** `Run full prune now` SHALL dispatch an asynchronous EC command, return immediately with a queued receipt, and update Graph Health via polling or subscription. Full hygiene runs SHALL be chunked, checkpointable, and resumable. The UI SHALL never block on a full hygiene run.

**Confidence-decayed promotion recall:** When graph hygiene demotes a backlink-promoted edge, the corresponding evidence bundle SHALL revert to `promotion_state = "candidate_relation"` with its support history intact. If subsequent observations reinforce the bundle above promotion threshold, the edge SHALL be re-promoted without requiring fresh evidence accumulation.

**Routes:**
- `GET /api/knowledge/graph-health` → `GraphHealthReadModelSchema`
- `POST /api/knowledge/graph-health/run-hygiene` → consumes `RunBackLinkGraphHygieneCommandSchema`, returns `GraphHygieneMutationReceiptSchema`

---

### 42A.7 Normative Rules Summary

1. Mention observations are evidence, not canonical truth. Only promoted evidence bundles become `edges`.
2. Contextual novelty gate replaces the static 80-token cap. Default 500 tokens, configurable.
3. Compiled truth is a derived read-model. DOC72 owns mirror + input builder. DOC24/KDA owns render template. It never lives on the node and never injects markdown directly into LLM runtime context.
4. RRF is transparent to DOC24. Output is normalized to 0-1 for DOC24 threshold compatibility.
5. Co-occurrence detection runs on the mention-observation ledger using maximal clique detection, not on canonical edges using connected components.
6. Ordered procedure motifs use contiguous-subsequence mining and require review before canonical node creation.
7. Tier 4 respects PropA execution trust classes. High-risk content is `same_machine_local_only` or blocked.
8. Tier 5 entity creation requires repeated unresolved support across multiple surfaces, visibility enforcement, review queue gating, approval-time re-resolution, and support for `create_new`, `link_existing`, and `dismiss` actions.
9. Alias-lexicon scanning runs before generic fallback patterns and remains a rebuildable derived optimization, not a second alias store. Stored as a derived SQLite table.
10. Relation types use a governed enum split into domain-agnostic core and domain-profile extensions. Learned extensions require `learned_` prefix.
11. Hub penalty applies at promotion time only, not at resolution time. `hub_penalty` and `salience_score` are computed during evidence aggregation and stored on the bundle.
12. Resolution ties SHALL use additional discriminators or abstain. No non-hub preference is allowed.
13. All subsystems are present from day one. Empty ledgers and zero-result bundles are natural zero states.
14. Every config knob MUST have a real consume path or be removed.
15. Every user-facing action MUST map to a typed EC command chain.
16. Cross-context expansion is fallback-only and type-scoped.
17. Stage 1 SHALL prefer bounded extraction text. Raw-document fallback requires caps and an approved runtime isolation strategy.
18. Nested spans are supported via `is_nested` and `parent_mention_id`.
19. Tier 4 prompt receives novelty-gate context for global awareness without unbounded prompt growth.
20. Compiled-truth mirrors render asynchronously with build-lock dedup; stale content may be returned immediately with visible stale state.
21. Work-context constellations are derived read-models. DOC72 owns structured schema; DOC24/KDA owns markdown render template.
22. Graph hygiene archives or demotes weak backlink-derived artifacts conservatively; demoted bundles revert to `candidate_relation` for potential re-promotion.
23. Visibility class SHALL propagate across all artifacts. Promoted-edge visibility enforcement checks edge + both endpoint nodes + PropA classification states.
24. Inspect-summary corrections SHALL use correction-kind authority mapping. No blanket `directive_authority` rule.
25. Backlink learning artifacts are derived, rebuildable, kill-switchable with staged generation, and MUST NOT become a second canonical confidence system.
26. All background work SHALL register with EC Core's `BackgroundJobOrchestrator` and honor the nightly DAG.
27. Q surfaces for Review Queue, Summary, Constellation, Graph Health, and settings MUST define loading, empty, blocked, degraded, stale-data, and command-failed states.
28. No coding agent may claim timeout-safe regex behavior unless the actual implementation enforces it.
29. Span extraction patterns, seed relation cues, and domain-specific relation type extensions SHALL be loaded from active domain profile(s) defined in DOC72 §35. A fresh instance with no profile SHALL use only domain-agnostic core patterns and relation types.
30. Undirected relation types SHALL normalize pair ordering before evidence-bundle and edge writes. Domain profiles declare directionality for their relation type extensions.
31. Cooldown resets on user corrections, new aliases, Tier 5 approvals, or lifecycle state changes. It does not reset on confidence-only changes.
32. Mention observations with `relation_hypothesis = "semantically_related"` SHALL NOT create new evidence bundles unless a corroborating mention from a different observation exists.
33. `NodeLinkBudgetPolicySchema` fields are enforced in the promotion pipeline, not decorative.
34. Mention-observation idempotency uses resolution-independent SHA-256 keys. `resolved_node_id` is mutable. Incremental aggregation uses monotonic `append_seq`, not content-addressed `mention_id`.
35. Evidence-bundle retarget/rekey operations are transactional, audit-trailed, and correctly cascade to affected promoted edges.

### 42A.8 Explicit Non-Goals

1. Storing compiled truth on the node schema.
2. Injecting compiled truth markdown into LLM context.
3. File-watcher-triggered correction pipelines from edited markdown.
4. Unbounded regex in relation cues or runtime-learned raw regex.
5. Back-link edges bypassing DOC1 Write Gate.
6. LLM back-linking on the hot path for every extraction.
7. Non-hub tie-breaking at resolution time.
8. Embedding-space policy enforcement as the primary privacy boundary.
9. Federated learning across ELNOR instances.
10. Full Shapley attribution for every backlink decision in V2.
11. Shadow node speculative merging.
12. Full temporal DVR / point-in-time graph queries.
13. MMR diversity rerank after RRF.
14. Letta-style memory tiering.
15. Leiden/community detection for graph clustering.
16. White-space probing / Socratic autonomous questioning.
17. Contradiction mining or ambient desktop/screen work-context inference.
18. Butterfly-effect dependency tracing.
19. Blind canonical deletion of weak backlink-derived truth instead of reviewable demotion/archive.

### Appendix 42A.A — DOC24/KDA Cross-Doc Obligations

#### 42A.A.1 Capability registration
Register `inspect_knowledge_summary` in DOC24 capability registry. Always-available, JIT-mounted.

#### 42A.A.2 Render template ownership
DOC24/KDA SHALL own render templates for: compiled truth summaries, work-context constellations, and weekly digests. Each consumes a DOC72-owned render-input schema.

#### 42A.A.3 Polymorphic compiled truth rendering [NP27]
DOC24/KDA render template SHALL check `node_kind` on the render input and select an appropriate rendering layout. At minimum: `world_entity` renders key facts + relationships + confidence. `procedure` renders intent + applicability + steps. `obligation` renders obligee + deadline + status. `standing_procedure` renders trigger + action + frequency. Other kinds use a generic layout.

#### 42A.A.4 Onboarding integration
DOC24 §21: Elnor SHOULD proactively call `inspect_knowledge_summary` on top 3-5 entities in new work context.

#### 42A.A.5 Weekly digest delivery
DOC24 owns delivery (Q inbox, email via DOC16). DOC72 owns content assembly and render-input building.

#### 42A.A.6 RRF consumption
DOC24 packet assembly calls `hybridRetrieve`. Normalized output ensures threshold compatibility.

---

### Appendix 42A.B — Storage Paths

| Artifact | Path | Format |
|---|---|---|
| Novelty gate config | `ELNOR_MEMORY/config/novelty_gate_config.json` | Atomic JSON |
| Back-link config | `ELNOR_MEMORY/config/backlink_enforcement_config.json` | Atomic JSON |
| Alias lexicon config | `ELNOR_MEMORY/config/alias_lexicon_config.json` | Atomic JSON |
| Promotion policy | `ELNOR_MEMORY/config/backlink_promotion_policy.json` | Atomic JSON |
| Hybrid retrieval config | `ELNOR_MEMORY/config/hybrid_retrieval_config.json` | Atomic JSON |
| Weekly digest config | `ELNOR_MEMORY/config/weekly_digest_config.json` | Atomic JSON |
| Node link budget policy | `ELNOR_MEMORY/config/node_link_budget_policy.json` | Atomic JSON |
| Graph hygiene config | `ELNOR_MEMORY/config/backlink_graph_hygiene_config.json` | Atomic JSON |
| Back-link learning settings | `ELNOR_MEMORY/config/backlink_learning_settings.json` | Atomic JSON |
| Retention policy | `ELNOR_MEMORY/config/backlink_retention_policy.json` | Atomic JSON |
| Entity creation policy | `ELNOR_MEMORY/config/entity_creation_candidate_policy.json` | Atomic JSON |
| Constellation build policy | `ELNOR_MEMORY/config/constellation_build_policy.json` | Atomic JSON |
| Clique detection config | `ELNOR_MEMORY/config/clique_detection_config.json` | Atomic JSON |
| Domain span patterns | `ELNOR_MEMORY/config/domain_span_patterns/{domain_id}.json` | Per-domain atomic JSON |
| Mention observations | `ELNOR_MEMORY/system/backlink/mention_observations.jsonl` | Append-only JSONL |
| Entity creation candidates | `ELNOR_MEMORY/system/backlink/entity_creation_candidates.jsonl` | Append-only JSONL |
| Relation cues | `ELNOR_MEMORY/system/backlink/relation_cues.json` | Atomic JSON |
| Rejections (durable) | `ELNOR_MEMORY/system/backlink/rejections.jsonl` | Append-only JSONL |
| Tier 4 queue | `ELNOR_MEMORY/system/backlink/tier4_queue.jsonl` | Append-only JSONL |
| Review queue | `ELNOR_MEMORY/system/backlink/review_queue.jsonl` | Append-only JSONL |
| Co-occurrence patterns | `ELNOR_MEMORY/system/backlink/co_occurrence_patterns.jsonl` | Append-only JSONL |
| Ordered procedure motifs | `ELNOR_MEMORY/system/backlink/procedure_motifs.jsonl` | Append-only JSONL |
| Feedback events | `ELNOR_MEMORY/system/backlink/feedback_events.jsonl` | Append-only JSONL |
| Learning counters | `ELNOR_MEMORY/system/backlink/learning_counters.json` | Atomic JSON |
| Utility ledgers | `ELNOR_MEMORY/system/backlink/{type}_utility.json` | Atomic JSON |
| Compiled policy bundle | `ELNOR_MEMORY/system/backlink/compiled_policy_bundle.json` | Atomic JSON |
| Policy generation status | `ELNOR_MEMORY/system/backlink/policy_generation_status.json` | Atomic JSON |
| Aggregation state | `ELNOR_MEMORY/system/backlink/aggregation_state.json` | Atomic JSON |
| Graph density metrics | `ELNOR_MEMORY/system/graph_density_metrics.json` | Atomic JSON |
| Graph hygiene receipts | `ELNOR_MEMORY/system/backlink/graph_hygiene_receipts.jsonl` | Append-only JSONL |
| Compiled truth mirrors | `ELNOR_MEMORY/mirror/{node_id}.json` | Per-node atomic JSON |
| Work-context constellations | `ELNOR_MEMORY/mirror/work_context/{work_context_id}.json` | Per-context atomic JSON |
| Cooldown tracker | `ELNOR_MEMORY/system/backlink/cooldown.json` | Atomic JSON |
| Weekly digests | `ELNOR_MEMORY/system/backlink/digests/{digest_id}.json` | Per-digest atomic JSON |
| Mention obs. (queryable) | `mention_observations_derived` SQLite table | Derived from JSONL |
| Rejections (queryable) | `backlink_rejections_derived` SQLite table | Derived from JSONL |
| Review queue (queryable) | `review_queue_derived` SQLite table | Derived from JSONL |
| Alias lexicon (queryable) | `alias_lexicon_derived` SQLite table | Derived from nodes+aliases |
| Evidence bundles (queryable) | `evidence_bundles_derived` SQLite table | Derived from mention observations |

### Appendix 42A.C — Cross-Doc Obligations Beyond DOC24

#### 42A.C.1 EC Core Addendum A

| Obligation | Priority | Detail |
|---|---|---|
| Register all backlink tasks in the unified task registry | High | Includes alias lexicon rebuild, nightly resolution, evidence aggregation, promotion, enrichment sweep, clique pattern detection, ordered motif detection, graph hygiene, constellation refresh, and historical sweep. |
| Token accounting for Tier 4 LLM resolution | High | Nightly resolution and historical sweep consume LLM budget and SHALL be visible in EC cost governance. |
| Settings wiring manifest entries | High | Entries for novelty gate, backlink enforcement, alias lexicon, promotion policy, graph hygiene, learning runtime settings, hybrid retrieval, and weekly digest. |
| Unified background orchestration only | High | No backlink-private scheduler or embedded cron loop may be introduced. |

#### 42A.C.2 PropA

| Obligation | Priority | Detail |
|---|---|---|
| Execution trust for Tier 4 | High | `resolveExecutionRequirement()` SHALL classify backlink resolution tasks as extraction/classification work that may see raw content. High-risk tags/findings → `same_machine_local_only`. |
| Visibility propagation | High | Mention observations, evidence bundles, promoted edges, entity creation candidates, compiled truth inputs, and constellations SHALL preserve `blocked` / `local_only` / `cloud_warn` / `cloud_allowed`. |
| Sensitive-content defer/block policy | High | When `local_only` work is deferred because no local model is available, visibility MUST be re-evaluated again at processing time, not only at deferral time. |
| No hot-path LLM expansion | High | Tier 4 remains nightly/queued by default. No new general-ingest live-turn LLM gate is introduced. |

#### 42A.C.3 DOC8 and DOC24 Addendum A (BDSM)

| Obligation | Priority | Detail |
|---|---|---|
| Compute nightly backlink utility ledgers | High | DOC8 computes entity-resolution, relation-inference, suppression, and cross-context utility records from feedback events and evidence outcomes. |
| Compile backlink policy bundle | High | DOC8 compiles promotions, demotions, suppressions, and cross-context allowances into `BackLinkCompiledPolicyBundleSchema`; EC writes the active generation. |
| Derived-only posture | High | These artifacts remain rebuildable, kill-switchable, and MUST NOT store DOC72 canonical alpha/beta confidence as a second epistemic system. |
| Runtime consumption boundary | Medium | DOC24 may consume backlink compiled policies only through settings/read-model surfaces or approved runtime policy hooks; this does not create a second prompt-control language. |

#### 42A.C.4 DOC1

| Obligation | Priority | Detail |
|---|---|---|
| Recognize promoted edges as legitimate intake path | Medium | Promoted edges enter at `observation` or `observed` lifecycle state with confidence from the evidence score. |
| Entity creation candidates through Write Gate | Medium | Approved Tier 5 candidates SHALL pass through the DOC1/DOC72 write gate before becoming canonical nodes. |
| Authority correction discipline | Medium | Only durable factual corrections from `inspect_knowledge_summary` become directive authority; relation/naming corrections remain graph corrections. |

#### 42A.C.5 DOC3

| Obligation | Priority | Detail |
|---|---|---|
| Recognize confirmed patterns as procedure source | Medium | Confirmed clique patterns or ordered motifs MAY seed `procedure` or `standing_procedure` nodes with source `backlink_pattern_detection`. |
| Keep semantic-procedure posture | Medium | Pattern promotion SHALL produce semantic intent, not mechanical UI click-paths. |

#### 42A.C.6 DOC11

| Obligation | Priority | Detail |
|---|---|---|
| Route Tier 4 LLM sessions | Medium | DOC11 routes Tier 4 sessions according to `same_machine_local_only` vs `cloud_or_local` execution trust requirements. |
| No silent cloud fallback | Medium | Sensitive Tier 4 work MUST NOT fall back to cloud when local execution is required and unavailable. |

#### 42A.C.7 DOC20 / DOC21 / DOC22

| Obligation | Priority | Detail |
|---|---|---|
| Review Queue surface | High | Knowledge Manager SHALL expose candidate relations, entity creation candidates, suppression controls, and motif confirmations with full blocked/degraded state handling. |
| Summary tab | High | Entity detail SHALL expose a Summary tab backed by the compiled truth mirror. |
| Constellation tab | High | Work-context detail SHALL expose a Constellation tab backed by the derived constellation read-model. |
| Graph Health panel | High | Knowledge Manager/System Health SHALL expose graph density and hygiene metrics plus manual hygiene trigger. |
| Settings surfaces | Medium | Settings SHALL expose Knowledge & Feedback controls for backlink runtime and Retrieval controls for novelty gate and RRF. |

#### 42A.C.8 DOC72 §35 Domain Profiles [NP09]

| Obligation | Priority | Detail |
|---|---|---|
| Domain profiles supply span patterns, relation cues, and relation type extensions | High | Legal domain profile supplies the current legal-specific seed rules. Other domain profiles supply their own. |
| Domain relation definitions include directionality | High | `DomainRelationDefinitionSchema` with `directed` or `undirected`. EC merges undirected types into pair-normalization at profile load. |
| Fresh instance behavior | High | No domain profile active → domain-agnostic core patterns and relation types only. |

### Appendix 42A.D — Rejected Enhancement Ideas (from red team reviews)

| Idea | Source | Rejection reason |
|---|---|---|
| Shadow node speculative merging | Gemini | Too speculative for V2. Evidence bundles already provide a governed convergence path. |
| White-space probing / Socratic agent | Gemini | Requires domain-specific completeness models and autonomous questioning policy not ready for V2. |
| Temporal DVR / point-in-time graph queries | Gemini | Requires point-in-time storage/query model beyond current DOC72 scope. |
| MMR diversity rerank after RRF | Grok | Adds complexity without closing a current blocker. |
| Letta-style memory tiering | Grok | Organizational polish, not a functional gap in this proposal. |
| Community detection (Leiden) | Grok | Premature at current scale; clique + ordered motifs are sufficient. |
| Butterfly-effect / dependency tracing | Gemini | Requires a far richer causal dependency graph than V2 defines. |
| Ambient work-context inference from broad desktop/screen sensing | Gemini | Different feature family, too invasive, and not needed for this proposal. |
| Contradiction mining / devil's advocate engine | Gemini | Valuable idea, but depends on authorial-voice attribution and polarity analysis not in scope here. |
| Friction-map failure zones | Gemini | Interesting but requires structured failure-node instrumentation not yet defined. |
| Procedural ghosting / UI anticipation | Gemini | UX feature for a later phase, not a blocker for V2 graph intelligence. |

---

*End of DOC72 Graph Intelligence Enhancement Proposal — V2.2 (Integrated)*

## 43. Graph Query API

### 43.1 Typed Query Operations

```ts
type GraphQuery =
  | { kind: "entity_lookup"; lookup: "id" | "alias" | "exact_name"; value: string }
  | { kind: "entity_neighbors"; entity_id: string; edge_types?: string[]; depth: 1 | 2; max_results: number }
  | { kind: "entity_cards"; entity_ids: string[]; card_tier: "compact" | "full" }
  | { kind: "procedure_bundle"; app_entity_id?: string; composite_id?: string; max_nodes: number }
  | { kind: "goal_context"; entity_id: string; max_tokens: number }
  | { kind: "knowledge_brief"; entity_id: string }
  | { kind: "related_threads"; entity_id: string; max_results: number }
  | { kind: "entity_scoped_search"; entity_id: string; query: string; max_results: number };

type GraphQueryResult = {
  query_kind: string;
  results: unknown[];
  latency_ms: number;
  truncated: boolean;
  truncation_reason?: string;
};
```

### 43.2 Latency Ceilings

Must fail or degrade if exceeded:

| Query kind | p50 target | p95 target | Hard ceiling |
|---|---|---|---|
| entity_lookup | <1ms | <3ms | 10ms |
| entity_neighbors (depth 1) | <3ms | <8ms | 25ms |
| entity_neighbors (depth 2) | <8ms | <20ms | 50ms |
| entity_cards (≤10) | <2ms | <5ms | 15ms |
| procedure_bundle | <3ms | <8ms | 25ms |
| goal_context | <2ms | <5ms | 15ms |

**Owner:** DOC72 defines query types and contracts. DOC24 implements via SQLite queries.

---

## 44. DOC3 Evolution

### 44.1 What DOC3 Keeps Owning

Observation / signal capture, candidate synthesis, proposal / review / promotion pipeline, skill categories, executable skill artifact management, deployment and enablement state, micro-skill chaining, skill/connector distinction.

### 44.2 What DOC3 Gains

- **Graph-backed procedure substrate:** DOC3's learning pipeline writes procedure nodes to the SQLite graph.
- **Shared procedure reuse:** Check graph for existing procedures before creating new ones.
- **Execution trace ingestion:** Capture traces as graph nodes.
- **Semantic intent procedures:** All procedures store intent, not UI steps.
- **Confidence linkage:** Beta α/β from the graph informs promotion decisions.
- **Correction propagation:** Corrected procedures flag DOC3 artifacts for regeneration.

### 44.3 What SKILL.md Becomes

SKILL.md files are NOT eliminated. They evolve into:
- Human-readable documentation of reviewed/promoted skills
- Execution-facing artifacts for complex workflows
- Onboarding seed content for pre-built knowledge
- Exportable/shareable procedure packs

The canonical KNOWLEDGE is in the graph. SKILL.md files are PROJECTIONS optimized for human readability and agent execution. DOC3 generates/updates SKILL.md from the graph when skills are promoted or corrections propagate.

---

## 45. The Microsoft Word Example — Full Trace (Updated for v5)

### Day 1: First TOC creation

Will says: "Create a table of contents for the Henderson brief."

**During execution:** Elnor opens Word, navigates to references section, inserts TOC, applies heading styles, saves.

**After execution (async, background):**
1. "Microsoft Word" application entity created (auto-confirmed)
2. LLM bootstrap runs → creates `known_capabilities` list: "references section, styles, track changes, comments, find and replace, headers and footers, TOC, footnotes, mail merge"
3. Execution trace captured
4. Semantic-intent procedure "Insert Table of Contents" extracted. Steps: "Open the references section and insert a table of contents. Ensure document headings use proper heading styles. Update the TOC."
5. "Apply Heading Styles" extracted as separate sub-procedure (reusable intent)

**Result:** Graph has: Word entity with 9 known capabilities, 2 semantic-intent procedures, 1 trace.

### Day 3: Proofreading

Will says: "Proofread the Henderson opposition brief."

**During execution:** Elnor opens Word, checks heading styles (RETRIEVES "Apply Heading Styles" procedure from graph), enables track changes, makes corrections, saves.

**After execution:**
1. "Enable Track Changes" procedure extracted (semantic intent: "Turn on change tracking to record all edits")
2. "Apply Heading Styles" gets second validation trace → α incremented
3. Composite procedure candidate: "Proofread Document" = Apply Heading Styles + Enable Track Changes + review

**Key moment:** When checking heading styles, Elnor retrieved the existing procedure. Shared knowledge compounded.

### Day 7: Track Changes Cleanup

By Day 7, Elnor has 5 validated procedures, 3 composite candidates, and 7 traces — all with semantic intent. Any future Word task starts from this base, not from zero. The graph is smaller and more resilient than v4's UI affordance model.

---

## 46. Evaluation Harness

**R5.7 note:** latency claims remain validated on an M4 Pro MacBook with 32GB unified memory. Lower-tier hardware is expected to achieve materially lower throughput. An embedding-quality benchmark harness and explicit hardware support tiers remain future work, but the re-embed migration path and scale notes in this revision are now explicit.

### 46.1 Purpose

The micro-instruction tagging system (§DOC24) and injection selection algorithm (§DOC24) must be continuously validated. Tags are only useful if the LLM reliably follows them. The evaluation harness measures tag adherence, detects regressions on model upgrades, and identifies tags that need reinforcement.

### 46.2 Benchmark Task Set

100+ representative tasks across domains:
- Legal: brief drafting, case analysis, motion preparation, client communication
- Personal: grocery ordering, scheduling, health decisions, music production
- Operational: file management, email triage, calendar management
- Strategic: settlement positioning, goal review, cross-matter analysis
- Mixed: tasks combining multiple domains and injection types

Each benchmark specifies: input request, expected injected knowledge, expected tags, expected LLM behavior.

### 46.3 Evaluation Metrics

| Metric | What it measures | Minimum threshold |
|---|---|---|
| `[enforce]` adherence | Constraint respected in output | 85% |
| `[cite]` adherence | Citation present for legal assertion | 85% |
| `[apply]` adherence | Preference silently applied | 75% |
| `[follow]` adherence | Procedure steps followed | 75% |
| `[caution]` adherence | Warning surfaced to user | 70% |
| `[consider]` adherence | Goal context reflected in reasoning | 70% |
| Missed constraint rate | `[enforce]` injected but constraint violated | <5% |
| Wrong silent application | `[apply]` knowledge applied when it shouldn't be | <10% |
| Citation accuracy | Citations match injected authorities | 90% |
| Proportional injection | Casual requests don't get strategic injection | 80% |

### 46.4 A/B Comparison

Three conditions tested:
- **No tags:** Raw knowledge injected without behavioral micro-instructions
- **Tags:** Natural-language bracket tags (e.g., `[apply]`, `[enforce]`)
- **Tags + XML:** Structured XML wrapper (§DOC24) with natural-language card content

Results determine which format produces the highest adherence for each tag class.

### 46.5 Regression Testing

Run the evaluation suite on:
- Model upgrades (new Gemini, Claude, or local model versions)
- KOI baseline revisions
- Injection algorithm parameter changes
- New tag additions

If any tag drops below its minimum threshold after a change, the change is flagged for review before deployment.

### 46.6 Manual Validation During Development

During initial build, test tag adherence manually across 20-30 representative scenarios. Document which tags work reliably. If any tag has <70% adherence, strengthen with XML structure, redundant placement, or KOI baseline reinforcement before first use.

---


---

## 47. Build Dependency Order

**R5.7 additional build order:** the absorbed KDA payload-contract module, the integrated Knowledge Intelligence module, and the integrated Graph Intelligence V2.1 module are normative dependencies for a complete R5.7 implementation. See §§4A, 34A, and 42A plus their appendices.

This is a dependency graph — what depends on what — NOT a phasing commitment. All features are end-state scope. The sequencing reflects structural dependencies (you can't build experience tracking before the entity graph exists), not deferral decisions.

| Step | What | Dependencies |
|---|---|---|
| 1 | Entity graph (10 types) + SQLite + FTS5 + sqlite-vec + DOC1 memory migration + embedding infrastructure (§3.7) | None |
| 2 | Knowledge-to-LLM delivery (rendering templates + XML injection + tagging + evaluation harness) — DOC24 | Step 1 |
| 3 | Domain knowledge (legal vertical with citations) + work products + authority provenance + domain signal profiles (§35) | Steps 1-2 |
| 4 | Conversation corpus / episodic recall | Steps 1-2 |
| 5 | Conversational inspectability + knowledge brief + Text-to-SQL + Knowledge Manager UI (§41) | Steps 1-4 |
| 6 | Basic experience tracking + Beta distribution confidence (§5) | Steps 1-2 |
| 7 | Standing procedures + trigger filter language + dedup + ActionSafetyClass | Steps 1-2, 6 |
| 8 | Goals + goal-linked injection | Steps 1-3 |
| 9 | Agent profiles + delegation protocol + auto-evolution | Steps 1-2 |
| 10 | Browser integration + domain consent model + research session capture | Steps 1-5 |
| 11 | Graph cleanup / entropy control + knowledge health dashboard + semantic folding (§42) | Steps 1, 6 |
| 12 | Knowledge intake pipeline — observation + significance gating + surface contracts (§§20A-20C) | Steps 1-2, 3 |
| 13 | Self-learning extraction feedback loop (§36) + versioned commits (§37) + audit trail (§38) | Steps 1, 6, 12 |
| 14 | Retroactive knowledge sonar (§21) | Steps 1, 3, 12 |
| 15 | Daily extraction budgets (§23) + critical non-drop queue (§24) + folder attention tiers (§22) | Steps 1, 12 |

Steps within the same dependency tier can be built in parallel (e.g., steps 3, 4, and 6 can proceed simultaneously once steps 1-2 are done).

---

## 48. Open Questions

### Resolved by Adjudication (R5.3)

The following open questions from v4 are CLOSED:

| # | Question | Resolution |
|---|---|---|
| 1 | Trace granularity | Semantic intent summaries, not step-by-step UI logs (ISS-04) |
| 6 | Token budget for procedure injection | 80-350 tokens proportional; ≤100 for goals (DOC24 delivery) |
| 12 | Standing procedure limits | <5ms for structured filter matching; soft cap 50 procedures (§19.2) |
| 20 | Graph size governance | Entropy control with 8 nightly jobs + archival + compaction (§42) |
| 25 | Experience record variant taxonomy | Governed by `variant_label: string` with context_distribution (§34) |
| 26 | Provenance chain depth | Three-tier model: Full/Compact/Minimal (§33.2) |
| 29 | Browser domain allowlist UX | Per-domain toggles with four privacy levels (§39.4) |
| 32 | Six-dimension storage efficiency | Tiered sparsity policy + SQLite column storage (§2.2) |
| 34 | Micro-tag vocabulary | Current set confirmed sufficient; extensible through config (DOC24) |
| 37 | Agent profile inheritance | Live link — profiles adapt from usage via auto_evolve (§40) |
| 38 | Inter-agent delegation protocol | Scoped payload defined (§40) |
| 39 | inspect_knowledge scope | Human-readable summaries + Text-to-SQL for raw access (§41) |
| 40 | Conversational adjustment authority | All agents can propose; Elnor-only for domain additions (§40) |
| 41 | Injection feedback attribution | Hysteresis + decay (DOC24) |
| 42 | Self-learning injection convergence | Damping mechanisms prevent oscillation (DOC24) |

### Resolved by R4 Knowledge Intake Architecture

| # | Question | Resolution |
|---|---|---|
| R4-1 | Note timeout | 20 min (configurable). OS focus as early trigger. Double-extraction guard. Extraction lock pointer. (§20B.2) |
| R4-2 | Folding threshold | Per node_kind. DOC8 tunes. Temporal constraint (>24 months → verify). (§42.6.2) |
| R4-3 | Opposing work classification | Default external. Scoped classification rules (sender + subject + mime + TTL). Ask once if ambiguous. (§20B.4) |
| R4-4 | Long documents | Full doc with caching for deep. Chunked for sonar Pass 2. |
| R4-5 | Re-extraction | Deep v1 + diff-only thereafter. Full re-extract if structure >25% change. |
| R4-6 | To-do lifetime | Today Note: same-day. Obligation-matching items persist. Non-obligation to-dos pass significance gating. (§20B.2) |
| R4-7 | Agent-uncontested weight | 0.40 default. 3+ agents → 0.55 review-needed. Still not directive. (§20B.6) |
| R4-8 | Westlaw Client ID | Never prompt. Optional bonus. (§20B.9) |
| R4-9 | Sonar scope | Global index for Pass 1. Tier 2 priority Pass 2. (§21.2) |
| R4-10 | Multi-user | principal_id + scope fields added now. Full multi-user deferred. (§4.0) |
| R4-11 | Research summary | Always for explicit sessions. Budget-subject for auto-detected dwell. (§39.2) |
| R4-12 | Cross-context folding | Fold context-agnostic parents. Link context-specific applications. (§42.6.2) |
| R4-13 | Context-specific incognito | excluded_matter_ids on MemoryModeState. (§13.4) |
| R4-14 | Per-surface controls | Global toggles default. Advanced per-surface override behind Settings > Advanced. |
| R4-15 | Embedding model | Qwen3 primary, locked config, migration pipeline. (§3.7) |

### Remaining Open Questions

1. **Cross-application procedures.** How should cross-app procedures (e.g., "copy from Excel, paste into Word") link to multiple application entities? Single procedure with multiple `part_of_application` edges, or separate procedures per app chained via composite?

2. **Procedure versioning.** When a procedure's semantic intent changes, archive as `specializes` variant or replace? Current answer leans toward `specializes` with temporal metadata.

3. **Community skill packs.** Import community-contributed procedure knowledge? Confidence floor for community content? Trust model?

4. **Skill file generation.** Reference graph node IDs or self-contained? Current direction: SKILL.md files are projections from graph, self-contained for portability.

5. **Failure-driven learning.** Auto-extract corrected procedure from recovery actions? Or require explicit teaching?

6. **Change detection cadence.** Filesystem watchers or scheduled scans? Tradeoff: watchers are real-time but consume resources; scans are batch but delayed.

7. **Change history depth.** Full history or windowed with archive? Current: append-only, Tier A gets full, Tier B gets minimal.

8. **Decision node formality.** Explicit capture or auto-detect from conversation? Current: auto-detect through conversation mining with explicit confirmation.

9. **Research lineage depth.** Full query logs or kept/rejected with reasoning?

10. **Cross-entity change cascading.** Maximum cascade depth confirmed at 2 hops. Is this sufficient for complex workflows?

11. **Environment migration assistant.** Explicit "new computer" mode for bulk re-validation?

12. **Goal abstraction ladder.** Discourage overly abstract goals during onboarding?

13. **Goal scope for personal domains.** When to suggest personal goals?

14. **Goal evolution triggers.** What signals prompt goal review? Milestone events only, or also time-based?

15. **DOC8 experience processing cadence.** Real-time for critical, nightly for trends? Or configurable per-signal?

16. **Domain concept hierarchy depth.** How many nesting levels are practical? 3-4 seems right.

17. **Domain knowledge portability.** Shareable domain concept packs as starters?

18. **Authority verification cadence.** How often to prompt for Shepardization? Quarterly seems reasonable. Weekly background job when browser integration is active (§39).

19. **SQLite scaling ceiling.** At what node count does sqlite-vec brute-force search become a bottleneck? ~100K seems safe. What then?

20. **DOC1 migration rollback.** If SQLite migration from JSONL fails partway, what's the recovery strategy?

21. **Multi-agent transaction isolation.** When two agents write to the graph simultaneously, how does SQLite WAL mode handle contention?

22. **Nightly job ordering.** Is there a dependency graph between the cleanup jobs? Can they run in parallel?

23. **Note extraction granularity.** Should every note save trigger extraction, or only saves with significant content changes? Answer: >100 chars changed, plus semantic diff detection (§20B.2).

24. **Document Viewer extraction for opposing work.** Resolved: multi-signal classification with scoped rules (§20B.4).

25. **Multi-agent room extraction cost.** Room conversations can be long (100+ turns, 3-5 agents). Extraction cost scales with transcript length. Should long room conversations be chunked for extraction, or extracted as a whole?

26. **CANDOR finding severity calibration.** Resolved: severity-weighted β increments (critical: 2.0, major: 1.5, minor: 0.5, observation: 0.25) (§20B.8).

27. **Panel recommendation expiry.** Resolved: PENDING → `stale_suggested` after 7 days (§20B.7).

28. **Cross-surface knowledge consolidation.** Resolved: precedence ordering with nightly consolidation (§20C).

---


## 49A. Cross-Document Obligations Triggered by R5.7

1. MultiDoc PropA `CanonicalNodeKindSchema` SHALL include `tool_capability` in the same revision.
2. DOC24/KDA rendering templates SHALL be updated for every newly integrated knowledge type or payload field this revision makes canonical.
3. DOC20 §6.18.2, DOC21, and DOC22 SHALL register all new or expanded UI surfaces, stored content types, health panels, routes, and settings introduced by this revision.
4. BDSM §5.2 and the DOC24 runtime sections SHALL carry the same matrix-off baseline fallback sentence used in DOC72 §14.9.


## 49. Items Rejected

For the record, the following proposals from the red team were REJECTED:

| Item | Proposal | Reason for rejection |
|---|---|---|
| REJ-01 | Merge standing procedures into DOC23 | Different complexity, effort, and reliability. Four-layer taxonomy is correct. Promotion path exists. |
| REJ-02 | Merge Temporal + Experience dimensions | Answer different questions. Temporal = system state. Experience = behavioral evidence. Shared data point, not merge reason. |
| REJ-03 | Move §33 delivery entirely to DOC24 | Originally rejected as wholesale move. In R5.4, the split is refined: rendering CONTRACT (schemas, shapes) remains in DOC72 as part of each node type definition. Rendering IMPLEMENTATION and delivery architecture (packet assembly, injection selection, rendering templates, injection tags, retrieval lanes) moves to DOC24. The original rejection's principle is preserved — DOC72 still defines the shape of knowledge cards. |
| REJ-04 | Drop UI awareness entirely | Modified: don't track UI elements as durable nodes (accepted). But app entities keep lightweight `known_capabilities` metadata. Semantic intent replaces UI tracking. |
| REJ-05 | Lazy knowledge only | Modified: eager for connected providers (email, calendar, files). Lazy for personal/one-off/new domains. Combine eager + lazy. |
| REJ-06 | Evaluation harness as build prerequisite | Modified: evaluation harness is fully specified (§46) but is not a build gate. Manual testing during development, automated harness runs continuously after first deployment. |
| REJ-07 | Create opposing counsel intelligence profiles | Too granular, not enough data, not domain-agnostic. Actor domain overlays (§4.6) provide sufficient per-actor knowledge without dedicated profiles. |
| REJ-08 | Proactive sonar result injection | Sonar results enrich the graph in background but are surfaced ON DEMAND only. Exception: contradictions go to weekly digest. Prevents information overload. |
| REJ-09 | Standalone decay notifications | No per-item decay alerts. Decay surfaced only in weekly digest, only for deadline-linked items. Prevents notification fatigue. |