DAMS_V4_1_CONSOLIDATED_ARCHITECTURE_PROPOSAL.md
Current Specs/DOC80 Memory Control Plane/DAMS_V4_1_CONSOLIDATED_ARCHITECTURE_PROPOSAL.md
ELNOR REPO READER TEXT MIRROR
Original path: Current Specs/DOC80 Memory Control Plane/DAMS_V4_1_CONSOLIDATED_ARCHITECTURE_PROPOSAL.md
Source repo: /Users/OpenClaw1/Elnor/Elnor Specs
Git branch: main
Git commit: dbaa25962edc11ab30e8d4ca1715f9ae5bf77331
Generated: 2026-06-09T01:23:58.539Z
---
# DAMS V4.1 — Consolidated Memory Architecture Proposal
**Working title.** "DAMS V4.1" is a filename, not a claim. This document's central recommendation is that DAMS — the Dynamic Attenuator Memory System — is *one demoted substrate* inside a larger memory architecture. The title is retained only as the successor in the proposal lineage (V2 → meta-review → V3.1 → V4); it will be renamed when the architecture crystallizes.
**Document type.** Proposal, for red-team review. It supersedes `DAMS_PROPOSAL_V2.md`, `DAMS_V2_CONSOLIDATED_META_REVIEW.md`, and the V3.1 proposal as the current architectural statement. It carries reasoning, open questions, and process notes, and it is built to be red-teamed again before the architecture is settled.
**V4.1 vs V4.** Two content additions, no structural change: §B.8 now carries the worked three-Topics walkthrough that establishes the multi-parent acyclic DAG (V4 stated only the conclusion); §C.1 sharpens the "the merge is itself the audit" epistemics (discovery of overlaps is not separable from their resolution).
**What changed from V3.1.** V3.1 corrected the DAMS defects and proposed the 5-kernel architecture. V4 keeps all of that and adds the substantial reframe developed since: (1) the organizing principle is now explicit — memory is a *control problem*, the system's job is to preserve, update, and compile the user's live working state; (2) **Premise Families** become the base semantic object; (3) the retrieval/render kernel is reframed as a **Context Compiler** emitting warranted packets; (4) **recompile-on-resume + cognitive diff** is added; (5) the **thick floor / thin swappable judgment layer** invariant is stated; (6) the **domain-pack** structure is adopted; (7) a new **Knowledge Topic / grouping primitive** is specified; (8) the substrate is explicitly **polytypic** — several first-class memory object kinds, not one universal one. V4 also folds Projects in as a `project`-kind scope object.
**Status of prior documents.** The V3.1 red-team reviews have been received (in summarized form) and are reflected here. The consolidated meta-review and V3.1 are superseded and move to archive on adoption of V4.
---
## How to read this document
V4 keeps the three-part de-fusion that has structured this process — three questions with different evidential status that must not carry each other:
- **Part A — defects (settled).** DAMS V2 bugs plus the bugs found in the meta-review's own fixes. Corrected here.
- **Part B — architecture (strong, with open questions).** The control-problem reframe; the 5-kernel placement discipline; the polytypic substrate; Premise Families; the Context Compiler; Knowledge Topics; the thick/thin invariant.
- **Part C — rebuild scope (narrowed).** The flatten-and-unify direction; the five-phase plan.
A reviewer must not wave Part C through on Part A's strength, nor treat Part B's structure as ratified because Part A's bugs are real. Disagreement is preserved deliberately, including one disagreement an earlier synthesis erased (see Part E).
---
# Part 0 — Executive Summary
**The reframe.** Memory for an agentic personal OS is not a retrieval problem. A retrieval system answers "find me the relevant past item." The hard problem in sustained high-stakes work is different: *preserve, update, and compile the user's live working state* — the current theory of the problem, what is settled, what is ruled out, what is still open — and deliver to the model, each turn, the smallest safe, warranted, admissible context. Memory is a **control problem**: at each step the system chooses to capture, ignore, inject, suppress, verify, interrupt, or consolidate, under uncertainty and under risk. Eight independent blank-slate models converged on this framing; the current agent-memory research literature is converging on it too (memory as a write–manage–read control loop, not RAG-plus-a-graph). V4 adopts it as the organizing principle.
**DAMS is demoted.** DAMS V2 asked one numeric attenuator to carry safety policy, capture priority, injection ranking, delegation, egress, export, rendering, preference, learning signal, category ontology, and mode. That is too much for one scalar. DAMS becomes one substrate — facets, modes, capacity priors, cold-start defaults — inside a ~5-kernel architecture. This was settled in V3.1 and is unchanged.
**The defects are real and include the meta-review's own fixes.** The capacity composition produces broken numbers — but the root cause is a missing primitive (`ScopeRoot`), not the operator. The safety composition is inverted (a floor on a MIN-composed safety axis enforces *minimum permissiveness*). Categorization fails open. And the meta-review's *own* proposed fixes carried three further defects: an additive operator unbounded below, a fail-closed fix that fails *stuck*, and a conflation of write-time and session-time policy. All corrected in Part A.
**The substrate is polytypic.** There is no single universal memory object. The substrate has several first-class kinds, each shaped to what it is: **Premise Families** (true/false/temporal propositions — the new base object for knowledge that can go stale), **Directives** (DOC1-style standing instructions — not claims, not temporal), **Procedures/skills** (capabilities, which *cite* premises for their conditional parts), **evidence/raw records** (the floor), and **goals/obligations/entities**. Forcing everything into one schema is the over-specification this architecture explicitly rejects.
**Two distinct new objects, often confused, kept separate.** A **working-state object** is the live, in-flight model of *one problem being actively worked* — prospective, hot, mostly not extracted from anything. A **Knowledge Topic** is a persistent, accreting *grouping of settled knowledge* around a subject — a curation lens plus an optional standing collection instruction. One is the workbench; the other is the library. They interact (work pulls from Topics; durable premises from work land in Topics) but neither contains the other. Both are first-class.
**The thick/thin invariant.** Build the *floor* — evidence ledger, Premise Families, provenance, temporal validity, authority hierarchy — heavy, durable, and permanent. Build the *judgment layer* — the decisions about what to inject, suppress, warrant, or surface — thin, declarative, and swappable, so it can be a deterministic function now and a learned policy or a stronger model later, without touching the floor. This is the defense against over-specifying the part that next-generation models will absorb.
**The phantom learning engine.** The document the whole stack treats as its learning/attribution engine is a legacy friction logger; the real engine was never written. It must be written from scratch, with its charter widened to also own the memory lifecycle (managed forgetting, consolidation, episode segmentation).
**The rebuild.** The memory specs already overlap each other before any new work is added. The direction is settled: a flatten-and-unify into one function-organized memory specification set, executed as a managed five-phase process (settle architecture → audit → outline → merge plan → execute). The merge is itself the design pass — the moment overlaps become visible and the better of two designs is chosen.
**Disposition:** substantial rework before any build commitment; DAMS demoted; the control-problem reframe adopted as the spine; the polytypic substrate, Premise Families, the Context Compiler, Knowledge Topics, and the thick/thin invariant adopted into Part B; the phantom learning engine written from scratch; the flatten-and-unify plan executed in sequence; a further red-team round on this document before the architecture is settled.
---
# Part 0A — Plain-Language Change List (for the architect)
*Written for the architect, not the reviewers. Functional summary: what each change is called, what it does, why it helps, how valuable it is. Detailed reasoning is in Parts A–C. Reviewers should read Parts A–E; this is the index.*
**Rating scale.** **Tier 1 — Core/high-leverage:** fixes a real defect or fills a real gap; the system is unsafe, broken, or meaningfully worse without it. **Tier 2 — Strong value-add:** genuine improvement, clear benefit, but the system functions without it. **Tier 3 — Worth doing/lower-stakes:** real but modest, or cheap insurance.
## Group 1 — The reframe (new in V4)
- **Memory as a control problem.** `[Tier 1]` The system's job is restated: not "retrieve relevant past items" but "preserve, update, and compile the user's live working state, and deliver the smallest safe context each turn." *Why it helps:* it reorganizes every other decision around the thing that actually matters for deep work — keeping the problem warm and the reasoning uncontaminated — instead of around search. (§B.1)
- **The thick floor / thin swappable judgment layer.** `[Tier 1]` A hard architectural rule: the durable record (evidence, premises, provenance, temporal validity, authority) is built heavy and permanent; the *decisions* on top (inject/suppress/warrant/surface) are built thin and replaceable. *Why it helps:* it future-proofs the system — the judgment layer can be a simple function now and a learned policy later without touching the floor — and it stops the project from over-engineering the part that next-generation models will do for free. (§B.2)
## Group 2 — Composition fixes (corrections to math that is currently wrong; all Tier 1)
- **`ScopeRoot` — a canonical object for "the matter."** `[Tier 1]` The entity, the document corpus, and the project tag for one matter are currently three things the system can't tell are the same matter, so it triple-counts. `ScopeRoot` is one shared object all three reference; co-reference becomes a lookup, not a guess. *Why it helps:* your most-active, best-modeled work stops being penalized by its own good modeling. (§A.1)
- **Bounded composition operator.** `[Tier 1]` The composition math currently runs away in both directions — too high (everything saturates, ranking dies) and, in the meta-review's own fix, too low (a negative score sorts good content *below* junk). Fixed with a symmetric bounded curve. (§A.1)
- **Ceiling-only safety axes.** `[Tier 1]` A "floor" on a safety control currently enforces *minimum* permissiveness — it can force a privileged document to leak. Safety controls become ceilings that can only ever be lowered. *Why it helps:* "this never leaves the device" becomes structural, not a fragile convention. (§A.2)
- **Typed safety decisions (target).** `[Tier 2]` A safety control as a single number can't express "allow but redact" or "reference-only." The target is a typed decision that can. (§A.2)
- **`intake` → `intake_aperture`, direction fixed.** `[Tier 1]` The capture control currently runs backwards. Rename and invert it. (§A.1)
## Group 3 — Safety gaps (the system currently fails in the dangerous direction; all Tier 1)
- **The suppression invariant.** `[Tier 1]` Learning is currently blocked from wrongly *boosting* bad content but not from wrongly *suppressing* good content toward silence — the failure mode that matters most for "forget nothing of value." Fixed: learning may not suppress high-confidence content without a real negative signal. (§A.3)
- **False-suppression measurement.** `[Tier 1]` A wrongly-suppressed memory is invisible. The fix re-tests suppressed content at a sampled rate and logs engagement, producing a real false-suppression rate. (§A.3)
- **Categorization fails closed, not open — and not stuck.** `[Tier 1]` Categorization failure currently defaults to permissive (a leak). Fixed to quarantine from egress until classified — with a terminal fallback so content that never classifies gets a safe conservative category instead of being frozen forever. (§A.4)
## Group 4 — The architecture (the 5 kernels and the substrate)
- **The 5-kernel placement discipline.** `[Tier 1]` The memory system is organized into five concern-areas — policy, temporal/lifecycle, retrieval-and-rendering, learning-and-audit, control surface — as a placement *rule*, not necessarily five documents. *Why it helps:* stops one concept being smeared across many specs, which is what makes every change ripple. (§B.3)
- **The polytypic substrate.** `[Tier 1]` There is no universal memory object. Several first-class kinds — Premise Families, Directives, Procedures, evidence, goals/obligations/entities — each shaped to what it is. *Why it helps:* directives ("call me Will") and procedures don't get forced into a claim schema that doesn't fit them; the system stays flexible. (§B.4)
- **Premise Families — the base object for knowledge.** `[Tier 1]` Knowledge that can go stale is stored not as a flat "fact" but as a scoped, source-bound, temporally-valid proposition with variants over time. *Why it helps:* directly attacks the worst long-term memory failure — obsolete knowledge resurfacing as current. (§B.4)
- **The WorkEpisode layer.** `[Tier 1]` A coherent unit of work ("the Marex MTD drafting session"), crosscutting matters rather than nested in them. *Why it helps:* "where did we land on loss causation" returns one orientation card, not 47 fragments; learning gets real work-units instead of a flat clock. (§B.5)
- **Widen DOC8 to a Learning, Attribution, and Lifecycle Engine.** `[Tier 1]` The learning engine doesn't exist as a spec; since it's being written from scratch, widen its charter to also own the unowned memory lifecycle. (§B.6)
## Group 5 — Delivery (the Context Compiler)
- **The Context Compiler with warrants.** `[Tier 1]` The delivery layer stops "retrieving top-k" and instead *compiles a cognitive packet*: each item carries a role, a warrant (assert/hedge/verify/ask/suppress), a validity window, a source chain, and a reason for inclusion. *Why it helps:* the model gets a safe, minimal, admissible reasoning environment instead of a similarity-ranked pile. (§B.7)
- **Admissibility over relevance.** `[Tier 1]` The retrieval question changes from "is this similar?" to "is this *usable here* — given phase, audience, source authority, temporal status, access rules?" *Why it helps:* structurally prevents the quiet context-contamination that plagues frontier RAG. (§B.7)
- **Recompile-on-resume + cognitive diff.** `[Tier 2]` On returning to a work thread, the system rebuilds state from the ledger and shows a "what changed while you were away / what's still open / what went stale" diff. *Why it helps:* work never goes cold; it's also what makes the working-state object cheap — you recompile it, you don't maintain it live. (§B.7)
- **Cap-and-trade selection with a membership rule.** `[Tier 1]` The injection budget is allocated across active scope buckets so one class of content can't crowd out another; the membership rule (a card bids in its single most-specific scope) prevents double-counting. (§B.7)
## Group 6 — Knowledge Topics and the grouping primitive (new in V4)
- **The Knowledge Topic / grouping primitive.** `[Tier 2]` One general grouping primitive — a named, hierarchical, reference-based *lens* over memory — with a `kind` discriminator. Its flagship kind, "Knowledge Topic," is a standing topical collection (e.g. "Ninth Circuit loss causation law"); the same primitive also expresses workflows, practice areas, and goal-initiatives. *Why it helps:* knowledge stops being trapped in the matter it was first developed in; it becomes reusable across matters. (§B.8)
- **The two faces — lens and standing collection.** `[Tier 2]` A Topic is retrospective (groups what you already know) *and* optionally prospective (carries a standing instruction to the extraction agent to gather matching premises going forward). *Why it helps:* you can say "start remembering Ninth Circuit loss causation law" and have it both fold in old research and catch new material. (§B.8)
- **Extraction-time candidate creation.** `[Tier 2]` Topics are *not* created by the live agent (he doesn't write to memory and shouldn't be interrupted to "notice" groupings). The extraction agent emits Topic Candidates when a multi-signal cluster threshold is met; candidates accrete regardless of review; a `candidate_only` field gates their use. *Why it helps:* the feature "just works" without confirmation friction, and without jamming off-task reasoning into every chat. (§B.8)
## Group 7 — The phantom and what's owed regardless (all Tier 1)
- **Write the learning engine (the real DOC8).** `[Tier 1 — owed regardless]` The document the stack treats as its learning engine is a legacy friction logger; the real engine was never written. Write it from scratch, charter widened to lifecycle. (§A.6, §B.6)
- **Split the over-ceiling documents; extract shared-contract documents.** `[Tier 1]` Local surgery and de-duplication; the actual cure for "every change ripples across five specs." (§C)
## Supporting mechanisms (Tier 3, condensed)
Negative memory / null-result memory; prediction-error-gated capture; reconsolidation as the living-memory trigger; page-fault instrumentation; provenance bisect (a later target); the procedural self-model; evidence-role labels; risk/provenance budgets; anti-learning zones; the Background Memory Coordinator; the policy-snapshot invariant; Attention Budget; Review-Debt Budget; Evidence Support Density. All at §B.9.
## Explicitly dropped
The parametric intuition tier (a whole added layer with a fine-tuning cadence, to buy context savings that high attenuator gains already buy inspectably); the 50-field continuously-maintained working-state kernel (a local model cannot maintain dense state per turn — kept only as a *recompiled projection*); the turn-by-turn Context Compiler (latency and false-positive death — made event-triggered instead); "semantic gravity" as math; literal ADSR envelopes; soft-prompt / KV-cache injection (model-specific; breaks the multi-model design); "memory as social actor"; "the system should sometimes lie to the model." (§B.9)
---
# Part A — The Defects (Settled)
Findings typed `[BUG]` / `[GAP]` / `[SUGGESTION]` / `[CONFIRMED]`. Where the meta-review's *own fix* carried a defect, it is corrected here and labelled.
## A.1 — Capacity composition: the operator is not the root cause
**`[BUG]` — DAMS V2's capacity composition produces broken numbers because correlated nested contributors are not collapsed.** `composeGain` multiplies contributor gains. The §5.3 worked example multiplies `in_project[Marex]` (5.0), the Marex entity (1.5), and the Marex corpus (1.2) into 9.0, saturating the cap. Once two cards both sit near the cap, DAMS contributes nothing to their ordering. The example is genuinely broken.
**Correction to the meta-review — the operator is not the bug.** The meta-review labelled this "the capacity operator is broken" and made an operator swap (multiply → add) its headline fix. That is wrong: additive composition over the *same un-collapsed contributor set* still double-counts, linearly instead of multiplicatively. Product over genuinely independent contributors is fine; additive over genuinely independent contributors is fine. The defect is that the data model has no canonical object the entity, corpus, and category all reference.
**Not unanimous.** The meta-review presented this finding as "settled" with reviewers "converged." Grok dissented and endorsed product composition. Grok was wrong on the merits (missed the double-count), but the meta-review's own process discipline (preserve disagreement) required recording the dissent and did not. V4 records it. (See Part E.)
**The fix — the `ScopeRoot` primitive.** Introduce a first-class `ScopeRoot` object (the Marex matter is one). Every attenuator-bearing object carries an explicit `scope_root_ref?`. Co-reference becomes a hash-equality check, reliable by construction, not a graph-similarity guess. Contributors sharing a `scope_root_ref` collapse to one before scoring; contributors with different roots or no root contribute independently — which is the correct answer for a genuinely shared corpus used across matters. Legacy objects with no `scope_root_ref`: do not collapse (conservative — a missed collapse slightly over-boosts; a wrong collapse silently mis-attributes). `ScopeRoot` also supplies cap-and-trade its bucket key (§B.7). **Open question:** is `scope_root_ref` reliably *populated* in practice — by matter-creation and corpus-creation templates? If population is unreliable the fix degrades to the conservative fallback. (Open Question 2.)
**The operator, corrected and bounded both ends.** With co-referent contributors collapsed, the operator over the remaining independent contributors is additive-deviation with **symmetric soft-knee saturation**: `S = Σ wᵢ·(gainᵢ−1.0); effective = 1.0 + softknee(S); effective = clamp(effective, 0.0, MAX)`. **Correction to the meta-review — its additive operator was unbounded below:** fifty suppressive tags at gain 0.5 give a negative gain, which flips a card's score sign so a suppressed card sorts *above* a neutral one. The soft knee must apply symmetrically and the result must clamp at 0.
**`intake` → `intake_aperture`.** `intake` and `injection` are both labelled "capacity" but act on opposite sides of the threshold inequality. Rename `intake` to `intake_aperture` with intuitive direction (higher = more capture). `incognito` setting intake toward 0 currently *captures all noise* — this makes it a `[BUG]`, not a cosmetic rename.
## A.2 — Safety composition: inverted
**`[BUG]` — A `floor` on a MIN-composed safety axis enforces minimum permissiveness.** On a safety axis, higher gain = more permissive; a `floor` is therefore a *minimum permissiveness*. Privileged content pinned to zero egress can be dragged up by an unrelated scope tag's nonzero floor. Independently triangulated by three reviewers — the most-confirmed finding in the set.
**Fix — ceiling-only safety axes now; typed decisions as the target.** Make `AxisConfiguration` axis-class-aware. Capacity axes: `{baseline_gain, floor, ceiling}` with the soft-knee operator. Safety axes: `{permit_ceiling}` only, composition `min` over ceilings — MIN over ceilings cannot raise a value, so inviolability is structural. The safety-pin pattern collapses to `permit_ceiling: 0`. **This is a minimal patch.** A scalar ceiling cannot express `redact_then_allow`, `reference_only`, `local_only` vs `redacted_cloud`. The target is a typed `MemoryPolicyDecision` (`deny | local_only | redact_then_allow | reference_only | allow_with_obligations | allow`). If the Part B architecture is adopted, the typed decision is the target and ceiling-only should be skipped. (Open Question 1.)
## A.3 — The dangerous direction is unguarded — and unmeasured
**`[GAP]` — No invariant bounds capacity-axis suppression.** DAMS V2 forbids learning from *raising* gain for low-confidence content; nothing stops learning, drift, or sycophantic feedback driving a valuable category's `injection` gain toward 0. For a system whose north star is "forget nothing of value," false suppression is the dominant failure mode.
**Fix — the suppression invariant.** The learning engine may not drive a content item's composed `injection` gain below `SUPPRESSION_FLOOR` (≈0.5) when confidence ≥ 0.7, unless there is a corroborating negative signal — defined as an explicit user dismissal or down-rank, **not** mere non-engagement. Suppressed content gets a periodic low-probability re-test injection.
**Extension — the re-test quota is also the measurement instrument.** False suppression is invisible by construction: a wrongly-suppressed card never appears, leaves no trace. Run the re-test quota as a **sampling estimator** — sample suppressed content at a known rate, log engagement on the re-tests, and the rate of those events is a measured false-suppression rate. This is the missing arm of the measurement story (§B.9).
## A.4 — Categorization fails open — and the meta-review's fix fails stuck
**`[BUG]` — The empty-contributor branch returns a permissive 1.0 on safety axes.** An un-categorized entity composes to permissive egress. Categorization failure fails open. Independently triangulated.
**Fix, corrected — quarantine with a terminal fallback.** An entity is created immediately, capacity-neutral but **egress-quarantined** (`permit_ceiling: 0` on `network_egress` and `agent_delegation`) until categorization completes. **Correction to the meta-review — its quarantine fix fails *stuck*:** it had no terminal branch for categorization that never completes (LLM budget exhausted, repeated failure), permanently quarantining usable content. The corrected fix adds a terminal fallback: after N failed attempts, assign a deterministic conservative category from cheap signals (corpus, surface, source domain). Fail-closed is right; fail-stuck is not.
**`[BUG]` — The per-category `intake` axis has a sequencing paradox.** `intake` keys off a category produced by deep extraction, but the significance gate that `intake` modifies runs *first*. Circular. Fix: `intake_aperture` may key only off scope cheap to determine at gate time (corpus membership, alias-matched entity link, surface, source domain). DAMS needs an explicit **axis-applicability matrix** stating which scope kinds and lifecycle stages each axis may key off.
**Categorization as a two-writer pipeline.** Run categorization as two independent writers — a deterministic writer (corpus, alias, surface, source domain; high precision/low recall) and the LLM writer (high recall/lower precision) — and **log every disagreement as a first-class signal**. Disagreement is a near-free detector of a bad category definition, a drifting prompt, or adversarial content, and yields the learning engine a labeled training stream.
## A.5 — Schema and specification defects (carried, condensed)
Temporal-window fields are dead schema or carry up-to-24h latency against a nightly bundle (resolve: runtime composition takes a clock, or short-lived windows are excluded and the latency documented); a "pin" pins a factor not the composed result; threshold re-calibration after the DAMS factor widens the score distribution is unowned; `ResolvedAttenuator` is referenced four times and never defined (and the meta-review built `CompiledAttenuatorBundle` on it without noticing); `share_export` conflates internal carryover with external export; §6 egress seeds are on the wrong abstraction (per entity *kind* when egress risk is a function of *sensitivity*); the `memory_directive` egress default of 1.0 sends correction and vocabulary directives to cloud by default; the 22 acceptance tests are names without inputs, oracles, or pass-conditions; the `user_principal` axis is omitted from the default-for-kind table. All owed in rework.
## A.6 — The DOC8 phantom
**`[CONFIRMED]` — The DOC8 phantom finding is correct and is the proposal's best work.** The document under that number is a ~730-line legacy friction logger; a grep for `shapley | attribution | utility scor | policy bundle` returns zero hits, yet DOC24, DOC72, DOC73, DOC23, and BDSM all reference "DOC8" as the attribution/learning engine. That engine was never written. The proposal's response — writing the dependency as a requirements contract rather than as "DOC8 integration" — is the no-phantom discipline working. **Stale sub-claim corrected:** the "unresolved DOC8/BDSM Shapley boundary" sub-claim cites BDSM v6.4; current BDSM V6.5 has drawn that boundary. The phantom stands; the boundary sub-claim does not.
---
# Part B — The Architecture (Strong, With Open Questions)
## B.1 — The organizing principle: memory is a control problem
The system's job is **not** "retrieve relevant past information." It is: *preserve, update, and compile the user's live working state, and deliver to the model each turn the smallest safe, warranted, admissible context.* A cutting-edge memory system is modelled as a controller — at each step it chooses among capture, ignore, ask, inject, suppress, verify, interrupt, consolidate, under uncertainty and risk. The criterion for "memory working" becomes *did it reduce bad reasoning-state transitions* — not "did it retrieve something relevant."
This reframe is adopted as the spine. It was independently converged on by eight blank-slate models and tracks the current research direction (agent memory as a write–manage–read control loop). It does not discard V3.1's content — it reorganizes it: the five kernels still hold, `ScopeRoot` still holds, the demotion of DAMS still holds. What changes is the *center of gravity*: the working-state object moves from a flagged candidate to the thing the architecture is organized around.
## B.2 — The thick floor / thin swappable judgment layer
A hard architectural invariant. Every memory operation has a **floor** part and a **judgment** part.
The **floor** is the durable factual record — this evidence exists, from this source, this claim was valid over this interval, this was superseded by that, the user corrected this on this date, this is privileged. It is *recorded*, not *decided*. Build it heavy, inspectable, permanent.
The **judgment layer** is every *decision* on top — is this worth injecting, asserted or hedged, admissible here, would it distract, should the system interrupt. Build it **thin, declarative, and swappable**: a defined component behind a defined interface, so it can be a deterministic function today and a learned policy or a stronger model tomorrow, *without touching the floor*.
This is the defense against the failure mode that sank DAMS V2 and that the follow-up ideation nearly repeated: over-specifying the part that next-generation models will absorb for free. Invest permanently in the floor — no model "absorbs" your private institutional record. Keep the judgment layer replaceable. The self-learning system (§B.6) is *how* the judgment layer improves over time — learning and the swap mechanism are the same seam.
## B.3 — The 5-kernel placement discipline
DAMS V2 asked one scalar to carry eleven kinds of decision. The architecture is organized into five kernels — **policy**, **temporal/lifecycle**, **retrieval-and-render planning**, **learning-and-audit**, **control surface** — adopted as a **placement discipline**, not a document architecture: every mechanism is assigned to exactly one kernel; a mechanism that wants two kernels is a design smell to resolve. This captures the genuine value of the "Memory Control Plane" reframe (it correctly names that DAMS is flow-control and lifecycle is a separate concern) without ratifying an eight-subsystem document suite — which, as the meta-review itself admitted, "out-accreted its alternatives rather than out-reasoning them." Reviewers' independent collapses converged near five. (Open Question 5: the kernel count and boundaries remain open to red-team.)
## B.4 — The polytypic substrate
**There is no single universal memory object.** The substrate has several first-class kinds, each shaped to what it actually is. Forcing one schema onto all of them is the over-specification this architecture rejects.
**Premise Families** — the base object for *knowledge that can go stale*. Not a flat "fact" but a scoped, source-bound, temporally-valid proposition with variants over time: a canonical question, a current statement, domain, scope (work objects, people, jurisdictions), temporal fields (`valid_from/to`, `learned_at`, `superseded_by`, `stale_after`), an authority block (source refs, source rank, contrary sources), applicability conditions, a use policy (default reliance: assert/hedge/suggest/verify-first/orientation-only/suppress), and a governance lifecycle (`candidate | active | contested | superseded | historical_only | withdrawn`). This directly attacks the worst long-term-memory failure: obsolete knowledge resurfacing as current. It also resolves V3.1's `field_path`/static-fact tangle — a Premise Family is *natively* temporal and scoped. Research backing: forgetting-aware memory benchmarks (Memora and similar) penalize exactly the obsolete-reuse failure this object prevents.
**Directives** — DOC1-style standing instructions ("call me Will," "never do X"). *Not* claims: no truth value, no counter-evidence, no supersession-by-better-claim. Changed only by the user, governed by DOC1's existing approval discipline. The Premise Family framing must **not** be forced onto directives — filling `valid_from`/`superseded_by` on a directive is schema cosplay. Directives already work; the reframe does not touch them.
**Procedures / skills** — capabilities (how-to). Not claims. A procedure can go stale ("does this still work") but that is not "is this still true." A procedure's *steps* are capability content; a procedure's *applicability conditions* are premise-like. So a procedure is its own object that **cites** Premise Families for its conditional parts — it is not itself a Premise Family.
**Evidence / raw records** — the floor (§B.2). Append-only, source-preserving.
**Goals, obligations, entities** — DOC72's other node kinds. Some are premise-adjacent (an obligation has temporal character), some are not (an entity is just a thing). They remain distinct kinds.
The discipline: the substrate is polytypic. Premise Families are the most important *new* kind, not the *only* kind. "Everything is a Premise Family" is a creep to resist.
## B.5 — The WorkEpisode layer
A **WorkEpisode** is a coherent unit of work — "the Marex MTD drafting session," "the research burst on price impact." It gives the learning loop event-segmented structure instead of a flat 14-day clock and gives retrieval a temporal handle.
**Episodes crosscut the scope graph; they do not nest in it.** Every prior episode proposal implicitly treated an episode as "smaller than a matter." A litigator's morning touching Marex, then Henderson, then a new-intake call is *one* episode over three matters. Model an episode as an independent temporal partition carrying a *weighted set* of touched `scope_root_ref`s — not a parent matter. The active episode's weighted scope-root set is what drives cap-and-trade allocation.
**Disciplines:** episode summaries are `framing_only`, not citable evidence, unless explicitly source-backed; boundary detection is conservative — hard boundaries (mode enter/exit, task-run start/end) are reliable, soft boundaries (topic shift, idle gap) produce a checkpoint and a user-confirmable suggestion only, never autonomous restructuring; a compressed episode summary must always render *with* its child source-backed cards (a summary that silently drops a failed path or unresolved loop is a missed-issue exposure).
**Placement:** boundary detection is an intake-time concern; episode storage is a DOC72 node kind; salience replay and consolidation are learning-engine work (§B.6).
## B.6 — The Learning, Attribution, and Lifecycle Engine
The phantom DOC8 must be written from scratch. Its charter is **widened** beyond learning/attribution to also own the **memory lifecycle** — managed forgetting, consolidation, episode-salience replay, conflict-resolution coordination — because that lifecycle function currently has no coordinating owner and DOC8 is green-field, so widening its charter costs nothing and avoids standing up another subsystem.
It owns: trace ingestion, counterfactual baselines, reliance attribution, absence learning, pattern detection, canaries, candidate promotion, rollback proposals — and salience-based consolidation, replay, and lifecycle-transition coordination. It does *not* own raw truth (DOC72). Lifecycle *state* lives on the memory nodes; the engine owns the *transition logic*; EC executes writes. Forgetting decomposes into five distinct operations — do-not-inject, demote-to-explicit-recall-only, archive-audit-only, supersede, delete/redact-where-policy-permits — not one vague "managed forgetting."
The learning loop's *target* sharpens under the control reframe: learn from **counterfactual contribution** (did injecting this memory improve the reasoning state — replay tasks with and without it) rather than from "was this relevant." The engine's outputs feed the **swappable judgment layer** (§B.2): self-learning *is* how the judgment layer improves.
## B.7 — The Context Compiler
The retrieval-and-render kernel is reframed: it does not "retrieve top-k memories." It **compiles a cognitive packet** for the turn.
**Admissibility over relevance.** The question is not "is this similar?" but, in order: is it *admissible* here (phase, audience, access, temporal status)? what is its *role* (fact, constraint, hypothesis, warning, source, procedure)? what is its *temporal status*? what is its *allowed action* (assert, hedge, verify, ask, suppress)? what is the *cost* of injecting it (anchoring, crowd-out, contamination)? what goes wrong if *omitted*? The right packet is often not the most relevant packet — it is the one with the highest expected improvement to reasoning integrity.
**Warranted packet.** Every injected item carries a role, a **warrant** (`assert | hedge | suggest | verify | ask | suppress`), a validity window, a source chain, and a reason for inclusion. The compiler also emits explicit `suppress` entries (with reason) and `ask_user` entries. Default-to-non-injection when a memory is relevant-but-risky: in serious work the failure mode is quiet contamination.
**Event-triggered, not per-turn.** The compiler does **not** run a full belief-consistency check on every turn — that is latency death and false-positive death (a compiler that cries wolf gets disabled within a week). Heavy recompilation is triggered by high-signal events: a change in the active goal, a task-mode shift (explore→decide), a long time gap, a detected contradiction, a new controlling source. Ordinary turns get light cached patches.
**Recompile-on-resume + cognitive diff.** The working-state object is *not* a continuously maintained 50-field kernel — a local model cannot maintain dense compounding state per turn. It is a **compiled projection**: on resume (or a high-signal event), the system rebuilds state from the append-only ledger and presents a **cognitive diff** — what changed while you were away, what is still open, what went stale, where you left off. This is what makes the working-state object cheap enough to build.
**Cap-and-trade selection with a membership rule.** Score with the corrected additive operator (§A.1), then *select* by allocating the injection budget across active scope-root buckets, so one content class cannot crowd out another. The membership rule (the piece no prior document closed): a card bids in the bucket of its **single most-specific scope root**, precedence scope-root > operational > role facet; sensitivity is a **gate, not a bucket**. Override lanes (`must_include_authority_update`, `must_include_user_correction`, `must_include_unresolved_loop`) and a small reserved analogy/transfer lane prevent starvation of rare critical content.
## B.8 — The Knowledge Topic and the grouping primitive
A new first-class object. **It is distinct from the working-state object and neither contains the other.** A working-state object is the live, in-flight model of *one problem being worked* — prospective, hot, mostly not extracted. A Knowledge Topic is a persistent, accreting *grouping of settled knowledge* — a curation lens. The workbench versus the library. They interact: work pulls from Topics; durable premises produced during work land in Topics. Both are first-class; the polytypic substrate (§B.4) is what lets the system have both rather than forcing one to be the other.
**One general grouping primitive, many kinds.** The primitive is a named, hierarchical, reference-based *lens* over memory, with a `kind` discriminator and optional kind-specific fields. Its flagship kind — "Knowledge Topic" (working name) — is a standing topical knowledge collection ("Ninth Circuit loss causation law"). The *same primitive* expresses **workflows** (groupings of procedures), **practice areas** (groupings of matters), and **goal-initiatives** (groupings around a goal). Building one primitive solves the grouping problem once — one inclusion model, one hierarchy model, one proliferation defense, one UI — instead of four times. The over-proliferation concern applies to *this primitive*; the working-state object is not a grouping and does not count against that budget.
**Lens, not container.** A Topic owns no raw truth. It *references* premises, sources, memories, episodes, matters. Shared members are therefore free — one Premise Family referenced by three Topics is three lenses on one object, not duplication.
**Two faces.** *Retrospective (lens):* groups what you already know. *Prospective (standing collection instruction):* carries an instruction, handed to the extraction agent, to actively gather matching premises going forward. The prospective face is distinct from a corpus — a corpus is a batch ingestion of a known document pile; a standing collection is a filter on the ongoing stream. The prospective face is the more dangerous one (a too-broad instruction over-collects) and needs a confidence threshold and a review surface.
**Membership is both explicit and rule-based.** A curated core of explicitly-pinned members *plus* an inclusion rule that auto-proposes matching members. The prospective face *requires* rule-based membership. Every member carries a provenance tag: `user_pinned | agent_proposed | rule_matched | standing_collection_caught`. The pinned core is the stable trusted part; the rule-matched part is the living part; the UI shows them distinctly.
**Containment is a multi-parent acyclic DAG; other relationships are a free typed web.**
*The walkthrough that establishes this.* Take three Topics a securities litigator would plausibly have: **A** — "Ninth Circuit loss causation law"; **B** — "Loss causation, all jurisdictions"; **C** — "Marex matter knowledge." Now place one Premise Family — *"9th Cir. requires loss causation to be pleaded with [specific elements]."* It genuinely belongs in A (Ninth Circuit loss causation), in B (loss causation), and in C (it arose in and matters to Marex). And the Topics themselves: A is a `child_of` B (Ninth Circuit loss causation narrows all-jurisdiction loss causation); a fourth Topic, "9th Circuit securities law," would *also* be a parent of A; but C is not a parent of A (a matter is not a sub-topic of a legal area) — A and C overlap heavily, sharing members, without either containing the other. A single-parent tree therefore fails twice over: A has two legitimate containment parents, and A↔C is a real relationship that is not containment at all. Branches of different spines touch — the architect's instinct was correct.
But the reason an acyclic rule was wanted still holds, and the fix is narrower than abandoning it. The rule was never really "one parent" — it was "when you ask for everything under B, that query must terminate." Multiple parents do **not** break termination: A having two parents just means A appears in two result sets, done. Only a *cycle* breaks termination. So: `parent_of`/`child_of` containment edges form a **multi-parent DAG with one rule — it must be acyclic** (a DAG is acyclic by definition; that is the whole constraint). *All other* relationship types (`matter_link` for the A↔C case, `related_to`, `narrows`, `informs`) form a free unconstrained web — they are never traversed by containment recursion, so they need no constraint. Relationship types are extensible; a small core set ships defined. Display renders a tree spine and lists the other edges. Filter queries — "9th Circuit falsity law" (everything under *falsity* ∩ jurisdiction filter), "falsity law for Marex" (everything under *falsity* with a `matter_link` to Marex), "all 9th Circuit securities law" (everything under *securities law* with the jurisdiction filter, transitively reaching falsity, scienter, loss causation) — are queries over the DAG plus typed edges plus attribute filters.
**Creation is extraction-time, not live-agent.** Elnor — the live conversational agent — does **not** create Topics and is not asked to "notice" that one should exist. He does not write to memory; EC and the extraction agent do. For Elnor to recognize a Topic live he would have to interrupt the conversation with off-task cross-memory matching reasoning jammed into every prompt — there is no good thing to inject for that. Instead: the **extraction agent**, running KIE/GIE backlinking plus the new collection instructions, emits a **Topic Candidate** when a cluster threshold is met. Candidates accrete members regardless of review; a `candidate_only` field gates their *use*. The user approves/edits in the Knowledge UI, or never looks and the candidate keeps growing. Approval gates *use*, not *collection*.
**Auto-creation at a high multi-signal threshold.** At a strong enough signal a candidate should auto-create or be unmissably raised. The trigger is a **cluster-validation problem** and the data-science literature is mature: a composite weighted score over cluster cohesion/separation (silhouette-style — the "likeness" signal), size/density, **temporal burst structure** (Kleinberg-style burst detection distinguishes a meaningful sustained burst from one-off noise — the "timing" signal), and definitional clarity. Not a raw count — a composite score, with auto-create above a high band, propose-loudly in a middle band, silent-candidate below.
**Injection — DOC24 proactively delivers the pertinent slice plus a handle.** When a Topic is applicable, DOC24 does not merely hand Elnor a recognition handle and make him fetch. DOC24 assesses applicability, *itself* injects the directly-pertinent compiled slice (the relevant premises, warranted, via the Context Compiler), **and** flags "there is more — call the Memory Agent." The common case arrives proactively with no round-trip; the long tail is one Memory-Agent call away. Topic Candidates are injected only as lightweight handles, flagged as proposals.
**The collection instruction is PropA-governed.** The prospective face's instruction is a structured extraction directive, not free-text, and it should be a **PropA-governed object** — PropA already owns extraction-agent governance and has sharp self-learning and prompt-improvement machinery (a preset DOC23 task, DSPy, agent review), strengthened further by the new DOC23 extractor/judge/evaluator modules. The collection instruction rides PropA's existing improvement loop rather than needing a new one. Extraction *depth* is inferred from the natural-language topic request and managed by PropA + BDSM — explicitly **not** by forcing the user into a settings panel. How the instruction is authored and compiled, and how depth is inferred and tuned, is flagged load-bearing (Open Question 9).
**Lifecycle is type-aware.** A Topic does not "go stale" the way a claim does. Each Topic carries a `lifecycle_type`: `reference` (knowledge topics, practice areas — date-staleness does not apply; case law does not rot; archived only by explicit user action — *no automatic dormancy degradation*); `terminal` (goal-initiatives — become `completed` when the goal completes); `maintained` (currency genuinely matters — a gap in new material signals the *collection instruction* may be wrong, not that the knowledge decayed). Staleness lives at the *member* level (a Premise Family's job). Only un-adopted Candidates expire. Lifecycle exists only to drive three real decisions — default surfacing, review nudges, archiving — and is sized to exactly that.
**UI home.** The DOC73 Libraries UI area: a browsable, hierarchical view of Topics, each member shown with source, provenance, and dates. This browse view is itself valuable independent of injection, and is the proliferation safety valve — an over-proliferated set is at least inspectable and prunable.
## B.9 — New mechanisms folded in (condensed)
Each marked `[NEW]` (not previously proposed) or `[CORROBORATING]` (a blank-slate exercise independently surfaced something the review set already had — raising confidence). Each placed at its kernel, each owing a no-phantom test in the spec-grade rework.
**Negative memory / null-result memory** `[CORROBORATING]` — a first-class store of what was tried and abandoned, rejected, or searched-for-and-not-found; surfaced informatively ("you explored this and dropped it three weeks ago"), never as a prohibition; carries a TTL and `evidence_status: search_exhaustion_record`. **Reconsolidation as the living-memory trigger** `[CORROBORATING]` — when a memory is retrieved and *used*, mark it briefly labile, watch the usage, propose a patch at session close, gated by an authority-weighted confirmation; the guard against retrieval-driven drift. **Prediction-error-gated capture** `[NEW]` — capture priority set by the divergence between a cheap model's prediction and what actually happened; needs per-domain surprise normalization or it floods (investigate before adopting). **Page-fault instrumentation** `[NEW]` — log every moment the model reaches past its given context (clarifying question, mid-turn lookup) as a free training label for the retrieval planner; scope to the unambiguous faults. **Provenance bisect** `[CORROBORATING]` — every durable belief carries a causal lineage; on a correction, binary-search the lineage to the introducing source and flag downstream; a *target* for the lifecycle kernel, not near-term. **The procedural self-model** `[CORROBORATING]` — memory carries the user's characteristic processes and decision patterns, and the system is proactive against them; memories can carry attached obligations ("reverify before filing").
**Adopted from the review set:** evidence-role labels on every card; the "why not remembered?" diagnostic; risk and provenance budgets; anti-learning zones (`do_not_learn`); the Background Memory Coordinator (a background job maintaining a plain-language anticipation summary as a `framing_only` card); faceted memory projection; the policy-snapshot invariant (every durable card records the policy generation in force at write time); Memory Flow Certificates; an Attention Budget (governing proactive interruptions); a Review-Debt Budget; Evidence Support Density (low-support summaries cannot render as authority); explicit memory SLOs (stale-injection rate, unwarranted-assertion rate, false-suppression rate, work-thread-resume success); a "do-not-know" memory (unresolved ignorance as a first-class object).
**Dropped:** the parametric intuition tier (a whole added layer to buy what high attenuator gains already buy inspectably; collides with the inspectability invariant); the 50-field continuously-maintained kernel (kept only as a recompiled projection); the turn-by-turn compiler (made event-triggered); "semantic gravity" as math (re-imports the multiplicative bug); literal ADSR envelopes; soft-prompt / KV-cache injection (model-specific; breaks the multi-model design); "memory as social actor"; "the system should sometimes lie to the model."
## B.10 — Domain-agnostic core with domain packs
The architecture is **domain-neutral with domain-specific applicability lenses**. The base system answers a universal question — *for this task, in this work context, at this phase, for this audience, under these source and access rules, what may the model rely on, how strongly, and in what form?* — and legal is the strictest *lens*, not the foundation.
The base primitives are domain-neutral: Premise Families, corpora (bounded source universes with policies), the grouping primitive, Work Objects (the agnostic generalization of "matter"), the Context Compiler. A **domain pack** supplies a phase taxonomy, a source-authority hierarchy, applicability rules, audience rules, and validators. The LawPack renders "admissibility," "privilege," "procedural posture," "binding authority"; a CodePack renders branch/commit/test semantics; a FinancePack renders time-horizon and tax-year semantics. Legal concepts are *stricter versions of universal professional-work concepts* (matter → work object; privilege → access boundary; procedural posture → work phase; case-time → work-object-lifecycle time); encoding the general form in the substrate and the legal form in a pack gives legal-grade rigor without forcing non-law users into legal categories. Build the substrate against agnostic interfaces; build LawPack first because it is the primary use; if a CodePack later feels natural the substrate is right, and if it feels like "litigation memory with code labels" the substrate is too legal.
## B.11 — Hard architectural requirements (carried)
The Policy Engine runs **mostly at write time** — but content-intrinsic policy (sensitivity-derived, fixed at write time, stamped as a facet) and session-intrinsic policy (mode-derived, computed at retrieval, and able to *override* a write-time stamp toward more restrictive) are distinct and must compose, safety-MIN, at retrieval. The Policy Engine is **not a new engine** — it is a memory-domain ruleset compiled into EC's existing `PolicyDecisionEngine`. The **hot-path invariant:** every operation on the memory hot path must have a locally-enforceable bounded worst case — no LLM calls, no unbounded scans; any unbounded operation (a network read, a cold-cache miss) may appear only behind a bounded local fallback plus a hard timeout. Predictive bundle-shard loading (load the next episode's shard ahead of need, previous shard as fallback) rather than lazy load-on-miss.
---
# Part C — The Rebuild Scope (Flatten-and-Unify)
## C.1 — Why flatten
ELNOR's memory specs already overlap *each other*, before any new work is added. Two verified examples: DOC72 §25 ("Conversation Corpus / Episodic Recall") already defines an episode-like primitive, directly overlapping the proposed WorkEpisode layer; and DOC72's integrated GIE already contains a token-budgeted, capped, priority-fill allocation algorithm — a fourth budget allocator alongside DOC24's injection budget, BDSM's eligibility layer, and the proposed cap-and-trade. There are likely a hundred such overlaps.
Three reasons to merge into one unified, function-organized specification set:
1. **Eliminate duplication** — overlapping systems doing the same thing slightly differently, merged so one survives.
2. **Pay the coordination cost once** — federating via shared contracts eliminates *schema* divergence but not *behavioral* duplication (three allocators importing one type are still three allocators) and converts coordination into permanent cross-doc overhead. A genuine flatten pays the cost once.
3. **The merge makes the specs better — the primary reason.** Flattening is the one moment the whole picture is visible at once, and therefore the moment to ask the real design questions: which of two overlapping designs is better, what should be lifted from the GIE and applied system-wide, what is dead weight. The merge is a design pass disguised as a consolidation. And the discovery is not separable from the resolution: the hundred overlaps cannot be enumerated by inspection from outside — they become visible only when two pieces are forced into one place and one of them is found to contradict, duplicate, or silently re-implement the other. The act of merging *is* the act of auditing; the audit document (Phase B) is an index that scopes the close reading, not a substitute for the merge surfacing what inspection alone would miss.
The honest counter — a flatten of ~40,000–50,000 lines executed by AI agents across many sessions is exactly the "half-migrated mess" of an uncontrolled rewrite — is met by making the merge a *managed*, sequenced, gated process.
## C.2 — The five-phase plan
**Phase A — Settle the architecture.** Harden this proposal to spec-grade through continued red-teaming. Endpoint: the primitives and kernel structure are stable enough that they will not move during the merge. Phases B–E cannot begin until Phase A is done, because the audit classifies the existing specs against the architecture Phase A settles.
**Phase B — Triage & Audit.** Two AI agents (Claude Code and Codex) independently audit every relevant spec and addendum, section by section, against a checklist derived from the settled architecture, producing a structured per-section record (function, schemas defined/consumed, cross-doc obligations, overlap flags, disposition hypothesis, confidence). The two auditors' disagreements are themselves signal. The corpus is broad — most specs, all memory-relevant addenda, and older foundational specs (EC Core) with memory functionality trapped in them by lineage.
**Phase C — Outline.** A hard, structure-only outline of the unified memory system: what the new documents are, what functions live where, how it is organized — no spec prose. The architect's primary decision gate: cut, add, relocate, fold in. Reorganizing an outline is cheap; reorganizing a merged 50,000-line spec is the half-migrated mess.
**Phase D — Merge Plan.** A self-driving execution package (an instructions file of ordered discrete steps, a prompts file of one prompt per step, audit steps built into the sequence) handed to an AI coding agent. Writing it requires close reading of the flagged sections, not just the audit inventory.
**Phase E — Execution.** The coding agent runs the package; the merge is performed as a design pass; the "which design wins" questions are explicit, adjudicated steps.
## C.3 — What proceeds regardless
Independent of the residual partition question: rework DAMS per Part A; write the Learning, Attribution, and Lifecycle Engine; split the over-ceiling documents; extract the smeared concepts into shared-contract documents. The flatten direction is settled; the merge is staged and gated so the half-migrated state is structurally prevented rather than risked.
---
# Part D — Open Questions for the Next Red-Team Round
1. **Safety-axis schema** — ceiling-only `permit_ceiling` (minimal patch) vs. typed `MemoryPolicyDecision` (complete fix). If Part B is adopted, the typed decision is the target — confirm.
2. **`ScopeRoot` population** — is `scope_root_ref` reliably populated by matter/corpus-creation templates? If not, the fix degrades to the conservative no-collapse fallback.
3. **The control-problem reframe** — is "memory as a control problem / preserve-and-compile working state" the right spine, or does it over-rotate away from retrieval, which is still genuinely needed?
4. **The working-state object** — it is the spine's central object and the one most at risk of being a phantom. Can the no-phantom test (what writes it, what reads it, what it concretely changes per turn) be met crisply? The recompile-on-resume framing is the proposed answer — is it sufficient?
5. **The 5-kernel count and boundaries** — adopt the layer names as a placement discipline; is five right, or is there a cleaner cut?
6. **The polytypic substrate** — is "several first-class kinds, no universal object" right, or does it leave too many object types to coordinate? Are Premise Families / Directives / Procedures / evidence / goals-obligations-entities the right set?
7. **Knowledge Topics — scope creep.** The architecture demotes and simplifies, yet adds a grouping primitive with a candidate lifecycle, a DAG, typed relationships, and an extraction-instruction subsystem. Does it earn its place? The claim is that it *replaces* scattered ad-hoc grouping (workflows, practice areas) rather than adding alongside it — is that claim sound?
8. **Knowledge Topic injection** — DOC24 proactively injects the pertinent slice plus a Memory-Agent handle. Is the proactive-slice model right, or should it be handle-only? This is where the feature's value rides.
9. **The PropA-governed collection instruction** — how is the extraction directive authored from a natural-language topic request, compiled, and its depth inferred and tuned via PropA + BDSM? Flagged load-bearing.
10. **The cluster-validation trigger** — the composite score (cohesion, density, burst structure, definitional clarity) for Topic-candidate auto-creation needs a real specification; what are the weights and the band thresholds?
11. **DOC8's widened charter** — is folding lifecycle, salience replay, and consolidation into the green-field learning engine right, or does lifecycle want its own document?
12. **The build→spec feedback protocol** — what happens when a Phase-E checkpoint invalidates upstream spec? The bounded revision loop and its escalation threshold need specifying.
13. **GIE/KIE and DOC73** — the GIE and KIE were drafted before DOC73 was complete; should they be expanded into DOC73, given DOC73's focus is libraries? And the retroactive Topic fold-in must reuse the GIE/KIE backlinking engine, not a parallel matcher — confirm as a unification requirement.
14. **Provenance bisect / prediction-error capture** — both flagged as investigate-or-later; confirm the deferral and the derivation-graph cost.
15. **Latency arithmetic** — the next spec must *compute* the per-card hot-path cost for the full architecture (policy read + retrieval planning + DAMS composition + Context Compiler), not assert it.
---
# Part E — Process Notes
**Self-review refines within a frame; only a fresh window questions the frame.** When a model red-teams its own prior output, the original framing sits in context as established fact. The blank-slate ideation exercise was run to break that funnel — eight models, no DAMS frame. It corroborated several review-set ideas (negative memory, reconsolidation, episodes) and surfaced two the framed process missed (prediction-error capture; the working-state reframe that is now this document's spine). The next round should keep routing to models that authored neither this document nor its predecessors.
**Agreement is the right metric for the settled part and the wrong one for the contested part.** On the Part A defects, convergence is real triangulation. On the rebuild scope and the contested architecture questions, routing until the models "agree" manufactures consensus rather than discovering truth. The honest end-state of a genuinely contested question may be "two coherent positions, here are the tradeoffs, the architect decides."
**A self-correction this document carries forward.** The consolidated meta-review stated the discipline "preserve disagreement, do not manufacture consensus" and then violated it in its own Part A — it labelled the capacity-operator finding "settled" and reported reviewer "convergence" while Grok had explicitly dissented and endorsed product composition. V4 records that dissent (§A.1). The standard the next round should hold *this* document to: if a reviewer finds a place where V4 has smoothed a real disagreement into false consensus, that is a finding, and it should be raised as one.
**What this document is for.** It supersedes DAMS V2, the consolidated meta-review, and V3.1, and is built to be red-teamed again — on Part D specifically — before the architecture is settled. It is a proposal, not a decision. Nothing in it is final.
---
# Document Status
**Supersedes:** `DAMS_PROPOSAL_V2.md`, `DAMS_V2_CONSOLIDATED_META_REVIEW.md`, the V3.1 proposal. All move to archive on adoption.
**Status:** Proposal for red-team review. Not operative. The architecture is not settled; this document exists to be pressure-tested on its Open Questions (Part D) before any build commitment.
**Critical item owed regardless of every other decision:** the learning engine the memory stack depends on does not exist as a written spec and must be built — as a Learning, Attribution, and Lifecycle Engine — satisfying the corrected DAMS contract and the BDSM V6.5 ownership table.
**Next step:** red-team this document — both the architecture and the flatten-and-unify plan — routed to fresh windows and prior reviewers, before the architecture is settled and before any merge begins.