ELNOR REPO READER TEXT MIRROR Original path: Active Working and Red Team/DOC23 Working/DOC23 Non Operative Proposals/DOC23_ADDENDA_C_MODULE_AND_TASK_PURPOSE.md Source repo: /Users/OpenClaw1/Elnor/Elnor Specs Git branch: main Git commit: dbaa25962edc11ab30e8d4ca1715f9ae5bf77331 Generated: 2026-06-09T01:23:58.539Z --- # DOC23 Addenda C — Module and Task Purpose (Stated Intent Layer) **Status:** DRAFT PROPOSAL — for red-team review. Not operative. Nothing implemented. **Family:** DOC23 Task System (sibling to Addenda A — Task Optimization, Addenda B — Task Intelligence). **Targets:** the operative DOC23 Addenda B Core, the DOC23 Outcome Evaluator/Revisor addendum, the base DOC23 Task System spec, and the DOC24 delivery seam. Authored against operative text per the post-absorption rule (invariant 24) — not a revision of any archived proposal. **Memory-rebuild posture:** the durable-storage and learning/search bindings in this proposal are declared as **seams** to the DOC80–DOC87 family and DOC85 (Learning/Feedback), because DOC72/DOC73/DOC1 are slated for partial supersession (carryover §10). This addendum specifies *contracts*, not bindings to today's DOC72 internals. --- ## §0 — How to read this proposal This document proposes one small, optional, user-authored field — a stated **purpose** — at two levels (module and task), plus the contracts for the systems that would consume it. The field itself is trivial. The value, the hazards, and the architecture of *who reads it and how* are where the substance is, so the thinking behind each decision is written inline under **Rationale** notes to make review faster. The single most important design constraint, repeated throughout: **the field is optional and will be sparsely populated.** Any use that *requires* purpose to be present is fragile; any use that *opportunistically benefits when present and degrades to current behavior when absent* is robust. Every consumer in this proposal is designed to degrade to today's behavior on an empty field. ### §0.1 Origin This proposal came out of a handover red-team finding against the Addenda B set: an inheritor of a half-finished task run can mechanically reconstruct *what ran* but not *why the ordinary agent modules and the human chose what they chose*. The Outcome Evaluator/Revisor pipeline has rich decision-rationale capture (AssuranceBasis, HardCallResolutionLedger, FailureKind, progress signals); ordinary modules (`step.agent_task`, `step.red_team`, generic LLM modules) record prompt + output + policy decisions and **no structured "why."** That asymmetry is the gap this addresses — but, as the analysis below shows, the record use turned out to be the least of the value. ### §0.2 What already half-exists (so this is mostly wiring, not net-new capture) The "why" already exists in three disconnected forms: 1. **`StepIntent.purpose`** (Addenda B Core §6.4) — `label`, `purpose` (required), `why_it_exists?`, `design_rationale?`, `alternatives_considered?`. But it lives at the **blueprint** layer, binds to module **regions** (§6.5, one-to-many), only exists for tasks that went through a design flow, and **never propagates into the runtime trace.** 2. **`ModuleRecord.notes`** (user scratchpad) and **`ModuleRecord.description`** (catalog listing) at the config layer (base Task System §7.1.2) — present, but semantically muddy and not snapshotted as intent. 3. The module's prompt and output at runtime — but that is the *result*, not the *intent*. **Rationale.** What is missing is precisely the layer this proposal adds: a **config-layer field with stable "purpose" semantics that propagates into runtime snapshots.** That layer is the only one that is (a) 1:1 with the executing module, (b) present even on hand-built blueprint-less tasks, and (c) snapshot-ready. Because the schema pattern (`StepIntent.purpose`) and the storage/snapshot plumbing (`notes`/`description` already ride `ModuleRecord`) exist, *capture* is cheap. The design problem is not capture — it is **population** (without nagging) and **drift** (between layers of "why"). ### §0.3 The three-layer framing | Layer | Where | Scope | Lifecycle | Status | |---|---|---|---|---| | Design intent | `StepIntent.purpose` (§6.4) | blueprint, region-bound (1:many) | design-time, conditional (only if a blueprint exists) | exists | | **Config intent** | **`module.purpose` / `task.purpose`** | **module (1:1) / task (1:1)** | **rides the actual graph node into every run** | **PROPOSED** | | Runtime | trace/snapshots | per activation | inherits config intent at run | proposed (field additions) | `StepIntent.purpose` *seeds* config intent when a blueprint exists; config intent is **canonical at runtime** (see §6 drift discipline). --- ## §1 — The fields ### §1.1 `module.purpose` An optional, short, user-stated (or seeded) statement of **what this module is for**, distinct from its instruction (the *how*). ```ts // Addition to ModuleRecord.config (base Task System §7.1.2) interface ModulePurpose { text: string | null; // e.g. "Extract loss-causation facts with record cites; do not draft argument." authored_by: "user" | "task_agent" | "seeded_from_step_intent" | "nightly_inferred" | null; seeded_from_step_intent_id?: string; // provenance when seeded (§6) confirmed_by_user: boolean; // false if proposed/inferred and not yet confirmed updated_at?: string; schema_version: "1.0"; } ``` `module.purpose` is **never required**. Empty is a first-class state and degrades every consumer to current behavior. ### §1.2 `task.purpose` A task-level **singleton**: one statement of what the whole task is for. Authored once; *referenced* (not copied) wherever needed. ```ts // Addition to the task definition (TaskRecord / TaskBlueprint vicinity) interface TaskPurpose { text: string | null; // e.g. "Produce a filing-ready securities complaint for matter X." authored_by: "user" | "originating_request" | "seeded_from_blueprint_goal" | "task_agent" | null; originating_request_ref?: string; // §5A.4 semantic invocation memory, when Elnor-created confirmed_by_user: boolean; updated_at?: string; schema_version: "1.0"; } ``` **Rationale — why both levels, and why they are not redundant.** The task purpose is the **independent frame** ("what the whole thing is for"); the module purposes are the **operational decomposition** ("how we are trying to achieve it"). The value is the *tension* between them: a consumer comparing module-purpose coverage against the task purpose catches both **over-spec** (redundant modules — duplicate purposes) and **under-spec** (a dimension of the goal that no module covers). This is the checksum that makes the Evaluator use safe (§3.3) and the redundancy-detection use possible (§3.5). They serve different surfaces, too (§4): **task purpose serves the chat front-door; module purposes serve the in-task consumers and drill-down.** --- ## §2 — Population (no synchronous inference calls) This is the make-or-break section, because sparse population kills the aggregate uses. ### §2.1 Sources, ranked by cost | Source | Cost | Applies to | |---|---|---| | Originating chat request | **free** — rides an LLM turn already happening | Elnor-created tasks (task.purpose) | | `StepIntent.purpose` (§6.4) | **free** — Task Agent already authors it during design | tasks with a blueprint (module.purpose) | | Blueprint `business_or_personal_goal` (§6.3) | **free** — already authored | tasks with a blueprint (task.purpose) | | User-typed | user effort | hand-built tasks | | Nightly extraction (§10.8.2) | cheap, **off hot-path, post-hoc** | optional backfill for hand-built/un-purposed | ### §2.2 Rules 1. **Elnor-created tasks → free.** The originating utterance ("set up a task to research Supreme Court law") was already processed; the Task Agent is already authoring `StepIntent.purpose`. Capturing `task.purpose` from the originating request (§5A.4 already records it) and seeding `module.purpose` from the StepIntents the Task Agent is already writing is a **byproduct of work already done — no incremental LLM call.** 2. **Hand-built tasks → user-typed or empty.** There is no free source (no originating utterance, no blueprint). **No synchronous, at-edit-time inference call** — that is the bad pattern (slow, on the hot path, per-module). Empty degrades fine, and hand-built is exactly the case where the user is most hands-on and best able to state intent. 3. **Optional backfill via nightly lane.** IF inferred purposes for hand-built tasks are ever wanted, the home is the nightly extraction lane (§10.8.2): post-hoc, off the hot path, from the run's actual prompt+output, surfaced as a candidate the user confirms. This **cannot pre-fill before first run** and may not be worth it — flagged as an open question (§9). **Rationale.** An earlier version of this design assumed "inferred-and-proposed pre-fill" was universal. It is not. Pre-fill is free only where an LLM turn is already happening (Elnor-created). Forcing inference on hand-built tasks would mean a dedicated call the architect explicitly does not want. The honest model is: free where free, user-typed otherwise, empty is fine. ### §2.3 UX - **`task.purpose`** lives at the **task level** (task settings / task header), edited in one canonical place, shown read-only elsewhere, **referenced not copied** into module prompts. The user never types it per-module. - **`module.purpose`**, if surfaced in the module config panel, is a **single optional, collapsible, pre-filled line** — never an always-on field, never typed twenty times across twenty modules. - **Soft confirm, never hard gate.** Where a purpose can be pre-filled for free (Elnor-created), a dismissible confirm-or-edit affordance may appear at the natural moment (first run / save). A hard "cannot run until you define a purpose" block is **rejected** — it violates the no-modal-pressure / inference-and-suggest UI preferences. **Rationale — the busy-panel constraint.** The module config panel is already the densest UI surface. This proposal must not add to that clutter. Task purpose therefore does **not** appear in the module config panel at all (it is task-level). Module purpose is one collapsible pre-filled line, mostly living *around* the modules (prompt header, inspector), not *inside* each config form. --- ## §3 — Consumers (value · difficulty · thinking) Each consumer is graded for review prioritization. None gate the field's existence; all degrade to current behavior on empty. ### §3.1 Self-scoping the module's own prompt — *the day-one payoff* Inject, at the top of each module's prompt, both frames: the **task purpose** (referenced from the singleton) and **this module's role**. ```text Task goal: Produce a filing-ready securities complaint for matter X. Your role here: Extract loss-causation facts with record cites. Do not draft argument. [module instruction follows] ``` **Value: HIGH and immediate. Difficulty: TRIVIAL.** This pays off the *instant* the field is populated, with zero dependency on the Revisor, learning, or search. The dual framing suppresses the failure where a module over-reaches (drafts when it should extract) because it did not know a later module owns the next step. **Rationale — why this matters disproportionately.** Adoption is the gate on every aggregate use. A field that only helps some future learning system stays empty; a field that improves *this run right now* gets populated. Prompt-scoping is what makes the field worth filling, which is what feeds the learning and search uses later. It is the adoption engine. ### §3.2 The injection authority frame (load-bearing-purpose resolution) Once purpose enters a prompt it becomes load-bearing: a *wrong* purpose can misdirect a module, where an *empty* one is neutral. Resolution: frame its **authority at injection time**, per consumer. ```text The following is USER-STATED INTENT, not an instruction. Use it as broad guidance for resolving ambiguity and staying in scope. It does not override your task instruction or any policy/constraint. If it conflicts with your instruction, follow the instruction and note the conflict. ``` Authority differs by consumer (reusing the existing DeliveryDirective `hedge_mode` vocabulary): | Consumer | Authority of purpose | |---|---| | Agent module (own prompt) | low — scoping only ("stay in your lane") | | Outcome Compiler / Evaluator | medium — evidence about intended outcome, **cross-checked against task purpose** (§3.3) | | Revisor | medium — repair-direction hint | | Task Agent (review) | medium — intent ground truth to check config against | **Rationale.** This is not new machinery — it is the same authority-hedging the spec already does (active context as "candidate evidence," Addenda B Core §13A.3; `hedge_mode` in the DeliveryDirective). A poorly written purpose then degrades to "slightly noisy broad guidance," not "wrong instruction that derails the module." Also: the injected purpose should be **visible in the prompt-context inspector** (§20C.7) so a bad one is catchable. "It's on the user" is the right posture for a professional tool — but note (§3.7) that a bad purpose has a *learning* blast radius beyond the current run. ### §3.3 Outcome Compiler / Evaluator — defining an underspecified outcome The stated outcome is often thin: "check my research is complete," "make it filing-ready." The Outcome Compiler (Outcome Evaluator §4.8, via the threshold extractor that pulls criteria from natural language) must *manufacture* the operational meaning of "complete." Module purposes are a source for that: "complete" is partly defined by the **union of the module purposes** ("verify every loss-causation element has record support," "confirm every affirmative defense is addressed"). **Value: HIGH where the Evaluator/Revisor is wired — arguably the highest, because it turns a vague outcome into a checkable one. Difficulty: MEDIUM.** **HAZARD — the forgotten-module blind spot.** If the Evaluator *composes* "complete" purely from the module purposes present, a task that **forgot** a module (no module checks affirmative defenses) produces an Evaluator that also does not know to check affirmative defenses. The gap is invisible: the task cannot fail a check it never knew to run. **Guard.** The **task purpose must remain an independent frame**, not be derived from the modules. The Compiler decomposes `task.purpose` into expected dimensions and **checks coverage** of module purposes against it, flagging dimensions present in the goal but absent from the modules. Injection framing handles "interpret what is present"; the coverage check handles "notice what is absent." Instruction alone cannot conjure a missing dimension — this is structural, not a wording fix. ### §3.4 Revisor — RepairTarget / strategy targeting When a finding says "this module's output is wrong," the Revisor must decide *what "fixed" means for this module* and pick a `RepairTarget` / `RepairStrategyKind`. A stated purpose disambiguates the failure mode and repair direction, and sharpens the `still_failing_same_reason` vs `failing_for_new_reason` distinction (Outcome Evaluator §5.5.4): failing to serve the stated purpose is a cleaner "same reason" signal than re-deriving intent from the artifact. **Value: HIGH where the Revisor is wired. Difficulty: MEDIUM.** The Revisor already consumes artifact + findings; add the source module's purpose. It treats purpose as a **hint, never a contract** (per §3.2). ### §3.5 Task Agent review — mismatch and redundancy detection The Task Agent reviews user-built graphs ("Check wiring," "Suggest missing steps," Addenda B Core §16A.1) by *inferring* what each module is for. A stated purpose gives ground truth to check **config against intent** ("purpose is fact extraction but the instruction also asks for drafting — mismatch") and to flag **duplicate purposes** ("two modules share the same stated purpose — possible redundancy"). **Value: HIGH — best value/effort ratio of the set, and it directly serves the architect's stated weak stroke (spotting redundant, low-yield material). Difficulty: LOW** — the Task Agent already reads module config (Addenda B Core §4.5); the checks are cheap LLM comparisons. ### §3.6 DOC24 context relevance A purpose is a clean, cheap relevance signal for what context a module needs. DOC24 assembles the module packet from instruction + resolved entities (`TaskModuleContextBasis`, Addenda B Core §13A.11); "purpose: extract loss-causation facts" tells it to prioritize loss-causation memories and procedures. **Value: MEDIUM-HIGH, and the most *frequently* exercised — every module activation, not just Evaluator/Revisor-wired ones. Difficulty: LOW** (one more field in the context basis). ### §3.7 Learning enrichment — weak as a feature, good for failure labeling As a raw learning *feature*, an optional free-text field is noisy: sparse, high-cardinality, inconsistently phrased. **Do not sell it as a primary learning driver.** Two reframes rescue it: 1. As an **intent/grouping label** enriching signals already collected (`TaskInvocationLearningSignal` §9A, `TaskAgentDesignLearningSignal` §9.5): "modules whose purpose is X succeed at rate R" beats bucketing by module-type. 2. **Purpose + a friction/failure event = a labeled negative example.** "A module meant to do X, configured this way, produced friction" is precisely the high-signal, low-volume datum that stays useful even sparse. **Value: moderate as enrichment, genuinely good for failure labeling. Difficulty: LOW-MEDIUM** (the learning envelopes already carry evidence refs; add an optional purpose label). **Blast-radius note.** "It's on the user" is fine for the *prompt* use (bad purpose → slightly worse output, this run only). For the *learning* use, a bad purpose pollutes the corpus that trains future task design — its effect outlives the run. Not a reason to childproof; a reason the learning consumers should **weight purpose by a consistency/quality signal** rather than ingesting it raw. ### §3.8 Recognition (chat) — entity resolution surfaces an existing task User: "help me create a task that researches Supreme Court law." A task scoped to that already exists. Elnor: "you have one of those." This is **entity resolution, not search** — tasks are `world_entity` nodes, resolved via the existing routing cascade (DOC72 routing cascade: alias → FTS5 → vector → model-assisted). See §4. **Value: HIGH for the "just tell Elnor" vision. Difficulty: LOW** — rides existing entity resolution; the only addition is that the task's card carries its purpose. ### §3.9 Discovery (chat) — Task Agent learns from past designs The Task Agent, designing a new task, wants "how have I built research tasks before, which module purposes worked, what produced friction." This is **semantic search over the purpose/case corpus**, ranked by outcome. See §4. **This is the long-term, highest-ceiling use, gated on the learning/search layer (DOC85 / sidecar index).** **Value: HIGH long-term; the auto-design vision depends on it. Difficulty: MEDIUM-HIGH.** ### §3.10 Minor / future uses Preflight gap detection (coverage of `task.purpose` by module purposes at design time, §15); artifact naming/importance hints; self-documenting graphs for multi-window red-team carryover; drift detection (`TaskKnowledgeDrift` §8.16) when config drifts from a stable stated purpose. --- ## §4 — Access patterns: recognition vs discovery (the core architecture) The two chat-surface jobs want different mechanisms. Conflating them is what made this feel tangled. **The cost to minimize is the number of LLM tool-call round-trips, not retrieval speed** (local FTS5/vector is sub-15ms per DOC72 §14.4; the cost is the LLM stopping, the round-trip, and resuming). ### §4.1 Recognition — zero tool calls, one compact card DOC24 entity-resolves the user's request against existing tasks (rides the routing cascade — no LLM tool call) and injects **one compact task card** per high-confidence match. ```ts interface TaskRecognitionCard { task_id: string; task_name: string; task_purpose_text: string | null; // the task.purpose last_run_summary?: string; // e.g. "ran 3 days ago, completed" module_count: number; drill_in_ref: StorageRef; // points to full task; fetched only on explicit "look at it" schema_version: "1.0"; } ``` Module detail is **not** injected. "Elnor could look at it and advise" → "look at it" is a deliberate drill-down (one user-justified tool call, expected pause). ### §4.2 Discovery — one search, compact summaries, selective drill-down The Task Agent issues **one** semantic search; results return **task/case-level purposes inline**, bounded by top-k (not by corpus size). ```ts interface PurposeDiscoveryResultItem { task_or_case_id: string; name: string; purpose_text: string | null; // task.purpose or TaskDesignCase intent outcome_summary: string; // "worked well" / "had friction on X" friction_flags: string[]; // links to the negative-example store drill_in_ref: StorageRef; // full module detail fetched only if drilled relevance: number; schema_version: "1.0"; } ``` The LLM triages the compact list **in-context** and fetches full module detail only for the one or two worth drilling into. **One search round-trip, then maybe one drill-down — never a tool call per module, never a bulk module dump.** ### §4.3 The clean-context rule **On the chat surface, the unit that ever enters context is the task/case-level purpose, never module-level detail.** Injecting the module detail of 50 search hits kills the clean response — that is the failure mode being designed against. The lightweight-purpose / heavy-detail split is exactly what prevents it: the purpose is the triage layer; module details sit behind a reference, fetched only on intent. ### §4.4 In-task consumers are different — and the only place the packet model applies The Evaluator / Revisor / Task-Agent-reviewing-*this*-task operate over a **bounded set** (one task's ~10–20 modules) and are **latency-tolerant** (two minutes added to a task run is fine). For them, DOC24 pre-assembles the **sibling-module descriptors** (§5) into the context packet at assembly time — zero search, zero per-module calls. **This packet-descriptor approach is correct here and wrong for chat** (§4.3). ### §4.5 Division of labor | Surface | Latency budget | Cleanliness budget | Unit | Mechanism | |---|---|---|---|---| | Chat (Elnor) — recognition | tight | high | one task card | entity resolution + inject | | Chat (Elnor) — discovery | one round-trip OK | high | top-k purpose summaries | one search, inline purposes | | In-task (Evaluator/Revisor) | minutes OK | n/a | sibling descriptors | pre-assembled packet | --- ## §5 — The module descriptor (the read-unit) "Read the module purposes" really means "read the module **descriptors**." There is currently **no single static, per-instance module descriptor**: type-level docs exist (`ModuleConfigSchema` / `ConfigFieldDef` in the base spec; `TaskModuleCard` §8.9), and a per-activation runtime descriptor exists (`TaskModuleContextBasis` §13A.11; `EffectivePromptSnapshot` §12.3), but nothing static lets a human or the Evaluator understand "the full module" before it runs. Proposal: a **static sibling of `TaskModuleContextBasis`** — the module design descriptor — anchored by purpose. ```ts interface ModuleDesignDescriptor { module_id: string; module_type: string; purpose: ModulePurpose; // §1.1 — the anchor instruction_summary: string | null; // the "how", summarized io_contract_summary: string | null; role_in_task: string | null; // derived: where this sits relative to siblings + task.purpose side_effect_class?: string; schema_version: "1.0"; } ``` This is what the Evaluator reads (bounded set, §4.4), what a drill-down returns, and what surfaces in the inspector. **Rationale:** the bare purpose string is not enough; the descriptor is the unit, and purpose is the field that makes the descriptor worth reading. --- ## §6 — Drift discipline (anti-phantom-wiring) With `task.purpose` added, there are now up to five "why/what" surfaces: blueprint `business_or_personal_goal`, `StepIntent.purpose`, `task.purpose`, `module.purpose`, and the instruction itself. Without discipline this becomes phantom wiring. Rules: 1. **Config-level `task.purpose` / `module.purpose` are canonical at runtime** (1:1, ride the trace). 2. **Blueprint fields *seed* config fields** (§2.1) with a **visible diff**, never silent reconciliation. If a blueprint StepIntent and a module's config purpose diverge, surface it; do not auto-merge. 3. **Purpose-vs-instruction mismatch is a *signal*** (the Task Agent check, §3.5), not something to auto-resolve. 4. The instruction is the "how"; purpose is the "what/why." They are allowed to differ; the difference is information. --- ## §7 — Storage (declared as seams to the memory rebuild) DOC72/DOC73/DOC1 are slated for partial supersession by the DOC80–DOC87 family (carryover §10). This proposal therefore specifies **contracts**, with the binding deferred to DOC80/DOC85. ### §7.1 Task purpose → resolvable entity property `task.purpose` is a property of the task entity (today a `world_entity` node; tomorrow whatever the DOC80 family makes tasks). **Contract:** task purpose must be a resolvable property of the task entity so the recognition path (§4.1) rides entity resolution for free. **No new store.** ### §7.2 Module purposes → config + run records + case aggregation Module purposes are **not** individual graph nodes. Promoting every module to a node explodes the graph with low-value entries (DOC72 §42 entropy / cleanup concern). They live in: - `ModuleRecord.config` (static, §1.1); - run snapshots (`EffectivePromptSnapshot`, `TaskRunStepRow`, `ModuleActivationReplayRecord` — §8); - aggregated into `TaskDesignCase` (§9.6) at **case granularity** — the right unit for the graph. ### §7.3 Discovery search index → derived sidecar The cross-corpus discovery search (§4.2) is a **derived sidecar index** (embeddings + LLM rerank) over the case corpus — the LlamaIndex-style sidecar role (invariant 15: sidecar retrieval provider, not canonical memory). The **canonical record stays graph-governed** (invariant 18: entity graph is a first-class durable store, not a rebuildable cache); the index is rebuildable and non-canonical. ### §7.4 Friction → purpose linkage The negative-example store (§3.7) links friction/failure events to the responsible module's purpose at case granularity. **Contract:** purpose must be retrievable alongside outcome and friction for a given run/case, so the learning layer (DOC85) inherits labeled positive/negative examples when it lands. **Rationale — capture must precede the consumer.** The active learning/search consumers are far off (DOC8 is undersized; DOC85 is in design; the Phase B audit has not run). But the **record and the friction→purpose linkage should be specified now**, because data has to accumulate *ahead* of the engine that reads it — otherwise DOC85 launches data-poor and learns from a cold start. Gating the *consumer* behind DOC85 does not mean gating the *capture*. --- ## §8 — Snapshot / trace field additions Add an optional `purpose` (or `module_purpose` + `task_purpose`) field to the runtime records so the why rides into the trace: - `EffectivePromptSnapshot` (§12.3) — record the injected purpose text (so the inspector can show it, §3.2). - `TaskRunStepRow` (§20C.3) — optional purpose for the inspector row. - `ModuleActivationReplayRecord` (§20E.4) — purpose in the replay bundle. - `TaskModuleContextBasis` (§13A.11) — purpose feeds relevance (§3.6) and seeds the static descriptor (§5). All optional; absence = current behavior. --- ## §9 — Open questions for review 1. **`task.purpose` distinct field vs derived.** Proposal leans **distinct but auto-seeded** (so it can stand alone on hand-built tasks). Confirm. 2. **Evaluator coverage guard strength (§3.3).** Soft warning ("module purposes don't cover sub-goal Y") vs gate-capable? Proposal leans soft warning surfaced in Preflight + Run Inspector; not a hard block. 3. **Hand-built backfill via nightly lane (§2.2.3).** Worth it, or drop? Proposal leans optional/deferred. 4. **Descriptor artifact (§5).** Static sibling of `TaskModuleContextBasis`, or extend `TaskModuleContextBasis` itself with a static-mode flag? Naming TBD. 5. **Per-consumer authority-hedge wording (§3.2).** Exact injection-frame text per consumer needs drafting and testing. 6. **DOC80/DOC85 binding (§7).** Once the memory family lands, bind task-purpose-as-entity-property, the case corpus, and the friction linkage to the actual DOC80/DOC85 schemas. 7. **Does the field risk over-engineering?** Honest self-check requested from reviewers: is the near-term value (§3.1, §3.5, §3.6, §3.8) sufficient on its own to justify the field, independent of the speculative learning/search uses (§3.7, §3.9)? The proposal's position: **yes** — prompt-scoping and Task Agent review pay off immediately; learning/search are a bonus the substrate enables later. --- ## §10 — Review-prioritization summary (not build phases) This addendum describes the **complete end-state** per the design philosophy that specs are complete end-state products and phasing comes later (Design Philosophy §9.1). The following is a *review* aid, not a build gate: | Tier | Consumers | Available when | New machinery | |---|---|---|---| | Immediate | prompt-scoping (§3.1), Task Agent review (§3.5), record (§3.1/§8), recognition (§3.8), DOC24 relevance (§3.6) | the moment the field is populated | **none** beyond field + packet/snapshot additions | | Wired-pipeline | Outcome Compiler/Evaluator (§3.3), Revisor (§3.4) | when Evaluator/Revisor is in the graph | small input additions + coverage guard | | Learning/search | enrichment (§3.7), discovery/auto-design (§3.9) | when DOC85 / sidecar index lands | the sidecar search index (the only genuinely new layer) | **Bottom line for the architect.** The capture is cheap and the high-value uses need no new layer. The only complicated piece — the discovery search index — is far off, gated on the memory rebuild, and optional to the near-term value. The honest risk is sparse population, mitigated by free seeding on Elnor-created tasks and by the immediate prompt-scoping payoff that gives users a reason to fill the field at all. --- ## §11 — Cross-doc obligations (to add to OP-A on absorption) - **DOC23 base / Addenda B Core:** add `module.purpose` to `ModuleRecord.config`; add `task.purpose` to the task definition; add purpose to `TaskModuleContextBasis`, `EffectivePromptSnapshot`, `TaskRunStepRow`, `ModuleActivationReplayRecord`; add the `ModuleDesignDescriptor`; add the blueprint→config seeding/diff rule. - **DOC23 Outcome Evaluator/Revisor:** Compiler consumes task.purpose (coverage guard) + module purposes; Revisor consumes module purpose for RepairTarget; define the authority hedge per consumer. - **DOC24:** recognition card injection; the clean-context rule (§4.3); per-consumer hedge via DeliveryDirective `hedge_mode`; purpose as a relevance input. - **DOC8 / DOC85 (seam):** purpose as enrichment label on learning signals; friction→purpose negative-example linkage; quality-weighting of purpose before ingestion. - **DOC80 family (seam):** task-purpose-as-resolvable-entity-property; case corpus storage; the derived sidecar discovery index. - **DOC20 / DOC21 / DOC22 (UI):** task-purpose editor in task settings; collapsible module-purpose line; purpose display in Run Inspector / prompt-context inspector; register any new content type/route. --- *End of DOC23 Addenda C draft proposal. For red-team review across reviewers. Not operative; nothing implemented; storage/learning bindings are seams to the DOC80–DOC87 memory rebuild.*