ELNOR REPO READER TEXT MIRROR
Original path: Active Working and Red Team/DOC23 Working/DOC23 Non Operative Proposals/DOC23_ADDENDA_C_MODULE_AND_TASK_PURPOSE.md
Source repo: /Users/OpenClaw1/Elnor/Elnor Specs
Git branch: main
Git commit: dbaa25962edc11ab30e8d4ca1715f9ae5bf77331
Generated: 2026-06-09T01:23:58.539Z

---

# DOC23 Addenda C — Module and Task Purpose (Stated Intent Layer)

**Status:** DRAFT PROPOSAL — for red-team review. Not operative. Nothing implemented.
**Family:** DOC23 Task System (sibling to Addenda A — Task Optimization, Addenda B — Task Intelligence).
**Targets:** the operative DOC23 Addenda B Core, the DOC23 Outcome Evaluator/Revisor addendum, the base DOC23 Task System spec, and the DOC24 delivery seam. Authored against operative text per the post-absorption rule (invariant 24) — not a revision of any archived proposal.
**Memory-rebuild posture:** the durable-storage and learning/search bindings in this proposal are declared as **seams** to the DOC80–DOC87 family and DOC85 (Learning/Feedback), because DOC72/DOC73/DOC1 are slated for partial supersession (carryover §10). This addendum specifies *contracts*, not bindings to today's DOC72 internals.

---

## §0 — How to read this proposal

This document proposes one small, optional, user-authored field — a stated **purpose** — at two levels (module and task), plus the contracts for the systems that would consume it. The field itself is trivial. The value, the hazards, and the architecture of *who reads it and how* are where the substance is, so the thinking behind each decision is written inline under **Rationale** notes to make review faster.

The single most important design constraint, repeated throughout: **the field is optional and will be sparsely populated.** Any use that *requires* purpose to be present is fragile; any use that *opportunistically benefits when present and degrades to current behavior when absent* is robust. Every consumer in this proposal is designed to degrade to today's behavior on an empty field.

### §0.1 Origin

This proposal came out of a handover red-team finding against the Addenda B set: an inheritor of a half-finished task run can mechanically reconstruct *what ran* but not *why the ordinary agent modules and the human chose what they chose*. The Outcome Evaluator/Revisor pipeline has rich decision-rationale capture (AssuranceBasis, HardCallResolutionLedger, FailureKind, progress signals); ordinary modules (`step.agent_task`, `step.red_team`, generic LLM modules) record prompt + output + policy decisions and **no structured "why."** That asymmetry is the gap this addresses — but, as the analysis below shows, the record use turned out to be the least of the value.

### §0.2 What already half-exists (so this is mostly wiring, not net-new capture)

The "why" already exists in three disconnected forms:

1. **`StepIntent.purpose`** (Addenda B Core §6.4) — `label`, `purpose` (required), `why_it_exists?`, `design_rationale?`, `alternatives_considered?`. But it lives at the **blueprint** layer, binds to module **regions** (§6.5, one-to-many), only exists for tasks that went through a design flow, and **never propagates into the runtime trace.**
2. **`ModuleRecord.notes`** (user scratchpad) and **`ModuleRecord.description`** (catalog listing) at the config layer (base Task System §7.1.2) — present, but semantically muddy and not snapshotted as intent.
3. The module's prompt and output at runtime — but that is the *result*, not the *intent*.

**Rationale.** What is missing is precisely the layer this proposal adds: a **config-layer field with stable "purpose" semantics that propagates into runtime snapshots.** That layer is the only one that is (a) 1:1 with the executing module, (b) present even on hand-built blueprint-less tasks, and (c) snapshot-ready. Because the schema pattern (`StepIntent.purpose`) and the storage/snapshot plumbing (`notes`/`description` already ride `ModuleRecord`) exist, *capture* is cheap. The design problem is not capture — it is **population** (without nagging) and **drift** (between layers of "why").

### §0.3 The three-layer framing

| Layer | Where | Scope | Lifecycle | Status |
|---|---|---|---|---|
| Design intent | `StepIntent.purpose` (§6.4) | blueprint, region-bound (1:many) | design-time, conditional (only if a blueprint exists) | exists |
| **Config intent** | **`module.purpose` / `task.purpose`** | **module (1:1) / task (1:1)** | **rides the actual graph node into every run** | **PROPOSED** |
| Runtime | trace/snapshots | per activation | inherits config intent at run | proposed (field additions) |

`StepIntent.purpose` *seeds* config intent when a blueprint exists; config intent is **canonical at runtime** (see §6 drift discipline).

---

## §1 — The fields

### §1.1 `module.purpose`

An optional, short, user-stated (or seeded) statement of **what this module is for**, distinct from its instruction (the *how*).

```ts
// Addition to ModuleRecord.config (base Task System §7.1.2)
interface ModulePurpose {
  text: string | null;                 // e.g. "Extract loss-causation facts with record cites; do not draft argument."
  authored_by: "user" | "task_agent" | "seeded_from_step_intent" | "nightly_inferred" | null;
  seeded_from_step_intent_id?: string; // provenance when seeded (§6)
  confirmed_by_user: boolean;          // false if proposed/inferred and not yet confirmed
  updated_at?: string;
  schema_version: "1.0";
}
```

`module.purpose` is **never required**. Empty is a first-class state and degrades every consumer to current behavior.

### §1.2 `task.purpose`

A task-level **singleton**: one statement of what the whole task is for. Authored once; *referenced* (not copied) wherever needed.

```ts
// Addition to the task definition (TaskRecord / TaskBlueprint vicinity)
interface TaskPurpose {
  text: string | null;                 // e.g. "Produce a filing-ready securities complaint for matter X."
  authored_by: "user" | "originating_request" | "seeded_from_blueprint_goal" | "task_agent" | null;
  originating_request_ref?: string;    // §5A.4 semantic invocation memory, when Elnor-created
  confirmed_by_user: boolean;
  updated_at?: string;
  schema_version: "1.0";
}
```

**Rationale — why both levels, and why they are not redundant.** The task purpose is the **independent frame** ("what the whole thing is for"); the module purposes are the **operational decomposition** ("how we are trying to achieve it"). The value is the *tension* between them: a consumer comparing module-purpose coverage against the task purpose catches both **over-spec** (redundant modules — duplicate purposes) and **under-spec** (a dimension of the goal that no module covers). This is the checksum that makes the Evaluator use safe (§3.3) and the redundancy-detection use possible (§3.5). They serve different surfaces, too (§4): **task purpose serves the chat front-door; module purposes serve the in-task consumers and drill-down.**

---

## §2 — Population (no synchronous inference calls)

This is the make-or-break section, because sparse population kills the aggregate uses.

### §2.1 Sources, ranked by cost

| Source | Cost | Applies to |
|---|---|---|
| Originating chat request | **free** — rides an LLM turn already happening | Elnor-created tasks (task.purpose) |
| `StepIntent.purpose` (§6.4) | **free** — Task Agent already authors it during design | tasks with a blueprint (module.purpose) |
| Blueprint `business_or_personal_goal` (§6.3) | **free** — already authored | tasks with a blueprint (task.purpose) |
| User-typed | user effort | hand-built tasks |
| Nightly extraction (§10.8.2) | cheap, **off hot-path, post-hoc** | optional backfill for hand-built/un-purposed |

### §2.2 Rules

1. **Elnor-created tasks → free.** The originating utterance ("set up a task to research Supreme Court law") was already processed; the Task Agent is already authoring `StepIntent.purpose`. Capturing `task.purpose` from the originating request (§5A.4 already records it) and seeding `module.purpose` from the StepIntents the Task Agent is already writing is a **byproduct of work already done — no incremental LLM call.**
2. **Hand-built tasks → user-typed or empty.** There is no free source (no originating utterance, no blueprint). **No synchronous, at-edit-time inference call** — that is the bad pattern (slow, on the hot path, per-module). Empty degrades fine, and hand-built is exactly the case where the user is most hands-on and best able to state intent.
3. **Optional backfill via nightly lane.** IF inferred purposes for hand-built tasks are ever wanted, the home is the nightly extraction lane (§10.8.2): post-hoc, off the hot path, from the run's actual prompt+output, surfaced as a candidate the user confirms. This **cannot pre-fill before first run** and may not be worth it — flagged as an open question (§9).

**Rationale.** An earlier version of this design assumed "inferred-and-proposed pre-fill" was universal. It is not. Pre-fill is free only where an LLM turn is already happening (Elnor-created). Forcing inference on hand-built tasks would mean a dedicated call the architect explicitly does not want. The honest model is: free where free, user-typed otherwise, empty is fine.

### §2.3 UX

- **`task.purpose`** lives at the **task level** (task settings / task header), edited in one canonical place, shown read-only elsewhere, **referenced not copied** into module prompts. The user never types it per-module.
- **`module.purpose`**, if surfaced in the module config panel, is a **single optional, collapsible, pre-filled line** — never an always-on field, never typed twenty times across twenty modules.
- **Soft confirm, never hard gate.** Where a purpose can be pre-filled for free (Elnor-created), a dismissible confirm-or-edit affordance may appear at the natural moment (first run / save). A hard "cannot run until you define a purpose" block is **rejected** — it violates the no-modal-pressure / inference-and-suggest UI preferences.

**Rationale — the busy-panel constraint.** The module config panel is already the densest UI surface. This proposal must not add to that clutter. Task purpose therefore does **not** appear in the module config panel at all (it is task-level). Module purpose is one collapsible pre-filled line, mostly living *around* the modules (prompt header, inspector), not *inside* each config form.

---

## §3 — Consumers (value · difficulty · thinking)

Each consumer is graded for review prioritization. None gate the field's existence; all degrade to current behavior on empty.

### §3.1 Self-scoping the module's own prompt — *the day-one payoff*

Inject, at the top of each module's prompt, both frames: the **task purpose** (referenced from the singleton) and **this module's role**.

```text
Task goal: Produce a filing-ready securities complaint for matter X.
Your role here: Extract loss-causation facts with record cites. Do not draft argument.

[module instruction follows]
```

**Value: HIGH and immediate. Difficulty: TRIVIAL.** This pays off the *instant* the field is populated, with zero dependency on the Revisor, learning, or search. The dual framing suppresses the failure where a module over-reaches (drafts when it should extract) because it did not know a later module owns the next step.

**Rationale — why this matters disproportionately.** Adoption is the gate on every aggregate use. A field that only helps some future learning system stays empty; a field that improves *this run right now* gets populated. Prompt-scoping is what makes the field worth filling, which is what feeds the learning and search uses later. It is the adoption engine.

### §3.2 The injection authority frame (load-bearing-purpose resolution)

Once purpose enters a prompt it becomes load-bearing: a *wrong* purpose can misdirect a module, where an *empty* one is neutral. Resolution: frame its **authority at injection time**, per consumer.

```text
The following is USER-STATED INTENT, not an instruction. Use it as broad guidance for
resolving ambiguity and staying in scope. It does not override your task instruction or
any policy/constraint. If it conflicts with your instruction, follow the instruction and
note the conflict.
```

Authority differs by consumer (reusing the existing DeliveryDirective `hedge_mode` vocabulary):

| Consumer | Authority of purpose |
|---|---|
| Agent module (own prompt) | low — scoping only ("stay in your lane") |
| Outcome Compiler / Evaluator | medium — evidence about intended outcome, **cross-checked against task purpose** (§3.3) |
| Revisor | medium — repair-direction hint |
| Task Agent (review) | medium — intent ground truth to check config against |

**Rationale.** This is not new machinery — it is the same authority-hedging the spec already does (active context as "candidate evidence," Addenda B Core §13A.3; `hedge_mode` in the DeliveryDirective). A poorly written purpose then degrades to "slightly noisy broad guidance," not "wrong instruction that derails the module." Also: the injected purpose should be **visible in the prompt-context inspector** (§20C.7) so a bad one is catchable. "It's on the user" is the right posture for a professional tool — but note (§3.7) that a bad purpose has a *learning* blast radius beyond the current run.

### §3.3 Outcome Compiler / Evaluator — defining an underspecified outcome

The stated outcome is often thin: "check my research is complete," "make it filing-ready." The Outcome Compiler (Outcome Evaluator §4.8, via the threshold extractor that pulls criteria from natural language) must *manufacture* the operational meaning of "complete." Module purposes are a source for that: "complete" is partly defined by the **union of the module purposes** ("verify every loss-causation element has record support," "confirm every affirmative defense is addressed").

**Value: HIGH where the Evaluator/Revisor is wired — arguably the highest, because it turns a vague outcome into a checkable one. Difficulty: MEDIUM.**

**HAZARD — the forgotten-module blind spot.** If the Evaluator *composes* "complete" purely from the module purposes present, a task that **forgot** a module (no module checks affirmative defenses) produces an Evaluator that also does not know to check affirmative defenses. The gap is invisible: the task cannot fail a check it never knew to run.

**Guard.** The **task purpose must remain an independent frame**, not be derived from the modules. The Compiler decomposes `task.purpose` into expected dimensions and **checks coverage** of module purposes against it, flagging dimensions present in the goal but absent from the modules. Injection framing handles "interpret what is present"; the coverage check handles "notice what is absent." Instruction alone cannot conjure a missing dimension — this is structural, not a wording fix.

### §3.4 Revisor — RepairTarget / strategy targeting

When a finding says "this module's output is wrong," the Revisor must decide *what "fixed" means for this module* and pick a `RepairTarget` / `RepairStrategyKind`. A stated purpose disambiguates the failure mode and repair direction, and sharpens the `still_failing_same_reason` vs `failing_for_new_reason` distinction (Outcome Evaluator §5.5.4): failing to serve the stated purpose is a cleaner "same reason" signal than re-deriving intent from the artifact.

**Value: HIGH where the Revisor is wired. Difficulty: MEDIUM.** The Revisor already consumes artifact + findings; add the source module's purpose. It treats purpose as a **hint, never a contract** (per §3.2).

### §3.5 Task Agent review — mismatch and redundancy detection

The Task Agent reviews user-built graphs ("Check wiring," "Suggest missing steps," Addenda B Core §16A.1) by *inferring* what each module is for. A stated purpose gives ground truth to check **config against intent** ("purpose is fact extraction but the instruction also asks for drafting — mismatch") and to flag **duplicate purposes** ("two modules share the same stated purpose — possible redundancy").

**Value: HIGH — best value/effort ratio of the set, and it directly serves the architect's stated weak stroke (spotting redundant, low-yield material). Difficulty: LOW** — the Task Agent already reads module config (Addenda B Core §4.5); the checks are cheap LLM comparisons.

### §3.6 DOC24 context relevance

A purpose is a clean, cheap relevance signal for what context a module needs. DOC24 assembles the module packet from instruction + resolved entities (`TaskModuleContextBasis`, Addenda B Core §13A.11); "purpose: extract loss-causation facts" tells it to prioritize loss-causation memories and procedures.

**Value: MEDIUM-HIGH, and the most *frequently* exercised — every module activation, not just Evaluator/Revisor-wired ones. Difficulty: LOW** (one more field in the context basis).

### §3.7 Learning enrichment — weak as a feature, good for failure labeling

As a raw learning *feature*, an optional free-text field is noisy: sparse, high-cardinality, inconsistently phrased. **Do not sell it as a primary learning driver.** Two reframes rescue it:

1. As an **intent/grouping label** enriching signals already collected (`TaskInvocationLearningSignal` §9A, `TaskAgentDesignLearningSignal` §9.5): "modules whose purpose is X succeed at rate R" beats bucketing by module-type.
2. **Purpose + a friction/failure event = a labeled negative example.** "A module meant to do X, configured this way, produced friction" is precisely the high-signal, low-volume datum that stays useful even sparse.

**Value: moderate as enrichment, genuinely good for failure labeling. Difficulty: LOW-MEDIUM** (the learning envelopes already carry evidence refs; add an optional purpose label).

**Blast-radius note.** "It's on the user" is fine for the *prompt* use (bad purpose → slightly worse output, this run only). For the *learning* use, a bad purpose pollutes the corpus that trains future task design — its effect outlives the run. Not a reason to childproof; a reason the learning consumers should **weight purpose by a consistency/quality signal** rather than ingesting it raw.

### §3.8 Recognition (chat) — entity resolution surfaces an existing task

User: "help me create a task that researches Supreme Court law." A task scoped to that already exists. Elnor: "you have one of those." This is **entity resolution, not search** — tasks are `world_entity` nodes, resolved via the existing routing cascade (DOC72 routing cascade: alias → FTS5 → vector → model-assisted). See §4.

**Value: HIGH for the "just tell Elnor" vision. Difficulty: LOW** — rides existing entity resolution; the only addition is that the task's card carries its purpose.

### §3.9 Discovery (chat) — Task Agent learns from past designs

The Task Agent, designing a new task, wants "how have I built research tasks before, which module purposes worked, what produced friction." This is **semantic search over the purpose/case corpus**, ranked by outcome. See §4. **This is the long-term, highest-ceiling use, gated on the learning/search layer (DOC85 / sidecar index).**

**Value: HIGH long-term; the auto-design vision depends on it. Difficulty: MEDIUM-HIGH.**

### §3.10 Minor / future uses

Preflight gap detection (coverage of `task.purpose` by module purposes at design time, §15); artifact naming/importance hints; self-documenting graphs for multi-window red-team carryover; drift detection (`TaskKnowledgeDrift` §8.16) when config drifts from a stable stated purpose.

---

## §4 — Access patterns: recognition vs discovery (the core architecture)

The two chat-surface jobs want different mechanisms. Conflating them is what made this feel tangled. **The cost to minimize is the number of LLM tool-call round-trips, not retrieval speed** (local FTS5/vector is sub-15ms per DOC72 §14.4; the cost is the LLM stopping, the round-trip, and resuming).

### §4.1 Recognition — zero tool calls, one compact card

DOC24 entity-resolves the user's request against existing tasks (rides the routing cascade — no LLM tool call) and injects **one compact task card** per high-confidence match.

```ts
interface TaskRecognitionCard {
  task_id: string;
  task_name: string;
  task_purpose_text: string | null;   // the task.purpose
  last_run_summary?: string;          // e.g. "ran 3 days ago, completed"
  module_count: number;
  drill_in_ref: StorageRef;           // points to full task; fetched only on explicit "look at it"
  schema_version: "1.0";
}
```

Module detail is **not** injected. "Elnor could look at it and advise" → "look at it" is a deliberate drill-down (one user-justified tool call, expected pause).

### §4.2 Discovery — one search, compact summaries, selective drill-down

The Task Agent issues **one** semantic search; results return **task/case-level purposes inline**, bounded by top-k (not by corpus size).

```ts
interface PurposeDiscoveryResultItem {
  task_or_case_id: string;
  name: string;
  purpose_text: string | null;        // task.purpose or TaskDesignCase intent
  outcome_summary: string;            // "worked well" / "had friction on X"
  friction_flags: string[];           // links to the negative-example store
  drill_in_ref: StorageRef;           // full module detail fetched only if drilled
  relevance: number;
  schema_version: "1.0";
}
```

The LLM triages the compact list **in-context** and fetches full module detail only for the one or two worth drilling into. **One search round-trip, then maybe one drill-down — never a tool call per module, never a bulk module dump.**

### §4.3 The clean-context rule

**On the chat surface, the unit that ever enters context is the task/case-level purpose, never module-level detail.** Injecting the module detail of 50 search hits kills the clean response — that is the failure mode being designed against. The lightweight-purpose / heavy-detail split is exactly what prevents it: the purpose is the triage layer; module details sit behind a reference, fetched only on intent.

### §4.4 In-task consumers are different — and the only place the packet model applies

The Evaluator / Revisor / Task-Agent-reviewing-*this*-task operate over a **bounded set** (one task's ~10–20 modules) and are **latency-tolerant** (two minutes added to a task run is fine). For them, DOC24 pre-assembles the **sibling-module descriptors** (§5) into the context packet at assembly time — zero search, zero per-module calls. **This packet-descriptor approach is correct here and wrong for chat** (§4.3).

### §4.5 Division of labor

| Surface | Latency budget | Cleanliness budget | Unit | Mechanism |
|---|---|---|---|---|
| Chat (Elnor) — recognition | tight | high | one task card | entity resolution + inject |
| Chat (Elnor) — discovery | one round-trip OK | high | top-k purpose summaries | one search, inline purposes |
| In-task (Evaluator/Revisor) | minutes OK | n/a | sibling descriptors | pre-assembled packet |

---

## §5 — The module descriptor (the read-unit)

"Read the module purposes" really means "read the module **descriptors**." There is currently **no single static, per-instance module descriptor**: type-level docs exist (`ModuleConfigSchema` / `ConfigFieldDef` in the base spec; `TaskModuleCard` §8.9), and a per-activation runtime descriptor exists (`TaskModuleContextBasis` §13A.11; `EffectivePromptSnapshot` §12.3), but nothing static lets a human or the Evaluator understand "the full module" before it runs.

Proposal: a **static sibling of `TaskModuleContextBasis`** — the module design descriptor — anchored by purpose.

```ts
interface ModuleDesignDescriptor {
  module_id: string;
  module_type: string;
  purpose: ModulePurpose;              // §1.1 — the anchor
  instruction_summary: string | null;  // the "how", summarized
  io_contract_summary: string | null;
  role_in_task: string | null;         // derived: where this sits relative to siblings + task.purpose
  side_effect_class?: string;
  schema_version: "1.0";
}
```

This is what the Evaluator reads (bounded set, §4.4), what a drill-down returns, and what surfaces in the inspector. **Rationale:** the bare purpose string is not enough; the descriptor is the unit, and purpose is the field that makes the descriptor worth reading.

---

## §6 — Drift discipline (anti-phantom-wiring)

With `task.purpose` added, there are now up to five "why/what" surfaces: blueprint `business_or_personal_goal`, `StepIntent.purpose`, `task.purpose`, `module.purpose`, and the instruction itself. Without discipline this becomes phantom wiring.

Rules:

1. **Config-level `task.purpose` / `module.purpose` are canonical at runtime** (1:1, ride the trace).
2. **Blueprint fields *seed* config fields** (§2.1) with a **visible diff**, never silent reconciliation. If a blueprint StepIntent and a module's config purpose diverge, surface it; do not auto-merge.
3. **Purpose-vs-instruction mismatch is a *signal*** (the Task Agent check, §3.5), not something to auto-resolve.
4. The instruction is the "how"; purpose is the "what/why." They are allowed to differ; the difference is information.

---

## §7 — Storage (declared as seams to the memory rebuild)

DOC72/DOC73/DOC1 are slated for partial supersession by the DOC80–DOC87 family (carryover §10). This proposal therefore specifies **contracts**, with the binding deferred to DOC80/DOC85.

### §7.1 Task purpose → resolvable entity property

`task.purpose` is a property of the task entity (today a `world_entity` node; tomorrow whatever the DOC80 family makes tasks). **Contract:** task purpose must be a resolvable property of the task entity so the recognition path (§4.1) rides entity resolution for free. **No new store.**

### §7.2 Module purposes → config + run records + case aggregation

Module purposes are **not** individual graph nodes. Promoting every module to a node explodes the graph with low-value entries (DOC72 §42 entropy / cleanup concern). They live in:

- `ModuleRecord.config` (static, §1.1);
- run snapshots (`EffectivePromptSnapshot`, `TaskRunStepRow`, `ModuleActivationReplayRecord` — §8);
- aggregated into `TaskDesignCase` (§9.6) at **case granularity** — the right unit for the graph.

### §7.3 Discovery search index → derived sidecar

The cross-corpus discovery search (§4.2) is a **derived sidecar index** (embeddings + LLM rerank) over the case corpus — the LlamaIndex-style sidecar role (invariant 15: sidecar retrieval provider, not canonical memory). The **canonical record stays graph-governed** (invariant 18: entity graph is a first-class durable store, not a rebuildable cache); the index is rebuildable and non-canonical.

### §7.4 Friction → purpose linkage

The negative-example store (§3.7) links friction/failure events to the responsible module's purpose at case granularity. **Contract:** purpose must be retrievable alongside outcome and friction for a given run/case, so the learning layer (DOC85) inherits labeled positive/negative examples when it lands.

**Rationale — capture must precede the consumer.** The active learning/search consumers are far off (DOC8 is undersized; DOC85 is in design; the Phase B audit has not run). But the **record and the friction→purpose linkage should be specified now**, because data has to accumulate *ahead* of the engine that reads it — otherwise DOC85 launches data-poor and learns from a cold start. Gating the *consumer* behind DOC85 does not mean gating the *capture*.

---

## §8 — Snapshot / trace field additions

Add an optional `purpose` (or `module_purpose` + `task_purpose`) field to the runtime records so the why rides into the trace:

- `EffectivePromptSnapshot` (§12.3) — record the injected purpose text (so the inspector can show it, §3.2).
- `TaskRunStepRow` (§20C.3) — optional purpose for the inspector row.
- `ModuleActivationReplayRecord` (§20E.4) — purpose in the replay bundle.
- `TaskModuleContextBasis` (§13A.11) — purpose feeds relevance (§3.6) and seeds the static descriptor (§5).

All optional; absence = current behavior.

---

## §9 — Open questions for review

1. **`task.purpose` distinct field vs derived.** Proposal leans **distinct but auto-seeded** (so it can stand alone on hand-built tasks). Confirm.
2. **Evaluator coverage guard strength (§3.3).** Soft warning ("module purposes don't cover sub-goal Y") vs gate-capable? Proposal leans soft warning surfaced in Preflight + Run Inspector; not a hard block.
3. **Hand-built backfill via nightly lane (§2.2.3).** Worth it, or drop? Proposal leans optional/deferred.
4. **Descriptor artifact (§5).** Static sibling of `TaskModuleContextBasis`, or extend `TaskModuleContextBasis` itself with a static-mode flag? Naming TBD.
5. **Per-consumer authority-hedge wording (§3.2).** Exact injection-frame text per consumer needs drafting and testing.
6. **DOC80/DOC85 binding (§7).** Once the memory family lands, bind task-purpose-as-entity-property, the case corpus, and the friction linkage to the actual DOC80/DOC85 schemas.
7. **Does the field risk over-engineering?** Honest self-check requested from reviewers: is the near-term value (§3.1, §3.5, §3.6, §3.8) sufficient on its own to justify the field, independent of the speculative learning/search uses (§3.7, §3.9)? The proposal's position: **yes** — prompt-scoping and Task Agent review pay off immediately; learning/search are a bonus the substrate enables later.

---

## §10 — Review-prioritization summary (not build phases)

This addendum describes the **complete end-state** per the design philosophy that specs are complete end-state products and phasing comes later (Design Philosophy §9.1). The following is a *review* aid, not a build gate:

| Tier | Consumers | Available when | New machinery |
|---|---|---|---|
| Immediate | prompt-scoping (§3.1), Task Agent review (§3.5), record (§3.1/§8), recognition (§3.8), DOC24 relevance (§3.6) | the moment the field is populated | **none** beyond field + packet/snapshot additions |
| Wired-pipeline | Outcome Compiler/Evaluator (§3.3), Revisor (§3.4) | when Evaluator/Revisor is in the graph | small input additions + coverage guard |
| Learning/search | enrichment (§3.7), discovery/auto-design (§3.9) | when DOC85 / sidecar index lands | the sidecar search index (the only genuinely new layer) |

**Bottom line for the architect.** The capture is cheap and the high-value uses need no new layer. The only complicated piece — the discovery search index — is far off, gated on the memory rebuild, and optional to the near-term value. The honest risk is sparse population, mitigated by free seeding on Elnor-created tasks and by the immediate prompt-scoping payoff that gives users a reason to fill the field at all.

---

## §11 — Cross-doc obligations (to add to OP-A on absorption)

- **DOC23 base / Addenda B Core:** add `module.purpose` to `ModuleRecord.config`; add `task.purpose` to the task definition; add purpose to `TaskModuleContextBasis`, `EffectivePromptSnapshot`, `TaskRunStepRow`, `ModuleActivationReplayRecord`; add the `ModuleDesignDescriptor`; add the blueprint→config seeding/diff rule.
- **DOC23 Outcome Evaluator/Revisor:** Compiler consumes task.purpose (coverage guard) + module purposes; Revisor consumes module purpose for RepairTarget; define the authority hedge per consumer.
- **DOC24:** recognition card injection; the clean-context rule (§4.3); per-consumer hedge via DeliveryDirective `hedge_mode`; purpose as a relevance input.
- **DOC8 / DOC85 (seam):** purpose as enrichment label on learning signals; friction→purpose negative-example linkage; quality-weighting of purpose before ingestion.
- **DOC80 family (seam):** task-purpose-as-resolvable-entity-property; case corpus storage; the derived sidecar discovery index.
- **DOC20 / DOC21 / DOC22 (UI):** task-purpose editor in task settings; collapsible module-purpose line; purpose display in Run Inspector / prompt-context inspector; register any new content type/route.

---

*End of DOC23 Addenda C draft proposal. For red-team review across reviewers. Not operative; nothing implemented; storage/learning bindings are seams to the DOC80–DOC87 memory rebuild.*