Elnor Repo Reader

DOC10_Unified_Engagement_Orchestration_R11_Consolidated.md

Current Specs/DOC10/DOC10_Unified_Engagement_Orchestration_R11_Consolidated.md

Generated 2026-06-09T01:23:58.539Z from commit dbaa25962edc11ab30e8d4ca1715f9ae5bf77331. Worktree: clean.

Open text page · Open raw txt · Open path URL

# DOC10 Unified Engagement Orchestration — R11 [Consolidated Current]

## Revision Lineage (must persist in all later versions)

Based on DOC10 Unified Engagement Orchestration v1.11.10 R10 plus DOC10 v1.11.10 R10.1 (Retrieval Receipts and Topology Consumption). This consolidated current version fully subsumes those prior operative versions.


## Consolidation Rule

This file is the current single operative DOC10 orchestration specification. Later merged revision blocks govern over earlier baseline statements on overlapping subjects.


## Included Source Chain

- 1. Inherited Baseline — DOC10 Unified Engagement Orchestration v1.11.10 R10 — source file: `DOC10_UNIFIED_ENGAGEMENT_ORCHESTRATION_v1_11_10_R10.md`
- 2. Merged Revision — DOC10 v1.11.10 R10.1 (Retrieval Receipts and Topology Consumption) — source file: `DOC10_UNIFIED_ENGAGEMENT_ORCHESTRATION_v1_11_10_R10_1_Retrieval_Receipts_and_Topology_Consumption.md`



---

# Part 1 — Inherited Baseline — DOC10 Unified Engagement Orchestration v1.11.10 R10


# DOC10 - Unified Engagement, Orchestration, Gateway-Aware Decision Routing, Capability Learning, and System-Agent Management

Version: v1.11.10 R10  
Status: active draft  
Supersedes: DOC10 v1.11.10 R9  
Companion ledger: DOC10 Orchestration Integration Ledger R8 (merged master ledger)

## Why this document exists

DOC10 defines how ELNOR decides what to do, what system should do it, what context and signals should be consulted, how those decisions are bounded by latency and policy, and how the user can see and control the results in Q.

DOC10 is not the owner of all behavior. It is the orchestration layer that sits above and between other subsystem specs.

This revision incorporates:
- the full DOC10 R9 red-team synthesis packet, including the Claude Code / Codex implementation-grounded corrections;
- a stronger coexistence story for legacy entry points, `OperationEnvelope`, and the orchestration spine so implementation does not drift into parallel submission paths;
- a concrete effective-mode contract so dispatch, UI, and telemetry speak the same mode language even during degrade/override conditions;
- a real producer-side Gateway reverse-telemetry and abort interop contract instead of a consumer-side wish list;
- explicit STOP schema reconciliation and migration guidance so existing stop-controller reality converges on one canonical durable representation;
- stronger typed read-model response schemas and owner-doc proxy result/refusal envelopes so Q and EC do not invent responses ad hoc;
- narrower, read-only transaction-manifest semantics so provenance stays strong without implying distributed rollback orchestration;
- stronger running-job persistence, cleanup, and restart expectations;
- sharper scanner, capability-cache, and bridge-derivation rules so capability awareness stays honest even before DOC3 fully catches up;
- provisional but explicit usage / token / cost hooks that can later be unified by DOC13 rather than reinvented incompatibly.

## Instruction

This document is normative for:
- execution-spine behavior
- mode enforcement
- routing and dispatch contracts
- event intake and routing policy
- system-agent lifecycle posture
- Q route transparency and orchestration controls
- artifact traceability and orchestration telemetry
- orchestration read-model contracts
- producer-side Gateway interop expectations that DOC11 must satisfy

This document is not allowed to:
- redefine OpenClaw native runtime behavior already owned by OpenClaw / DOC4 / DOC11
- redefine DOC1 memory ownership
- redefine DOC2 freshness ownership
- redefine DOC3 skill or wrapper artifact ownership
- redefine DOC7 bucket storage ownership
- redefine DOC8 friction ownership
- redefine DOC9 repair ownership
- redefine DOC12 room / participant / room-turn behavior beyond shared-contract hooks explicitly referenced here
- invent repo-path truth that conflicts with the live codebase; physical layout details belong in implementation appendices, not canonical semantics

## Spec pinning

This revision assumes the following surrounding active docs exist and remain canonical for their own domains:
- DOC1 memory resilience / memory lifecycle / nightly memory infrastructure
- DOC2 Freshness Manager
- DOC3 App Skills and capability artifacts
- DOC4 OpenClaw bridge / gateway integration baseline
- DOC6 panels and forums sidecar architecture
- DOC7 context buckets and system buckets
- DOC8 friction / regression / self-learning
- DOC9 self-repair pipeline
- DOC11 OpenClaw Gateway-first interactive chat and dynamic model controls
- DOC12 inter-agent communication / rooms / ACP
- DOC13 costs / usage / token accounting
- Running Brief / OCM remediation spec

If any surrounding spec conflicts with DOC10 on ownership, the surrounding spec wins on storage/behavior ownership and DOC10 must consume or register hooks rather than re-owning the behavior.

---

## 0) Guiding principles

### 0.1 DOC10 is a registrar and consumer, not a sovereign replacement
DOC10 coordinates, routes, traces, and mode-gates behavior. It does not replace subsystem ownership.

### 0.2 Deterministic fast path first
Interactive use must remain viable when advanced orchestration is disabled. Most turns must succeed using deterministic routing, cached context, cheap lookups, and zero additional synchronous advisor LLM calls.

### 0.3 Thin orchestration, thick runtime
OpenClaw remains the native execution intelligence. DOC10 decides which path/system applies and what bounded augmentation should be added. DOC10 does not micromanage OpenClaw's step-by-step tool reasoning once a task is routed into the Gateway runtime.

### 0.4 Gateway-first for Q interactive chat
For interactive chat originating from Q, the authoritative runtime path is the OpenClaw Gateway except in explicit external-bypass or gateway-degraded fallback modes. This preserves OpenClaw-native files, skills, agent loop, and tool/runtime behavior.

### 0.5 External bypass remains real
The user must be able to disable advanced orchestration and still use the system. External/direct OpenClaw usage remains a valid posture and must not be silently disabled by DOC10.

### 0.6 Manager agents are allowed, but bounded
OCM, CSM, SHUA, and DLA may exist as bounded system agents or logical system-agent services. They enrich or advise; they do not become hidden executors.

### 0.7 Capability learning is first-class, but not hot-path
Discovery, teaching, synthesis, and capability growth are central goals. They must run as bounded jobs or clearly mode-gated operations, not as hidden per-turn latency multipliers.

### 0.8 Reference-first context exchange
Decisioning should use refs, compact facts, bounded summaries, and fetch-on-demand, not giant context dumps.

### 0.9 The user must understand and control the system
Modes, route choices, discovery runs, system agents, memory/self-learning events, and repair state must be visible and controllable in Q.

### 0.10 Artifact traceability is mandatory
If DOC10 creates, influences, or wakes a durable artifact, the user must be able to inspect where it came from.

### 0.11 The integration ledger is permanent
All cross-doc obligations generated by DOC10 and DOC11 belong in the master ledger and must not live only in conversational memory.


### 0.12 Free text defaults to Gateway-first unless a strict fast path applies
Natural-language free text from Q chat defaults to Gateway-first chat handling. Exact high-confidence fast paths are allowed only when the route is clearly a Q/EC-owned structured action or when the fast path merely supplies a route hint before Gateway handoff.

### 0.13 Capability awareness must be explicit
ELNOR must have a live capability-awareness view combining DOC3 capability bridge metadata, DOC11/OpenClaw catalog/session data, health/quarantine state, route blocklist state, and page/entity context so routing and UI can truthfully answer "what can I do here?"

### 0.14 No dual live execution
Shadow reasoning, shadow routing, and preflight metadata fetch are allowed. Sending the same user request into multiple live execution runtimes at once is forbidden.

### 0.15 One execution path, one context authority
For any given live operation, there must be exactly one authoritative context-assembly owner for the user-visible execution path.

Normative authority rules:
- DOC10 owns routing facts, mode, route trace, and bounded decision-packet construction.
- Running Brief / OCM owns generation of cached brief outputs and conversation-context summaries.
- DOC11 owns lean EC annotation assembly for Gateway-routed chat.
- OpenClaw owns native prompt/runtime assembly once a turn has entered the Gateway runtime.

For Q free-text Gateway-first chat, this means:
1. DOC10 builds routing facts and the DecisionPacket.
2. Running Brief / OCM may provide a cached brief ref or compact summary ref.
3. DOC11 builds the lean EC annotation block from DOC10 packet outputs plus approved refs.
4. OpenClaw performs the final native runtime turn.

The full Running Brief unified prompt assembler does not run on the Gateway-first hot path unless a future revision explicitly amends all four docs together.

Enforcement rule:
- any new live execution path must add or update a row in the mode x operation authority matrix before the path may ship;
- any surrounding doc introducing a new live context assembler for an existing path without updating this matrix is non-compliant;
- EC must either reject non-compliant paths at startup or emit `context_authority.violation_detected` with enough detail to make the violation visible in engineering surfaces.

Required validation concept:
- `assertPathAuthorityCompliant(path_id, selected_owner, active_context_contributors)`
## 1) Operating modes and enforceable latency budgets (normative)

### 1.1 Why modes exist
Modes exist to keep ELNOR usable under real latency, cost, and reliability constraints.

### 1.2 Mode overview
- **Mode -1: external_bypass** - user is intentionally using direct/native OpenClaw posture or comparable minimal orchestration path
- **Mode 0: baseline** - deterministic orchestration only
- **Mode 1: assisted** - deterministic first plus bounded single-advisor escalation when necessary
- **Mode 2: shadow** - assisted live path plus extra shadow analysis for learning/tuning
- **Mode 3: discovery** - explicit teach/discovery/synthesis jobs
- **Mode 4: lab** - controlled experimental mode

### 1.3 Mode semantics

#### Mode -1: external_bypass
Allowed:
- direct/native OpenClaw use
- no DOC10 advisory routing requirement
- no DOC10 route-trace requirement for direct external usage unless explicitly bridged

Forbidden:
- pretending the user is in baseline/assisted mode when they are not

Purpose:
- preserve the ability to use native OpenClaw without advanced orchestration
- provide regression oracle for "did DOC10 hamstring OpenClaw?"

#### Mode 0: baseline
Allowed:
- deterministic intake
- deterministic intent classification
- deterministic entity routing
- deterministic capability routing
- cheap local registry and cache reads
- cheap entity summary reads
- route trace logging
- event intake/accumulator
- cached OCM brief consumption if already present

Forbidden by default:
- DLA synchronous consultation
- live OCM query
- live CSM advisory routing
- SHUA
- discovery jobs triggered automatically
- deep ref expansion

#### Mode 1: assisted
Mode 0 plus:
- at most one synchronous advisor consultation per eligible operation
- DLA may be consulted only if guard conditions pass
- CSM or OCM cached outputs may contribute to decision packet if already available
- discovery/repair may be offered, but not silently launched in chat fast path

#### Mode 2: shadow
Mode 1 live behavior plus:
- additional shadow analysis allowed after or beside live path
- shadow work must never block live response
- shadow analysis must obey its own budgets

#### Mode 3: discovery
Used for:
- teach mode
- discovery jobs
- synthesis
- explicit capability-learning operations

May use more expensive reasoning, but only as job-mode activity visible in Q.

#### Mode 4: lab
For evaluation and experiments only. Must be clearly labeled as unstable and not assumed to meet normal latency expectations.

### 1.4 Single source of truth for mode state (normative)

```ts
const OrchestrationModeSchema = z.enum([
  "external_bypass",
  "baseline",
  "assisted",
  "shadow",
  "discovery",
  "lab",
]);

const OrchestrationModeStateSchema = z.object({
  current_mode: OrchestrationModeSchema,
  set_at: z.string(),
  set_by: z.enum(["user", "system_degrade", "policy"]),
  previous_mode: OrchestrationModeSchema.optional(),
  degrade_reason: z.string().max(200).optional(),
  schema_version: z.literal(1),
});

const EffectiveModeStateSchema = z.object({
  current_mode: OrchestrationModeSchema,
  persisted_mode: OrchestrationModeSchema,
  effective_reason: z.enum([
    "user_selected",
    "policy_selected",
    "system_degrade",
    "gateway_degraded_fallback",
    "stop_constrained",
    "legacy_adapter_default",
  ]),
  source_of_truth: z.enum(["mode_state_file", "degrade_overlay", "legacy_adapter"]),
  previous_mode: OrchestrationModeSchema.optional(),
  degrade_reason: z.string().max(200).optional(),
  stop_active: z.boolean().default(false),
  gateway_health: z.enum(["healthy", "degraded", "offline", "unknown"]).default("unknown"),
  set_at: z.string(),
  evaluated_at: z.string(),
  schema_version: z.literal(1),
});
```

Canonical durable path:
- `ELNOR_MEMORY/system/orchestration/mode_state.json`

Q may request changes. Q is never the source of truth.

#### 1.4A Effective mode rules (normative)
- `OrchestrationModeStateSchema` is the durable persisted mode record.
- `EffectiveModeStateSchema` is the runtime-derived mode object consumed by dispatch, read-models, route banners, and acceptance tests.
- If degrade overlays, STOP constraints, or legacy-adapter compatibility rules alter behavior for a specific operation, the system must expose that through `EffectiveModeStateSchema` rather than silently mutating `mode_state.json`.
- Any Q surface that displays mode must be able to show both persisted mode and effective mode when they diverge.

### 1.5 Authoritative mode service (required)
Required module:
- `apps/ec-service/src/orchestration/orchestration-modes.ts`

Required exported functions:

```ts
export function getCurrentMode(): OrchestrationModeState;
export function setMode(
  mode: OrchestrationMode,
  actor: "user" | "system_degrade" | "policy",
  reason?: string,
): OrchestrationModeState;
export function assertModeAllows(action: ModeGatedAction, mode?: OrchestrationModeState): void;
```

No subsystem may independently reinterpret the current mode.

### 1.6 Mode-gated actions (normative)

```ts
const ModeGatedActionSchema = z.enum([
  "dladvise_sync",
  "ocm_live_query",
  "csm_live_route_help",
  "shua_run",
  "discovery_run_start",
  "teach_record_start",
  "capability_synthesis_start",
  "gateway_interactive_chat",
  "direct_provider_background",
]);
```

Mode rules:
- `gateway_interactive_chat` is allowed in baseline/assisted/shadow/discovery/lab and bypassed in `external_bypass`
- `dladvise_sync` is allowed only in assisted/shadow/discovery/lab and only through the DLA guard
- `ocm_live_query` is disallowed in baseline; cached OCM brief consumption remains allowed
- `direct_provider_background` is allowed only for explicit background jobs and must not handle interactive Q chat

### 1.7 Sync budgets by mode (normative)

```ts
const TurnAdvisorBudgetSchema = z.object({
  sync_calls_max: z.number().int().min(0).max(2).default(1),
  sync_calls_used: z.number().int().min(0).default(0),
  sync_ms_budget: z.number().int().min(0).max(3000).default(1500),
  sync_ms_used: z.number().int().min(0).default(0),
  schema_version: z.literal(1),
});
```

Defaults:
- Mode 0: `sync_calls_max = 0`
- Mode 1: `sync_calls_max = 1`, `sync_ms_budget = 1500`
- Mode 2: live path same as Mode 1; shadow budget separate
- Mode 3: operation-specific
- Mode 4: configurable but must be explicitly visible

### 1.8 Engineering / Orchestration Panel (required in Q)
Phase 0 must show:
- current mode
- global advanced-orchestration enable/disable state
- DLA enabled/disabled state
- current sync-call budget and usage
- degraded status and degrade reason
- gateway status / EC status

Phase 1 must add:
- per-agent enable/disable / self-test controls
- running jobs table
- route trace viewer (basic)
- event accumulator backlog/health

Phase 2 may add:
- cost/latency charts
- shadow comparison panels
- route tuning diagnostics

### 1.9 Automatic degrade-to-lean behavior
EC may auto-clamp from assisted/shadow/discovery/lab down to baseline when:
- repeated DLA failures or timeouts trip the circuit breaker
- gateway degraded fallback requires minimalism
- budgets are exhausted
- explicit policy says degrade

The system must surface the clamp event in Q and in route trace telemetry.

### 1.10 Mode x operation authority matrix (normative)
This matrix exists to stop cross-doc seam drift.

| Operation/path | Selected handler owner | User-visible execution owner | Context authority | Active memory paths | Abort owner | Required reverse telemetry |
|---|---|---|---|---|---|---|
| Q free-text chat -> Gateway-first | DOC10 selects `gateway_interactive_chat` | DOC11 -> OpenClaw | DOC10 route packet + DOC11 lean annotations | DOC1 selector -> DOC10 refs/hints only; OCM cached brief ref allowed; no full Running Brief assembler | DOC10 dispatch-abort service via DOC11 abort channel | `gateway.chat.*`, `gateway.tool.*`, `gateway.skill.*`, `gateway.approval.*`, abort ack/timeouts |
| Q structured reroute / route override | DOC10 | DOC10 | DOC10 only | none unless route requires it | DOC10 | `route.override.*`, `ui.control.*` |
| Entity lookup / deterministic UI action | DOC10 | DOC10 | DOC10 only | only if entity-linked memory changes actionability | DOC10 | `dispatch.*`, `proxy_mutation.*` |
| Background repair/discovery job | DOC10 selects, DOC9/DOC3 own execution semantics | owner doc pipeline | owner doc plus DOC10 route trace | owner-doc scoped only | owner doc with DOC10 job facade | `job.*`, owner-doc completion/failure hooks |
| External bypass | outside DOC10 | OpenClaw/native path | OpenClaw / external path | DOC10 none | external path | capability watermark route state only |

Normative:
- A path not listed here must be added before it can ship.
- A control that requires an abort owner or reverse telemetry but has none is non-compliant.
## 2) Ownership Matrix and cross-doc control boundaries (normative)

### 2.1 Ownership Matrix

| Concern | Canonical owner | DOC10 role | Notes |
|---|---|---|---|
| Running brief / OCM behavior | Running Brief / OCM spec | consume + mode-gate | DOC10 may not redefine OCM extraction/query behavior |
| Freshness / temporal truth | DOC2 | consume + hook | DOC10 does not replace freshness logic |
| Skills / wrappers / page knowledge | DOC3 | route + request + trace | DOC10 consumes capability metadata bridge |
| Bucket storage / source_read | DOC7 | consume + route hints | DOC10 does not own bucket storage |
| Friction / regression evaluation | DOC8 | feed + consume | DOC10 does not become a second friction brain |
| Repair session / patch pipeline | DOC9 | wake + proposal + trace | DOC10 does not own repair execution semantics |
| Native OpenClaw runtime loop | OpenClaw / DOC4 / DOC11 | preserve + integrate | DOC10 must not micromanage step-level runtime reasoning |
| Gateway-first Q chat controls | DOC11 | cross-link + route | DOC10 chooses handler; DOC11 owns chat-control integration details |
| Mode state | DOC10 | own | server-side only |
| Route trace | DOC10 | own | canonical orchestration telemetry |
| Artifact origin traceability | shared contract via DOC10 | define + require | artifacts remain owned by other docs |

### 2.2 Manager Agent Trust Boundary (normative)
All manager agents and logical manager-agent services:
- are read-only or recommendation-only by default
- may emit structured advice/events
- may not directly write durable state except through canonical command/proxy mutation paths owned elsewhere
- may not bypass EC command queue or policy checks
- may not silently execute code changes or durable memory edits

This applies to:
- OCM
- CSM
- SHUA
- DLA
- future manager agents

### 2.3 DOC10 must not rules
DOC10 must not:
- replace OpenClaw's native agent loop with step-by-step external micromanagement
- overwrite or duplicate OpenClaw-native workspace identity files every turn
- become the owner of DOC1 memory lifecycle semantics
- become a second source of truth for entity discovery, inbox, or STOP once canonical schemas are chosen
- silently force all external/direct OpenClaw usage through the execution spine

---

## 3) Hook Registry (normative)

### 3.1 Hook registry entry schema

```ts
const HookRegistryEntrySchema = z.object({
  hook_id: z.string().max(120),
  producer: z.string().max(120),
  trigger: z.string().max(200),
  payload_schema_ref: z.string().max(160),
  consumer: z.string().max(160),
  sync_or_async: z.enum(["sync", "async"]),
  hot_path_allowed: z.boolean(),
  retry_policy: z.string().max(120).optional(),
  debounce_policy: z.string().max(120).optional(),
  fallback_behavior: z.string().max(200),
  schema_version: z.literal(1),
});
```

### 3.2 Required hooks
Required hooks include at minimum:
- `operation.received`
- `operation.mode_resolved`
- `route.decision.made`
- `route.trace.recorded`
- `artifact.created`
- `artifact.transaction_manifest.updated`
- `capability.installed`
- `capability.updated`
- `capability.removed`
- `capability.quarantined`
- `capability.bridge.refresh.failed`
- `capability.health.changed`
- `memory.proposal.created`
- `memory.mutation.completed`
- `decision.feedback.received`
- `event.group.flushed`
- `event.group.priority_flush`
- `freshness.result.available`
- `repair.wake.requested`
- `repair.session.completed`
- `gateway.degraded`
- `gateway.interactive.completed`
- `gateway.abort.acknowledged`
- `gateway.abort.timeout`
- `alias.misfire_detected`
- `system_agent.selftest.failed`
- `system_agent.message.failed`
- `ui.control.failed`
- `model_provider.health.changed`
- `context_authority.violation_detected`

#### 3.2A Hook delivery, retry, and phase tiering
Required delivery contract:
- each required hook family must have a documented payload schema reference;
- hook producers must specify whether delivery is sync-blocking, async-required, or async-best-effort;
- retry, debounce, dead-letter, and fallback behavior may not be left implicit;
- if the current EventBus cannot satisfy a hook's durability/retry expectations, the hook must be labeled degraded rather than silently treated as reliable.

Telemetry phase tiering:
- **Phase A / hot path**: must be cheap, bounded, and emitted synchronously if required for correctness;
- **Phase B / near-real-time**: may arrive through stream/push infrastructure after dispatch returns;
- **Phase C / retained analytics**: may be delayed, batched, or written by background retention workers.

This tiering exists so implementation can preserve observability without turning baseline interactive chat into a logging tax.

### 3.3 Hook rules
- Raw high-volume events do not directly enter routing or DOC8; they pass through Event Intake and Accumulator first.
- Friction-worthy normalized events must be visible to DOC8 even if DOC10 also acts on them.
- Any hook that creates or mutates a durable artifact must carry `ArtifactOriginSchema`.

---

## 4) Execution Spine (normative)

### 4.1 Components
The execution spine consists of:
- Operation Intake
- Mode Resolution
- Intent Resolution
- Actionability Computation
- Decision Context Builder
- Deterministic Route Selection
- Optional DLA escalation
- Dispatch
- Route Trace and Artifact Origin binding

### 4.2 Operation Intake / Q Interaction Contract

```ts
const OperationEnvelopeSchema = z.object({
  operation_id: z.string().max(160),
  idempotency_key: z.string().max(200).optional(),
  correlation_id: z.string().max(160).optional(),
  operation_type: z.enum([
    "chat",
    "task",
    "project",
    "panel_run",
    "forum",
    "system_event",
    "inbox_action",
    "memory_action",
    "learning_action",
  ]),
  source_surface: z.enum([
    "chat_input",
    "task_page",
    "project_page",
    "panel_page",
    "forum_page",
    "agents_page",
    "capabilities_page",
    "inbox",
    "memory_page",
    "engineering_panel",
  ]),
  page_context: z.object({
    page_type: z.enum([
      "general",
      "task",
      "project",
      "panel",
      "forum",
      "agents",
      "capabilities",
      "inbox",
      "memory",
      "engineering",
    ]).optional(),
    selected_entity_refs: z.array(z.string().max(200)).max(20).optional(),
    active_task_id: z.string().max(160).optional(),
    active_project_id: z.string().max(160).optional(),
    active_panel_id: z.string().max(160).optional(),
    active_forum_id: z.string().max(160).optional(),
  }).optional(),
  user_text: z.string().max(20000).optional(),
  triggering_event_ref: z.string().max(200).optional(),
  requested_mode_hint: OrchestrationModeSchema.optional(),
  requested_route_hint: z.string().max(160).optional(),
  requested_behavior_hint: z.string().max(160).optional(),
  actor_type: z.enum(["user", "system"]),
  agent_id: z.string().max(160),
  run_id: z.string().max(160),
  session_hint: z.string().max(200).optional(),
  can_write_durable: z.boolean(),
  schema_version: z.literal(1),
});
```

The client may send hints. The server computes the effective mode and route authority.

#### 4.2A Entry-point convention, coexistence, and IDs (normative)
Canonical intake rule:
- all orchestration-aware operations must pass through one canonical service boundary:
  - `handleOperationEnvelope(envelope: OperationEnvelope)`
- legacy chat routes, legacy `processCommand()` flows, and owner-doc UI actions may continue to exist temporarily, but they must adapt into `OperationEnvelopeSchema` before DOC10 routing/dispatch logic runs.

ID and correlation rules:
- `operation_id` is the canonical request identity for the orchestration spine;
- `correlation_id` defaults to `operation_id` unless a downstream handoff contract explicitly derives another value;
- `idempotency_key` is required for retry-prone proxy mutations and strongly recommended for chat submission in any environment where duplicate posts are possible;
- downstream objects derive from the operation:
  - `decision_id`
  - `route_trace_id`
  - `job_id`
  - `transaction_manifest_id`
- `session_key` is never client-authored truth; it is assigned or echoed by the runtime/Gateway layer.

Migration rule:
- until Q and EC fully converge on envelope-based submission, any non-envelope endpoint must declare its adapter function and its field-mapping into `OperationEnvelopeSchema`.

### 4.3 SystemEventEnvelope

```ts
const SystemEventEnvelopeSchema = z.object({
  event_id: z.string().max(160),
  event_type: z.string().max(120),
  source_component: z.string().max(120),
  related_ref: z.string().max(200).optional(),
  severity: z.enum(["info", "warn", "error", "critical"]).default("info"),
  payload_ref: z.string().max(200).optional(),
  ts: z.string(),
  schema_version: z.literal(1),
});
```

### 4.4 Intent Resolution

#### Intent classes
```ts
const IntentClassSchema = z.enum([
  "general_chat",
  "entity_lookup",
  "status_inspection",
  "capability_execution",
  "capability_discovery",
  "capability_repair",
  "proposal_review",
  "system_control",
]);
```

#### Intent resolution schema
```ts
const IntentResolutionSchema = z.object({
  intent_class: IntentClassSchema,
  target_domain: z.enum([
    "tasks",
    "projects",
    "panels",
    "forums",
    "email",
    "calendar",
    "browser",
    "word",
    "research",
    "memory",
    "repair",
    "system",
    "capability",
  ]).optional(),
  requested_action: z.string().max(120).optional(),
  confidence: z.number().min(0).max(1),
  reasons: z.array(z.string().max(80)).max(10),
  schema_version: z.literal(1),
});
```

#### Classifier tiers by mode (normative)
Tier 1 - structural:
- operation_type
- page_context
- explicit route or behavior hints
- known button-generated action intents

Tier 2 - pattern:
- deterministic phrase families
- explicit aliases
- route aliases
- app/entity names

Tier 3 - local index:
- LocalIntentIndex over capability metadata, entity labels, aliases, skill names, wrapper names, project/task/forum/panel summaries

Tier 4 - DLA escalation:
- only if allowed by mode and DLA guard

Mode behavior:
- Mode 0: tiers 1-3 only
- Mode 1: tiers 1-3, DLA only if below confidence threshold and guard passes
- Mode 2: same live path as Mode 1; optional shadow analysis
- Mode 3: job-specific
- Mode 4: configurable but must remain visible


#### Free-text natural-language default rule (normative)
For `source_surface = "chat_input"`:
- natural-language free text defaults to **Gateway-first** handling
- exact high-confidence fast paths may directly dispatch only if the matched action is a Q/EC-owned structured action, entity lookup, or owner-doc command that does not require OpenClaw-native runtime latitude
- if an exact match identifies an **OpenClaw-native** task (terminal, browser, desktop/app control, native skill, or comparable runtime problem solving), the broker may attach a route hint or capability hint but must still dispatch to `gateway_interactive_chat`
- free text must never bypass Gateway directly into structured capability execution for OpenClaw-native runtime work
- structured button/menu/form actions generated by Q are exempt and may dispatch to owner-doc handlers directly

#### classifyIntent composition pseudocode (normative)
```ts
function classifyIntent(envelope: OperationEnvelope, modeState: OrchestrationModeState): IntentResolution {
  const structural = scoreStructuralSignals(envelope);
  const pattern = scorePatternSignals(envelope);
  const localIndex = scoreLocalIntentIndex(envelope);
  const merged = mergeIntentTierScores(structural, pattern, localIndex);

  if (isNaturalLanguageChatInput(envelope) && !matchesStrictStructuredFastPath(merged)) {
    return {
      intent_class: "general_chat",
      confidence: merged.confidence,
      reasons: ["gateway_first_chat"],
      schema_version: 1,
    };
  }

  return merged;
}
```

Classifier implementation notes:
- `scoreStructuralSignals()` must prioritize source surface, explicit button actions, slash/structured commands, selected entities, and page context
- `scorePatternSignals()` may use exact aliases, deterministic phrase families, route aliases, and obvious app/entity names
- `scoreLocalIntentIndex()` must consult the cheap local index built from DOC3 bridge metadata, route aliases, entity labels, and selected summaries
- classifier tiers may not call a cloud LLM directly; DLA escalation remains a separate guarded step

### 4.5 Actionability Ladder

```ts
const ActionabilityLevelSchema = z.enum([
  "answer_only",
  "lookup_or_inspect",
  "navigate_or_open",
  "execute_existing_capability",
  "offer_repair",
  "offer_discovery",
  "automate_via_policy",
]);

const ActionabilityDecisionSchema = z.object({
  level: ActionabilityLevelSchema,
  requires_confirmation: z.boolean(),
  requires_supervision: z.boolean(),
  reasons: z.array(z.string().max(80)).max(10),
  schema_version: z.literal(1),
});
```

Canonical rule:
- `ActionabilityDecision.level` is the only valid actionability discriminator.
- Any code, pseudocode, or adapter referencing `actionability.kind` is non-compliant and must be migrated.
- DOC10 does not define a `strict_structured_ui_command` actionability kind. Structured UI commands are identified through `operation_type`, `source_surface`, and intent class.

#### computeActionability decision table (normative)

| intent_class | condition | result |
|---|---|---|
| general_chat | always | answer_only |
| entity_lookup | entity exists or can be listed | lookup_or_inspect |
| status_inspection | always | lookup_or_inspect |
| capability_execution | healthy viable capability exists | execute_existing_capability |
| capability_execution | no viable capability but quarantined/failed candidate exists | offer_repair |
| capability_execution | no viable capability and explicit learning/discovery ask or repeated gap | offer_discovery |
| capability_discovery | always | offer_discovery |
| capability_repair | always | offer_repair |
| proposal_review | always | lookup_or_inspect |
| system_control | valid command and policy allows | automate_via_policy |
| system_control | otherwise | blocked/ask confirmation via dispatch |

### 4.6 Decision Policy Stack
Tier 1 hard rules:
- STOP
- write capability
- remote read-only
- quarantined/disabled routes
- mode restrictions
- policy-approved automation only

Tier 2 routing rules:
- entity questions go to Entity Router
- chat from Q goes to Gateway-first handler after routing
- prefer healthy existing capability over raw fallback
- prefer repair over repeatedly failing route
- discovery should be offered, not silently launched, in interactive chat

Tier 3 preferences:
- standing orders
- user route preferences
- preferred model/session controls from DOC11/OpenClaw layer

Tier 4 advisor hints:
- cached OCM decision brief
- route hints store
- capability utility metrics
- process-memory hint summaries

### 4.7 Decision Broker (authoritative)
Required module:
- `apps/ec-service/src/orchestration/decision-broker.ts`

Authoritative exported functions:

```ts
export async function handleOperation(envelope: OperationEnvelope): Promise<DispatchResult>;
export async function handleSystemEvent(event: SystemEventEnvelope): Promise<void>;
```

### 4.8 DLA as advisory-only escalation layer
DLA is a logical system agent / advisory service. It is not a durable-action owner.

DLA may be consulted only when:
- current mode allows `dladvise_sync`
- DLA is enabled
- `TurnAdvisorBudget.sync_calls_used < sync_calls_max`
- route confidence is below threshold OR tie spread is below threshold OR explicit ambiguity condition is present
- operation type is eligible

DLA may not be consulted when:
- mode is `external_bypass` or `baseline`
- STOP is active
- the DLA circuit breaker is tripped
- the route is already blocked by hard policy


#### shouldEscalateToDLA positive trigger contract (normative)
`shouldEscalateToDLA()` may return `true` only when at least one of the following positive triggers exists:
- route confidence is below `ROUTE_THRESHOLDS.dla_trigger_threshold`
- top viable candidate tie spread is below `ROUTE_THRESHOLDS.tie_spread`
- packet context was marked insufficient after allowed deterministic fetch
- explicit ambiguity signal exists (for example conflicting page context vs alias hit)
- the route is action-like and viable deterministic candidates disagree across route families

`shouldEscalateToDLA()` must return `false` when:
- hard policy already blocks the route
- the route is a Q/EC-owned exact fast path above confidence threshold
- the operation is already forced to Gateway-first free-text handling with no actionable ambiguity
- mode or budget disallow consultation

### 4.9 Event Intake and Event Action Router
Raw events enter Event Intake first. Event Intake groups/debounces/normalizes them before any routing or DOC8 friction handling occurs.


### 4.10 Dispatch
Dispatch may target:
- `chat_responder`
- `entity_router`
- `gateway_interactive_chat`
- `capability_executor`
- `discovery_runner`
- `repair_runner`
- `proposal_creator`
- `system_control_handler`
- `blocked_response`

#### 4.10A DispatchHandler interface (normative)

All handlers implement a common interface so dispatch is not a black box.

```ts
export type DispatchKind =
  | "chat_responder"
  | "entity_router"
  | "gateway_interactive_chat"
  | "capability_executor"
  | "discovery_runner"
  | "repair_runner"
  | "proposal_creator"
  | "system_control_handler"
  | "blocked_response";

const ProvisionalUsageSummarySchema = z.object({
  provider: z.string().max(80).optional(),
  model_id: z.string().max(160).optional(),
  prompt_tokens: z.number().int().min(0).optional(),
  completion_tokens: z.number().int().min(0).optional(),
  total_tokens: z.number().int().min(0).optional(),
  estimated_cost_usd: z.number().min(0).optional(),
  source: z.enum(["gateway", "local_provider", "capability", "unknown"]).default("unknown"),
  updated_at: z.string().optional(),
  schema_version: z.literal(1),
});

const DispatchErrorSchema = z.object({
  code: z.enum([
    "DISABLED_BY_MODE",
    "BUDGET_EXCEEDED",
    "HANDLER_UNAVAILABLE",
    "HANDLER_TIMEOUT",
    "HANDLER_FAILED",
    "ABORT_UNSUPPORTED",
    "ABORT_TIMEOUT",
    "CLEANUP_FAILED",
    "DOWNSTREAM_DEGRADED",
    "VALIDATION_FAILED",
    "READ_MODEL_UNAVAILABLE",
  ]),
  message: z.string().max(240),
  retryable: z.boolean().default(false),
  suggested_fallback: z.enum([
    "none",
    "gateway_interactive_chat",
    "blocked_response",
    "proposal_creator",
    "repair_runner",
  ]).default("none"),
  telemetry_reason_code: z.string().max(80).optional(),
  schema_version: z.literal(1),
});

const RunningJobSchema = z.object({
  job_id: z.string().max(160),
  route_trace_id: z.string().max(160),
  operation_id: z.string().max(160),
  correlation_id: z.string().max(160),
  family: z.enum(["gateway_chat", "capability", "discovery", "repair", "optimization", "shadow"]),
  handler_kind: z.enum([
    "gateway_interactive_chat",
    "capability_executor",
    "discovery_runner",
    "repair_runner",
    "proposal_creator",
  ]),
  priority: z.enum(["user_blocking", "interactive", "normal", "background"]).default("normal"),
  state: z.enum([
    "queued",
    "starting",
    "running",
    "awaiting_approval",
    "abort_requested",
    "aborted",
    "completed",
    "failed",
    "timed_out",
    "cleanup_pending",
    "cleanup_completed",
    "cleanup_failed",
    "orphaned",
  ]),
  owner_component: z.string().max(120),
  session_key: z.string().max(200).optional(),
  stream_channel: z.string().max(200).optional(),
  abort_supported: z.boolean().default(false),
  abort_state: z.enum([
    "not_requested",
    "requested",
    "acknowledged",
    "timeout",
    "unsupported",
    "cleanup_pending",
    "cleanup_failed",
    "completed",
  ]).default("not_requested"),
  cleanup_timeout_ms: z.number().int().min(0).optional(),
  cleanup_started_at: z.string().optional(),
  cleanup_completed_at: z.string().optional(),
  cleanup_failed_reason: z.string().max(200).optional(),
  started_at: z.string(),
  updated_at: z.string(),
  completed_at: z.string().optional(),
  usage_summary: ProvisionalUsageSummarySchema.optional(),
  schema_version: z.literal(1),
});

export interface DispatchHandler {
  kind: DispatchKind;
  handle(args: {
    envelope: OperationEnvelope;
    decision: RouteDecision;
    effectiveMode: EffectiveModeState;
    trace: RouteTraceRecord;
  }): Promise<DispatchResult>;

  abort?(args: {
    job: z.infer<typeof RunningJobSchema>;
    requested_by: string;
    reason: string;
  }): Promise<{
    acknowledged: boolean;
    cleanup_pending: boolean;
    cleanup_timeout_ms?: number;
  }>;
}
```

Normative:
- Every handler must return either a success result or `DispatchErrorSchema`. Silent failure is forbidden.
- Any handler that can create a long-lived job must emit `RunningJobSchema` and register that job before returning control to Q.
- `capability_executor` may not remain an active-path handler without a concrete contract. If implementation cannot satisfy it yet, it must stay registry-present but policy-disabled.
- Jobs must survive restart through persisted reconciliation state. On boot, EC must reconcile `running_jobs.json` and convert abandoned jobs to `orphaned`, `failed`, or `cleanup_pending` rather than silently dropping them.

#### 4.10A1 Running-job lifecycle, persistence, and restart rules
- `running_jobs.json` is the canonical current-state read model for active and recently-finished jobs visible in Q.
- append-only lifecycle evidence for jobs may be persisted separately, but Q must read a stable current-state snapshot rather than replaying raw logs.
- if EC restarts while a job is `starting`, `running`, `awaiting_approval`, or `abort_requested`, implementation must perform startup reconciliation:
  - reattach if the downstream runtime can still be queried;
  - otherwise mark the job `orphaned` with a visible reason;
  - if cleanup is required and supported, transition to `cleanup_pending`.
- `DispatchResultSchema`, `RunningJobSchema`, and any stream transport must agree on `job_id`, `route_trace_id`, `correlation_id`, and `session_key`.
#### 4.10B Handler registry invariants (normative)

- EC must build a handler registry at startup.
- Missing required handlers is a hard startup error (fail-fast).
- If a handler is disabled by mode/policy, registry remains present but handler must return a structured `DISABLED_BY_MODE` result (not crash).

Required registry:
- `chat_responder`
- `entity_router`
- `gateway_interactive_chat`
- `system_control_handler`
- `blocked_response`

Optional (may be stubbed initially but must exist in registry before UI exposes controls):
- `capability_executor`
- `discovery_runner`
- `repair_runner`
- `proposal_creator`

#### 4.10C Dispatch contract table (normative)

| Handler kind | Invocation | Output type | Time budget (baseline) | Failure behavior | Route trace update | Telemetry minimum |
|---|---|---|---:|---|---|---|
| `gateway_interactive_chat` | sync handoff + async stream | streaming + running job + reverse telemetry | 250ms handoff | degrade banner; do not retry side-effects; may fall back only when gateway degraded policy is active | mark `handed_to_gateway`; later update executed-route from reverse events | `route.gateway_handoff`, `job.started`, `gateway.*` |
| `chat_responder` | sync | sync | 2s | fallback to `gateway_interactive_chat` if uncertainty or tool need appears | record chosen local response | `dispatch.completed/failed` |
| `entity_router` | sync | sync | 2s | return partial results; surface degraded | record query + response count | `entity.discovery.*` |
| `system_control_handler` | sync | sync | 2s | no-op + error message; no silent fallback | record command outcome | `proxy_mutation.*` |
| `capability_executor` | sync or async | job | 500ms start | quarantine capability on repeated failure; may create repair/proposal signal | record start + job id | `job.*`, `capability.*` |
| `discovery_runner` | async job | job | 500ms start | surface failure + optional repair proposal; no auto-retry loops | record job | `discovery.*`, `job.*` |
| `repair_runner` | async job | job | 500ms start | surface failure + stop | record job | `repair.*`, `job.*` |
| `proposal_creator` | sync | sync | 2s | dedupe suppression | record proposal id | `proposal.*` |
| `blocked_response` | sync | sync | 200ms | n/a | record block reason | `route.blocked` |

Notes:
- Baseline budgets are dispatch budgets, not total wall-clock runtime. Streaming and jobs continue beyond the initial budget.
- If a handler returns a job, the job must be registered in Running Jobs and emit a completion notice.

#### 4.10D GatewayHandoffPayload (normative)

When routing to `gateway_interactive_chat`, DOC10 must produce a concrete handoff payload so:
- DOC11 can assemble lean annotations consistently;
- reverse telemetry can be correlated;
- Q can render executed-route state;
- the abort path has enough state to target the correct runtime session;
- implementation can validate producer-side field coverage rather than inferring missing handoff state.

```ts
const GatewayHandoffPayloadSchema = z.object({
  operation_id: z.string().max(160),
  route_trace_id: z.string().max(160),
  correlation_id: z.string().max(160), // default = route_trace_id
  session_key: z.string().max(200),
  effective_mode: EffectiveModeStateSchema,
  route_hints: z.array(z.string().max(160)).max(12).default([]),
  capability_hints: z.array(z.string().max(160)).max(12).default([]),
  decision_reason_codes: z.array(z.string().max(60)).max(20).default([]),
  context_ref_ids: z.array(z.string().max(200)).max(40).default([]),
  memory_ref_ids: z.array(z.string().max(200)).max(12).default([]),
  ocm_brief_ref: z.string().max(200).optional(),
  annotation_budget_tokens: z.number().int().min(0).max(1800),
  brief_excerpt_max_tokens: z.number().int().min(0).max(600).optional(),
  lean_ec_annotations: z.string().max(3000).default(""),
  expected_reverse_event_families: z.array(z.string().max(80)).max(20).default([]),
  abort_supported: z.boolean().default(false),
  artifact_origin_seed: z.object({
    origin_operation_id: z.string().max(160),
    origin_route_trace_id: z.string().max(160),
    origin_signal_type: z.enum(["user_prompt", "system_event", "nightly", "repair", "discovery", "manual"]),
    created_by_component: z.string().max(120),
    created_at: z.string(),
    schema_version: z.literal(1),
  }),
  schema_version: z.literal(1),
});
```

Normative:
- `correlation_id` MUST be present and MUST equal `route_trace_id` unless explicitly overridden.
- DOC11/Gateway must echo `correlation_id` back in reverse telemetry events.
- `lean_ec_annotations` must obey the DOC11 lean contract and must not duplicate OpenClaw-native identity files.
- `annotation_budget_tokens` must reflect the effective mode budget after trim, not a wishful upper bound.
- if DOC10 cannot produce a field required by `GatewayHandoffPayloadSchema`, dispatch must fail visibly rather than sending a partial wish-object into the runtime.

#### 4.10D1 DOC10 -> DOC11 annotation handshake (normative)
DOC10 produces routing facts, refs, and bounded hints. DOC11 owns final lean-annotation assembly.

Required handshake expectations:
- DOC10 hands off:
  - `route_trace_id`
  - `correlation_id`
  - effective mode
  - approved refs
  - trim-aware annotation budget
  - decision reason codes
  - route/capability hints
  - optional OCM brief ref
- DOC11 returns or emits:
  - accepted/rejected handoff state
  - effective session key
  - annotation-consumed token count if available
  - gateway health reason when handoff is refused or degraded

If DOC11 cannot yet emit the full annotation-consumed token count, it must at least emit an explicit `annotation_usage_unknown` marker rather than pretending the value is known.

#### 4.10E Gateway abort cascade (normative)

```ts
const GatewayAbortRequestSchema = z.object({
  job_id: z.string().max(160),
  operation_id: z.string().max(160),
  route_trace_id: z.string().max(160),
  correlation_id: z.string().max(160),
  session_key: z.string().max(200),
  requested_by: z.string().max(160),
  reason: z.string().max(200),
  requested_at: z.string(),
  schema_version: z.literal(1),
});

const GatewayAbortInteropSchema = z.object({
  request: GatewayAbortRequestSchema,
  transport: z.enum(["gateway_ws", "gateway_http", "gateway_internal_bridge"]),
  accepted_for_delivery: z.boolean(),
  delivery_acknowledged: z.boolean().default(false),
  downstream_abort_supported: z.boolean().default(false),
  cleanup_pending: z.boolean().default(false),
  cleanup_timeout_ms: z.number().int().min(0).optional(),
  schema_version: z.literal(1),
});
```

Required lifecycle:
1. User invokes a terminate/cancel control.
2. Q emits `JobTerminateRequest`.
3. EC validates policy and job ownership.
4. If the job is Gateway-backed and `abort_supported = true`, DOC10 must call the real DOC11 abort/cancel path.
5. Running job state moves to `abort_requested`.
6. DOC11/Gateway must either:
   - acknowledge delivery and later emit a terminal reverse event, or
   - explicitly refuse / report unsupported state.
7. Completion requires either:
   - `gateway.chat.aborted` / `gateway.abort.acknowledged`, or
   - `gateway.abort.timeout` followed by visible degraded cleanup state.
8. If cleanup is required, job state must remain `cleanup_pending` until cleanup completes, fails, or times out.
9. EC must not mark the job fully terminated before downstream acknowledgment or timeout.

A job cancel control that only mutates EC local state is non-compliant.

Abort cleanup states that must be representable:
- `cleanup_pending`
- `cleanup_completed`
- `cleanup_failed`

If cleanup fails, Q must show that failure explicitly rather than collapsing to a generic "aborted" badge.
#### 4.10F Dispatch fallback chain (normative)

| Primary handler | Failure / refusal condition | Allowed fallback | Forbidden fallback |
|---|---|---|---|
| `chat_responder` | low confidence, tool need, blocked local action | `gateway_interactive_chat` | direct provider call |
| `entity_router` | partial data / stale entity facade | `chat_responder` or `blocked_response` | capability executor |
| `gateway_interactive_chat` | gateway degraded and policy allows | `chat_responder` or `blocked_response` | second live runtime race |
| `capability_executor` | quarantined / repeated failure | `proposal_creator` and optionally `repair_runner` | silent retry storm |
| `discovery_runner` | policy block / STOP / timeout | `blocked_response` or proposal | hidden continued exploration |
| `repair_runner` | STOP / policy block / timeout | `blocked_response` or proposal | silent auto-repair loop |
| `system_control_handler` | validation failure | `blocked_response` | alternate mutating owner-doc path |

Retry / backoff rule:
- DOC10 may retry only idempotent dispatch preparation steps, not already-handed-off downstream side effects;
- retries must declare max attempts and backoff policy in handler implementation notes;
- repeated handler failure must converge to a visible terminal error, not a silent retry storm.

#### 4.10G Dispatch lifecycle telemetry (normative)

Every dispatch attempt must emit:
- `route.handler_dispatched`
- `dispatch.completed` OR `dispatch.failed`
- `dispatch.fallback_invoked` when a second handler is chosen
- `job.started` when a job is created
- `job.abort_requested` / `job.abort_acknowledged` / `job.abort_timeout` where relevant
### 4.11 Main flow pseudocode (normative example)

```ts
export async function handleOperation(envelope: OperationEnvelope): Promise<DispatchResult> {
  const modeState = getCurrentMode();
  const effectiveMode = resolveEffectiveMode(envelope, modeState);
  const budget = initTurnAdvisorBudget(effectiveMode);

  if (effectiveMode.current_mode === "external_bypass" && envelope.operation_type === "chat") {
    return passThroughOpenClawOrGateway(envelope);
  }

  const intent = classifyIntent(envelope, effectiveMode);
  const actionability = computeActionability(intent, envelope, effectiveMode);
  const packet = buildDecisionPacket(envelope, intent, actionability, effectiveMode);

  if (shouldDefaultToGatewayFirstChat(envelope, intent, actionability, effectiveMode)) {
    const gatewayDecision = buildGatewayFirstRouteDecision(envelope, intent, actionability, packet, effectiveMode);
    return dispatchDecision(gatewayDecision, envelope, effectiveMode);
  }

  let decision = deterministicRoute(envelope, intent, actionability, packet, effectiveMode, budget);

  if (shouldEscalateToDLA(decision, packet, effectiveMode, budget)) {
    decision = await consultDLA(packet, decision, effectiveMode, budget);
  }

  return dispatchDecision(decision, envelope, effectiveMode);
}
```

### 4.12 Interactive chat routing rule (normative)
For `operation_type = "chat"` originating from Q:
- `handleOperation()` must ultimately dispatch to `gateway_interactive_chat`
- except in explicit `external_bypass` or gateway-degraded fallback conditions

DOC11 owns the concrete Gateway-first interactive chat contract. DOC10 owns the rule that this handler must be selected.

#### 4.12A `shouldDefaultToGatewayFirstChat()` (required mid-level contract)

```ts
function shouldDefaultToGatewayFirstChat(args: {
  envelope: OperationEnvelope;
  intent: IntentResolution;
  actionability: ActionabilityDecision;
  modeState: EffectiveModeState;
  gatewayHealth: "healthy" | "degraded" | "offline" | "unknown";
}): boolean {
  if (args.modeState.current_mode === "external_bypass") return false;
  if (args.envelope.source_surface !== "chat_input") return false;
  if (args.envelope.operation_type !== "chat") return false;
  if (args.gatewayHealth === "offline") return false;
  if (args.intent.intent_class === "system_control") return false;
  if (
    args.envelope.source_surface !== "chat_input" &&
    args.envelope.operation_type !== "chat"
  ) return false;
  return true;
}
```

Normative:
- Alias hits for OpenClaw-native work may increase confidence and route hints, but they do not make this function return `false`.
- This function is the only allowed gateway-first default guard. Duplicate gateway-first decision logic elsewhere is non-compliant.
- Existing code paths that previously referenced `actionability.kind` must be migrated to the canonical fields used above.
- Gateway degraded fallback must be surfaced through `EffectiveModeStateSchema` and route-trace telemetry rather than by silently bypassing this guard.
### 4.13 Parallelism rule (normative)
Allowed:
- shadow route comparison
- shadow DLA reasoning in eligible modes
- preflight metadata fetch
- session warm-up / telemetry subscription setup

Forbidden:
- dual live execution of the same user request through multiple runtimes
- racing Gateway execution against a second live capability executor
- duplicate side-effecting runs for the same `operation_id`

If parallel work occurs, only one path may hold execution authority for the live request.

---

## 5) Decision context architecture: refs first, summary seeded, fetch on demand

### 5.1 Canonical context references
Every retrievable object must expose a stable ref form.
Examples:
- `run:{runId}:msg:{seq}`
- `task:{taskId}`
- `project:{projectId}`
- `panel:{panelId}:post:{seq}`
- `forum:{forumId}:post:{seq}`
- `memory:{memoryId}`
- `bucket:{bucketId}:file:{fileId}:section:{sectionId}`
- `capability:{capabilityId}`
- `repair:{sessionId}`
- `proposal:{proposalId}`
- `gateway_session:{sessionKey}`

### 5.2 ContextRef schema

```ts
const ContextRefSchema = z.object({
  ref_id: z.string().max(200),
  ref_type: z.enum([
    "chat_message",
    "task",
    "project",
    "panel_post",
    "forum_post",
    "memory",
    "bucket_section",
    "capability",
    "repair_session",
    "proposal",
    "gateway_session",
  ]),
  label: z.string().max(200).optional(),
  ts: z.string().optional(),
  agent_id: z.string().max(160).optional(),
  run_id: z.string().max(160).optional(),
  trust: z.enum(["user", "system", "derived", "llm_summary"]),
  tags: z.array(z.string().max(40)).max(12).optional(),
  token_estimate: z.number().int().min(0).optional(),
  schema_version: z.literal(1),
});
```

### 5.3 Decision Context Builder (DCB)
Required module:
- `apps/ec-service/src/orchestration/decision-context-builder.ts`

The DCB assembles:
- operation facts
- policy facts
- a small decision brief
- context refs
- a tiny number of inline snippets if needed
- no synchronous summarizer LLM calls

The DCB is a routing packet builder, not a second full prompt assembler.

#### 5.3A Path authority clarifications (normative)

For Gateway-first chat:
- DCB output is consumed by the Decision Broker and by DOC11's annotation builder.
- DCB does not call the Running Brief full context assembler.
- DCB may include `ocm_brief_ref` or a compact OCM cached brief excerpt, but not a second transcript-style block.
- DCB may emit memory refs and one-line hints, not full memory payloads.

For structured Q actions:
- DCB may be the only context builder on the hot path.

For owner-doc jobs:
- DCB may provide route facts, but owner-doc execution context remains owner-doc controlled.

#### 5.3B DecisionPacket usage rule (normative)
The `DecisionPacket` is for:
- route selection
- DLA consultation when mode allows
- DOC11 annotation input
- route trace explanation
- downstream proposal / repair / learning signals

The `DecisionPacket` is not injected verbatim into the user-visible runtime prompt.
### 5.4 DecisionPacket schema

```ts
const DecisionPacketSchema = z.object({
  packet_id: z.string().max(160),
  operation_id: z.string().max(160),
  decision_goal: z.string().max(200),
  operation: OperationEnvelopeSchema,
  hard_facts: z.array(z.string().max(240)).max(20),
  decision_brief: z.string().max(1200).optional(),
  context_refs: z.array(ContextRefSchema).max(20),
  inline_snippets: z.array(z.object({
    ref_id: z.string().max(200),
    text: z.string().max(600),
  })).max(4).default([]),
  fetch_budget_calls: z.number().int().min(0).max(2).default(1),
  fetch_budget_tokens: z.number().int().min(0).max(2000).default(1000),
  current_resolve_turn: z.number().int().min(0).default(0),
  max_resolve_turns: z.number().int().min(0).max(1).default(1),
  token_estimate: z.number().int().min(0).max(3500),
  schema_version: z.literal(1),
});
```

### 5.5 Hard packet caps (normative)
- `DECISION_PACKET_MAX_REFS = 20`
- `DECISION_PACKET_MAX_INLINE_SNIPPETS = 4`
- `DECISION_PACKET_TOKEN_CAP = 3500`
- inline snippets should be dropped before hard facts
- low-priority refs should be dropped before inline snippets are expanded

### 5.6 OCM decision brief contract
The OCM behavior remains owned by Running Brief / OCM remediation. DOC10 consumes a derived compact brief only.

```ts
const OCMDecisionBriefSchema = z.object({
  brief_text: z.string().max(1200),
  brief_kind: z.enum(["chat", "task", "project", "panel", "forum", "system"]),
  generated_at: z.string(),
  source_mode: z.enum(["cached", "live"]),
  schema_version: z.literal(1),
});
```

Default posture:
- baseline/assisted consume cached brief only;
- live OCM query requires mode allowance and is never mandatory for the fast path;
- live OCM path remains defined but gated and may not become a hidden hot-path dependency.

#### 5.6A OCM brief excerpt limits (normative)
- `brief_excerpt_max_tokens` must be honored before brief text is rendered into lean annotations;
- if the cached brief exceeds the current path budget, DOC10 must prefer a ref or title-only hint over inline excerpt text;
- DOC10 may not request a full Running Brief unified assembler output on the Gateway-first hot path.

### 5.7 Context sufficiency logic

```ts
const ContextSufficiencySchema = z.object({
  sufficient: z.boolean(),
  missing: z.array(z.enum([
    "none",
    "recent_turns",
    "selected_entity_detail",
    "capability_detail",
    "memory_hint",
    "freshness_check",
    "repair_status",
    "bucket_section",
  ])).max(6),
  confidence: z.number().min(0).max(1),
  suggested_fetches: z.array(z.string().max(200)).max(8).optional(),
  schema_version: z.literal(1),
});
```

### 5.8 Controlled context fetch services
Required services:
- `ctx_fetch(ref_id, mode)`
- `ctx_fetch_range(runId, fromSeq, toSeq)`
- `ctx_search(scope, query, topK)`
- `ctx_expand(ref_id, neighbors)`

Rules:
- requested ref must exist in `DecisionPacket.context_refs` unless broker explicitly authorizes wider search
- `MAX_RESOLVE_TURNS = 1`
- `ctx_expand` must obey a hard truncation bound and record when truncation occurred
- exceeding resolve budget forces deterministic fallback and route-trace note

### 5.9 ContextRef relevance scoring and decay
Context refs are ranked, not blindly included.

Ranking factors:
- page-context alignment
- active-task or active-entity boost
- recent-touch boost
- trust/source weight
- freshness/recency score
- explicit user-selected boost

Rules:
- refs must not be hard-dropped purely because they are older
- age should decay relevance score, not erase potentially important refs
- Freshness Manager signals may further penalize stale refs for freshness-sensitive tasks
- packet trimming must drop the lowest-ranked refs first

### 5.10 Route hints are not DOC1 process memory
DOC10 owns its own thin route-hint history derived from route traces.

Canonical path:
- `ELNOR_MEMORY/system/orchestration/route_hints.jsonl`
- `ELNOR_MEMORY/system/orchestration/route_hints_current.json`

DOC1 process memory may provide hint-like summaries, but it is not the canonical route-history store.


### 5.11 Memory selection and injection contract (normative)

DOC10 does not own durable memory semantics (DOC1 does). DOC10 does own:
- when memory should be consulted for decisioning
- how memory refs/hints are included without context bloat
- how memory is kept additive relative to OpenClaw-native workspace memory
- how memory injection stays bounded by mode and operation type

The objective is: higher correctness with minimal added latency.

#### 5.11A MemoryInjectionPlan schema

```ts
const MemoryInjectionPlanSchema = z.object({
  operation_id: z.string().max(160),
  route_trace_id: z.string().max(160),
  effective_mode: OrchestrationModeSchema,
  selector_owner: z.enum(["doc1_memory_selector", "dcb_rules", "user_selected"]).default("dcb_rules"),

  memory_refs: z.array(z.string().max(200)).max(12).default([]),
  memory_snippets: z.array(z.object({
    memory_ref: z.string().max(200),
    snippet: z.string().max(420),
  })).max(2).default([]),

  injection_targets: z.array(z.enum([
    "decision_packet",
    "lean_ec_annotations",
    "none",
  ])).max(2).default(["decision_packet"]),

  budget_tokens: z.number().int().min(0).max(900).default(300),
  reason_codes: z.array(z.string().max(60)).max(16).default([]),
  dedupe_policy: z.enum(["avoid_openclaw_native", "allow_overlap"]).default("avoid_openclaw_native"),
  created_at: z.string(),
  schema_version: z.literal(1),
});
```

Normative:
- Memory injection must never require synchronous LLM calls.
- Memory refs must be preferred over snippets.
- If memory is injected into lean EC annotations, it must be ref-form plus a one-line hint, not full memory payload.
- For Gateway-first chat, memory selection must be deterministic and cheap enough to stay inside the DOC10 hot-path budget.

#### 5.11B Memory Injection Matrix (normative)

| Operation type | Default memory targets | Max refs | Max snippets | Budget tokens | Notes |
|---|---|---:|---:|---:|---|
| `chat` from `chat_input` (free text) | `lean_ec_annotations` + `decision_packet` | 6 | 1 | 250 | Free-text defaults to Gateway-first; memory should be hints only. |
| `chat` structured UI action | `decision_packet` | 4 | 0 | 150 | Use memory only if action touches user preferences or persistent workflows. |
| `task` | `decision_packet` | 8 | 2 | 450 | Task context may justify more memory refs; still bounded. |
| `project` | `decision_packet` | 8 | 1 | 350 | Project scope memory allowed; avoid large snippets. |
| `panel_run` / `forum` | `decision_packet` | 6 | 1 | 250 | Prefer thread refs and bucket refs over memory snippets. |
| `system_event` | `decision_packet` | 4 | 0 | 150 | Only attach memory if it changes diagnosis/actionability. |
| `inbox_action` / `memory_action` / `learning_action` | `none` | 0 | 0 | 0 | These are already owner-doc scoped; do not spam memory. |

Normative:
- These are defaults. Modes may clamp down further.
- In `baseline` mode, memory injection must never exceed the matrix maximums.
- In `assisted` mode, a single extra memory ref may be allowed if it avoids DLA escalation.

#### 5.11C Memory selection rules (deterministic)

DCB must select memory using deterministic rules, in priority order:
1. user-selected memory
2. entity-linked memory
3. workflow memory
4. recent high-confidence repair/learning outcomes if relevant
5. recency-biased general memory only if it clearly reduces ambiguity

Selection must use metadata, tags, and scope rules (DOC1-owned), not LLM summarization.

#### 5.11D Dedup with OpenClaw-native workspace memory (anti-bloat)

OpenClaw may already inject workspace memory files. DOC10 must avoid duplicating that content.

Normative dedup policy:
- If DOC1 marks a memory entry as `source = "openclaw_native"` (or equivalent), DOC10 should not inject it again.
- If that metadata does not exist yet, DOC10 must still prefer memory refs/hints and keep budgets small.
- If uncertain, DOC10 should attach only a single line:
  - `Memory hint: see memory:{id} ({title})`

#### 5.11E Active memory paths by execution path (normative)

| Path | Allowed memory paths | Forbidden duplicate paths | Notes |
|---|---|---|---|
| Gateway-first free-text chat | DOC1 deterministic selector -> DOC10 refs; optional OCM brief ref; DOC11 one-line memory hints | full Running Brief memory overlay + DOC10 duplicate hints + OpenClaw-native duplication | keep additive and lean |
| Structured Q action | DOC1 selector -> DOC10 refs only | OCM chat brief injection | action-oriented path |
| Discovery / repair job | owner-doc scoped memory only | chat-turn memory injection | avoid accidental hot-path bloat |
| External bypass | DOC10 none | any DOC10 memory injection | preserve bypass reality |

#### 5.11F Combined DOC10 chat contribution ceiling (normative)

For Gateway-first chat, the total DOC10-contributed material that can reach DOC11's annotation builder for a single turn must not exceed:
- `DOC10_CHAT_CONTEXT_CAP_BASELINE = 700` tokens
- `DOC10_CHAT_CONTEXT_CAP_ASSISTED = 1200` tokens
- `DOC10_CHAT_CONTEXT_CAP_DISCOVERY = 1500` tokens only when the operation explicitly entered a discovery-heavy path

This cap includes:
- route hints
- OCM brief excerpt or ref titles
- memory hints
- bucket titles / compact summaries
- freshness / repair / friction notes
- control/debug notes

The `DecisionPacket` itself does not count toward this cap unless parts of it are rendered into lean annotations.

#### 5.11G Cross-turn dedupe (required)

DOC10 must maintain a short-lived per-thread dedupe cache for annotation-level memory hints.
Rules:
- if the same memory ref was injected in either of the previous two turns and no new reason code was added, suppress the duplicate hint
- if a memory ref remains important, compress to a stable short label rather than re-emitting the full one-line hint
- emit `memory.injection_deduped` when suppression occurs

#### 5.11H Memory telemetry (required)

DOC10/Q must expose:
- `memory.injection_selected`
- `memory.injection_trimmed`
- `memory.injection_skipped`
- `memory.injection_deduped`
- `memory.origin_viewed`

This is required to debug stale, wrong, or overemphasized memory behavior.

#### 5.11I Trim order and lean-return expectations (required)
When the combined DOC10 chat contribution must be trimmed, the default trim order is:
1. debug/control notes
2. route-hint prose
3. long bucket labels / summaries
4. OCM brief excerpt text
5. memory one-line hints
6. freshness / repair / friction notes
7. required refs and compact reason codes last

DOC1 / memory-layer expectation:
- DOC10 expects lean memory query results from DOC1: refs, titles, tags, trust, scope, and compact snippet metadata;
- DOC10 does not expect DOC1 to return large free-form text blobs on the hot path.

OpenClaw dedupe expectation:
- if memory metadata indicates the item is already represented in OpenClaw-native workspace memory, DOC10 must suppress duplicate inline injection and prefer a stable ref hint only.
## 6) DLA, deterministic routing, and route-trace infrastructure

### 6.1 DLA config, guard, and circuit breaker

```ts
const DLACircuitBreakerSchema = z.object({
  failure_window_ms: z.number().int().default(300000),
  failure_threshold: z.number().int().min(1).max(10).default(3),
  cooldown_ms: z.number().int().default(600000),
  current_failures: z.number().int().default(0),
  tripped_at: z.string().optional(),
  tripped: z.boolean().default(false),
  schema_version: z.literal(1),
});
```

Guard constants:
- `DLA_MAX_SYNC_CALLS_PER_TURN = 1`
- `DLA_MAX_RESOLVE_TURNS = 1`
- `DLA_TIMEOUT_MS_DEFAULT = 1500`
- `DLA_TIMEOUT_MS_HARD_MAX = 2000`

### 6.2 Local intent and capability index
DOC10 depends on a cheap local index backed by capability metadata, entity labels, aliases, and selected summaries.

Required module:
- `apps/ec-service/src/orchestration/local-intent-index.ts`

The DOC3 capability metadata bridge must feed this index.

#### 6.2A Phase 1 bridge fallback: `SKILL.md` and artifact scanner
Until DOC3 exports the full Capability Registry Bridge, DOC10 may build a provisional bridge cache by scanning existing `SKILL.md` files, wrapper metadata, and known artifact descriptors at startup and on capability-change hooks.

Rules:
- this fallback is provisional, not canonical;
- provisional entries must be marked `source = "provisional_skill_scanner"`;
- once DOC3 canonical bridge entries exist, canonical entries override provisional scanner output;
- the scanner may not invent fields that it cannot derive;
- built-in OpenClaw skills and workspace-provided capabilities must be tagged as `origin_owner = "openclaw_runtime"` when that can be determined;
- scanner refresh must run on startup and on explicit bridge refresh, not on every user turn;
- low-confidence scanner outputs are guilty until proven innocent:
  - unknown route tier defaults low;
  - unknown supervision posture defaults conservative;
  - unknown health defaults `degraded` or `unknown` rather than `healthy`;
  - unknown alias sets may not be treated as high-confidence route triggers.

#### 6.2B CapabilityAwarenessSnapshot schema

```ts
const CapabilityAwarenessSnapshotSchema = z.object({
  snapshot_id: z.string().max(160),
  scope: z.enum(["workspace", "session", "page", "operation"]),
  top_capability_ids: z.array(z.string().max(160)).max(12).default([]),
  top_route_family_ids: z.array(z.string().max(120)).max(8).default([]),
  unavailable_capabilities: z.array(z.object({
    capability_id: z.string().max(160),
    reason_code: z.string().max(80),
  })).max(12).default([]),
  quarantined_capability_ids: z.array(z.string().max(160)).max(12).default([]),
  blocked_route_ids: z.array(z.string().max(160)).max(12).default([]),
  bridge_state: z.enum(["canonical", "provisional_scanner", "missing", "stale"]),
  scanner_confidence: z.enum(["high", "mixed", "low"]).optional(),
  gateway_available: z.boolean(),
  effective_mode: OrchestrationModeSchema,
  updated_at: z.string(),
  schema_version: z.literal(1),
});
```

This snapshot is a derived view built from:
- DOC3 bridge metadata or provisional scanner output;
- DOC11/OpenClaw catalog and session controls;
- route blocklist;
- quarantine/health state;
- current page/entity context.

#### 6.2C Local intent query contract

```ts
const LocalIntentIndexQuerySchema = z.object({
  operation_type: z.string().max(80),
  source_surface: z.string().max(80),
  utterance: z.string().max(2000).optional(),
  selected_entity_refs: z.array(z.string().max(200)).max(20).default([]),
  page_context_label: z.string().max(120).optional(),
  effective_mode: OrchestrationModeSchema,
  schema_version: z.literal(1),
});

const LocalIntentIndexResultSchema = z.object({
  top_route_family_ids: z.array(z.string().max(120)).max(12),
  top_capability_ids: z.array(z.string().max(160)).max(12),
  bridge_missing_family_ids: z.array(z.string().max(120)).max(12).default([]),
  blocked_route_ids: z.array(z.string().max(160)).max(12).default([]),
  explanation_codes: z.array(z.string().max(80)).max(20).default([]),
  schema_version: z.literal(1),
});
```

Implementation constraints:
- LocalIntentIndex must remain cheap and deterministic; it may not perform synchronous remote fetches or cloud LLM calls.
- cache invalidation must run on capability installed/updated/removed, bridge refresh success/failure, route alias changes, and capability health changes.
- a stale index must be surfaced as stale, not silently treated as accurate.

#### 6.2D RouteAliasSeed schema and governance

```ts
const RouteAliasSeedSchema = z.object({
  alias_id: z.string().max(160),
  phrase: z.string().max(160),
  normalized_phrase: z.string().max(160),
  match_type: z.enum(["exact", "surface_exact"]),
  target_kind: z.enum(["entity_route", "gateway_route", "ui_command"]),
  target_id: z.string().max(160),
  required_surface: z.array(z.string().max(80)).max(8).default([]),
  required_mode: z.array(OrchestrationModeSchema).max(6).default([]),
  requires_confirmation: z.boolean().default(false),
  source: z.enum(["manual", "suggested_from_trace", "promoted_learning"]),
  enabled: z.boolean().default(true),
  priority: z.number().int().min(0).max(100).default(50),
  hit_count: z.number().int().min(0).default(0),
  misfire_count: z.number().int().min(0).default(0),
  last_used_at: z.string().optional(),
  created_from_trace_id: z.string().max(160).optional(),
  created_from_feedback_id: z.string().max(160).optional(),
  schema_version: z.literal(1),
});
```

Phase 1 rules:
- only exact or surface-constrained exact matches are allowed;
- free-text alias hits may directly dispatch only for Q/EC-owned actions or entity lookups;
- alias hits for OpenClaw-native work must still route through `gateway_interactive_chat` with a route hint;
- learned alias seeds must be reviewable and editable in Q before broad activation unless explicitly marked low-risk by policy;
- aliases may not override explicit quarantine or route blocklist state.

#### 6.2E Capability health mutation (required)
DOC10 must expose a canonical command for bridge-derived capability health changes:

```ts
const CapabilityHealthMutationSchema = z.object({
  capability_id: z.string().max(160),
  new_health_status: z.enum(["healthy", "degraded", "quarantined", "disabled"]),
  reason_code: z.string().max(80),
  source_component: z.string().max(120),
  route_trace_id: z.string().max(160).optional(),
  created_at: z.string(),
  schema_version: z.literal(1),
});
```

Applying this mutation must trigger:
- capability snapshot refresh;
- LocalIntentIndex refresh;
- route scoring cache invalidation;
- visible Q refresh where the capability card is open;
- quarantine cooldown / reprobe scheduling if the new state is `degraded` or `quarantined`.

#### 6.2F Capability registry derivation rules (required)
DOC10 may consume a richer capability registry view than the raw runtime capability snapshot, but the derivation rules must be explicit.

Required posture:
- runtime/Gateway/OpenClaw capability facts may be flat and execution-oriented;
- DOC3 bridge entries may be richer and user-facing;
- DOC10 may derive route-facing composite entries, but it must preserve provenance:
  - runtime fact
  - DOC3 metadata
  - provisional scanner fallback
- when fields conflict, the registry facade must label which source won and why.
### 6.3 Decision reason codes

```ts
const DecisionReasonCodeSchema = z.enum([
  "page_context_task",
  "page_context_project",
  "page_context_panel",
  "page_context_forum",
  "explicit_teach_request",
  "explicit_repair_request",
  "matched_entity_name",
  "matched_capability_name",
  "healthy_existing_route",
  "quarantined_route_blocked",
  "read_only_mode_blocked",
  "low_confidence_escalation",
  "fallback_no_viable_capability",
  "gateway_first_chat",
  "external_bypass_mode",
]);
```

Mode 0/1 intent cards and banners must use deterministic reason-code templates, not an LLM.

### 6.4 Capability Routing Decision Function (normative)
Inputs:
- `IntentResolution`
- `ActionabilityDecision`
- `DecisionPacket`
- capability registry facade
- capability health
- route blocklist
- route hints
- policy stack
- mode/budgets

### 6.5 deterministicRoute signature and thresholds

```ts
function deterministicRoute(
  envelope: OperationEnvelope,
  intent: IntentResolution,
  actionability: ActionabilityDecision,
  packet: DecisionPacket,
  modeState: OrchestrationModeState,
  budget: TurnAdvisorBudget,
): RouteDecision
```

Threshold constants:

```ts
const ROUTE_THRESHOLDS = {
  min_viable_score: 0.40,
  tie_spread: 0.15,
  fallback_to_chat_threshold: 0.30,
  dla_trigger_threshold: 0.75,
} as const;
```

Weighted scoring guidance:

```ts
score =
  0.30 * intent_match +
  0.15 * alias_match +
  0.15 * health_score +
  0.10 * route_priority +
  0.10 * utility_score +
  0.10 * page_context_alignment +
  0.05 * recent_success -
  0.05 * latency_penalty -
  0.10 * supervision_penalty -
  0.10 * quarantine_penalty -
  0.20 * user_blocklist_penalty;
```

Edge-case rules:
- if no viable candidates and discovery is not indicated -> blocked or chat fallback depending on actionability and threshold
- if tie spread is under threshold -> DLA may be consulted if allowed, else use deterministic tiebreakers
- if candidate is user-blocked -> drop before scoring


Phase 1 scoring guidance:
- if an exact `RouteAliasSeed` hit exists and targets a Q/EC-owned action, it may receive a deterministic boost sufficient to win without DLA
- if an exact `RouteAliasSeed` hit points to an OpenClaw-native route family, it should increase route confidence but still result in `gateway_interactive_chat`
- if capability bridge metadata is missing for a family, add an explicit `bridge_missing_penalty` and emit telemetry rather than silently pretending the capability does not exist
- if free-text default applies, deterministic scoring should still produce route hints and candidate ordering for telemetry/debugging, but execution authority remains Gateway-first

### 6.6 RouteBlocklistSchema

```ts
const RouteBlocklistSchema = z.object({
  block_id: z.string().max(160),
  scope: z.enum(["user", "workspace", "session"]),
  route_id: z.string().max(160),
  reason: z.string().max(200).optional(),
  created_at: z.string(),
  created_by: z.enum(["user", "system"]),
  active: z.boolean().default(true),
  schema_version: z.literal(1),
});
```

Canonical path:
- `ELNOR_MEMORY/system/orchestration/route_blocklist.jsonl`
- `.../route_blocklist_current.json`

### 6.7 RouteDecision schema

```ts
const RouteDecisionSchema = z.object({
  decision_id: z.string().max(160),
  operation_id: z.string().max(160),
  intent_class: IntentClassSchema,
  actionability: ActionabilityLevelSchema,
  route_type: z.enum([
    "chat",
    "entity",
    "capability",
    "repair",
    "discovery",
    "proposal",
    "blocked",
  ]),
  selected_handler: z.string().max(120),
  selected_capability_id: z.string().max(160).optional(),
  selected_entity_route: z.string().max(160).optional(),
  confidence: z.number().min(0).max(1),
  consulted_dla: z.boolean().default(false),
  fallback_route_ids: z.array(z.string().max(160)).max(8).default([]),
  reasons: z.array(DecisionReasonCodeSchema).max(10),
  blocked_reason: z.string().max(200).optional(),
  schema_version: z.literal(1),
});
```

### 6.8 DispatchResult schema

```ts
const DispatchResultSchema = z.object({
  operation_id: z.string().max(160),
  decision_id: z.string().max(160),
  route_trace_id: z.string().max(160).optional(),
  correlation_id: z.string().max(160).optional(),
  job_id: z.string().max(160).optional(),
  session_key: z.string().max(200).optional(),
  stream_channel: z.string().max(200).optional(),
  result_type: z.enum([
    "chat_response",
    "entity_result",
    "action_result",
    "proposal_created",
    "discovery_started",
    "repair_started",
    "blocked",
    "error",
  ]),
  route_banner: z.string().max(240).optional(),
  intent_card: z.object({
    mode: z.enum(["templated", "detailed_debug"]),
    summary: z.string().max(300),
  }).optional(),
  confirmation_required: z.boolean().default(false),
  warnings: z.array(z.string().max(200)).default([]),
  owner_doc_result_ref: z.string().max(200).optional(),
  owner_doc_refusal_ref: z.string().max(200).optional(),
  usage_summary: ProvisionalUsageSummarySchema.optional(),
  schema_version: z.literal(1),
});
```

Normative:
- any dispatch that creates a visible running job must populate `job_id`;
- any dispatch that expects a live stream/push path must populate `stream_channel` or an equivalent read-model-backed stream identity;
- if the result proxies into an owner doc, the dispatch result must point either to a result artifact or a refusal artifact; "mutation maybe happened" is non-compliant.
### 6.9 Route trace subsystem

```ts
const RouteTraceRecordSchema = z.object({
  trace_id: z.string().max(160),
  operation_id: z.string().max(160),
  mode: OrchestrationModeSchema,
  effective_mode_reason: z.string().max(80).optional(),
  intent_class: z.string().max(80),
  selected_route_type: z.string().max(80),
  selected_handler: z.string().max(120),
  consulted_dla: z.boolean().default(false),
  confidence: z.number().min(0).max(1),
  candidate_route_ids: z.array(z.string().max(120)).max(10).default([]),
  decision_reason_codes: z.array(DecisionReasonCodeSchema).max(10).default([]),
  executed_route: z.string().max(120).optional(),
  gateway_session_key: z.string().max(200).optional(),
  effective_model_id: z.string().max(160).optional(),
  effective_thinking: z.string().max(40).optional(),
  effective_reasoning_visibility: z.string().max(40).optional(),
  latency_ms: z.number().int().min(0),
  user_override: z.boolean().default(false),
  outcome: z.enum(["success", "fallback", "blocked", "error"]),
  usage_summary: ProvisionalUsageSummarySchema.optional(),
  ts: z.string(),
  schema_version: z.literal(1),
});
```

Route traces are the canonical user-visible execution-decision record.
They must be updated first when a route is chosen and later enriched by executed-behavior / reverse-telemetry events.

Usage note:
- any cost/usage fields in `RouteTraceRecordSchema` are provisional seed fields until DOC13 freezes the canonical shared cost language.
- route traces must still carry those seed fields now so cost does not become invisible on the most important execution artifact.
### 6.10 Decision feedback and self-learning

```ts
const DecisionFeedbackEventSchema = z.object({
  ts: z.string(),
  decision_id: z.string().max(160),
  operation_id: z.string().max(160),
  trace_id: z.string().max(160).optional(),
  run_id: z.string().max(160),
  agent_id: z.string().max(160),
  feedback_type: z.enum([
    "thumbs_up",
    "thumbs_down",
    "user_route_override",
    "manual_takeover",
    "too_many_turns",
    "failed_to_execute",
    "wrong_intent_class",
    "wrong_capability_choice",
    "needed_discovery",
    "needed_repair",
  ]),
  user_explanation: z.string().max(600).optional(),
  selected_route_type: z.string().max(120).optional(),
  replacement_route_type: z.string().max(120).optional(),
  schema_version: z.literal(1),
});
```

This signal set must bridge into DOC8 friction/self-learning.

---

## 7) Event intake, debounce, STOP, and accumulated event routing

### 7.1 Event intake service
Required modules:
- `apps/ec-service/src/orchestration/event-intake.ts`
- `apps/ec-service/src/orchestration/event-accumulator.ts`
- `apps/ec-service/src/orchestration/gateway-telemetry-bridge.ts`
- `apps/ec-service/src/orchestration/capability-awareness.ts`
- `apps/ec-service/src/orchestration/route-aliases.ts`

### 7.2 Event accumulator schemas

```ts
const EventAccumulatorConfigSchema = z.object({
  default_debounce_ms: z.number().int().min(100).max(30000).default(2000),
  max_accumulation_window_ms: z.number().int().min(1000).max(60000).default(10000),
  max_hold_time_ms: z.number().int().min(500).max(10000).default(5000),
  max_queued_events: z.number().int().min(10).max(500).default(100),
  schema_version: z.literal(1),
});

const SharedNormalizedEventKeySchema = z.object({
  source_family: z.string().max(80),
  target_ref: z.string().max(200).optional(),
  normalized_reason: z.string().max(120),
  time_bucket: z.string().max(80),
  schema_version: z.literal(1),
});

const AccumulatedEventGroupSchema = z.object({
  group_id: z.string().max(160),
  group_key: z.string().max(200),
  dedupe_key: SharedNormalizedEventKeySchema,
  event_type: z.string().max(120),
  priority: z.enum(["low", "normal", "high", "critical"]).default("normal"),
  count: z.number().int(),
  first_seen: z.string(),
  last_seen: z.string(),
  sample_event_refs: z.array(z.string().max(200)).max(10),
  schema_version: z.literal(1),
});
```

Normative:
- priority `critical` must flush immediately
- `max_hold_time_ms` must be enforced even under constant event churn
- normalized events sent to DOC8/DOC9/DOC10 proposals must reuse the same shared dedupe key so one failure does not create multiple duplicate artifacts across subsystems
### 7.3 Flush and routing rules
- trailing-edge flush on debounce pause
- forced flush at `max_hold_time_ms`
- immediate flush allowed for critical active-operation failures
- all normalized/accumulated groups persist before routing decisions
- DOC8 is the sole raw-event sequencer for friction-worthy failure streams
- friction-worthy normalized events must be forwarded to DOC8
- DOC10 may also act on normalized groups, but may not consume them privately
- if hook/debounce/retry infrastructure is not yet strong enough to guarantee delivery, the degraded state must be surfaced rather than silently swallowing event groups


#### Event Action Router mapping table (normative)
| normalized event family | default DOC10 action | other required consumers |
|---|---|---|
| gateway/tool failure during active operation | update route trace, emit repair/learning consideration | DOC8 friction, DOC9 if repair-worthy |
| repeated unavailable capability | emit capability-gap signal or discovery offer | DOC8 learning |
| successful discovery/teach completion | create completion notice and learning/proposal candidates as allowed | DOC8 learning, DOC3/DOC9 proposal owners |
| mode/advisor/circuit state changes | update engineering telemetry and system pulse | none |
| user override / manual takeover | update route trace and feedback event | DOC8 routing-quality friction |

#### Proposal/repair/learning dedupe key rule
Any proposal, repair wake, learning candidate, or capability suggestion derived from normalized events must compute a shared dedupe key:

`dedupe_key = hash(group_id + subject_ref + proposal_family)`

Consumers may multicast the signal, but they must suppress duplicate open items that share the same active dedupe key.

### 7.4 STOP canonicalization (normative)
Canonical STOP path:
- `ELNOR_MEMORY/system/stop_request.json`

```ts
const STOPStateSchema = z.object({
  active: z.boolean(),
  activated_at: z.string().optional(),
  activated_by: z.string().max(160).optional(),
  scope: z.enum(["global", "write_actions", "discovery_only"]).default("global"),
  reason: z.string().max(200).optional(),
  schema_version: z.literal(1),
});
```

Rules:
1. STOP is checked before any write/mutating action and before any discovery/teach start.
2. STOP must propagate to in-flight jobs and system agents as a cancellation/clamp signal.
3. STOP does not corrupt durable writes; writes must complete atomically or roll back under the owner doc's rules.
4. STOP persists until explicitly cleared.

#### 7.4A STOP migration / collision rule
If current implementation code uses a different stop-controller schema or auxiliary stop files, those paths must adapt into `STOPStateSchema` at the canonical durable path above.
- DOC10 canonical semantics are defined by `STOPStateSchema`.
- adapter layers may temporarily read legacy shapes, but they must write or mirror the canonical shape.
- Q and read-models must read one canonical effective STOP state rather than attempting to reconcile multiple files client-side.

---

## 8) Artifact traceability, proxy mutation, and durable-origin binding

### 8.1 ArtifactOriginSchema (normative)

```ts
const ArtifactOriginSchema = z.object({
  origin_operation_id: z.string().max(160).optional(),
  origin_route_trace_id: z.string().max(160).optional(),
  origin_decision_id: z.string().max(160).optional(),
  origin_event_group_id: z.string().max(160).optional(),
  origin_signal_type: z.enum([
    "user_prompt",
    "system_event",
    "nightly",
    "repair",
    "discovery",
    "manual",
  ]),
  origin_source_ref: z.string().max(200).optional(),
  created_by_component: z.string().max(120),
  created_at: z.string(),
  schema_version: z.literal(1),
});
```

Any DOC10-originated or DOC10-influenced durable artifact request must carry this structure.

Applies to at minimum:
- DOC1 memory candidates / memory proposals
- DOC8 learning/proposal artifacts if applicable
- DOC9 improvement proposals / repair wakes / repair sessions / capability repair requests
- DOC3 capability install/update proposals
- inbox items created by orchestration flows

### 8.2 Proxy mutation rule
DOC10 may not directly write foreign durable stores. It must use owner-document-approved command/proxy mutation paths.

### 8.3 Proxy mutation status schema

```ts
const ProxyMutationStatusSchema = z.object({
  mutation_id: z.string().max(160),
  target_owner: z.enum(["DOC1", "DOC3", "DOC8", "DOC9"]),
  status: z.enum(["open", "applying", "applied", "failed", "cancelled"]),
  error: z.string().max(240).optional(),
  created_at: z.string(),
  updated_at: z.string(),
  schema_version: z.literal(1),
});

const OwnerDocProxyResultSchema = z.object({
  mutation_id: z.string().max(160),
  target_owner: z.enum(["DOC1", "DOC3", "DOC8", "DOC9"]),
  status: z.enum(["applied", "refused", "failed"]),
  artifact_refs: z.array(z.string().max(200)).max(20).default([]),
  refusal_code: z.string().max(80).optional(),
  refusal_message: z.string().max(240).optional(),
  updated_at: z.string(),
  schema_version: z.literal(1),
});
```

### 8.4 User-facing provenance requirement
Q must support at least:
- `View Origin Trace`
- `View Origin Signal`
- `View Origin Route`

For route traces that created or influenced durable artifacts, Q should also render a trace-scoped transaction manifest containing:
- route_trace_id
- owner-doc commands/proxy calls issued
- resulting artifact refs
- resulting inbox items
- resulting repair / learning / memory signals
- reverse Gateway events linked to the same trace

If an owner-doc artifact is revert-eligible, the manifest should link to the owner-doc revert or rollback command rather than inventing a DOC10-owned revert path.

#### 8.4A TransactionManifestSchema (normative)
```ts
const TransactionManifestSchema = z.object({
  transaction_manifest_id: z.string().max(160),
  route_trace_id: z.string().max(160),
  operation_id: z.string().max(160),
  owner_doc_calls: z.array(z.object({
    target_owner: z.enum(["DOC1", "DOC3", "DOC8", "DOC9"]),
    mutation_id: z.string().max(160),
    command_ref: z.string().max(200).optional(),
    status: z.enum(["queued", "applied", "refused", "failed"]),
  })).max(40).default([]),
  artifact_refs: z.array(z.string().max(200)).max(40).default([]),
  inbox_refs: z.array(z.string().max(200)).max(40).default([]),
  linked_gateway_event_refs: z.array(z.string().max(200)).max(80).default([]),
  lifecycle_state: z.enum(["open", "stable", "closed"]).default("open"),
  created_at: z.string(),
  updated_at: z.string(),
  schema_version: z.literal(1),
});
```

Manifest rule:
- transaction manifests are read-only provenance groupings;
- they are not a distributed rollback planner;
- DOC10 may link owner-doc rollback or revert affordances, but it may not imply a one-click cross-owner revert-all operation unless a future spec explicitly defines it.
### 8.5 Artifact-origin propagation rule
If DOC10 influences a durable artifact, the owner-doc command/proxy path must preserve:
- `origin_operation_id`
- `origin_route_trace_id`
- `origin_decision_id`
- `origin_event_group_id` where applicable
- `origin_signal_type`
- `origin_source_ref` where applicable

Owner-doc handlers may add their own provenance, but they may not drop DOC10 origin fields without an explicit compatibility note in the ledger.

---

## 9) Permanent system agents: bootstrap, persistence, lifecycle, and visibility

### 9.1 System agents
- OCM
- CSM
- SHUA
- DLA (logical system agent / runtime-light advisory service)

### 9.2 Persistent identity vs continuous activity
- OCM: persistent identity, often active, but cached output preferred on fast path
- CSM: persistent identity, event-driven/bursty execution
- SHUA: persistent identity, recommendation-focused, off or nightly/shadow by default
- DLA: visible logical system agent, consulted only through the Decision Broker guard

### 9.3 Durable registry and event log
Canonical paths:
- `ELNOR_MEMORY/system/agents/system_agents_registry.json`
- `ELNOR_MEMORY/system/agents/system_agent_events.jsonl`

### 9.4 System agent schema

```ts
const SystemAgentSchema = z.object({
  agent_id: z.string().max(160),
  role: z.enum(["ocm", "csm", "shua", "dla"]),
  desired_state: z.enum(["enabled", "disabled"]),
  runtime_state: z.enum(["ready", "running", "degraded", "stopped", "error"]),
  auto_start: z.boolean().default(true),
  restart_policy: z.enum(["always", "on_failure", "never"]),
  heartbeat_at: z.string().optional(),
  last_selftest_at: z.string().optional(),
  last_error: z.string().max(240).optional(),
  max_restarts_per_minute: z.number().int().min(1).max(10).default(3),
  instruction_ref: z.string().max(200).optional(),
  model_profile_ref: z.string().max(200).optional(),
  schema_version: z.literal(1),
});
```

### 9.5 Heartbeat specification
A system agent heartbeat is stale if older than twice its configured interval.
Heartbeat data must be consumable by the System Agents page and Engineering Panel.

### 9.6 State transition table (normative)
| From | To | Trigger |
|---|---|---|
| stopped | ready | bootstrap complete |
| ready | running | work begins |
| running | ready | work ends |
| any | degraded | self-test fail / dependency degraded |
| degraded | ready | health restored and self-test passes |
| any | error | repeated failure beyond breaker |
| error | stopped | manual stop |
| error | ready | manual reset + self-test passes |

### 9.7 Mandatory self-tests
Each system agent must expose a self-test that validates its minimum dependencies without performing durable mutations.

### 9.8 Required commands
- `system_agent_start`
- `system_agent_stop`
- `system_agent_restart`
- `system_agent_reconfigure`
- `system_agent_run_selftest`

### 9.9 Q System Agents page
Must show:
- desired state
- runtime state
- restart policy
- heartbeat freshness
- last self-test result
- queue/activity summary
- start/stop/restart/self-test/configure controls


### 9.10 SystemAgentDirectory snapshot (normative)

To prevent “agents that exist but nobody knows how to use,” EC must expose a deterministic directory snapshot.

This directory is used by:
- Decision Context Builder (to include a tiny “system map”)
- Q (to render what system agents exist, their status, and how they can be queried)
- sub-agents/system agents (to know who owns what and how to request help)

Canonical path:
- `ELNOR_MEMORY/system/agents/system_agent_directory.json` (atomic JSON)

```ts
const SystemAgentDirectorySchema = z.object({
  updated_at: z.string(),
  entries: z.array(z.object({
    agent_id: z.string().max(160),
    role: z.enum(["ocm", "dla", "csm", "shua", "other"]),
    owner_doc: z.string().max(40),
    callable_query_types: z.array(z.string().max(60)).max(20).default([]),
    write_authority: z.enum(["none", "proposal_only", "command_router_only"]).default("none"),
    health_state: z.enum(["starting", "active", "degraded", "disabled", "offline"]),
    mode_visibility: z.array(OrchestrationModeSchema).max(6).optional(),
  })).max(32),
  schema_version: z.literal(1),
});
```

Normative:
- The directory is generated by EC (deterministic).
- A system agent that is not listed here must be treated as non-existent by other agents.
- The directory must be small enough to inject as a “map & compass” in ≤120 tokens.

### 9.11 Inter-agent/system-agent communication contract (normative)

DOC10 does not create a free-form agent mesh. Inter-agent coordination must be:
- bounded
- traceable
- deduped
- mode-aware
- preferably mediated by EC
- and never required for OpenClaw-native runtime success on ordinary Gateway chat turns

#### 9.11A SystemAgentQueryType

```ts
const SystemAgentQueryTypeSchema = z.enum([
  "ocm_cached_brief",
  "ocm_cross_agent_context",
  "capability_gap_assessment",
  "route_explanation",
  "repair_readiness_check",
  "shadow_evaluation_request",
]);
```

#### 9.11B AgentQueryEnvelope

```ts
const AgentQueryEnvelopeSchema = z.object({
  query_id: z.string().max(160),
  from_agent_id: z.string().max(160),
  to_agent_id: z.string().max(160),
  query_type: SystemAgentQueryTypeSchema,
  payload_ref: z.string().max(200).optional(),
  payload_scope: z.enum(["operation", "session", "workspace"]).default("operation"),
  origin_operation_id: z.string().max(160),
  origin_route_trace_id: z.string().max(160),
  dedupe_key: z.string().max(160).optional(),
  ttl_ms: z.number().int().min(1000).max(600000).default(60000),
  max_depth: z.number().int().min(1).max(3).default(2),
  hop_count: z.number().int().min(0).max(3).default(0),
  reply_expected: z.boolean().default(true),
  created_at: z.string(),
  schema_version: z.literal(1),
});
```

#### 9.11C AgentQueryResponseSchema

```ts
const AgentQueryResponseSchema = z.object({
  query_id: z.string().max(160),
  status: z.enum(["completed", "dropped", "timed_out", "failed", "rejected_by_mode"]),
  responding_agent_id: z.string().max(160).optional(),
  response_ref: z.string().max(200).optional(),
  summary: z.string().max(400).optional(),
  reason_code: z.string().max(80).optional(),
  latency_ms: z.number().int().min(0).optional(),
  schema_version: z.literal(1),
});
```

#### 9.11D Transport and routing

Required internal transport:
- `apps/ec-service/src/orchestration/system-agent-bus.ts`

Required deterministic endpoint/service surface:
- `orchestration/system-agents/query`
- `orchestration/system-agents/directory`
- `orchestration/system-agents/notices`

Normative rule:
- If an agent needs help, it must consult `SystemAgentDirectory` and choose based on `callable_query_types`.
- If multiple candidates match, prefer:
  1. local deterministic logic
  2. OCM cached brief or cached query output if relevant
  3. DLA escalation only if mode allows and deterministic options are exhausted

Transport rules:
- transport must be mediated through an explicit queue/service boundary, not direct class invocation;
- sends must return an enqueue acknowledgment or structured refusal;
- retries, dedupe, timeout, and dead-letter behavior must be explicit and testable.

#### 9.11E Guardrails

Allowed:
- `agent_query` (bounded request/response)
- `agent_notice` (one-way, non-blocking)
- proposal creation via owner-doc pipeline
- event emission via event intake

Forbidden:
- ad hoc peer-to-peer durable writes
- ad hoc "call random agent" without directory entry
- recursive unbounded agent chains
- using agent messaging as a prerequisite for ordinary Gateway chat execution

Depth and cycle rules:
- `max_depth` may not exceed 3
- an envelope may not revisit the same `to_agent_id` twice
- if `ttl_ms` expires, the query must end in `timed_out` or `dropped`, not hang

#### 9.11F Startup and shutdown ordering

Startup order:
1. mode authority
2. route-trace and telemetry sinks
3. capability registry facade
4. OCM
5. DLA
6. CSM
7. SHUA
8. optional later agents

Shutdown rule:
- `system_agent_stop` must either drain in-flight queries or mark them `dropped` with telemetry within a bounded timeout
- graceful shutdown semantics must be implemented before auto-restart loops are enabled

#### 9.11G Telemetry and audit trail

Every agent query must emit:
- `agent_message.sent`
- `agent_message.acknowledged` OR `agent_message.rejected`
- `agent_message.completed` OR `agent_message.dropped` OR `agent_message.timed_out`
- `agent_message.failed`
- correlation to `origin_route_trace_id`

Agent failures that create user-visible impact should also be eligible for Unified Inbox surfacing.

Required audit rule:
- agent-message audit logs must preserve `from_agent_id`, `to_agent_id`, `query_type`, `query_id`, enqueue outcome, terminal outcome, and latency;
- direct synchronous in-memory calls that bypass the audit surface are non-compliant.
## 10) Environment awareness, entity discovery, and system pulse

### 10.1 Layer 1 remains map-and-compass only
Layer 1 is not a transcript dump. It is a compact orientation surface.

### 10.2 Layer 1 budget
Target total:
- 150 to 180 tokens including:
  - temporal grounding
  - environment map line
  - compact system pulse when non-empty

### 10.3 Entity discovery ownership contract
DOC10 owns the public orchestration facade contract. Subsystem docs own their underlying data and summary generation.

Running Brief / OCM and DOC7 must consume/produce summaries compatible with the canonical facade contract rather than define competing public endpoints.

### 10.4 Canonical entity discovery request/response schemas

```ts
const EntityDiscoverRequestSchema = z.object({
  entity_type: z.array(z.enum([
    "task",
    "project",
    "panel",
    "forum",
    "workspace",
    "agent",
    "memory",
    "standing_order",
    "bucket",
    "capability",
    "repair",
  ])).max(12).optional(),
  active_only: z.boolean().default(false),
  status_filter: z.array(z.string().max(40)).max(10).optional(),
  cursor: z.string().max(160).optional(),
  limit: z.number().int().min(1).max(50).default(20),
  schema_version: z.literal(1),
});

const EnvironmentEntitiesResponseSchema = z.object({
  items: z.array(z.object({
    ref_id: z.string().max(200),
    entity_type: z.string().max(80),
    title: z.string().max(200),
    state: z.string().max(80).optional(),
    updated_at: z.string().optional(),
    source_path: z.string().max(240).optional(),
  })).max(50),
  next_cursor: z.string().max(160).optional(),
  schema_version: z.literal(1),
});
```

### 10.5 System pulse

```ts
const SystemPulseSchema = z.object({
  current_mode: OrchestrationModeSchema,
  active_repairs_count: z.number().int().min(0),
  awaiting_approvals_count: z.number().int().min(0),
  degraded_subsystem_count: z.number().int().min(0),
  queue_pressure: z.enum(["low", "normal", "high"]).optional(),
  freshness_state: z.enum(["clear", "warning", "stale", "unknown"]).default("unknown"),
  context_health: z.enum(["good", "degraded", "unknown"]),
  bridge_state: z.enum(["canonical", "provisional_scanner", "missing", "stale"]),
  updated_at: z.string(),
  schema_version: z.literal(1),
});
```

The System Pulse is a compact, user-facing summary including:
- current mode
- active repairs count
- awaiting approvals count
- degraded subsystem count
- queue pressure if meaningful
- freshness alert if meaningful
- context health
- capability bridge state

#### 10.5A System pulse production cadence
- baseline expectation: refresh on meaningful state change and at least every 30 seconds while Q is active;
- if production cadence degrades, Q must show staleness rather than pretending pulse data is live;
- `SystemPulseSchema` without a production cadence contract is non-compliant.
### 10.6 Context health
Context health is one of:
- `good`
- `degraded`
- `unknown`
plus short reason codes.


### 10.7 Capability awareness snapshot and Context Card
DOC10 must expose a compact capability-awareness view for both routing and UI.

Required outputs:
- `CapabilityAwarenessSnapshot`
- deterministic `WhatCanIDoHereCard`

The Context Card should be derivable without an LLM from:
- current page context and selected entities
- capability-awareness snapshot
- current mode
- top route families / top healthy capabilities
- blocked or unavailable capability reasons

```ts
const WhatCanIDoHereCardSchema = z.object({
  context_label: z.string().max(160),
  effective_mode: OrchestrationModeSchema,
  bridge_state: z.enum(["canonical", "provisional_scanner", "missing", "stale"]),
  top_actions: z.array(z.object({
    label: z.string().max(120),
    operation_template_id: z.string().max(120),
    route_family_id: z.string().max(120).optional(),
    capability_id: z.string().max(160).optional(),
    availability: z.enum(["available", "degraded", "blocked"]),
    reason_code: z.string().max(80).optional(),
  })).max(12),
  unavailable_actions: z.array(z.object({
    label: z.string().max(120),
    reason_code: z.string().max(80),
  })).max(12).default([]),
  suggested_next_steps: z.array(z.string().max(160)).max(8).default([]),
  schema_version: z.literal(1),
});
```

Minimum card fields:
- current context label
- top relevant actions/capabilities
- unavailable actions with short reason codes
- mode-aware suggested next steps
- buttons that generate structured `OperationEnvelope`s

Normative:
- if `bridge_state != "canonical"`, the card must surface that state somewhere visible
- unavailable actions may not be silently omitted when their absence would misrepresent capability reality
## 11) Capability and skills orchestration

### 11.1 DOC3 remains canonical for capability artifacts
DOC3 owns:
- skill packages
- wrappers
- page knowledge
- connector descriptors if defined there
- capability-specific artifact semantics

DOC10 owns:
- route family selection
- capability registry facade contract
- route scoring and learning around capability use
- discovery/repair/proposal orchestration around capability artifacts

### 11.2 Capability stack ordering (normative)
1. native integration / adapter / Gateway-native route
2. connector / MCP / structured bridge
3. skill / wrapper
4. site or page-knowledge-guided workflow
5. raw fallback actions

### 11.3 Capability Registry Bridge (normative cross-doc requirement)
DOC3 must expose capability metadata consumable by DOC10.

```ts
const CapabilityRegistryBridgeEntrySchema = z.object({
  capability_id: z.string().max(160),
  family: z.string().max(80),
  title: z.string().max(200),
  aliases: z.array(z.string().max(80)).max(20).default([]),
  action_verbs: z.array(z.string().max(80)).max(20).default([]),
  route_tier: z.number().int().min(1).max(5),
  requires_supervision: z.boolean().default(false),
  dry_run_supported: z.boolean().default(false),
  routing_eligible: z.boolean().default(true),
  health_status: z.enum(["healthy", "degraded", "quarantined", "disabled"]),
  health_reason_code: z.string().max(80).optional(),
  origin_owner: z.enum(["doc3", "openclaw_runtime", "provisional_scanner"]),
  built_in_openclaw: z.boolean().default(false),
  metadata_ref: z.string().max(200),
  updated_at: z.string(),
  schema_version: z.literal(1),
});
```

Hooks required:
- `capability.installed`
- `capability.updated`
- `capability.removed`
- `capability.quarantined`
- `capability.health.changed`

These hooks must trigger:
- LocalIntentIndex refresh
- routing-cache invalidation
- capability-awareness snapshot refresh
- Q capability-awareness refresh where active

Canonical durable bridge cache path expected by DOC10:
- `ELNOR_MEMORY/system/capabilities/bridge_entries_current.json`

If DOC3 has not yet been amended to export the canonical bridge, DOC10 may maintain a provisional cache derived from the startup scanner, but canonical DOC3 bridge entries must replace provisional entries when available.

#### 11.3A Capability registry facade query contract

```ts
const CapabilityRegistryFacadeQuerySchema = z.object({
  family_filter: z.array(z.string().max(80)).max(20).optional(),
  capability_ids: z.array(z.string().max(160)).max(50).optional(),
  health_filter: z.array(z.enum(["healthy", "degraded", "quarantined", "disabled"])).max(4).optional(),
  include_openclaw_builtins: z.boolean().default(true),
  schema_version: z.literal(1),
});

const CapabilityRegistryFacadeResponseSchema = z.object({
  entries: z.array(CapabilityRegistryBridgeEntrySchema).max(200),
  bridge_state: z.enum(["canonical", "provisional_scanner", "missing", "stale"]),
  updated_at: z.string(),
  schema_version: z.literal(1),
});
```

This facade is the only supported query surface for deterministic routing and UI capability-awareness reads.

#### 11.3B Bridge derivation, hot-reload, and cache invalidation (required)
- DOC10 may keep a derived capability cache, but it must preserve source provenance for each field.
- capability installed/updated/removed, bridge refresh success/failure, route alias changes, and capability health mutation must invalidate or refresh the route-facing cache.
- if hot-reload fails, the cache must be labeled stale and surfaced as stale in both telemetry and read-models.
### 11.4 Capability registry facade
DOC10 consumes bridge entries and exposes a route-facing registry facade. It does not replace DOC3 as artifact owner.

### 11.5 Keyboard shortcut registry
Shortcut metadata may improve route quality and capability synthesis. It belongs to the capability registry facade but not the primary route family selection.

### 11.6 App accessibility profiles
Accessibility profiles may affect synthesis and routing fallback choice.

### 11.7 Capability versioning and rollback
Capability artifacts must support versioning and rollback through DOC3/DOC9 owner paths.

### 11.8 Capability cost accounting
Capability usage must contribute to route trace and cost telemetry.

---

## 12) Capability discovery, teaching, learning, and repair hooks

### 12.1 DiscoveryRun schema

```ts
const DiscoveryRunSchema = z.object({
  discovery_id: z.string().max(160),
  target_domain: z.string().max(80),
  goal_summary: z.string().max(300),
  state: z.enum([
    "queued",
    "waiting_for_dependencies",
    "observing",
    "exploring",
    "evaluating",
    "synthesizing",
    "proposal_ready",
    "budget_exhausted",
    "timed_out",
    "quarantined",
    "cancelled",
    "failed",
    "completed",
  ]),
  requires_supervision: z.boolean().default(true),
  dry_run: z.boolean().default(false),
  started_at: z.string().optional(),
  completed_at: z.string().optional(),
  schema_version: z.literal(1),
});
```

### 12.2 Exploration Execution Loop (phase-later, non-normative)
Preserved idea only. Do not treat this as a required implementation contract for Phase 0-4.

Any future implementation must first define:
- deterministic loop entry criteria
- supervision and approval boundaries
- mutation permission model
- explicit stop/cancel semantics
- completion criteria
- failure / rollback semantics

Until then, discovery work should use bounded owner-doc jobs and explicit proposals rather than an open-ended exploration loop.
### 12.3 Speculative action proposal schema

```ts
const SpeculativeActionProposalSchema = z.object({
  proposal_id: z.string().max(160),
  proposed_action: z.string().max(120),
  target_ref: z.string().max(200).optional(),
  confidence: z.number().min(0).max(1),
  read_only_safe: z.boolean(),
  expected_assertion_refs: z.array(z.string().max(160)).max(10).default([]),
  schema_version: z.literal(1),
});
```

### 12.4 Write-action gating table (normative)
| Mode | Read-only safe actions | Mutating actions |
|---|---|---|
| baseline | no discovery loop | blocked |
| assisted | blocked except explicit discovery start | blocked |
| shadow | shadow only, no mutating execution | blocked |
| discovery | allowed if policy and supervision permit | only if explicitly approved by discovery policy and STOP inactive |
| lab | explicit policy only | explicit policy only |

The deterministic runner must always cross-check tool/action classes against a hard denylist for destructive tools, regardless of any LLM flag.

### 12.5 Success assertions and evaluation split

```ts
const SuccessAssertionSchema: z.ZodTypeAny = z.lazy(() => z.discriminatedUnion("kind", [
  z.object({
    kind: z.literal("ui_element_present"),
    selector: z.string(),
    timeout_ms: z.number().int().positive().default(3000),
  }),
  z.object({
    kind: z.literal("window_title_match"),
    pattern: z.string(),
    timeout_ms: z.number().int().positive().default(3000),
  }),
  z.object({
    kind: z.literal("file_created"),
    path: z.string(),
    timeout_ms: z.number().int().positive().default(5000),
  }),
  z.object({
    kind: z.literal("text_match"),
    target: z.string(),
    pattern: z.string(),
    timeout_ms: z.number().int().positive().default(3000),
  }),
  z.object({
    kind: z.literal("all_of"),
    children: z.array(SuccessAssertionSchema).min(1),
  }),
  z.object({
    kind: z.literal("any_of"),
    children: z.array(SuccessAssertionSchema).min(1),
  }),
]));
```

### 12.6 Demonstration / Teach mode (phase-later, non-normative)
Teach mode must be explicit, visible, quota-bound, and separately specified before implementation.
Required future limits:
- max duration
- max storage quota
- STOP abort handling
- target app/window scoping
- user-visible countdown/stop controls

This section is preserved as a future-phase requirement, not a current implementation contract.
### 12.7 Workflow Pattern Detector (phase-later, non-normative)
Workflow pattern detection should propose, not silently install.

Current rule:
- preserve the idea
- do not implement broad auto-pattern learning until route misfire tracking, review UX, and owner-doc acceptance paths are fully wired
### 12.8 Process memory integration
Process-memory summaries may inform discovery or route hints, but they are not the canonical route-history store.

### 12.9 Capability synthesis pipeline (phase-later, non-normative)
Preserved future pipeline only:
1. trace normalization
2. literal replay draft
3. parameter extraction
4. structural generalization
5. assertion binding
6. packaging
7. verification
8. proposal generation

Current rule:
- do not treat this as normative for Phase 0-4
- capability growth should currently land as proposals, repair offers, wrapper updates, or known-unautomatable notes
### 12.10 Capability outputs
Outputs may include:
- capability install proposal
- capability update proposal
- wrapper update proposal
- page knowledge update proposal
- repair offer
- known-unautomatable note

### 12.11 Capability quarantine / graveyard
Quarantined capability artifacts must be excluded from routing via `routing_eligible = false` and surfaced in Q with repair/proposal options.

### 12.12 Supervision and dry-run defaults for new capabilities
Newly synthesized or heavily changed capabilities must default to:
- `requires_supervision = true` for first 3 runs
- `dry_run = true` where supported until approved for real execution


### 12.13 Route optimization and proactive suggestions
Nightly or batched optimization may propose:
- route alias seed candidates
- route preference adjustments
- capability suggestions for repeated unmet intents
- discovery offers for repeated gaps

Rules:
- optimization produces proposals or low-risk learned hints with provenance
- optimization may not silently rewrite core routing policy without an owner-doc path and audit trail
- repeated unmet intents should be thresholded and deduped before surfacing

### 12.14 Learning receipts and job completion notices
Discovery, synthesis, repair-adjacent learning, and major route-learning jobs must emit a compact completion notice consumable by Q.

```ts
const JobCompletionNoticeSchema = z.object({
  notice_id: z.string().max(160),
  job_id: z.string().max(160),
  family: z.enum(["discovery", "repair", "learning", "optimization", "gateway_chat"]),
  title: z.string().max(200),
  summary: z.string().max(400),
  related_trace_id: z.string().max(160).optional(),
  related_artifact_ref: z.string().max(200).optional(),
  created_at: z.string(),
  schema_version: z.literal(1),
});
```

---

## 13) OpenClaw and Gateway integration contract (DOC11 / DOC4 cross-link)

### 13.1 Preserve OpenClaw's native runtime
OpenClaw remains the native execution intelligence for Gateway-routed chat and tool-enabled runtime. DOC10 must not replace or micromanage OpenClaw's internal agent loop.

### 13.2 Gateway-first interactive chat cross-link
For interactive chat originating from Q:
- `selected_handler` MUST be `gateway_interactive_chat`
- except in `external_bypass` or gateway-degraded fallback conditions
- direct provider calls are background-only and not allowed to replace the interactive Gateway path

DOC11 owns the concrete Gateway-first chat assembly and control hierarchy.

### 13.3 Modes influence annotations and advisor posture, not Gateway-first routing
Modes may affect:
- whether DLA/OCM/CSM contribute to annotations or route hints
- annotation size and content
- whether discovery/repair/teaching are offered

Modes must not silently change the fact that Q interactive chat is Gateway-first unless explicit bypass/degraded fallback is active.

### 13.4 Lean EC annotations only
Lean EC annotations for Gateway-routed chat must:
- be additive
- be bounded
- avoid duplicating OpenClaw-native identity/context files
- follow the DOC11 Lean EC Annotation Contract

Normative:
- DOC10 may provide refs, compact facts, and reason codes to DOC11's annotation builder
- DOC10 may not pre-render a second full prompt block
- OCM brief use on this path must be cached-brief or compact-summary only
- the total DOC10 contribution must obey section 5.11F before DOC11 performs final annotation trim

#### 13.4A Annotation handshake and health source rules
- DOC10 owns the handoff payload and its routing facts;
- DOC11 owns final annotation assembly and Gateway dispatch;
- gateway health shown in DOC10/Q must come from a declared DOC11 / Gateway health source, not from UI guesswork;
- if DOC11 cannot consume a DOC10 annotation field, that incompatibility must be surfaced through explicit handoff refusal or degraded-state telemetry rather than silent field dropping.

### 13.5 Executed-route and reverse-telemetry contract from Gateway
Gateway completion and tool/runtime events must feed back into:
- route trace
- capability watermark
- running jobs state
- memory/self-learning signals where appropriate
- repair/discovery/wrong-route signals where appropriate
- Q executed-route displays

Minimum reverse-telemetry event families:
- `gateway.chat.started`
- `gateway.chat.completed`
- `gateway.chat.failed`
- `gateway.chat.aborted`
- `gateway.tool.started`
- `gateway.tool.completed`
- `gateway.tool.failed`
- `gateway.skill.invoked`
- `gateway.skill.failed`
- `gateway.approval.requested`
- `gateway.approval.resolved`
- `gateway.lifecycle.degraded`
- `gateway.lifecycle.recovered`
- `gateway.abort.acknowledged`
- `gateway.abort.timeout`

Each reverse-telemetry event must include at minimum:
- `operation_id`
- `route_trace_id`
- `correlation_id` (must match handoff; default = route_trace_id)
- `session_key`
- timestamps
- success/failure state
- `tool_name` / `capability_id` where relevant
- compact summary or refs for UI and learning systems

#### 13.5A GatewayReverseEventSchema

```ts
const GatewayReverseEventSchema = z.object({
  family: z.string().max(80),
  operation_id: z.string().max(160),
  route_trace_id: z.string().max(160),
  correlation_id: z.string().max(160),
  session_key: z.string().max(200),
  capability_id: z.string().max(160).optional(),
  tool_name: z.string().max(160).optional(),
  summary: z.string().max(240).optional(),
  success: z.boolean().optional(),
  emitted_at: z.string(),
  schema_version: z.literal(1),
});
```

#### 13.5B ExecutedBehaviorBundleSchema

```ts
const ExecutedBehaviorBundleSchema = z.object({
  route_trace_id: z.string().max(160),
  correlation_id: z.string().max(160),
  session_key: z.string().max(200),
  executed_route: z.enum(["gateway_first", "gateway_degraded_fallback", "external_bypass", "local_structured"]),
  effective_model_id: z.string().max(160).optional(),
  effective_thinking_level: z.string().max(80).optional(),
  effective_reasoning_visibility: z.string().max(80).optional(),
  effective_verbose_level: z.string().max(80).optional(),
  ec_annotations_used: z.boolean(),
  ec_annotation_tokens: z.number().int().min(0).optional(),
  tool_names: z.array(z.string().max(160)).max(50).default([]),
  skill_ids: z.array(z.string().max(160)).max(50).default([]),
  approval_events: z.array(z.string().max(160)).max(20).default([]),
  abort_reason: z.string().max(200).optional(),
  usage_summary: ProvisionalUsageSummarySchema.optional(),
  completed_at: z.string().optional(),
  schema_version: z.literal(1),
});
```

This bundle is the canonical executed-truth payload for:
- route banner updates
- engineering panel drill-down
- capability watermark refresh
- trace manifest linking

#### 13.5C Reverse transport, echo, and truthfulness rules
Producer-side contract:
- DOC11 / Gateway integration must emit or normalize reverse events into `GatewayReverseEventSchema`;
- `correlation_id` and `session_key` must be echoed from handoff or explicitly labeled unknown;
- if a desired reverse-event family cannot be emitted natively, the bridge must surface capability limitations explicitly rather than fabricating event certainty.

Required interop behaviors:
- reverse events must be delivered over a declared stream/push transport or a documented durable-queue fallback;
- duplicate events must be safe to consume via correlation + sequence handling;
- EC must be able to distinguish:
  - no event yet
  - event unsupported
  - transport degraded
  - terminal failure

#### 13.5D Q rendering minimums
Q must render at minimum:
- a live indicator when Gateway/tool execution is in progress
- a compact post-completion executed-route banner
- an expandable executed-behavior / tool-events panel in engineering mode
- visible degraded state when Gateway fallback or abort timeout occurred
- explicit "telemetry degraded" state if reverse events are missing or transport is impaired
### 13.6 Parallel shadow/preflight allowed, dual live execution forbidden
Allowed:
- shadow route comparison
- shadow DLA reasoning where mode allows
- preflight capability-awareness or session-control fetch
- telemetry subscription setup

Forbidden:
- dual live execution of the same request across Gateway and a second runtime
- racing structured capability execution against Gateway for the same `operation_id`
- duplicate side-effecting actions for a single live user request

### 13.7 Anti-hamstringing rule (normative)
DOC10/DOC11/EC/Q must guide and enrich OpenClaw, not constrain it into a passive transport. Once a task is intentionally routed into OpenClaw's runtime, OpenClaw retains latitude to use its native terminal/browser/computer/tools/skills/runtime loop under the current mode and policy.

Corollary:
- deterministic fast paths may accelerate selection
- deterministic fast paths must not strip OpenClaw-native problem-solving latitude from tasks that genuinely belong in the Gateway runtime

---

## 14) UI, telemetry, route transparency, and user controls

### 14.1 Required surfaces
Q must provide at minimum:
- inline route banner for executed action-like routes
- intent card
- `What can I do here?` capability Context Card
- route override controls
- Route Trace viewer / orchestration log surface
- Route Alias manager
- System Agents page
- Engineering / Orchestration Panel
- Unified Inbox
- Running Jobs view
- memory management surfaces with provenance controls
- learning receipts / job completion notices

### 14.2 Route banner
User-facing route banner should show the **executed route**, not just the planned route.
For trivial chat/entity lookups it may be hidden or compacted.

### 14.3 Intent card
Mode 0/1 intent cards must be deterministic/templated from reason codes. They must not invoke an LLM solely to explain routing.

### 14.4 Route controls
At minimum:
- `Use different route`
- `Just answer, do not act`
- `Teach this`
- `Repair this`
- `Do not use this route again`

These controls must generate new `OperationEnvelope`s or canonical commands, not decorative local state only.

#### 14.4A Reroute lifecycle (normative)
`Use different route` must:
1. create a new `OperationEnvelope`
2. reference the previous `route_trace_id`
3. include `requested_route_hint`
4. produce a new route trace rather than mutating the old one in place
5. emit `route.override.requested` and `route.override.resolved`

A reroute control that only toggles local UI state is non-compliant.
### 14.4B `What can I do here?` Context Card
The Context Card must be deterministic and mode-aware. It should show:
- current page/context label
- top relevant capabilities or actions
- unavailable actions with short reason codes
- suggested next steps
- buttons that emit structured `OperationEnvelope`s or canonical owner-doc commands

The Context Card is required for at minimum:
- chat thread surface
- task detail
- project detail
- capability/skills page
- engineering panel debug view

It must render from `WhatCanIDoHereCardSchema`, not ad hoc per-page logic.
### 14.4C Route Alias manager
Q must expose a Route Alias manager that lets the user:
- view alias seeds
- see hit/misfire counts
- enable/disable aliases
- edit phrase/target/scope
- inspect alias provenance from route traces or learning suggestions

Alias activation must never be hidden from the user in baseline deployments.
Alias management must include real persist/list/update/delete round-trips.
### 14.5 Running Jobs view
The Engineering Panel must expose a running jobs table with at least:
- discovery runs
- synthesis/proposal jobs
- repair jobs
- shadow analysis jobs
- Gateway chat jobs when an active runtime session is being tracked

Each row must include:
- job id
- type
- state
- started_at
- mode
- terminate/cancel control if allowed
- abort state
- route_trace_id
- owner component
- session / stream identity where present
- provisional usage / cost rollup when available

Normative:
- rows must be backed by `RunningJobSchema`
- a terminate button may only render when policy allows and the job declares whether abort is supported
- if abort is unsupported, the row must say so explicitly rather than pretending the control exists
- rows must persist or reconcile across EC restart according to section 4.10A1

### 14.6 Unified Inbox canonical container schema

```ts
const InboxItemSchema = z.object({
  item_id: z.string().max(160),
  item_kind: z.enum([
    "memory_candidate",
    "memory_promotion",
    "learning_candidate",
    "rule_candidate",
    "repair_offer",
    "capability_proposal",
    "discovery_result",
    "improvement_proposal",
    "system_alert",
  ]),
  title: z.string().max(200),
  summary: z.string().max(400),
  status: z.enum(["open", "applying", "applied", "failed", "dismissed"]),
  owner_doc: z.enum(["DOC1", "DOC8", "DOC9", "DOC10", "DOC11", "DOC3"]),
  artifact_origin: ArtifactOriginSchema.optional(),
  created_at: z.string(),
  updated_at: z.string(),
  schema_version: z.literal(1),
});
```

Legacy decision-card-like surfaces may be served through a compatibility adapter over this canonical container.

Compatibility note:
- if existing implementation still uses `PendingItem` or another pre-Inbox shape, a compatibility adapter is allowed;
- the adapter must be explicit and temporary, not silent schema drift.

### 14.7 Memory and self-learning controls
Q must eventually support wired controls for:
- promote memory
- demote memory
- edit memory payload
- delete or archive memory per owner-doc rules
- approve/reject memory proposal
- inspect why a memory/self-learning artifact occurred
- inspect origin route/trace
- revert eligible self-learning changes or recommendations through owner-doc commands

DOC10 does not own the mutation semantics, but it must require the routing/telemetry/provenance plumbing that makes these controls possible.

### 14.8 Tooltips and help text
Modes, route controls, discovery/teach buttons, and engineering-panel controls must include explanatory text sufficient for a normal user to understand what is enabled/disabled and why.

### 14.9 Route Trace viewer and orchestration log
Q must provide a route-trace/orchestration log surface that shows at minimum:
- operation id
- selected route
- executed route
- consulted DLA yes/no
- route reason codes
- gateway session / capability watermark when relevant
- linked artifact origin refs
- user override history where present
- related running job rows
- linked transaction manifest items
- visible degrade / abort / fallback reasons where relevant
- provisional usage / cost summary when present

`View Origin Trace` should render at minimum:
- route_trace_id
- created_at / updated_at
- reason codes
- selected vs executed route
- handler fallbacks
- downstream owner-doc commands issued
- resulting artifact refs
- reverse Gateway events for the same correlation id

### 14.10 Learning receipts and completion notices
Q must show compact completion notices for:
- discovery jobs
- synthesis jobs
- repair-adjacent learning jobs
- nightly optimization suggestions
- gateway-routed jobs that produced notable learned outcomes

These notices must link to:
- related route trace
- related artifact(s)
- next recommended action

### 14.11 Telemetry event families that must be visible somewhere in Q

Additional required telemetry families (non-exhaustive; see wiring audit):
- `operation.intake`
- `operation.mode_resolved`
- `route.handler_dispatched`
- `dispatch.completed`
- `dispatch.failed`
- `dispatch.fallback_invoked`
- `route.gateway_handoff`
- `gateway.reverse_event_received`
- `gateway.abort.acknowledged`
- `gateway.abort.timeout`
- `proxy_mutation.requested`
- `proxy_mutation.completed`
- `proxy_mutation.failed`
- `memory.injection_selected`
- `memory.injection_trimmed`
- `memory.injection_skipped`
- `memory.injection_deduped`
- `learning.signal_created`
- `learning.applied`
- `learning.reverted`
- `repair.wake_requested`
- `repair.session.completed`
- `repair.duplicate_suppressed`
- `capability.inventory_snapshot_built`
- `capability.bridge_missing`
- `capability.bridge.refresh.failed`
- `capability.index_rebuilt`
- `capability.health.changed`
- `alias.matched`
- `alias.misfire_detected`
- `job.started`
- `job.terminated`
- `job.cleanup_completed`
- `advisor.skipped_due_to_mode`
- `advisor.skipped_due_to_budget`
- `agent_message.failed`
- `hook.consumer_failed`
- `ui.control_invoked`
- `ui.control.failed`
- `context_authority.violation_detected`

At minimum, Q must have drill-down visibility for:
- `operation.intake`
- `operation.mode_resolved`
- `operation.path_selected`
- `decision.packet_built`
- `decision.packet_trimmed`
- `decision.context_insufficient`
- `route.alias_matched`
- `route.blocked`
- `route.gateway_handoff`
- `gateway.chat.*`
- `gateway.tool.*`
- `gateway.abort.*`
- `capability.index_rebuilt`
- `capability.bridge_missing`
- `memory.candidate_created`
- `learning.signal_created`
- `repair.wake_requested`
- `mode.changed`
- `advisor.consulted`
- `ui.control_invoked`

The Engineering Panel may aggregate or filter these, but they must exist as observable telemetry.

### 14.12 Q proxy endpoint and owner-doc command mapping table

#### 14.12A Proxy request schemas (minimum exact set)

Each proxy must have an explicit JSON schema, stored in the Q API layer and mirrored in EC validation.

```ts
const MemoryPromoteRequestSchema = z.object({
  memory_ref: z.string().max(200),
  source_inbox_item_id: z.string().max(160).optional(),
  route_trace_id: z.string().max(160).optional(),
  schema_version: z.literal(1),
});

const MemoryDemoteRequestSchema = z.object({
  memory_ref: z.string().max(200),
  reason_code: z.string().max(80).optional(),
  schema_version: z.literal(1),
});

const MemoryEditPatchOperationSchema = z.discriminatedUnion("op", [
  z.object({
    op: z.literal("replace"),
    path: z.string().max(200),
    value: z.any(),
  }),
  z.object({
    op: z.literal("remove"),
    path: z.string().max(200),
  }),
  z.object({
    op: z.literal("add"),
    path: z.string().max(200),
    value: z.any(),
  }),
]);

const MemoryEditRequestSchema = z.object({
  memory_ref: z.string().max(200),
  patch: z.array(MemoryEditPatchOperationSchema).max(30),
  schema_version: z.literal(1),
});

const MemoryArchiveOrDeleteRequestSchema = z.object({
  memory_ref: z.string().max(200),
  action: z.enum(["archive", "delete"]),
  schema_version: z.literal(1),
});

const MemoryOriginInspectRequestSchema = z.object({
  memory_ref: z.string().max(200),
  schema_version: z.literal(1),
});

const RouteOverrideRequestSchema = z.object({
  prior_route_trace_id: z.string().max(160),
  requested_route_hint: z.string().max(120),
  reason_code: z.string().max(80).optional(),
  schema_version: z.literal(1),
});

const RouteBlocklistAddRequestSchema = z.object({
  route_id: z.string().max(160),
  scope: z.enum(["workspace", "session"]),
  reason_code: z.string().max(80),
  schema_version: z.literal(1),
});

const RouteBlocklistRemoveRequestSchema = z.object({
  route_id: z.string().max(160),
  scope: z.enum(["workspace", "session"]),
  schema_version: z.literal(1),
});

const AliasCreateOrUpdateRequestSchema = z.object({
  alias_id: z.string().max(160).optional(),
  phrase: z.string().max(160),
  target_kind: z.enum(["entity_route", "gateway_route", "ui_command"]),
  target_id: z.string().max(160),
  required_surface: z.array(z.string().max(80)).max(8).default([]),
  required_mode: z.array(OrchestrationModeSchema).max(6).default([]),
  enabled: z.boolean().default(true),
  schema_version: z.literal(1),
});

const AliasEnableDisableRequestSchema = z.object({
  alias_id: z.string().max(160),
  enabled: z.boolean(),
  schema_version: z.literal(1),
});

const JobTerminateRequestSchema = z.object({
  job_id: z.string().max(160),
  reason: z.string().max(200).optional(),
  schema_version: z.literal(1),
});

const RepairWakeRequestSchema = z.object({
  target_ref: z.string().max(200),
  reason_code: z.string().max(80),
  route_trace_id: z.string().max(160).optional(),
  schema_version: z.literal(1),
});

const LearningAcceptRejectRevertRequestSchema = z.object({
  artifact_ref: z.string().max(200),
  action: z.enum(["accept", "reject", "revert"]),
  schema_version: z.literal(1),
});

const ModeSetRequestSchema = z.object({
  requested_mode: OrchestrationModeSchema,
  schema_version: z.literal(1),
});
```

Normative: a UI control may not ship unless its proxy schema exists and is validated by EC.

#### 14.12B Required mapping rows (normative)

| UI surface + control | Proxy endpoint | EC handler | Owner-doc command / path | Durable paths mutated | Telemetry emitted | User-visible degraded behavior |
|---|---|---|---|---|---|---|
| Memory card -> Promote | `POST /api/orchestration/memory/promote` | `system_control_handler` | DOC1 promote memory command | DOC1-owned | `ui.control_invoked`, `proxy_mutation.*` | disable button + error toast |
| Memory card -> Demote | `POST /api/orchestration/memory/demote` | `system_control_handler` | DOC1 demote memory command | DOC1-owned | `ui.control_invoked`, `proxy_mutation.*` | disable button + error toast |
| Memory card -> Edit | `POST /api/orchestration/memory/edit` | `system_control_handler` | DOC1 edit memory command | DOC1-owned | `ui.control_invoked`, `proxy_mutation.*` | form error with no local pretend update |
| Memory card -> Archive/Delete | `POST /api/orchestration/memory/archive-delete` | `system_control_handler` | DOC1 archive/delete | DOC1-owned | `ui.control_invoked`, `proxy_mutation.*` | explicit refusal if owner doc blocks delete |
| Memory / artifact -> View Origin | `POST /api/orchestration/origin/inspect` | `system_control_handler` | DOC10 trace/origin lookup | none | `ui.control_invoked`, `origin.viewed` | show unavailable state |
| Route banner -> Use different route | `POST /api/orchestration/route/override` | `system_control_handler` | DOC10 reroute lifecycle | DOC10 route traces | `route.override.*`, `ui.control_invoked` | route remains unchanged with visible reason |
| Route settings -> Block route | `POST /api/orchestration/route/blocklist/add` | `system_control_handler` | DOC10 route blocklist add | DOC10 blocklist stores | `proxy_mutation.*`, `route.blocked` | visible refusal |
| Route settings -> Unblock route | `POST /api/orchestration/route/blocklist/remove` | `system_control_handler` | DOC10 route blocklist remove | DOC10 blocklist stores | `proxy_mutation.*` | visible refusal |
| Alias Manager -> Upsert alias | `POST /api/orchestration/aliases/upsert` | `system_control_handler` | DOC10 alias upsert | DOC10 alias stores | `proxy_mutation.*`, `alias.updated` | form error |
| Alias Manager -> Enable/disable | `POST /api/orchestration/aliases/set-enabled` | `system_control_handler` | DOC10 alias enable toggle | DOC10 alias stores | `proxy_mutation.*`, `alias.updated` | toggle snaps back from real state |
| Running Jobs -> Terminate | `POST /api/orchestration/jobs/terminate` | `system_control_handler` | DOC10 abort cascade or owner-doc abort | running jobs state + owner-doc job state | `job.abort_requested`, `job.abort_*` | explicit unsupported/degraded state |
| Repair card -> Wake repair | `POST /api/orchestration/repair/wake` | `system_control_handler` | DOC9 repair wake command | DOC9-owned | `repair.wake_requested`, `proxy_mutation.*` | visible refusal |
| Inbox / learning item -> Accept/Reject/Revert | `POST /api/orchestration/learning/decision` | `system_control_handler` | DOC8/DOC9/DOC3 owner-doc decision path | owner-doc owned | `learning.*`, `proxy_mutation.*` | explicit refusal |
| Mode switcher | `POST /api/orchestration/mode/set` | `system_control_handler` | DOC10 mode service | DOC10 mode state | `mode.changed`, `ui.control_invoked` | old mode remains visible |

#### 14.12C Required read endpoints and canonical response schemas

```ts
const OrchestrationStateResponseSchema = z.object({
  effective_mode: EffectiveModeStateSchema,
  stop_state: STOPStateSchema,
  system_pulse: SystemPulseSchema.optional(),
  schema_version: z.literal(1),
});

const RunningJobsResponseSchema = z.object({
  jobs: z.array(RunningJobSchema).max(500),
  updated_at: z.string(),
  schema_version: z.literal(1),
});

const RouteTraceResponseSchema = z.object({
  trace: RouteTraceRecordSchema,
  manifest: TransactionManifestSchema.optional(),
  executed_behavior: ExecutedBehaviorBundleSchema.optional(),
  schema_version: z.literal(1),
});

const SystemAgentsResponseSchema = z.object({
  agents: z.array(SystemAgentSchema).max(100),
  directory: SystemAgentDirectorySchema.optional(),
  schema_version: z.literal(1),
});

const ContextCardResponseSchema = z.object({
  card: WhatCanIDoHereCardSchema,
  schema_version: z.literal(1),
});

const CapabilitySnapshotResponseSchema = z.object({
  snapshot: CapabilityAwarenessSnapshotSchema,
  schema_version: z.literal(1),
});
```

Required endpoints:
- `GET /api/orchestration/state`
- `GET /api/orchestration/jobs`
- `GET /api/orchestration/traces/:routeTraceId`
- `GET /api/orchestration/system-agents`
- `GET /api/orchestration/context-card`
- `GET /api/orchestration/capabilities/snapshot`

#### 14.12D Owner-doc result / refusal contract
Every proxy-driven mutation into another owner doc must resolve to one of:
- `OwnerDocProxyResultSchema` with `status = "applied"`
- `OwnerDocProxyResultSchema` with `status = "refused"`
- structured failure with `DispatchErrorSchema`

A Q control without a documented backend command/proxy mapping is non-compliant.

### 14.13 Wiring audit requirement
Before shipping any new surface, implementation must confirm:
1. UI control emits telemetry (`ui.control_invoked`)
2. an `OperationEnvelope` or canonical command is generated
3. a backend handler receives it
4. the owner-doc durable store changes or explicitly refuses
5. UI refreshes from real state
6. any expected stream/push channel is subscribed before the UI claims live state
## 15) Shared infrastructure, module plan, and integration notes

### 15.1 Required new or extended modules
- `apps/ec-service/src/orchestration/orchestration-modes.ts`
- `apps/ec-service/src/orchestration/mode-enforcer.ts`
- `apps/ec-service/src/orchestration/context-authority.ts`
- `apps/ec-service/src/orchestration/intent.ts`
- `apps/ec-service/src/orchestration/local-intent-index.ts`
- `apps/ec-service/src/orchestration/capability-registry-facade.ts`
- `apps/ec-service/src/orchestration/decision-context-builder.ts`
- `apps/ec-service/src/orchestration/decision-broker.ts`
- `apps/ec-service/src/orchestration/event-intake.ts`
- `apps/ec-service/src/orchestration/event-accumulator.ts`
- `apps/ec-service/src/orchestration/entity-discovery.ts`
- `apps/ec-service/src/orchestration/dispatch.ts`
- `apps/ec-service/src/orchestration/dispatch-abort.ts`
- `apps/ec-service/src/orchestration/running-jobs.ts`
- `apps/ec-service/src/orchestration/system-agent-bus.ts`
- `apps/ec-service/src/orchestration/route-trace.ts`
- `apps/ec-service/src/orchestration/transaction-manifest.ts`
- `apps/ec-service/src/orchestration/orchestration-read-models.ts`
- `apps/ec-service/src/orchestration/hook-dispatcher.ts`
- `apps/ec-service/src/orchestration/gateway-watchdog.ts`
- `apps/ec-service/src/orchestration/telemetry-retention.ts`
- `apps/q-dashboard/src/api/orchestration-proxy.ts`
- `apps/q-dashboard/src/api/orchestration-stream.ts`
- `apps/ec-service/src/orchestration/legacy-adapter.ts`
- gateway/chat integration modules per DOC11

### 15.2 Durable paths
Canonical orchestration storage root:
- `ELNOR_MEMORY/system/orchestration/`

Required current-state/read-model artifacts:
- `ELNOR_MEMORY/system/orchestration/mode_state.json`
- `ELNOR_MEMORY/system/orchestration/route_trace_recent.json`
- `ELNOR_MEMORY/system/orchestration/running_jobs.json`
- `ELNOR_MEMORY/system/orchestration/system_pulse_current.json`
- `ELNOR_MEMORY/system/orchestration/telemetry_retention_state.json`

Required append-only evidence artifacts:
- `ELNOR_MEMORY/system/orchestration/route_traces.jsonl`
- `ELNOR_MEMORY/system/orchestration/route_trace_manifest.jsonl`
- `ELNOR_MEMORY/system/orchestration/route_hints.jsonl`
- `ELNOR_MEMORY/system/orchestration/route_blocklist.jsonl`
- `ELNOR_MEMORY/system/orchestration/route_aliases.jsonl`

Required current-state route artifacts:
- `ELNOR_MEMORY/system/orchestration/route_hints_current.json`
- `ELNOR_MEMORY/system/orchestration/route_blocklist_current.json`
- `ELNOR_MEMORY/system/orchestration/route_aliases_current.json`

Other required current-state artifacts:
- `ELNOR_MEMORY/system/capabilities/bridge_entries_current.json`
- `ELNOR_MEMORY/system/capabilities/capability_snapshot_current.json`
- `ELNOR_MEMORY/system/agents/system_agents_registry.json`
- `ELNOR_MEMORY/system/agents/system_agent_directory.json`
- `ELNOR_MEMORY/system/agents/system_agent_events.jsonl`
- `ELNOR_MEMORY/system/stop_request.json`

Canonicalization rule:
- if current code uses multiple orchestration-like roots or historical parallel files, implementation may temporarily mirror them, but the canonical read-model surfaces must converge on the paths above.

### 15.3 Data ownership notes
- route hints are DOC10-owned
- memory lifecycle remains DOC1-owned
- capability artifacts remain DOC3-owned
- repair execution remains DOC9-owned
- OpenClaw-native runtime/session/model control semantics remain OpenClaw/DOC11-owned

### 15.4 End-to-end data flow (normative reference)

This diagram is intentionally simple. It exists to prevent features described but not wired.

```text
[Q UI]
  | (OperationEnvelope / exact proxy payload)
  v
[EC handleOperationEnvelope]
  |-> operation.intake / mode_resolved telemetry
  |-> build DecisionPacket (routing facts only)
  |-> build MemoryInjectionPlan (bounded refs/hints only)
  |-> deterministicRoute / optional DLA (mode-gated)
  |-> write RouteTrace + transaction manifest seed
  v
[Dispatch]
  |-> gateway_interactive_chat (most free-text)
  |     |-> build GatewayHandoffPayload
  |     |-> DOC11 assembles lean annotations from packet + refs
  |     v
  |   [OpenClaw Gateway Runtime]
  |     |-> native tool/skill/runtime loop
  |     |-> emits gateway.* reverse events with correlation_id
  |     v
  |   [EC Reverse Telemetry Bridge]
  |     |-> updates RouteTrace executed fields
  |     |-> updates Running Jobs
  |     |-> updates ExecutedBehaviorBundle
  |     |-> publishes stream events / updates read-models
  |     |-> emits learning/repair/memory signals as needed
  |     v
  |   [Q Route Banner / Tool Panel / Engineering Panel]
  |
  |-> entity_router / system_control_handler / proposal_creator (structured actions)
  |-> discovery_runner / repair_runner (jobs -> completion receipts)
  v
[Owner-doc pipelines]
  |-> DOC1 memory
  |-> DOC8 learning/friction
  |-> DOC9 repair
  |-> DOC3 capability artifacts
```

### 15.5 Defaults and constants (normative starting points)

These constants exist to prevent coding-agent invention. Tune later, but do not omit.

- `SYSTEM_AGENT_HEARTBEAT_MS = 15000`
- `SYSTEM_AGENT_OFFLINE_AFTER_MS = 60000`
- `SYSTEM_AGENT_SELFTEST_STARTUP_TIMEOUT_MS = 5000`
- `SYSTEM_AGENT_QUERY_MAX_DEPTH = 3`

- `LEAN_EC_ANNOTATIONS_TOKEN_CAP_BASELINE = 700`
- `LEAN_EC_ANNOTATIONS_TOKEN_CAP_ASSISTED = 1200`
- `LEAN_EC_ANNOTATIONS_TOKEN_CAP_DISCOVERY = 1500`
- `DECISION_PACKET_TOKEN_CAP = 3500`

- `MEMORY_INJECTION_TOKEN_BUDGET_CHAT = 250`
- `MEMORY_INJECTION_TOKEN_BUDGET_TASK = 450`
- `MEMORY_INJECTION_MAX_REFS_CHAT = 6`
- `MEMORY_INJECTION_MAX_REFS_TASK = 8`
- `BRIEF_EXCERPT_MAX_TOKENS = 400`
- `CTX_EXPAND_MAX_INJECTED_TOKENS = 600`

- `DOC10_CHAT_CONTEXT_CAP_BASELINE = 700`
- `DOC10_CHAT_CONTEXT_CAP_ASSISTED = 1200`
- `DOC10_CHAT_CONTEXT_CAP_DISCOVERY = 1500`

- `CONTEXTREF_DECAY_HALFLIFE_HOURS = 24`
- `OCM_CACHED_BRIEF_TTL_MS = 3600000`
- `CAPABILITY_AWARENESS_SNAPSHOT_TTL_MS = 300000`

- `ADVISOR_BUDGET_MAX_CALLS_BASELINE = 0`
- `ADVISOR_BUDGET_MAX_CALLS_ASSISTED = 1`
- `ADVISOR_BUDGET_MAX_CALLS_SHADOW = 1`

- `GATEWAY_HANDOFF_BUDGET_MS = 250`
- `DISPATCH_SYNC_BUDGET_MS = 2000`
- `GATEWAY_ABORT_TIMEOUT_MS = 8000`
- `GATEWAY_WATCHDOG_TIMEOUT_MS = 12000`

- `EVENT_ACCUMULATOR_MAX_HOLD_MS = 5000`
- `EVENT_ACCUMULATOR_CRITICAL_FLUSH_MS = 0`

- `ALIAS_MATCH_PHASE1 = "exact_normalized_only"`
- `ALIAS_REQUIRES_USER_ENABLE = true`

- `ROUTE_TRACE_JSONL_RETENTION_DAYS = 30`
- `SYSTEM_AGENT_EVENTS_JSONL_RETENTION_DAYS = 30`
- `GATEWAY_REVERSE_EVENT_RETENTION_DAYS = 14`

### 15.6 Coexistence and migration notes (normative)
These notes exist because the current codebase is not a clean-sheet implementation.

Required migration posture:
- legacy `processCommand()` and chat routes may coexist temporarily, but they must adapt into `OperationEnvelopeSchema` before DOC10 routing/dispatch logic;
- existing golden-turn or lightweight intent classifiers may remain temporarily, but DOC10 must declare whether they feed Tier 1 classification or are being replaced;
- existing PendingItem-like inbox precursors may remain temporarily behind an explicit compatibility adapter;
- EventBus / hook infrastructure may be upgraded incrementally, but required hook families must declare their degraded posture if current infrastructure is insufficient;
- route traces, chat history, running jobs, and transaction manifests must coordinate writes so the same operation does not create inconsistent durable evidence;
- remote-write / durable-write permission enforcement in current code must interoperate with DOC10 decisions rather than being silently bypassed;
- physical file/module paths in the current repo may differ from the idealized module plan above; canonical semantics win, but implementation appendices must name the real code locations.
## 16) Acceptance tests

### 16.1 Fast-path and mode enforcement
1. Baseline mode never calls DLA.
2. Assisted mode calls DLA only when guard conditions pass.
3. External-bypass mode preserves direct/native OpenClaw use.
4. Auto-degrade to baseline visibly clamps advanced behavior.
5. `shouldDefaultToGatewayFirstChat()` is the only gateway-first default gate.
6. `EffectiveModeStateSchema` correctly reflects degrade overlays without mutating the persisted mode file.

### 16.2 Routing correctness
7. Entity lookup routes to Entity Router.
8. Capability execution with healthy route does not require DLA.
9. Route blocklist excludes blocked route on next attempt.
10. Gateway-first chat path is selected for Q interactive chat except explicit bypass/degraded fallback.
11. Alias hits for OpenClaw-native work still route through Gateway with hints.
12. Legacy non-envelope submissions adapt into `OperationEnvelopeSchema` before routing.

### 16.3 Context authority and memory discipline
13. Gateway-first chat does not run the full Running Brief prompt assembler.
14. DOC10 chat contribution never exceeds section 5.11F caps.
15. Repeated memory hints across adjacent turns are deduped.
16. OpenClaw-native memory is not duplicated by DOC10 when metadata allows detection.
17. `context_authority.violation_detected` is emitted when a non-compliant second assembler is introduced.

### 16.4 Event storm resistance
18. 5,000 raw file/system events do not create 5,000 decision packets.
19. Accumulator forced flush works under continuous streams.
20. DOC8 still receives friction-worthy normalized events.
21. Shared normalized dedupe keys prevent duplicate DOC8/DOC9 proposal fan-out.
22. If hook infrastructure is degraded, degraded state is surfaced instead of silently dropping required hooks.

### 16.5 Traceability and provenance
23. Memory candidate created from orchestration path carries `ArtifactOriginSchema`.
24. Repair offer/proposal created from orchestration path carries `ArtifactOriginSchema`.
25. Q can render `View Origin Trace` when origin trace exists.
26. Trace manifest links owner-doc commands and resulting artifacts for a single route trace.
27. Transaction manifest never claims distributed rollback or revert-all semantics.

### 16.6 Capability bridge and awareness
28. Installing/updating/removing a DOC3 capability refreshes LocalIntentIndex.
29. Provisional `SKILL.md` scanner fallback populates bridge cache when DOC3 bridge export is absent.
30. Canonical DOC3 bridge entries replace provisional entries when available.
31. Quarantining a capability removes it from routing eligibility immediately.
32. `WhatCanIDoHereCard` visibly reflects bridge state and unavailable reasons.
33. Hot-reload failure marks capability cache stale and surfaces that state.

### 16.7 OpenClaw preservation
34. Q chat through Gateway still reflects OpenClaw-native files/skills/runtime behavior under DOC11 contract.
35. EC annotations remain bounded and additive.
36. DOC10 does not require step-by-step supervision of OpenClaw runtime tool decisions.
37. Dual live execution is prevented.

### 16.8 Reverse Gateway telemetry and abort correctness
38. Route trace updates when Gateway starts/completes/fails/aborts.
39. Tool events from Gateway are visible in Q telemetry.
40. Gateway failure can trigger DOC8/DOC9 pathways through normalized events.
41. Terminate/cancel on a Gateway-backed job sends a real downstream abort request and waits for ack or timeout.
42. Abort timeout is visible in Running Jobs and Route Trace.
43. Cleanup failure is visible in Running Jobs and Route Trace.
44. Reverse telemetry transport outage yields explicit degraded-state UI, not false completion.

### 16.9 UI wiring and provenance
45. Every memory/learning/repair control maps to a real backend command or owner-doc proxy.
46. `View Origin Trace` works for memory, learning, repair, and capability artifacts.
47. Route Alias manager edits persist and affect subsequent routing only after confirmation or saved change.
48. A Q control cannot ship without both `ui.control_invoked` telemetry and a real backend round-trip.
49. Read endpoints return the canonical response schemas in section 14.12C.

### 16.10 Persistence, restart, and migration
50. Running jobs reconcile correctly on EC restart.
51. STOP adapters converge on the canonical STOP read-model.
52. Existing legacy intake paths cannot bypass `handleOperationEnvelope`.
53. Existing PendingItem-like inbox data is either adapted or refused explicitly.
54. Route trace, chat history, and running job writes remain coherent for a single operation.

## 17) Phasing

### Phase 0 - guardrails and authority
- mode service
- single-authority matrix
- STOP canonicalization
- event accumulator + shared dedupe key
- canonical schemas for route decision / dispatch / route trace / artifact origin / running jobs / system pulse
- `EffectiveModeStateSchema`
- `OwnerDocProxyResultSchema`
- canonical read-model response pack

### Phase 1 - usable lean execution spine
- intake
- envelope/legacy adapter coexistence
- classifier tiers 1-3
- entity router / capability router
- route trace + transaction manifest seed
- gateway-first chat dispatch cross-link
- reverse Gateway telemetry bridge
- Gateway abort cascade
- Phase 1 capability-bridge fallback scanner
- basic Context Card and Route Trace viewer
- live stream / push path for running jobs and reverse telemetry

### Phase 2 - bounded advisor integration
- DLA consultation path
- cached OCM decision brief consumption
- engineering panel phase 0/1
- capability registry facade query path
- watchdog / dead-letter handling

### Phase 3 - capability bridge and learning foundation
- DOC3 canonical capability registry bridge
- quarantine / repair hooks
- running jobs view
- real alias management + misfire telemetry
- hook delivery hardening / queue upgrade if needed

### Phase 4 - stronger user-facing traceability and controls
- origin trace modal
- route override round-trip
- memory/self-learning provenance surfacing
- unified inbox compatibility adapter
- trace-scoped transaction manifest rendering
- cost / usage exposure via DOC13 canonical schemas when available

### Phase 5+ - preserved advanced ideas
- exploration execution loop
- teach mode
- workflow pattern detector
- capability synthesis pipeline
- richer shadow analysis
- deeper lab-mode experiments
- more advanced route tuning and analytics

## 18) Honest constraints and non-goals
- DOC10 does not make cloud LLM latency disappear.
- DOC10 does not turn every raw event into an intelligent action.
- DOC10 does not replace OpenClaw's native runtime intelligence.
- DOC10 does not own the full memory UI or memory mutation semantics.
- DOC10 does not guarantee that every proposed capability can be learned safely or automatically.
- DOC10 does not promise transport features that DOC11 / OpenClaw cannot actually emit; unsupported event families must be marked unsupported, not faked.

---

## 19) Build notes for Claude Code (do not skip)
1. Implement the mode service first.
2. Do not let the client authoritatively set mode.
3. Implement event intake/accumulator before wiring raw system events into routing.
4. Implement `ArtifactOriginSchema` before building more memory/proposal UI.
5. Implement `RouteDecisionSchema`, `DispatchResultSchema`, and `RouteTraceRecordSchema` before wiring route banners or intent cards.
6. Treat DOC11 as the owner of Gateway-first chat assembly and model/thinking/reasoning control specifics.
7. Do not invent OpenClaw protocol method names. Verify them against actual source/type definitions and use the real ones.
8. Preserve external-bypass/native OpenClaw use.
9. Keep orchestration additive and bounded. Do not smother OpenClaw with giant context injections or step-level supervision.
10. Do not route free-text Q chat around Gateway unless the request is an explicit Q/EC-owned structured action.
11. Do not let an exact alias for an OpenClaw-native task bypass Gateway; use it as a route hint only.
12. Do not implement dual live execution for the same request.
13. Do not ship a new UI control without a documented backend command or owner-doc proxy mapping.
14. Do not treat provisional SKILL.md scanner output as canonical once DOC3 bridge metadata exists.
15. Do not drop artifact-origin fields when proxying into owner-doc commands.
16. Do not silently suppress Gateway reverse telemetry; if the bridge is degraded, surface it.
17. Do not let DLA or other advisors become baseline hot-path dependencies.
18. Do not let legacy intake paths bypass `OperationEnvelopeSchema` once the orchestration spine is active.
19. Do not assume the current EventBus, inbox model, or capability registry shape already satisfy DOC10; add explicit adapters/migration notes where needed.
20. Do not invent repo paths in canonical behavior sections; use implementation appendices for physical file mapping.
21. Do not invent a separate cost language here. Use provisional usage fields only until DOC13 freezes the canonical shared model.


---

# Part 2 — Merged Revision — DOC10 v1.11.10 R10.1 (Retrieval Receipts and Topology Consumption)


# DOC10 — Unified Engagement, Orchestration, Gateway-Aware Decision Routing, Capability Learning, and System-Agent Management
## v1.11.10 R10.1 — Retrieval Receipts and Topology Consumption Alignment

**Date:** March 10, 2026  
**Status:** targeted revision draft — Wave C consumer alignment  
**Supersedes:** DOC10 v1.11.10 R10 only for the subjects covered here  
**Companion trackers:** DOC10 Orchestration Integration Ledger R9.1; DOC15 Cross-Document Integration Contract v1.1.1

---

## Why this revision exists

R10 already made DOC10 the orchestration owner for routing facts, bounded context exchange, route traces, user-visible controls, and provenance. Wave A and Wave B then clarified three missing owner seams:

1. retrieval lanes and provider truth belong to DOC3 / DOC18,
2. the broader graph/topology layer is a **derived read-model**, not a second truth store,
3. DOC10 is the main **consumer-side presenter and packager** for retrieval receipts, graph-aware expansion, and user-visible route explanation.

R10.1 therefore does **not** redefine storage ownership for canonical memory, DocIndex, LlamaIndex, or topology state. It adds the missing consumer rules so Q and EC can:

- preserve retrieval-lane truth inside route traces,
- render coherent provider receipts in Q,
- perform bounded relation-aware expansion without unbounded graph walks,
- resolve graph-aware document recommendations and support-pack suggestions,
- degrade honestly when topology snapshots are unavailable or stale.

---

## 0) Non-negotiable interpretation rules

### 0.1 DOC10 consumes retrieval truth; it does not invent it

DOC10 must consume `search_lane`, `provider_kind`, `route_reason`, corpus identity, freshness state, and degraded reasons from owner docs. DOC10 may normalize the information for route traces and UI, but may not fabricate provider truth.

### 0.2 DOC10 consumes topology read models; it does not own graph truth

DOC10 may call bounded relation/topology read seams and may present their outputs in route traces, search UIs, and context packaging. DOC10 may not:

- write canonical graph nodes or edges,
- silently modify relationship-index truth,
- run unbounded graph traversal in the hot path,
- infer contradiction or supersession as canonical truth without owner-doc support.

### 0.3 Hot-path boundedness still wins

Graph-aware retrieval does **not** justify giant context dumps. Route-time use must stay bounded, reference-first, and mode-aware. If topology expansion would exceed budget, DOC10 must preserve a receipt and fallback summary instead of silently inflating the prompt.

### 0.4 Partial deployment must be honest

If topology snapshots, relation queries, or provider receipts are unavailable, DOC10 must render explicit degraded states such as:

- `no_topology_data_available`
- `topology_snapshot_stale`
- `provider_receipt_unavailable`
- `semantic_provider_degraded`

Never imply graph awareness or receipt truth that was not actually present.

---

## 1) Consumer architecture additions

### 1.1 Retrieval-lane consumption model

DOC10 consumes four retrieval lanes defined upstream:

1. **exact_live_lookup** — known-path or permission-truth retrieval
2. **semantic_corpus_search** — corpus-scoped semantic retrieval, including LlamaIndex sidecar use
3. **canonical_memory_search** — EC-owned memory retrieval / future `MemorySearchService`
4. **runtime_local_search** — OpenClaw-native local/runtime retrieval

DOC10 owns:

- lane selection display,
- route-trace preservation,
- user-visible receipt rendering,
- bounded expansion policy,
- handoff of selected refs/hints into context packaging.

DOC10 does **not** own provider internals or graph storage.

### 1.2 Topology-aware consumption model

The topology layer is consumed in three bounded ways:

1. **neighbor expansion** — add nearby related refs when explicitly requested and still within budget,
2. **result explanation** — explain why an item matched, was grouped, was replaced, or was suppressed,
3. **document recommendation resolution** — turn relation-aware hints into support packs or grouped document suggestions.

### 1.3 Support-pack resolution model

A support pack is an **ephemeral resolved document grouping**, not a new canonical truth store.

DOC10 may resolve support packs from:

- CIL `document_priority_hints`,
- corpus/provider receipts,
- topology relation metadata,
- active review-target pin state.

DOC10 may present and load support packs, but durable learning about them belongs to DOC15 / related owner docs.

---

## 2) Contracts and schema extensions

### 2.1 Imported upstream contracts

R10.1 consumes the following upstream contracts by reference:

- `RetrievalProviderReceiptSchema` (DOC3 / DOC18)
- `SearchLaneSchema` (DOC3)
- `DocumentPriorityHintSchema` (DOC15)
- bounded topology / relationship read contracts (DOC1 / Core / DocIndex family)

DOC10 may extend route-trace and UI response schemas using these types. It must not fork them.

### 2.2 Route-trace retrieval slice

```ts
// packages/contracts/src/orchestration/retrieval-trace.ts
import { z } from "zod";
import {
  RetrievalProviderReceiptSchema,
  SearchLaneSchema,
} from "../retrieval/provider-receipts";

export const TopologyReasonCodeSchema = z.enum([
  "same_matter",
  "same_issue",
  "same_motion_type",
  "support_pack_member",
  "active_review_target_neighbor",
  "references_target",
  "supports_target",
  "contradicts_target",
  "supersedes_target",
  "derived_from_target",
  "fallback_no_topology_data",
]);

export const TopologyExpansionSummarySchema = z.object({
  used: z.boolean().default(false),
  degraded: z.boolean().default(false),
  degraded_reason: z.string().max(120).optional(),
  source_ref_id: z.string().max(200).optional(),
  source_node_id: z.string().max(200).optional(),
  neighbor_limit: z.number().int().min(0).max(20).default(0),
  included_neighbor_refs: z.array(z.string().max(200)).max(20).default([]),
  suppressed_neighbor_refs: z.array(z.object({
    ref_id: z.string().max(200),
    reason_code: TopologyReasonCodeSchema.or(z.string().max(120)),
  })).max(20).default([]),
  relation_types_used: z.array(z.string().max(120)).max(12).default([]),
});

export const RouteTraceRetrievalSliceSchema = z.object({
  query_kind: z.enum([
    "route_selection",
    "search_surface",
    "context_resolution",
    "document_recommendation",
    "support_pack_resolution",
  ]),
  selected_lane: SearchLaneSchema.optional(),
  selected_provider_kind: z.string().max(120).optional(),
  selected_corpus_id: z.string().max(200).optional(),
  route_reason: z.string().max(200).optional(),
  provider_receipts: z.array(RetrievalProviderReceiptSchema).max(12).default([]),
  topology_expansion: TopologyExpansionSummarySchema.optional(),
  support_pack_id: z.string().max(160).optional(),
  support_pack_member_ids: z.array(z.string().max(200)).max(20).default([]),
  created_at: z.string(),
  schema_version: z.literal(1),
});
```

### 2.3 Search result display receipt

```ts
export const SearchResultDisplayReceiptSchema = z.object({
  provider_kind: z.string().max(120),
  search_lane: SearchLaneSchema,
  corpus_id: z.string().max(200).optional(),
  route_reason: z.string().max(200).optional(),
  freshness_state: z.string().max(80).optional(),
  degraded_reason: z.string().max(160).optional(),
  topology_reason_codes: z.array(TopologyReasonCodeSchema).max(8).default([]),
  support_pack_id: z.string().max(160).optional(),
  why_this_matched: z.array(z.string().max(240)).max(6).default([]),
});
```

### 2.4 Document recommendation resolution schema

```ts
export const DocumentRecommendationResolutionSchema = z.object({
  recommendation_id: z.string().uuid(),
  origin: z.enum(["cil_hint", "search_surface", "support_pack", "advisor", "manual_compare"]),
  doc_ids: z.array(z.string().max(200)).min(1).max(20),
  support_pack_id: z.string().max(160).optional(),
  reason_codes: z.array(TopologyReasonCodeSchema).max(12).default([]),
  provider_receipt_refs: z.array(z.string().max(200)).max(12).default([]),
  route_trace_id: z.string().max(160),
  created_at: z.string(),
  schema_version: z.literal(1),
});
```

### 2.5 Route-trace response extension

R10.1 extends `RouteTraceResponseSchema` as follows:

```ts
const RouteTraceResponseSchema = z.object({
  trace: RouteTraceRecordSchema,
  manifest: TransactionManifestSchema.optional(),
  executed_behavior: ExecutedBehaviorBundleSchema.optional(),
  retrieval: RouteTraceRetrievalSliceSchema.optional(),
  schema_version: z.literal(1),
});
```

---

## 3) Bounded topology-expansion behavior

### 3.1 Allowed uses of `ctx_expand`

DOC10 already exposes `ctx_expand(ref_id, neighbors)`. R10.1 clarifies that topology-backed expansion may only be used when:

- a selected result or ref already exists,
- the caller requests expansion or the route policy explicitly allows it,
- the operation mode permits the added cost,
- the expansion stays within hard result and token bounds.

### 3.2 Hard bounds

Default bounds:

- maximum source refs expanded per request: **3**
- maximum neighbors per source ref: **5**
- maximum total added refs from expansion: **8**
- maximum additional inline token budget consumed by expansion summaries: **500**

If expansion exceeds bounds:

- preserve only the top-ranked neighbors,
- write the suppression decisions into `topology_expansion.suppressed_neighbor_refs`,
- emit `retrieval.topology_expansion.truncated`.

### 3.3 Contradiction and supersession behavior

If topology metadata marks a candidate as `contradicts_target` or `supersedes_target`, DOC10 must not blindly inline both old and new materials without explanation.

Rules:

- `supersedes_target` may replace the older ref in the suggested set while preserving a visible note.
- `contradicts_target` may appear only if the surface is analytical/comparative or the user requested comparison.
- hidden/suppressed contradiction or supersession decisions must remain visible in receipts or route trace details.

### 3.4 Fallback behavior

If no topology snapshot is available, DOC10 must still complete the route using retrieval/provider truth only, with `fallback_no_topology_data` reason codes where relevant.

---

## 4) Document recommendation and support-pack orchestration

### 4.1 Support-pack assembly rules

DOC10 may assemble an ephemeral support pack when all of the following are true:

- at least two candidate documents share a common relation-aware reason,
- the grouped recommendation remains under the configured doc-count cap,
- the source docs are accessible and not stale-blocked,
- the active route/surface benefits from grouped loading.

Recommended caps:

- default support-pack size: **3–6 docs**
- hard max: **8 docs**
- if more than 8 qualify, show top set plus “view more” in UI.

### 4.2 Grouping reasons

Grouping reasons may include:

- same matter
- same issue
- same motion type
- support-pack member
- active review target neighbor
- supersedes target

### 4.3 No silent auto-load in interactive chat

For interactive chat surfaces, DOC10 may recommend or pre-stage a support pack, but must not silently load an expanded pack unless:

- the route policy explicitly authorizes it,
- the user accepted a suggestion,
- or the operation is a non-interactive structured workflow that already carries approved doc refs.

### 4.4 Provenance rule

Every support-pack recommendation must preserve:

- originating route trace,
- originating provider receipt refs,
- source hint/node refs where applicable,
- relation-aware reason codes.

---

## 5) Endpoint and command amendments

### 5.1 Route trace read endpoint

Existing route-trace read endpoints must return the new `retrieval` block when available.

### 5.2 Search surface response amendment

Where DOC10 returns search/deep-search responses, each result should include `display_receipt?: SearchResultDisplayReceiptSchema`.

If the retrieval owner doc cannot provide a provider receipt, DOC10 must either:

- map the response into a compatibility receipt, or
- return `display_receipt_unavailable=true` with a reason.

### 5.3 Document recommendation action endpoints

```ts
POST /api/orchestration/documents/recommendation/accept
POST /api/orchestration/documents/recommendation/dismiss
POST /api/orchestration/documents/recommendation/compare
```

#### Request

```ts
const DocumentRecommendationDecisionRequestSchema = z.object({
  recommendation_id: z.string().uuid(),
  route_trace_id: z.string().max(160),
  action: z.enum(["accept", "dismiss", "compare"]),
  schema_version: z.literal(1),
});
```

#### Response

```ts
const DocumentRecommendationDecisionResponseSchema = z.object({
  ok: z.boolean(),
  recommendation: DocumentRecommendationResolutionSchema.optional(),
  refusal_code: z.string().max(80).optional(),
  refusal_message: z.string().max(240).optional(),
  schema_version: z.literal(1),
});
```

DOC10 owns the proxy endpoints and provenance. Owner docs still own any durable writes triggered downstream.

---

## 6) Q/UI amendments

### 6.1 Route trace drawer — retrieval section

Add a dedicated **Retrieval** section showing:

- selected search lane,
- selected provider,
- corpus ID if present,
- route reason,
- degraded reason if any,
- topology expansion summary,
- support-pack grouping summary,
- link to compare alternate routes where supported.

States:

- **loading** — spinner with “Loading retrieval details…”
- **empty** — “No retrieval receipts recorded for this trace.”
- **degraded** — warning badge with explicit reason
- **populated** — receipts + relation-aware notes

### 6.2 Search / deep-search result cards

Each result card should render a compact receipt row:

- provider badge
- lane badge
- freshness badge
- degraded badge if any
- “Why this matched” expandable section
- topology reason pills when present (`same matter`, `same issue`, `supersedes target`, etc.)

### 6.3 Document recommendation banner / support-pack card

When DOC10 resolves document suggestions from CIL or search:

- show grouped support-pack card when multiple docs belong together,
- show single-doc recommendation card otherwise,
- provide actions: `Load`, `Compare`, `Dismiss`, `Why this pack?`.

### 6.4 Partial deployment messaging

If topology data is missing, UI must render:

- “Related-doc logic unavailable; showing retrieval-only results.”

not a blank or falsely confident graph-aware explanation.

### 6.5 Mobile behavior

On narrow widths, collapse the receipt row into a single tappable “Route details” chip; do not remove degraded warnings.

---

## 7) Code implementation plan

### 7.1 New or amended files

```text
packages/contracts/src/retrieval/provider-receipts.ts          # imported owner contracts
packages/contracts/src/orchestration/retrieval-trace.ts        # new DOC10 consumer contracts
apps/ec-service/src/orchestration/retrieval/route-receipts.ts  # parse/store provider receipts
apps/ec-service/src/orchestration/retrieval/topology-expand.ts # bounded relation-aware expansion
apps/ec-service/src/orchestration/retrieval/support-packs.ts   # ephemeral support-pack assembly
apps/ec-service/src/orchestration/read-models/route-trace.ts   # include retrieval block in trace views
apps/ec-service/src/server.ts                                  # wire recommendation decision endpoints
apps/q-backend/src/orchestration/routes.ts                     # proxy new read/decision routes
apps/q-frontend/src/components/orchestration/RouteTraceRetrievalCard.tsx
apps/q-frontend/src/components/search/SearchResultReceiptRow.tsx
apps/q-frontend/src/components/search/SupportPackRecommendationCard.tsx
```

### 7.2 Required functions

```ts
export async function persistRouteTraceRetrievalSlice(input: {
  routeTraceId: string;
  retrieval: z.infer<typeof RouteTraceRetrievalSliceSchema>;
}): Promise<void>;

export async function boundedTopologyExpand(input: {
  refIds: string[];
  neighborLimit?: number;
  operationId: string;
  routeTraceId: string;
  allowRelationTypes?: string[];
}): Promise<z.infer<typeof TopologyExpansionSummarySchema>>;

export async function resolveSupportPack(input: {
  candidateDocIds: string[];
  reasonCodes: string[];
  providerReceiptRefs: string[];
  routeTraceId: string;
}): Promise<z.infer<typeof DocumentRecommendationResolutionSchema> | null>;
```

### 7.3 Failure handling

- If provider receipt parsing fails, preserve the route result and emit `provider_receipt_unavailable`.
- If topology expansion fails, return a degraded expansion summary instead of throwing the route.
- If support-pack grouping fails validation, fall back to individual recommendation cards.

---

## 8) Telemetry additions

Add at minimum:

- `retrieval.route.selected`
- `retrieval.receipt.rendered`
- `retrieval.topology_expansion.used`
- `retrieval.topology_expansion.skipped`
- `retrieval.topology_expansion.truncated`
- `document_recommendation.accepted`
- `document_recommendation.dismissed`
- `support_pack.accepted`
- `support_pack.dismissed`

Every event must include `route_trace_id` and `correlation_id`.

---

## 9) Acceptance scenarios

1. **Semantic corpus result with provider truth**  
   A LlamaIndex-assisted result reaches Q. Route trace and result card both show `semantic_corpus_search`, `llamaindex_index`, `corpus_id`, route reason, and freshness state.

2. **Topology unavailable fallback**  
   Topology read-model is unavailable. Search still completes. UI shows retrieval receipts and an explicit “no topology data” fallback note.

3. **Support-pack grouping**  
   Three documents from the same matter and motion type are recommended together. DOC10 renders one support-pack card with relation reasons and preserves provider receipt refs.

4. **Supersession handling**  
   A newer memo supersedes an older one. DOC10 recommends the newer memo, records the supersession in the receipt, and does not silently present both as equivalent.

5. **Bounded expansion**  
   A user expands related refs. DOC10 includes up to the configured neighbor bound, records suppressed neighbors, and stays inside the inline budget.

6. **Compare path**  
   User clicks `Compare` on a document recommendation. DOC10 routes to the appropriate comparative surface while preserving the original route trace and recommendation provenance.

---

## 10) Manifest reconciliation for this revision

R10.1 covers the Wave C DOC10 consumer obligations from the control packet and Waves A/B:

- retrieval-lane truth consumption
- provider receipt propagation into route traces and Q
- bounded topology expansion
- graph-aware document recommendations / support packs
- honest degraded-state handling
- implementation-ready schemas, UI states, endpoints, and code seams

No canonical owner truth is moved in this revision. DOC10 remains the consumer/presenter/orchestrator for these seams.