Elnor Repo Reader

EC_CORE_ADDENDUM_A_INTAKE_ROUTING_FOR_CORPUS_BINDINGS_PROPOSAL_V1.md

Current Specs/EC Core/EC_CORE_ADDENDUM_A_INTAKE_ROUTING_FOR_CORPUS_BINDINGS_PROPOSAL_V1.md

Short text page 0a0361bd8d85. Generated 2026-06-09T01:23:58.539Z from commit dbaa25962edc11ab30e8d4ca1715f9ae5bf77331. Worktree: clean.

Open readable HTML page · Open raw txt · Open path URL

ELNOR REPO READER TEXT MIRROR
Original path: Current Specs/EC Core/EC_CORE_ADDENDUM_A_INTAKE_ROUTING_FOR_CORPUS_BINDINGS_PROPOSAL_V1.md
Source repo: /Users/OpenClaw1/Elnor/Elnor Specs
Git branch: main
Git commit: dbaa25962edc11ab30e8d4ca1715f9ae5bf77331
Generated: 2026-06-09T01:23:58.539Z

---

# EC Core Addendum A — Intake Routing for Corpus Source Bindings (Proposal V1)

**Source:** Cross-doc need surfaced during PACER plugin spec V1.1 review (2026-04-27).
**Target:** Increment to EC Core Addendum A V3.3 (operative as of 2026-04-16) — adds an intake-routing step for DOC73 Corpus Source Bindings.
**Status:** Proposal — companion to `DOC73_CORPUS_SOURCE_BINDINGS_PROPOSAL_V1.md`. Cannot land independently; absorbs together with the DOC73 addendum.
**Date:** 2026-04-27
**Companion:** `DOC73_CORPUS_SOURCE_BINDINGS_PROPOSAL_V1.md` (the contract this proposal implements).

---

## 1. Purpose

DOC73 owns the corpus model, including the new "source binding" lifecycle path proposed in `DOC73_CORPUS_SOURCE_BINDINGS_PROPOSAL_V1.md` §3. EC Core owns the intake pipeline — the runtime that takes events from plugins and surfaces, validates them against intake contracts (DOC72 §20A), and writes durable nodes.

This proposal adds **one routing step** to EC Core's intake pipeline so that bindings are evaluated against every inbound event before the resulting node is written. The change is small and additive: bindings are looked up; matching `corpus_id`s are appended to the node's scope-tag set; extraction tasks are enqueued where applicable. EC remains the sole writer.

This is a stub proposal. It does not redesign the intake pipeline. It identifies the precise insertion point and specifies the lookup/evaluation/write contract. Full pseudocode for the binding lookup is in DOC73 Corpus Source Bindings §3; this proposal references that and adds EC-specific concerns (transactional boundaries, error propagation, telemetry, schema migration).

---

## 2. Where the Routing Step Goes

EC Core Addendum A V3.3 §X (intake pipeline; exact section reference TBD on absorption — current operative spec section to be located by the integrator) defines the intake sequence. The new step inserts as follows:

### 2.1 Existing intake sequence (paraphrased; verify against V3.3 §X on absorption)

```
1. Receive event from plugin / surface.
2. Resolve event to intake contract per DOC72 §20A.
3. Validate event payload against contract schema.
4. Resolve referenced entities (entity-resolution sub-pipeline).
5. Construct candidate node payload.
6. Apply DOC25 ingestion contract for Knowledge-side writes.
7. WRITE node to SQLite (Layer 1).
8. Project to DOC72 entity graph (Layer 2).
9. Emit JSONL audit entry.
10. Trigger downstream agent-instruction routing per DOC23.
```

### 2.2 New step — inserted between (5) and (6)

```
5.5 Corpus Source Binding evaluation:
    a. Query corpus_source_binding nodes WHERE
         source_kind = event.source_kind
         AND source_id = event.source_id
         AND active = true
       (in created_at ASC order — deterministic).
    b. For each matching binding:
       - If content_type_filter is set, check event.content_type membership.
         If not present in filter, skip this binding.
       - If inclusion_predicate is set, evaluate against event payload via
         the DOC72 predicate sandbox. On evaluation error: log to
         binding.last_evaluation_error; treat as no-match; continue.
       - If both checks pass, append binding.corpus_id to the candidate
         node's scope-tag set (deduplicated).
       - If binding.contribution_mode == "scope_and_extract", queue a
         deferred extraction task targeting the corpus's extraction_profile
         (deferred = enqueued in step 10's downstream-routing batch, not
         executed inline; preserves intake latency budget).
       - Increment binding.contribution_count and update
         binding.last_contribution_at (these are durable writes, batched
         with the main node write in step 7).
    c. If no bindings match, no-op. Step 5.5 has zero effect on events
       from sources with no active bindings.
```

The trust-posture interaction (DOC73 Corpus Source Bindings §5) is enforced by DOC73 §3.1 logic, not here. EC writes the node with `corpus_id` scope tags; DOC73's per-corpus trust posture determines how the resulting member is presented (confirmed vs candidate) at read time. EC does not need to know about trust posture at intake.

### 2.3 Latency budget

V1 of this addendum holds intake latency to within +5ms of the V3.3 baseline for events with zero matching bindings (~99% of intake volume — most sources are not bound to any corpus). For events with N matching bindings, latency adds approximately `N × predicate_eval_time + N × scope_tag_append_time`, expected sub-millisecond per binding for trivial predicates.

The binding lookup at step 5.5(a) is the only mandatory new query. EC Core caches active bindings in memory keyed by `(source_kind, source_id)` with cache invalidation on binding mutation (write-through). Cold cache cost: one indexed SQLite read per source-id seen in the cache lifetime.

### 2.4 Transactional boundary

The candidate node write (step 7), the binding metadata updates (`contribution_count`, `last_contribution_at` — multiple bindings, multiple updates), and the JSONL audit entry (step 9) MUST be in a single transaction. Either the node lands with all corpus tags AND all binding counters increment, or nothing changes. Partial states ("node written but binding counter not updated") are forbidden — they would corrupt the audit trail and the binding utility computation downstream.

V3.3's existing transactional model already wraps steps 7–9. This addendum extends the wrap to include the binding metadata updates from step 5.5(b).

---

## 3. Schema Migration

EC Core's SQLite schema needs three additions:

### 3.1 New table: `corpus_source_binding`

Column-by-column maps to the `CorpusSourceBinding` interface in DOC73 Corpus Source Bindings §2.1. Indexed on `(source_kind, source_id, active)` for the step 5.5(a) lookup.

```sql
CREATE TABLE corpus_source_binding (
  binding_id              TEXT PRIMARY KEY,
  corpus_id               TEXT NOT NULL,
  source_kind             TEXT NOT NULL,
  source_id               TEXT NOT NULL,
  content_type_filter     TEXT,           -- JSON array, NULL = no filter
  inclusion_predicate     TEXT,           -- DOC72 predicate DSL, NULL = no predicate
  contribution_mode       TEXT NOT NULL,  -- 'scope_only' | 'scope_and_extract'
  active                  INTEGER NOT NULL DEFAULT 1,
  created_at              TEXT NOT NULL,
  created_by              TEXT NOT NULL,  -- 'user' | 'system' | 'suggested_acceptance'
  last_contribution_at    TEXT,
  contribution_count      INTEGER NOT NULL DEFAULT 0,
  last_evaluation_error   TEXT,           -- JSON {timestamp, error}
  CHECK (contribution_mode IN ('scope_only', 'scope_and_extract')),
  CHECK (created_by IN ('user', 'system', 'suggested_acceptance')),
  CHECK (active IN (0, 1))
);

CREATE INDEX idx_csb_lookup
  ON corpus_source_binding (source_kind, source_id, active);

CREATE INDEX idx_csb_corpus
  ON corpus_source_binding (corpus_id);
```

### 3.2 New table: `source_kind_registration`

```sql
CREATE TABLE source_kind_registration (
  source_kind             TEXT PRIMARY KEY,
  owning_plugin_or_doc    TEXT NOT NULL,
  human_label             TEXT NOT NULL,
  description             TEXT,
  emits_content_types     TEXT NOT NULL,  -- JSON array
  example_source_ids      TEXT,           -- JSON array, optional
  registered_at           TEXT NOT NULL
);
```

### 3.3 Extension to existing node-write path

The candidate-node write at step 7 already emits a list of scope tags. The binding evaluation at step 5.5(b) appends to that list. No schema change to the node table itself — `corpus_id` scope tags are already part of DOC72's existing scope-tag mechanism per DOC73 §3.1.

### 3.4 Migration safety

Both new tables are additive. Pre-migration EC instances continue to function — the binding lookup at step 5.5(a) returns zero matches if the table is absent (graceful degradation), so a partial rollout where some EC instances have the migration and others don't is safe. V3.4 ships the migration as a forward-only schema change.

---

## 4. Telemetry and Observability

### 4.1 Per-event telemetry

EC's existing intake telemetry (event count, validation failures, write latency) extends with three new counters per event:

- `bindings_evaluated` — how many bindings were considered (post `(source_kind, source_id, active)` lookup, pre filter / predicate evaluation).
- `bindings_matched` — how many fired (filter + predicate passed).
- `predicate_eval_errors` — how many bindings failed predicate evaluation.

Per-binding telemetry is captured in `corpus_source_binding.contribution_count` and `last_contribution_at` (DOC73 contract). EC additionally writes a JSONL audit entry per fired binding.

### 4.2 Aggregate dashboards

Out of scope for this proposal but flagged for the operations dashboard owner: surface "bindings most active in last 24h," "bindings with rising predicate error rates," "bindings with zero contributions in last 30 days (candidates for cleanup)." These read directly from `corpus_source_binding` columns plus EC's JSONL audit.

---

## 5. Failure Mode Handling (EC-specific)

DOC73 Corpus Source Bindings §3.3 enumerates user-visible failure modes. EC-internal failure modes:

| Failure | EC handling |
|---|---|
| Binding lookup query times out (DB pressure) | Treat as "no bindings matched"; log warning; continue intake. Better to drop corpus tagging on a single event than to fail the whole intake. |
| Predicate evaluation throws unexpected exception | Catch, log to `last_evaluation_error`, treat as no-match. Surface to user only after N consecutive failures (default 5). |
| Binding metadata update fails inside transaction | Roll back entire transaction (per §2.4). Surface to telemetry as `intake_transaction_rollback`. Event re-queued for retry per existing retry policy. |
| Cache and DB diverge (binding mutated externally) | Cache invalidation is write-through, so this should not happen via EC-mediated writes. If it does (manual DB edit), TTL-based cache refresh (default 60s) corrects the divergence. Logged as warning. |
| Source-kind registration missing for an inbound event | Per DOC73 Corpus Source Bindings §3.3, this is upstream — DOC20 §6.18.2 content-type registration validation rejects the event before binding evaluation. EC does NOT need to handle this case at step 5.5. |

---

## 6. What This Proposal Does NOT Do

- Does NOT redesign the intake pipeline. Adds one step at a defined insertion point.
- Does NOT specify the DOC72 predicate DSL referenced in DOC73 Corpus Source Bindings §2.1. EC consumes whatever DSL DOC72 provides; if no DSL is yet specified, EC defers `inclusion_predicate` evaluation to V2 (predicate-set bindings would simply no-op gracefully until DSL ships — backward compatible).
- Does NOT specify retroactive backfill of corpus_id onto historical nodes. DOC73 Corpus Source Bindings §12.6 flags this as out of scope for V1; if V2 introduces a backfill operation, that requires its own EC Core proposal.
- Does NOT touch DOC25 ingestion contract semantics. The binding step is at 5.5; DOC25 application is at step 6, unchanged.
- Does NOT introduce new agent-instruction or task-queue mechanics. Extraction tasks queued at step 5.5(b) flow into the existing DOC23 task queue per V3.3 §10.

---

## 7. Cross-Doc Obligations

Mirror of DOC73 Corpus Source Bindings §8, scoped to EC's responsibilities:

- **DOC73 Corpus Source Bindings absorption** must land at the same time as this addendum or before. EC cannot implement step 5.5 without the contract.
- **DOC72 §20A intake contracts** must declare `source_kind` enums on each contract. EC's step 5.5(a) lookup keys on these.
- **DOC20 §6.18.2 content-type registration** must validate event `content_type` upstream of step 5.5.
- **DOC72 predicate DSL** is a soft dependency — bindings without `inclusion_predicate` work without it; bindings with predicates require it. If absent, V1 of this addendum ships with `inclusion_predicate` evaluation gated behind a feature flag, defaulting off.

---

## 8. Versioning and Absorption

Per the post-absorption versioning rule, when this proposal is absorbed into the next operative EC Core Addendum A version (V3.4 or later), this V1 proposal is archived. Future revisions author a fresh proposal against the absorbed text, not a re-edit of this V1.

This proposal's filename intentionally namespaces it (`EC_CORE_ADDENDUM_A_INTAKE_ROUTING_FOR_CORPUS_BINDINGS_PROPOSAL_V1.md`) so it is unambiguous in the addenda folder alongside other EC Core proposals.

---

## 9. Red-Team Targets

For fresh-window review:

1. **Insertion point precision.** §2.1 paraphrases V3.3's intake sequence; the integrator must verify the actual section reference and confirm step 5.5 inserts cleanly without re-ordering existing steps or breaking transactional invariants.
2. **Latency budget realism.** §2.3 estimates +5ms for unbound events. Under realistic load (PACER plugin alone could emit ~50 events/cycle across 20 watched cases) this needs benchmarking, not estimation.
3. **Cache invalidation correctness.** Write-through cache (§2.3) plus 60s TTL refresh (§5) is belt-and-suspenders — verify that's intentional and not redundant. If write-through is reliable, TTL is unneeded.
4. **Transaction scope.** §2.4 requires node write + N binding metadata updates + JSONL audit in one transaction. For high-fan-out events (one event hitting 10+ bindings), transaction size grows. Worth verifying SQLite WAL behavior under this load.
5. **Predicate DSL coupling.** §7 says predicate-set bindings "no-op gracefully" if DSL is missing. Worth verifying that "no-op" is the right default — some users would expect predicate-set bindings to error loudly during binding creation if the DSL isn't available, rather than silently failing intake-time evaluation.
6. **Migration rollback.** §3.4 says forward-only. If the migration causes problems, rollback requires a separate down-migration. Consider whether a reversible migration is worth the extra spec complexity here.