DOC19_ORIG_SPEC_FILLS_AND_FIXES.md

Current Specs/DOC19/DOC19_ORIG_SPEC_FILLS_AND_FIXES.md
Generated 2026-06-09T01:23:58.539Z from commit dbaa25962edc11ab30e8d4ca1715f9ae5bf77331. Worktree: clean.
Open text page · Open raw txt · Open path URL
# DOC19 — Orig Spec Fills and Fixes

Fixes, additions, and undocumented behaviors implemented outside the original DOC specs.
These need to be folded into the next spec rebuild so they are not lost during future
code-generation passes.

---

## Staging Rule (Read First)

`DOC19` is a staging ledger. Entries here are preserved implementation knowledge
and promotion packets, not operative owner truth by themselves.

Rules:

1. Treat every entry as `staging-only` until it is promoted into a versioned
   owner doc.
2. Do not assume owner-doc adoption just because an item exists in `DOC19`.
3. Keep entries in `DOC19` until promotion is completed and verified against
   release gates.
4. During promotion, update owner docs and ledgers first; then mark the `DOC19`
   entry as promoted/deprecated if desired.

---

## Table of Contents

0. [Staging Rule (Read First)](#staging-rule-read-first)
1. [Markdown Renderer Component](#1-markdown-renderer-component)
2. [Forum / Panel Text Whitespace Normalization](#2-forum--panel-text-whitespace-normalization)
3. [Forum Orchestrator Proactive Tick System](#3-forum-orchestrator-proactive-tick-system)
4. [Orchestrator Mode Gate Fixes](#4-orchestrator-mode-gate-fixes)
5. [Orchestrator Dedupe Key Fix](#5-orchestrator-dedupe-key-fix)
6. [Entity Sync Race Condition Fixes](#6-entity-sync-race-condition-fixes)
7. [Frontend Tick Response Application](#7-frontend-tick-response-application)
8. [Orchestrator Debug Logging](#8-orchestrator-debug-logging)
9. [New Dependencies](#9-new-dependencies)
10. [Backend-Side Proactive Tick Timer](#10-backend-side-proactive-tick-timer)
11. [Frontend Entity Sync Architecture Refactor](#11-frontend-entity-sync-architecture-refactor)
12. [Queue Processor Concurrency Fix](#12-queue-processor-concurrency-fix)
13. [Orchestrator Error Logging and Cost Cap Logging](#13-orchestrator-error-logging-and-cost-cap-logging)
14. [Interactive Chat / Forum / Panel Latency and Composer Architecture](#14-interactive-chat--forum--panel-latency-and-composer-architecture)

---

## Thematic Groups

| Group | Focus | Sections |
|-------|-------|----------|
| Group A | UI rendering and presentation behavior | 1 |
| Group B | Forum/panel orchestration and sync core fixes | 2, 3, 4, 5, 6, 7, 8 |
| Group C | Dependency/build + reliability placement block | 9, Summary, 10, 11, 12, 13 |
| Group D | Interactive latency architecture and cross-doc promotion packet | 14 |

---

## Group A — UI Rendering and Presentation

## 1. Markdown Renderer Component

**Problem:** All model-generated text across the entire dashboard rendered as raw
plaintext. No formatting for headings, bold, lists, code blocks, tables, etc.

**Spec gap:** No DOC spec defined a shared Markdown rendering component, element-level
styles, or the remark/rehype plugin chain.

### What was built

New file: `apps/q-frontend/src/components/MarkdownRenderer.jsx`

A shared, memoized React component using `react-markdown` with the following
plugin chain:

| Layer | Plugin | Purpose |
|-------|--------|---------|
| Remark | `remark-gfm` | GitHub Flavored Markdown (tables, task lists, strikethrough) |
| Remark | `remark-breaks` | Treats single `\n` as `<br>` (standard Markdown needs `\n\n`) |
| Rehype | `rehype-highlight` | Syntax highlighting for fenced code blocks |

### Design tokens

The component mirrors the Q Dashboard design tokens so all styled elements
match the rest of the UI. These must stay in sync with `QDashV11.jsx`:

```
font.sans  = "Sohne, Helvetica Neue, -apple-system, BlinkMacSystemFont, sans-serif"
font.mono  = "Sohne Mono, SF Mono, Menlo, monospace"

c.textPri     = #1A1D21
c.textSec     = #5E6570
c.textTer     = #8B919A
c.accent      = #4A5060
c.accentBtn   = #31588c
c.border      = #E0E2E5
c.borderLight = #ECEEF0
c.bgPanelAlt  = #F9FAFB

R.sm = 6px   R.md = 10px   R.lg = 14px
```

### Styled element overrides

Every Markdown element has a custom `components` override with inline styles.
These are the design decisions that need to be in the spec:

| Element | Font Size | Line Height | Color | Margins | Notes |
|---------|-----------|-------------|-------|---------|-------|
| `p` | 13px | 1.7 | textSec | 0 0 10px | Base paragraph |
| `strong` | inherit | inherit | textPri | -- | fontWeight 650 |
| `h1` | 20px | -- | textPri | 24px 0 12px | Bottom border, borderLight |
| `h2` | 17px | -- | textPri | 20px 0 10px | Bottom border, borderLight |
| `h3` | 15px | -- | textPri | 16px 0 8px | No border |
| `h4` | 13.5px | -- | textPri | 14px 0 6px | No border |
| `ul` | -- | 1.7 | -- | 8px 0 12px | paddingLeft 22, disc |
| `ol` | -- | 1.7 | -- | 8px 0 12px | paddingLeft 22, decimal |
| `li` | 13px | -- | textSec | 0 0 4px | paddingLeft 4 |
| Inline `code` | 0.88em | -- | #1F2937 | -- | bg #F3F4F6, 1px border borderLight, radius 4 |
| Block `code` | 12px | 1.6 | #1F2937 | -- | Inside pre wrapper |
| `pre` wrapper | -- | -- | -- | 10px 0 14px | Rounded border, borderLight, overflow hidden |
| `pre` inner | 12px | 1.6 | #1F2937 | 0 | bg #F3F4F6, mono font, overflowX auto |
| Code lang label | 10px | -- | textTer | -- | Uppercase, bg #F3F4F6, bottom border, weight 600 |
| `blockquote` | 13px | 1.6 | textSec | 10px 0 | Left border accentBtn 80% opacity, italic, bg bgPanelAlt |
| `a` (link) | inherit | -- | #2563EB | -- | No underline, 1px bottom border rgba blue |
| `hr` | -- | -- | -- | 16px 0 | Top border only, color border |
| `table` wrapper | -- | -- | -- | 10px 0 | overflowX auto |
| `th` | 12px | -- | -- | padding 8x12 | fontWeight 650, bg #F3F4F6, 2px bottom border |
| `td` | 13px | -- | -- | padding 8x12 | 1px bottom border borderLight |
| Striped rows | -- | -- | -- | -- | odd bg #F9FAFB via CSS nth-child |
| `img` | -- | -- | -- | 8px 0 | maxWidth 100%, radius sm |
| checkbox `input` | -- | -- | -- | marginRight 6 | accentColor accentBtn, readOnly, cursor default |

### Syntax highlighting theme

Inline `<style>` block with `.hljs` classes:

| Token | Color | Weight |
|-------|-------|--------|
| Default | #1F2937 | -- |
| keyword, selector-tag, title | #7C3AED | 600 |
| string, addition | #059669 | -- |
| number, literal | #D97706 | -- |
| comment, quote | #9CA3AF | italic |
| built_in, type | #2563EB | -- |
| attr, variable | #DC2626 | -- |
| meta | #6B7280 | -- |
| function title | #2563EB | -- |
| params | #1F2937 | -- |
| selector-class | #D97706 | -- |
| selector-id | #DC2626 | -- |
| deletion | #DC2626 | bg #FEE2E2 |
| addition (bg) | -- | bg #D1FAE5 |

### Integration points (16 locations)

The component replaces raw text rendering at these locations in `QDashV11.jsx`.
Each previously used either `renderMessageTextWithLinks()` or raw `{text}`:

| Surface | Content Source | Old Rendering |
|---------|---------------|---------------|
| Chat messages | `msg.text` | `renderMessageTextWithLinks()` |
| Task descriptions | `task.description` | Raw text |
| Task review context summary | `review.contextSummary` | Raw text |
| Forum consensus conclusion | `conclusion` | Raw text |
| Task review chat thread | `message.text` | Raw text |
| Task review summary | `review.summary` | Raw text |
| Task error messages | `error.message` | Raw text |
| Notification descriptions | `notification.description` | Raw text |
| Memory items | `memory.text` | Raw text |
| CRS event descriptions | `event.description` | Raw text |
| Standing orders | `order.text` | Raw text |
| Correction rules | `rule.text` | Raw text |
| Agent memory items | `memory.text` | Raw text |
| Activity event descriptions | `event.description` | Raw text |
| Forum thread posts | `post.text` | Raw text |
| Panel timeline posts | `post.text` | Raw text |

> **Note:** `renderMessageTextWithLinks()` still exists and is still used for
> system messages that contain `q://` deep links. Those should NOT use
> MarkdownRenderer because they have special link handling.

### Spec section needed

A new section (e.g., DOC spec "Shared Components" or within the existing
Q Dashboard spec) should define:

- Component name, file path, and import pattern
- The remark/rehype plugin chain and why each plugin is needed
- All element style overrides with exact values
- The syntax highlighting theme
- The 16 integration points and the rule about `q://` deep links
- The `remark-breaks` requirement (LLMs typically use single `\n`, not `\n\n`)
- The memoization strategy (`memo()` + `useMemo()`)
- Graceful null/undefined handling

### Promotion packet

- Primary owner doc: Q master UI spec
- Secondary owner doc: frontend dependency/build manifest if maintained separately
- Proposed amendment title: `Shared Markdown Rendering Component Contract`
- Carry-forward rule: keep this entry in `DOC19` until the shared component,
  plugin chain, styling contract, and dependency list are formally captured in
  the Q UI owner docs

---

## Group B — Forum/Panel Orchestration, Sync, and Reliability

## 2. Forum / Panel Text Whitespace Normalization

**Problem:** Forum posts rendered as one giant text blob with no paragraph breaks,
even though the LLM was generating proper `\n` newlines.

**Root cause:** The backend `normalizeWhitespace()` function uses the regex
`/\s+/ -> " "` which replaces ALL whitespace (including `\n`) with a single
space. Forum post text passed through this function at 3 kill points.

**Spec gap:** No spec defined how whitespace normalization should handle newlines
in multi-paragraph content like forum posts.

### What was built

New function in `server.ts`:

```
normalizeWhitespacePreserveNewlines(value: string): string
```

**Behavior:**
1. Split input on `\n`
2. For each line, collapse horizontal whitespace (`/[ \t]+/g -> " "`) and trim
3. Rejoin with `\n`
4. Limit 3+ consecutive blank lines to 2 (`/\n{3,}/g -> "\n\n"`)
5. Final `.trim()`

### Where it replaced `normalizeWhitespace()`

| Location | Function | When it runs |
|----------|----------|--------------|
| `normalizeForumPostRecords()` | Post text normalization | Reading posts from store |
| `generateForumScopedReply()` | LLM reply normalization | After LLM generates a forum reply |
| `createForumUserPost()` | User post text normalization | When user submits a forum post |

### Spec section needed

The spec should define:

- The difference between `normalizeWhitespace` (flat) and
  `normalizeWhitespacePreserveNewlines` (paragraph-safe)
- A rule: any text field that may contain multi-paragraph content (forum posts,
  panel posts, task descriptions, memory items) MUST use the newline-preserving
  variant
- The original `normalizeWhitespace` is still correct for single-line fields
  (names, tags, titles, slugs)

### Promotion packet

- Primary owner doc: `DOC6`
- Secondary owner doc: Q backend/server behavior appendix if maintained separately
- Proposed amendment title: `Paragraph-Safe Text Normalization Contract`
- Carry-forward rule: do not remove this entry until multiline content fields
  are explicitly split between flat normalization and newline-preserving
  normalization in the owner docs

---

## 3. Forum Orchestrator Proactive Tick System

**Problem:** Agents configured with timed intervals (e.g., "respond every 1 minute")
never responded proactively. They only responded when @mentioned.

**Spec gap:** The original spec described timed responses as a participant
configuration option but did not fully specify:
- The frontend polling mechanism
- The proactive job enqueueing path (separate from mention-triggered path)
- How the tick response feeds back into the UI
- How the entity sync interacts with orchestrator-generated data

### What was built

#### 3a. Backend-side proactive tick timer (primary)

The backend runs a `setInterval` timer that fires every 20 seconds. This is the
**primary orchestration driver** and runs independently of whether any frontend
is connected. See section 10 for full details.

#### 3b. Frontend entity sync poll (secondary)

The frontend polls the tick endpoint to **fetch** updated entities (not to drive
orchestration). It uses adaptive intervals: 20 seconds when the tab is visible,
60 seconds when hidden. It also re-fetches immediately on `visibilitychange`
when the user returns to the tab. See section 11 for full details.

#### 3c. API client method

New method in `api.ts`:

```
orchestratorTick(opts?: { scope_kind, scope_id, trigger_post_id })
  -> POST /api/q/orchestrator/tick
```

#### 3d. Backend proactive scan function

New function `enqueueDueProactiveForumJobs(store)`:

- Iterates ALL forum threads and panels (not just the one that triggered)
- For each participant:
  - Skips `mode === "observe"`
  - Calls `isForumParticipantDue(participant)` to check interval
  - Creates a `ForumAutoReplyJob` with `reason: "proactive_due"`
  - Uses time-bucket dedupe keys (see section 5)
- This function intentionally bypasses `buildForumJobsForTriggerPost()` so
  that @mention checks on the last user post do not block proactive jobs

#### 3e. Tick endpoint

```
POST /api/q/orchestrator/tick
```

**Request body (all optional):**
- `scope_kind`: "thread" or "panel" (if provided with scope_id, runs targeted scan)
- `scope_id`: specific scope to check
- `trigger_post_id`: specific trigger post
- `include_entities`: boolean, default true

**Response:**
```json
{
  "ok": true,
  "tick": {
    "enqueued": <number>,
    "queue_size": <number>,
    "processed": <number>,
    "failed": <number>
  },
  "orchestrator": {
    "running": <boolean>,
    "queued_jobs": <number>,
    "last_tick_at": <iso>,
    "last_error": <string|null>,
    "total_processed": <number>,
    "total_failed": <number>
  },
  "entities": <QEntityStore>
}
```

When called with no scope parameters (the default from the 45-second poll),
the endpoint runs the full proactive scan across all threads and panels.

### Spec section needed

- The backend 20-second proactive tick timer (NOT frontend-driven)
- The frontend entity sync poll (adaptive intervals, visibility listener)
- The full tick endpoint contract (request/response)
- The proactive scan function and how it differs from mention-triggered jobs
- The `isForumParticipantDue()` logic:
  - Returns `false` if `interval_review_enabled` is false
  - Returns `true` if `last_auto_reply_at` is empty (never replied = always due)
  - Otherwise returns `nowMs >= (last_auto_reply_at + intervalMs)`
- The `forumParticipantIntervalMs()` logic:
  - Returns 0 if `interval_review_enabled` is false
  - Returns 0 if `check_mode !== "interval"`
  - Otherwise converts `check_interval_value` + `check_interval_unit` to ms

### Promotion packet

- Primary owner doc: `DOC6`
- Secondary owner doc: Q UI spec for display refresh behavior
- Proposed amendment title: `Forum/Panel Proactive Tick and Scheduling Contract`
- Carry-forward rule: keep this entry until backend scheduling, frontend
  display refresh, and the tick endpoint contract are all captured in the
  owner-doc set

---

## 4. Orchestrator Mode Gate Fixes

**Problem:** The `buildForumJobsForTriggerPost()` function had a timed candidate
check that required `mode === "proactive"`, but the UI sets mode to "respond"
when the user enables timed responses. The execute function also had a redundant
mode gate.

**Spec gap:** The spec did not clearly define which `mode` values should allow
timed responses.

### What was changed

| Location | Before | After | Why |
|----------|--------|-------|-----|
| `buildForumJobsForTriggerPost` timed candidate | `mode === "proactive"` | `mode !== "observe"` | "respond" mode participants with timed intervals should still get timed responses |
| `executeForumAutoReplyJob` mode gate | `participant.mode !== "proactive"` causes skip | Removed (redundant) | The `observe` check at the top of the function already filters out observe-mode participants; the narrower check was blocking "respond" mode |

### Spec section needed

Define which mode values allow which behaviors:

| Mode | @Mention replies | Timed interval replies | Agent-determined reviews |
|------|------------------|----------------------|------------------------|
| `observe` | No | No | No |
| `respond` | Yes | Yes (if interval configured) | No |
| `proactive` | Yes | Yes (if interval configured) | Yes (if enabled) |

### Promotion packet

- Primary owner doc: `DOC6`
- Proposed amendment title: `Forum Participant Mode Gate Semantics`
- Carry-forward rule: do not remove until mode semantics for `observe`,
  `respond`, and `proactive` are stated explicitly in the participant contract

---

## 5. Orchestrator Dedupe Key Fix

**Problem:** The dedupe key for proactive jobs included the `trigger_post_id`,
which meant once a proactive job was created for a given trigger post, no
future proactive job could be created for the same trigger post. Since the
"trigger post" in proactive mode is just the latest post (and it might not
change between ticks), this permanently blocked future proactive responses.

**Spec gap:** The deduplication strategy for proactive jobs was not specified.

### What was changed

| Before | After |
|--------|-------|
| Dedupe key: `scope:id:participant:proactive_due:triggerPostId` | Dedupe key: `scope:id:participant:proactive_due:tick-{bucket}` |

The `bucket` is calculated as:
```
intervalMs = forumParticipantIntervalMs(participant) || 60000
bucket = Math.floor(Date.now() / intervalMs)
```

This allows one proactive job per interval window per participant, regardless
of whether the trigger post has changed.

### Spec section needed

- Dedupe key format for each reason type
- The time-bucket strategy for proactive jobs
- The 5-minute dedupe window for other job types (existing behavior)

### Promotion packet

- Primary owner doc: `DOC6`
- Proposed amendment title: `Forum Orchestrator Dedupe Strategy`
- Carry-forward rule: keep this entry until proactive dedupe keys and the
  time-bucket rule are written into the orchestrator internals contract

---

## 6. Entity Sync Race Condition Fixes

**Problem:** The frontend entity sync (`PUT /api/q/entities`) was a full
replacement. When the backend orchestrator generated an agent reply and
updated `last_auto_reply_at`, the frontend's next sync would overwrite the
backend store with stale data:
1. Agent reply post deleted
2. `last_auto_reply_at` reset to empty string
3. `runtime_reply_count` and `runtime_spend_usd` reset to zero

This created a loop: the participant appeared "always due" (empty timestamp)
but their replies were immediately destroyed.

**Spec gap:** The entity sync protocol did not define merge semantics for
backend-generated data.

### What was built

Two new helper functions in `server.ts`, used by the `PUT /api/q/entities`
handler:

#### 6a. `mergeForumPostsByScope(incoming, existing)`

Merges post arrays by scope ID. For each scope:
- Builds a Set of incoming post IDs
- Finds backend-only posts (IDs not in incoming set)
- Appends backend-only posts to the merged result

This ensures agent-generated posts survive even when the frontend hasn't
received them yet.

#### 6b. `preserveParticipantRuntimeFields(incomingScopes, existingScopes)`

For each scope (thread/panel), for each participant:
- If incoming `last_auto_reply_at` is empty but existing has a value, keep existing
- If incoming `runtime_reply_count` is 0 but existing has a value, keep existing
- If incoming `runtime_spend_usd` is 0 but existing has a value, keep existing

This prevents the frontend sync from clobbering orchestrator-set fields.

#### 6c. Updated PUT handler

The `PUT /api/q/entities` handler now uses these helpers:

```
forum_threads:          preserveParticipantRuntimeFields(incoming, existing)
forum_posts_by_thread:  mergeForumPostsByScope(incoming, existing)
forum_panels:           preserveParticipantRuntimeFields(incoming, existing)
panel_posts_by_id:      mergeForumPostsByScope(incoming, existing)
```

`projects` is still a direct replacement (no merge needed).

### Spec section needed

The entity sync spec should define:

- Merge strategy for posts: union by ID, incoming wins on conflicts
- Merge strategy for participant runtime fields: backend wins when frontend
  value is empty/zero
- The three runtime fields that are backend-authoritative:
  - `last_auto_reply_at` (set by orchestrator after successful reply)
  - `runtime_reply_count` (incremented by orchestrator)
  - `runtime_spend_usd` (accumulated by orchestrator)
- Frontend-authoritative fields (everything else: mode, model, intervals, etc.)

### Promotion packet

- Primary owner doc: Q master UI spec
- Secondary owner doc: `DOC6`
- Proposed amendment title: `Entity Sync Merge Semantics for Forum/Panel State`
- Carry-forward rule: do not remove until backend-authoritative runtime fields
  and union/merge rules are explicit in both the sync contract and the forum/panel
  owner doc

---

## 7. Frontend Tick Response Application

**Problem:** The frontend's orchestrator tick polling was fire-and-forget.
The backend returned updated entities (including new agent posts) in the
tick response, but the frontend discarded them. New agent posts were
invisible until the next full backend sync (every 20 seconds).

**Spec gap:** The spec did not define how the tick response should feed
back into the UI.

### What was changed

The tick `useEffect` now processes the response:

```jsx
const tickResult = await api.orchestratorTick();
if (tickResult?.entities && typeof tickResult.entities === "object") {
  startTransition(() => {
    applyQEntitySnapshot(tickResult.entities);
  });
}
```

Key behaviors:
- Uses `startTransition()` to avoid blocking the UI during state updates
- `applyQEntitySnapshot()` handles the suppress-sync mechanism
  (`qEntitySyncSuppressRef`) to prevent the entity sync useEffect from
  immediately syncing back the just-received data
- Cancellation flag prevents applying stale tick responses after unmount

### Spec section needed

- The tick response must be applied via `applyQEntitySnapshot()`
- The suppress mechanism prevents sync loops
- `startTransition` ensures non-blocking UI updates
- The cancelled flag pattern for cleanup

### Promotion packet

- Primary owner doc: Q master UI spec
- Secondary owner doc: `DOC6`
- Proposed amendment title: `Tick Response Application and Non-Blocking Entity Hydration`
- Carry-forward rule: keep this entry until tick responses, transition-wrapped
  application, and stale-response cleanup are captured in the UI/state owner docs

---

## 8. Orchestrator Debug Logging

**What was added:** Console.log statements throughout the orchestrator flow,
all prefixed with `[orchestrator-tick]`. These are diagnostic and should be
documented as a debugging aid, not removed.

### Log points (11 total)

| Location | Message | Purpose |
|----------|---------|---------|
| `enqueueDueProactiveForumJobs` entry | Thread/panel count | Verify scan scope |
| Per-participant (observe skip) | Name, mode=observe | Why participant skipped |
| Per-participant (full status) | mode, interval_review, check_mode, intervalMs, last_reply_at, dueAt, due | Full participant state for debugging |
| Job enqueued | Participant, scope, dedupe key | Confirm job creation |
| Job deduped | Participant, scope, dedupe key | Explain why job skipped |
| `enqueueDueProactiveForumJobs` exit | Jobs enqueued, queue size | Summary |
| `executeForumAutoReplyJob` entry | Scope, participant, reason | Track execution |
| Scope inactive | Scope name | Why execution skipped |
| Reply posted | Participant, scope, post ID, timestamp | Confirm successful reply |
| `runForumOrchestratorTick` entry | Options or "(proactive scan)" | Track tick invocation |
| `runForumOrchestratorTick` exit | enqueued, queue_size, processed, failed | Tick summary |

### Spec section needed

Document the `[orchestrator-tick]` log prefix and the expected log output
for successful and unsuccessful tick cycles. These logs are essential for
diagnosing "agents not responding" issues.

### Promotion packet

- Primary owner doc: `DOC6`
- Secondary owner doc: operations/debug appendix if maintained separately
- Proposed amendment title: `Forum Orchestrator Debug Logging Contract`
- Carry-forward rule: do not remove until required log points and the shared
  `[orchestrator-tick]` prefix are written into the debugging/operations material

---

## 9. New Dependencies

**Spec gap:** The original spec did not list all required npm dependencies
for the frontend.

### Added to `apps/q-frontend/package.json`

| Package | Version | Purpose |
|---------|---------|---------|
| `react-markdown` | ^10.1.0 | Core Markdown rendering engine |
| `remark-gfm` | ^4.0.1 | GitHub Flavored Markdown (tables, task lists, strikethrough) |
| `remark-breaks` | ^4.0.0 | Treat single `\n` as `<br>` (critical for LLM output) |
| `rehype-highlight` | ^7.0.2 | Syntax highlighting for fenced code blocks |

> **Note:** These are installed at the monorepo root `node_modules/` via
> hoisting, not in `apps/q-frontend/node_modules/`. This is normal for
> the workspace setup but should be documented.

### Promotion packet

- Primary owner doc: Q master UI spec
- Secondary owner doc: workspace/dependency manifest if maintained separately
- Proposed amendment title: `Q Frontend Dependency Additions for Markdown Rendering`
- Carry-forward rule: keep this entry until these packages are explicitly listed
  in the frontend dependency/build contract

---

## Group C — Dependency/Build and Placement Map

## Summary: Where Each Fix Belongs in Specs

| Fix | Suggested Spec Location |
|-----|------------------------|
| MarkdownRenderer component | New "Shared Components" section or Q Dashboard UI spec |
| Whitespace normalization | Backend data normalization spec (alongside existing `normalizeWhitespace`) |
| Orchestrator tick system | Forum/Panel orchestrator spec (new section for proactive polling) |
| Mode gate definitions | Forum participant configuration spec |
| Dedupe key strategy | Forum orchestrator internals spec |
| Entity sync merge semantics | Entity sync protocol spec (new merge rules section) |
| Tick response application | Frontend state management spec (entity hydration section) |
| Debug logging | Operations/debugging appendix |
| Dependencies | Frontend build/dependency manifest |
| Backend-side proactive tick timer | Backend server lifecycle spec |
| Frontend entity sync refactor | Frontend state management spec |
| Queue processor concurrency fix | Forum orchestrator internals spec |
| Error/cost cap logging | Operations/debugging appendix |

---

## 10. Backend-Side Proactive Tick Timer

**Problem:** The proactive tick was originally driven by frontend `setInterval`
polling. When the user's browser tab was hidden (switched to another tab or
minimized), browsers throttle or pause `setInterval`. Chrome limits background
tab intervals to a minimum of 1 minute and may pause them entirely. This
meant agents would respond while the user was actively looking at the page
but stop responding as soon as they switched away.

**Root cause:** Browser tab lifecycle constraints. `setInterval` is unreliable
for backend-driven scheduling when it runs in a browser tab.

**Spec gap:** The spec assumed the frontend would drive the polling loop.
The correct architecture is backend-driven scheduling with frontend
entity fetching for display purposes only.

### What was built

New server-side timer in the backend `start()` method:

```
FORUM_ORCHESTRATOR_TICK_INTERVAL_MS = 20_000 (20 seconds)

On server start:
  - setTimeout(runBackendProactiveTick, 5000)   // initial after 5s boot delay
  - setInterval(runBackendProactiveTick, 20000)  // then every 20s

On server stop:
  - clearInterval(forumOrchestratorTimer)
```

The `runBackendProactiveTick` function:
1. Reads the entity store from disk
2. Calls `enqueueDueProactiveForumJobs(store)` to find due participants
3. If any jobs were enqueued, calls `processForumAutoReplyQueue()` to process them
4. Errors are caught and logged but never crash the timer

### Key architectural principle

The backend is the **sole authority** for scheduling and executing agent responses.
The frontend only **reads** the results. This means:

- Agents respond on schedule even when no browser is open
- Agents respond on schedule even when the browser tab is in the background
- The frontend can poll at a relaxed rate for display purposes
- No frontend-to-backend coordination is needed for scheduling

### Spec section needed

- The 20-second backend tick interval as a server lifecycle timer
- The `forumOrchestratorTimer` variable and cleanup in `stop()`
- The 5-second initial boot delay before the first tick
- The error isolation (catch-and-log, never crash the timer)
- The architectural rule: backend schedules, frontend displays

### Promotion packet

- Primary owner doc: `DOC6`
- Proposed amendment title: `Backend-Side Proactive Tick Timer`
- Carry-forward rule: do not remove until the backend-owned scheduling timer,
  boot delay, and cleanup rules are explicitly captured in the owner doc

---

## 11. Frontend Entity Sync Architecture Refactor

**Problem:** The frontend's orchestrator tick `useEffect` originally:
1. Drove the orchestration by calling the tick endpoint (which triggered agent responses)
2. Had a hard `document.hidden` guard that completely blocked ticks on hidden tabs
3. Used `setInterval` which browsers throttle on background tabs

The result: agents appeared to respond when the user was looking at the page
but stopped as soon as they switched away. From the user's perspective,
"agents respond once but then stop."

### What was changed

The frontend tick was refactored from an **orchestration driver** to a
**display-only entity fetcher**:

**Before:**
```
setInterval(tick, 20000)  // drives orchestration
if (document.hidden) return  // stops ALL ticks when tab hidden
```

**After:**
```
setTimeout-based chain with adaptive intervals:
  - 20s when tab visible (responsive display updates)
  - 60s when tab hidden (light keep-alive for state freshness)
visibilitychange event listener:
  - Immediate fetch when user returns to tab
```

**Key behaviors:**
- Uses `setTimeout` chains instead of `setInterval` to avoid overlapping requests
- Each fetch completes before the next one is scheduled
- When the user switches back to the tab, they see fresh data immediately
- The tick endpoint still triggers the backend orchestrator, but the backend
  timer is the primary driver; the frontend call is supplementary

### Spec section needed

- The adaptive interval strategy (20s visible / 60s hidden)
- The `visibilitychange` event listener for immediate refresh
- The `setTimeout` chain pattern (no overlapping requests)
- The distinction: backend drives scheduling, frontend drives display refresh
- The `startTransition()` wrapper for non-blocking entity application

### Promotion packet

- Primary owner doc: Q master UI spec
- Secondary owner doc: `DOC6`
- Proposed amendment title: `Frontend Entity Sync Refresh Architecture`
- Carry-forward rule: keep this entry until adaptive refresh intervals,
  visibility-return refresh, and non-overlapping timeout-chain behavior are
  captured in the UI/state owner docs

---

## 12. Queue Processor Concurrency Fix

**Problem:** The `processForumAutoReplyQueue` function had a concurrency
issue. When multiple callers (frontend tick + backend timer + user post handler)
called it simultaneously:
1. All callers would await the in-flight promise
2. When it resolved, ALL callers would fall through and create new runs
3. Multiple concurrent runs would execute against the same queue
4. While `splice()` is atomic in single-threaded JS, the `running` flag
   and `inFlight` promise would get corrupted

**Previous fix (incorrect):** "Fall through after await" — allowed all waiters
to start new runs after the current one finished.

### What was changed

The queue processor now checks the queue state after waiting:

```
if (running && inFlight) {
    await inFlight;           // Wait for current run to finish
    if (queue.length === 0)   // If nothing left, no need for a new run
        return;
    if (running)              // If another waiter already started a run
        return;
}
// Start a new run (only one waiter reaches here)
```

This is safe because JavaScript microtask ordering guarantees that `.then()`
handlers on the same promise execute in registration order, one at a time.
The first microtask to resume sets `running = true`, so subsequent microtasks
see `running = true` and return.

### Spec section needed

- The concurrency model for `processForumAutoReplyQueue`
- The "check after wait" pattern to prevent duplicate runs
- The guarantee: at most ONE run executes at any given time
- The contract: jobs remain in the queue until processed; the backend timer
  ensures they will be picked up within 20 seconds

### Promotion packet

- Primary owner doc: `DOC6`
- Proposed amendment title: `Forum Auto-Reply Queue Concurrency Contract`
- Carry-forward rule: do not remove until the single-run guarantee and
  check-after-wait concurrency pattern are explicit in the queue processor spec

---

## 13. Orchestrator Error Logging and Cost Cap Logging

**Problem:** The `executeForumAutoReplyJob` function had a catch block that
stored the error in `forumOrchestratorState.lastError` but never logged it
to the console. Failures were completely silent. Additionally, cost cap blocks
were not logged, making it impossible to diagnose "agents stopped responding."

### What was added

| Location | Log Type | Message |
|----------|----------|---------|
| `executeForumAutoReplyJob` entry | `console.log` | Scope, participant, reason |
| Catch block (retriable) | `console.error` | Stack trace, retry details |
| Catch block (permanent failure) | `console.error` | Stack trace, permanent failure, removed from queue |
| Scope cost cap block | `console.log` | Participant, scope, cap amount, spend |
| Preflight cost limit block | `console.log` | Participant, scope, blocker, blocker scope |
| Successful reply | `console.log` | Participant, scope, post ID, timestamp |

### Spec section needed

- Error logging requirements for all orchestrator job execution paths
- Cost cap blocking should be logged with enough context to diagnose
- The `[orchestrator-tick]` log prefix convention for all orchestrator logs

### Promotion packet

- Primary owner doc: `DOC6`
- Secondary owner doc: `DOC13` for cost-cap visibility
- Proposed amendment title: `Forum Orchestrator Error and Cost-Cap Logging Contract`
- Carry-forward rule: keep this entry until both error logging and cost-cap
  block visibility are represented in the owner-doc set

---

## Group D — Interactive Latency Architecture and Promotion Packet

## 14. Interactive Chat / Forum / Panel Latency and Composer Architecture

**Problem:** The previous build fixed several hot-path latency problems, but the
interactive surfaces still stopped short of a truthful low-latency architecture.
Chat used optimistic UI and bounded context assembly, yet still returned the
assistant reply through a blocking `fetch()` response. Forum/panel posting
persisted the post first, then waited for orchestrator follow-up work before
returning success. Forum and panel composers also used single-line `<input>`
controls, which made longer drafting behave incorrectly.

This section preserves both:

1. the low-latency patterns that actually worked in code, and
2. the architectural fixes still required in the rebuild so those gains are not
   lost or partially reimplemented.

### What was actually built

#### Frontend patterns that reduced perceived latency

The chat path in `apps/q-frontend/src/QDashV11.jsx` already implemented several
useful hot-path behaviors:

| Area | What was built | Why it mattered |
|------|----------------|-----------------|
| User send path | Immediate optimistic append of the user message with a local `message_id` / `client_message_id` | Send felt instant instead of waiting on backend round-trip |
| Pending state | Immediate local `thinkingState` | User got visible acknowledgement that the run started |
| Composer performance | Uncontrolled chat textarea + ref buffering + scheduled autoresize | Reduced typing lag from high-frequency controlled re-renders |
| Background state merges | `startTransition(...)` for lower-priority merges | Reduced UI contention while typing |
| Hydration suppression | Signature-based entity snapshot suppression and optimistic/local merge preservation | Prevented backend hydration from stomping active local state |
| History isolation | Per-conversation message scoping and merge logic | Prevented cross-thread churn |
| Forum/panel isolation | Separate `forumPostsByThread` and `panelPostsById` stores | Kept forum/panel traffic from polluting chat state |

#### Backend/runtime patterns that reduced actual latency

The backend and EC hot path also kept some important work bounded:

| Area | What was built | Why it mattered |
|------|----------------|-----------------|
| Gateway session reuse | Conversation/model keyed session reuse in `apps/q-backend/src/server.ts` | Avoided repeated cold-start model priming |
| Gateway context budget | `buildGatewayInlineContextMessage(...)` clipped sections and capped the total injected block | Prevented unbounded prompt growth on the hot path |
| Retry/backoff | Gateway transient retry and quota backoff state | Prevented tight retry storms and reduced user-visible flapping |
| Forum/panel history isolation | `/api/chat/send` supports `persist_history:false` for non-chat model calls | Prevented forum/panel model work from bloating chat history |
| EC hot-path assembly | `assembleContextCard(...)` in `apps/ec-service/src/server.ts` is deterministic, bounded, and parallelized | Kept EC off the critical path from doing slow or model-driven work |

### What was still wrong

#### 1. Chat was still final-only

The visible chat experience still waited for the full `/api/chat/send` response.
The assistant message was appended only after the request completed. This
meant:

- no true streaming,
- no visible TTFT,
- no truthful stream gap handling,
- and no real active-run read model.

The old `EventFileStreamer` existed, but it was not the normative chat token
stream.

#### 2. STOP was cosmetic

The current chat stop button only clears `thinkingState`. It does not send a
backend abort and does not reflect true runtime abort state.

#### 3. Forum/panel post acceptance was blocked on orchestrator follow-up

`createForumUserPost(...)` in `apps/q-backend/src/server.ts` currently:

1. appends the user post,
2. writes the entity store,
3. awaits `runForumOrchestratorTick(...)`,
4. then returns the HTTP response.

That is the wrong contract for an interactive submit.

If the post is durably written but the tick is slow or throws:

- the user sees delay,
- the post may appear anyway,
- the composer may not clear,
- and the user may retry into duplication/confusion.

#### 4. Forum/panel composers were the wrong control type

The forum reply box and panel inject box are single-line `<input>` controls in
`apps/q-frontend/src/QDashV11.jsx`, so long text continues horizontally instead
of wrapping into a multiline drafting surface.

#### 5. Interactive state still competed with dashboard hydration

The old build added useful protections, but they were implementation accidents
instead of spec-owned obligations. A rebuild could easily regress typing
latency, transcript churn, or scroll jank unless the rules are written down.

### What must be preserved in the rebuild

These behaviors should become explicit spec obligations:

1. Optimistic local append for user-originated messages and posts.
2. Local IDs plus reconciliation against persisted IDs.
3. Immediate local pending/thinking state on accepted send.
4. Deterministic bounded context assembly.
5. Parallel hot-path prep where results are independent.
6. Gateway session reuse where session semantics allow it.
7. Retry/backoff/fallback logic that is honest about degraded mode.
8. Hydration suppression while typing or while an active stream is running.
9. Lower-priority background merges via `startTransition(...)` or equivalent.
10. Forum/panel scoped model work must not pollute ordinary chat history.

### Rebuild architecture required

#### A. Separate accepted from completed

For chat, forum post, and panel post, the system must distinguish:

- `accepted`
- `completed`
- `failed`

User-visible acceptance must be fast and must not wait on secondary work such
as orchestrator ticks, extraction, recommendation refresh, or other follow-up
automation.

#### B. Build true streaming, not blocking final-only chat

Required rebuild behavior:

1. Backend emits accepted, delta, completed, failed, and aborted events.
2. Frontend keeps a dedicated stream reducer for the active run.
3. UI batches delta application in small intervals, rather than re-rendering
   per token.
4. Transcript truth comes from stream state plus runtime read models, not from a
   broad dashboard snapshot feed.

#### C. Use truthful abort state

Abort needs a real state machine:

- `abort_pending`
- `abort_verified`
- `abort_unknown`
- `abort_failed`

Do not clear the pending state and pretend the run stopped unless real abort
truth arrives.

#### D. Make forum/panel posting follow the same low-latency contract

Required behavior:

1. Append a local optimistic post row immediately.
2. Clear the composer on `accepted`, not on downstream completion.
3. Return a fast acceptance response from the backend after durable post write.
4. Run orchestrator follow-up asynchronously after acceptance.
5. Reconcile later if secondary work fails.

#### E. Replace single-line composers with autosizing multiline drafting

Chat, forum, and panel composers should all use autosizing multiline
`textarea` controls with:

- wrap by default,
- `Enter` to submit where appropriate,
- `Shift+Enter` for newline,
- IME-safe composition handling,
- and per-scope draft persistence.

#### F. Keep rendering cheap while a run is active

Required UI rules:

1. Stream as plain text first.
2. Finalize heavy Markdown rendering after completion.
3. Virtualize longer transcripts.
4. Preserve bottom-stickiness only when the user is already near the bottom.
5. Do not let background hydration rewrite the active transcript while the user
   is typing or while a stream is active.

#### G. Measure latency as separate budgets

The rebuild should record at least:

- submit -> accepted
- accepted -> first token
- first token -> completed
- completed -> rendered

This matters because “chat is slow” is otherwise impossible to diagnose
correctly.

### Additional performance enhancements that should be specified now

These were not all fully built before, but they should be part of the rebuild
spec so the next implementation does not stop at optimistic append:

1. Dedicated `chat-store` / `chat-stream-store` instead of burying transcript
   logic inside one giant component.
2. Dedicated `forum-post-store` and `panel-post-store` with the same accepted /
   completed reconciliation pattern.
3. `composer-draft-store` keyed by conversation/thread/panel so drafts survive
   navigation.
4. Transcript virtualization / windowing for long conversations.
5. Revision-keyed caching of deterministic context artifacts so hot-path prep is
   not recomputed unnecessarily.
6. Attachment staging before dispatch so uploads do not sit inside the critical
   submit round-trip.
7. Background provider health/catalog refresh that stays off the send path.
8. A real stream adapter instead of treating broad dashboard polling as a chat
   transport.
9. A `markdown finalization` rule so code fences/tables are not re-parsed on
   every delta.
10. Scroll preservation rules that prevent jumpiness during background merges.

### Where this belongs in the rebuilt specs

| Owner doc | What must be added |
|-----------|--------------------|
| `DOC11` | Chat transport, streaming events, accepted/completed truth, abort lifecycle, latency metrics, stream gap handling |
| `DOC10` / Core orchestration sections | Accepted-vs-completed mutation semantics and async follow-up work after acceptance |
| EC Core rebuild spec | Deterministic bounded context assembly, parallel hot-path prep, cache boundaries, no hot-path LLM |
| `DOC6` | Forum/panel post acceptance semantics, optimistic posting, async orchestrator follow-up |
| Q master UI spec | Optimistic append, stream reducer, multiline autosizing composers, draft persistence, virtualization, markdown finalization, bottom-stick rules |
| `DOC13` | Latency and abort telemetry visibility |

### Spec section needed

- Interactive low-latency surface contract for chat, forum, and panel posting
- Accepted vs completed vs failed mutation semantics
- True streaming and abort state machines
- Composer contract: multiline, autosizing, draft persistence, input-clear rules
- Transcript virtualization and markdown-finalization rules
- Latency budget metrics and dashboards

### Owner-doc amendment packet

This section is a **staging packet**, not owner truth by itself. It should stay
in `DOC19` until the material is formally promoted into versioned owner-doc
revisions.

Do not delete this entry merely because the ideas are understood. Delete or
shrink it only after the owner-doc promotion pass is complete and verified.

#### Promotion map

| Owner doc | Proposed amendment title | What moves from this section |
|-----------|--------------------------|------------------------------|
| `DOC11` | `Interactive Low-Latency Transport and Streaming Truth` | accepted/completed runtime truth, stream event family, final-only honesty, abort truth, TTFT/latency fields |
| `DOC10` | `Interactive Mutation Low-Latency Closure` | accepted vs completed semantics, secondary work after acceptance, follow-up degradation semantics |
| EC Core rebuild spec | `Interactive Hot-Path and Post-Acceptance Follow-Up Contract` | bounded context assembly, no hot-path LLM, async follow-up queue, revision-keyed caches |
| `DOC6` | `Interactive Posting Latency Contract` | forum/panel acceptance boundary, optimistic local post, clear-on-accepted, multiline composer, follow-up status |
| `DOC13` | `Interactive Latency and Abort Visibility` | `submit -> accepted`, TTFT, total latency, abort-resolution metrics, partial usage on abort/fail |
| Q master UI spec | `Interactive Chat / Forum / Panel Low-Latency UI Contract` | optimistic append, stream reducer, markdown finalization, virtualization, draft persistence, scroll behavior |

#### Exact amendment contents by owner

##### `DOC11`

Add a runtime-owned section covering:

1. `accepted`, `streaming`, `completed`, `failed`, `aborted` truth states.
2. Required stream event family:
   - `gateway.chat.accepted`
   - `gateway.chat.stream.delta`
   - `gateway.chat.completed`
   - `gateway.chat.failed`
   - `gateway.chat.aborted`
3. Required timing fields:
   - `accepted_at`
   - `first_token_at`
   - `completed_at`
   - `ttft_ms`
   - `stream_duration_ms`
   - `latency_ms`
4. Rule that final-only surfaces are allowed only if they still expose
   `accepted` and terminal completion honestly.
5. Rule that STOP cannot claim success before abort truth arrives.

##### `DOC10`

Add an orchestration-owned section covering:

1. `accepted` vs `completed` vs `completed_degraded` vs `failed`.
2. Rule that orchestrator ticks, extraction, recommendation refresh,
   moderation side work, and similar follow-up jobs cannot block acceptance.
3. Interactive mutation lifecycle schema for:
   - `chat_dispatch`
   - `forum_post`
   - `panel_post`
4. Rule that follow-up failures after acceptance are rendered as degraded
   follow-up, not failed submit.

##### EC Core rebuild spec

Add a hot-path/backend-owned section covering:

1. Acceptance boundary for interactive actions.
2. Deterministic, bounded, parallel context assembly.
3. Explicit prohibition on hot-path LLM calls in EC context planning.
4. Async follow-up queue after accepted durable write.
5. Revision-keyed caching for deterministic hot-path artifacts.

##### `DOC6`

Add a panel/forum-owned section covering:

1. Acceptance boundary = durable post append + canonical post id.
2. Post routes return accepted immediately.
3. Orchestrator follow-up runs after acceptance.
4. Thread/panel composers are multiline autosizing `textarea` controls.
5. Forum/panel post rows support optimistic local append and later canonical
   reconciliation.
6. Follow-up failure is shown at the post level, not by leaving the composer
   uncleared.

##### `DOC13`

Add a telemetry/visibility section covering:

1. Interactive latency metrics:
   - `submit_to_accepted_ms`
   - `accepted_to_first_token_ms`
   - `first_token_to_completed_ms`
   - `submit_to_completed_ms`
   - `abort_resolution_ms`
2. Partial usage handling on fail/abort.
3. Engineering/runtime visibility requirements for TTFT, stream mode, total
   latency, and abort resolution.

##### Q master UI spec

Add a UI-owned section covering:

1. optimistic append for chat/forum/panel,
2. local-id to canonical-id reconciliation,
3. dedicated stream reducer,
4. multiline autosizing composers,
5. per-scope draft persistence,
6. clear on `accepted`,
7. markdown finalization after completion,
8. transcript virtualization/windowing,
9. scroll preservation and bottom-stick rules,
10. hydration suppression while typing or streaming.

#### Release-gate checklist for later promotion

When this entry is promoted into owner docs, the promotion is not complete
unless all of the following are true:

1. Chat send has a truthful accepted state even when the surface is final-only.
2. STOP/abort is real, not cosmetic.
3. Forum/panel post acceptance no longer waits on orchestrator follow-up.
4. Forum and panel composers are multiline and autosizing.
5. Follow-up failures after accepted post do not leave the original draft in
   the composer.
6. TTFT and total interactive latency are measurable from runtime truth.
7. Transcript rendering rules prevent per-delta markdown thrash and long-DOM
   slowdown.

#### Carry-forward rule

Until the owner-doc promotion pass happens:

1. `DOC19` remains the preserved source for this latency package.
2. Owner docs should be treated as not yet amended.
3. Future rebuild work should reference this section, not assume the owner docs
   already contain it.