ELNOR REPO READER TEXT MIRROR Original path: Current Specs/Connector and Integration Specs/SPOTIFY_INTEGRATION_PART_2_SCHEDULED_INTAKE.md Source repo: /Users/OpenClaw1/Elnor/Elnor Specs Git branch: main Git commit: dbaa25962edc11ab30e8d4ca1715f9ae5bf77331 Generated: 2026-06-09T01:23:58.539Z --- # Spotify Integration — Part 2: Scheduled Intake & Knowledge Graph (Backend Required) **Date:** 2026-04-11 **Status:** Spec for future build — requires EC, DOC3 extraction pipeline, DOC72 write path, DOC23 task scheduler **Scope:** Automated listening data capture, preference extraction, pattern learning, proactive music intelligence **Prerequisites:** Part 1 complete (MCP server running, Spotify connected), backend infrastructure operational **Depends on:** DOC3 R11.3 (KnowledgeExtractionBundle), DOC72 R5.6+ (entity graph write path), DOC23 (task scheduler), DOC8 (nightly dream cycle) --- ## 1. What This Adds Beyond Part 1 Part 1 gives Elnor on-demand access to Spotify — ask and Elnor checks. Part 2 gives Elnor continuous, ambient understanding of your music behavior without being asked: | Capability | Part 1 (On-Demand) | Part 2 (Scheduled Intake) | |-----------|-------------------|--------------------------| | "What am I listening to now?" | ✅ Calls API on request | ✅ Same | | "What have I listened to this week?" | ✅ Calls API, gets last 50 tracks | ✅ Has full week's history in DOC72 | | "What kind of music do I like?" | ❌ No memory across sessions | ✅ Rich preference model in entity graph | | "Play something for trial prep" | ⚠️ Generic search, no personal context | ✅ Knows what you played during past trial prep | | "My listening changed this month" | ❌ Cannot compare over time | ✅ Trend analysis from stored history | | "Make me a playlist like my Wednesday focus sessions" | ❌ No pattern data | ✅ Temporal listening patterns stored | --- ## 2. Architecture Overview ``` ┌─────────────────────────────────────────────────────────────┐ │ SCHEDULED INTAKE FLOW │ │ │ │ DOC23 Task Scheduler │ │ ┌──────────────────┐ │ │ │ "Spotify Intake" │ ── runs weekly (configurable) ──┐ │ │ │ Cron: Sun 3:00 AM │ │ │ │ └──────────────────┘ ▼ │ │ ┌──────────────┐ │ │ │ Spotify MCP │ │ │ │ Server │ │ │ │ │ │ │ │ Fetch: │ │ │ │ - recently │ │ │ │ played │ │ │ │ - top tracks │ │ │ │ - top artists│ │ │ │ - playlists │ │ │ └──────┬───────┘ │ │ │ │ │ ▼ │ │ ┌──────────────────┐ │ │ │ DOC3 Extraction │ │ │ │ Pipeline │ │ │ │ │ │ │ │ LLM extracts: │ │ │ │ - entities │ │ │ │ - preferences │ │ │ │ - patterns │ │ │ │ - associations │ │ │ └──────┬───────────┘ │ │ │ │ │ ▼ │ │ ┌────────────────────┐ │ │ │ DOC72 Entity Graph │ │ │ │ │ │ │ │ Stores: │ │ │ │ - artist nodes │ │ │ │ - genre nodes │ │ │ │ - preference nodes │ │ │ │ - pattern nodes │ │ │ │ - association nodes│ │ │ └────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────┘ ``` --- ## 3. DOC23 Task Definition: Spotify Intake ### 3.1 Task Registration Register a recurring task in the DOC23 task system: ```typescript const spotifyIntakeTask: TaskDefinition = { task_id: "spotify_weekly_intake", display_name: "Spotify Listening Intake", description: "Fetches recent listening data from Spotify and extracts preferences, patterns, and entities into the knowledge graph.", category: "system_maintenance", schedule: { type: "cron", expression: "0 3 * * 0", // Every Sunday at 3:00 AM timezone: "America/Los_Angeles", }, trigger: "scheduled", // Also triggerable manually from Q agent_id: "elnor", priority: "low", // Background task, non-urgent estimated_duration_seconds: 120, max_retries: 2, retry_delay_seconds: 300, // Task steps (DOC23 module chain) steps: [ { step_id: "fetch_recent", tool: "spotify_get_recently_played", params: { limit: 50 }, output_key: "recent_tracks", }, { step_id: "fetch_top_tracks_short", tool: "spotify_get_top_tracks", params: { time_range: "short_term", limit: 50 }, output_key: "top_tracks_short", }, { step_id: "fetch_top_tracks_medium", tool: "spotify_get_top_tracks", params: { time_range: "medium_term", limit: 50 }, output_key: "top_tracks_medium", }, { step_id: "fetch_top_artists_short", tool: "spotify_get_top_artists", params: { time_range: "short_term", limit: 30 }, output_key: "top_artists_short", }, { step_id: "fetch_top_artists_medium", tool: "spotify_get_top_artists", params: { time_range: "medium_term", limit: 30 }, output_key: "top_artists_medium", }, { step_id: "fetch_playlists", tool: "spotify_get_playlists", params: { limit: 50 }, output_key: "playlists", }, { step_id: "extract_knowledge", tool: "doc3_extract_knowledge", params: { source_type: "spotify_intake", // All fetched data flows in as context }, output_key: "extraction_bundle", }, { step_id: "write_to_graph", tool: "doc72_write_nodes", params: { // Write extracted nodes to entity graph }, }, ], }; ``` ### 3.2 Schedule Options Configurable in Q Settings: ``` Settings > Integrations > Spotify > Listening Intelligence ├─ Scheduled intake: ◉ Enabled ○ Disabled ├─ Frequency │ ○ Daily (3:00 AM) │ ● Weekly (Sunday 3:00 AM) — recommended │ ○ Monthly (1st of month, 3:00 AM) ├─ [Run now] — manual trigger └─ Last run: Sun Apr 6, 3:01 AM — 142 tracks processed, 8 new entities, 3 preference updates ``` ### 3.3 Manual Trigger The user can trigger the intake manually from: - Q Settings > Spotify > [Run now] - Chat: "Elnor, update your knowledge of my music" - Floating Palette command: "Sync Spotify data" --- ## 4. DOC3 Extraction Pipeline: Spotify Data ### 4.1 KnowledgeExtractionBundle for Spotify The fetched Spotify data is packaged as a `KnowledgeExtractionBundle` (DOC3 R11.3) and sent through the standard extraction pipeline. The LLM receives the raw data and extracts structured knowledge. **Extraction prompt (system-level, appended to DOC3 extraction instructions):** ``` You are analyzing Spotify listening data for knowledge extraction. Extract: 1. ENTITIES: Artists, albums, songs, genres, and playlists that appear frequently or are explicitly saved/liked. Include Spotify URIs for precise identification. 2. PREFERENCES: Infer music preferences from listening patterns. - Genre preferences (with strength: strong/moderate/weak) - Artist affinity (frequency-based) - Tempo/energy preferences (from audio features if available) - Explicit dislikes (genres/artists never appearing despite being popular) 3. TEMPORAL PATTERNS: When does the user listen to what? - Time-of-day patterns (morning music vs evening music) - Day-of-week patterns (workday vs weekend) - Activity associations (if inferable from playlist names or listening context) 4. CHANGES: Compare with previous intake data (if provided). What's new? - New artists discovered - Genres gaining or losing share - Playlist additions/removals - Listening volume changes (more or less music overall) 5. ASSOCIATIONS: Link music entities to other known entities in the user's world. - Playlist names referencing cases, projects, or activities - Temporal correlation with known calendar events or work periods Output as a KnowledgeExtractionBundle with typed nodes. ``` ### 4.2 Input Data Shape The extraction pipeline receives this data from the MCP fetches: ```typescript interface SpotifyIntakeData { // From spotify_get_recently_played recent_tracks: { track: { name: string; uri: string; artists: { name: string; uri: string }[]; album: { name: string; uri: string } }; played_at: string; // ISO timestamp }[]; // From spotify_get_top_tracks (short + medium term) top_tracks_short: { name: string; uri: string; artists: { name: string }[]; popularity: number }[]; top_tracks_medium: { name: string; uri: string; artists: { name: string }[]; popularity: number }[]; // From spotify_get_top_artists (short + medium term) top_artists_short: { name: string; uri: string; genres: string[]; popularity: number }[]; top_artists_medium: { name: string; uri: string; genres: string[]; popularity: number }[]; // From spotify_get_playlists playlists: { name: string; uri: string; track_count: number; public: boolean; description: string }[]; // Previous intake summary (for change detection) previous_intake_summary?: string; } ``` ### 4.3 LLM Cost Estimate The extraction payload is small: - 50 recent tracks × ~100 tokens each = ~5,000 tokens - 80 top tracks × ~60 tokens each = ~4,800 tokens - 60 top artists × ~80 tokens each = ~4,800 tokens - 50 playlists × ~40 tokens each = ~2,000 tokens - Extraction prompt: ~500 tokens - Output: ~2,000-4,000 tokens **Total per run: ~20,000 tokens input + ~3,000 tokens output.** At Gemini 2.5 Pro rates, that's roughly $0.02-0.05 per weekly run. Negligible. --- ## 5. DOC72 Entity Graph: Music Knowledge Nodes ### 5.1 Node Types for Music Data All music data maps to DOC72's existing 10 canonical node types. No new node types needed. | DOC72 Node Type | Music Usage | Example | |----------------|-------------|---------| | `entity` | Artists, albums, songs, playlists, genres | `{type: "entity", subtype: "artist", name: "John Coltrane", spotify_uri: "spotify:artist:..."}` | | `preference` | Music taste declarations | `{type: "preference", domain: "music", subject: "jazz_fusion", valence: "positive", strength: 0.85}` | | `observation` | Listening events and patterns | `{type: "observation", domain: "music", content: "Listened to ambient electronic 80% of morning sessions in March 2026"}` | | `association` | Links between music and other life entities | `{type: "association", entity_a: "henderson_trial_prep", entity_b: "ambient_electronic", relation: "activity_music"}` | | `fact` | Stable music facts | `{type: "fact", domain: "music", content: "Will's Spotify account has 47 playlists, 1,200 saved tracks"}` | | `procedure` | Learned music workflows | `{type: "procedure", trigger: "focus_music_request", steps: "1. Check DOC72 for focus genre preference 2. Search Spotify 3. Play"}` | ### 5.2 Entity Schemas ```typescript // Artist entity interface MusicArtistNode { node_id: string; type: "entity"; subtype: "music_artist"; name: string; spotify_uri: string; genres: string[]; affinity_score: number; // 0-1, based on listening frequency first_observed: string; // ISO date last_observed: string; // ISO date play_count_estimate: number; // Approximate from intake data principal_id: "will"; scope: "personal"; } // Genre preference interface MusicPreferenceNode { node_id: string; type: "preference"; domain: "music"; subject: string; // "jazz", "ambient_electronic", "classic_rock" valence: "positive" | "negative" | "neutral"; strength: number; // 0-1, Beta distribution confidence context?: string; // "for_focus", "for_relaxation", "morning", "weekend" evidence_count: number; // How many data points support this last_updated: string; principal_id: "will"; scope: "personal"; } // Temporal pattern interface MusicPatternNode { node_id: string; type: "observation"; domain: "music_pattern"; pattern_type: "temporal" | "activity" | "mood" | "trend"; content: string; // Natural language description data: { time_period?: string; // "morning", "evening", "weekend" genre_distribution?: Record; // {"jazz": 0.4, "ambient": 0.3, ...} trend_direction?: "increasing" | "decreasing" | "stable"; }; observed_period: string; // "2026-W14" or "2026-03" principal_id: "will"; scope: "personal"; } // Activity association interface MusicAssociationNode { node_id: string; type: "association"; entity_a_ref: string; // DOC72 node_id of activity/case/project entity_b_ref: string; // DOC72 node_id of genre/playlist/artist relation: "activity_music" | "case_playlist" | "mood_genre"; strength: number; // 0-1 evidence: string; // "Playlist 'Henderson Prep' played during Henderson case work" principal_id: "will"; scope: "personal"; } ``` ### 5.3 Deduplication The entity graph must deduplicate across intakes. An artist node for "John Coltrane" created in week 1 should be UPDATED in week 2, not duplicated. **Dedup strategy:** - `spotify_uri` is the unique key for Spotify entities (artists, tracks, albums, playlists) - On each intake, check if a node with that `spotify_uri` already exists - If yes: update `affinity_score`, `last_observed`, `play_count_estimate` - If no: create new node - Preference nodes deduplicate on `{domain, subject, context}` tuple - Pattern nodes are temporal — each intake period gets its own observation, building a time series ### 5.4 Confidence Scoring DOC72 uses Beta distribution confidence scoring. For music preferences: - A genre that appears in 90% of recent plays with 200+ data points → high confidence (α=180, β=20) - A genre that appeared 3 times last week → low confidence (α=3, β=47) - Confidence naturally decays if a genre stops appearing in subsequent intakes This means Elnor can express uncertainty: "You've been listening to a lot of classical lately, but I'm not sure if it's a lasting preference or just a phase." --- ## 6. Nightly Dream Cycle Integration (DOC8) ### 6.1 Music in the Dream Cycle DOC8's nightly dream cycle runs lightweight consolidation tasks. Music knowledge participates: **Consolidation tasks:** - Merge redundant artist/genre nodes (e.g., "electronic" and "electronica" → single node) - Decay affinity scores for artists not listened to in 30+ days - Promote provisional preferences to confirmed (if evidence_count > threshold) - Generate weekly/monthly listening summary nodes - Detect significant changes: "Listening shifted from jazz to classical this month" **Weekly consolidation (heavier):** - Compare current top artists/tracks with previous month - Generate trend analysis: "New discovery: Nils Frahm (first appeared 2 weeks ago, now in top 10)" - Cross-reference with calendar: "Classical listening spiked during the week of the Henderson motion deadline" ### 6.2 Dream Cycle Task Definition ```typescript const musicDreamCycleTask: DreamCycleTask = { task_id: "music_knowledge_consolidation", frequency: "nightly", priority: "low", steps: [ "decay_stale_affinities", // Reduce affinity for artists not seen in 30+ days "merge_redundant_genres", // Deduplicate genre nodes "promote_provisional_preferences",// Move from provisional to confirmed "generate_period_summary", // Create observation node for the day/week ], weekly_extra_steps: [ "trend_analysis", // Compare with last month "calendar_cross_reference", // Correlate with DOC72 calendar entities "generate_weekly_digest_entry", // Add music section to weekly digest ], }; ``` --- ## 7. What Elnor Can Do With This Knowledge Once the scheduled intake is running and DOC72 has music data, Elnor can: ### 7.1 Proactive Recommendations - "You usually listen to ambient during morning focus sessions. Want me to put something on?" (based on temporal pattern nodes) - "You haven't listened to Coltrane in a while — want me to queue up A Love Supreme?" (based on decaying affinity + historical high affinity) ### 7.2 Context-Aware Playback - "Play something for trial prep" → Elnor checks DOC72 for association nodes linking trial prep to music → finds "ambient electronic" pattern → searches Spotify → plays - "Play what I usually listen to on Sunday mornings" → temporal pattern lookup → genre match → play ### 7.3 Trend Awareness - "How has my music taste changed this year?" → Elnor queries pattern nodes across months → generates trend narrative - "I feel like I've been in a rut musically" → Elnor can suggest genres/artists outside your recent patterns ### 7.4 Cross-Domain Intelligence - Elnor notices you play aggressive music before deadlines and calm music after → stores this as a mood/activity pattern - Elnor can correlate music changes with case milestones: "Your listening got more intense during the Paramount expert discovery phase" ### 7.5 Playlist Intelligence - "Make me a playlist of my top discoveries from the last 3 months" → Elnor queries DOC72 for artists with `first_observed` in last 90 days + high affinity → creates Spotify playlist - "What playlist was I playing during the Henderson depo prep?" → DOC72 association lookup → finds playlist entity → plays it --- ## 8. Settings (Backend Section) Additions to the Spotify settings from Part 1: ``` Settings > Integrations > Spotify > Listening Intelligence ├─ Knowledge capture │ ├─ Scheduled intake: ◉ Enabled ○ Disabled │ ├─ Frequency: ○ Daily ● Weekly ○ Monthly │ ├─ Include in nightly dream cycle: ☑ Yes │ ├─ [Run now] │ └─ Last run: Sun Apr 6, 3:01 AM — 142 tracks, 8 entities, 3 preferences ├─ What Elnor remembers │ ├─ ☑ Artists and genres I listen to │ ├─ ☑ Listening patterns (time of day, day of week) │ ├─ ☑ Playlist contents and changes │ ├─ ☑ Music-activity associations (e.g., trial prep → ambient) │ └─ ☐ Exact play counts and timestamps (detailed — more storage) ├─ Proactive suggestions │ ├─ ☑ Offer to play music based on time/activity context │ └─ ☑ Mention music trends in weekly digest └─ Data management ├─ Music entities in knowledge graph: 247 nodes ├─ [View music knowledge] — opens DOC72 filtered to music domain └─ [Clear all music knowledge] — removes all music nodes from graph ``` --- ## 9. Cross-Document Obligations ### 9.1 DOC3 (Semantic Skill Learning) - Register `spotify_intake` as a recognized `source_type` in the `KnowledgeExtractionBundle` pipeline - Add music-domain extraction instructions to the skill library (§4.1 prompt above) - The conversational learning path (Part 1 §5.2) already uses DOC3's standard extraction — no changes needed ### 9.2 DOC72 (Entity Graph) - No new node types needed — all music data fits existing canonical types - Add `spotify_uri` as a recognized external identifier field on entity nodes (alongside any existing external ID fields) - Add `domain: "music"` as a recognized domain tag for filtering/querying - Deduplication logic must handle `spotify_uri` as a unique key (§5.3) - Beta confidence scoring applies unchanged (§5.4) ### 9.3 DOC23 (Task System) - Register `spotify_weekly_intake` as a system task template - Support `cron` schedule type (if not already supported) - The task should appear in the Tasks page as a system maintenance task, not a user-created task - Manual trigger via "Run now" button or chat command ### 9.4 DOC8 (Dream Cycle) - Register `music_knowledge_consolidation` as a dream cycle participant - Nightly: affinity decay, dedup, promotion - Weekly: trend analysis, calendar cross-reference, digest entry ### 9.5 DOC24 (Knowledge Delivery) - Music knowledge nodes participate in standard DOC24 retrieval (RRF, three-lane retrieval) - When Elnor receives a music-related query, DOC24's semantic routing should include music domain nodes in the retrieval context - The `inspect_knowledge_summary` tool (GBrain proposal) should include a music section when queried: "What does Elnor know about my music?" --- ## 10. Implementation Sequence ``` 1. Part 1 complete (MCP server running, web player working) ← prerequisite 2. EC + DOC72 write path operational ← prerequisite 3. DOC3 extraction pipeline operational ← prerequisite 4. Register spotify_intake source_type in DOC3 ← 30 min 5. Create SpotifyIntakeService in EC ← 2-3 hours - Fetches data via MCP tools - Packages as KnowledgeExtractionBundle - Sends through DOC3 pipeline - Writes results to DOC72 6. Register task in DOC23 ← 30 min 7. Add music consolidation to DOC8 dream cycle ← 1 hour 8. Add Listening Intelligence settings to Q ← 1 hour 9. Test end-to-end: scheduled run → extraction → graph write ← 1-2 hours 10. Test proactive recommendations: "play something for focus" ← testing ``` **Total estimated build time (after prerequisites):** 6-8 hours