# Issue Triage — 2026-01-07 (elizaOS)

## 1) Priority Issues (Detailed Triage)

### A. “Anthropic + MCP server fails with `No handler found for delegate type: TEXT_EMBEDDING` (requires OpenAI key + plugin ordering workaround)” — **Discord-reported (no GitHub issue yet)**
- **Current Status:** Reported on Discord (2026-01-05). Workaround exists (set any OpenAI key; ensure OpenAI plugin after Anthropic). No tracked GitHub issue identified in provided data.
- **Impact Assessment**
  - **User Impact:** **High** (anyone attempting Anthropic-only setups with MCP/embeddings hits it)
  - **Functional Impact:** **Yes** (blocks embedding flow; can block agent/tool flows that rely on embeddings)
  - **Brand Impact:** **High** (users perceive “Anthropic integration is broken / requires fake keys”)
- **Technical Classification**
  - **Category:** Bug / UX
  - **Component:** Model Integration + Plugin System (embedding delegate routing), MCP server integration path
  - **Complexity:** **Moderate effort**
- **Resource Requirements**
  - **Required Expertise:** Provider/plugin routing, embedding pipeline, MCP server integration, TypeScript
  - **Dependencies:** Clarify expected behavior when TEXT_EMBEDDING provider missing; decide fallback policy (error vs auto-disable features vs alternate provider)
  - **Estimated Effort:** **3/5**
- **Recommended Priority:** **P0**
- **Actionable Next Steps**
  1. **Open GitHub issue** in `elizaos/eliza`: document repro steps (Anthropic-only config + MCP server), logs, and current workaround.
  2. Implement **explicit embedding-provider resolution**:
     - If no TEXT_EMBEDDING handler exists, either (a) select a default embedding provider if configured, or (b) fail fast with a **clear actionable error** (which env var / plugin needed).
  3. Add **config validation** at startup: detect “Anthropic configured but no embedding provider configured” and warn/error before runtime.
  4. Add a small doc snippet: “Using Anthropic requires an embedding provider (OpenAI or other) unless you disable features X/Y.”
- **Potential Assignees**
  - **standujar** (messaging/transport + core integration experience)
  - **wtfsayo** (runtime robustness + tests)
  - **madjin** (cross-system integration, if available)
  - (Support/verification) **Stan ⚡ / sayonara** (reported workaround; can validate fix)

---

### B. “Claude code review functionality appears to be failing” — **Discord-reported; partial CI remediation landed**
- **Current Status:** Reported on Discord (2026-01-05). Recent CI workflow updates exist:
  - `elizaos/eliza#6324` (upgrade Claude workflows) **merged**
  - `elizaos/eliza#6328` (allow cursor bot to trigger Claude workflows) **merged**
  - Still reported as failing/unstable in practice (no failure logs included in provided data).
- **Impact Assessment**
  - **User Impact:** **Medium** (contributors/review pipeline)
  - **Functional Impact:** **Partial** (doesn’t block runtime, but degrades review quality and velocity)
  - **Brand Impact:** **Medium** (public CI instability reduces trust for contributors)
- **Technical Classification**
  - **Category:** Bug / DevEx
  - **Component:** CI / GitHub Actions
  - **Complexity:** **Moderate effort**
- **Resource Requirements**
  - **Required Expertise:** GitHub Actions, Claude action configuration, secrets/permissions, bot event triggers
  - **Dependencies:** Need actual failing run links + error messages; confirm whether failures are permissions, rate limits, model availability, or workflow conditions
  - **Estimated Effort:** **2–3/5**
- **Recommended Priority:** **P1**
- **Actionable Next Steps**
  1. Collect 3–5 failing workflow run URLs; categorize failures (auth, throttling, action version, permissions).
  2. Add a **diagnostic step** to print relevant context (event name, actor, permissions snapshot) without leaking secrets.
  3. Confirm action supports bot triggers across all workflows (not just cursor) and standardize conditions.
  4. Document “How to re-run / how to trigger @claude” for maintainers.
- **Potential Assignees**
  - **rejected-l** (repo maintenance/merges)
  - **standujar** (recently involved in large infra changes; can triage quickly)
  - **wtfsayo** (test/infra hardening mindset)

---

### C. “Turbo build tool uses extreme memory (21GB+) during builds” — **Discord-reported (no GitHub issue yet)**
- **Current Status:** Reported on Discord (2026-01-04). Not tracked in provided GitHub issues.
- **Impact Assessment**
  - **User Impact:** **Medium** (contributors and self-hosting devs; especially laptops/CI runners)
  - **Functional Impact:** **Partial** (blocks local development on constrained machines; can break CI if memory-limited)
  - **Brand Impact:** **Medium** (perception of “project is heavy / hard to build”)
- **Technical Classification**
  - **Category:** Performance / DevEx
  - **Component:** Build Tooling (Turbo/Bun/monorepo build graph)
  - **Complexity:** **Complex solution** (profiling + config changes; possibly architectural build adjustments)
- **Resource Requirements**
  - **Required Expertise:** Monorepo build tooling (Turbo), Bun tooling, TS build pipeline, CI resource profiling
  - **Dependencies:** Need reproduction details: OS, node/bun version, turbo version, command used, workspace scope
  - **Estimated Effort:** **4/5**
- **Recommended Priority:** **P1**
- **Actionable Next Steps**
  1. Open GitHub issue in `elizaos/eliza` with baseline metrics and repro steps.
  2. Add a CI job to capture build memory peak (even coarse sampling) to prevent regressions.
  3. Review Turbo config: caching, pipeline concurrency, task outputs, pruning scope, incremental builds.
  4. Identify largest memory consumers: TS typecheck, bundler, test runner; split pipelines if needed.
- **Potential Assignees**
  - **standujar** (core repo familiarity)
  - **wtfsayo** (performance + reliability fixes)
  - (If available) a maintainer focused on tooling/CI

---

### D. “Need to refresh for conversation to actually show as deleted …” — **`elizaos/eliza#6322`**
- **Current Status:** **Open** (mentioned as newly created on 2026-01-04; not shown closed in provided data)
- **Impact Assessment**
  - **User Impact:** **High** (common UI workflow; deletion is a core UX action)
  - **Functional Impact:** **Partial** (data likely deleted but UI state stale; users think it failed)
  - **Brand Impact:** **High** (erodes confidence; “basic UI is buggy”)
- **Technical Classification**
  - **Category:** Bug / UX
  - **Component:** Client GUI (chat/session list state management) + API cache invalidation
  - **Complexity:** **Moderate effort**
- **Resource Requirements**
  - **Required Expertise:** Frontend state management, query invalidation (hooks), session/chat routing
  - **Dependencies:** Confirm whether deletion is optimistic and missing invalidation, or server returns inconsistent data
  - **Estimated Effort:** **2–3/5**
- **Recommended Priority:** **P1**
- **Actionable Next Steps**
  1. Reproduce: delete conversation; confirm server state vs UI state.
  2. Ensure the delete mutation **invalidates** the correct query keys (session list + current session).
  3. Add E2E test for “delete conversation updates sidebar immediately.”
  4. Verify behavior across all transports (HTTP/SSE/WebSocket) after `#6300`.
- **Potential Assignees**
  - **standujar** (recent client hooks + transport work in `#6300`)
  - **borisudovicic** (raised UX issues; can verify acceptance criteria)
  - **wtfsayo** (if server-side event emission/invalidation is involved)

---

### E. “Agent sorting doesn’t work” — **`elizaos/eliza#6319`**
- **Current Status:** **Open** (created 2026-01-04; not listed closed)
- **Impact Assessment**
  - **User Impact:** **Medium** (organization/discoverability; impacts daily use with many agents)
  - **Functional Impact:** **No** (not blocking core messaging)
  - **Brand Impact:** **Medium** (polish issue; visible to end users)
- **Technical Classification**
  - **Category:** Bug / UX
  - **Component:** Client GUI (agent list), possibly API ordering
  - **Complexity:** **Simple fix → Moderate effort** (depends on whether server supports sort)
- **Resource Requirements**
  - **Required Expertise:** Frontend list rendering, API query params, persistence of sort preference
  - **Dependencies:** Define intended sort (alpha, last used, pinned); confirm source of truth
  - **Estimated Effort:** **2/5**
- **Recommended Priority:** **P2**
- **Actionable Next Steps**
  1. Identify current sort logic and expected sort.
  2. Add unit test for sort comparator + integration test for agent list order.
  3. If server-driven: add `sort=` parameter + default ordering contract.
- **Potential Assignees**
  - **standujar**
  - **borisudovicic** (product expectation)
  - Any frontend contributor familiar with agent list page

---

### F. “Change free credits from $5 to $1” — **`elizaos/eliza#6315`**
- **Current Status:** **Open**
- **Impact Assessment**
  - **User Impact:** **High** (affects all new/guest users’ ability to try platform)
  - **Functional Impact:** **No** (policy/monetization change, not a runtime break)
  - **Brand Impact:** **Medium** (could reduce goodwill; needs careful comms)
- **Technical Classification**
  - **Category:** Feature Request / Product
  - **Component:** Billing/Credits logic (API + UI copy)
  - **Complexity:** **Simple fix**
- **Resource Requirements**
  - **Required Expertise:** Product + billing config, minimal engineering
  - **Dependencies:** Align with growth strategy issues (`#6312`, `#6315` cluster) and confirm analytics targets
  - **Estimated Effort:** **1–2/5**
- **Recommended Priority:** **P2**
- **Actionable Next Steps**
  1. Confirm decision with product owners; define rollout strategy (grandfathering?).
  2. Update server-side default credit grant + UI messaging + docs/FAQ.
  3. Add a regression test that asserts default credit values for new accounts.
- **Potential Assignees**
  - **borisudovicic** (opened; product direction)
  - A maintainer with access to billing/credits configuration

---

### G. “Public agent ecosystem roadmap items (agent discovery / forking / knowledge sharing)” — **`elizaos/eliza#6302`, `#6305`, `#6303` (and related)**
- **Current Status:** **Open** (roadmap/epics; not immediate bugfixes)
- **Impact Assessment**
  - **User Impact:** **Medium → High** (strategic growth; not an immediate break)
  - **Functional Impact:** **No** (new capabilities)
  - **Brand Impact:** **High** (major differentiation if executed)
- **Technical Classification**
  - **Category:** Feature Request
  - **Component:** Core Framework + GUI + API (agent registry/discovery)
  - **Complexity:** **Architectural change**
- **Resource Requirements**
  - **Required Expertise:** Backend API design, auth/permissions, content moderation considerations, UI/UX, database schema
  - **Dependencies:** Requires clear data model for “public agents,” versioning, and permission boundaries
  - **Estimated Effort:** **5/5**
- **Recommended Priority:** **P3** (keep planning active, but do not preempt stability/P0-P1 issues)
- **Actionable Next Steps**
  1. Convert roadmap issues into a sequenced milestone: schema → API → UI listing → forking → sharing.
  2. Define security/privacy constraints (what gets published; secret stripping).
  3. Draft minimal “v0 discovery” spec to ship incrementally.
- **Potential Assignees**
  - **borisudovicic** (roadmap owner)
  - **madjin** (architecture/data pipelines)
  - **standujar** (core/server implementation)

---

### H. “CachedDatabaseAdapter + embedding dimension caching (DRAFT, has syntax/TS issues)” — **`elizaos/eliza#6329` (PR, DRAFT)**
- **Current Status:** **Open / DRAFT / Do not merge**; automated review notes TypeScript syntax issues and potential logic risks.
- **Impact Assessment**
  - **User Impact:** **Medium** (potential large perf win, especially serverless; but not yet shipped)
  - **Functional Impact:** **No** (not in production, but may unblock performance roadmap)
  - **Brand Impact:** **Medium** (large DRAFT PR lingering can slow velocity)
- **Technical Classification**
  - **Category:** Performance (future) / Bug (within PR)
  - **Component:** Plugin System (`plugin-sql`), Core Runtime
  - **Complexity:** **Complex solution**
- **Resource Requirements**
  - **Required Expertise:** TypeScript, caching correctness, DB adapter semantics, concurrency, invalidation
  - **Dependencies:** Align with existing `plugin-sql` fixes merged in `#6316` and `#6323` to avoid conflicts
  - **Estimated Effort:** **4/5**
- **Recommended Priority:** **P2** (stabilize/trim PR so it can progress, but not at expense of P0/P1)
- **Actionable Next Steps**
  1. Fix TypeScript syntax issues called out in review; ensure compilation passes.
  2. Reduce blast radius: split into two PRs (runtime embedding-dim cache vs SQL cached adapter).
  3. Add benchmarks (before/after) and define safe defaults (cache TTLs, what not to cache).
  4. Threat-model invalidation bugs (stale reads) and document consistency guarantees.
- **Potential Assignees**
  - **0xbbjoker** (author)
  - **wtfsayo** (plugin-sql reliability; can co-review)
  - **standujar** (core runtime changes review)

---

## 2) Summary: Top 5–10 Issues to Address Immediately

1. **P0:** Anthropic + MCP **TEXT_EMBEDDING handler missing** causes runtime failure; replace “fake OpenAI key + ordering” workaround with explicit provider resolution + validation. *(Discord-reported; needs GitHub issue)*
2. **P1:** **Claude code review CI failing/unstable**; gather failing runs and harden workflows beyond the already-merged updates (`#6324`, `#6328`). *(Discord-reported)*
3. **P1:** **Turbo build memory blow-up (21GB+)**; open issue and begin profiling/config work to protect contributors/CI. *(Discord-reported)*
4. **P1:** `elizaos/eliza#6322` **Conversation deletion requires refresh**; fix cache invalidation/state sync; add tests.
5. **P2:** `elizaos/eliza#6319` **Agent sorting doesn’t work**; fix ordering and test.
6. **P2:** `elizaos/eliza#6329` **DRAFT caching adapter PR**: resolve TS/syntax issues, split PR, benchmark and validate invalidation.
7. **P2:** `elizaos/eliza#6315` **Free credits policy change**; confirm decision + implement safely with clear comms.
8. **P3:** `elizaos/eliza#6302/#6305/#6303` **Public agent ecosystem roadmap**; keep milestone planning, but don’t preempt stability work.

---

## 3) Patterns / Themes Suggesting Deeper Issues

- **Provider/Plugin dependency ambiguity:** The Anthropic+embedding issue indicates the framework does not clearly express *required capabilities* (e.g., embeddings) per feature path, leading to brittle “order-dependent” setups.
- **State synchronization regressions in UI:** “Refresh to see deletion” and prior chat/session issues suggest query invalidation and real-time updates (across HTTP/SSE/WebSocket) need stronger contracts and automated coverage.
- **Tooling/CI fragility:** Claude workflow instability and Turbo memory spikes both point to insufficient observability and guardrails in the developer pipeline (resource caps, diagnostics, and reproducible build profiles).

---

## 4) Process Recommendations (Prevention)

1. **Add startup configuration validation** (“capability matrix”): explicitly validate that required delegates exist (TEXT_EMBEDDING, image, audio) for enabled features; fail fast with actionable messages.
2. **Transport-parity test suite requirement:** for any messaging/session UI change, require tests across **HTTP + SSE + WebSocket** to prevent stale state and deletion/rename inconsistencies.
3. **CI observability standards:** standardize debug logging for workflow failures (sanitized), and maintain a small “CI health” dashboard (e.g., weekly failure categories).
4. **Build resource regression checks:** add a lightweight job that measures build peak RSS (or approximate memory) and flags large regressions.
5. **Enforce “DRAFT PR exit criteria”:** require a checklist (build passes, TS passes, benchmarks, split PRs) to avoid large DRAFTs stalling and accumulating merge conflicts.