## Issue Triage — 2026-04-17

### 1) IGNORE responses silently not persisted when `ALLOW_MEMORY_SOURCE_IDS` is configured (regression from merged PR)
- **Issue Title & ID:** Bug: IGNORE memory persistence dropped when `ALLOW_MEMORY_SOURCE_IDS` is set — **UNFILED (regression from elizaos/eliza PR #6562, merged 2026-04-08)**
- **Current Status:** Not tracked as a GitHub issue; identified in automated review notes; likely shipping in released/active mainline.
- **Impact Assessment:**
  - **User Impact:** **High** (any deployment using memory allowlisting + relying on IGNORE persistence/auditing)
  - **Functional Impact:** **Partial** (memory/audit correctness; behavior diverges from documented intent)
  - **Brand Impact:** **High** (silent data loss/inconsistency; hard to diagnose)
- **Technical Classification:**
  - **Issue Category:** Bug
  - **Component Affected:** Core Framework (TypeScript runtime) — `DefaultMessageService` memory pipeline
  - **Complexity:** Simple fix
- **Resource Requirements:**
  - **Required Expertise:** TypeScript, message/memory pipeline familiarity
  - **Dependencies:** None, but should add a regression test to prevent recurrence
  - **Estimated Effort:** **2/5**
- **Recommended Priority:** **P0**
- **Specific Actionable Next Steps:**
  1. **File a GitHub issue** documenting exact conditions: `DISABLE_MEMORY_CREATION=false`, `ALLOW_MEMORY_SOURCE_IDS` set, and IGNORE path taken.
  2. Patch `packages/typescript/src/services/message.ts` IGNORE persistence logic to **not** require `allowedSources.includes("agent_response")` (or align with the normal response persistence rules).
  3. Add a unit test: when `ALLOW_MEMORY_SOURCE_IDS` is set to e.g. `["user_message"]`, confirm IGNORE decision persistence behavior matches intended design (explicitly decide desired behavior and encode it).
  4. Ship as patch release / hotfix notes because it is silent correctness breakage.
- **Potential Assignees:** `odilitime` (core/runtime author), `NubsCarson` (message-service changes), with review help from `greptile-apps` findings.

---

### 2) Zero-vector embedding fallback corrupts semantic memory retrieval (silent data corruption)
- **Issue Title & ID:** Bug: embedding failure persists memories with zero vector, making them unretrievable — **UNFILED (regression from elizaos/eliza PR #6562, merged 2026-04-08)**
- **Current Status:** Not tracked as a GitHub issue; identified in review notes; likely present in main.
- **Impact Assessment:**
  - **User Impact:** **Critical** (anyone with intermittent embedding failures: rate limits, model outages, network issues)
  - **Functional Impact:** **Yes** (semantic recall quality and correctness; “stored” memories become effectively lost)
  - **Brand Impact:** **High** (trust in memory system; “it saved but can’t recall” reports)
- **Technical Classification:**
  - **Issue Category:** Bug (Data integrity)
  - **Component Affected:** Core Framework — runtime embeddings + memory persistence (`packages/typescript/src/runtime.ts`)
  - **Complexity:** Moderate effort (must decide correct failure-mode semantics)
- **Resource Requirements:**
  - **Required Expertise:** TypeScript runtime, embeddings, memory store schema/adapter behavior
  - **Dependencies:** Might need coordination with any DB adapters / vector stores to confirm expected null-handling
  - **Estimated Effort:** **3/5**
- **Recommended Priority:** **P0**
- **Specific Actionable Next Steps:**
  1. **File an issue** with reproduction: force embedding provider failure and verify memory persisted with zero vector.
  2. Change behavior to one of:
     - **Do not persist** semantic memory when embedding fails; or
     - Persist with a **nullable embedding** and mark `embedding_status=failed` for later backfill; or
     - Queue a retry job and **only finalize** persistence once embedding is available.
  3. Add tests covering embedding failure and ensuring no “valid-looking but unretrievable” memory entries are created.
  4. Add migration/backfill guidance: identify existing zero-vectors and re-embed or delete.
- **Potential Assignees:** `odilitime` (runtime), optionally `0xSolace` / `dutchiono` if they have DB/vector-store context.

---

### 3) Dev/install break risk: submodule workspace paths and `bun.lock` inconsistencies merged into main
- **Issue Title & ID:** Bug/Infra: Fresh clone may fail due to committed submodule workspaces + lockfile mismatch — **elizaos/eliza PR #6702 (merged 2026-04-09) follow-up**
- **Current Status:** PR merged; automated review flagged that committed repo state may break `bun install` on machines without submodules initialized and/or resolve wrong dependency sources.
- **Impact Assessment:**
  - **User Impact:** **High** (contributors and anyone building from source; potential CI failures)
  - **Functional Impact:** **Yes** (cannot install/build reliably)
  - **Brand Impact:** **High** (onboarding friction; “repo is broken” perception)
- **Technical Classification:**
  - **Issue Category:** Bug / Documentation (developer workflow), potentially Release/Packaging
  - **Component Affected:** Repo tooling / workspace configuration (`package.json`, `bun.lock`, submodules under `plugins/`)
  - **Complexity:** Moderate effort
- **Resource Requirements:**
  - **Required Expertise:** Bun workspaces/lockfiles, monorepo tooling, CI setup
  - **Dependencies:** CI reproduction matrix (with/without submodules); confirm intended supported flow
  - **Estimated Effort:** **3/5**
- **Recommended Priority:** **P0**
- **Specific Actionable Next Steps:**
  1. Create a tracking issue: “Fresh clone + bun install succeeds without submodules” (or explicitly require submodules and enforce it).
  2. Add CI job that runs:
     - `git clone` (no submodule init) → `bun install` → ensure failure is intentional with clear message OR ensure success.
     - `git submodule update --init` → `bun install` → `bun run build/test`.
  3. Decide policy:
     - If submodules are optional: ensure root `package.json` committed state **does not** include submodule workspace entries.
     - If submodules are required: add a **preinstall guard** that prints actionable instructions.
  4. Regenerate and commit `bun.lock` consistent with chosen policy (avoid `workspace:*` vs `alpha` mismatches).
- **Potential Assignees:** `odilitime` (author of PR #6702), with CI help from a maintainer familiar with release/build pipelines.

---

### 4) `elizaos create` fails on macOS: “Bun's postinstall script was not run” (CLI bootstrap blocker)
- **Issue Title & ID:** elizaos create fails with "Bun's postinstall script was not run" on macOS — **elizaos/eliza #6704 (OPEN)**
- **Current Status:** Open, no comments yet.
- **Impact Assessment:**
  - **User Impact:** **Critical** (blocks new-user onboarding on a major dev platform)
  - **Functional Impact:** **Yes** (`elizaos create` is a primary entrypoint)
  - **Brand Impact:** **High** (first-run failure, project appears unusable)
- **Technical Classification:**
  - **Issue Category:** Bug / UX (developer onboarding)
  - **Component Affected:** CLI + bootstrap templates (`@elizaos/cli`, `@elizaos/plugin-bootstrap`), dependency graph (`bun` package)
  - **Complexity:** Simple fix
- **Resource Requirements:**
  - **Required Expertise:** Node/Bun packaging, CLI release process, dependency hygiene
  - **Dependencies:** Verify behavior across bun/pnpm/npm; ensure templates don’t rely on bun-as-a-node-module
  - **Estimated Effort:** **2/5**
- **Recommended Priority:** **P0**
- **Specific Actionable Next Steps:**
  1. Confirm minimal repro on Apple Silicon with latest bun + latest CLI.
  2. Remove `bun` from **runtime dependencies** in `@elizaos/cli` and `@elizaos/plugin-bootstrap` (use `@types/bun` in devDependencies if needed).
  3. Add CI coverage for `elizaos create` on macOS (at least smoke test that project directory remains and build completes).
  4. Improve failure mode: do not delete the created directory on failure; print explicit remediation.
  5. Release patched CLI version and announce in docs.
- **Potential Assignees:** `odilitime` (core tooling), plus the reporter `dirtybits` for validation/testing on macOS.

---

### 5) Group addressee routing PR has a logic bug that defeats name resolution when `agentId !== entityId`
- **Issue Title & ID:** Bug: `aliasEntity` ambiguity breaks addressee resolution in group routing — **elizaos/eliza PR #6712 (OPEN)**
- **Current Status:** Open PR; automated review flagged a P1 logic bug; not yet merged.
- **Impact Assessment:**
  - **User Impact:** **High** for multi-agent/group-room deployments (Discord/Telegram/group chats)
  - **Functional Impact:** **Partial** (routing/shouldRespond correctness; can cause missed replies or unwanted replies)
  - **Brand Impact:** **Medium/High** (agents “act weird” in group chats)
- **Technical Classification:**
  - **Issue Category:** Bug
  - **Component Affected:** Core Framework — addressee resolution utilities + shouldRespond pipeline
  - **Complexity:** Moderate effort
- **Resource Requirements:**
  - **Required Expertise:** TypeScript, entity/agent identity model, group chat adapter metadata
  - **Dependencies:** Needs expanded tests specifically covering aliasing cases (`agentId` vs `entityId`)
  - **Estimated Effort:** **3/5**
- **Recommended Priority:** **P1** (must fix before merge)
- **Specific Actionable Next Steps:**
  1. Update `NameVariationRegistry.aliasEntity` / token mapping so aliasing does not introduce multi-entity ambiguity for a single token.
  2. Add unit tests covering:
     - Agents with `agentId !== entityId`
     - Two agents with overlapping name tokens
     - Correct behavior for `isAddressedToSelf` / `isAddressedToOther`
  3. Re-run group-room simulations (Discord threads + reply-to-author metadata) to ensure improved determinism doesn’t regress.
- **Potential Assignees:** `odilitime` (PR author), with review from `NubsCarson` (message service / connector behavior).

---

### 6) Provider timeout default change may increase tail latency for existing deployments
- **Issue Title & ID:** Performance/Behavior change: `PROVIDERS_TOTAL_TIMEOUT_MS` default raised 1s → 5s — **UNFILED (introduced in elizaos/eliza PR #6562, merged 2026-04-08)**
- **Current Status:** Not tracked; highlighted as a potential silent behavior change.
- **Impact Assessment:**
  - **User Impact:** **Medium** (latency-sensitive deployments; depends on provider mix)
  - **Functional Impact:** **No** (system still functions; may feel slower)
  - **Brand Impact:** **Medium** (perceived responsiveness)
- **Technical Classification:**
  - **Issue Category:** Performance
  - **Component Affected:** Core Framework — provider composition / state building
  - **Complexity:** Simple fix (config/defaults + docs), possibly moderate if needing adaptive timeouts
- **Resource Requirements:**
  - **Required Expertise:** Runtime performance, provider architecture
  - **Dependencies:** Need baseline metrics and guidance for overriding defaults
  - **Estimated Effort:** **2/5**
- **Recommended Priority:** **P2**
- **Specific Actionable Next Steps:**
  1. Decide whether to revert default to 1s, or keep 5s but clearly document + expose config prominently.
  2. Add a perf note in CHANGELOG/README: “provider total timeout affects P99.”
  3. Add a test ensuring provider timeouts don’t stall response generation beyond expected bounds.
- **Potential Assignees:** `odilitime` (runtime), with input from maintainers operating production agents.

---

### 7) plugin-mnemopay PR has multiple correctness/safety gaps (persistence, NaN config, null-deref, unbounded growth)
- **Issue Title & ID:** feat: add plugin-mnemopay — economic memory for AI agents — **elizaos/eliza PR #6701 (OPEN)**
- **Current Status:** Open PR; automated review indicates not merge-ready.
- **Impact Assessment:**
  - **User Impact:** **Low/Medium** (new plugin; not core unless merged)
  - **Functional Impact:** **Partial** (plugin’s value proposition fails without persistence; could crash via null deref)
  - **Brand Impact:** **Medium** (if merged prematurely, “official plugin is unreliable” perception)
- **Technical Classification:**
  - **Issue Category:** Feature / Bug (merge readiness)
  - **Component Affected:** Plugin System (new plugin implementation)
  - **Complexity:** Complex solution (needs persistence design, bounds, tests)
- **Resource Requirements:**
  - **Required Expertise:** Plugin patterns, persistence/storage integration, testing discipline
  - **Dependencies:** Define persistence approach (SQL adapter? existing memory store? file-based?) and plugin policy
  - **Estimated Effort:** **4/5**
- **Recommended Priority:** **P3** (do not merge until addressed; not a fire unless it’s on a release train)
- **Specific Actionable Next Steps:**
  1. Require persistence implementation or explicitly scope plugin as “ephemeral demo” (and label it accordingly).
  2. Add guards for `MNEMOPAY_REPUTATION_DELTA` parsing (reject NaN, default safely).
  3. Add null checks in all handlers/evaluator (no unsafe casts).
  4. Add memory eviction / cap policy.
  5. Add test suite covering actions + evaluator behavior.
- **Potential Assignees:** `t49qnsx7qt-kpanks` (author) with guidance/review from `odilitime` on plugin standards.

---

### 8) Marketplace/safety/capability-token plugin proposals (triage for roadmap alignment)
- **Issue Title & ID:** Plugin: MAXIA AI Marketplace — **elizaos/eliza #6700 (OPEN)**
  - **Priority:** **P4** (roadmap discussion; evaluate security + scope + maintenance commitments)
- **Issue Title & ID:** Plugin: SafeAgent — token safety checks — **elizaos/eliza #6706 (OPEN)**
  - **Priority:** **P3** (valuable, but needs security review + registry process; not core blocker)
- **Issue Title & ID:** Plugin proposal: capability token enforcement — **elizaos/eliza #6707 (OPEN)**
  - **Priority:** **P2** (strategically aligned with AgentID/capability auth; worth design review soon)
- **Issue Title & ID:** AIGEN Protocol incentives — **elizaos/eliza #6708 (OPEN)**
  - **Priority:** **P4** (ecosystem/coordination; ensure legal/brand implications reviewed)

---

## Immediate Focus Summary (Top 5–10 priorities)
1. **P0:** Fix **IGNORE persistence bug** under `ALLOW_MEMORY_SOURCE_IDS` (unfiled, from PR #6562).
2. **P0:** Fix **zero-vector embedding fallback** (unfiled, from PR #6562) + provide backfill guidance.
3. **P0:** Validate and correct **fresh-clone install/workspace/lockfile** integrity after PR #6702 merge.
4. **P0:** Resolve **macOS `elizaos create` failure** (Issue **#6704**) and ship CLI patch release.
5. **P1:** Fix **aliasEntity/addressee resolution** bug before merging PR **#6712**; add missing tests.
6. **P2:** Decide/document **provider timeout default** change and its latency implications (unfiled).
7. **P3:** Keep **plugin-mnemopay PR #6701** blocked until persistence/null-safety/tests are implemented.
8. **P2/P3:** Schedule design review for **capability-token enforcement plugin proposal #6707** (aligned with AgentID direction).

---

## Patterns / Themes Suggesting Deeper Issues
- **Silent correctness failures in core memory subsystem:** multiple cases where the system “succeeds” but produces unusable or missing memory (IGNORE persistence + zero-vector embeddings). This indicates missing negative-path design and incomplete test coverage for failure modes.
- **High-risk changes landing without issue tracking for follow-ups:** key regressions were identified in review notes rather than tracked issues, increasing the chance they ship unnoticed.
- **Repo/dev workflow fragility:** submodules + workspace rewriting can easily break fresh installs unless enforced by CI and documented as a supported path.

---

## Process Improvement Recommendations
1. **Add “post-merge regression checklist” for core runtime PRs** (memory, embeddings, shouldRespond, install): require explicit tests for failure paths (embedding failure, allowlists, IGNORE/NO_REPLY).
2. **CI matrix for onboarding paths:**
   - `elizaos create` smoke tests on macOS + Linux
   - Fresh clone install with and without submodules
3. **Require issue links (or “follow-up issue created”) for medium/high-risk PRs** touching runtime/message pipeline defaults (timeouts, persistence semantics).
4. **Define and document memory failure-mode semantics** (what happens when embeddings fail; when to persist; how to backfill) and encode them as contract tests.