# Issue Triage — 2026-02-05

## 1) Skill invocation fails in ~56% of eval cases (skills not triggered)
- **Issue Title & ID:** Skill invocation reliability regression (Discord: 2026-02-02 / internal ID DISC-2026-02-02-SKILL-INVOKE)
- **Current Status:** Reported with a working workaround (UserPromptSubmit hook enforcing 3-step activation). Not yet tracked as a GitHub issue.
- **Impact Assessment:**
  - **User Impact:** **Critical** (affects most agent workflows relying on tools/skills)
  - **Functional Impact:** **Yes** (blocks core “agent uses skills/tools” behavior)
  - **Brand Impact:** **High** (agents appear “dumb”/non-functional despite available skills)
- **Technical Classification:**
  - **Category:** Bug / UX (agent behavior reliability)
  - **Component:** Core Framework (runtime prompting/tool-calling), Cloud orchestration
  - **Complexity:** **Moderate effort** (prompting + routing + evaluation changes; needs tests)
- **Resource Requirements:**
  - **Required Expertise:** LLM tool-calling/prompt engineering, evaluation harnesses, runtime hooks, TypeScript
  - **Dependencies:** Define “skill selection contract” (system prompt / tool policy); align with Cloud behavior (Stan noted similar pattern already used)
  - **Estimated Effort:** **4/5**
- **Recommended Priority:** **P0**
- **Specific Actionable Next Steps:**
  1. Create GitHub issue in `elizaos/eliza` capturing: reproduction prompts, eval set, and 56% failure metric.
  2. Implement a first-class “Skill Activation Pass” (pre-response step) inspired by the 3-step sequence:
     - enumerate candidate skills, YES/NO rationale, then immediate tool calls.
  3. Add telemetry: % prompts where skills were available vs called; per-model breakdown.
  4. Add regression tests in CI with a fixed eval corpus (include “docs available but not invoked” cases).
  5. Document best-practice skill descriptions and required schema to reduce ambiguity.
- **Potential Assignees:** **Stan ⚡** (Cloud pattern), **R0am** (proposed workaround), **Odilitime** (core guidance), support from **0xbbjoker** (tooling/tests)

---

## 2) Malicious skills risk on clawhub / skill ecosystem security
- **Issue Title & ID:** Malicious skills distribution & execution hardening (Discord: 2026-02-03 / internal ID DISC-2026-02-03-SKILL-SECURITY)
- **Current Status:** Threat identified; proposed mitigations include scanner skills, code rewriting/adaptation phase, LLM review, and sandboxing.
- **Impact Assessment:**
  - **User Impact:** **High** (any user installing third-party skills is at risk)
  - **Functional Impact:** **Partial** (core can run, but unsafe to extend)
  - **Brand Impact:** **High** (security incident potential)
- **Technical Classification:**
  - **Category:** **Security**
  - **Component:** Plugin System / Skill Execution / Registry integration
  - **Complexity:** **Complex solution** (multi-layer defense; potentially architectural sandbox)
- **Resource Requirements:**
  - **Required Expertise:** AppSec, sandboxing (VM/containers/isolates), supply-chain security, static analysis, policy enforcement
  - **Dependencies:** Skill packaging standard (agentskills.io), cskill/plugin pipeline, registry trust model
  - **Estimated Effort:** **5/5**
- **Recommended Priority:** **P0**
- **Specific Actionable Next Steps:**
  1. Open a security tracking issue + private disclosure workflow if not already covered by `SECURITY.md`.
  2. Define a minimum “Skill Trust & Permissions Model” (network/file/process/tool access).
  3. Implement baseline scanners:
     - static checks (dangerous APIs, exec, network exfil patterns)
     - dependency allow/deny lists
  4. Add an execution sandbox option (default-on for untrusted skills).
  5. Add LLM-based review as a *supplement*, not a primary gate.
- **Potential Assignees:** **Odilitime** (proposed layered approach), **jin** (raised concern), **Stan ⚡** (typing/standards), security-oriented contributors from core

---

## 3) elizacloud.ai account duplication causes “missing agent” after Proton email variant login
- **Issue Title & ID:** Account merge/identity normalization bug (Discord: 2026-02-02 / internal ID DISC-2026-02-02-ACCOUNT-DUP)
- **Current Status:** Reported; suspected duplicate accounts from `@proton.me` vs `@protonmail.com`.
- **Impact Assessment:**
  - **User Impact:** **High** (users can “lose” agents / appear deleted)
  - **Functional Impact:** **Yes** (blocks access to created agents)
  - **Brand Impact:** **High** (trust/reliability issue for Cloud product)
- **Technical Classification:**
  - **Category:** Bug
  - **Component:** Cloud Auth / Account Management / Dashboard
  - **Complexity:** **Moderate effort**
- **Resource Requirements:**
  - **Required Expertise:** Auth/identity, database migrations, account linking, customer support remediation
  - **Dependencies:** Auth provider behavior; existing user model constraints
  - **Estimated Effort:** **3/5**
- **Recommended Priority:** **P0**
- **Specific Actionable Next Steps:**
  1. Add canonicalization rules (email normalization) *or* implement explicit account linking.
  2. Backfill: detect duplicates by verified ownership signals; provide a merge tool (admin + self-serve).
  3. Add UI warning when logging in with a different identifier that could create a new account.
  4. Add monitoring: spikes in “new account created” for same user/session.
- **Potential Assignees:** **Sam** (app integrations/auth context), **Stan ⚡** (Cloud), support from whoever owns Cloud DB/auth

---

## 4) Babylon.market internal launch: infinite loading spinner / waitlist position not loading (network switching)
- **Issue Title & ID:** Babylon.market spinner + waitlist loading failure (Discord: 2026-02-04 / internal ID DISC-2026-02-04-BABYLON-SPINNER)
- **Current Status:** Multiple internal testers stuck during signup/testing; identified by ziflie as a fix target.
- **Impact Assessment:**
  - **User Impact:** **Critical** (blocks onboarding/testing; likely impacts public users)
  - **Functional Impact:** **Yes** (cannot progress through core flows)
  - **Brand Impact:** **High** (launch-quality perception)
- **Technical Classification:**
  - **Category:** Bug / Performance (frontend state machine / RPC / wallet)
  - **Component:** Babylon.market Web App (wallet connect, network switching, waitlist API)
  - **Complexity:** **Moderate effort**
- **Resource Requirements:**
  - **Required Expertise:** Frontend (React/Next), wallet integration, RPC/network handling, observability
  - **Dependencies:** Farcaster/wallet providers; backend waitlist service availability
  - **Estimated Effort:** **3/5**
- **Recommended Priority:** **P0**
- **Specific Actionable Next Steps:**
  1. Instrument loading states with timeouts + error surfaces (no infinite spinner).
  2. Capture failing requests (Sentry + network logs) and correlate with chain/network switching.
  3. Add a deterministic repro checklist for testers (wallet type, browser, chain, steps).
  4. Patch: ensure waitlist endpoint returns within SLA; add retries/backoff and fallback UI.
- **Potential Assignees:** **ziflie** (named), **s** (launch owner), frontend engineer familiar with wallet flows

---

## 5) Babylon.market Farcaster login failure
- **Issue Title & ID:** Farcaster auth/login broken (Discord: 2026-02-04 / internal ID DISC-2026-02-04-BABYLON-FARCASTER)
- **Current Status:** Reported by multiple testers; 0xbbjoker flagged as action item.
- **Impact Assessment:**
  - **User Impact:** **High** (blocks a key login method)
  - **Functional Impact:** **Partial** (if alternate login exists; otherwise Yes)
  - **Brand Impact:** **High** (auth failures at launch)
- **Technical Classification:**
  - **Category:** Bug
  - **Component:** Auth / OAuth / Farcaster integration
  - **Complexity:** **Moderate effort**
- **Resource Requirements:**
  - **Required Expertise:** OAuth/auth flows, Farcaster SDK, redirect URIs, session handling
  - **Dependencies:** Farcaster service status; correct app credentials
  - **Estimated Effort:** **3/5**
- **Recommended Priority:** **P0**
- **Specific Actionable Next Steps:**
  1. Validate redirect URIs and environment-specific callback domains.
  2. Add structured auth error reporting to UI + logs (capture provider error codes).
  3. Add integration test covering login on fresh session + wallet connected/disconnected.
- **Potential Assignees:** **0xbbjoker**, **Sam** (OAuth experience), Babylon auth owner

---

## 6) Token migration & bridge UX failures: detection issues, “Max amount reached”, user loss reports
- **Issue Title & ID:** ai16z → elizaOS migration friction & errors (Discord: 2026-02-02/03/04 / internal ID DISC-2026-02-MIGRATION-ERRORS)
- **Current Status:** Ongoing user confusion; multiple error modes:
  - Bridge site not detecting pre-Nov 2025 tokens
  - “Max amount reached” migration error
  - Users reporting significant losses and missed notifications
- **Impact Assessment:**
  - **User Impact:** **High** (financial impact + many affected)
  - **Functional Impact:** **Partial** (not core framework, but ecosystem-critical)
  - **Brand Impact:** **High** (trust and reputational risk)
- **Technical Classification:**
  - **Category:** UX / Bug / Documentation
  - **Component:** Migration tooling, bridge website, support process
  - **Complexity:** **Moderate effort** (triage + fixes + comms)
- **Resource Requirements:**
  - **Required Expertise:** Web3 token migration mechanics, frontend, on-chain indexing, support ops
  - **Dependencies:** Snapshot rules, indexers, bridge contract constraints
  - **Estimated Effort:** **4/5**
- **Recommended Priority:** **P1** (escalate to **P0** if active loss is ongoing and fixable)
- **Specific Actionable Next Steps:**
  1. Publish a single “Migration Status” page: deadlines, “bridging optional”, post-snapshot guidance, and known errors.
  2. Add self-serve diagnostics on the bridge site (token eligibility checker; clear error reasons).
  3. Investigate “Max amount reached” root cause and implement a user-safe workaround.
  4. Improve notifications: in-app banner + email digests (reduce “notification overload” misses).
- **Potential Assignees:** **Odilitime** (support guidance), web3/migration maintainer, community mods for comms

---

## 7) Billing implementation placeholder is open and unscoped
- **Issue Title & ID:** Billing — `elizaos/eliza` **#6448**
- **Current Status:** **OPEN**, empty body; no acceptance criteria.
- **Impact Assessment:**
  - **User Impact:** **Medium** (depends on product timeline)
  - **Functional Impact:** **Partial** (blocks monetization flows; aligns with revenue directive)
  - **Brand Impact:** **Medium**
- **Technical Classification:**
  - **Category:** Feature Request (but needs definition)
  - **Component:** Cloud / API / Payments
  - **Complexity:** **Architectural change** (entitlements, metering, plans, invoicing)
- **Resource Requirements:**
  - **Required Expertise:** Payments (Stripe/etc), metering, entitlement systems, security/compliance
  - **Dependencies:** Product packaging decisions (plans, limits), OAuth/integration rollout, identity model fixes
  - **Estimated Effort:** **5/5**
- **Recommended Priority:** **P1**
- **Specific Actionable Next Steps:**
  1. Replace empty issue with a scoped spec: billing provider, plan types, quotas (tokens, tool calls, seats), and success metrics.
  2. Define architecture: entitlement checks at API gateway; audit logs; webhook handling.
  3. Deliver MVP milestone: “paid plan enables X; free plan limited to Y.”
- **Potential Assignees:** **borisudovicic** (opened), product lead aligned with **Borko** revenue focus, backend engineer with payments experience

---

## 8) OAuth provider rollout completion + MCP testing plan
- **Issue Title & ID:** OAuth provider rollout + MCP validation (Discord: 2026-02-04 / internal ID DISC-2026-02-04-OAUTH-MCP)
- **Current Status:** X/GitHub/Slack/Linear integrated; Notion next; MCP testing pending.
- **Impact Assessment:**
  - **User Impact:** **High** (integrations are core to “agent does work” value)
  - **Functional Impact:** **Partial** (core runs, but real workflows depend on integrations)
  - **Brand Impact:** **Medium**
- **Technical Classification:**
  - **Category:** Feature / Reliability
  - **Component:** App Integrations, OAuth adapters, MCP
  - **Complexity:** **Moderate effort**
- **Resource Requirements:**
  - **Required Expertise:** OAuth/OIDC, provider quirks, secret management, redirect URI management
  - **Dependencies:** Provider credentials; MCP adapter readiness
  - **Estimated Effort:** **3/5**
- **Recommended Priority:** **P2** (keep moving, but address P0/P1 failures first)
- **Specific Actionable Next Steps:**
  1. Finish Notion integration; standardize provider onboarding checklist (credentials, scopes, redirects).
  2. Add automated integration smoke tests per provider.
  3. Run MCP end-to-end test now that OAuth foundation exists; document outcomes and gaps.
- **Potential Assignees:** **Sam** (already driving), support from **Stan ⚡** (providers/typing), **0xbbjoker** (plugin/integration experience)

---

## 9) PR review queue risks: PR #6457 (eliza) and PR #278 (eliza-cloud-v2)
- **Issue Title & ID:** Review & merge pending PRs (Discord: 2026-02-03 / internal IDs DISC-2026-02-03-PR6457, DISC-2026-02-03-PR278)
- **Current Status:** Awaiting review.
- **Impact Assessment:**
  - **User Impact:** **Medium** (unknown changes, but delays compound)
  - **Functional Impact:** **Partial**
  - **Brand Impact:** **Low/Medium** (slower velocity perception)
- **Technical Classification:**
  - **Category:** Process / Maintenance
  - **Component:** Core repo + Cloud repo
  - **Complexity:** **Simple fix** (review bandwidth)
- **Resource Requirements:**
  - **Required Expertise:** Repo maintainers; context on touched areas
  - **Dependencies:** None
  - **Estimated Effort:** **2/5**
- **Recommended Priority:** **P2**
- **Specific Actionable Next Steps:**
  1. Assign reviewers today; enforce 48-hour first-review SLA.
  2. If blocked, request author to split PR or add tests/repro steps.
- **Potential Assignees:** Core maintainers; **Stan ⚡** (Cloud), **Odilitime** (core), **0xbbjoker** (author of #6457)

---

## 10) elizacloud.ai image generation lacks visual consistency across iterations
- **Issue Title & ID:** Visual consistency / character continuity for image gen (Discord: 2026-02-02 / internal ID DISC-2026-02-02-IMG-CONSISTENCY)
- **Current Status:** Reported; suggested solution path includes LoRA-based workflows or app-level feature.
- **Impact Assessment:**
  - **User Impact:** **Medium** (high for storytellers/brand users; not universal)
  - **Functional Impact:** **No** (enhancement)
  - **Brand Impact:** **Medium** (creative tooling quality perception)
- **Technical Classification:**
  - **Category:** Feature Request / UX
  - **Component:** Model Integration / Media pipeline
  - **Complexity:** **Complex solution** (model-side + product UX)
- **Resource Requirements:**
  - **Required Expertise:** Diffusion pipelines, LoRA training/hosting, asset management
  - **Dependencies:** Model/provider capabilities; storage; UI for “character profiles”
  - **Estimated Effort:** **5/5**
- **Recommended Priority:** **P3**
- **Specific Actionable Next Steps:**
  1. Decide: first-party feature vs “app built by agents/3rd parties.”
  2. Prototype “character kit” workflow (upload references → train/attach LoRA → reuse token).
  3. Add UX concept: “edit element” vs regenerate-all.
- **Potential Assignees:** Media/model engineer; **DorianD** (direction), Cloud product owner

---

# Conclusion

## 1) Top 5–10 highest priority issues to address immediately
1. **P0:** Skill invocation reliability (~56% not triggered) — DISC-2026-02-02-SKILL-INVOKE  
2. **P0:** Malicious skills / sandboxing & scanning — DISC-2026-02-03-SKILL-SECURITY  
3. **P0:** Cloud account duplication (Proton email variants) — DISC-2026-02-02-ACCOUNT-DUP  
4. **P0:** Babylon.market infinite spinner / waitlist loading — DISC-2026-02-04-BABYLON-SPINNER  
5. **P0:** Babylon.market Farcaster login failure — DISC-2026-02-04-BABYLON-FARCASTER  
6. **P1:** Token migration/bridge errors + clearer comms — DISC-2026-02-MIGRATION-ERRORS  
7. **P1:** `elizaos/eliza` **#6448 Billing** (scope + MVP architecture)  
8. **P2:** OAuth: Notion completion + MCP testing — DISC-2026-02-04-OAUTH-MCP  
9. **P2:** Review PR #6457 (eliza) and PR #278 (eliza-cloud-v2)  
10. **P3:** Image generation visual consistency — DISC-2026-02-02-IMG-CONSISTENCY  

## 2) Patterns/themes indicating deeper architectural problems
- **Reliability gap between “features exist” and “features fire”:** Skills/tools are present but not invoked consistently, suggesting the runtime needs a stronger deterministic tool-selection contract and better evaluation coverage.
- **Trust & safety not yet first-class in the skill ecosystem:** Open distribution of skills without robust permissioning/sandboxing creates supply-chain risk.
- **Identity and onboarding fragility:** Both Babylon and ElizaCloud show onboarding blockers (auth/login/spinners; duplicate accounts), pointing to missing guardrails, better error handling, and observability.
- **Operational comms debt:** Token migration confusion shows that critical lifecycle events need a dedicated comms channel, not ad-hoc Discord discovery.

## 3) Recommendations for process improvements
- **Create “P0 intake” pipeline:** Any report that blocks login, onboarding, tool invocation, or security becomes a GitHub issue within 24 hours with an owner and repro steps.
- **Add observability-by-default:** Standardize Sentry/structured logs for auth flows, tool calls, and long-running loading states (ban infinite spinners; require timeout + user-visible errors).
- **Introduce a skill security baseline:** Permission model + sandbox default for untrusted skills + automated scanning in CI/registry publishing.
- **Establish evaluation gates for agent behavior:** A fixed “tool invocation” eval suite that must pass before releases (track tool-call rate and correctness per model/provider).
- **Tighten PR review SLAs:** 48-hour first-review rule with explicit reviewer assignment to prevent stalled fixes landing late.