## Issue Triage — 2026-05-10 (elizaOS)

### 1) Cloud monetized chat: auth errors reported as 500 + credit reconciliation refund/charge inconsistencies
- **Issue Title & ID:** Cloud app-scoped chat endpoint returns incorrect status codes + billing reconciliation edge cases (Follow-up to **PR #7376**, unfiled)
- **Current Status:** Reported by automated review notes; PR merged; needs immediate validation and patch release.
- **Impact Assessment:**
  - **User Impact:** **Critical** (Cloud customers + any monetized app users)
  - **Functional Impact:** **Yes** (monetized chat reliability + correct auth semantics)
  - **Brand Impact:** **High** (billing correctness + “500 on auth” looks unstable/untrustworthy)
- **Technical Classification:**
  - **Category:** Bug / Reliability / Billing correctness
  - **Component Affected:** Cloud API (app chat endpoint), Billing/Credits reconciliation
  - **Complexity:** **Complex solution** (streaming/non-streaming lifecycle, reconciliation failure handling)
- **Resource Requirements:**
  - **Required Expertise:** Cloud API (Hono/Next routes), streaming responses, billing/credits ledger semantics, error classification
  - **Dependencies:** None, but should coordinate with any ongoing Cloud deploys/releases to avoid regressions
  - **Estimated Effort (1-5):** **5**
- **Recommended Priority:** **P0**
- **Specific Actionable Next Steps:**
  1. **Create GitHub issue** capturing all sub-bugs with reproduction cases and expected behaviors (401/403 vs 500; refund rules).
  2. In `cloud/apps/api/v1/apps/[id]/chat/route.ts`:
     - Move `requireAuthOrApiKeyWithOrg` **out of `Promise.all`** and map known auth errors to **401/403**.
     - For **streaming**: ensure reconciliation failure **cannot** trigger “full refund after content delivered”; persist “delivered=true” before closing writer; reconcile in a protected block with retry/compensation policy.
     - For **non-streaming**: if provider response succeeded but reconciliation fails, return the provider response and enqueue reconciliation retry (or issue safe partial refund with explicit audit log).
  3. Add unit/integration tests:
     - API-key invalid ⇒ 401 (not 500)
     - Streaming delivered + DB failure ⇒ no full refund
     - Non-streaming reconcile failure ⇒ response returned + deterministic credit outcome
  4. Add structured logging + alerting around reconciliation exceptions (include appId, orgId, requestId).
- **Potential Assignees:** **NubsCarson** (PR owner/context), **standujar** (Cloud auth/stability), backup: **0xSolace** (bugfix focus)

---

### 2) Cloud managed domains: `/sync` never flips `verified=true` after Cloudflare zone provisioning (CORS origins remain empty)
- **Issue Title & ID:** Domain sync does not mark Cloudflare domains as verified (Follow-up to **PR #7376**, unfiled)
- **Current Status:** Reported by automated review notes; PR merged; likely production-impacting for managed domains.
- **Impact Assessment:**
  - **User Impact:** **High** (any app using managed domains)
  - **Functional Impact:** **Yes** (CORS origin list remains empty ⇒ app chat/auth can break)
  - **Brand Impact:** **High** (“domain is active but doesn’t work”)
- **Technical Classification:**
  - **Category:** Bug
  - **Component Affected:** Cloud API (domains sync), CORS/origin allowlisting
  - **Complexity:** **Moderate effort**
- **Resource Requirements:**
  - **Required Expertise:** Cloudflare domain lifecycle, Cloud DB schema/managed-domains service
  - **Dependencies:** Might require DB migration awareness, but fix appears route-level/service-level
  - **Estimated Effort (1-5):** **3**
- **Recommended Priority:** **P0**
- **Specific Actionable Next Steps:**
  1. Patch `cloud/apps/api/v1/apps/[id]/domains/sync/route.ts` to set `verified: true` when status transitions to active/live.
  2. Add regression test that:
     - starts with purchased domain `verified=false`
     - `/sync` after provider reports active ⇒ stored `verified=true`
     - `listVerifiedAppOrigins` includes the domain
  3. Validate end-to-end: new managed domain → verify → CORS origin list updated without manual intervention.
- **Potential Assignees:** **NubsCarson**, **standujar**

---

### 3) Slack connector: missing try/catch around `users.info` can drop incoming messages silently
- **Issue Title & ID:** Slack plugin can silently drop messages when Slack API user lookup fails (Follow-up to **PR #7375**, unfiled)
- **Current Status:** Reported by automated review notes; PR merged; risk is on critical inbound event handler path.
- **Impact Assessment:**
  - **User Impact:** **High** (Slack connector users; message loss is user-visible but hard to diagnose)
  - **Functional Impact:** **Partial** (Slack still works until API errors/rate limits occur; then messages vanish)
  - **Brand Impact:** **High** (silent drops undermine trust)
- **Technical Classification:**
  - **Category:** Bug / Reliability
  - **Component Affected:** Plugin System → `plugin-slack` service event handlers
  - **Complexity:** **Simple fix** (guard + fallback behavior) to **Moderate** (add backoff/telemetry)
- **Resource Requirements:**
  - **Required Expertise:** Slack Bolt/Socket Mode, connector event handling patterns, error handling/observability
  - **Dependencies:** None
  - **Estimated Effort (1-5):** **2**
- **Recommended Priority:** **P1**
- **Specific Actionable Next Steps:**
  1. Add try/catch around `getUser()` in `handleMessage` and `handleAppMention`.
  2. On failure:
     - continue processing with minimal identity (userId only), and log a structured warning (rate-limit vs network vs invalid user).
  3. Add tests that simulate Slack API failure and assert message is still persisted + agent response attempted.
  4. Consider a small retry/backoff for transient `ratelimited` responses.
- **Potential Assignees:** **2-A-M** (plugin migration author), backup: **0xSolace**

---

### 4) elizaos.github.io daily summary page stuck since 2026-05-04 (site publishing pipeline regression)
- **Issue Title & ID:** `elizaos.github.io/summary/day` not updating (Discord report 2026-05-07; unfiled)
- **Current Status:** Open; suspected GitHub account/config or pipeline failure; no fix recorded.
- **Impact Assessment:**
  - **User Impact:** **Medium** (community relies on summaries; affects visibility)
  - **Functional Impact:** **No** (not core runtime), but affects comms/docs surface
  - **Brand Impact:** **Medium** (project appears stale/broken publicly)
- **Technical Classification:**
  - **Category:** Bug / Infrastructure
  - **Component Affected:** Docs/Website pipeline (GitHub Pages + summary generation)
  - **Complexity:** **Moderate effort**
- **Resource Requirements:**
  - **Required Expertise:** GitHub Actions, GitHub Pages publishing, permissions/tokens, content generation job
  - **Dependencies:** Access to the publishing repo/settings; credentials
  - **Estimated Effort (1-5):** **3**
- **Recommended Priority:** **P1**
- **Specific Actionable Next Steps:**
  1. Identify which workflow generates `summary/day` and check last successful run + failure logs.
  2. Validate token permissions (Pages deploy, workflow permissions, org policy changes).
  3. Add a “freshness monitor” (simple scheduled job that alerts/opens issue if no update in 24h).
  4. Document ownership + runbook for restoring the pipeline.
- **Potential Assignees:** **odilitime** (ops/core), backup: **0xSolace** (workflow/CI bugfixing)

---

### 5) Hyperfy plugin removed / 404 + core/plugin version mismatch (breaks 3D starter onboarding)
- **Issue Title & ID:** Hyperfy plugin missing from GitHub + incompatible with current core (Discord report 2026-05-07; unfiled)
- **Current Status:** Plugin removed (404); workaround shared privately as a zip; underlying compatibility issue unresolved.
- **Impact Assessment:**
  - **User Impact:** **Medium** (subset building 3D/Hyperfy experiences; onboarding friction)
  - **Functional Impact:** **Partial** (blocks that integration path)
  - **Brand Impact:** **High** (dead links + removed plugin looks chaotic/unreliable)
- **Technical Classification:**
  - **Category:** Bug / Plugin ecosystem maintenance
  - **Component Affected:** Plugin System, 3D/Hyperfy integration, release/versioning
  - **Complexity:** **Architectural change** if versioning/compat policy is missing; otherwise **Moderate effort**
- **Resource Requirements:**
  - **Required Expertise:** Plugin API compatibility, packaging/versioning, CI publishing, repo management
  - **Dependencies:** Need the last known good plugin source and current core/plugin API expectations
  - **Estimated Effort (1-5):** **4**
- **Recommended Priority:** **P2** (promote to P1 if Hyperfy is a strategic demo/partner path)
- **Specific Actionable Next Steps:**
  1. Restore the repository (or publish an archived read-only mirror) so links stop 404’ing.
  2. Create a compatibility matrix: core version ↔ plugin version; add CI check that fails on API drift.
  3. Update plugin to current core:
     - fix build/ESM expectations
     - bump peer deps
     - add minimal smoke test that loads the plugin in a sample agent runtime
  4. Replace “DM zip” with a documented release artifact.
- **Potential Assignees:** **odilitime** (plugin coordination), **2-A-M** (monorepo/plugin integration patterns), community collaborator: **binarycookies** (reporter; can validate)

---

### 6) Twitter/X integration: X API now required; docs + UX must reflect pricing and setup expectations
- **Issue Title & ID:** Docs/UX mismatch: Twitter integration requires paid X API (Discord report 2026-05-07; unfiled)
- **Current Status:** Requirement clarified in Discord; docs likely outdated; risk of user confusion and failed setups.
- **Impact Assessment:**
  - **User Impact:** **High** (common connector; many users try Twitter/X first)
  - **Functional Impact:** **Partial** (integration fails without X API)
  - **Brand Impact:** **Medium** (confusion feels like “broken integration”)
- **Technical Classification:**
  - **Category:** Documentation / UX
  - **Component Affected:** Plugin docs, Settings UI copy, onboarding
  - **Complexity:** **Simple fix**
- **Resource Requirements:**
  - **Required Expertise:** Docs + connector UX copy; basic knowledge of X auth modes
  - **Dependencies:** Confirm the enforced requirement in code paths; ensure no legacy login remains
  - **Estimated Effort (1-5):** **1**
- **Recommended Priority:** **P2**
- **Specific Actionable Next Steps:**
  1. Update docs: “X API required”, expected tiers, and what actions require posting vs read-only.
  2. Update Settings UI to show a clear “Requires X API key” message and link to setup guide.
  3. Add runtime error mapping: if missing/invalid X credentials, surface actionable guidance (not generic failure).
- **Potential Assignees:** **2-A-M** (UI/doc-adjacent work), **odilitime** (connector guidance)

---

### 7) Security/community: scam warning posted without details (needs incident capture + moderation response pattern)
- **Issue Title & ID:** Untriaged “scam” report in Discord (Discord report 2026-05-09; unfiled)
- **Current Status:** A user flagged “scam” with no context; unknown whether active phishing is ongoing.
- **Impact Assessment:**
  - **User Impact:** **Medium → Critical** (unknown scope; could be widespread if active)
  - **Functional Impact:** **No** (not code), but community safety risk
  - **Brand Impact:** **High** (security incidents in community harm trust)
- **Technical Classification:**
  - **Category:** Security / Community Ops
  - **Component Affected:** Discord moderation processes; possibly link filtering/bot automation
  - **Complexity:** **Moderate effort** (process + tooling)
- **Resource Requirements:**
  - **Required Expertise:** Moderation, Discord audit logs, basic incident response
  - **Dependencies:** Access to message context; moderation permissions
  - **Estimated Effort (1-5):** **2**
- **Recommended Priority:** **P1**
- **Specific Actionable Next Steps:**
  1. Ask reporter for the message link/screenshot; locate the flagged content in logs.
  2. If malicious: remove content, warn/ban user, post a short advisory with indicators.
  3. Add a lightweight “scam report” template and a mod-only channel for intake.
  4. Consider enabling stronger anti-phishing automod rules (new account link posting limits, domain allow/deny lists).
- **Potential Assignees:** **odilitime** (Community Ops), **dankvr** (Moderator/Core Dev), **dieantwoord1337** (reporter, to provide context)

---

### 8) Developer support: “movement” functionality blocked in a chat bubble UI project (community request)
- **Issue Title & ID:** Help request: movement implementation blocked in chat app with bubble UI (Discord 2026-05-09; unfiled)
- **Current Status:** Unanswered; unclear relation to elizaOS core (likely user project).
- **Impact Assessment:**
  - **User Impact:** **Low**
  - **Functional Impact:** **No**
  - **Brand Impact:** **Low**
- **Technical Classification:**
  - **Category:** UX / Developer Support
  - **Component Affected:** N/A (likely external project)
  - **Complexity:** **Unknown**
- **Resource Requirements:**
  - **Required Expertise:** Frontend/game UI movement patterns
  - **Dependencies:** Need code snippet/repo
  - **Estimated Effort (1-5):** **1**
- **Recommended Priority:** **P3**
- **Specific Actionable Next Steps:**
  1. Ask for minimal repro (repo/video), tech stack, and what “movement” means (scrolling? dragging? avatar movement?).
  2. Route to a community help thread; capture FAQ if common.
- **Potential Assignees:** Community volunteers; **binarycookies** (requester) to provide repro details

---

## Top Priority Summary (Immediate Focus: Top 5–10)
1. **P0:** Cloud monetized chat auth/billing reconciliation correctness (Follow-up to **PR #7376**)
2. **P0:** Cloud domain sync never marking `verified=true` (CORS/origins break) (Follow-up to **PR #7376**)
3. **P1:** Slack connector can silently drop inbound messages on Slack API lookup errors (Follow-up to **PR #7375**)
4. **P1:** `elizaos.github.io/summary/day` stuck since 2026-05-04 (publishing pipeline regression)
5. **P1:** Untriaged Discord scam warning (confirm/contain + improve incident intake)
6. **P2:** Hyperfy plugin removed/404 + version mismatch (restore + compatibility work)
7. **P2:** Twitter/X requires X API — update docs/UX to avoid failed setups
8. **P3:** Community “movement” help request (non-core)

---

## Patterns / Themes Indicating Deeper Issues
- **Post-merge critical-path regressions escaping review:** Multiple P1/P0 findings are in **merged** PRs (Cloud monetization + Slack connector), suggesting missing gating on automated review warnings and/or insufficient pre-merge production-like tests.
- **Silent failure modes:** Slack “drop message on exception”, Cloud “auth becomes 500”, website summary pipeline “stuck without alerting”—these are all cases where failures are **not surfaced clearly** to users/maintainers.
- **Ecosystem/versioning drift:** Hyperfy plugin removal + compatibility mismatch signals a broader need for **plugin API compatibility contracts** and publishing discipline.
- **Operational observability gaps:** Lack of freshness monitors (website summaries) and missing structured error classification (Cloud endpoints) increases time-to-diagnosis.

---

## Process Improvement Recommendations
1. **Introduce “P1/P0 review gating” for merges:** If automated reviewers (e.g., Greptile) flag P1/P0, require explicit resolution or documented risk acceptance before merge.
2. **Add connector reliability smoke tests:** Minimal integration tests for Slack/Telegram/Discord event ingestion that assert “inbound message ⇒ stored memory ⇒ attempted reply” even when upstream APIs error.
3. **Add Cloud “billing invariants” test suite:** Property/invariant tests ensuring reconciliation failures cannot yield “delivered response with full refund” or “charged without response/refund”.
4. **Improve error taxonomy and response mapping:** Standardize auth/billing error handling so 401/403 are never returned as 500; add structured error codes for clients.
5. **Plugin compatibility CI checks:** Enforce a compatibility matrix and automated build/load tests for plugins against current `develop`; publish deprecation/removal policy to avoid 404 surprises.
6. **Ops monitoring for public surfaces:** Add a scheduled job that opens an issue/pings maintainers if the daily summary page hasn’t updated within an SLA window.