## Issue Triage — 2026-05-09

### 1) Cloud monetized chat: credit reconciliation can refund after content delivered / overcharge on reconciliation failure (post-merge follow-up to PR #7376)
- **Issue Title & ID:** `cloud/apps: /api/v1/apps/:appId/chat reconciliation edge cases enable free inference or user overcharge` (elizaos/eliza **PR #7376 follow-up issue — needs filing**)
- **Current Status:** **Known risk identified in review; PR merged; no tracked issue found in data**
- **Impact Assessment:**
  - **User Impact:** **Critical** (affects paying traffic on monetized apps)
  - **Functional Impact:** **Yes** (billing correctness + response delivery)
  - **Brand Impact:** **High** (trust + monetization integrity)
- **Technical Classification:**
  - **Issue Category:** Bug / Security (economic abuse)
  - **Component Affected:** Cloud API (Apps chat/billing)
  - **Complexity:** **Complex solution** (streaming lifecycle + transactional billing + error handling)
- **Resource Requirements:**
  - **Required Expertise:** Cloud API, billing/credits, streaming response handling (Hono/Workers), transactional design
  - **Dependencies:** Deployment pipeline for Cloud API; alignment with billing service; test harness for streaming failures
  - **Estimated Effort (1–5):** **5**
- **Recommended Priority:** **P0**
- **Specific Actionable Next Steps:**
  1. **File a tracking issue** capturing the two failure modes from review:
     - Streaming: if post-stream reconciliation fails, current catch path may **issue full refund after output delivered**.
     - Non-streaming: if reconcile fails after provider response is consumed, user can be **charged and receive 500/no response**.
  2. Refactor reconciliation into a **two-phase accounting model**:
     - Reserve (pre-call) → deliver response → reconcile (post-call) with **idempotency key** and “delivered=true” guard.
  3. Ensure catch paths **never set actual cost to 0** if content was delivered; instead mark “reconcile_pending”.
  4. Add tests simulating:
     - writer closed then reconcile throws
     - reconcile throws after provider JSON read
     - retries/idempotency behavior
  5. Add operational alerting/dashboard for `reconcile_pending` and automatic retry job.
- **Potential Assignees:**
  - **NubsCarson** (authored PR #7376)
  - **standujar** (cloud stabilization/auth work)
  - **0xSolace** (cloud hotfixes during migration)

---

### 2) Cloud monetized chat: API-key auth errors returned as HTTP 500 instead of 401/403 (post-merge follow-up to PR #7376)
- **Issue Title & ID:** `cloud/apps: /api/v1/apps/:appId/chat masks AuthenticationError/ForbiddenError as 500 for API-key callers` (elizaos/eliza **PR #7376 follow-up issue — needs filing**)
- **Current Status:** **Known risk identified in review; PR merged**
- **Impact Assessment:**
  - **User Impact:** **High** (breaks API clients; misleads retries/backoff logic)
  - **Functional Impact:** **Partial** (auth works but error contract is wrong)
  - **Brand Impact:** **High** (appears as platform instability)
- **Technical Classification:**
  - **Issue Category:** Bug / UX (API contract)
  - **Component Affected:** Cloud API middleware + apps chat route
  - **Complexity:** **Moderate effort**
- **Resource Requirements:**
  - **Required Expertise:** Cloud API error handling patterns, auth middleware ordering
  - **Dependencies:** None (localized fix) but should be coordinated with billing fix above
  - **Estimated Effort (1–5):** **3**
- **Recommended Priority:** **P1**
- **Specific Actionable Next Steps:**
  1. Move `requireAuthOrApiKeyWithOrg()` out of `Promise.all` or wrap to **preserve typed errors**.
  2. Add an error mapper that converts known auth errors to **401/403** with stable `code`.
  3. Add unit tests for:
     - missing/invalid API key
     - forbidden org access
     - cookie-auth path still returns expected results
- **Potential Assignees:**
  - **standujar**
  - **NubsCarson**

---

### 3) Cloud domains: `/sync` never marks Cloudflare domain as `verified=true`, breaking CORS origins (post-merge follow-up to PR #7376)
- **Issue Title & ID:** `cloud/domains: sync leaves verified=false after zone provisioning; verified origins list remains empty` (elizaos/eliza **PR #7376 follow-up issue — needs filing**)
- **Current Status:** **Known risk identified in review; PR merged**
- **Impact Assessment:**
  - **User Impact:** **High** (custom domains may never work due to CORS/origin gating)
  - **Functional Impact:** **Yes** (blocks custom domain usage for apps)
  - **Brand Impact:** **High** (paid feature appears broken)
- **Technical Classification:**
  - **Issue Category:** Bug
  - **Component Affected:** Cloud API (domains sync) + CORS origin resolution
  - **Complexity:** **Moderate effort**
- **Resource Requirements:**
  - **Required Expertise:** Cloudflare domain lifecycle, DB schema/state machine
  - **Dependencies:** Might require backfill/migration for existing purchased domains stuck unverified
  - **Estimated Effort (1–5):** **3**
- **Recommended Priority:** **P1**
- **Specific Actionable Next Steps:**
  1. Update sync path to set `verified: true` when registrar status indicates active + zone ready.
  2. Add a one-time **backfill job** or admin script to fix existing records.
  3. Add tests for status transition: pending → active implies verified + appears in `listVerifiedAppOrigins`.
- **Potential Assignees:**
  - **NubsCarson**
  - **standujar**

---

### 4) Container control plane: internal token enforcement becomes no-op when env var missing (post-merge follow-up to PR #7376)
- **Issue Title & ID:** `cloud/container-control-plane: requireInternalToken bypass when CONTAINER_CONTROL_PLANE_TOKEN unset` (elizaos/eliza **PR #7376 follow-up issue — needs filing**)
- **Current Status:** **Known risk identified in review; PR merged**
- **Impact Assessment:**
  - **User Impact:** **Critical** (potential unauthorized control-plane actions)
  - **Functional Impact:** **Yes** (security boundary)
  - **Brand Impact:** **High** (cloud security posture)
- **Technical Classification:**
  - **Issue Category:** Security
  - **Component Affected:** Cloud services (container control plane)
  - **Complexity:** **Moderate effort**
- **Resource Requirements:**
  - **Required Expertise:** Cloud service auth, infra/env management, secret provisioning
  - **Dependencies:** Deployment config must ensure token is always set; secrets management integration
  - **Estimated Effort (1–5):** **4**
- **Recommended Priority:** **P0**
- **Specific Actionable Next Steps:**
  1. Change `requireInternalToken` behavior to **fail closed**:
     - If token env missing in non-dev, return 503/500 and emit a high-severity log.
  2. Ensure CI/deploy manifests require the secret (Terraform/K8s/Workers config).
  3. Add security regression tests (request without token must be rejected even if env misconfigured).
- **Potential Assignees:**
  - **standujar**
  - **0xSolace**
  - (consult) **Dexploarer** (vault/secrets patterns if needed)

---

### 5) Slack connector: unhandled `users.info` failures can drop inbound messages silently (post-merge follow-up to PR #7375)
- **Issue Title & ID:** `plugin-slack: missing try/catch around users.info causes Bolt handler to throw; messages dropped` (elizaos/eliza **PR #7375 follow-up issue — needs filing**)
- **Current Status:** **Identified in review; PR merged**
- **Impact Assessment:**
  - **User Impact:** **High** (Slack is a major connector; dropped messages are hard to diagnose)
  - **Functional Impact:** **Yes** (core connector reliability)
  - **Brand Impact:** **High** (“bot ignores messages” perception)
- **Technical Classification:**
  - **Issue Category:** Bug / Reliability
  - **Component Affected:** Plugin System (plugin-slack)
  - **Complexity:** **Simple fix** (guard error path) + **moderate** (behavior decisions)
- **Resource Requirements:**
  - **Required Expertise:** Slack Bolt, connector runtime event handling, error/reporting conventions
  - **Dependencies:** None
  - **Estimated Effort (1–5):** **2**
- **Recommended Priority:** **P1**
- **Specific Actionable Next Steps:**
  1. Wrap `getUser()` / `client.users.info` calls in try/catch; on failure:
     - continue with a minimal user profile (id-only) and still create memory + respond
     - log structured warning with Slack error code (rate_limited, user_not_found, etc.)
  2. Add tests for simulated Slack API failure ensuring message is still processed.
  3. Add connector-level metric: “inbound events processed vs failed”.
- **Potential Assignees:**
  - **2-A-M** (connector-related architecture familiarity)
  - **0xSolace** (stability/hotfix work)
  - **2-A-M** or **NubsCarson** as reviewer (if not primary)

---

### 6) Hyperfy plugin removed / 404 + core version mismatch (Discord-reported)
- **Issue Title & ID:** `Hyperfy plugin missing (404) and incompatible with current elizaOS core` (Discord report **2026-05-07**, repo likely `elizaos-plugins/eliza-3d-hyperfy-starter` — **needs GitHub issue**)
- **Current Status:** **Untracked in GitHub data; workaround distributed via DM zip**
- **Impact Assessment:**
  - **User Impact:** **Medium** (subset of 3D/Hyperfy users, but blocks that integration entirely)
  - **Functional Impact:** **Partial** (plugin ecosystem credibility + specific integration blocked)
  - **Brand Impact:** **Medium/High** (plugin disappearance undermines trust)
- **Technical Classification:**
  - **Issue Category:** Bug / Documentation (distribution)
  - **Component Affected:** Plugin System / 3D integration
  - **Complexity:** **Moderate effort** (version alignment + packaging + release)
- **Resource Requirements:**
  - **Required Expertise:** Plugin API compatibility, packaging/release, Hyperfy integration knowledge
  - **Dependencies:** Determine target core version/API surface; repository ownership/archival policy
  - **Estimated Effort (1–5):** **3**
- **Recommended Priority:** **P2**
- **Specific Actionable Next Steps:**
  1. Create a public GitHub issue in the appropriate repo (or elizaos/eliza if monorepo migration planned) documenting:
     - last known working commit/tag
     - current breakage points (API changes)
     - expected install path
  2. Restore plugin availability:
     - re-add repo (or migrate into monorepo) and publish a tagged release compatible with current core
  3. Add a “plugin compatibility matrix” doc/page so removals are explicit and discoverable.
- **Potential Assignees:**
  - **odilitime** (actively engaged in troubleshooting; provided workaround)
  - **da4tner** (offered help)
  - (backup) **lalalune** (repo consolidation experience)

---

### 7) elizaos.github.io daily summary page stuck since May 4 (Discord-reported)
- **Issue Title & ID:** `Website: /summary/day page stopped updating (stuck since 2026-05-04)` (Discord report **2026-05-07** — **needs GitHub issue**)
- **Current Status:** **Unresolved; suspected GitHub account/config issue**
- **Impact Assessment:**
  - **User Impact:** **Medium** (community visibility, project transparency)
  - **Functional Impact:** **No** (doesn’t block runtime)
  - **Brand Impact:** **Medium** (looks like neglected automation)
- **Technical Classification:**
  - **Issue Category:** Bug / DevOps
  - **Component Affected:** Docs/Website pipeline (summary generation)
  - **Complexity:** **Moderate effort**
- **Resource Requirements:**
  - **Required Expertise:** GitHub Pages, Actions/cron, permissions/tokens, artifact publishing
  - **Dependencies:** Access to org secrets/tokens; verify scheduled workflows
  - **Estimated Effort (1–5):** **3**
- **Recommended Priority:** **P2**
- **Specific Actionable Next Steps:**
  1. Identify the job that publishes `/summary/day` and check:
     - schedule triggers firing
     - token permissions and rate limits
     - recent changes around May 4
  2. Add monitoring: on missed publish for 24h, open an issue or send Discord alert.
  3. Add a “last updated” badge and a fallback link to raw JSON endpoint if available.
- **Potential Assignees:**
  - **0xSolace** (ops-style fixes)
  - **lalalune** (workflow-heavy changes in repo)
  - **odilitime** (community triage)

---

### 8) Twitter/X integration: X API now required; docs and UX need update (Discord-reported)
- **Issue Title & ID:** `Docs/UX: Twitter/X connector now requires paid X API; users unclear about auth/cost` (Discord **2026-05-07** — **needs GitHub issue**)
- **Current Status:** **Clarified verbally on Discord; not reflected in docs**
- **Impact Assessment:**
  - **User Impact:** **High** (common connector; surprises new users)
  - **Functional Impact:** **Partial** (integration still possible but expectation mismatch blocks adoption)
  - **Brand Impact:** **Medium** (perceived “bait-and-switch” if not communicated)
- **Technical Classification:**
  - **Issue Category:** Documentation / UX
  - **Component Affected:** Connector docs + onboarding/settings copy
  - **Complexity:** **Simple fix**
- **Resource Requirements:**
  - **Required Expertise:** Docs writing, connector configuration knowledge
  - **Dependencies:** Confirm current connector implementation and required API tier
  - **Estimated Effort (1–5):** **1**
- **Recommended Priority:** **P2**
- **Specific Actionable Next Steps:**
  1. Update docs: “X API is required for posting” + pricing notes + alternative “read-only” modes if any.
  2. Update Settings UI copy to explicitly state requirements before users attempt setup.
  3. Add a troubleshooting section for auth errors and common 401/403 cases.
- **Potential Assignees:**
  - **odilitime**
  - **2-A-M** (UI copy placement)
  - Docs maintainers (unlisted)

---

### 9) DeFi plugin registry security docs: SafeAgent entry awaiting review (docs PR)
- **Issue Title & ID:** `Add SafeAgent to DeFi plugin registry` (elizaos/docs **PR #84**)
- **Current Status:** **Open / awaiting review**
- **Impact Assessment:**
  - **User Impact:** **Medium** (improves safety posture for DeFi plugin users)
  - **Functional Impact:** **No** (documentation/registry)
  - **Brand Impact:** **Medium** (signals security diligence)
- **Technical Classification:**
  - **Issue Category:** Documentation / Security (guidance)
  - **Component Affected:** Docs / DeFi plugin ecosystem registry
  - **Complexity:** **Simple fix**
- **Resource Requirements:**
  - **Required Expertise:** DeFi plugin ecosystem context, security review literacy
  - **Dependencies:** Reviewer availability; confirm SafeAgent scope and link accuracy
  - **Estimated Effort (1–5):** **1**
- **Recommended Priority:** **P2**
- **Specific Actionable Next Steps:**
  1. Assign reviewers and merge/iterate within 48 hours.
  2. Add a lightweight policy: security-tool additions should have an SLA (e.g., 72h review).
- **Potential Assignees:**
  - Docs maintainers (unlisted)
  - **odilitime** (as coordinator if appropriate)

---

### 10) Community safety / moderation: request for USDT receiver + WhatsApp contact posted in coders channel
- **Issue Title & ID:** `Discord moderation: financial solicitation (USDT receiver / WhatsApp) in #coders` (Discord **2026-05-08**)
- **Current Status:** **No action logged**
- **Impact Assessment:**
  - **User Impact:** **Medium** (risk of scams targeting builders)
  - **Functional Impact:** **No**
  - **Brand Impact:** **High** (community trust/safety)
- **Technical Classification:**
  - **Issue Category:** Security / Community Ops
  - **Component Affected:** Discord moderation policies
  - **Complexity:** **Simple fix**
- **Resource Requirements:**
  - **Required Expertise:** Moderation, community guidelines enforcement
  - **Dependencies:** Moderator availability; clear policy on solicitation
  - **Estimated Effort (1–5):** **1**
- **Recommended Priority:** **P1**
- **Specific Actionable Next Steps:**
  1. Review and remove/flag the message if it violates rules; warn/ban if repeated.
  2. Add/clarify a pinned rule: no payment handling solicitations, no off-platform payment brokering.
  3. Add an AutoMod filter for “USDT”, “WhatsApp”, wallet addresses in dev channels (tunable).
- **Potential Assignees:**
  - Discord moderators/admins (unlisted)
  - **odilitime** (if acting in a mod capacity)

---

## Highest-Priority Focus (Top 5–10 to address immediately)
1. **P0:** Cloud monetized chat reconciliation correctness (free inference / overcharge) — **PR #7376 follow-up**
2. **P0:** Container control-plane token enforcement fail-open — **PR #7376 follow-up**
3. **P1:** Cloud chat auth errors misreported as 500 — **PR #7376 follow-up**
4. **P1:** Cloud domain sync never sets `verified=true` (breaks CORS/custom domains) — **PR #7376 follow-up**
5. **P1:** Slack connector message drops on `users.info` failure — **PR #7375 follow-up**
6. **P1:** Discord moderation: financial solicitation in dev channels
7. **P2:** Hyperfy plugin missing/404 + compatibility drift
8. **P2:** elizaos.github.io daily summary publishing stuck
9. **P2:** X API requirement documentation + Settings UI copy update
10. **P2:** docs#84 SafeAgent registry PR review/merge

---

## Patterns / Themes Indicating Deeper Architectural Problems
- **Post-merge “known P1/P0” findings not converted into tracked issues:** Greptile highlighted P1s, yet PRs merged; indicates a gap between automated review signals and release gating.
- **State-machine drift in Cloud features (domains/billing):** Verified/active flags and billing reconciliation suggest incomplete invariants and missing idempotency.
- **Connector reliability hinges on external API calls without defensive guards:** Slack user lookup failure dropping events mirrors prior Telegram reliability themes (silent message loss harms trust).
- **Plugin ecosystem lifecycle management is ad hoc:** Hyperfy plugin removal + DM zip workaround implies missing policy for deprecation, compatibility, and distribution continuity.
- **Automation/observability gaps for project-facing reporting:** The stuck daily summary page suggests fragile publishing pipelines without alerting.

---

## Process Improvements (to prevent repeats)
1. **Introduce “P0/P1 must-have issue” gating:** If automated review (Greptile/CI) flags P0/P1, require either a fix before merge or a **linked issue with owner + SLA**.
2. **Add Cloud monetization invariants + idempotency standards:**
   - mandatory idempotency keys for billing operations
   - explicit “delivered” vs “charged” vs “reconciled” state model
   - chaos tests for reconciliation failures
3. **Connector resilience checklist:** All inbound event handlers must be “no-throw” at top level; external API calls need fallbacks; add metrics for dropped events.
4. **Plugin compatibility + deprecation policy:** Maintain a compatibility matrix and a “soft-deprecate” flow instead of repo removal; provide archived tags/releases and migration notes.
5. **Website/docs automation monitoring:** Add a scheduled check that validates “latest summary date == yesterday” and alerts (Discord + issue) when stale.