# Council Briefing: 2025-04-02

## Monthly Goal

December 2025: Execution excellence—complete token migration with high success rate, launch ElizaOS Cloud, stabilize flagship agents, and build developer trust through reliability and clear documentation.

## Daily Focus

- The fleet’s reliability posture improved via core stability fixes (DB deadlocks, migration safety, memory duplication) while new CLI/documentation usability gaps surfaced as the primary threat to developer trust.

## Key Points for Deliberation

### 1. Topic: V2 Reliability Hardening (DB + Memory)

**Summary of Topic:** Core stability work landed across database migrations, transaction safety, and memory/interaction handling—directly serving the Execution Excellence principle, but indicating the system is still in a consolidation phase rather than feature expansion.

#### Deliberation Items (Questions):

**Question 1:** Do we prioritize a short reliability freeze (no new features) to burn down database/memory risks before any wider distribution push?

  **Context:**
  - `GitHub Daily Update (2025-04-02): "Resolved a migration issue with pglite" (PR #4158) and "resolved a database transaction deadlock" (PR #4142).`
  - `GitHub Daily Update (2025-04-02): "Fixed memory duplication and cursor caching issues related to Twitter interactions" (PR #4155).`

  **Multiple Choice Answers:**
    a) Yes—declare a reliability freeze until DB + memory invariants are verified and regression-tested.
        *Implication:* Increases confidence and reduces support burden, but delays ecosystem-visible features.
    b) Partial freeze—allow only features that reduce operational risk or improve observability.
        *Implication:* Maintains momentum while protecting stability, but requires strong triage discipline.
    c) No—continue parallel feature and stability work with best-effort testing.
        *Implication:* Speeds perceived progress, but risks compounding bugs that erode developer trust.
    d) Other / More discussion needed / None of the above.

**Question 2:** What is the Council’s minimum acceptable “data safety standard” for releases (migrations, deadlocks, and memory duplication) before we call the framework reliable?

  **Context:**
  - `PR #4142 summary: DB connections stuck "idle in transaction" causing unresponsiveness; fix shipped.`
  - `PR #4158 summary: pglite migration risk due to inconsistent Datadir usage; fix shipped.`

  **Multiple Choice Answers:**
    a) Formal bar: migration rollback strategy + deadlock tests + memory dedupe guarantees in CI.
        *Implication:* Sets a high reliability signal to builders; increases engineering and CI investment.
    b) Pragmatic bar: a curated suite of integration tests and documented recovery playbooks.
        *Implication:* Balances speed and safety; relies on operational discipline and clear docs.
    c) Market bar: ship when critical bugs are fixed and community reports decline.
        *Implication:* Optimizes for velocity, but leaves “reliability” as a moving target.
    d) Other / More discussion needed / None of the above.

**Question 3:** Should Twitter/interaction memory correctness be treated as a core runtime concern (first-class) rather than plugin-local logic?

  **Context:**
  - `Issue #4127: Twitter plugin repeatedly checks the same tweets/mentions in a loop; suggested cursor/TTL caching.`
  - `PR #4155: "caches the cursor of the interaction to avoid repeatedly checking the same interaction or mentioned tweets" and fixes duplicate memory creation.`

  **Multiple Choice Answers:**
    a) Yes—promote interaction cursors/idempotency primitives into core runtime utilities.
        *Implication:* Creates reusable correctness patterns across all clients, reducing repeated plugin bugs.
    b) No—keep it plugin-local but establish a mandatory plugin standard (cursor + idempotency checklist).
        *Implication:* Preserves modularity while improving quality, but depends on plugin maintainer compliance.
    c) Hybrid—core provides optional helpers; plugins adopt them as needed.
        *Implication:* Minimizes disruption, but risks uneven reliability across the ecosystem.
    d) Other / More discussion needed / None of the above.

---


### 2. Topic: Developer Trust Surface: CLI + Docs Accuracy

**Summary of Topic:** New issues signal that onboarding remains fragile—CLI availability/behavior and command documentation accuracy are now strategic trust bottlenecks, directly impacting our Developer First and Trust Through Shipping principles.

#### Deliberation Items (Questions):

**Question 1:** Do we treat the CLI as a “critical system interface” with release-blocking tests for every documented command path?

  **Context:**
  - `New issue #4143: "Users reported a need to test every command in the CLI documentation for accuracy."`
  - `New issue #4159: "How to run Eliza CLI?" questioning current functionality.`

  **Multiple Choice Answers:**
    a) Yes—every docs command becomes an automated CLI test; docs and CLI ship as one artifact.
        *Implication:* Greatly improves trust and reduces support load, but adds CI complexity and maintenance.
    b) Partially—only Quickstart + top 10 workflows are release-blocking; the rest are best-effort.
        *Implication:* Targets highest-impact DX paths while keeping velocity reasonable.
    c) No—community-driven verification with bounties; core team focuses on features and fixes.
        *Implication:* Moves effort outward, but may slow trust-building if inconsistencies persist.
    d) Other / More discussion needed / None of the above.

**Question 2:** What “single source of truth” should the Council designate for onboarding: docs.eliza.how, the CLI help output, or the starter templates?

  **Context:**
  - `Discord dev logs (2025-04-01): "V2 is about to be published to the main branch... simplify the startup process to just `npx elizaos start`" (shaw).`
  - `GitHub issue #4159 indicates confusion about CLI usage in practice.`

  **Multiple Choice Answers:**
    a) Docs-first: docs.eliza.how is canonical; CLI and templates must conform to it.
        *Implication:* Optimizes discoverability and learning, but requires strict docs governance.
    b) CLI-first: `elizaos --help` output is canonical; docs are generated from it.
        *Implication:* Reduces drift and makes local UX authoritative; requires disciplined CLI UX design.
    c) Template-first: starter templates encode truth; docs and CLI reference template behavior.
        *Implication:* Maximizes “it just works” onboarding, but risks under-documenting non-template flows.
    d) Other / More discussion needed / None of the above.

**Question 3:** How aggressively should we deprecate or hide workflows that are “technically possible but unreliable” to reduce cognitive load for new builders?

  **Context:**
  - `New issue #4164: "Compatibility concerns... plugins not yet updated for Eliza v2."`
  - `Discord dev logs: "not 100% backwards compatible" due to clients→services architectural change (Ritvik S).`

  **Multiple Choice Answers:**
    a) Aggressive curation: hide/unlist incompatible or flaky plugins and paths until certified.
        *Implication:* Improves first impressions and reduces churn, but may frustrate power users.
    b) Soft warnings: keep everything visible but clearly label compatibility and risk levels.
        *Implication:* Maintains openness while guiding users, but still exposes them to footguns.
    c) Open frontier: no curation; rely on community knowledge and rapid iteration.
        *Implication:* Maximizes experimentation, but undermines “reliable framework” positioning.
    d) Other / More discussion needed / None of the above.

---


### 3. Topic: Client & Plugin Operational Integrity (Twitter/Telegram/Security)

**Summary of Topic:** Operational fixes are landing (Telegram sync improvements, Twitter crash/memory fixes), and a security issue (Farcaster sensitive logging) highlights the need for a stricter plugin security posture as the ecosystem scales.

#### Deliberation Items (Questions):

**Question 1:** Should the Council mandate a baseline security checklist for all official plugins (secrets handling, logging redaction, dependency scanning) before registry inclusion?

  **Context:**
  - `Dev Discord (2025-04-01): "Security fix for Farcaster plugin that was logging sensitive data" (merged PRs mentioned).`
  - `GitHub PR summary: dependency security bumps (dompurify, katex) shipped (PR #4141).`

  **Multiple Choice Answers:**
    a) Yes—enforce a formal security gate (lint rules + secret scanning + review requirements).
        *Implication:* Reduces catastrophic trust failures, but may slow plugin publishing throughput.
    b) Moderate—apply the gate only to plugins in the default install path; community plugins are labeled.
        *Implication:* Protects most users while keeping ecosystem velocity, but creates a two-tier perception.
    c) No—security remains advisory; rely on fast patching and community oversight.
        *Implication:* Keeps shipping fast, but increases risk of high-visibility incidents.
    d) Other / More discussion needed / None of the above.

**Question 2:** Do we unify “client vs plugin” terminology and packaging to eliminate repeated integration confusion (Twitter/Telegram), even if it requires breaking changes?

  **Context:**
  - `Discord coders channel: "Confusion between 'client' and 'plugin' naming conventions was resolved" (historical summary).`
  - `Dev Discord (2025-04-01): "clients have been replaced with plugins + services" (architecture migration discussion).`

  **Multiple Choice Answers:**
    a) Yes—perform a structured rename/migration with codemods and a compatibility layer.
        *Implication:* Improves long-term DX and reduces support overhead, but requires careful transition planning.
    b) Partially—keep naming but publish an authoritative mapping guide and update error messages.
        *Implication:* Low disruption and quick relief, but confusion may recur as the ecosystem grows.
    c) No—accept terminology pluralism; power users will adapt.
        *Implication:* Minimizes engineering effort, but conflicts with “developer-friendly” positioning.
    d) Other / More discussion needed / None of the above.

**Question 3:** How should we measure “operational integrity” of social clients (Twitter/Telegram) to match Execution Excellence—API call efficiency, reply correctness, or uptime under rate limits?

  **Context:**
  - `Issue #4127: repeated checks increased API calls/log spam/system load.`
  - `Discord logs: users report Twitter client issues "getting agents to reply" and Telegram connection issues; fixes and enhancements landed via PRs (#4124, #4125, #4128).`

  **Multiple Choice Answers:**
    a) Primary metric: correctness (reply/mention handling) with regression suites and golden tests.
        *Implication:* Aligns with user expectations and trust; may increase test engineering effort.
    b) Primary metric: efficiency (API calls, cursor use, rate-limit resilience) with telemetry.
        *Implication:* Reduces bans and costs; correctness issues may still frustrate end users.
    c) Balanced scorecard: correctness + efficiency + uptime, weighted by client importance.
        *Implication:* Most aligned with “reliable platform,” but requires instrumentation and clear ownership.
    d) Other / More discussion needed / None of the above.