# Council Briefing: 2025-03-23

## Monthly Goal

December 2025: Execution excellence—complete token migration with high success rate, launch ElizaOS Cloud, stabilize flagship agents, and build developer trust through reliability and clear documentation.

## Daily Focus

- V2-era composability advanced with RedPill model access and Groq integration maintenance, but developer trust remains gated by unresolved “last-mile” reliability issues (Twitter media, plugin install friction, and inconsistent runtime behavior across clients).

## Key Points for Deliberation

### 1. Topic: Model Provider Expansion & Composability (RedPill + Groq)

**Summary of Topic:** We shipped new provider surface area (RedPill) and stabilized Groq integration, strengthening the “open & composable” thesis—yet each new provider increases testing burden and multiplies failure modes during the V2 launch window.

#### Deliberation Items (Questions):

**Question 1:** Do we prioritize rapid provider expansion (breadth) or enforce a compatibility gate (depth) to protect reliability as we approach launch and Cloud rollout?

  **Context:**
  - `GitHub Daily 2025-03-23: "Added support for RedPill to access additional models" (PR #4045).`
  - `GitHub Daily 2025-03-23: "Rebased changes related to the groq plugin" (PR #4044).`

  **Multiple Choice Answers:**
    a) Breadth-first: keep adding providers quickly to win ecosystem mindshare.
        *Implication:* Faster narrative momentum and builder attraction, but higher support load and higher risk of launch instability.
    b) Depth-first: freeze new providers until a standardized conformance suite passes for each provider.
        *Implication:* Improves trust-through-shipping and Cloud readiness, but may slow community excitement and partner integrations.
    c) Hybrid: allow new providers behind an “experimental” flag with clear SLAs and separate docs.
        *Implication:* Preserves innovation while protecting the default path, but requires disciplined labeling, docs, and telemetry.
    d) Other / More discussion needed / None of the above.

**Question 2:** What is our canonical definition of “LLM compatibility” in V2 (tool-calling strictness, JSON schema adherence, streaming, and memory actions), and where should it be enforced?

  **Context:**
  - `Discord #discussion 2025-03-22: winded4752 asked: "Can we use any LLM model with eliza os or does it have to be trained to tool call?" (unanswered).`
  - `Discord #spartan_holders 2025-03-22: Odilitime: "the LLM model isn't changing, just how it experiences the world".`

  **Multiple Choice Answers:**
    a) Enforce in core: a strict runtime contract (schemas + validators) that all providers must satisfy.
        *Implication:* Maximizes reliability and predictable behavior, but may exclude some models or require adapters.
    b) Enforce in plugins: provider-specific adapters handle quirks; core stays minimal.
        *Implication:* Speeds provider onboarding, but risks fragmented behavior and inconsistent DX across plugins.
    c) Enforce in Cloud: Cloud becomes the “compatibility layer” while self-hosted remains best-effort.
        *Implication:* Strengthens Cloud value proposition, but could dilute open-source trust if self-hosted feels second-class.
    d) Other / More discussion needed / None of the above.

**Question 3:** Should the Council mandate a single streaming standard (SSE/token streaming semantics) across providers now, or defer until after launchpad/Cloud stabilization?

  **Context:**
  - `Discord 💻-coders 2025-03-22: scooper asked about streaming responses like ChatGPT/Claude (unanswered).`
  - `GitHub Issue 2025-03-23: "client-twitter functionality... enable the Eliza agent to post images along with tweets" (Issue #4050).`

  **Multiple Choice Answers:**
    a) Mandate now: standardize streaming semantics across all supported providers immediately.
        *Implication:* Improves UX consistency and developer expectations, but increases short-term engineering scope before launch.
    b) Defer: ship launch without full streaming uniformity; stabilize core first.
        *Implication:* Reduces launch risk, but leaves a visible UX gap that users explicitly request.
    c) Incremental: standardize streaming for the top 1–2 providers first; publish an interface for others.
        *Implication:* Balances speed and consistency, but requires careful comms to avoid “why doesn’t my provider stream?” churn.
    d) Other / More discussion needed / None of the above.

---


### 2. Topic: V2 Launch Readiness: Installation Friction, Plugin Reliability, and Cross-Client Consistency

**Summary of Topic:** User-reported failures cluster around plugin installation (notably plugin-sql), Docker character loading, and divergent behavior across Discord vs Twitter—these are trust killers that directly oppose Execution Excellence and the December directive’s “reliability + clear docs” posture.

#### Deliberation Items (Questions):

**Question 1:** What should be treated as “launch-blocking” for V2: install failures (plugin-sql), character persistence inconsistencies, or client integration regressions (Twitter/Discord voice/media)?

  **Context:**
  - `Discord 💻-coders 2025-03-22: "Multiple users reported issues with plugin loading failures, particularly with plugin-sql" (beta.7).`
  - `Discord 💻-coders 2025-03-22: "Agent kept talking like a pirate in Discord while behaving correctly on Twitter" (bobo bixby / jin).`

  **Multiple Choice Answers:**
    a) Installability is the launch gate: any common “npm create / cli start” failure is blocking.
        *Implication:* Maximizes developer-first trust, but may delay launch to chase edge cases.
    b) Behavioral correctness is the launch gate: cross-client personality/memory consistency is blocking.
        *Implication:* Protects flagship agent perception and governance credibility, but may ship with rough setup edges.
    c) Client integrations are the launch gate: Twitter/Discord core flows must be stable; other issues are post-launch.
        *Implication:* Optimizes for visible demos and growth channels, but risks alienating builders who can’t run locally.
    d) Other / More discussion needed / None of the above.

**Question 2:** How do we reduce “tribal knowledge” support load: invest in automated diagnostics (doctor command + log capture) or in curated docs/tutorials (Windows, Docker, multi-account Twitter)?

  **Context:**
  - `Discord 💻-coders 2025-03-22: workaround for plugin-sql error: "rm -rf node_modules && rm package-lock.json && bun install" (Pedro).`
  - `Discord 2025-03-22 Action Items: "Create guide for setting up ElizaOS on Windows" and "Fix Docker configuration to properly load custom characters in Web UI".`

  **Multiple Choice Answers:**
    a) Diagnostics-first: ship a CLI “doctor” that detects Node/Bun versions, missing modules, DB URLs, and plugin registry issues.
        *Implication:* Scales support and improves reliability narratives, but requires engineering effort and careful UX design.
    b) Docs-first: produce platform-specific guides and short tutorial videos as the primary lever.
        *Implication:* Fastest to deploy and aligns with developer-first, but risks staleness and still leaves hard-to-debug failures.
    c) Dual track: minimal “doctor” for top failures + docs that link to exact remediation paths.
        *Implication:* Balances speed and scalability, but requires coordination to keep docs and diagnostics in lockstep.
    d) Other / More discussion needed / None of the above.

**Question 3:** Do we formalize a “character contract” that guarantees consistent personality/memory across clients (Discord/Twitter/Web), even if it restricts flexibility?

  **Context:**
  - `Discord 💻-coders 2025-03-22: character fix: add "never uses emoji" / "hates pirate talk" to character file (community report).`
  - `Discord 💻-coders 2025-03-22: reports of personality persistence despite configuration changes (multiple users).`

  **Multiple Choice Answers:**
    a) Yes: define a strict character schema + deterministic prompt assembly; clients cannot override without explicit flags.
        *Implication:* Improves predictability and flagship stability, but may constrain advanced customization and experimentation.
    b) No: keep characters flexible; document best practices and accept some cross-client variance.
        *Implication:* Preserves creativity and speed, but risks reputational damage when agents behave unpredictably in public channels.
    c) Tiered: “Certified Characters” for flagship/Cloud templates vs “Experimental Characters” for local development.
        *Implication:* Creates a trust ladder for users while retaining openness, but adds product complexity and labeling overhead.
    d) Other / More discussion needed / None of the above.

---


### 3. Topic: Ecosystem Trust & Distribution: Twitter Media, Automated News Ops, and Token Surface Area

**Summary of Topic:** Operational momentum is converging on outward-facing channels (X automation, image tweets), while parallel token discourse and scam token alerts highlight that growth without trust controls can backfire; we need a coherent security-and-comms posture that matches “trust through shipping.”

#### Deliberation Items (Questions):

**Question 1:** Should we treat X/Twitter as a core product surface (first-class reliability + media support) or as a best-effort plugin while Cloud and framework stabilize?

  **Context:**
  - `GitHub 2025-03-23: new issue: "client-twitter functionality... enable the Eliza agent to post images" (Issue #4050).`
  - `Discord dao-organization 2025-03-22: Hubert building automated AI16Z news posting system using JSON + Puppeteer + X API.`

  **Multiple Choice Answers:**
    a) Core surface: prioritize Twitter media/reply reliability as a top-tier KPI before launch marketing.
        *Implication:* Strengthens distribution and flagship credibility, but may pull resources from foundational stability work.
    b) Best-effort: keep Twitter as a plugin with community support; focus internal effort on Cloud + core runtime.
        *Implication:* Protects engineering focus, but risks undermining marketing efficacy if the most visible channel is flaky.
    c) Phased: stabilize minimal posting + text replies now; schedule media/threads after launch with clear roadmap.
        *Implication:* Delivers dependable baseline distribution while managing expectations and controlling scope.
    d) Other / More discussion needed / None of the above.

**Question 2:** What governance and comms protocol do we adopt for token-related risk (scam tokens, partner allocations) to avoid fragmentation during launch hype?

  **Context:**
  - `Discord 🥇-partners 2025-03-22: "A fraudulent ElizaOS token on BSC was identified" (Zolo).`
  - `Discord 🥇-partners 2025-03-22: debate on partner allocations vs requiring investment; proposal for "stake-weight-reputation system" (yikesawjeez).`

  **Multiple Choice Answers:**
    a) Central authority protocol: an official verification page + rapid takedown playbook + single source of truth for contract addresses.
        *Implication:* Reduces scam impact and confusion, but requires disciplined updates and may feel less decentralized to some.
    b) Community-led protocol: delegate verification and alerts to a partner council with on-chain attestations.
        *Implication:* Aligns with decentralized ethos, but increases coordination complexity and may slow response time.
    c) Hybrid protocol: official canonical registry + community watch network feeding alerts into it.
        *Implication:* Balances speed and decentralization, but needs clear operational ownership and escalation rules.
    d) Other / More discussion needed / None of the above.

**Question 3:** Do we operationalize “Taming Information” into a formal pipeline now (JSON→editorial pass→auto-post→archive), or keep it ad-hoc until V2 stabilizes?

  **Context:**
  - `Discord dao-organization 2025-03-22: Jin: "JSON updates daily at midnight UTC" and suggested a filter/editing pass using HackMD API.`
  - `Discord 2025-03-22 Action Items: "Implement daily X post system" and "Add filter/editing pass before publishing".`

  **Multiple Choice Answers:**
    a) Formalize now: treat information ops as production infrastructure with owners, SLAs, and audit trails.
        *Implication:* Improves consistency and trust, and accelerates developer education, but adds process overhead during a busy launch cycle.
    b) Keep ad-hoc: allow experimentation; avoid locking into tooling/process prematurely.
        *Implication:* Maximizes flexibility, but risks fragmented messaging and missed opportunities to build developer trust.
    c) Minimum viable pipeline: automate collection + archival now, add editorial governance when V2 reaches stability thresholds.
        *Implication:* Creates compounding documentation value with low risk, while deferring heavier governance until readiness is higher.
    d) Other / More discussion needed / None of the above.