# Council Briefing: 2025-01-17

## Monthly Goal

December 2025: Execution excellence—complete token migration with high success rate, launch ElizaOS Cloud, stabilize flagship agents, and build developer trust through reliability and clear documentation.

## Daily Focus

- A high-velocity shipping cycle is strengthening cross-platform stability (tests, Windows support, critical startup fixes), but rising operational issues (Mac client connectivity, Telegram deployment behavior, Twitter auth/parsing) threaten the reliability narrative unless triaged as “fleet blockers.”

## Key Points for Deliberation

### 1. Topic: Reliability Front: Cross-Platform Stability & Test Coverage

**Summary of Topic:** Engineering throughput remains exceptional (dozens of PRs/day and expanding tests), yet the issue stream highlights fragility on macOS and in database adapters—directly challenging our Execution Excellence mandate and developer trust.

#### Deliberation Items (Questions):

**Question 1:** Which problems become “Fleet Blockers” that pause feature intake until resolved (to protect the reliability story)?

  **Context:**
  - `GitHub issues cited in holo-logs: macOS client startup/connectivity failures (#2360, #2471) and ARM64 Docker tokenizer module error (#2432).`
  - `Daily update: new requests for tests on Redis/SQLite/Supabase adapters after structure changes (#2469, #2467).`

  **Multiple Choice Answers:**
    a) Treat macOS client connectivity loops and ARM64 Docker breakages as immediate Fleet Blockers; freeze new features until fixed.
        *Implication:* Maximizes developer trust and reduces support load, but temporarily slows ecosystem expansion.
    b) Prioritize database adapter correctness/testing (Redis + SQLite/Supabase) as Fleet Blockers; macOS issues handled as best-effort patches.
        *Implication:* Protects core persistence guarantees, but risks reputational damage among Mac-heavy builders.
    c) No hard freeze; run parallel workstreams with a rotating strike team that closes top reliability issues weekly.
        *Implication:* Maintains momentum, but risks recurring instability if triage discipline degrades.
    d) Other / More discussion needed / None of the above.

**Question 2:** Do we enforce a minimum test/CI standard for new clients/plugins before merge to prevent regressions at current contribution volume?

  **Context:**
  - `Daily report: new tests added for GitHub client (#2407), Slack client (#2404), Instagram client (#2454), plugin-solana (#2345).`
  - `Repo activity: 46 PRs/33 merged (Jan 16-17) and 45 PRs/37 merged (Jan 17-18), indicating high merge throughput.`

  **Multiple Choice Answers:**
    a) Yes—require a baseline test harness + smoke tests for every new client/plugin before merge.
        *Implication:* Raises merge friction now but stabilizes long-term reliability under massive contributor scale.
    b) Partial—require tests only for core runtime, DB adapters, and flagship clients; allow experimental plugins with looser gates.
        *Implication:* Balances innovation with stability, but may create a “two-tier” quality perception.
    c) No—optimize for speed; rely on rapid rollback and community bug reports.
        *Implication:* Maximizes shipping velocity, but undermines the ‘most reliable’ positioning and increases operator pain.
    d) Other / More discussion needed / None of the above.

**Question 3:** What is the Council’s preferred stabilization cadence for releases while issue volume is rising?

  **Context:**
  - `Holo-log monthly stats (Jan): 1039 new PRs (735 merged), 401 new issues, 694 active contributors—high change rate.`
  - `Daily update includes multiple bug fixes and new compatibility issues appearing simultaneously.`

  **Multiple Choice Answers:**
    a) Adopt a strict release train: scheduled cutoffs + stabilization week with bugfix-only merges.
        *Implication:* Improves predictability and quality, but may frustrate fast-moving contributors.
    b) Continue continuous delivery, but introduce a “stability branch” for Cloud/flagship users.
        *Implication:* Preserves momentum while protecting production users, at the cost of branch management overhead.
    c) Move to fewer, larger releases tied to Cloud milestones to reduce churn.
        *Implication:* Reduces operational noise but delays user-visible improvements and community feedback loops.
    d) Other / More discussion needed / None of the above.

---


### 2. Topic: Flagship Surface Area: Twitter/Telegram Client Reliability & Anti-Spam Controls

**Summary of Topic:** Social clients are a primary adoption funnel and public credibility layer, but authentication failures, reply formatting bugs, and deployment incompatibilities are actively impairing agent uptime and perceived competence.

#### Deliberation Items (Questions):

**Question 1:** What is our “minimum viable reliability bar” for social clients (Twitter/Telegram) before we position them as flagship-ready?

  **Context:**
  - `Issues cited: Twitter auth failures on AWS EC2 (Error 399, #2372) and unexpected JSON metadata in bot replies (#2423).`
  - `Daily update: Telegram client polling may conflict with cloud/blue-green deployments (#2466).`

  **Multiple Choice Answers:**
    a) Flagship-ready only when auth is robust across common hosts, replies are clean, and deployment mode is Cloud-compatible.
        *Implication:* Stronger trust-through-shipping, but delays marketing/visibility for agent showcases.
    b) Flagship-ready if core posting works; publish known-issues + recommended hosting recipes (VPN/login steps, rate limits).
        *Implication:* Ships faster while reducing surprises, but may normalize fragile behavior as “expected.”
    c) Flagship readiness is per-agent, not per-client; allow DegenSpartan/AIXVC to proceed with guardrails even if clients are imperfect.
        *Implication:* Maintains narrative momentum, but risks public failures being attributed to ElizaOS itself.
    d) Other / More discussion needed / None of the above.

**Question 2:** How should we reduce social-client harm (spam, scams, loops) while keeping autonomy high?

  **Context:**
  - `Discord holo-log: “Fix Twitter client to prevent responding to scam replies” mentioned; also rate limiting and mention handling challenges across channels.`
  - `Discord Q&A: control posting frequency via env vars (ENABLE_ACTION_PROCESSING=false; POST_INTERVAL_MIN/MAX).`

  **Multiple Choice Answers:**
    a) Default-safe autonomy: conservative rate limits, target user allowlists, and scam-reply filters enabled by default.
        *Implication:* Reduces platform bans and reputational damage, but limits viral growth and responsiveness.
    b) Operator-driven autonomy: ship tooling and docs, but keep defaults permissive; builders assume responsibility.
        *Implication:* Maximizes flexibility, but increases support burden and inconsistent user experiences.
    c) Introduce an optional “approval workflow” mode for high-stakes accounts (human-in-the-loop for posts).
        *Implication:* Protects key brands/agents while preserving autonomy elsewhere, at the cost of extra UX complexity.
    d) Other / More discussion needed / None of the above.

**Question 3:** Do we standardize a first-party uptime/ops pattern (watchdog, cron, health checks) as part of the core framework or keep it external?

  **Context:**
  - `Discord action item: “Implement cron job to monitor agent uptime” (Cipher); suggestions to auto-restart agents.`
  - `Community reports: running multiple agents and keeping them alive post-logout remains confusing/unanswered in some channels.`

  **Multiple Choice Answers:**
    a) Build first-party ops primitives (health endpoints + supervised restart) into ElizaOS Cloud and recommended self-host templates.
        *Implication:* Improves reliability and DX; increases scope and maintenance responsibility.
    b) Publish official recipes (systemd, pm2, docker compose) and keep ops outside core.
        *Implication:* Faster and simpler, but produces fragmented operator experience across environments.
    c) Make ops a plugin/adapter layer (community-maintained) with optional installation via registry.
        *Implication:* Aligns with composability while avoiding core bloat, but quality may vary.
    d) Other / More discussion needed / None of the above.

---


### 3. Topic: Composable Future: Plugin Registry, Modularization, and V2 Secrecy vs Trust

**Summary of Topic:** Signals point toward a modular future (plugin registry, moving plugins out of core, dynamic plugin loading), while V2 is incubating privately—creating a strategic tension between rapid architectural evolution and community trust/coordination.

#### Deliberation Items (Questions):

**Question 1:** How aggressively should we move plugins out of core to reduce maintenance pain while keeping the developer experience seamless?

  **Context:**
  - `Completed items: “A new plugin registry has been created… move plugins out of core and add dynamic plugin loading” (Shaw, via X update in holo-logs).`
  - `Daily report shows rapid growth in features/integrations across many plugins, increasing surface area.`

  **Multiple Choice Answers:**
    a) Fast migration: deprecate core-bundled plugins quickly; make registry + dynamic loading the default path.
        *Implication:* Reduces core bloat and accelerates composability, but risks breaking changes and onboarding confusion.
    b) Hybrid: keep a curated “core set” (flagship-quality) and move everything else to registry with clear tiering.
        *Implication:* Preserves a stable DX baseline while enabling ecosystem growth, but requires governance and curation effort.
    c) Slow migration: prioritize stability; only extract plugins once APIs and docs are mature.
        *Implication:* Minimizes churn, but prolongs maintenance load and slows scaling to many platforms/chains.
    d) Other / More discussion needed / None of the above.

**Question 2:** What level of transparency should we maintain around V2 while it remains in a private repository?

  **Context:**
  - `Completed items: “ElizaOS v2 is currently in a private repository with limited access… finalizing details before merging back.” (Shaw, via X update in holo-logs).`
  - `Discord logs show builders asking “Where is V2 being developed?” with unanswered questions in coders channel.`

  **Multiple Choice Answers:**
    a) Publish a public V2 roadmap + API intent notes now, even if code stays private temporarily.
        *Implication:* Improves alignment and reduces rumor load, while protecting unfinished implementation details.
    b) Selective access program: grant V2 repo access to high-signal contributors under guidelines, keep broader details limited.
        *Implication:* Accelerates development with trusted builders, but may create perceived gatekeeping.
    c) Keep V2 mostly opaque until a merge-ready milestone to avoid thrash and external pressure.
        *Implication:* Reduces coordination overhead, but increases community uncertainty and speculative narratives.
    d) Other / More discussion needed / None of the above.

**Question 3:** How do we prevent knowledge/embedding confusion from becoming a chronic DX tax as we scale multi-provider support?

  **Context:**
  - `Discord coders: “dimension mismatches when trying to use different embedding models” and recurring RAG/knowledge management confusion (multiple users).`
  - `Suggested workaround: “use OpenAI for embedding since it uses 1536 dimensions” (Titan | Livepeer-Eliza.com).`

  **Multiple Choice Answers:**
    a) Enforce strict embedding compatibility checks with clear errors and an automatic migration/reset flow.
        *Implication:* Reduces silent failures and support time, but may require opinionated constraints.
    b) Standardize on a default embedding dimension/provider for ‘happy path’ and document advanced overrides.
        *Implication:* Improves onboarding and reliability; advanced users still can customize with informed tradeoffs.
    c) Keep flexibility; focus on documentation and community recipes rather than enforcing constraints.
        *Implication:* Maximizes configurability but risks ongoing friction and perceived instability for new developers.
    d) Other / More discussion needed / None of the above.