# Council Briefing: 2025-03-02

## Monthly Goal

December 2025: Execution excellence—complete token migration with high success rate, launch ElizaOS Cloud, stabilize flagship agents, and build developer trust through reliability and clear documentation.

## Daily Focus

- Core platform hardening accelerated via agent/character consolidation, CLI + API stabilization, and Cloud auth fixes—while RAG ingestion revealed a new scalability fault line (large-file embedding).

## Key Points for Deliberation

### 1. Topic: Release Hardening: Agent/Character Merge + API/CLI Stability

**Summary of Topic:** Engineering throughput remains strong, with key stability work merged (server startup/API fixes, agent endpoint updates, CLI dependency handling) and structural simplification (agent+character merge) improving long-term maintainability—at the cost of short-term integration friction for developers.

#### Deliberation Items (Questions):

**Question 1:** Do we treat the agent+character merge as a freeze point for stabilization (optimize reliability), or continue rapid structural refactors (optimize future velocity)?

  **Context:**
  - `GitHub Daily (2025-03-02): "Merged agent and character functionalities to streamline operations" (PR #3742).`
  - `GitHub Daily (2025-03-02): "Resolved issues with API and server startup" (PR #3743).`

  **Multiple Choice Answers:**
    a) Freeze core structures for one release cycle; prioritize regression fixes, docs, and migration tooling.
        *Implication:* Maximizes reliability and developer trust, but may slow delivery of deeper architectural corrections.
    b) Continue refactors, but gate them behind feature flags and strict CI/regression suites.
        *Implication:* Preserves momentum while containing blast radius, but requires disciplined engineering process and test investment.
    c) Push forward aggressively; accept churn as the price of reaching the next architecture quickly.
        *Implication:* May accelerate long-term platform power, but risks eroding DX and community confidence during a critical trust-building phase.
    d) Other / More discussion needed / None of the above.

**Question 2:** What is the Council’s minimum definition of “release-ready” for v0.25.9-style transitions (plugin/client architecture shifts) to protect Developer First and Trust Through Shipping?

  **Context:**
  - `Discord (2025-03-01, coders): "clients now need to be added as plugins" (v0.25.8 changes).`
  - `Discord (2025-03-01, coders): repeated questions on CLI/plugin install and client integration.`

  **Multiple Choice Answers:**
    a) Release-ready = migration guide + one-click template + passing e2e tests for top clients (Discord/Twitter).
        *Implication:* Creates consistent onboarding outcomes and reduces support load, but delays releases until docs/testing are complete.
    b) Release-ready = core passes CI and smoke tests; docs can follow within 72 hours.
        *Implication:* Maintains shipping cadence, but risks community confusion and repeated support escalations.
    c) Release-ready = stable API contracts only; treat templates/docs as community-driven add-ons.
        *Implication:* Maximizes core velocity but shifts burden to builders, undermining the project’s “developer-friendly” mandate.
    d) Other / More discussion needed / None of the above.

**Question 3:** Should the CLI become the primary reliability surface (opinionated golden path), or remain a thin wrapper over a flexible but complex system?

  **Context:**
  - `GitHub Daily (2025-03-02): "Fixed CLI handling of plugin dependencies" (PR #3737).`
  - `Discord (2025-03-01, coders): CLI install/use confusion and repeated setup questions (odilitime, pinecone_magg).`

  **Multiple Choice Answers:**
    a) Make CLI the golden path with strict validation, guided flows, and auto-fixes.
        *Implication:* Improves DX and reduces support burden, but constrains advanced workflows unless escape hatches are designed.
    b) Keep CLI thin; invest in docs and templates instead of increasing CLI complexity.
        *Implication:* Avoids maintaining a heavy CLI, but leaves more failure modes exposed during onboarding.
    c) Split: a “simple mode” CLI for most users plus an “expert mode” for power builders.
        *Implication:* Balances DX and flexibility, but increases product surface area and testing requirements.
    d) Other / More discussion needed / None of the above.

---


### 2. Topic: Knowledge & Memory at Scale: RAG Large-File Failures and Heap Pressure

**Summary of Topic:** Field signals and a new GitHub issue confirm that knowledge ingestion is still fragile: users hit Node heap limits, PDFs are unreliable, and the RAG pipeline can attempt embedding entire files—threatening reliability as builders scale beyond toy corpora.

#### Deliberation Items (Questions):

**Question 1:** What is our official stance on knowledge ingestion limits in the near term: enforce strict constraints now, or invest immediately in chunking/streaming pipelines to meet real-world document sizes?

  **Context:**
  - `GitHub Issue (2025-03-02): "RAG processFile attempts to embed entire files causing errors for large documents" (Issue #3745).`
  - `Discord (2025-03-01, coders): memory OOM workaround: NODE_OPTIONS="--max-old-space-size=6144" (CARSON.ts).`

  **Multiple Choice Answers:**
    a) Enforce strict limits (file size, token counts) with clear error messages and docs until a robust pipeline ships.
        *Implication:* Improves predictability and reduces crash reports, but caps adoption for serious RAG use cases.
    b) Prioritize immediate pipeline fixes (chunking + incremental embedding + backpressure) as a top reliability initiative.
        *Implication:* Unlocks real workloads and strengthens trust, but diverts bandwidth from other launches and features.
    c) Hybrid: ship limits now plus an experimental “large-doc mode” behind a flag for early adopters.
        *Implication:* Reduces user pain quickly while iterating in production-like environments, but increases complexity and support variance.
    d) Other / More discussion needed / None of the above.

**Question 2:** Do we standardize an officially supported set of document formats (e.g., TXT/MD first) and explicitly de-scope PDFs until reliable parsing/embedding is guaranteed?

  **Context:**
  - `Discord (2025-02-27): "It didn't work with PDFs, converting to txt format worked instead" (Ale | AutoRujira).`
  - `Discord (2025-03-01): request: "Provide documentation on handling PDF files in Eliza" (andy4net).`

  **Multiple Choice Answers:**
    a) Yes—publish a supported-format matrix and de-scope PDF until we can guarantee quality.
        *Implication:* Sets clear expectations and reduces churn, but may frustrate users who view PDF as table-stakes for RAG.
    b) No—treat PDF as a first-class requirement and build the parsing + chunking stack now.
        *Implication:* Aligns with real-world usage, but is a multi-surface engineering effort (parsing, chunking, embeddings, memory).
    c) Support PDF via integrations (external converters) while we build native support gradually.
        *Implication:* Gets users unstuck quickly, but risks inconsistent results and complicates debugging/support.
    d) Other / More discussion needed / None of the above.

**Question 3:** How should we address memory persistence issues driving repetitive outputs (e.g., Twitter repetition): prioritize core memory architecture improvements or deliver pragmatic “best practices” configs first?

  **Context:**
  - `Discord (2025-03-01, coders): "Implement proper memory storage to prevent repetitive tweets" (Redvoid).`
  - `Discord (2025-03-01): model behavior tuning via temperature/frequency_penalty/presence_penalty (artzy).`

  **Multiple Choice Answers:**
    a) Core-first: invest in memory architecture/persistence semantics before recommending tuning hacks.
        *Implication:* Produces durable correctness and better agents, but may delay immediate relief for builders shipping bots now.
    b) Pragmatic-first: publish a “Twitter anti-repetition” playbook (memory settings + evaluator patterns + modelConfig defaults).
        *Implication:* Reduces user pain quickly and improves ecosystem outputs, but may ossify around workarounds if core fixes lag.
    c) Both: release playbook immediately and schedule memory architecture milestones with measurable outcomes.
        *Implication:* Balances near-term ecosystem quality with long-term platform integrity, but requires strong ownership and timeline discipline.
    d) Other / More discussion needed / None of the above.

---


### 3. Topic: Cloud & Client Reliability: Social Integrations, Auth, and Public Surface Area

**Summary of Topic:** Cloud-facing reliability improved via fixing Twitter auth failures, but user reports highlight ongoing fragility (image posting, hosted embedding model init errors), and our public web surface (eliza.gg) remains broken—undermining trust-through-shipping and onboarding.

#### Deliberation Items (Questions):

**Question 1:** What is the Council’s priority order for reliability fixes: (1) Cloud auth + core connectivity, (2) social posting features (images/polls), or (3) hosted embedding/model initialization stability?

  **Context:**
  - `GitHub Daily (2025-03-02): "Resolved Twitter authentication failures on Cloud" (Issue #2225).`
  - `Discord (2025-03-01, coders): "Fix Twitter image generation and posting capability" (Abderahman).`

  **Multiple Choice Answers:**
    a) Prioritize Cloud auth + core connectivity first; everything else is secondary.
        *Implication:* Protects platform foundation and reduces systemic outages, but may disappoint builders focused on visible social demos.
    b) Prioritize social posting features (especially images) to maximize flagship agent impact and virality.
        *Implication:* Improves public perception and adoption loops, but risks papering over deeper infrastructure weaknesses.
    c) Prioritize hosted model initialization stability to prevent hard-to-debug deployment failures.
        *Implication:* Reduces high-friction failures for builders deploying anywhere, improving trust, but may delay feature polish.
    d) Other / More discussion needed / None of the above.

**Question 2:** How should we govern third-party hosting failures (e.g., Fleek BGE init errors): treat as “out of scope” or build a compatibility certification + diagnostic toolkit?

  **Context:**
  - `Discord (2025-03-01): "failed to initialize BGE model" on fleek.xyz (Ordinal Watches).`
  - `Discord (2025-03-01, coders): response indicated it is a hosting issue needing Fleek fixes (jintern).`

  **Multiple Choice Answers:**
    a) Out of scope: document known-good hosts and redirect hosting-specific issues to providers.
        *Implication:* Keeps team focused, but leaves a fractured deployment story and inconsistent user experience.
    b) Build diagnostics + environment checks in CLI to detect common host constraints (memory, filesystem, native deps).
        *Implication:* Improves self-serve debugging and reduces support load, but requires ongoing maintenance across host ecosystems.
    c) Launch an “ElizaOS Certified Deployments” program (compat matrix + automated validation workflows).
        *Implication:* Strengthens trust and ecosystem standards, but adds governance overhead and potential politics with hosting partners.
    d) Other / More discussion needed / None of the above.

**Question 3:** What is our strategic response to a broken public domain (eliza.gg): rapid replacement now, or consolidate all public entry points into a single canonical docs/Cloud portal?

  **Context:**
  - `Discord (2025-03-01): "eliza.gg website was reported broken" (Teng Yan); "jin confirmed they would set up a new site since the previous maintainer went AWOL."`
  - `Discord (2025-03-01): new showcase page added; "needs further polish" (jin).`

  **Multiple Choice Answers:**
    a) Replace eliza.gg immediately with a minimal stable landing page and clear redirects to docs/showcase.
        *Implication:* Stops reputational bleed quickly, but may create another surface to maintain.
    b) Consolidate: one canonical portal (docs + Cloud + showcase) and deprecate legacy domains aggressively.
        *Implication:* Reduces fragmentation long-term, but requires careful migration/redirect strategy to avoid breaking community links.
    c) Keep multiple surfaces but formalize ownership (maintainers, SLAs, automation) for each public endpoint.
        *Implication:* Supports marketing flexibility, but increases operational complexity and risk of future “maintainer vanished” incidents.
    d) Other / More discussion needed / None of the above.