---
sidebar_position: 3
title: 'Discord Development Stream'
description: "Complete technical walkthrough of Eliza's architecture, systems, and implementation details"
---

# Discord Dev Stream 11-6-24

## Part 1

<div className="responsive-iframe">
  <iframe
    src="https://www.youtube.com/embed/oqq5H0HRF_A"
    title="YouTube video player"
    allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
    allowFullScreen
  />
</div>
Watch: https://www.youtube.com/watch?v=oqq5H0HRF_A

00:00:00 - Overview

- Eliza is moving to a plugin architecture to enable developers to easily add integrations (e.g. Ethereum wallets, NFTs, Obsidian, etc.) without modifying core code
- Plugins allow devs to focus on specific areas of interest
- Core changes will focus on enabling more flexibility and features to support plugins

00:01:27 - Core abstractions

- Characters: Way to input information to enable multi-agent systems
- Actions, evaluators, providers
- Existing capabilities: Document reading, audio transcription, video summarization, long-form context, timed message summarization

00:02:50 - Eliza as an agent, not just a chatbot

- Designed to act human-like and interact with the world using human tools
- Aim is to enable natural interactions without reliance on slash commands

00:04:44 - Advanced usage and services

- Memory and vector search db (SQLite, Postgres with pgVector)
- Browser service to summarize website content, get through CAPTCHAs
- Services are tools leveraged by actions, attached to runtime

00:06:06 - Character-centric configuration

- Moving secrets, API keys, model provider to character config
- Clients will become plugins, selectable per character
- Allows closed-source custom plugins while still contributing to open-source

00:10:13 - Providers

- Inject dynamic, real-time context into the agent
- Examples: Time, wallet, marketplace trust score, token balances, boredom/cringe detection
- Easy to add and register with the agent

00:15:12 - Setting up providers and default actions

- Default providers imported in runtime.ts
- CLI loads characters and default actions (to be made more flexible)
- Character config will define custom action names to load

00:18:13 - Actions
Q: How does each client decide which action to call?
A: Agent response can include text, action, or both. Process actions checks the action name/similes and executes the corresponding handler. Action description is injected into agent context to guide usage.

00:22:27 - Action execution flow

- Check if action should be taken (validation)
- Determine action outcome
- Compose context and send follow-up if continuing
- Execute desired functionality (mint token, generate image, etc.)
- Use callback to send messages back to the connector (Discord, Twitter, etc.)

00:24:47 - Choosing actions
Q: How does it choose which action to run?
A: The "generate method response" includes the action to run. Message handler template includes action examples, facts, generated dialogue actions, and more to guide the agent.

00:28:22 - Custom actions
Q: How to create a custom action (e.g. send USDC to a wallet)?
A: Use existing actions (like token swap) as a template. Actions don't have input fields, but use secondary prompts to gather parameters. The "generate object" converts language to API calls.

00:32:21 - Limitations of action-only approaches

- Shaw believes half of the PhD papers on action-only models are not reproducible
- Many public claims of superior models are exaggerated; use Eliza if it's better

00:36:40 - Next steps

- Shaw to make a tutorial to better communicate key concepts
- Debugging and improvements based on the discussion
- Attendee to document their experience and suggest doc enhancements

## Part 2

Watch: https://www.youtube.com/watch?v=yE8Mzq3BnUc

00:00:00 - Dealing with OpenAI rate limits for new accounts

- New accounts have very low rate limits
- Options to increase limits:
  1. Have a friend at OpenAI age your account
  2. Use an older account
  3. Consistently use the API and limits will increase quickly
- Can also email OpenAI to request limit increases

00:00:43 - Alternatives to OpenAI to avoid rate limits

- Amazon Bedrock or Google Vertex likely have same models without strict rate limits
- Switching to these is probably a one-line change
- Project 89 got unlimited free access to Vertex

00:01:25 - Memory management best practices
Q: Suggestions for memory management best practices across users/rooms?
A: Most memory systems are user-agent based, with no room concept. Eliza uses a room abstraction (like a Discord channel/server or Twitter thread) to enable multi-agent simulation. Memories are stored per-agent to avoid collisions.

00:02:57 - Using memories in Eliza

- Memories are used in the `composeState` function
- Pulls memories from various sources (recent messages, facts, goals, etc.) into a large state object
- State object is used to hydrate templates
- Custom memory providers can be added to pull from other sources (Obsidian, databases)

00:05:11 - Evaluators vs. Action validation

- Actions have a `validate` function to check if the action is valid to run (e.g., check if agent has a wallet before a swap)
- Evaluators are a separate abstraction that run a "reflection" step
- Example: Fact extraction evaluator runs every N messages to store facts about the user as memories
- Allows agent to "get to know" the user without needing full conversation history

00:07:58 - Example use case: Order book evaluator

- Evaluator looks at chats sent to an agent and extracts information about "shields" (tokens?)
- Uses this to build an order book and "marketplace of trust"

00:09:15 - Mapping Eliza abstractions to OODA loop

- Providers: Observe/Orient stages (merged since agent is a data machine)
- Actions & response handling: Decide stage
- Action execution: Act stage
- Evaluators: Update state, then loop back to Decide

00:10:03 - Wrap up

- Shaw considers making a video to explain these concepts in depth

## Part 3

Watch: https://www.youtube.com/watch?v=7FiKJPyaMJI

00:00:00 - Managing large context sizes

- State object can get very large, especially with long user posts
- Eliza uses "trim tokens" and a maximum content length (120k tokens) to cap context size
  - New models have 128k-200k context, which is a lot (equivalent to 10 YouTube videos + full conversation)
- Conversation length is typically capped at 32 messages
  - Fact extraction allows recalling information beyond this window
  - Per-channel conversation access
- Increasing conversation length risks more aggressive token trimming from the top of the prompt
  - Keep instructions at the bottom to avoid trimming them

00:01:53 - Billing costs for cloud/GPT models
Q: What billing costs have you experienced with cloud/GPT model integration?
A:

- Open Router has a few always-free models limited to 8k context and rate-limited
  - Plan to re-implement and use these for the tiny/check model with fallback for rate limiting
- 8k context unlikely to make a good agent; preference for smaller model over largest 8k one
- Locally-run models are free for MacBooks with 16GB RAM, but not feasible for Linux/AMD users

00:03:35 - Cost management strategies

- Very cost-scalable depending on model size
- Use very cheap model (1000x cheaper than GPT-4) for should_respond handler
  - Runs AI on every message, so cost is a consideration
- Consider running a local Llama 3B model for should_respond to minimize costs
  - Only pay for valid generations

00:04:32 - Model provider and class configuration

- `ModelProvider` class with `ModelClass` (small, medium, large, embedding)
- Configured in `models.ts`
- Example: OpenAI small = GPT-4-mini, medium = GPT-4
- Approach: Check if model class can handle everything in less than 8k context
  - If yes (should_respond), default to free tier
  - Else, use big models

00:06:23 - Fine-tuned model support

- Extend `ModelProvider` to support fine-tuned instances of small Llama models for specific tasks
- In progress, to be added soon
- Model endpoint override exists; will add per-model provider override
  - Allows pointing small model to fine-tuned Llama 3.1B for should_respond

00:07:10 - Avoiding cringey model loops

- Fine-tuning is a form of anti-slop (avoiding low-quality responses)
- For detecting cringey model responses, use the "boredom provider"
  - Has a list of cringe words; if detected, agent disengages
- JSON file exists with words disproportionately high in the dataset
  - To be shared for a more comprehensive solution

## Part 4

Watch: https://www.youtube.com/watch?v=ZlzZzDU1drM

00:00:00 - Setting up an autonomous agent loop
Q: How to set up an agent to constantly loop and explore based on objectives/goals?
A: Create a new "autonomous" client:

1. Initialize with just the runtime (no Express app needed)
2. Set a timer to call a `step` function every 10 seconds
3. In the `step` function:
   - Compose state
   - Decide on action
   - Execute action
   - Update state
   - Run evaluators

00:01:56 - Creating an auto template

- Create an `autoTemplate` with agent info (bio, lore, goals, actions)
- Prompt: "What does the agent want to do? Your response should only be the name of the action to call."
- Compose state using `runtime.composeState`

00:03:38 - Passing a message object

- Need to pass a message object with `userId`, `agentId`, `content`, and `roomId`
- Create a unique `roomId` for the autonomous agent using `crypto.randomUUID()`
- Set `userId` and `agentId` using the runtime
- Set `content` to a default message

00:04:33 - Composing context

- Compose context using the runtime, state, and auto template

00:05:02 - Type error

- Getting a type error: "is missing the following from type state"
- (Transcript ends before resolution)

The key steps are:

1. Create a dedicated autonomous client
2. Set up a loop to continuously step through the runtime
3. In each step, compose state, decide & execute actions, update state, and run evaluators
4. Create a custom auto template to guide the agent's decisions
5. Pass a properly formatted message object
6. Compose context using the runtime, state, and auto template
