Full example — see the Generalist Voice Agent cookbook for the complete runnable demo with sample docs and persona config.
Why Moss for voice agents?
Voice agents are latency-sensitive. A retrieval step that takes 200–500ms is audible as a pause. Moss loads indexes into local memory at session start so everymoss_search call during the call returns in ~1–10ms — fast enough to stay below the perceptual threshold.
The persona model also means you can run a single agent binary against many knowledge bases. Add hr_policies as a new index, register it in personas.json, and the next call routed to that persona immediately has access to it.
Architecture
Required tools
- Moss account with project credentials
- LiveKit account (URL, API key, API secret)
- OpenAI API key
- Deepgram API key
- Cartesia API key
- Python 3.10+
Integration guide
Index your documents
Place Run To add a new knowledge base later, index a new folder — no agent restart needed.
.txt or .md files in a folder named after the knowledge domain. The folder name becomes the Moss index name and is what the agent routes to.create_index.py once per folder:Register personas
personas.json maps a persona ID to an index name and a system prompt. Each persona is a distinct voice and knowledge domain.personas.json
Build the voice agent
MossVoiceAgent extends livekit.agents.Agent. The @function_tool decorator exposes moss_search to the LLM. The index is pre-loaded at session start so retrieval stays in local memory.Start the agent and connect
Switching personas at call time
The persona is resolved from room metadata at the start of each session. This means you can route different callers to different knowledge bases — support line vs. product FAQ vs. HR helpdesk — from a single running agent process.| Room metadata | Persona loaded | Moss index searched |
|---|---|---|
{"persona": "customer_support"} | Customer support | customer_support |
{"persona": "product_faq"} | Product specialist | product_faq |
{"persona": "hr_policies"} | HR assistant | hr_policies |
create_index.py, and add a new entry to personas.json. No restart required.