Skip to main content
LiveKit provides the real-time audio pipeline (speech-to-text, LLM, text-to-speech); Moss supplies retrieval. The agent loads a Moss index (and optionally opens a session for the live conversation), then queries it in-process to ground its answers - each lookup is a local call, not a network request.

Pipeline

Voice agent pipeline from mic to speech-to-text, Moss retrieval, LLM, and speakerVoice agent pipeline from mic to speech-to-text, Moss retrieval, LLM, and speaker

Two ways to build it

LiveKit integration (DIY)

Add Moss to your own LiveKit agent: load a knowledge index, open a per-call session, and expose search as function tools. Full, copy-pasteable recipe.

Managed Voice Agents

Let Moss host the agent: the Agent SDK, backend token minting, and the deploy CLI.
Prefer to start from a working app? Clone the LiveKit voice agent sample and add your own keys.

How retrieval fits

  • Load a Moss index into memory at start; retrieval then runs in-process (single-digit ms).
  • For live conversation memory, open a session and index transcript turns as they arrive - see Live-call context.
  • Expose retrieval to the LLM as a function tool so it searches only when it needs to.

Live-call context

Short-term session context plus long-term knowledge during a call.

Sub-10ms knowledge retrieval

The load-then-query retrieval pipeline.