agora-moss package. Moss is exposed as a single MCP tool (search_knowledge_base) over streamable HTTP — wire it into ConvoAI’s llm.mcp_servers join-body field and your voice agent can look up knowledge base answers in under 10ms during a live call.
Note: For a complete working example, see the agora-moss app.
Why use Moss with Agora?
Agora ConvoAI agents accept MCP servers as tools the LLM can call mid-conversation. Moss drops in as one of those servers: your agent keeps whichever LLM, ASR, and TTS vendors you already use, and gains fast, hallucination-free knowledge base lookups with no LLM-side plumbing.Required tools
- Moss account with project credentials
- Agora account with Conversational AI enabled (App ID, App Certificate, Customer ID, Customer Secret)
- An OpenAI-compatible LLM endpoint (OpenAI, Groq, Together, vLLM, etc.) plus ASR/TTS vendor keys (Deepgram, Cartesia, or any other Agora-supported provider)
- A public URL for your MCP server (production host, or
ngrok/cloudflaredfor local dev) - Python 3.10+
Integration guide
Run the MCP server
Build a FastMCP app from Run it and expose
MossAgoraSearch and serve it at a public HTTPS URL. The index is preloaded into memory during the server’s lifespan so every tool call runs in-process.server.py
/mcp publicly:Wire into the Agora ConvoAI join body
Point Agora’s ConvoAI REST Agora rules to watch:
/join endpoint at your MCP server by adding one mcp_servers entry under llm and flipping advanced_features.enable_tools on. Everything else — vendor, LLM URL, ASR, TTS — stays exactly as you already have it.- Server-entry
namemust be ≤48 characters and alphanumeric only (no hyphens, underscores, or dots). - Transport must be
streamable_http. advanced_features.enable_toolsmust betrue.
Configuration
MossAgoraSearch
| Parameter | Type | Default | Description |
|---|---|---|---|
project_id | str | None | None | Your Moss Project ID. Read it from MOSS_PROJECT_ID and pass it in. |
project_key | str | None | None | Your Moss Project Key. Read it from MOSS_PROJECT_KEY and pass it in. |
index_name | str | Required | The name of the Moss index to query. |
top_k | int | 5 | Number of results to retrieve per query. |
alpha | float | 0.8 | Hybrid search weighting. 0.0 = keyword only, 1.0 = semantic only. |
MossAgoraSearch.search() returns an AgoraSearchResult with documents: list[dict] ({"content": str, "similarity": float}) and time_taken_ms: int | None.
create_mcp_app
| Argument | Type | Description |
|---|---|---|
search | MossAgoraSearch | A configured adapter. The returned FastMCP app awaits search.load_index() in its lifespan before accepting tool calls. |
FastMCP instance exposing a single tool: search_knowledge_base(query: str). Exceptions from the adapter are surfaced to the LLM as MCP tool-errors.