Skip to main content
Moss provides three packages for building voice agents: a Python SDK for agent logic, a Node.js SDK for your web app, and a CLI for deployment.

Agent SDK

moss-voice-agent-manager: Python package for building voice agents. Handles STT, LLM, and TTS provider configuration automatically.
pip install moss-voice-agent-manager

Basic Agent

from moss_voice_agent_manager import (
    Agent, MossAgentSession, MossConfig, JobContext, AutoSubscribe,
    WorkerOptions, WorkerType, RunContext, cli, llm,
)

class MyAgent(Agent):
    instructions = "You are a helpful assistant."

    @llm.function_tool
    async def lookup(self, context: RunContext, query: str) -> str:
        """Look up information."""
        return "results..."

async def entrypoint(ctx: JobContext):
    await ctx.connect(auto_subscribe=AutoSubscribe.AUDIO_ONLY)

    session = MossAgentSession(userdata=None, ctx=ctx, max_tool_steps=10)
    await session.start(agent=MyAgent(), room=ctx.room)

def run():
    moss_config = MossConfig.from_platform()
    cli.run_app(WorkerOptions(
        entrypoint_fnc=entrypoint,
        ws_url=moss_config.platform_ws_url,
        api_key=moss_config.platform_api_key,
        api_secret=moss_config.platform_api_secret,
        agent_name=moss_config.voice_agent_name,
        worker_type=WorkerType.ROOM,
        prewarm_fnc=MossAgentSession.prewarm,
    ))

if __name__ == "__main__":
    run()

Multi-Agent Transfers

Define multiple agents and transfer between them using function tools:
from moss_voice_agent_manager import Agent, RunContext, llm

class GreeterAgent(Agent):
    instructions = "Welcome the user, then call transfer_to_support."

    def __init__(self):
        super().__init__(instructions=self.instructions, tools=[transfer_to_support])

class SupportAgent(Agent):
    instructions = "Help the user with their issue."

@llm.function_tool()
async def transfer_to_support(context: RunContext) -> Agent:
    """Hand off to the support agent."""
    # userdata.agents is a dict you populate at session start
    return context.userdata.agents["support"]

TTS Customization

Override platform TTS defaults per session:
from moss_voice_agent_manager import MossAgentSession, SessionOptions, TTSOptions

session = MossAgentSession(
    userdata=None,
    options=SessionOptions(
        tts=TTSOptions(model="sonic-2", voice="custom-voice-id", language="en")
    ),
)

Key Exports

ExportDescription
MossAgentSessionAgent session with auto-configured providers
AgentBase class for agent behavior and instructions
RunContextContext passed to tool functions
llm.function_toolDecorator for agent tools
SessionOptions / TTSOptionsTTS override options
MossConfigPlatform configuration
WorkerOptions / cliWorker lifecycle management

Session Transcripts

Store session transcripts so they can be downloaded later via moss-agent transcripts download. Call submit_session_report in a shutdown callback to capture the transcript when a session ends. Requires moss-voice-agent-manager>=1.0.0b14.
session = MossAgentSession(userdata=None, ctx=ctx, max_tool_steps=10)

async def on_shutdown():
    await session.submit_session_report(ctx, ctx.room.name)

ctx.add_shutdown_callback(on_shutdown)

Frontend SDK

@moss-tools/voice-server: Node.js package for generating session tokens from your backend. Use this in your Next.js, Express, or any server-side app.
npm install @moss-tools/voice-server

Next.js API Route

import { MossVoiceServer } from "@moss-tools/voice-server";
import { NextResponse } from "next/server";

let server: Awaited<ReturnType<typeof MossVoiceServer.create>> | null = null;

async function getServer() {
  if (!server) {
    server = await MossVoiceServer.create({
      projectId: process.env.MOSS_PROJECT_ID!,
      projectKey: process.env.MOSS_PROJECT_KEY!,
      voiceAgentId: process.env.MOSS_VOICE_AGENT_ID!,
    });
  }
  return server;
}

export async function POST() {
  const srv = await getServer();
  const roomName = `session-${Date.now()}`;
  const identity = `user-${Math.random().toString(36).slice(2, 8)}`;

  const token = await srv.createParticipantToken(
    { identity, name: "User" },
    roomName,
    srv.getAgentName()
  );

  return NextResponse.json({ token, serverUrl: srv.getServerUrl() });
}

API Reference

MethodDescription
MossVoiceServer.create(config)Initialize with Moss credentials. Caches credentials.
server.createParticipantToken(userInfo, roomName, agentName?)Generate a signed JWT (15-min TTL) for a participant.
server.getServerUrl()Returns the WebSocket URL for client connections.
server.getAgentName()Returns the configured agent name.

Agent CLI

moss-agent-cli: Deploy and manage voice agents from the command line.
pip install moss-agent-cli

Deploy

cd your-agent-directory
moss-agent deploy
The CLI validates your agent, fetches deployment credentials, and builds and deploys it.

Stream Logs

moss-agent logs
moss-agent logs --log-type build

Transcripts

Requires moss-agent-cli>=0.3.0. List recent voice agent sessions:
moss-agent transcripts list
moss-agent transcripts list --period 3d
moss-agent transcripts list --from 2026-03-01 --to 2026-03-15
Download session transcripts:
moss-agent transcripts download
moss-agent transcripts download --session-id <session-id>
moss-agent transcripts download --output ./transcripts --format json

Agent Directory Structure

my-agent/
├── agent.py           # Entry point (or main.py)
├── requirements.txt   # Must include moss-voice-agent-manager
└── .env               # Moss credentials

Environment Variables

All three packages use the same credentials:
VariableDescription
MOSS_PROJECT_IDYour Moss project ID
MOSS_PROJECT_KEYYour Moss project API key
MOSS_VOICE_AGENT_IDThe voice agent ID

Requirements

  • Agent SDK / CLI: Python 3.10+
  • Frontend SDK: Node.js 18+