Skip to main content
Moss is the runtime for real-time semantic search in conversational apps. It delivers sub-10 ms lookups and instant index updates without extra infrastructure. It runs in the browser, on-device, or in the cloud - wherever your agent lives - so search feels native. Connect your data once; Moss packages, distributes, and keeps indexes fresh. Because the index lives next to your agent, retrieval is a function you call, not a service you query.
  • Sub-10 ms lookups with instant updates
  • Real-time sessions (Python, JavaScript, Swift, Elixir, C): index and query locally during a conversation, then sync to the cloud
  • No infra to run; local-first with optional sync
  • Browser, device, or cloud - same API

Use cases

  • Sub-10 ms answers for docs, FAQ, and in-app search
  • Grounding agents with your data without centralizing user info
  • Local or hybrid embeddings with minimal infrastructure

Capabilities

Real-Time Local Indexing

Index and query on-device in milliseconds, with no network round trip.

Live-Call Context

Short-term session context plus long-term knowledge, during a call.

Data Hydration & Sync

Hydrate from the cloud and stay fresh with zero-downtime hot-swaps.

Cross-Agent Handoff

Carry full context across agents, channels, and devices.

SDKs

One model across every surface: JavaScript (Node), Python, Swift (iOS), Elixir, C, and an in-browser/WASM build.

Using Moss Portal

  • Sign up at Moss, confirm email, and sign in
  • From the portal, click Create Index and copy your Project ID and Project Key for your SDK
  • Join our Discord to get onboarded: Moss Discord
Moss Portal walkthrough

Samples

  • View samples repo: moss on GitHub
  • JavaScript: javascript/comprehensive_sample.ts, javascript/load_and_query_sample.ts
  • Python: python/comprehensive_sample.py, python/load_and_query_sample.py
  • Adapt by swapping the FAQ data with your own, or plug Moss calls into your app

How it works (at a glance)

  • Index: Convert your data into an efficient local index
  • Embeddings: Generate vectors on-device (moss-minilm, moss-mediumlm) or supply your own
  • Sessions: Index and query locally in real time during a live interaction, then sync to the cloud
  • Retrieval: Load the index, then query in-memory with semantic or hybrid search
  • Storage: Persist indexes locally and optionally sync to cloud

Next steps