- Sub-10 ms lookups with instant updates
- Real-time sessions (Python, JavaScript, Swift, Elixir, C): index and query locally during a conversation, then sync to the cloud
- No infra to run; local-first with optional sync
- Browser, device, or cloud - same API
Use cases
- Sub-10 ms answers for docs, FAQ, and in-app search
- Grounding agents with your data without centralizing user info
- Local or hybrid embeddings with minimal infrastructure
Capabilities
Real-Time Local Indexing
Index and query on-device in milliseconds, with no network round trip.
Live-Call Context
Short-term session context plus long-term knowledge, during a call.
Data Hydration & Sync
Hydrate from the cloud and stay fresh with zero-downtime hot-swaps.
Cross-Agent Handoff
Carry full context across agents, channels, and devices.
SDKs
One model across every surface: JavaScript (Node), Python, Swift (iOS), Elixir, C, and an in-browser/WASM build.Using Moss Portal
- Sign up at Moss, confirm email, and sign in
- From the portal, click Create Index and copy your Project ID and Project Key for your SDK
- Join our Discord to get onboarded: Moss Discord
Samples
- View samples repo: moss on GitHub
- JavaScript:
javascript/comprehensive_sample.ts,javascript/load_and_query_sample.ts - Python:
python/comprehensive_sample.py,python/load_and_query_sample.py - Adapt by swapping the FAQ data with your own, or plug Moss calls into your app
How it works (at a glance)
- Index: Convert your data into an efficient local index
- Embeddings: Generate vectors on-device (
moss-minilm,moss-mediumlm) or supply your own - Sessions: Index and query locally in real time during a live interaction, then sync to the cloud
- Retrieval: Load the index, then query in-memory with semantic or hybrid search
- Storage: Persist indexes locally and optionally sync to cloud