Moss - Real-time Semantic Search for Conversational AI

Moss embeds documents on-device using a built-in model, so the text and resulting vectors stay on the machine and embedding stays off the network path. You choose the model at index creation.

How it works

Text stays on device, flows through a local embedding model, becomes vectors, and is stored in a local index

Setup

Pick the on-device model at index creation: moss-minilm (fast, lightweight) or moss-mediumlm (higher accuracy). Moss embeds your documents on-device with the model you choose. If you’d rather supply precomputed vectors from your own pipeline, see Custom embeddings.

import { MossClient } from '@moss-dev/moss'
const client = new MossClient(process.env.MOSS_PROJECT_ID!, process.env.MOSS_PROJECT_KEY!)
await client.createIndex('local-embeddings', docs, { modelId: 'moss-minilm' })

Tips

Batch inputs for speed
Cache vectors for unchanged content
Use hybrid retrieval for best relevance

Custom embeddings

Bring your own precomputed vectors.

Sub-10ms knowledge retrieval

The local retrieval pipeline.

Electron App Local Search Authentication

⌘I

Getting Started

Capabilities

Use Cases

How it works

Pricing

Local Embeddings

How it works

Setup

Tips

Custom embeddings

Sub-10ms knowledge retrieval

​How it works

​Setup

​Tips

​Related

Custom embeddings

Sub-10ms knowledge retrieval

How it works

Setup

Tips

Related