How it works
Setup
Pick the on-device model at index creation:moss-minilm (fast, lightweight) or moss-mediumlm (higher accuracy). Moss embeds your documents on-device with the model you choose. If you’d rather supply precomputed vectors from your own pipeline, see Custom embeddings.
Tips
- Batch inputs for speed
- Cache vectors for unchanged content
- Use hybrid retrieval for best relevance
Related
Custom embeddings
Bring your own precomputed vectors.
Sub-10ms knowledge retrieval
The local retrieval pipeline.