moss) brings semantic search to Python. It wraps a high-performance
Rust core and exposes an async API. Documents are embedded and queried locally, with optional
cloud sync.
Requirements
- Python 3.10 or higher
Install
Two ways to search
MossClient- the entry point. Manage cloud indexes, load one into memory, and query it.SessionIndex- a local, in-process index for real-time indexing during a live interaction; push to the cloud when done.
Quick start
Indexes
Create, inspect, and delete cloud indexes. Mutations run as async jobs and return aMutationResult with a job_id and doc_count.
Documents
Add, update, fetch, and remove documents on an existing index.Load and query
Load an index into memory, then query it in-process. Callload_index before querying.
Hybrid search
Blend semantic and keyword scoring withalpha (1.0 = semantic, 0.0 = keyword; default 0.8).
Metadata filtering
Narrow results by document metadata on a loaded index.$eq, $ne, $gt, $gte, $lt, $lte, $in, $nin, $near, composed with
$and / $or. See Metadata filtering.
Custom embeddings
Supply your own vectors withmodel_id="custom" (each document carries embedding, and
queries pass embedding).
Multi-index search
Query several loaded indexes in one call; each result is tagged with its sourceindex_name.
Sessions
Index and query locally in real time with aSessionIndex, then
push to the cloud. session() resumes an existing cloud index by name, or starts empty.
Keeping indexes fresh
Auto-refresh a loaded index (poll the cloud and hot-swap newer versions in automatically), and track async jobs.Models
moss-minilm(default) - fast, lightweightmoss-mediumlm- higher accuracycustom- supply your own embedding vectors viaDocumentInfo.embedding