Skip to main content
Moss supports three retrieval strategies, all over the same query call:
  • Vector similarity (semantic) - matches on meaning
  • Keyword / BM25 - matches on exact terms
  • Hybrid - blends both, tuned with alpha
An index must be loaded before you query it. Call load_index() (or open a session) first; queries then run entirely in-memory (~1-10 ms).

Basic query

import { MossClient } from '@moss-dev/moss'
const client = new MossClient(process.env.MOSS_PROJECT_ID!, process.env.MOSS_PROJECT_KEY!)
await client.loadIndex('my-index')
const results = await client.query('my-index', 'getting started latency', { topK: 5 })

Go deeper

Hybrid search

Blend semantic and keyword scoring with alpha.

Metadata filtering

Narrow results by document metadata.

Custom embeddings

Bring your own query and document vectors.

Multi-index search

Query several loaded indexes in one call.

Tuning

  • Adjust topK / top_k and score thresholds
  • Layer metadata filters to narrow candidate sets
  • Group queries by intent (returns, billing, onboarding) and tune per index
  • Choose model per index: moss-minilm (fast) or moss-mediumlm (more accurate)