Skip to main content

Strategies

  • Vector similarity (semantic)
  • Keyword/BM25
  • Hybrid (best of both)
import { MossClient } from '@inferedge/moss'
const client = new MossClient(process.env.MOSS_PROJECT_ID!, process.env.MOSS_PROJECT_KEY!)
await client.loadIndex('my-index')
const byVector = await client.query('my-index', 'getting started latency', { topK: 5 })

Hybrid weighting (alpha)

  • alpha = 1.0: pure semantic (embeddings)
  • alpha = 0.0: pure keyword
  • Between 0 and 1 blends the two (default is semantic-heavy, e.g., ~0.8)
const results = await client.query('my-index', 'return policy', { topK: 3 })

Reranking

Apply a reranker to reorder top-k for precision.

Tuning

  • Adjust k and score thresholds
  • Use metadata filters
  • Group queries by intent (e.g., returns, billing, onboarding) and tune per index
  • Choose model per index: moss-minilm (fast) or moss-mediumlm (more accurate)