Skip to main content
2026-02-26
v1.0.0-beta.17
  • Metadata Filtering: query() now accepts an optional filter dict to narrow results by document metadata on locally loaded indexes
    • Comparison operators: $eq, $ne, $gt, $gte, $lt, $lte
    • Set operators: $in, $nin
    • Composable with $and / $or for complex predicates (supports arbitrary nesting)
    • Numeric coercion: int and float filter values are automatically converted to strings for consistent matching
  • Geo-distance filtering: new $near operator filters documents by haversine distance from a "lat,lng,radiusMeters" value
  • When filter is passed to query() but the index is not loaded locally, a warning is logged and the filter is skipped (cloud query API does not yet support filtering)
  • Updated inferedge-moss-core dependency to 0.6.0
2026-02-23
v1.0.0-beta.16
  • Bumped inferedge-moss-core dependency to 0.5.0 to support session index telemetry and push_index improvements
2026-02-17
v1.0.0-beta.15
  • All index mutations and reads now go through the Rust ManageClient, replacing the Python HTTP layer
  • Index creation uses an async bulk pipeline: binary upload → server-side build → poll until completion
  • load_index supports both V1 and V2 binary formats, with cloud query fallback when index isn’t loaded locally
  • New return type MutationResult (with job_id, index_name, doc_count) for create_index, add_docs, delete_docs
  • get_docs takes doc_ids directly instead of wrapping in GetDocumentsOptions
2026-02-06
v1.0.0-beta.14
  • Query latency reduced from ~2,300ms to ~10ms for 100K vectors
  • Optimized search pipeline reducing memory allocations
  • Significantly reduced memory overhead for large indexes (100K+ documents) in the context of hybrid search (keyword + semantic)
  • Enhanced performance across all index sizes
2026-02-02
v1.0.0-beta.13
  • Hot Reload & Auto-Refresh: Indexes can now automatically detect and reload when updated in the cloud.
    • load_index() now accepts optional auto_refresh and polling_interval_in_seconds parameters
    • When auto_refresh is enabled, the SDK polls for updates at the configured interval (default: 600 seconds)
    • To stop auto-refresh, call load_index() again without the auto_refresh option
  • load_index() now allows reloading an already-loaded index (previously threw an error)
  • Index management now uses Rust core for improved performance and reliability
2026-01-30
v1.0.0-beta.12
  • Adds partial support for Python 3.14 by disabling local embedding service functionality. Full support coming soon.
2026-01-29
v1.0.0-beta.11
  • Adds support for user-supplied embeddings.
  • query() now automatically falls back to the cloud API when the index is not loaded locally, enabling queries without requiring load_index() first.
  • Adds better scoring evaluation for search results
2026-01-26
v1.0.0-beta.10
  • Removes the ‘<2’ upper bound on numpy dependency.
2026-01-14
v1.0.0-beta.9
  • Drops support for Python 3.9 and below.
  • Bug fix: Keyword search now functions correctly after load_index().
  • New service endpoint with significant infrastructure upgrades. Management operations are now ~3× faster across most real-world use cases, providing faster index operations while also supporting larger payloads.
2025-12-15
v1.0.0-beta.8
  • Updates inferedge-moss-core dependency to version 0.2.3 for new ARM64 wheel support.
2025-12-01
v1.0.0-beta.7
Adds IntelliSense support in all the IDEs
2025-11-29
v1.0.0-beta.6
Adds support for keyword search and alpha blending between keyword and semantic search.
2025-10-23
v1.0.0-beta.5
Removes Pipecat integration and MossContextRetriever from the SDK. Will be offered as a pipecat extension instead soon.
2025-10-09
v1.0.0-beta.4
Performance improvements for query() calls.
2025-10-09
v1.0.0-beta.3
  • MossContextRetriever: Added Pipecat integration for real-time voice AI applications
    • Automatically enhances LLM conversations with semantic search results from Moss indexes
    • Seamless integration with OpenAI LLM context frames
2025-09-14
v1.0.0-beta.1
Initial release of inferedge-moss with core features:
  • Semantic search using transformer-based embeddings
  • Lightweight embedding models for edge computing; supports proprietary “moss-minilm” model
  • API key validation with secure host access
  • Cloudflare CDN support for fast model loading
  • Multi-index support for isolated search spaces
  • Add, update, and remove items across indexes
  • Query interface with configurable result count
  • Performance metrics tracking