Skip to main content
Local-first does not mean disconnected. Moss keeps on-device indexes and the cloud in step in both directions: it hydrates a local index with existing data when you open it, and syncs changes afterward without ever restarting your service. The naive way to refresh an in-memory index is to restart - which drops in-flight requests and forces downtime. Moss avoids that entirely.

Hydration: cloud to device

When you open a session or load an index, Moss downloads the index binary and deserializes it into memory, so the agent starts warm instead of empty. No re-embedding happens; the prebuilt vectors are loaded directly.
from moss import MossClient

client = MossClient(MOSS_PROJECT_ID, MOSS_PROJECT_KEY)

# Hydrate a loaded index for querying...
await client.load_index("support-faqs")

# ...or hydrate a session by name (auto-loads the cloud index if it exists).
session = await client.session(index_name="call-123")

Two layers of freshness

Once an index is live, freshness has two independent handles - one for getting changes into the index, and one for propagating those changes to running agents.

Layer 1: mutations

add_docs and delete_docs are the write API. They are asynchronous: you submit the change, and Moss rebuilds the index server-side without touching your live service. A mutation moves through statuses (pending_upload, uploading, building, completed); while building, it reports finer-grained phases such as generating_embeddings and building_index. Once completed, the new version is available in the cloud.
from moss import DocumentInfo, MutationOptions

await client.add_docs(
    "product-catalog",
    [DocumentInfo(id="item-9182", text="Wireless headphones, midnight blue, 40-hour battery.",
                  metadata={"category": "audio", "in_stock": "true"})],
    MutationOptions(upsert=True),
)

Layer 2: runtime hot-swap

Load an index with auto_refresh and Moss keeps the in-memory copy current automatically:
await client.load_index("product-catalog", auto_refresh=True, polling_interval_in_seconds=120)
Periodically, Moss checks the cloud for a newer version. If one exists, it downloads in the background while the current version keeps serving queries; when the download finishes, the index is hot-swapped atomically. In-flight queries finish against the old version, new queries immediately see the updated one. No restart, no dropped requests, no coordination.

Tuning the refresh interval

polling_interval_in_seconds controls how often Moss checks for a new version - not how fast a build completes. Match it to how quickly your data changes:
Data change frequencySuggested intervalNotes
Near-real-time (live inventory, pricing)30-60 sFrequent polls; size your build time accordingly
Regular updates (daily policy changes)300-600 sThe default of 600 s fits here
Infrequent (quarterly docs, stable FAQs)1800 s+Low overhead; practically always fresh
To force an immediate update after a known critical change, reload the index explicitly. This is a blocking call that downloads and installs the latest version:
# Force an immediate refresh after a known critical update.
await client.load_index("compliance-rules")

Independent per-index refresh

Each index manages its own refresh lifecycle on its own timer, so you can tune staleness tolerance per knowledge domain. A slow rebuild on one index never blocks queries or refreshes on another.
await client.load_index("live-inventory",  auto_refresh=True, polling_interval_in_seconds=30)
await client.load_index("support-policies", auto_refresh=True, polling_interval_in_seconds=600)
await client.load_index("legal-archive",    auto_refresh=True, polling_interval_in_seconds=3600)

Persist: device to cloud

A session accumulates context locally during an interaction. push_index() syncs it back to the cloud - at the end, or at checkpoints - so it survives the session and any agent can resume it (see Cross-agent context & omni-channel handoff).
await session.push_index()

The operational guarantee

With auto_refresh enabled, the index converges to the latest version within at most one polling interval after the build completes, with zero service interruption. You update your documents; the service updates itself; users get current answers and never know a refresh happened.

Live-call context

Short-term + long-term context during a call.

Sessions

The session lifecycle in depth.