Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.moss.dev/llms.txt

Use this file to discover all available pages before exploring further.

Overview

createIndexFromFiles / create_index_from_files lets you index raw files directly. The server handles parsing, chunking, and embedding - you don’t need to extract text or split documents yourself. Use this instead of createIndex when your source content is files (PDFs, documents) rather than pre-extracted text strings.

Supported formats

FormatMIME type
PDFapplication/pdf
Additional formats are planned. For unsupported types, extract text yourself and use createIndex.

Usage

import { MossClient } from '@moss-dev/moss'

const client = new MossClient(
  process.env.MOSS_PROJECT_ID!,
  process.env.MOSS_PROJECT_KEY!
)

const result = await client.createIndexFromFiles('knowledge-base', [
  { name: 'report.pdf',  contentType: 'application/pdf', path: '/docs/report.pdf' },
  { name: 'manual.pdf',  contentType: 'application/pdf', path: '/docs/manual.pdf' },
], { modelId: 'moss-minilm' })

console.log(`Index ready - ${result.docCount} chunks created`)

Query the index

Once created, use the index exactly like any other - loadIndex + query:
await client.loadIndex('knowledge-base')
const results = await client.query('knowledge-base', 'refund policy', { topK: 5 })
results.docs.forEach(doc => console.log(doc.text, doc.score))
Each file is described by a ParseFileInput: see JavaScript or Python for the full field reference.

Options

OptionJSPythonDefaultNotes
ModelmodelIdmodel_id"moss-minilm""moss-minilm" or "moss-mediumlm" only; "custom" is not supported
ProgressonProgress--JS only: callback fired ~every 2s during processing
The "custom" model is not supported because the parse pipeline generates embeddings server-side. If you need to supply your own embedding vectors, extract the text first and use createIndex.

Limits

  • Maximum 20 files per call
  • Each ParseFileInput must provide at least one of path or data. If both are given, data takes precedence.