Skip to content

BM25

Defined in: packages/core/src/vector-store/bm25.ts:24

A simple BM25 (Best Matching 25) implementation for in-memory search.

BM25 is a bag-of-words retrieval function that ranks documents based on the query terms appearing in each document. It’s an improvement over TF-IDF that includes document length normalization.

Key parameters:

  • k1 (default 1.5): Controls term frequency saturation. Higher values give more weight to term frequency.
  • b (default 0.75): Controls document length normalization. 0 = no normalization, 1 = full normalization.
const bm25 = new BM25(documents);
const results = bm25.search("search query", 10);
// Returns top 10 documents with their BM25 scores

new BM25(nodes, options?): BM25

Defined in: packages/core/src/vector-store/bm25.ts:41

Creates a new BM25 index from the given nodes.

BaseNode<Metadata>[]

Array of nodes to index

Optional BM25 parameters

number

Term frequency saturation parameter (default: 1.5)

number

Document length normalization parameter (default: 0.75)

BM25

search(query, topK): object[]

Defined in: packages/core/src/vector-store/bm25.ts:96

Searches the index for documents matching the query.

string

The search query string

number

Maximum number of results to return

object[]

Array of document IDs with their BM25 scores, sorted by score descending


getDocumentCount(): number

Defined in: packages/core/src/vector-store/bm25.ts:130

Returns the number of documents in the index.

number


getAverageDocumentLength(): number

Defined in: packages/core/src/vector-store/bm25.ts:137

Returns the average document length in the index.

number