BM25
Defined in: packages/core/src/vector-store/bm25.ts:24
A simple BM25 (Best Matching 25) implementation for in-memory search.
BM25 is a bag-of-words retrieval function that ranks documents based on the query terms appearing in each document. It’s an improvement over TF-IDF that includes document length normalization.
Key parameters:
- k1 (default 1.5): Controls term frequency saturation. Higher values give more weight to term frequency.
- b (default 0.75): Controls document length normalization. 0 = no normalization, 1 = full normalization.
Example
Section titled “Example”const bm25 = new BM25(documents);const results = bm25.search("search query", 10);// Returns top 10 documents with their BM25 scoresConstructors
Section titled “Constructors”Constructor
Section titled “Constructor”new BM25(
nodes,options?):BM25
Defined in: packages/core/src/vector-store/bm25.ts:41
Creates a new BM25 index from the given nodes.
Parameters
Section titled “Parameters”Array of nodes to index
options?
Section titled “options?”Optional BM25 parameters
number
Term frequency saturation parameter (default: 1.5)
number
Document length normalization parameter (default: 0.75)
Returns
Section titled “Returns”BM25
Methods
Section titled “Methods”search()
Section titled “search()”search(
query,topK):object[]
Defined in: packages/core/src/vector-store/bm25.ts:96
Searches the index for documents matching the query.
Parameters
Section titled “Parameters”string
The search query string
number
Maximum number of results to return
Returns
Section titled “Returns”object[]
Array of document IDs with their BM25 scores, sorted by score descending
getDocumentCount()
Section titled “getDocumentCount()”getDocumentCount():
number
Defined in: packages/core/src/vector-store/bm25.ts:130
Returns the number of documents in the index.
Returns
Section titled “Returns”number
getAverageDocumentLength()
Section titled “getAverageDocumentLength()”getAverageDocumentLength():
number
Defined in: packages/core/src/vector-store/bm25.ts:137
Returns the average document length in the index.
Returns
Section titled “Returns”number