Vector Stores

Available Vector stores save embedding vectors of your ingested document chunks.

Search Modes

Most vector stores in this library support the following search modes:

Default: Standard vector similarity search using embeddings.
BM25: Keyword search using the BM25 algorithm.
Hybrid: A combination of vector similarity search and BM25 keyword search.

Scoring Formats

The score returned for each node depends on the search mode used:

Default (Vector): Scores are based on cosine similarity and range from 0.0 to 1.0, where 1.0 indicates a perfect semantic match.
BM25: Scores are calculated using the BM25 algorithm. These scores are unbounded (can be greater than 1.0) and represent keyword relevance. Higher scores indicate better matches.
Hybrid: Scores are calculated using Reciprocal Rank Fusion (RRF). RRF scores are typically very small (e.g., a rank 1 result has a score of $\approx 0.0164$ or $1.64%$). This method is robust as it only considers the rank position from each sub-search, not the raw similarity or BM25 scores.

You can specify the search mode in the query method of the vector store:

const result = await vectorStore.query({
  queryStr: "your keyword query",
  queryEmbedding: [0.1, 0.2, ...],
  mode: "hybrid",
  similarityTopK: 5,
  alpha: 0.5, // Importance of vector search vs keyword search (0.0 to 1.0)
});

Native vs Fallback Support

Where possible, this library uses the native hybrid and BM25 implementations of the underlying vector store (e.g., Weaviate, ElasticSearch, MongoDB Atlas, PostgreSQL). If a vector store does not support these modes natively, a fallback implementation is used where possible.

Available Vector Stores

Available Vector Stores are shown on the sidebar to the left. Additionally the following integrations exist without separate documentation:

SimpleVectorStore: A simple in-memory vector store with optional persistance to disk.
AstraDBVectorStore: A cloud-native, scalable Database-as-a-Service built on Apache Cassandra, see datastax.com
ChromaVectorStore: An open-source vector database, focused on ease of use and performance, see trychroma.com
MilvusVectorStore: An open-source, high-performance, highly scalable vector database, see milvus.io
MongoDBAtlasVectorSearch: A cloud-based vector search solution for MongoDB, see mongodb.com
PGVectorStore: An open-source vector store built on PostgreSQL, see pgvector Github
PineconeVectorStore: A managed, cloud-native vector database, see pinecone.io
WeaviateVectorStore: An open-source, ai-native vector database, see weaviate.io

Check the vectorstores Github for the most up to date overview of integrations.