Vercel
This guide demonstrates how to build an agentic RAG (Retrieval-Augmented Generation) application using vectorstores with Vercel’s AI SDK. The example shows how to create a vector store index from documents and use it as a tool within Vercel’s streaming text generation.
Overview
Section titled “Overview”The agentic RAG example combines:
- vectorstores for document indexing and retrieval
- Vercel AI SDK for streaming text generation with tool calling
- OpenAI as the LLM provider
The application allows the LLM to autonomously query your knowledge base by providing it with a retrieval tool, enabling multi-step reasoning and information gathering.
Complete Example
Section titled “Complete Example”Here’s the full example code:
import { openai } from "@ai-sdk/openai";import { Document, formatLLM, Settings, VectorStoreIndex } from "@vectorstores/core";import { stepCountIs, streamText, tool } from "ai";import fs from "node:fs/promises";import { fileURLToPath } from "node:url";import { OpenAI } from "openai";import { z } from "zod";
async function main() { // Ensure OpenAI API key is available if (!process.env.OPENAI_API_KEY) { console.error('Error: OpenAI API key not found in environment variables.'); return; }
// Configure OpenAI embeddings with vectorstores const openaiClient = new OpenAI(); Settings.embedFunc = async (input) => { const { data } = await openaiClient.embeddings.create({ model: "text-embedding-3-small", input, }); return data.map((d) => d.embedding); };
const filePath = fileURLToPath( new URL("../shared/data/abramov.txt", import.meta.url), ); const essay = await fs.readFile(filePath, "utf-8"); const document = new Document({ text: essay, id_: filePath });
const index = await VectorStoreIndex.fromDocuments([document]); console.log("Successfully created index");
const retriever = index.asRetriever(); const result = streamText({ model: openai("gpt-5.1-mini"), prompt: "Cost of moving cat from Russia to UK?", tools: { queryTool: tool({ description: "get information from your knowledge base to answer questions.", inputSchema: z.object({ query: z .string() .describe("The query to get information about your documents."), }), execute: async ({ query }) => { return ( formatLLM(await retriever.retrieve({ query })) || "No result found in documents" ); }, }), }, stopWhen: stepCountIs(5), });
for await (const textPart of result.textStream) { process.stdout.write(textPart); }}
main().catch(console.error);Step-by-Step Explanation
Section titled “Step-by-Step Explanation”1. Setup and Configuration
Section titled “1. Setup and Configuration”The example starts by ensuring the OpenAI API key is available and configuring the embedding model:
// Ensure OpenAI API key is availableif (!process.env.OPENAI_API_KEY) { console.error("OpenAI API key not found in environment variables."); return;}
// Configure OpenAI embeddingsconst openaiClient = new OpenAI();Settings.embedFunc = async (input) => { const { data } = await openaiClient.embeddings.create({ model: "text-embedding-3-small", input, }); return data.map((d) => d.embedding);};2. Loading and Indexing Documents
Section titled “2. Loading and Indexing Documents”A document is loaded from a file and indexed:
const filePath = fileURLToPath( new URL("../shared/data/abramov.txt", import.meta.url),);const essay = await fs.readFile(filePath, "utf-8");const document = new Document({ text: essay, id_: filePath });
const index = await VectorStoreIndex.fromDocuments([document]);- The document is read from the filesystem
- A
Documentobject is created with the text content and a unique ID VectorStoreIndex.fromDocuments()creates a searchable vector index from the document
3. Creating a Retriever
Section titled “3. Creating a Retriever”A retriever is created from the index to enable querying:
const retriever = index.asRetriever();The retriever can search the indexed documents and return relevant chunks based on semantic similarity.
4. Defining the Query Tool
Section titled “4. Defining the Query Tool”A tool is defined that allows the LLM to query the knowledge base:
queryTool: tool({ description: "get information from your knowledge base to answer questions.", inputSchema: z.object({ query: z .string() .describe("The query to get information about your documents."), }), execute: async ({ query }) => { return ( formatLLM(await retriever.retrieve({ query })) || "No result found in documents" ); },}),Key components:
description: Tells the LLM when and how to use this toolinputSchema: Defines the tool’s input parameters using Zodexecute: The function that runs when the tool is called- Retrieves relevant document chunks using the retriever
- Formats the results using
formatLLM()for LLM consumption - Returns a fallback message if no results are found
5. Streaming Text Generation
Section titled “5. Streaming Text Generation”The streamText function generates responses with tool calling capabilities:
const result = streamText({ model: openai("gpt-4o"), prompt: "Cost of moving cat from Russia to UK?", tools: { queryTool }, stopWhen: stepCountIs(5),});model: Uses OpenAI’s GPT-4o model via Vercel’s AI SDKprompt: The user’s questiontools: Makes the query tool available to the LLMstopWhen: stepCountIs(5): Limits the agent to 5 reasoning steps to prevent infinite loops
6. Streaming the Response
Section titled “6. Streaming the Response”The response is streamed to the console:
for await (const textPart of result.textStream) { process.stdout.write(textPart);}This allows the user to see the response as it’s generated, providing a better user experience.
How It Works
Section titled “How It Works”- The LLM receives the user’s question
- It decides whether to use the
queryToolto search the knowledge base - If it calls the tool, the retriever searches the indexed documents
- The retrieved information is formatted and returned to the LLM
- The LLM uses this information to generate a response
- The process can repeat for multi-step reasoning (up to 5 steps)
- The final response is streamed to the user
Key Benefits
Section titled “Key Benefits”- Autonomous Information Retrieval: The LLM decides when to query the knowledge base
- Multi-step Reasoning: Can perform multiple queries to gather comprehensive information
- Streaming Responses: Provides real-time feedback to users
- Flexible Tool Usage: The LLM uses tools only when needed
Next Steps
Section titled “Next Steps”- Experiment with different retrieval strategies and tool configurations to improve the agent’s performance
- Try using different Vercel AI model providers