If you’ve been following this series, you’ll know we’ve been incrementally building out the Iron Mind AI personal trainer agent.
We’ve added function tools, agents as function tools, human-in-the-loop checkpoints, contextual memory, background responses, MCP tool support, and email marketing capabilities.
In this post, we take a different direction and look at how to give your AI agent access to your own documents using Retrieval Augmented Generation (RAG).
Instead of relying solely on the LLM’s training data, the agent searches a vector store before each response and grounds its answers in your content.
~
Why RAG Matters for AI Agents
LLMs are powerful but they have a fundamental limitation, they only know what they were trained on.
They can’t access your internal documentation, product guides, company policies, or domain-specific knowledge.
RAG solves this by retrieving relevant documents at runtime and injecting them into the model’s context. The result is an agent that answers questions using your data, not generic knowledge.
This is useful when you want your agent to:
- Answer questions grounded in specific documentation
- Cite sources so users can verify information
- Avoid hallucination by constraining responses to known content
RAG also helps your agent stay current without retraining the underlying model
~
How It Works
The Microsoft Agent Framework provides a TextSearchProvider that hooks into the agent’s execution pipeline. Before each model invocation, it runs a vector search against your document store and injects the matching results into the conversation as additional context messages.
The flow works like this:
- User asks a question
- TextSearchProvider converts the question into an embedding
- The embedding is compared against document embeddings in the vector store
- The top matching documents are injected into the model context
- The LLM generates a response grounded in the retrieved documents
For this example, we use the InMemoryVectorStore from the Microsoft.SemanticKernel.Connectors.InMemory NuGet package.
We’re using this because at the time of writing, the Microsoft Agent Framework doesn’t ship with its own vector store connector.
Instead, the Microsoft Agent Framework delegates storage to the Microsoft.Extensions.VectorData abstractions.
These define the interfaces but contain no implementations. The Semantic Kernel connector packages ship with standard implementations for this interface.
Note: We’re not using the Semantic Kernel framework itself, just this one connector package for its in-memory vector store implementation.
In a production scenario, you could swap this out for any vector store that implements these same abstractions, such as Azure AI Search, Qdrant, or Pinecone.
~
Project Structure
The project follows the same structure we’ve used throughout this series, with an Agents/ folder to keep agent logic separate from the entry point:
├── Agent-Framework-9-Basic-RAG.csproj ├── Program.cs ├── Agents/ │ └── IronMindRagAgent.cs ├── TextSearchStore.cs └── TextSearchDocument.cs
- Program.cs is lean and handles only the OpenAI client setup and the chat loop
- IronMindRagAgent.cs encapsulates the vector store setup, search logic, and agent configuration
- TextSearchStore.cs wraps the in-memory vector store with a clean API for upserting documents and running searches
- TextSearchDocument.cs defines the document model
With an overview of the project structure, lets look at the project definition.
~
Setting Up the Project
The project targets .NET 8.0 and uses the following NuGet packages:
<Project Sdk="Microsoft.NET.Sdk"> <PropertyGroup> <OutputType>Exe</OutputType> <TargetFramework>net8.0</TargetFramework> <RootNamespace>Agent_Framework_9_Basic_RAG</RootNamespace> <ImplicitUsings>enable</ImplicitUsings> <Nullable>enable</Nullable> </PropertyGroup> <ItemGroup> <PackageReference Include="Microsoft.Agents.AI" Version="1.0.0-preview.260205.1" /> <PackageReference Include="Microsoft.Agents.AI.OpenAI" Version="1.0.0-preview.260205.1" /> <PackageReference Include="Microsoft.Extensions.AI.OpenAI" Version="10.2.0-preview.1.26063.2" /> <PackageReference Include="Microsoft.Extensions.VectorData.Abstractions" Version="9.7.0" /> <PackageReference Include="Microsoft.SemanticKernel.Connectors.InMemory" Version="1.66.0-preview" /> <PackageReference Include="OpenAI" Version="2.8.0" /> <PackageReference Include="System.ClientModel" Version="1.8.1" /> </ItemGroup> </Project>
Some key packages to note include:
- Microsoft.Agents.AI provides the TextSearchProvider, AIAgent, and ChatClientAgentOptions
- Microsoft.Agents.AI.OpenAI provides the .AsAIAgent() extension method for OpenAI’s ChatClient
- Microsoft.Extensions.AI.OpenAI provides .AsIEmbeddingGenerator() to convert OpenAI’s embedding client into the standard IEmbeddingGenerator interface
- Microsoft.Extensions.VectorData.Abstractions defines the vector store abstractions (VectorStoreCollection, etc.)
- Microsoft.SemanticKernel.Connectors.InMemory provides the InMemoryVectorStore implementation
Now let’s dig into the required models.
~
Defining the Document Model
First, we need a simple model to represent the documents we want to store and search:
public class TextSearchDocument
{
public string SourceId { get; set; } = string.Empty;
public string SourceName { get; set; } = string.Empty;
public string SourceLink { get; set; } = string.Empty;
public string Text { get; set; } = string.Empty;
}
Each document has an identifier, a human-readable source name, a link for citation purposes, and the actual text content.
~
Building the TextSearchStore
The TextSearchStore wraps the in-memory vector store and provides a clean API for upserting documents and running searches:
using Microsoft.Extensions.VectorData;
using Microsoft.SemanticKernel.Connectors.InMemory;
namespace Agent_Framework_9_Basic_RAG;
public class TextSearchStore
{
private readonly VectorStoreCollection<string, TextSearchRecord> _collection;
public TextSearchStore(InMemoryVectorStore vectorStore, string collectionName, int dimensions)
{
var definition = new VectorStoreCollectionDefinition
{
Properties =
[
new VectorStoreKeyProperty("SourceId", typeof(string)),
new VectorStoreDataProperty("SourceName", typeof(string)),
new VectorStoreDataProperty("SourceLink", typeof(string)),
new VectorStoreDataProperty("Text", typeof(string)),
new VectorStoreVectorProperty("Embedding", typeof(string), dimensions),
]
};
_collection = vectorStore.GetCollection<string, TextSearchRecord>(collectionName, definition);
}
public async Task UpsertDocumentsAsync(IEnumerable<TextSearchDocument> documents)
{
await _collection.EnsureCollectionExistsAsync();
foreach (var doc in documents)
{
var record = new TextSearchRecord
{
SourceId = doc.SourceId,
SourceName = doc.SourceName,
SourceLink = doc.SourceLink,
Text = doc.Text,
Embedding = doc.Text
};
await _collection.UpsertAsync(record);
}
}
public async Task<IEnumerable<TextSearchDocument>> SearchAsync(
string query, int topK, CancellationToken cancellationToken = default)
{
var results = _collection.SearchAsync(query, topK, cancellationToken: cancellationToken);
var documents = new List<TextSearchDocument>();
await foreach (var result in results)
{
documents.Add(new TextSearchDocument
{
SourceId = result.Record.SourceId,
SourceName = result.Record.SourceName,
SourceLink = result.Record.SourceLink,
Text = result.Record.Text
});
}
return documents;
}
}
A few important things to note here:
- The VectorStoreCollectionDefinition defines the schema for the collection, including the vector property with its dimensions (3072 for text-embedding-3-large)
- The Embedding property is typed as string, not ReadOnlyMemory<float>. This is important. When you set the Embedding to the document text and the vector store has an EmbeddingGenerator configured, it automatically converts the text to a vector embedding during upsert. The same happens during search when you pass a string query. If you use ReadOnlyMemory<float> instead, the embeddings won’t be generated and you’ll get a dimension mismatch error at search time
- The TextSearchRecord is an internal model that includes the Embedding property, while TextSearchDocument is the public-facing model
The internal record type:
public class TextSearchRecord
{
public string SourceId { get; set; } = string.Empty;
public string SourceName { get; set; } = string.Empty;
public string SourceLink { get; set; } = string.Empty;
public string Text { get; set; } = string.Empty;
public string Embedding { get; set; } = string.Empty;
}
This is used to represent vectorised content.
~
Creating the Sample Documents
For this example, we use three Iron Mind AI personal trainer documents covering beginner training, nutrition, and recovery. These are defined as a static method on TextSearchStore:
public static IEnumerable<TextSearchDocument> GetSampleDocuments()
{
yield return new TextSearchDocument
{
SourceId = "beginner-strength-001",
SourceName = "Iron Mind AI - Beginner Strength Training Guide",
SourceLink = "https://ironmindai.com/tips/beginner-strength",
Text = "For beginners, focus on compound movements like squats, deadlifts, bench press, " +
"and overhead press. Train 3-4 days per week with at least one rest day between " +
"sessions. Start with a weight you can control for 8-12 reps with good form. " +
"Progressive overload is key - aim to gradually increase weight, reps, or sets " +
"over time. Consistency beats intensity in the early stages."
};
yield return new TextSearchDocument
{
SourceId = "nutrition-basics-001",
SourceName = "Iron Mind AI - Nutrition for Muscle Growth",
SourceLink = "https://ironmindai.com/tips/nutrition-muscle-growth",
Text = "To support muscle growth, aim for 1.6 to 2.2 grams of protein per kilogram of " +
"body weight per day. Spread protein intake across 3-5 meals for optimal muscle " +
"protein synthesis. Prioritize whole food sources like chicken, fish, eggs, Greek " +
"yogurt, and legumes. Don't neglect carbohydrates - they fuel your workouts and " +
"aid recovery. A slight caloric surplus of 200-300 calories above maintenance is " +
"ideal for lean muscle gain."
};
yield return new TextSearchDocument
{
SourceId = "recovery-sleep-001",
SourceName = "Iron Mind AI - Recovery and Sleep Guide",
SourceLink = "https://ironmindai.com/tips/recovery-sleep",
Text = "Sleep is when your body repairs and builds muscle tissue. Aim for 7-9 hours of " +
"quality sleep per night. Poor sleep increases cortisol levels, which can impair " +
"muscle recovery and promote fat storage. Establish a consistent sleep schedule, " +
"limit screen time before bed, and keep your room cool and dark. Active recovery " +
"on rest days - such as walking, stretching, or light yoga - also helps reduce " +
"soreness and improve circulation."
};
}
In a real-world scenario, these documents would come from a database, CMS, file system, or an external API.
~
The IronMindRagAgent
This is where the RAG logic lives. The IronMindRagAgent class encapsulates the vector store setup, the search adapter, and the agent configuration.
This keeps Program.cs clean and follows the same Agents/ folder convention used across the other projects in this series.
using Microsoft.Agents.AI;
using Microsoft.Extensions.AI;
using Microsoft.SemanticKernel.Connectors.InMemory;
using OpenAI;
using OpenAI.Chat;
namespace Agent_Framework_9_Basic_RAG.Agents;
public class IronMindRagAgent
{
private const int EmbeddingDimensions = 3072;
private const string CollectionName = "iron-mind-ai-tips";
private readonly TextSearchStore _textSearchStore;
public IronMindRagAgent(OpenAIClient openAIClient, string embeddingModel)
{
var vectorStore = new InMemoryVectorStore(new()
{
EmbeddingGenerator = openAIClient.GetEmbeddingClient(embeddingModel).AsIEmbeddingGenerator()
});
_textSearchStore = new TextSearchStore(vectorStore, CollectionName, EmbeddingDimensions);
}
public async Task<AIAgent> CreateAgentAsync(OpenAIClient openAIClient, string model)
{
await _textSearchStore.UpsertDocumentsAsync(TextSearchStore.GetSampleDocuments());
var textSearchOptions = new TextSearchProviderOptions
{
SearchTime = TextSearchProviderOptions.TextSearchBehavior.BeforeAIInvoke,
CitationsPrompt = "Always cite sources at the end of your response using the format: " +
"**Source:** [SourceName](SourceLink)",
};
return openAIClient
.GetChatClient(model)
.AsAIAgent(new ChatClientAgentOptions
{
ChatOptions = new()
{
Instructions = "You are Iron Mind AI, a knowledgeable personal trainer. " +
"You MUST base your answers on the provided context documents. " +
"Always cite your sources by name and link at the end of your response. " +
"If the context does not contain relevant information, say so."
},
AIContextProviderFactory = (ctx, ct) => new ValueTask<AIContextProvider>(
new TextSearchProvider(SearchAsync, ctx.SerializedState,
ctx.JsonSerializerOptions, textSearchOptions)),
ChatHistoryProviderFactory = (ctx, ct) => new ValueTask<ChatHistoryProvider>(
new InMemoryChatHistoryProvider().WithAIContextProviderMessageRemoval()),
});
}
private async Task<IEnumerable<TextSearchProvider.TextSearchResult>> SearchAsync(
string text, CancellationToken ct)
{
var searchResults = await _textSearchStore.SearchAsync(text, 2, ct);
return searchResults.Select(r => new TextSearchProvider.TextSearchResult
{
SourceName = r.SourceName,
SourceLink = r.SourceLink,
Text = r.Text,
RawRepresentation = r
});
}
}
Let’s walk through the key parts.
~
The Constructor
The constructor takes the OpenAIClient and the embedding model name. It creates the in-memory vector store with the OpenAI embedding generator and initialises the TextSearchStore:
public IronMindRagAgent(OpenAIClient openAIClient, string embeddingModel)
{
var vectorStore = new InMemoryVectorStore(new()
{
EmbeddingGenerator = openAIClient.GetEmbeddingClient(embeddingModel).AsIEmbeddingGenerator()
});
_textSearchStore = new TextSearchStore(vectorStore, CollectionName, EmbeddingDimensions);
}
The .AsIEmbeddingGenerator() extension converts OpenAI’s embedding client into the standard IEmbeddingGenerator interface from Microsoft.Extensions.AI.
This embedding generator is used automatically during both upsert (to convert document text into vectors) and search (to convert the query into a vector).
~
CreateAgentAsync
This method loads the sample documents into the vector store, configures the TextSearchProvider, and builds the agent:
public async Task<AIAgent> CreateAgentAsync(OpenAIClient openAIClient, string model)
{
await _textSearchStore.UpsertDocumentsAsync(TextSearchStore.GetSampleDocuments());
var textSearchOptions = new TextSearchProviderOptions
{
SearchTime = TextSearchProviderOptions.TextSearchBehavior.BeforeAIInvoke,
CitationsPrompt = "Always cite sources at the end of your response using the format: " +
"**Source:** [SourceName](SourceLink)",
};
return openAIClient
.GetChatClient(model)
.AsAIAgent(new ChatClientAgentOptions { ... });
}
Two options control the RAG behaviour:
- SearchTime = BeforeAIInvoke tells the provider to automatically run a search before every model invocation and inject the results as context messages
- CitationsPrompt provides explicit instructions to the model on how to format source citations
An alternative to BeforeAIInvoke is OnDemandFunctionCalling, which exposes the search as a function tool that the model can choose to invoke when it decides it needs more information.
~
The Agent Configuration
The AIContextProviderFactory creates a new TextSearchProvider instance for each session. The SearchAsync method is passed as a method group reference rather than an inline delegate, keeping the code clean and readable.
The ctx.SerializedState and ctx.JsonSerializerOptions parameters support session serialisation, allowing the provider’s state to be persisted and restored.
The ChatHistoryProviderFactory creates an InMemoryChatHistoryProvider with .WithAIContextProviderMessageRemoval().
This is important because without it, every search result message would accumulate in the chat history, bloating the context window over a multi-turn conversation.
Notice the system instructions explicitly tell the model to base answers on the provided context and cite sources.
Without strong instructions, the model may fall back to its general training data and ignore the injected documents.
~
The Search Adapter
The TextSearchProvider doesn’t talk to the vector store directly. Instead, it calls the SearchAsync method that you provide.
This gives you full control over how searches are executed:
private async Task<IEnumerable<TextSearchProvider.TextSearchResult>> SearchAsync( string text, CancellationToken ct)
{
var searchResults = await _textSearchStore.SearchAsync(text, 2, ct);
return searchResults.Select(r => new TextSearchProvider.TextSearchResult
{
SourceName = r.SourceName,
SourceLink = r.SourceLink,
Text = r.Text,
RawRepresentation = r
});
}
We return the top 2 results per query. You can adjust this based on how much context you want to provide to the model.
~
Program.cs
With all the RAG logic encapsulated in the agent class, Program.cs is minimal:
using Agent_Framework_9_Basic_RAG.Agents;
using OpenAI;
string apiKey = "your-openai-api-key";
string model = "gpt-4o-mini";
string embeddingModel = "text-embedding-3-large";
OpenAIClient openAIClient = new(apiKey);
var ironMind = new IronMindRagAgent(openAIClient, embeddingModel);
var agent = await ironMind.CreateAgentAsync(openAIClient, model);
var session = await agent.CreateSessionAsync();
Console.WriteLine("Iron Mind AI - Personal Trainer");
Console.WriteLine("Ask me anything about training, nutrition, or recovery. Type 'exit' to quit.\n");
while (true)
{
Console.Write("You: ");
string? input = Console.ReadLine();
if (string.IsNullOrWhiteSpace(input) || input.Equals("exit", StringComparison.OrdinalIgnoreCase))
break;
Console.WriteLine();
Console.WriteLine(await agent.RunAsync(input, session));
Console.WriteLine();
}
Three lines to set up the agent. The rest is the chat loop. All vector store configuration, document loading, search logic, and prompt engineering lives inside IronMindRagAgent.
~
Example Output
Here’s what a conversation with the RAG-enabled agent looks like in action:
An example from the console output:
Iron Mind AI - Personal Trainer Ask me anything about training, nutrition, or recovery. Type 'exit' to quit. You: as a beginner, what can i do? As a beginner in strength training, here are some key steps you can follow: **Focus on Compound Movements**: Start with exercises like squats, deadlifts, bench presses, and overhead presses. These movements engage multiple muscle groups, which is great for building strength effectively. **Training Frequency**: Aim to train 3-4 days per week. Ensure you have at least one rest day between sessions to allow your muscles to recover. **Start with Manageable Weights**: Choose a weight that you can control for 8-12 repetitions while maintaining good form. **Progressive Overload**: Gradually increase the weight, number of repetitions, or sets over time. **Consistency Over Intensity**: In the beginning, it's better to focus on being consistent with your workouts rather than pushing yourself too hard. **Prioritize Recovery and Sleep**: Aim for 7-9 hours of quality sleep per night, as this is when your body repairs and builds muscle tissue. **Source:** [Iron Mind AI - Beginner Strength Training Guide](https://ironmindai.com/tips/beginner-strength), [Iron Mind AI - Recovery and Sleep Guide](https://ironmindai.com/tips/recovery-sleep)
The agent pulls from the relevant documents, synthesises a coherent answer, and cites the sources at the end. This is the key benefit of RAG: the response is grounded in your content, not hallucinated from general training data.
~
Demo
In the following demo, we can see the “RAG-fied” Iron Mind AI personal trainer agent in action:
~
Wrapping Up
In this post we looked at how to add RAG capabilities to a Microsoft Agent Framework agent using TextSearchProvider and an in-memory vector store. The key takeaways:
- TextSearchProvider integrates directly into the agent pipeline, running searches before each model invocation
- In-Memory Vector Store with an
EmbeddingGeneratorhandles embedding generation automatically during both upsert and search - The vector property should be typed as string (not ReadOnlyMemory<float>) to enable automatic embedding generation
- Strong system instructions are essential to ensure the model uses the provided context rather than falling back to general knowledge
- CitationsPrompt on
TextSearchProviderOptionstells the model how to format source citations - WithAIContextProviderMessageRemoval() prevents search result messages from bloating the chat history
- Encapsulating the RAG logic in a dedicated agent class keeps
Program.cs lean and followsgood separation of concerns
The in-memory vector store used here is great for prototyping and demos.
For production workloads, you would swap it for a persistent vector store such as Azure AI Search, Qdrant, Weaviate, or Pinecone.
The Microsoft.Extensions.VectorData abstractions make this a straightforward change.
If you have any questions, feel free to reach out.
~
Enjoy what you’ve read, have questions about this content, or would like to see another topic?
Drop me a note below.
You can schedule a call using my Calendly link to discuss consulting and development services.
