Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
152677 stories
·
33 followers

Building an Enterprise Knowledge Copilot with Foundry IQ and Agentic Retrieval on Azure AI

1 Share

Every enterprise has the same problem: knowledge scattered across SharePoint, file shares, wikis, and email. This article walks through building a knowledge copilot that unifies that data behind a single conversational interface — using Microsoft's Foundry IQ knowledge bases and the agentic retrieval engine in Azure AI Search.

The Problem: Fragmented Knowledge, Fragmented Answers

Enterprise AI projects today share a common pain point. Each new agent or copilot that needs to answer questions from company data must rebuild its own retrieval pipeline from scratch — data connections, chunking logic, embeddings, routing, permissions — all duplicated project after project. The result is a tangle of fragmented, siloed pipelines that are expensive to maintain and inconsistent in quality.

Consider a field technician troubleshooting equipment. The answer might span a vendor manual stored in OneLake, a company repair policy on SharePoint, and a public electrical standard on the web. Traditional single-index RAG cannot orchestrate across those sources in one pass. The technician waits, the issue escalates, and productivity drops.

Foundry IQ, announced in public preview in November 2025, addresses this directly. It provides a unified knowledge layer for agents — a single endpoint that replaces per-project RAG pipelines with a reusable, topic-centric knowledge base that any number of agents can consume.

What Is Foundry IQ?

Foundry IQ introduces four capabilities built on top of Azure AI Search:

  • Knowledge Bases — Reusable, topic-centric collections (e.g., "employee policies," "product documentation") available directly in the Foundry portal. Rather than wiring retrieval logic into every agent, you define a knowledge base once and ground multiple agents through a single API.
  • Indexed and Federated Knowledge Sources — A knowledge base can draw from Azure Blob Storage, OneLake, SharePoint, Azure AI Search indexes, the web, and MCP servers (MCP in private preview). Developers do not need to manage different retrieval strategies per source; the knowledge base presents a unified endpoint.
  • Agentic Retrieval Engine — A self-reflective query engine that uses AI to plan, search, and synthesize answers with configurable retrieval reasoning effort.
  • Enterprise-Grade Security — Document-level access control and alignment with existing permissions models. Microsoft Purview sensitivity labels are respected through the indexing and retrieval pipeline, so classified content remains governed as it flows into knowledge bases.

For indexed sources, Foundry IQ automatically manages the full indexing pipeline: content is ingested, chunked, vectorized, and prepared for hybrid retrieval. When Azure Content Understanding is enabled, complex documents gain layout-aware enrichment — tables, figures, and headers are extracted and structured without extra engineering work.

How Agentic Retrieval Works

Single-shot RAG — one query, one index, one pass — breaks down when questions are ambiguous, multi-hop, or span several data silos. Foundry IQ's agentic retrieval engine treats retrieval as a multi-step reasoning task rather than a keyword lookup:

  1. Plan — The engine analyzes the conversation and decomposes the query into focused sub-queries, deciding which knowledge sources to consult.
  2. Search — Sub-queries run concurrently against selected sources using keyword, vector, or hybrid techniques.
  3. Rank — Semantic reranking identifies the most relevant results.
  4. Reflect — If the information gathered is insufficient, the engine iterates — issuing follow-up queries autonomously.
  5. Synthesize — Results are unified into a natural-language answer with source references.

Developers control this behaviour through a high-level retrieval reasoning effort setting. Lower effort suits fast, lightweight lookups; higher effort enables iterative search and richer planning across the entire data estate.

Real-world impact: AT&T integrated Azure AI Search and retrieval-augmented generation into its multi-agent framework, reducing customer resolution times by 33 percent, cutting average handle time by nearly 10 percent, and scaling 71 AI solutions to 100,000 employees. Ontario Power Generation used agentic retrieval to sift through over 40 years of nuclear operating experience, enabling data-driven decision-making and helping new staff learn from decades of institutional knowledge.

Architecture Overview

Step-by-Step: Setting Up the Knowledge Copilot

  1. Provision Resources

You need an Azure AI Search service (Basic tier or above), a Microsoft Foundry project, an embedding model deployment (e.g., text-embedding-3-large), and an LLM deployment (e.g., gpt-4.1) for query planning and answer generation. .NET 8 or later is required for the C# SDK.

  1. Create a Knowledge Base in Azure AI Search

Using the Azure.Search.Documents preview SDK, define an index, a knowledge source pointing to your data, and a knowledge base with OutputMode set to AnswerSynthesis for natural-language answers with citations. The following C# snippet (adapted from the official Azure AI Search quickstart) shows the knowledge base creation:

using Azure; using Azure.Identity; using Azure.Search.Documents.Indexes; var searchEndpoint = "https://<your-service>.search.windows.net"; var aoaiEndpoint = "https://<your-resource>.openai.azure.com/"; var indexClient = new SearchIndexClient( new Uri(searchEndpoint), new DefaultAzureCredential()); // Configure the LLM for query planning and answer synthesis var openAiParameters = new AzureOpenAIVectorizerParameters { ResourceUri = new Uri(aoaiEndpoint), DeploymentName = "gpt-4.1", ModelName = "gpt-4.1" }; var model = new KnowledgeBaseAzureOpenAIModel(openAiParameters); // Create the knowledge base with answer synthesis enabled var knowledgeBase = new KnowledgeBase("<knowledge-base-name>") { OutputMode = KnowledgeBaseOutputMode.AnswerSynthesis, AnswerInstructions = "Provide a concise answer based on the retrieved documents.", Models = { model } }; await indexClient.CreateOrUpdateKnowledgeBaseAsync(knowledgeBase);
  1. Connect an Agent to the Knowledge Base via MCP

Each knowledge base exposes a Model Context Protocol (MCP) endpoint that MCP-compatible agents can call. The Foundry IQ-specific agent SDK currently offers full code samples for Python and REST API, but you can use the general-purpose MCP tooling in C# to achieve the same connection. The following pattern is drawn from the official Microsoft Learn documentation on MCP tools with Foundry Agents:

using Azure.AI.Projects; using Azure.Identity; var endpoint = "https://<your-resource>.services.ai.azure.com/api/projects/<your-project>"; var model = "gpt-4.1-mini"; // Point the MCP tool at the knowledge base's MCP endpoint var mcpTool = new MCPToolDefinition( serverLabel: "enterprise_kb", serverUrl: "https://<search-service>.search.windows.net" + "/knowledgebases/<kb-name>/mcp?api-version=2025-11-01-preview"); mcpTool.AllowedTools.Add("knowledge_base_retrieve"); // Create the agent with the MCP tool attached var projectClient = new AIProjectClient(new Uri(endpoint), new DefaultAzureCredential()); var agentVersion = await projectClient.AgentAdministrationClient .CreateAgentVersionAsync( "enterprise-copilot", new ProjectsAgentVersionCreationOptions( new DeclarativeAgentDefinition(model) { Instructions = "You are a company knowledge assistant. " + "Always search the knowledge base before answering. " + "If the knowledge base has no answer, say so clearly.", Tools = { mcpTool } }));

The agent instructions are critical — explicitly requiring the agent to use the knowledge base prevents it from answering purely from the LLM's training data.

  1. Query the Copilot

Once the agent is published, your application layer simply sends user questions via the Azure AI Projects SDK or REST API. The agent autonomously invokes the knowledge base tool, retrieves grounded context, and returns an answer with citations referencing the original documents.

 

Trade-offs and Considerations

Dimension

Detail

Maturity

Foundry IQ is in public preview — not recommended for production workloads without accepting preview SLA terms.

Cost

Agentic retrieval has two billing streams: token-based billing from Azure AI Search for retrieval, and billing from Azure OpenAI for query planning and answer synthesis.

Latency vs. Quality

Higher retrieval reasoning effort produces better answers but adds latency due to iterative search. For sub-second lookups, use minimal effort; for complex multi-hop questions, use medium.

C# SDK Coverage

The Foundry IQ–specific agent connection SDK currently supports Python and REST API. C# support is available for the underlying agentic retrieval queries and for general MCP tool integration.

Security

Document-level ACLs from SharePoint are enforced at query time. For per-user authorization in Foundry Agent Service, the current preview does not support per-request MCP headers — use the Azure OpenAI Responses API as an alternative.

 

Key Takeaways

Foundry IQ transforms enterprise RAG from a bespoke, per-project exercise into a managed, reusable knowledge layer. You define a knowledge base once, connect it to your data sources, and any number of agents or apps can consume it. The agentic retrieval engine handles query planning, multi-source search, semantic reranking, and iterative refinement — capabilities that previously required significant custom engineering. For .NET developers, the Azure AI Search C# SDK and the MCP tooling in the Agent Framework provide the building blocks to integrate this into your applications today.

 

References:

Read the whole story
alvinashcraft
56 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

The Evolution of Microservices: Agents, Monoliths, and the Patterns That Never Die

1 Share
Recorded live at AWS Summit London, Matheus Guimaraes — Senior Developer Advocate at AWS and microservices specialist with over 25 years in tech — joins Romain to explore how agentic AI is reshaping the way we think about distributed systems architecture. From Martin Fowler's 2014 definition to agentic microservices in 2026, Matheus unpacks why the same distributed systems patterns — single responsibility, context dilution, failure modes — keep resurfacing in every new wave of architecture. The conversation covers the monolith vs. microservices debate as a deliberate architectural choice rather than accidental spaghetti, modular monoliths with Spring Modulith, and how AI coding assistants like Kiro are changing the architect's role from writing boilerplate to making higher-order design decisions. Matheus introduces his concepts of 'smart APIs,' 'monolithic agentic microservices,' and 'specialized agentic microservices' — and explains his talk 'Is It Agent?' on when to reach for agents vs. traditional applications. We dig into the serverless primitives purpose-built for agentic workloads: Amazon Bedrock AgentCore Runtime for long-running agent processes, AWS Lambda Durable Functions for multi-step workflows, and the AWS DevOps Agent for autonomous incident response. We also explore integration patterns with MCP and Google's A2A protocol, the 'lost in the middle' problem with context dilution, and why critical thinking about AI adoption matters more than ever. Whether you are decomposing a monolith or designing your first agentic system, this conversation connects the dots between a decade of microservices wisdom and the agentic future.

With Matheus Guimaraes, Senior Developer Advocate, AWS





  • Download audio: https://op3.dev/e/dts.podtrac.com/redirect.mp3/developers.podcast.go-aws.com/media/206.mp3
    Read the whole story
    alvinashcraft
    57 minutes ago
    reply
    Pennsylvania, USA
    Share this story
    Delete

    Vanishing Culture

    1 Share

    In Vanishing Culture, editors Luca Messarra, Chris Freeland and Juliya Ziskina bring together voices exploring what it means to lose access to our shared cultural record in the digital age. From disappearing websites and delisted music to fragile licensing agreements and platform shutdowns, the book traces how corporate control, technological change, and neglect are reshaping what survives... and what vanishes.

    In this episode, Messarra and Freeland are joined by contributor Katie Livingston to discuss the forces driving cultural loss today, the stakes for libraries and public memory, and what it will take to build a more durable, accessible digital future.

    Read Vanishing Culture for free at the Internet Archive: https://archive.org/details/vanishing-culture-2026
    Purchase in print from Better World Books or your favorite local bookstore: https://www.betterworldbooks.com/product/detail/vanishing-culture-a-report-on-our-fragile-cultural-record-9798995425014/new

    This conversation was recorded on 4/17/2026.

    Check out all of the Future Knowledge episodes at https://archive.org/details/future-knowledge





    Download audio: https://media.transistor.fm/76e5c574/92c6cf7a.mp3
    Read the whole story
    alvinashcraft
    57 minutes ago
    reply
    Pennsylvania, USA
    Share this story
    Delete

    AI-Proofing Your Skillset - High-Meaning, High-Specifity Vocabulary is the Path to Growth

    1 Share
    • Why I'm Not "Picking a Fight" on AI: A listener asked if I'm intentionally stoking a flame war by treating agentic coding as a foregone conclusion. The honest answer is that I've used it, the data points one direction, and a show built around pretending otherwise would slowly drift away from reality — and away from being useful to you.
    • Respecting the Misgivings, Without Getting Stuck in Them: Ethical concerns, skill atrophy worries, and questions about long-term effects are all legitimate. But the goal of this show is practical applicability, so we focus on mental models you can use Monday morning rather than litigating every angle of the debate.
    • The "Minecraft" Principle: If I ask you to "build Minecraft," I've handed you several chapters of specification in a single word. That's meaning-rich abstraction — language that points at a huge amount of shared context with very little token cost.
    • Meaning-Rich AND Specific: "Human history" is meaning-rich but uselessly broad. "Block-building game" is specific but loses fidelity. The sweet spot is vocabulary that is both compact and unambiguous — sitting in the top right of the meaning-density / specificity graph.
    • A Real Example — Strategy Pattern: When working on authorization rules, I didn't want a pipeline. Instead of describing base classes, shared interfaces, and parallel execution to the LLM, I used the words "strategy pattern." Three words did the work of three paragraphs, and the output landed where I wanted it.
    • Vocabulary as Leverage: Named patterns, named algorithms (Monte Carlo, etc.), named architectural concepts — these act like compressed pointers. The more of them you genuinely understand, the higher the leverage of every prompt you write and every conversation you have with another engineer.
    • How to Build This Vocabulary: Have conversations with senior engineers. Ask an LLM what patterns are at play in a codebase, which ones you're using incorrectly, and which ones you're tricked into thinking you're using. Learn the abstraction layer that sits one step above your day-to-day implementation work.
    • The Asterisk — Shared Context Required: This only works when both sides know the term. Public, well-documented concepts (patterns, papers, algorithms) translate immediately to LLMs. Private or organization-specific concepts need to be loaded into context — via CLAUDE.md, AGENTS.md, or skills — before that compression kicks in.
    • Episode Homework: Pick one area of your current codebase. Ask an LLM to name the patterns in play, the patterns you're using incorrectly, and the ones you might be missing. Use that conversation to add at least one new piece of meaning-rich vocabulary to your working set.

    🙏 Today's Episode is Brought To You by: Unblocked

    Your coding agents have access to your code base — and probably more — but access isn't the same as context. Agents can't reason well across MCPs on their own, they don't know your architecture decisions, and they don't know which docs are reliable versus written by someone in their free time two years ago. ● Unblocked is the context layer your agents are missing. ● It synthesizes your PRs, docs, Slack messages, and Jira issues into organizational context that agents actually understand. ● That means better plans, higher quality code, fewer tokens, and fewer correction loops. ● Whether you're running Claude Code, Cursor, Codex, or any agentic workflow, it's worth a look. Get a free three-week trial at getunblocked.com/developertea.

    📮 Ask a Question

    If you enjoyed this episode and would like me to discuss a question that you have on the show, drop it over at: developertea.com.

    📮 Join the Discord

    If you want to be a part of a supportive community of engineers (non-engineers welcome!) working to improve their lives and careers, join us on the Developer Tea Discord community today!

    🗞️ Subscribe to The Tea Break

    We are developing a brand new newsletter called The Tea Break! You can be the first in line to receive it by entering your email directly over at developertea.com.

    🧡 Leave a Review

    If you're enjoying the show and want to support the content head over to iTunes and leave a review!





    Download audio: https://dts.podtrac.com/redirect.mp3/cdn.simplecast.com/audio/c44db111-b60d-436e-ab63-38c7c3402406/episodes/88cbb506-fb73-45fe-8f99-32bd247e077d/audio/adaea5bb-fb27-42e4-bcf8-0ea1ee34f3ac/default_tc.mp3?aid=rss_feed&feed=dLRotFGk
    Read the whole story
    alvinashcraft
    58 minutes ago
    reply
    Pennsylvania, USA
    Share this story
    Delete

    ASP.NET Core Cookie Size Limits in Production: Causes and Fixes

    1 Share

    Everything works fine in development. You log in, you get a cookie, life is good. You deploy to production, and a portion of your users can't authenticate. No useful error message. The browser just silently fails or shows a generic "something went wrong" page. Your logs contain 431 Request Header Fields Too Large or requests that arrive with a valid session cookie that your app promptly rejects.

    The culprit is almost always cookie size. This post walks through why auth cookies grow out of control, how to spot the symptoms, and three concrete solutions you can apply today.

    Read the whole story
    alvinashcraft
    58 minutes ago
    reply
    Pennsylvania, USA
    Share this story
    Delete

    Musk Testifies OpenAI Was Created As Nonprofit To Counter Google

    1 Share
    Elon Musk testified on day two of his trial against OpenAI, saying he helped create the company as a nonprofit counterweight to Google and would not have backed it if the goal had been private profit. CNBC reports: Musk on Tuesday was the first witness called to testify in the trial. He spoke about his upbringing, his many companies, his role in founding OpenAI and his understanding of its structure. Musk said in his testimony that he was not opposed to the creation of a small for-profit subsidiary, "as long as the tail didn't wag the dog." Musk said he was motivated to start OpenAI to serve as a counterweight to Google. He got the idea after an argument he had with Google co-founder Larry Page, who called Musk a "speciesist for being pro-human," he testified. "I could have started it as a for profit and I chose not to," Musk said on the stand. Earlier, attorneys for Musk and OpenAI presented their opening arguments to the jury. Musk's lead trial lawyer, Steven Molo, delivered the opening statement for the Tesla and SpaceX CEO. OpenAI lawyer William Savitt gave the opening statement for the AI company, Altman and Brockman. OpenAI has characterized Musk's lawsuit as a baseless "harassment campaign." The company said Monday in a post on X that it "can't wait to make our case in court where both the truth and the law are on our side." During his testimony on Tuesday, Musk repeatedly emphasized that he founded OpenAI to serve as a counterweight to Google. He said he got the idea after an argument about AI safety with Google co-founder Larry Page, who Musk said called him "a speciesist for being pro-human." Musk said he was concerned Page was not taking AI safety seriously, so he wanted there to be an nonprofit, open source alternative to Google. "I could have started it as a for profit and I chose not to," Musk said on the stand. Further reading: Elon Musk and OpenAI CEO Sam Altman Head To Court

    Read more of this story at Slashdot.

    Read the whole story
    alvinashcraft
    3 hours ago
    reply
    Pennsylvania, USA
    Share this story
    Delete
    Next Page of Stories