A couple of months ago, we published a three-part series showing how to build multi-agent AI systems on Azure App Service using preview packages from the Microsoft Agent Framework (MAF) (formerly AutoGen / Semantic Kernel Agents). The series walked through async processing, the request-reply pattern, and client-side multi-agent orchestration — all running on App Service.
Since then, Microsoft Agent Framework has reached 1.0 GA — unifying AutoGen and Semantic Kernel into a single, production-ready agent platform. This post is a fresh start with the GA bits. We'll rebuild our travel-planner sample on the stable API surface, call out the breaking changes from preview, and get you up and running fast.
All of the code is in the companion repo: seligj95/app-service-multi-agent-maf-otel.
What Changed in MAF 1.0 GA
The 1.0 release is more than a version bump. Here's what moved:
- Unified platform. AutoGen and Semantic Kernel agent capabilities have converged into
Microsoft.Agents.AI. One package, one API surface.
- Stable APIs with long-term support. The 1.0 contract is now locked for servicing. No more preview churn.
- Breaking change —
Instructions on options removed. In preview, you set instructions through ChatClientAgentOptions.Instructions. In GA, pass them directly to the ChatClientAgent constructor.
- Breaking change —
RunAsync parameter rename. The thread parameter is now session (type AgentSession). If you were using named arguments, this is a compile error.
Microsoft.Extensions.AI upgraded. The framework moved from the 9.x preview of Microsoft.Extensions.AI to the stable 10.4.1 release.
- OpenTelemetry integration built in. The builder pipeline now includes
UseOpenTelemetry() out of the box — more on that in Blog 2.
Our project references reflect the GA stack:
<PackageReference Include="Microsoft.Agents.AI" Version="1.0.0" />
<PackageReference Include="Microsoft.Extensions.AI" Version="10.4.1" />
<PackageReference Include="Azure.AI.OpenAI" Version="2.1.0" />
Why Azure App Service for AI Agents?
If you're building with Microsoft Agent Framework, you need somewhere to run your agents. You could reach for Kubernetes, containers, or serverless — but for most agent workloads, Azure App Service is the sweet spot. Here's why:
- No infrastructure management — App Service is fully managed. No clusters to configure, no container orchestration to learn. Deploy your .NET or Python agent code and it just runs.
- Always On — Agent workflows can take minutes. App Service's Always On feature (on Premium tiers) ensures your background workers never go cold, so agents are ready to process requests instantly.
- WebJobs for background processing — Long-running agent workflows don't belong in HTTP request handlers. App Service's built-in WebJob support gives you a dedicated background worker that shares the same deployment, configuration, and managed identity — no separate compute resource needed.
- Managed Identity everywhere — Zero secrets in your code. App Service's system-assigned managed identity authenticates to Azure OpenAI, Service Bus, Cosmos DB, and Application Insights automatically. No connection strings, no API keys, no rotation headaches.
- Built-in observability — Native integration with Application Insights and OpenTelemetry means you can see exactly what your agents are doing in production (more on this in Part 2).
- Enterprise-ready — VNet integration, deployment slots for safe rollouts, custom domains, auto-scaling rules, and built-in authentication. All the things you'll need when your agent POC becomes a production service.
- Cost-effective — A single P0v4 instance (~$75/month) hosts both your API and WebJob worker. Compare that to running separate container apps or a Kubernetes cluster for the same workload.
The bottom line: App Service lets you focus on building your agents, not managing infrastructure. And since MAF supports both .NET and Python — both first-class citizens on App Service — you're covered regardless of your language preference.
Architecture Overview
The sample is a travel planner that coordinates six specialized agents to build a personalized trip itinerary. Users fill out a form (destination, dates, budget, interests), and the system returns a comprehensive travel plan complete with weather forecasts, currency advice, a day-by-day itinerary, and a budget breakdown.
The Six Agents
- Currency Converter — calls the Frankfurter API for real-time exchange rates
- Weather Advisor — calls the National Weather Service API for forecasts and packing tips
- Local Knowledge Expert — cultural insights, customs, and hidden gems
- Itinerary Planner — day-by-day scheduling with timing and costs
- Budget Optimizer — allocates spend across categories and suggests savings
- Coordinator — assembles everything into a polished final plan
Four-Phase Workflow
| Phase | Agents | Execution |
|---|
| 1 — Parallel Gathering | Currency, Weather, Local Knowledge | Task.WhenAll |
| 2 — Itinerary | Itinerary Planner | Sequential (uses Phase 1 context) |
| 3 — Budget | Budget Optimizer | Sequential (uses Phase 2 output) |
| 4 — Assembly | Coordinator | Final synthesis |
Infrastructure
- Azure App Service (P0v4) — hosts the API and a continuous WebJob for background processing
- Azure Service Bus — decouples the API from heavy AI work (async request-reply)
- Azure Cosmos DB — stores task state, results, and per-agent chat histories (24-hour TTL)
- Azure OpenAI (GPT-4o) — powers all agent LLM calls
- Application Insights + Log Analytics — monitoring and diagnostics
ChatClientAgent Deep Dive
At the core of every agent is ChatClientAgent from Microsoft.Agents.AI. It wraps an IChatClient (from Microsoft.Extensions.AI) with instructions, a name, a description, and optionally a set of tools. This is client-side orchestration — you control the chat history, lifecycle, and execution order. No server-side Foundry agent resources are created.
Here's the BaseAgent pattern used by all six agents in the sample:
// BaseAgent.cs — constructor for agents with tools
Agent = new ChatClientAgent(
chatClient,
instructions: Instructions,
name: AgentName,
description: Description,
tools: chatOptions.Tools?.ToList())
.AsBuilder()
.UseOpenTelemetry(sourceName: AgentName)
.Build();
Notice the builder pipeline: .AsBuilder().UseOpenTelemetry(...).Build(). This opts every agent into the framework's built-in OpenTelemetry instrumentation with a single line. We'll explore what that telemetry looks like in Blog 2.
Invoking an agent is equally straightforward:
// BaseAgent.cs — InvokeAsync
public async Task<ChatMessage> InvokeAsync(
IList<ChatMessage> chatHistory,
CancellationToken cancellationToken = default)
{
var response = await Agent.RunAsync(
chatHistory, session: null, options: null, cancellationToken);
return response.Messages.LastOrDefault()
?? new ChatMessage(ChatRole.Assistant, "No response generated.");
}
Key things to note:
session: null — this is the renamed parameter (was thread in preview). We pass null because we manage chat history ourselves.
- The agent receives the full
chatHistory list, so context accumulates across turns.
- Simple agents (Local Knowledge, Itinerary Planner, Budget Optimizer, Coordinator) use the tool-less constructor; agents that call external APIs (Currency, Weather) use the constructor that accepts
ChatOptions with tools.
Tool Integration
Two of our agents — Weather Advisor and Currency Converter — call real external APIs through the MAF tool-calling pipeline. Tools are registered using AIFunctionFactory.Create() from Microsoft.Extensions.AI.
Here's how the WeatherAdvisorAgent wires up its tool:
// WeatherAdvisorAgent.cs
private static ChatOptions CreateChatOptions(
IWeatherService weatherService, ILogger logger)
{
var chatOptions = new ChatOptions
{
Tools = new List<AITool>
{
AIFunctionFactory.Create(
GetWeatherForecastFunction(weatherService, logger))
}
};
return chatOptions;
}
GetWeatherForecastFunction returns a Func<double, double, int, Task<string>> that the model can call with latitude, longitude, and number of days. Under the hood, it hits the National Weather Service API and returns a formatted forecast string. The Currency Converter follows the same pattern with the Frankfurter API.
This is one of the nicest parts of the GA API: you write a plain C# method, wrap it with AIFunctionFactory.Create(), and the framework handles the JSON schema generation, function-call parsing, and response routing automatically.
Multi-Phase Workflow Orchestration
The TravelPlanningWorkflow class coordinates all six agents. The key insight is that the orchestration is just C# code — no YAML, no graph DSL, no special runtime. You decide when agents run, what context they receive, and how results flow between phases.
// Phase 1: Parallel Information Gathering
var gatheringTasks = new[]
{
GatherCurrencyInfoAsync(request, state, progress, cancellationToken),
GatherWeatherInfoAsync(request, state, progress, cancellationToken),
GatherLocalKnowledgeAsync(request, state, progress, cancellationToken)
};
await Task.WhenAll(gatheringTasks);
After Phase 1 completes, results are stored in a WorkflowState object — a simple dictionary-backed container that holds per-agent chat histories and contextual data:
// WorkflowState.cs
public Dictionary<string, object> Context { get; set; } = new();
public Dictionary<string, List<ChatMessage>> AgentChatHistories { get; set; } = new();
Phases 2–4 run sequentially, each pulling context from the previous phase. For example, the Itinerary Planner receives weather and local knowledge gathered in Phase 1:
var localKnowledge = state.GetFromContext<string>("LocalKnowledge") ?? "";
var weatherAdvice = state.GetFromContext<string>("WeatherAdvice") ?? "";
var itineraryChatHistory = state.GetChatHistory("ItineraryPlanner");
itineraryChatHistory.Add(new ChatMessage(ChatRole.User,
$"Create a detailed {days}-day itinerary for {request.Destination}..."
+ $"\n\nWEATHER INFORMATION:\n{weatherAdvice}"
+ $"\n\nLOCAL KNOWLEDGE & TIPS:\n{localKnowledge}"));
var itineraryResponse = await _itineraryAgent.InvokeAsync(
itineraryChatHistory, cancellationToken);
This pattern — parallel fan-out followed by sequential context enrichment — is simple, testable, and easy to extend. Need a seventh agent? Add it to the appropriate phase and wire it into WorkflowState.
Async Request-Reply Pattern
A multi-agent workflow with six LLM calls (some with tool invocations) can easily run 30–60 seconds. That's well beyond typical HTTP timeout expectations and not a great user experience for a synchronous request. We use the Async Request-Reply pattern to handle this:
- The API receives the travel plan request and immediately queues a message to Service Bus.
- It stores an initial task record in Cosmos DB with status
queued and returns a taskId to the client.
- A continuous WebJob (running as a separate process on the same App Service plan) picks up the message, executes the full multi-agent workflow, and writes the result back to Cosmos DB.
- The client polls the API for status updates until the task reaches
completed.
This pattern keeps the API responsive, makes the heavy work retriable (Service Bus handles retries and dead-lettering), and lets the WebJob run independently — you can restart it without affecting the API. We covered this pattern in detail in the previous series, so we won't repeat the plumbing here.
Deploy with azd
The repo is wired up with the Azure Developer CLI for one-command provisioning and deployment:
git clone https://github.com/seligj95/app-service-multi-agent-maf-otel.git
cd app-service-multi-agent-maf-otel
azd auth login
azd up
azd up provisions the following resources via Bicep:
- Azure App Service (P0v4 Windows) with a continuous WebJob
- Azure Service Bus namespace and queue
- Azure Cosmos DB account, database, and containers
- Azure AI Services (Azure OpenAI with GPT-4o deployment)
- Application Insights and Log Analytics workspace
- Managed Identity with all necessary role assignments
After deployment completes, azd outputs the App Service URL. Open it in your browser, fill in the travel form, and watch six agents collaborate on your trip plan in real time.
What's Next
We now have a production-ready multi-agent app running on App Service with the GA Microsoft Agent Framework. But how do you actually observe what these agents are doing? When six agents are making LLM calls, invoking tools, and passing context between phases — you need visibility into every step.
In the next post, we'll dive deep into how we instrumented these agents with OpenTelemetry and the new Agents (Preview) view in Application Insights — giving you full visibility into agent runs, token usage, tool calls, and model performance. You already saw the .UseOpenTelemetry() call in the builder pipeline; Blog 2 shows what that telemetry looks like end to end and how to light up the new Agents experience in the Azure portal.
Stay tuned!
Resources