A few weeks ago, we announced the public preview refresh of hosted agents in Foundry Agent Service — a fundamentally re-imagined agent runtime built for operationalizing production-grade AI agents in enterprise systems. Today at Microsoft Build, we are excited to share several updates that make hosted agents easier to deploy, more capable across modalities, and seamless to optimize through the agent loop. This article covers what’s new, and what’s coming.
The Problems We Set Out to Solve
Developers who want to take agents to production find themselves managing a list of things that have nothing to do with the agent’s actual intelligence:
- Containerization and infrastructure — building images, pushing to registries, managing versions
- Security and identity — provisioning managed identities, scoping access, preventing cross-session data leakage
- State persistence — storing files, memory, and context across turns without building your own persistence layer
- Scaling — right-sizing compute for variable concurrency without paying for idle capacity
- Observability — knowing what your agent is doing, when it fails, and why
- Evaluation – measuring the quality, safety, and reliability of your agent continuously
Hosted agents in Foundry Agent Service were designed to take all of this off your plate. Each agent session runs in its own hypervisor-isolated sandbox with a dedicated persistent file system, an automatically provisioned Microsoft Entra ID (agent identity), and built-in OpenTelemetry tracing. You bring your code and your framework; the platform handles the rest.
Since the preview launched, we’ve shipped four additional capabilities that expand what you can do with hosted agents.
What’s New
1. Deploy Directly from Source Code — No Container Required
Previously, deploying a hosted agent required packaging your application into a container image, pushing that image to Azure Container Registry, and configuring the agent to run it. This container-based approach gives you full control over the runtime environment, including OS dependencies, system libraries, startup behavior, and how your code is packaged and executed — it continues to be fully supported. However, for many teams, especially during early development, building and managing containers can introduce unnecessary friction. Direct code deployment removes that friction. You zip your Python or .NET project, upload it to the Foundry Agent Service, and the platform either installs your dependencies at provision time (remote_build mode) or runs your pre-bundled output directly (bundled mode).
The result is a significantly shorter path from local development to deployed agent. For developers using the Azure Developer CLI (azd) or the Foundry Toolkit for VS Code, source code deployment is even simpler — the tooling handles packaging, uploading, polling for active status, and configuring role-based access control automatically.
With azd, the agent deployment takes just two commands:
# Initialize — configure your agent project for source code deployment
azd ai agent init \
--src ./src/my-agent \
--agent-name my-unique-agent \
--deploy-mode code \
--runtime python_3_13 \
--entry-point main.py \
--dep-resolution remote_build
# Deploy — package, upload, and wait for Active state
azd deploy
# Invoke Agent
azd ai agent invoke “message”
# Inspect recent agent logs for the failure
azd ai agent monitor --tail 100
The init command generates the azure.yaml and agent.yaml configuration needed for deployment. The deploy command handles zip packaging, SHA verification, upload, and polls until the agent reaches active state — no manual curl, token management and container registry required.
Key flags for azd ai agent init:
| Flag |
Description |
| –src, -s |
Path to your agent source code directory |
| –agent-name |
Unique agent name (reusing a name creates a new version) |
| –deploy-mode |
code (source upload) or container (Docker image) |
| –runtime |
python_3_13, python_3_14, or dotnet_10 |
| –entry-point |
Application entry point (e.g., main.py, MyAgent.dll) |
| –dep-resolution |
remote_build (default) or bundled |
| –project-id, -p |
Existing Foundry Project resource ID (skips interactive selection) |
| –model |
AI model to use (defaults to gpt-4.1-mini) |
| –protocol |
invocations or responses |
For CI/CD pipelines, add –no-prompt for fully non-interactive execution:
azd ai agent init --no-prompt \
--project-id "/subscriptions/<sub>/resourceGroups/<rg>/providers/Microsoft.CognitiveServices/accounts/<account>/projects/<project>" \
--src ./my-agent \
--agent-name my-unique-agent \
--deploy-mode code \
--runtime python_3_13 \
--entry-point main.py
azd deploy
Supported runtimes: python_3_13, python_3_14, and dotnet_10. Runtime support follows each language’s upstream end-of-life schedule.
Full guide: Deploy a hosted agent from source code (preview)
2. Built-In Guardrails for Responsible Agentic AI
Production agents interact with real users, and real users sometimes send harmful content. Whether it’s a customer service agent being probed with violent instructions or a coding assistant receiving inappropriate prompts, agents need a safety layer that blocks harmful inputs before they reach agent logic and filters unsafe outputs before they reach end users.
Previously, teams building responsible AI agents had to integrate content filtering themselves: standing up a separate Content Safety in Foundry Control Plane endpoint, writing middleware to intercept requests and responses, handling streaming edge cases, and managing policy configurations across environments. That’s table-stakes safety work that every production agent needs, but no team wants to build from scratch.
Hosted agents now include built-in content safety guardrails powered by Content Safety, integrated directly into the agent runtime. When enabled, every user prompt is evaluated in real time before it reaches your agent code, and every response is also evaluated before it reaches your users.
Here is how you can create guardrail policies, which you can later on pass in the hosted agent definition: How to configure guardrails and controls in Microsoft Foundry – Microsoft Foundry | Microsoft Learn

Content safety guardrails are available today in public preview across all hosted agents regions.
3. Voice Live Integration and WebSocket Support
Agents that reason and act are valuable. Agents that can also speak and listen in real time open an entirely different class of applications — customer service, accessibility tooling, voice-first interfaces, and more.
For developers building text-based hosted agents, Voice Live integration with hosted text agent is now in public preview, enabling real-time voice experiences through both the Responses and Invocations protocols with one click to turn on.
For developers building native speech-to-speech hosted agents, hosted agents now support WebSocket and WebRTC for real-time voice scenarios. By combining the Invocations (WebSocket) protocol with frameworks such as Voice Live (as a speech-to-speech model API), Pipecat, or LiveKit within your container, you can deliver fully real-time voice agents — from microphone input to natural speech output, on the same secure and scalable platform that powers your text-based agents.
Leading telephony providers are already leveraging this capability to enable seamless phone-call interactions with hosted agents using speech.
The WebSocket endpoint exposes a persistent bidirectional connection:
wss://{account}.services.ai.azure.com/api/projects/agents/endpoint/protocols/invocations_ws?project_name={project}&agent_name={name}
This is distinct from the Responses and Invocations (HTTP) protocols. A single persistent connection handles both inbound audio and outbound speech synthesis without the overhead of repeated HTTP handshakes.
The Invocations (WebSocket) protocol is currently only available in North Central US. Support in additional regions is coming soon.
The addition of the Invocations (WebSocket) protocol completes the protocol triad for hosted agents, each relevant for different real-world scenarios:
| Protocol |
Use-cases |
Key Characteristics |
| Responses |
Conversational agents, RAG, Publish to Teams / M365 |
Platform manages history, streaming, and session lifecycle; any OpenAI-compatible SDK works as the client |
| Invocations (HTTP) |
Webhooks, structured data, custom streaming (AG-UI, etc.) |
Arbitrary JSON in/out; you control the schema and SSE stream |
| Invocations (WebSocket) |
Real-time voice, bidirectional streaming |
Persistent connection; pair with Pipecat, LiveKit, or Voice Live |
A single hosted agent can expose multiple protocols simultaneously. The platform automatically bridges the Responses protocol to the Activity protocol for Teams and Microsoft 365 channel delivery — no additional configuration required. Agent2Agent (A2A) has also been added to allow for agent delegation.
4. Agent Optimizer: A Closed-Loop Improvement Engine
“Live” and “production-ready” aren’t the same thing, and the gap shows up quickly.
Your customer support agent handles requests — but it forgets to ask for an order number before looking up status. It answers warranty questions without checking the purchase date. It gives electrical wiring advice when it should decline and recommend a professional. Each fix means rewriting your system prompt, testing by hand, and hoping you didn’t break something else.
For one agent, that’s manageable. For a team running ten agents across different domains, it’s a bottleneck that doesn’t scale.
Agent optimizer in Foundry Agent Service solves this by automating the agent improvement loop. It evaluates your hosted agent against defined criteria, generates better configurations, and ranks the results so you can deploy the best one — all in a few minutes, with no additional infrastructure to provision.
How It Works
The optimizer runs a closed-loop cycle:
- Evaluate the baseline — your agent processes tasks with pass/fail criteria, producing a composite score from 0.0 to 1.0
- Generate candidates — guided by what failed, the optimizer produces new configurations for your chosen target
- Evaluate candidates — each candidate runs against the same task set
- Rank and recommend — results sorted by score, with per-task breakdowns and token costs visible before you commit
- Deploy the winner — one command promotes the winning configuration as a new versioned deployment
No model retraining. No code changes. The optimizer uses evaluation signals to identify where the agent fell short, then rewrites its instructions to strengthen return policies, escalation procedures, troubleshooting frameworks, and safety boundaries.
Optimization Targets
- Instruction — rewrites your agent’s system prompt to address observed gaps. This is where most teams start.
- Skill — generates reusable, named procedures (escalation steps, troubleshooting sequences) appended to your instructions.
- Model — evaluates your agent across multiple model deployments in a single run, scoring quality/cost trade-offs.
- Tool Descriptions — refines how your agent understands and invokes external tools: when to call each, parameter requirements, fallback behavior.
Solving the Cold-Start Problem
Most teams don’t have evaluation datasets on day one. The eval init command solves this by generating both a dataset and evaluation criteria from your agent’s existing instructions and no manual test-writing is required.
$ azd ai agent eval init
Eval suite created
Dataset: customer-support (2.0), 15 tasks
Evaluator: customer-support (1)
Evaluator dimensions (6):
Weight Dimension
────── ─────────
10 policy_compliance
6 resolution_accuracy
5 troubleshooting_structure
4 communication_clarity
3 safety_boundaries
5 general_quality
The Full Workflow
azd ai agent init # scaffold your agent
azd deploy # ship to Foundry
azd ai agent eval init # generate evaluation criteria
azd ai agent eval run # score your agent
azd ai agent optimize # improve it
azd ai agent optimize apply --candidate <id>
azd deploy # deploy the optimized agent as a new version
Each promoted candidate becomes a new versioned hosted agent deployment — auditable, rollback-ready, and captured in tracing.
Agent optimizer is currently in private preview, with public preview rolling out in 30 days. Sign up for early access →
The Road to General Availability
Hosted agents are in public preview today, available across 20 Azure regions globally. As we approach General Availability by end of June 2026, our focus areas will include:
- Agent optimizer in Foundry Agent Service public preview — rolling out to all hosted agents regions
- Private ACR in BYO virtual network— enabling container images in private Azure Container Registries to be deployed to Hosted Agents
- Managed virtual network — extending support of Microsoft managed virtual network for Hosted agents
- Expanding Voice Live / WebSocket coverage beyond North Central US to additional regions
- Durable, long running agents — enabling hosted agents to survive container crashes, redeployments, and periods of inactivity with automatic recovery, state persistence across turns, and multi-session context accumulation
Get Started
Developers can get started in minutes by following the QuickStart or with code samples (Python · C#) which walk through setting up, testing, and deploying a production-ready hosted agent end to end.
Check out AI Agents for Beginners for a 12-lesson curriculum, then go deeper with guided labs: Develop AI Agents in Azure, Hosted Agents Workshop (.NET), and the ZavaShop Supply Chain Workshop.
Watch: Foundry Agent Service + Microsoft Agent Framework Explained — Jeff Hollan walks through how to operationalize AI agents from deployment to real-world impact.
If you’re attending Microsoft Build 2026, or watching on-demand content later, be sure to check out these sessions:
The post What’s New in Hosted Agents in Foundry Agent Service appeared first on Microsoft Foundry Blog.