Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
149744 stories
·
33 followers

Foundry Agent Service is GA: private networking, Voice Live, and enterprise-grade evaluations

1 Share

The hardest part of shipping production AI agents isn’t the prototype — it’s everything after. Network isolation requirements. Compliance audits. Voice channels your operations team actually wants to use. Evaluations that aren’t just a pre-ship checkbox.

Today’s GA release of the next-gen Foundry Agent Service addresses all of these directly. Here’s what shipped and what it means for your builds.

What’s new

  • Foundry Agent Service (GA): Responses API-based runtime, wire-compatible with OpenAI agents, open model support across Meta, Mistral, DeepSeek, xAI, LangChain, LangGraph, and more
  • End-to-end private networking: BYO VNet with no public egress, extended to cover tool connectivity — MCP servers, Azure AI Search, and Fabric data agents
  • MCP authentication expansion: Key-based, Entra Agent Identity, Managed Identity, and OAuth Identity Passthrough in a single service
  • Voice Live (preview) + Foundry Agents: Real-time speech-to-speech, fully managed, wired natively to your agent’s prompt, tools, and tracing
  • Evaluations (GA): Out-of-the-box evaluators, custom evaluators, and continuous production monitoring piped into Azure Monitor
  • Hosted agents (preview) in six new Azure regions: East US, North Central US, Sweden Central, Southeast Asia, Japan East, and more

Foundry Agent Service GA: built on the Responses API

The next-gen Foundry Agent Service is built on the OpenAI Responses API — the same agentic wire protocol developers are already building on. If you’re building with the Responses API today, migrating to Foundry is minimal code changes. What you gain immediately: Foundry’s enterprise security layer, private networking, Entra RBAC, full tracing, and evaluation — on top of your existing agent logic.

The architecture is intentionally open. You’re not locked to a single model provider or orchestration framework. Use a Llama model for planning, an OpenAI model for generation, LangGraph for orchestration — the runtime handles the consistency layer. Agents, tools, and the surrounding infrastructure all speak the same protocol.

import os
from azure.identity import DefaultAzureCredential
from azure.ai.projects import AIProjectClient
from azure.ai.projects.models import PromptAgentDefinition

with (
    DefaultAzureCredential() as credential,
    AIProjectClient(endpoint=os.environ["AZURE_AI_PROJECT_ENDPOINT"], credential=credential) as project_client,
    project_client.get_openai_client() as openai_client,
):
    agent = project_client.agents.create_version(
        agent_name="my-enterprise-agent",
        definition=PromptAgentDefinition(
            model=os.environ["AZURE_AI_MODEL_DEPLOYMENT_NAME"],
            instructions="You are a helpful assistant.",
        ),
    )

    conversation = openai_client.conversations.create()

    response = openai_client.responses.create(
        conversation=conversation.id,
        input="What are best practices for building AI agents?",
        extra_body={"agent_reference": {"name": agent.name, "type": "agent_reference"}},
    )
    print(response.output_text)

Note: If you’re coming from the azure-ai-agents package, agents are now first-class operations on AIProjectClient in azure-ai-projects. Remove your standalone azure-ai-agents pin and use get_openai_client() to drive responses.


End-to-end private networking

Unmanaged network paths are a showstopper for enterprises operating under data classification policies that prohibit external routing of query content or retrieved documents. Every retrieval call, every tool invocation, every model round-trip is a potential exposure vector if it crosses the public internet.

Foundry Agent Service now supports Standard Setup with private networking, where you bring your own virtual network (BYO VNet):

  • No public egress — agent traffic never traverses the public internet
  • Container/subnet injection into your network for local communication to Azure resources
  • Access to private resources via the platform network with appropriate authorization

More importantly, private networking is extended to tool connectivity. MCP servers, Azure AI Search indexes, and Fabric data agents can all operate over private network paths — so retrieval and action surfaces sit inside your network boundary, not just inference calls.


MCP authentication: the full spectrum

MCP as a connection primitive is only as secure as its auth model. Enterprise MCP deployments span org-wide shared services, user-delegated access, and service-to-service connections — and they need different auth patterns for each.

Foundry now supports the full spectrum for MCP server connections:

Auth method When to use
Key-based Simple shared access for org-wide internal tools
Entra Agent Identity Service-to-service; the agent authenticates as itself
Entra Foundry Project Managed Identity Per-project permission isolation; no credential management overhead
OAuth Identity Passthrough User-delegated access; user authenticates to the MCP server and grants the agent their credentials

OAuth Identity Passthrough is the one worth calling out. When users need to grant an agent access to their personal data or permissions — their OneDrive, their Salesforce org, a SaaS API that scopes by user — the agent should act on their behalf, not as a shared system identity. Passthrough enables exactly that with standard OAuth flows.

For key-based auth, add a Custom Keys connection in your Foundry project with an Authorization: Bearer <token> header, then reference it via project_connection_id:

from azure.ai.projects.models import MCPTool, PromptAgentDefinition

# project_connection_id: resource ID of a Custom Keys connection
# storing Authorization: Bearer <your-pat-token>
tool = MCPTool(
    server_label="github-api",
    server_url="https://api.githubcopilot.com/mcp",
    require_approval="always",
    project_connection_id=os.environ["MCP_PROJECT_CONNECTION_ID"],
)

agent = project_client.agents.create_version(
    agent_name="my-mcp-agent",
    definition=PromptAgentDefinition(
        model=os.environ["AZURE_AI_MODEL_DEPLOYMENT_NAME"],
        instructions="Use MCP tools as needed.",
        tools=[tool],
    ),
)

Voice Live (preview): a managed speech channel for your agents

Adding voice to an agent used to mean stitching together three separate services (STT, LLM, TTS) — three latency hops, three billing surfaces, three failure modes, all synchronized by hand. Voice Live is a fully managed, real-time speech-to-speech runtime that collapses that into a single managed API.

What Voice Live handles:

  • Semantic voice activity detection — knows when you’ve stopped speaking based on meaning, not just silence or audio level
  • Semantic end-of-turn detection — understands conversational context to determine when the agent should respond
  • Server-side noise suppression and echo cancellation — no post-processing pipeline required
  • Barge-in support — users can interrupt mid-response

With this integration, you connect Voice Live directly to an existing Foundry agent. The agent’s prompt, tool definitions, and configuration are managed in Foundry; Voice Live handles the audio pipeline. Voice interactions go through the same agent runtime as text — which means the same evaluators, the same traces, the same cost visibility. Voice doesn’t get a second-class observability story.

For customer support, field service, accessibility, and any hands-free workflow where spoken dialogue is the primary interface, this replaces what previously required a custom audio pipeline.

Connecting Voice Live to a Foundry agent uses AgentSessionConfig at connection time — point it at an agent name and project, and the session is immediately voice-enabled:

import asyncio
from azure.ai.voicelive.aio import connect, AgentSessionConfig
from azure.identity.aio import DefaultAzureCredential

async def run():
    agent_config: AgentSessionConfig = {
        "agent_name": "my-enterprise-agent",
        "project_name": "my-foundry-project",
        # "agent_version": "v1",  # optional — defaults to latest
    }

    async with DefaultAzureCredential() as credential:
        async with connect(
            endpoint=os.environ["AZURE_VOICELIVE_ENDPOINT"],
            credential=credential,
            agent_config=agent_config,
        ) as connection:
            # Update session: modalities, voice, VAD, echo cancellation
            await connection.session.update(session=session_config)
            # Process audio events
            async for event in connection:
                ...

asyncio.run(run())

The agent’s prompt, tool definitions, and safety configuration stay in Foundry. Voice Live owns the audio I/O. The full working sample — including audio capture/playback via PyAudio and interrupt handling — is in the SDK repo.


Evaluations: GA with continuous production monitoring

Running a test suite before shipping is not a production quality strategy — it’s a snapshot. Quality degrades in production as traffic patterns shift, retrieved documents go stale, and new edge cases emerge that never appeared in your eval dataset.

Foundry Evaluations are now generally available with three layers that together enable a proper quality lifecycle:

Out-of-the-box evaluators cover the standard RAG and generation scenarios: coherence, relevance, groundedness, retrieval quality, and safety. No custom configuration required — connect them to a dataset or live traffic and get quantitative scores back.

Custom evaluators let you encode your own criteria: business logic, internal tone standards, domain-specific compliance rules, or any quality signal that doesn’t map cleanly to a general evaluator.

Continuous evaluation closes the production loop. Foundry samples live traffic automatically, runs your evaluator suite against it, and surfaces results through integrated dashboards. Configure Azure Monitor alerts to fire when groundedness drops, safety thresholds breach, or performance degrades — before users notice.

All evaluation results, traces, and red-teaming runs publish to Azure Monitor Application Insights. You get full-stack observability that spans agent quality, infrastructure health, cost, and traditional app telemetry in one place.

Evaluations in azure-ai-projects run through the OpenAI-compatible evals API on AIProjectClient. The pattern: define the schema and evaluators in openai_client.evals.create(), then run against an agent target with openai_client.evals.runs.create().

import os, time
from azure.identity import DefaultAzureCredential
from azure.ai.projects import AIProjectClient
from azure.ai.projects.models import PromptAgentDefinition
from openai.types.eval_create_params import DataSourceConfigCustom

with (
    DefaultAzureCredential() as credential,
    AIProjectClient(endpoint=os.environ["AZURE_AI_PROJECT_ENDPOINT"], credential=credential) as project_client,
    project_client.get_openai_client() as openai_client,
):
    agent = project_client.agents.create_version(
        agent_name=os.environ["AZURE_AI_AGENT_NAME"],
        definition=PromptAgentDefinition(
            model=os.environ["AZURE_AI_MODEL_DEPLOYMENT_NAME"],
            instructions="You are a helpful assistant.",
        ),
    )

    eval_object = openai_client.evals.create(
        name="Agent Quality Evaluation",
        data_source_config=DataSourceConfigCustom(
            type="custom",
            item_schema={"type": "object", "properties": {"query": {"type": "string"}}, "required": ["query"]},
            include_sample_schema=True,
        ),
        testing_criteria=[
            {
                "type": "azure_ai_evaluator",
                "name": "fluency",
                "evaluator_name": "builtin.fluency",
                "initialization_parameters": {"deployment_name": os.environ["AZURE_AI_MODEL_DEPLOYMENT_NAME"]},
                "data_mapping": {"query": "{{item.query}}", "response": "{{sample.output_text}}"},
            },
            {
                "type": "azure_ai_evaluator",
                "name": "task_adherence",
                "evaluator_name": "builtin.task_adherence",
                "initialization_parameters": {"deployment_name": os.environ["AZURE_AI_MODEL_DEPLOYMENT_NAME"]},
                "data_mapping": {"query": "{{item.query}}", "response": "{{sample.output_items}}"},
            },
        ],
    )

    run = openai_client.evals.runs.create(
        eval_id=eval_object.id,
        name=f"Run for {agent.name}",
        data_source={
            "type": "azure_ai_target_completions",
            "source": {
                "type": "file_content",
                "content": [{"item": {"query": "What is the capital of France?"}},
                             {"item": {"query": "How do I reverse a string in Python?"}}],
            },
            "input_messages": {
                "type": "template",
                "template": [{"type": "message", "role": "user",
                               "content": {"type": "input_text", "text": "{{item.query}}"}}],
            },
            "target": {"type": "azure_ai_agent", "name": agent.name, "version": agent.version},
        },
    )

    while run.status not in ["completed", "failed"]:
        run = openai_client.evals.runs.retrieve(run_id=run.id, eval_id=eval_object.id)
        time.sleep(5)

    print(f"Status: {run.status}, Results: {run.result_counts}")

Hosted agents (preview) in six new regions

Hosted agents — containerized agent code running as managed services on Foundry Agent Service — are now available in six additional Azure regions: East US, North Central US, Sweden Central, Southeast Asia, Japan East, and more.

This is relevant for two concrete scenarios: data residency requirements that mandate processing stays within a geographic boundary, and latency that compresses when your agent runs close to its data sources and users. Foundry handles container orchestration, scaling, networking, and endpoint management — you own the agent behavior and business logic.


Learn more

For a hands-on walkthrough of the Foundry Agent Service capabilities, watch the session below — covering building a basic conversational agent, adding custom skills, grounding with documents, code execution, real-time internet access, connecting to external servers via MCP, and combining multiple tools:


Get started

The next-gen Foundry Agent Service is available now. Install the SDK, open the portal, and go:

pip install azure-ai-projects azure-identity

The Foundry portal has an updated agents experience with visual workflow building, a unified Tools tab for MCP, A2A, and Azure AI Search connections, and the separated v1/v2 resource view. If you’re coming from Foundry Classic, the new experience is the default.

For a hands-on introduction, the agents quickstart takes you from zero to a running, tool-using agent in a few minutes.

The post Foundry Agent Service is GA: private networking, Voice Live, and enterprise-grade evaluations appeared first on Microsoft Foundry Blog.

Read the whole story
alvinashcraft
3 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Securing our codebase with autonomous agents

1 Share
Cursor's security team built a fleet of security agents to find and fix vulnerabilities across a fast-changing codebase.
Read the whole story
alvinashcraft
4 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Microsoft Outlook and 365 Hit by Widespread Outages, Users Report Login and Email Failures

1 Share

Microsoft is investigating issues with Outlook and Microsoft 365 after users report login failures and missing emails, alongside known bugs in classic Outlook.

The post Microsoft Outlook and 365 Hit by Widespread Outages, Users Report Login and Email Failures appeared first on TechRepublic.

Read the whole story
alvinashcraft
56 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Android 17 Leaks Reveal Major Redesign, AI Features, and Privacy Upgrades

1 Share

Android 17 beta is here. Here’s what is confirmed so far, what leaks suggest, and which rumored features may arrive later in 2026.

The post Android 17 Leaks Reveal Major Redesign, AI Features, and Privacy Upgrades appeared first on TechRepublic.

Read the whole story
alvinashcraft
56 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Microsoft teases image support in Notepad for Windows 11 ahead of roll out

1 Share

Windows Latest previously exclusively reported that Microsoft is internally testing image support in Notepad for Windows 11. A month later, Notepad’s image integration is now being quietly teased in an email sent to the Windows Insiders ahead of the rollout. But it’s unclear when the update will roll out to everyone.

Notepad image support teased

In an email spotted by Windows Latest, there’s a screenshot of an unreleased Notepad version, and it has a toggle to insert images.

We don’t know how the feature works, but you should be able to insert multiple images in Notepad. It’ll be similar to how WordPad handled images.

Notepad image insert

Microsoft sources have already told Windows Latest that Notepad’s image support is real and has been in the works for the past few months. Some of you might argue that Notepad is a text editor and it doesn’t need image support, which is a fair point. However, it’s part of the company’s efforts to fill the gap left by WordPad.

Like the rest of the formatting options, Windows Latest has learned that Notepad’s image support does not use a lot of resources, and it’s barely noticable in most cases.

Microsoft is bridging the gap between Notepad and WordPad at the cost of Notepad’s simplicity

Microsoft has always maintained multiple text editors on Windows, including MS Word, WordPad, and Notepad.

While Microsoft Word is a flagship software and paid, Notepad has always been a simple text editor that only allows you to type text. This changed after Microsoft retired WordPad, as the company decided to bring WordPad-like text formatting to Notepad.

Notepad is no longer a simple text editor, and it’s always getting new features, including full-fledged support for markdown.

Insert a table in Notepad

Microsoft argues that markdown in Notepad is lightweight, and it allows applying text formatting, such as italics, underline, bold, links, and even creating a table.

Notepad also has other formatting syntax, including strikethrough and nested lists, so you can make cleaner notes without switching to toher. You can use the toolbar, keyboard shortcuts, or just type the Markdown symbols directly.

Finally, tables are available more widely after recent update. You can insert a table from the toolbar and pick the size using a small grid, then type inside the cells like normal text.

Because it’s Markdown-style, the table is still stored as plain text with separators, which keeps it lightweight and easy to edit even in a basic .txt file.

Another change is how Notepad handles AI text tools like Write, Rewrite, and Summarize. Instead of waiting for the full result to finish, Notepad now starts showing text sooner, line by line, similar to ChatGPT.

Notepad text streaming

As always, if you dislike one of these features, you can turn them off from Settings in Notepad.

The post Microsoft teases image support in Notepad for Windows 11 ahead of roll out appeared first on Windows Latest

Read the whole story
alvinashcraft
56 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Microsoft at NVIDIA GTC: New solutions for Microsoft Foundry, Azure AI infrastructure and Physical AI

1 Share

Microsoft combines accelerated computing with cloud scale engineering to bring advanced AI capabilities to our customers. For years, we’ve worked with NVIDIA to integrate hardware, software and infrastructure to power many of today’s most important AI breakthroughs.

What’s new at NVIDIA GTC

  • Expanded Microsoft Foundry capabilities to build, deploy and operate production-ready AI agents on NVIDIA accelerators and open NVIDIA Nemotron models
  • New Azure AI infrastructure optimized for inference-heavy, reasoning-based workloads, including the first hyperscale cloud to power on next-generation NVIDIA Vera Rubin NVL72 systems
  • Deeper integration across Microsoft Foundry, Microsoft Fabric and NVIDIA Omniverse libraries and open frameworks to support Physical AI systems from simulation to real‑world operations

From Frontier models to production-ready agents

At the foundation of this system is Microsoft Foundry: serving as the operating system for building, deploying and operating AI at enterprise scale. Foundry builds on Azure to bring together models, tools, data and observability into a single system designed for production agents. Today we’re expanding those capabilities across Foundry Agent Service and NVIDIA Nemotron models.

The next-generation Foundry Agent Service and Observability in Foundry Control Plane are now generally available, enabling organizations to build and operate AI agents at production scale. Foundry Agent Service allows teams to quickly develop agents that reason, plan and act across tools, data and workflows. Once created, Foundry Control Plane provides the developer end-to-end visibility into agent behavior, unlocking both developer productivity as well as enterprise trust. Companies such as Corvus Energy are already using Foundry to replace manual inspection workflows with agent-driven operational intelligence across their global fleet.

We are further simplifying the path from prototype to production with the availability of Voice Live API integration with Foundry Agent Service, in public preview, which enables developers to build voice-first, multimodal, real-time agentic experiences. This pairs with the general availability of a refreshed Microsoft Foundry portal and expanded integrations for Palo Alto Networks’ Prisma AIRS and Zenity, delivering deeper builder experiences and runtime security across the entire agent lifecycle.

NVIDIA Nemotron models are also now available through Microsoft Foundry, joining the widest selection of models on any cloud, including the latest reasoning, frontier and open models. This bolsters our recent partnership announcement bringing Fireworks AI to Microsoft Foundry, enabling customers to fine-tune open-weight models like NVIDIA Nemotron into low-latency assets that can be distributed to the edge.

Scaling AI infrastructure for the world’s most demanding workloads

Inference AI workloads are reshaping cost, performance and system design requirements. To operationalize agentic AI at scale, customers need purpose-built infrastructure for inference‑heavy, reasoning‑based workloads that can be deployed and operated consistently across global and regulated environments.

Microsoft’s AI infrastructure approach is engineered to seamlessly bring next-generation NVIDIA systems into Azure datacenters that are designed for power, cooling networking and rapid generational upgrades. This allows our customers to move with speed and agility and stay at the leading edge from generation to generation.

In less than a year, we’ve deployed hundreds of thousands of liquid-cooled Grace Blackwell GPUs across our global datacenter footprint, and now we are excited to be the first hyperscale cloud to power on NVIDIA’s newest Vera Rubin NVL72 in our labs. Over the next few months, Vera Rubin NVL72 will be rolled out into our modern, liquid-cooled Azure datacenters.

Microsoft’s infrastructure innovation with NVIDIA also extends to sovereign and regulated environments to give customers control of both where AI runs and how it evolves over time. Recently, we announced Foundry Local support for modern infrastructure and large AI models, and today we now have initial support for NVIDIA Vera Rubin platform on Azure Local, extending accelerated AI capabilities to customer-controlled environments. This approach allows organizations to plan for next-generation AI workloads, including reasoning-based and agentic systems, while maintaining Azure-consistent operations, governance and security through our unified software layer with Azure Arc and Foundry Local.

YouTube Video

Bringing AI into the physical world

As AI moves beyond digital experiences, Microsoft and NVIDIA are collaborating to support the next wave of Physical AI. At GTC, this work centers on NVIDIA Physical AI Data Factory Blueprint, with Microsoft Foundry as the platform for hosting and operating Physical AI systems on Azure at cloud scale.

By integrating this blueprint with Azure services as part of a Physical AI Toolchain, Microsoft enables developers to build, train and operate physical AI and robotics workflows that connect physical assets, simulation and cloud training environments into repeatable, enterprise-grade pipelines. To support, we are introducing a public Azure Physical AI Toolchain GitHub repository integrated with the Nvidia Physical AI Data Factory and with core Azure services.

To further the impact of AI in real‑world, physical environments, today Microsoft and NVIDIA are deepening the integration between Microsoft Fabric and NVIDIA Omniverse libraries, connecting live operational data with physically accurate digital twins and simulation. This allows organizations to see what’s happening across their physical systems, understand it in real time and use AI to decide what to do next. In practice, customers in manufacturing and operations and beyond are using this approach to move beyond dashboards and alerts to coordinated, AI‑driven action across machines, facilities and workflows.

From innovation to impact

Microsoft is delivering reliable, production‑scale AI by bringing together its global AI infrastructure, platforms and real‑world systems with the latest innovation from NVIDIA. For customers, this means the ability to operate intelligence continuously, running inference-heavy, reasoning-based and physical AI workloads with the performance, security and governance required for real businesses and regulated industries.

Whether powering always-on agents, scaling next-generation AI infrastructure or deploying intelligent systems in factories, energy facilities and sovereign environments, Microsoft and Nvidia are helping customers move faster from insight to action.

Yina Arenas leads product strategy and execution for Microsoft Foundry, overseeing the endtoend AI product portfolio, infrastructure, developer experiences and foundation model integration across OpenAI, Anthropic, Mistral, DeepSeek and others. She delivers an enterprise ready, production grade AI platform trusted by global customers for secure, reliable and scalable AI.

The post Microsoft at NVIDIA GTC: New solutions for Microsoft Foundry, Azure AI infrastructure and Physical AI appeared first on The Official Microsoft Blog.

Read the whole story
alvinashcraft
57 minutes ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories