AI has arrived in the enterprise, and the shift is happening all at once. Every function, every role, every workflow is being reshaped. At the same time, a new class of organizations is emerging, one that will look fundamentally different from the companies that defined the last era of business. The winners won’t be those with the most demos, but those that turn AI into a governed, continuously improving system for running real work.
This isn’t just about chatbots, either. Those experiences are useful, but they don’t transform how large organizations operate. The real opportunity is teams of agents executing long running work across functions like software delivery, support, finance, HR, and operations — with the identity, context, policy, and human oversight required to trust them in production.
To make this possible, enterprises need more than access to a powerful AI model or scalable compute. What determines success is the system around the AI: how agents are built and deployed by engineering teams, how they’re contextualized in the enterprise, how they’re governed and observed in production, and how they improve safely over time. Without that system, AI remains fragmented, fragile, and difficult to trust at scale.
We’re taking a fundamentally different approach. We are building a comprehensive agent platform: one that supports many models, is open, and gives you choice and flexibility at every layer of the stack. And we are purposefully designing it with developers at the center. Today, the next pieces of that platform are clicking into place.
Building a system for the agentic enterprise
To succeed in this new era, an agent platform must meet a higher bar. It must run real production workloads, map real organizational complexity, and manage real business responsibility.
We’re building around three key principles:
First, it must be a single, integrated system, with support for a wide range of models.
Enterprises can’t afford to assemble their agent strategy one piece at a time. Disconnected tools stitched together after the fact can slow teams down and introduce unnecessary risk. Building, contextualizing, running, governing, and improving agents should happen within one coherent system. That’s why we’re bringing together Azure, GitHub, Microsoft IQ, Fabric, Foundry, Windows, Microsoft Security, and Microsoft 365 to operate as a single system you can use to deploy agents at enterprise scale. Enterprises also need the flexibility to choose the right model for the task, balancing quality, speed, and cost — including Microsoft models, partner models, and open models.
Second, it must be secured and governed by design.
Governance is easy to claim and much harder to deliver. Making it real means starting with a single stack that spans development through production, built on the identity, access, compliance, and security foundations enterprises already trust. By extending Entra, Purview, Defender, Agent 365, and the broader Microsoft Security stack, governance becomes native to the system rather than bolted on later, supporting the ambitions of an AI first enterprise without compromising control.
Third, it must improve continuously.
Enterprise AI systems can’t be static. Agent behavior, outcomes, and human feedback must flow back into the system, so it can improve safely over time under human oversight. As the system runs, models, workflows, and agents become more capable and more specific to an enterprise’s unique business processes. The result is a system that compounds in value the longer it’s in use.
These properties are becoming must-haves, and enterprises that align their AI ambitions with these three principles will pull ahead in quarters, not years.
So how does a system like this actually take shape inside a real enterprise? It starts where work begins, with how agents are built. Let’s walk through what that looks like on the platform we’ve built.

1. Build in GitHub
GitHub is where your developers already work. It’s where your dependencies live, where your application and code context is kept, where you collaborate with the open source community you depend on, and where you drive innovation. Building agents anywhere else means leaving all that behind.
Agents should be built the same way production software is built. You write code with GitHub Copilot to move faster. You bring together the assets that matter most: codebases, work items, agent skills, and tools. And because agents aren’t just code, you bring your evals and observability assets alongside them, all versioned the way any production system should be.
Agents must follow a lifecycle: source, test, deploy, observe, and improve. GitHub sets up that lifecycle and provides the necessary controls from day one. The result is a workflow designed for building agents with the right guardrails from the start. And you can do all this in one place, in a new app built for this system.
2. Contextualize with Microsoft IQ
Code is only part of an agent. To be useful, an agent also has to understand your business: your customers, your products, your contracts, your processes. Without enterprise context and intelligence you can trust, even the most capable model is guessing.
Enterprises require a wide variety of models and the ability to match the right model to the right job, but model choice alone is not enough. Microsoft IQ grounds agents in enterprise context by connecting to your business data wherever it lives, across Microsoft 365, your core business systems (such as customer and revenue data), and other systems your enterprise already relies on, like knowledge bases and your website. With Web IQ, the latest addition to the IQ platform, agents can also incorporate relevant information from the web when appropriate.
Contextualizing agents in enterprise data isn’t just about access. Pointing AI at raw information is inefficient and brittle. Microsoft IQ organizes, secures, and surfaces the right information in forms agents can actually use, so they can reach accurate insight without drowning in noise or hallucinating answers.
Once agents are grounded in the right context, enterprises can go further. With Frontier Tuning, you don’t just call AI models. You improve how they behave using your data and real-world workflows.
That includes Microsoft’s seven new MAI models, spanning image, voice, transcription, coding, and reasoning. Together, this model family is designed to work across the kinds of tasks that matter in the real world, and critically, these models are not static endpoints. They’re built to learn from how work actually gets done in your business.
Our reinforcement learning environments allow our models to be reinforced through actual outcomes in your environment. Think of them as training gyms for AI. Here the agent learns your very specific processes, standards, and way of working. It becomes specialized and adapted to you, delivering a measurable and better ROI.
Moreover, your custom or post-trained models all stay in your environment. Your intellectual property, your proprietary data, and the way work actually gets done become part of how your agents reason and act. The resulting intelligence runs in your environment, under your control, and the learning stays yours.
Without context and Frontier Tuning, agents are capable generalists. With it, they become a customized partner that understands the business they’re operating in.
3. Run in Foundry
Once agents are built and contextualized, they need a place to run. Not as an experiment. In production.
Agents and teams of agents place very different demands on a runtime than traditional applications do. They need to reason, act, call tools, coordinate with other agents, and adapt over time, all while operating under enterprise controls. Foundry is the runtime designed for that reality.
- The largest collection of models: Different agents need to be good at different things at different price points. Whatever the task, whatever the cost profile, Foundry provides access to the right model, and an optimized model router helps you balance quality, speed, and cost for each agent.
- Optimized performance for open models: With Fireworks AI on Foundry, enterprises get faster, more efficient inference directly into the platform.
- Support for any agent, including those not built on our stack: Bring in agents built on the Microsoft Agent Framework, LangGraph, GitHub Copilot SDK, Claude Agent SDK, or a custom harness.
- Tools and actions: Agents act on enterprise systems through MCP, connectors, APIs, and workflows, with safe execution by default.
- Evals and traces: Observability and traces make agent behavior measurable. If you can’t measure it, you can’t improve it.
- Continuous optimization: Foundry enables tuning of models, harnesses, IQs, tools, and actions over time, improving performance as agents operate in your world.
A trust, security, and policy rail wraps the entire runtime. Policy applies consistently across context access, tool calls, optimization updates, traces, and response delivery. The agent doesn’t just work. It works the way your enterprise requires.
This is where your agent stops being a project and starts becoming a production system.
4. Govern with Agent 365
Now multiply that agent by hundreds. Then thousands. That’s what happens as different teams build agents across an enterprise. Some are well designed. Some aren’t. Some have access they shouldn’t. Others are doing valuable work that no one else in the organization benefits from.
Enterprise governance isn’t optional. Enterprises need a way to see what’s running, understand what it can access, monitor task adherence, and enforce policies across their entire agent estate.
Agent 365, along with Entra, Purview, Defender, and the broader Microsoft Security stack, come together to do just this. And if you’re interested in AI for security in addition to securing your AI, there’s “MDASH.”
Every agent in your organization shows up in a single catalog, whether it was built in Foundry or elsewhere. IT sees who deployed an agent, what data and tools it can access, how it’s behaving, and what it costs. They can enforce policy or take action when required.
One place. Full visibility. Real control over what your agents do and don’t do.
5. Improve continuously
Agents can’t be static. Every agent action generates signal: trajectories, outcomes, feedback. The system captures it, refines it, and feeds it back. Observe. Evaluate. Improve. Roll out safely. Repeat.
This learning loop runs continuously, in production.
Most gains start with eval-driven improvements to the agent itself: prompts, context, skills, and tools. As clear patterns emerge, learning can extend into model routing across multiple models, fine-tuning, or reinforcement learning. But it all stays anchored in evaluation, improving agent quality and ROI to the level the business requires.
The loop is governed, not closed. Enterprises need to audit it, correct it, and control how to roll out changes. The system becomes more capable over time, guided by human oversight and increasingly autonomous, but never beyond your reach.
This is the hill-climbing model in action: system-level improvement, happening continuously while the system runs.
6. Surface where people work, and scale on Azure
Of course, none of this matters if it doesn’t reach the people doing the work.
Agents surface directly in the flow of work, in Teams, across Microsoft 365, and inside your own applications and experiences. Identity, security, and compliance are built in from the start, so the agents that your teams rely on day to day inherit the same trust model as the rest of your environment.
We support multiple platforms, but your agents can be developed and run in an optimized and secure way on Windows. You can run models both in the cloud and locally on your machine, and best-in-class sandboxing lets you run always-on agents safely.
When you need compute optimized for AI, global and sovereign infrastructure, or a route to market, the system scales on Azure, the same enterprise foundation customers have trusted for decades.
The system compounds
Every leading enterprise will converge on this model: a central AI platform that orchestrates work across the business, bringing together data, models, agents, and human judgment into a continuously improving and secure system.
As that system runs, its value compounds. Velocity increases and the bottleneck shifts from effort to human creativity and coordination. People are able to do more work independently, guided by shared context and fewer handoffs, while the business moves faster without adding friction.
We’re in a time of profound disruption. The enterprises that lead in this moment will be those that adapt as conditions change, simplify how work is coordinated across the business, and consistently turn intelligence into real outcomes. Microsoft’s agent platform is designed to do exactly that: it unlocks the ability to build, contextualize, run, govern, and improve agents as a single, integrated system.
At that point, the platform becomes more than a build layer. It becomes the operating system for enterprise AI at scale, where intelligence and trust are built in by design.
The post AI alone won’t change your business. The system running it will. appeared first on The Official Microsoft Blog.