We're building three interfaces for Azure SRE Agent: an interactive CLI for humans at a terminal, an agent mode for coding agents that spawn it as a subprocess, and an MCP server for humans inside coding agents and for remote agents in other ecosystems.
The CLI and agent mode are coming. The MCP server ships first. That ordering wasn't obvious at first, and this post is about why we landed there.
Three interfaces, one question: who's actually calling this?
When we started designing the Azure SRE Agent CLI, the question we kept running into was deceptively simple: who's the caller?
A human at 2 AM during an incident? Yes. But also a coding agent mid-session that wants SRE Agent capabilities without leaving VS Code. And a PagerDuty SRE agent running an automated triage loop with no human in the picture. And another Azure SRE Agent instance that wants to delegate a sub-task.
Four callers. Same backend. None of them want the same thing.
The three interfaces map to these callers:
- Interactive CLI: humans at a terminal, terse and incident-optimized
- Agent Mode: coding agents like Copilot CLI that spawn it as a subprocess
- MCP Server: humans inside coding agents, and remote agents in other ecosystems
The MCP server is the one shipping now. Here's why.
The CLI requires someone to reach for it
The interactive CLI and agent mode have something in common: the caller has to know Azure SRE Agent exists and decide to invoke it. A human types a command. A coding agent spawns a subprocess. Either way, it's a deliberate call.
The MCP server works differently. It surfaces itself as tools inside whatever environment the caller is already in. The model decides when to use them. An SRE working in Copilot CLI doesn't open a separate terminal and type a command. They ask a question and the right tool fires. A remote agent in a PagerDuty loop doesn't spawn a subprocess. It speaks a protocol and gets a response.
That's the difference. The CLI requires intent. The MCP server meets callers where they already are.
Two callers, one protocol
The MCP surface has two audiences. They speak the same protocol but come from completely different contexts.
Humans inside coding agents. An SRE in VS Code Copilot, Claude Desktop, GitHub Copilot CLI, or Cursor is already in a session: writing a deployment script, reviewing a runbook, debugging a failing service. They don't want a context switch. They want SRE Agent capabilities alongside the work they're already doing. Connect the MCP server once and it's just there.
Remote agents in other ecosystems. An AWS DevOps agent handling a cross-cloud incident might need to check Azure resource health without bouncing the call to a human. A PagerDuty SRE agent might pull an incident summary as part of its triage loop. One Azure SRE Agent instance might delegate work to another. MCP is what makes any of this work without custom integrations on both sides. Both sides agree on the protocol. Neither side has to know the other's internals.
| Caller | Context | What they need |
|---|---|---|
| Human in Copilot CLI / VS Code Copilot | Mid-workflow, coding session | Readable summary, minimal overhead |
| Human in Claude Desktop / Cursor | Agentic session | SRE tools available in the conversation |
| AWS DevOps Agent | Automated incident loop | Defined schema, stable fields |
| PagerDuty SRE Agent | Triage pipeline | Parseable, sparse, no narrative |
| Other Azure SRE Agent | Delegated sub-task | Agent-to-agent contract |
Tool descriptions are product decisions
Each MCP tool maps to a specific SRE Agent capability. Tools have names, descriptions in natural language, and JSON input schemas. The descriptions do more work than they look like they should.
When someone in Copilot CLI asks "what's wrong with my API gateway," the model reads tool descriptions to decide which tool to call. A description that says "Returns health status for an Azure resource" gets invoked less reliably than one that says "Check whether an Azure resource (VM, gateway, database, container) is healthy, degraded, or unreachable. Use this when diagnosing an active outage or validating state after a deployment."
The second version tells the model when to reach for the tool, not just what it does. PM, engineering, and content design reviewed descriptions together. When an invocation misfired in testing, the fix was almost always the description, not the schema.
We iterated on tool descriptions the same way you'd iterate on a system prompt, because that's what they are.
One output shape, two callers
The human-in-a-coding-agent and the remote-agent-in-an-automated-loop want different things from the same tool response. A human wants something readable. A remote agent wants something parseable. The obvious answer is to return different shapes based on who's calling.
We didn't do that.
Every tool response follows the same contract: defined fields, stable semantics, no preamble, no internal reasoning, plus one summary field with a plain-language sentence. A human reads the summary. A remote agent ignores it and parses the structured fields. The overhead is negligible in both directions.
We briefly considered branching on a caller-type hint in the request header. The problem: it added surface area to maintain and created subtle failure modes when the hint was wrong or missing. One shape, always.
What's harder than it looks
Statelessness is a feature for remote agents and friction for humans. MCP tools are stateless by design. Each invocation is independent. Remote agents love this; they don't want to manage session state across calls. Humans working interactively want context to carry forward. We handled it by making every response self-sufficient: the tool returns enough context that the model can construct a coherent follow-up call without re-explaining the situation. The tool doesn't remember. It returns enough that memory is cheap for whoever holds it.
You can't test the remote agent use case the same way you test the human use case. Spin up Copilot CLI, connect the server, ask a question and you can watch what happens. You can't easily simulate an AWS agent calling you cold with no prior context about what Azure SRE Agent does. Designing for that caller meant writing descriptions and schemas that work for a model meeting your tools for the first time, with no assumed vocabulary and no assumed workflow.
What's next
The interactive CLI and agent mode follow the same three-node architecture. The interactive CLI is for humans at the terminal: terse, incident-optimized, with progressive disclosure. Agent Mode is for coding agents that spawn the CLI as a subprocess and want direct access to SRE Agent capabilities without a protocol layer in between. Both are in progress.
In the meantime connect SRE Agent to your MCP client and it will show up where you're already working.
Part of a series on the design decisions behind Azure SRE Agent. Companion posts on the CLI and agent mode will follow when they ship.


