Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
154915 stories
·
33 followers

Introducing Command Line, and the new rules for builders

1 Share

Welcome to Command Line, a new blog where we’ll share what Microsoft builds, how our technical teams operate, and what we learn along the way. Build 2026 felt like the right time to roll this out, as we bring together engineering leaders from around the world to dive deep on building, deploying, and operating scalable AI systems. For those on the ground in San Francisco, you’ll notice that the vibe has shifted along with the locale. And that seems only fitting since the entire SDLC has changed dramatically, too. 

To kick things off officially on Command Line, I thought I’d share some of the things we’ve learned about building in the agentic AI era. Things are evolving quickly, and the old rules no longer apply. Manual review processes are breaking under a flood of PRs that our workflows weren’t designed to accommodate. Build time iteration is being met with runtime learning loops as agents improve post-deployment. And the focus is shifting from shipping code to orchestrating systems. 

> Things are evolving quickly, and the old rules no longer apply.

This is more than a productivity boost. It’s an entirely different relationship with code, tools, and decision-making. Throw out the old playbook. We need to develop a new set of rules for builders. 

Here are 10 things that feel important and true today. Time will tell how durable they prove to be. We’re in a time of seismic disruption, so we all need to constantly challenge our assumptions.  

1. Build agent-first by default 

As you develop your proficiency with AI tools, you’ll probably find yourself reaching for the Copilot CLI in GitHub or agent mode in VS Code more often than not. Many senior engineers on our own teams and across the industry no longer write code by hand, and even small tweaks are made via agents when it’s more efficient. 

2. Context and skills are your most important asset 

When you first start using an agent in your repo, you’ll notice long sessions with higher failure rates. Start by prompting the agent to populate the knowledge base of Markdown files about your repo, verify the output to ensure correctness, and then set up a continuous improvement agent so that knowledge base memory is updated after each agent session. Things start to compound rapidly. 

If you’re doing something repeatedly with an agent, wrap it into a reusable skill and share it. Team skills compound the same way code libraries do. If the same agent failure shows up twice, promote the correction into a reusable skill, test, eval, prompt, or workflow with a clear trigger. 

3. Plans are the real work 

When you invest in a good plan, the agent can often one-shot the implementation. Shift your energy from typing out code to shaping clear, scoped roadmaps. Human judgment lives in the plan. Execution runs on autopilot. 

4. Prototypes replace detailed PRDs 

Learning by doing should become the default. Experiment with live demos and prototypes to establish ground truth and guide your decisions before you commit to building. Think demos, not memos. 

5. Taste, not time, is the crucial limited resource 

When building is cheap, the discipline of deciding what’s worth building becomes more critical, not less. Product judgment and prioritization are the highest-leverage skills on the team. Additionally, with near-zero cost to prototype, deciding what to build now includes seeing concrete options upfront. 

6. Tackle the important but overlooked 

AI-forward velocity reclaims bandwidth for critical, high-value engineering debt and operational tasks that usually get pushed aside. That includes finding and fixing high-value sentry errors, repairing broken telemetry dashboards, running sentiment analysis on dogfooding feedback, and operationalizing more rigor in your data quality. 

7. Tests are your safety net 

At high velocity, good test coverage is what prevents you from shipping regressions. Invest in test quality the same way you invest in feature velocity. 

8. Don’t let code review become a bottleneck 

When small teams can ship hundreds of PRs every month, staying close to the codebase requires deliberate effort. Start using and trusting agentic code review tools like CCR, shifting first-pass code reviews to agents, and keeping humans in the loop for architectural oversight. 

9. Everything’s changing, not just code 

Build pipelines, verification, triage, planning, and team rituals all need to evolve. Longer shipping cycles gave you more time to uncover bugs. Shorter shipping cycles and a higher pace of code turnover decrease that buffer. As AI-assisted PR volume increases, you need automated, fast quality gates to keep up. 

10. Code is disposable 

Don’t be afraid to throw code away or rewrite it. For well-bounded features, the spec might become the durable artifact. And because code is disposable, you can be brutally honest in review and ditch what doesn’t serve the product. Embrace an egoless culture. 

> Taste should be the sharpest tool in your arsenal.

Bonus: Another word on taste 

In the agentic AI era, taste is the ultimate differentiator and should be the sharpest tool in your arsenal. But don’t just set it and forget it. Keep coming back and feeding the flywheel. Human taste can be captured once in curated examples, then enforced continuously across every agent trajectory. The compounding improvement loop means that as you give feedback, the bar keeps rising. 

That’s how you get high-quality code without humans writing every line. 

The post Introducing Command Line, and the new rules for builders appeared first on Command Line.

Read the whole story
alvinashcraft
just a second ago
reply
Pennsylvania, USA
Share this story
Delete

Returning to Work After Maternity Leave: Day 1 as a New Mom and Remote Tech Worker

1 Share

I started work back up again yesterday, June 1, as a remote tech worker for an open source consultancy. My husband started his paternity leave today. I'm only going back part-time for the moment so my rough plan is to work 3 days a week with the exception of this week, as I wanted to open my laptop, clear out my inbox, setup my Notion pages for 2026, change my away message and just try to get a vague sense of what's happened while I've been away. I'll work Monday-Thursday but Monday and Tuesday will be half days.

My goal for the day was to get up with Chloe at 6:00 AM like normal. Put her down for her nap at 7:30. Have Jhey take over while I take the dog to the vet for heart scan and then log on for a little bit this afternoon.

The universe said, a plan with an almost 5 month old? Nice try.

Chloe woke up at 3:00 AM which is unusual for her. I rocked her back to sleep. Put her down. Tried to crawl back into bed. She woke up, rolled onto her tummy, and cried because she can't get onto her back because she's seemingly forgotten how to do so. She started rolling tummy to back weeks ago, but now that she can go back to tummy...that's all she does.

Rinse and repeat until 6 AM.

So I'm at that shattered level of tired. So much so that when I take my dog Vogue to the vet and they're telling me the results of her heart scan, it's not processing how poorly her heart actually is until I get home and talk to Jhey.

She'll be on medication for the rest of her life. She's 14. We'll do another scan in a month to see if the meds help slow the progression of her heart enlargement and then I'll ask for a potential timeline...how much time they think she has left.

Jhey goes to the gym and then someone shows up to pick up his special edition Mustang that he's sold. I get Chloe down for her nap and log on at 2:00 PM. I had briefly logged on for a half hour in the morning to just do a quick check and update.

My inbox is a disaster as it seems my out of office reply, replied to spam messages that would normally not be in my inbox and I have 100s to delete. I try to be ruthless and delete as much as possible but leave the more company-wide discussions for follow up.

I try and skim the chat. But there's too much. By the time it's 5:00 PM I have barely managed to get my Notion pages set back up. I'll try again tomorrow.

I remind myself that today was particularly difficult with scheduling and is not indicative of my days going forward. Tomorrow I'll try for more routine.

I go over to Jhey's parents' house and chat, we bring Chloe back for her nap, then bath and bedtime routine. I make a quick dinner of shrimp stir fry with a nice veggie stir fry pack from Waitrose.

I almost make it to the end of the Euphoria finale but pass out briefly and then go up to bed. Exhausted but accomplished.

Day one of this new life and routine and figuring it out is complete.

Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete

What’s New in Hosted Agents in Foundry Agent Service

1 Share

A few weeks ago, we announced the public preview refresh of hosted agents in Foundry Agent Service — a fundamentally re-imagined agent runtime built for operationalizing production-grade AI agents in enterprise systems. Today at Microsoft Build, we are excited to share several updates that make hosted agents easier to deploy, more capable across modalities, and seamless to optimize through the agent loop. This article covers what’s new, and what’s coming.

The Problems We Set Out to Solve

Developers who want to take agents to production find themselves managing a list of things that have nothing to do with the agent’s actual intelligence:

  • Containerization and infrastructure — building images, pushing to registries, managing versions
  • Security and identity — provisioning managed identities, scoping access, preventing cross-session data leakage
  • State persistence — storing files, memory, and context across turns without building your own persistence layer
  • Scaling — right-sizing compute for variable concurrency without paying for idle capacity
  • Observability — knowing what your agent is doing, when it fails, and why
  • Evaluation – measuring the quality, safety, and reliability of your agent continuously

Hosted agents in Foundry Agent Service were designed to take all of this off your plate. Each agent session runs in its own hypervisor-isolated sandbox with a dedicated persistent file system, an automatically provisioned Microsoft Entra ID (agent identity), and built-in OpenTelemetry tracing. You bring your code and your framework; the platform handles the rest.

Since the preview launched, we’ve shipped four additional capabilities that expand what you can do with hosted agents. 

What’s New

1. Deploy Directly from Source Code — No Container Required

Previously, deploying a hosted agent required packaging your application into a container image, pushing that image to Azure Container Registry, and configuring the agent to run it. This container-based approach gives you full control over the runtime environment, including OS dependencies, system libraries, startup behavior, and how your code is packaged and executed — it continues to be fully supported. However, for many teams, especially during early development, building and managing containers can introduce unnecessary friction. Direct code deployment removes that friction. You zip your Python or .NET project, upload it to the Foundry Agent Service, and the platform either installs your dependencies at provision time (remote_build mode) or runs your pre-bundled output directly (bundled mode).

The result is a significantly shorter path from local development to deployed agent. For developers using the Azure Developer CLI (azd) or the Foundry Toolkit for VS Code, source code deployment is even simpler — the tooling handles packaging, uploading, polling for active status, and configuring role-based access control automatically.

With azd, the agent deployment takes just two commands:

# Initialize — configure your agent project for source code deployment
azd ai agent init \
  --src ./src/my-agent \
  --agent-name my-unique-agent \
  --deploy-mode code \
  --runtime python_3_13 \
  --entry-point main.py \
  --dep-resolution remote_build

# Deploy — package, upload, and wait for Active state
azd deploy

# Invoke Agent
azd ai agent invoke “message”

# Inspect recent agent logs for the failure
azd ai agent monitor --tail 100

The init command generates the azure.yaml and agent.yaml configuration needed for deployment. The deploy command handles zip packaging, SHA verification, upload, and polls until the agent reaches active state — no manual curl, token management and container registry required.

Key flags for azd ai agent init:

Flag Description
–src, -s Path to your agent source code directory
–agent-name Unique agent name (reusing a name creates a new version)
–deploy-mode code (source upload) or container (Docker image)
–runtime python_3_13, python_3_14, or dotnet_10
–entry-point Application entry point (e.g., main.py, MyAgent.dll)
–dep-resolution remote_build (default) or bundled
–project-id, -p Existing Foundry Project resource ID (skips interactive selection)
–model AI model to use (defaults to gpt-4.1-mini)
–protocol invocations or responses

For CI/CD pipelines, add –no-prompt for fully non-interactive execution:

azd ai agent init --no-prompt \
  --project-id "/subscriptions/<sub>/resourceGroups/<rg>/providers/Microsoft.CognitiveServices/accounts/<account>/projects/<project>" \
  --src ./my-agent \
  --agent-name my-unique-agent \
  --deploy-mode code \
  --runtime python_3_13 \
  --entry-point main.py

azd deploy

Supported runtimes: python_3_13, python_3_14, and dotnet_10. Runtime support follows each language’s upstream end-of-life schedule.

Full guide: Deploy a hosted agent from source code (preview)

2. Built-In Guardrails for Responsible Agentic AI

Production agents interact with real users, and real users sometimes send harmful content. Whether it’s a customer service agent being probed with violent instructions or a coding assistant receiving inappropriate prompts, agents need a safety layer that blocks harmful inputs before they reach agent logic and filters unsafe outputs before they reach end users.

Previously, teams building responsible AI agents had to integrate content filtering themselves: standing up a separate Content Safety in Foundry Control Plane endpoint, writing middleware to intercept requests and responses, handling streaming edge cases, and managing policy configurations across environments. That’s table-stakes safety work that every production agent needs, but no team wants to build from scratch.

Hosted agents now include built-in content safety guardrails powered by Content Safety, integrated directly into the agent runtime. When enabled, every user prompt is evaluated in real time before it reaches your agent code, and every response is also evaluated before it reaches your users.

Here is how you can create guardrail policies, which you can later on pass in the hosted agent definition: How to configure guardrails and controls in Microsoft Foundry – Microsoft Foundry | Microsoft Learn

 

Hosted Agents guardrails image

Content safety guardrails are available today in public preview across all hosted agents regions.

3. Voice Live Integration and WebSocket Support

Agents that reason and act are valuable. Agents that can also speak and listen in real time open an entirely different class of applications — customer service, accessibility tooling, voice-first interfaces, and more.

For developers building text-based hosted agents, Voice Live integration with hosted text agent is now in public preview, enabling real-time voice experiences through both the Responses and Invocations protocols with one click to turn on.

For developers building native speech-to-speech hosted agents, hosted agents now support WebSocket and WebRTC for real-time voice scenarios. By combining the Invocations (WebSocket) protocol with frameworks such as Voice Live (as a speech-to-speech model API), Pipecat, or LiveKit within your container, you can deliver fully real-time voice agents — from microphone input to natural speech output, on the same secure and scalable platform that powers your text-based agents.

Leading telephony providers are already leveraging this capability to enable seamless phone-call interactions with hosted agents using speech.

The WebSocket endpoint exposes a persistent bidirectional connection:

wss://{account}.services.ai.azure.com/api/projects/agents/endpoint/protocols/invocations_ws?project_name={project}&agent_name={name}

This is distinct from the Responses and Invocations (HTTP) protocols. A single persistent connection handles both inbound audio and outbound speech synthesis without the overhead of repeated HTTP handshakes.

The Invocations (WebSocket) protocol is currently only available in North Central US. Support in additional regions is coming soon.

The addition of the Invocations (WebSocket) protocol completes the protocol triad for hosted agents, each relevant for different real-world scenarios:

Protocol Use-cases Key Characteristics
Responses Conversational agents, RAG, Publish to Teams / M365 Platform manages history, streaming, and session lifecycle; any OpenAI-compatible SDK works as the client
Invocations (HTTP) Webhooks, structured data, custom streaming (AG-UI, etc.) Arbitrary JSON in/out; you control the schema and SSE stream
Invocations (WebSocket) Real-time voice, bidirectional streaming Persistent connection; pair with Pipecat, LiveKit, or Voice Live

A single hosted agent can expose multiple protocols simultaneously. The platform automatically bridges the Responses protocol to the Activity protocol for Teams and Microsoft 365 channel delivery — no additional configuration required. Agent2Agent (A2A) has also been added to allow for agent delegation.

4. Agent Optimizer: A Closed-Loop Improvement Engine

“Live” and “production-ready” aren’t the same thing, and the gap shows up quickly.

Your customer support agent handles requests — but it forgets to ask for an order number before looking up status. It answers warranty questions without checking the purchase date. It gives electrical wiring advice when it should decline and recommend a professional. Each fix means rewriting your system prompt, testing by hand, and hoping you didn’t break something else.

For one agent, that’s manageable. For a team running ten agents across different domains, it’s a bottleneck that doesn’t scale.

Agent optimizer in Foundry Agent Service solves this by automating the agent improvement loop. It evaluates your hosted agent against defined criteria, generates better configurations, and ranks the results so you can deploy the best one — all in a few minutes, with no additional infrastructure to provision.

How It Works

The optimizer runs a closed-loop cycle:

  1. Evaluate the baseline — your agent processes tasks with pass/fail criteria, producing a composite score from 0.0 to 1.0
  2. Generate candidates — guided by what failed, the optimizer produces new configurations for your chosen target
  3. Evaluate candidates — each candidate runs against the same task set
  4. Rank and recommend — results sorted by score, with per-task breakdowns and token costs visible before you commit
  5. Deploy the winner — one command promotes the winning configuration as a new versioned deployment

No model retraining. No code changes. The optimizer uses evaluation signals to identify where the agent fell short, then rewrites its instructions to strengthen return policies, escalation procedures, troubleshooting frameworks, and safety boundaries.

Optimization Targets

  • Instruction — rewrites your agent’s system prompt to address observed gaps. This is where most teams start.
  • Skill — generates reusable, named procedures (escalation steps, troubleshooting sequences) appended to your instructions.
  • Model — evaluates your agent across multiple model deployments in a single run, scoring quality/cost trade-offs.
  • Tool Descriptions — refines how your agent understands and invokes external tools: when to call each, parameter requirements, fallback behavior.

Solving the Cold-Start Problem

Most teams don’t have evaluation datasets on day one. The eval init command solves this by generating both a dataset and evaluation criteria from your agent’s existing instructions and no manual test-writing is required.

$ azd ai agent eval init

Eval suite created
  Dataset:    customer-support (2.0), 15 tasks
  Evaluator:  customer-support (1)

  Evaluator dimensions (6):
    Weight  Dimension
    ──────  ─────────
        10  policy_compliance
         6  resolution_accuracy
         5  troubleshooting_structure
         4  communication_clarity
         3  safety_boundaries
         5  general_quality

The Full Workflow

azd ai agent init               # scaffold your agent
azd deploy                      # ship to Foundry
azd ai agent eval init          # generate evaluation criteria
azd ai agent eval run           # score your agent
azd ai agent optimize           # improve it
azd ai agent optimize apply --candidate <id>
azd deploy                      # deploy the optimized agent as a new version

Each promoted candidate becomes a new versioned hosted agent deployment — auditable, rollback-ready, and captured in tracing.

Agent optimizer is currently in private preview, with public preview rolling out in 30 days. Sign up for early access →

The Road to General Availability

Hosted agents are in public preview today, available across 20 Azure regions globally. As we approach General Availability by end of June 2026, our focus areas will include:

  • Agent optimizer in Foundry Agent Service public preview — rolling out to all hosted agents regions
  • Private ACR in BYO virtual network— enabling container images in private Azure Container Registries to be deployed to Hosted Agents
  • Managed virtual network — extending support of Microsoft managed virtual network for Hosted agents
  • Expanding Voice Live / WebSocket coverage beyond North Central US to additional regions
  • Durable, long running agents — enabling hosted agents to survive container crashes, redeployments, and periods of inactivity with automatic recovery, state persistence across turns, and multi-session context accumulation

Get Started

Developers can get started in minutes by following the QuickStart or with code samples (Python  ·  C#) which walk through setting up, testing, and deploying a production-ready hosted agent end to end.

Check out AI Agents for Beginners for a 12-lesson curriculum, then go deeper with guided labs: Develop AI Agents in Azure, Hosted Agents Workshop (.NET), and the ZavaShop Supply Chain Workshop.

📺 Watch: Foundry Agent Service + Microsoft Agent Framework Explained — Jeff Hollan walks through how to operationalize AI agents from deployment to real-world impact.

If you’re attending Microsoft Build 2026, or watching on-demand content later, be sure to check out these sessions:

 

 

 

 

 

The post What’s New in Hosted Agents in Foundry Agent Service appeared first on Microsoft Foundry Blog.

Read the whole story
alvinashcraft
2 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Holo3.1: Fast & Local Computer Use Agents

1 Share
Read the whole story
alvinashcraft
2 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Codex for every role, tool, and workflow

1 Share
Discover new Codex plugins, sites, and annotations that help analysts, marketers, designers, investors, and other teams get more done with AI.
Read the whole story
alvinashcraft
6 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

1.0.58

1 Share

2026-06-02

  • Actionable error message shown when GitHub API rate limit is hit during copilot update
  • Add /rubber-duck command for adversarial feedback on code and designs
  • Plugin slash commands (/plugin install, uninstall, update, marketplace add/remove/browse) now show immediate feedback while the operation is in progress
  • Canceling a running shell command (Ctrl+C on a !command, or aborting an agent command — including in sandboxed and background-promoted shells) now terminates the whole process tree instead of leaving orphaned processes running
  • Canvas providers can return file:// URLs in open results for local file previews
  • Symlinked directories appear in /cwd completion suggestions
  • In Azure DevOps-only repositories, the built-in GitHub MCP server now exposes only the web_search tool instead of being fully disabled
  • Quota footer shows remaining requests as a rounded percentage
  • /lsp show, /lsp test, and /lsp reload correctly discover project LSP config when the CLI is launched from a subdirectory
  • MCP server timeout configuration is preserved after tools list changes
  • /skills add and /skills remove correctly handle paths wrapped in quotes (e.g., from Windows Explorer "Copy as path")
  • Running copilot with an unquoted multi-word prompt now shows a helpful "quote your prompt" hint instead of a raw commander error
  • Default networking transport is now HTTP/1.1, improving reliability on some network paths. Opt into HTTP/2 with COPILOT_ENABLE_HTTP2=1.
  • Plugins auto-installed from repository settings no longer leak into user global config
  • Grep tool correctly handles tsx and jsx as file type filters
  • COPILOT_HOME is honored for the server discovery registry directory
  • Click a diff line with the mouse to select it in diff mode
  • Ctrl+C and other modified keys work correctly inside tmux
  • @-mention file search matches files regardless of query letter casing
  • copilot plugin marketplace list now honors repo-level extraKnownMarketplaces settings from .github/copilot/settings.json
  • Queued prompts in the footer are capped to a single line, preventing them from pushing session messages off screen
  • MCP servers configured with npx --registry are no longer incorrectly blocked by policy
  • Session no longer hangs indefinitely after an error occurs during internal event processing
  • Installed plugins no longer include the .git directory from the plugin source repository
  • New reasoning after tool calls appears at the bottom of the timeline instead of above earlier output
  • Pasting text copied from a browser, editor, or terminal no longer leaves a stray empty line, broken box-drawing lines, or a misplaced cursor in the prompt
  • preToolUse hook errors now deny the tool call instead of silently allowing execution
  • Session resume works correctly after a crash that left partial data in the session log
  • High-contrast diff backgrounds use darker colors to improve text readability
  • Add showTipsOnStartup setting to control whether startup tips are shown
  • Surface the underlying reason (e.g. GitHub API rate limit) when SDK auth-token validation fails, instead of the misleading "Session was not created with authentication info or custom provider" message.
  • /diff defaults to branch diff when there are no unstaged changes
Read the whole story
alvinashcraft
7 minutes ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories