Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
156868 stories
·
33 followers

Proving application resilience on Azure with Chaos Studio

1 Share

Takeaway: Azure Chaos Studio helps organizations validate application resilience by simulating outages, failovers, network disruptions, and infrastructure failures before they impact production.


You don’t know with certainty that your application is resilient until that resilience is tested. Better to learn it isn’t by deliberately breaking it in a test environment and watching how it reacts, than by a failure in production. Azure Chaos Studio is our managed service for doing exactly that, safely and on purpose.

Today, Azure Chaos Studio Workspaces is in public preview: a scenario-focused approach that lets you test the failure modes Azure customers actually see in production. We’ve been hard at work making Workspaces easy to use, with broad fault support and named scenarios that mirror real outages, instead of isolated faults.

Why designing for resilience isn’t enough

Azure customers have invested in resilient design: multi-zone deployments, geo-redundant storage, automatic database failover, retry logic, load-balanced front ends. However, the real question is when an incident begins: when the failure arrives, do those mechanisms recover your application in the time you assumed they would?

Real outages don’t read the architecture diagram. A zone-redundant deployment can fail because a health probe was misconfigured years ago. A database with automatic failover can leave the application dead because a connection string is hard coded to a single region. Geo-redundant storage can briefly produce stale reads the application code never expected. These mistakes are common, and they only show up when the failure happens.

Reliability and resiliency on Azure are a shared responsibility. Microsoft is responsible for the platform and the resilience built into Azure services. Customers are responsible for configuring that resilience and the code that uses it. No layer makes up for a gap in another. The only way to know whether your architecture, configuration, and application logic will hold up in production is to prove they hold under failure before an outage tests them for you.

How Chaos Studio Workspaces changes resilience testing

Chaos Studio is Azure’s managed chaos engineering service for validating how applications behave under failure. By simulating controlled disruptions across infrastructure, networking, databases, and application dependencies, it helps teams uncover resilience gaps before customers experience them. Chaos Studio Workspaces focuses on scenarios that match what happens in production, so you start from a real outage pattern instead of assembling individual faults. You begin with a named scenario like Zone Down, DNS Outage, or SQL failover, already sequenced against the resources in a Workspace.

Most outages exercise two layers at once. There’s the platform layer: did the service come back, did failover complete within your Recovery Time Objective, did traffic reroute. And there’s the application layer: did your code maintain data integrity, pick up in-flight transactions, retry the right things, degrade gracefully. A chaos test that only stops a Virtual Machine (VM) tells you about the platform layer. The scenarios in Chaos Studio Workspaces are designed to validate the entire stack.

Workspaces reduce the burden of getting started. The most common reason resilience testing stalls is that teams don’t know where to start. The Workspace is the new top-level resource: you point it at a subscription or resource group, and its managed identity discovers what’s in scope and recommends the scenarios that apply. Those scenarios show up inside the Workspace, ready to configure and run, and a refresh, updates the recommendations whenever your infrastructure changes.

A library of real outage scenarios. Chaos Studio Workspaces ships with curated scenarios informed by patterns observed in real Azure incidents, so the patterns you test against are the patterns customers actually experience. Think of these as resilience templates, a fast path to the failure modes most teams need to test, and when you need something different, design your own from the same fault library.

Available today:

  • Availability Zone Down: Virtual Machine Scale Sets (VMSS) shutdown with per-zone targeting to validate cross-zone routing and recovery.
  • Availability Zone Down and Database failover: Compute Zone Down composed with Azure Database for PostgreSQL (Flexible Server) failover, to observe failover behavior against your configured recovery objectives and application-side connection handling.
  • DNS Outage: a full DNS resolution outage via NSG rules that block resolver traffic, to validate how your application behaves when name resolution fails.
  • Microsoft Entra ID Outage: identity-provider failure that exercises authentication retry, token caching, and fallback paths.
  • Cache Stampede: Redis flush combined with database restart and an App Service process crash, to validate behavior under a cache-miss storm and the resulting database surge. The App Service process-crash variant currently supports Windows App Service plans.
  • Event-Driven Messaging Disruption: Azure Service Bus and Event Hubs disable, to validate dead-letter handling and backpressure.

Behind every scenario are granular API-level actions built for Workspaces:

Each scenario composes the right faults automatically. And when a curated scenario doesn’t match your workload, you can build your own. The new Scenario Designer is a drag-and-drop experience in the Azure portal for composing any of these faults into a custom scenario arranging steps, branches, and faults with the same flexibility as classic Chaos Studio experiments, now available directly inside Workspaces. Start with a curated template, or design from scratch using the full fault library.

VM agent faults such as Central Processing Unit (CPU) and memory pressure also run in Workspaces. Each scenario sequences the right combination of faults automatically, so running Zone Down + Database Failover doesn’t mean thinking in terms of “shut down VMSS instances in zone 1, then force-failover the database primary.” The library will keep growing through public preview and into GA, with plans to explore additional scenarios over time, such as:

  • Storage account failover
  • Microsoft Azure SQL Managed Instance failover
  • Microsoft Azure Front Door and Microsoft Azure Application Gateway
  • Partial zone degradation
  • Microsoft Azure Kubernetes Service (AKS)-native pod chaos
  • Customer-observed region down

That same foundation is also relevant for AI applications moving into production. Copilots, agents, retrieval-augmented generation pipelines, and inference endpoints may introduce new AI-specific failure modes, but they still rely on the same Azure building blocks as other distributed applications: compute, databases, caches, search indexes, identity, networking, messaging, and storage. Chaos Studio Workspaces can validate that foundation today through scenarios like Zone Down, Database Failover, DNS Outage, Cache Stampede, and Event-Driven Messaging Disruption, while the catalog continues to evolve toward AI-specific behaviors such as retrieval drift, token throttling, and model behavior shifts under load as more insights are gathered fromworking closely with customers building AI on Azure.

Scenario reports. When a run finishes, Chaos Studio Workspaces generates a structured drill report. It lays out what the scenario injected, which resources it affected, how the recovery timeline played out, which signals were attributable to the drill versus the normal baseline, and where the workload behaved differently than expected. The report reads like an internal post-incident review, which makes it useful both for the team that ran the drill and for the leaders who want to see resilience being validated regularly. Teams can export it and attach it to change tickets, audit evidence, or service health reviews.

Bringing resilience testing into AI-powered operations

Alongside the product, we’re shipping two ways to drive Chaos Studio from the tools engineers already work in. The first is the Chaos Studio Skill for GitHub Copilot: it walks you through the whole loop in a conversation. Point a Workspace at a subscription, see the scenarios it recommends, run a drill, and get back a report of what actually happened, correlated against your Azure Monitor signals.

The second is an Model Context Protocol (MCP) server that exposes the same Chaos Studio operations as typed tools, so other assistants and autonomous agents: Claude, Cursor, Codex, or your own, can provision a Workspace, run a scenario, and query the signals around it without a person in the loop. Both run against the same Chaos Studio APIs and your own Azure sign-in, and you can try them today.

We’re shipping this on day one for one reason: When a customer asks an AI assistant about Chaos Studio, the experience should be shaped by us, not improvised by a large language model (LLM) reading our REST API. In our experience, one of the hardest parts of resilience testing is often deciding to run the drill in the first place, and that decision increasingly lives in the chat tools engineers already use, so this needs to live there too.

Where this is headed: The Skill becoming a step inside automated operations flows on Microsoft Foundry, and one of the ways an Azure SRE agent validates its own assumptions about how a workload fails. Try it and tell us what’s missing; we’ll close the gaps through public preview.

Get started

Azure Chaos Studio Workspaces is in public preview today. General availability is currently targeted for late 2026, subject to change.

To start:

  1. Create a Workspace scoped to a subscription or resource group you want to test.
  2. Let discovery populate the recommended scenarios for the resources it finds. Prefer to build your own? Open the Scenario Designer and compose a custom scenario from the fault library, no scripting required.
  3. Run your first drill. If you’ve never run a chaos experiment, run Zone Down. A full availability-zone failure surfaces how compute placement, database failover, DNS resolution, and application-layer retry logic respond under stress. If your workload recovers within an acceptable time, you’ve gained evidence about how it responds to one of the most common causes of extended cloud downtime. If it doesn’t, you’ve found the gap on your terms instead of your customers’.

Resilience isn’t something a single feature, a single redundancy mechanism, or a single architecture decision will give you. It’s an engineering discipline, and the discipline requires verification. Azure Chaos Studio Workspaces is how we’re making that verification the default for Azure workloads, including the AI workloads more of our customers are putting into production.

Run your first resilience testing today

With Azure Chaos Studio Workspaces, you can simulate failures across your stack and gain practical insight into recovery behavior

The post Proving application resilience on Azure with Chaos Studio appeared first on Microsoft Azure Blog.

Read the whole story
alvinashcraft
13 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

Building C# and C++ Apps with GitHub Copilot CLI and Visual Studio 2026

1 Share

Most engineers learned Visual Studio through a mouse-driven workflow — New Project — open a project, click through a few dialogs, press F5. The GitHub Copilot CLI inverts that workflow: you describe the desired outcome in plain English (or any other language Copilot supports), and the agent scaffolds the project, drives MSBuild, resolves its own compiler errors, and explains the result. By the way: this style of working — you describe what you want and the AI does the actual programming and takes care of the rest — is also known as Vibe Coding. For Microsoft FTEs, this translates into measurably faster proofs of concept for customer escalations, reproduction builds, and internal tooling. Engineers outside Microsoft benefit in exactly the same way, as long as their organization — or their personal subscription — provides the corresponding Copilot entitlement.

This article documents the setup and workflow recommended for engineers onboarding to the CLI. The guidance is deliberately opinionated, assumes a clean Windows 11 installation, and has been verified end-to-end against the current Visual Studio 2026 Community (18.6.2) and Copilot CLI 1.0.56.

Why use the CLI at all?

Copilot is available inside Visual Studio, inside VS Code, in the browser, and on the command line. These front-ends share the same underlying model, but the user experiences differ significantly.

The CLI is the preferred choice in the following scenarios:

  • You want a single tool that drives the whole machine — installer flags, MSBuild, winget, certificates, registry, even vswhere — rather than only the files open in an editor.
  • You are working on a throwaway repro for a customer issue, where opening the full IDE is disproportionate.
  • You want repeatable, scriptable workflows. The CLI accepts instructions, plans and prompts that can be checked in, reviewed, and shared with the rest of your team.
  • You need to build small tools or short projects without deep programming expertise.

Prerequisites

  • Windows 11 (or Windows 10 22H2) with administrator rights for the install step.
  • Microsoft FTE GitHub account linked through the Open Source @ Microsoft portal and a Copilot Business or Enterprise seat assigned by your org. Engineers at other organizations use whichever GitHub identity their employer has assigned a Copilot seat to; individuals can substitute a personal GitHub account that holds a Copilot Pro or Pro+ subscription.
  • A modern terminal — Windows Terminal with PowerShell 7 is the supported combination.

Note Everything in this guide is built on supported, GA components. Nothing here requires private previews, MSIT exceptions, or non-public flags. If your tenant has additional Conditional Access policies, you may be prompted for MFA the first time you sign in.

Recommended: Run it all in a virtual machine

Before installing anything, an important recommendation: don't run the agent on your primary, company-managed laptop. The CLI — especially with --yolo — has full read, write and execute access to any directory you have trusted. A well-intentioned prompt such as "clean up the temp files in my projects folder" may delete production work if it is pointed at the wrong root. Compliance tooling, OneDrive Known Folder Move, and on-access antivirus on the corporate image also routinely conflict with build outputs and cause agent runs to fail in opaque ways.

The safe, repeatable pattern is to host Visual Studio 2026 and the Copilot CLI inside an isolated environment that can be snapshotted, reset, or discarded. The following options are all suitable:

  • A Microsoft Dev Box (preferred — already domain-joined, already monitored, and easily recreated from an image).
  • An Azure VM in your sponsored subscription, ideally with the Visual Studio image from the Marketplace.
  • A local Hyper-V or WSL2/Windows Sandbox-based Windows 11 VM for offline experimentation.

Important Treat the VM as ephemeral. Retain customer artifacts, screenshots and reproductions only within the VM (or in a dedicated OneDrive/SharePoint location outside the trusted Copilot directory). If something goes wrong, you reset the VM — not your corporate laptop.

Setting up the workspace

Select a workspace folder before performing any other step, and observe two rules:

  • Use a local path under your profile.
  • Do not place it on OneDrive, Known Folders Move, or any other sync root.

The rationale is mundane but consequential: the Copilot agent writes files, MSBuild writes obj\ and bin\, and OneDrive's file-system filter will conflict with all of them. You will observe ghost lock files, partially rewritten .cs files, and "the process cannot access the file" errors that are extremely difficult to diagnose.

# Recommended workspace root mkdir "$env:USERPROFILE\AppData\Local\Projects" cd "$env:USERPROFILE\AppData\Local\Projects"

Warning Also avoid %TEMP% for build output. MSBuild emits warning MSB8029, and cleanup tasks in some scheduled jobs may delete your intermediates mid-build.

Installing the Copilot CLI

Open an elevated PowerShell window — the package needs to register PowerShell 7 as a dependency on first run:

winget install GitHub.Copilot

Winget downloads the CLI itself (currently 1.0.56) and, on a fresh installation, also installs PowerShell 7 as a dependency. Once the installation completes, the elevated shell can be closed — do not run Copilot as administrator for day-to-day work.

Confirm the install in a normal (non-elevated) terminal:

copilot --version

Installing Visual Studio 2026 Community

The CLI does not require Visual Studio — it strictly requires only MSBuild and the appropriate toolchain. However, the Community edition is the most straightforward way to obtain a known-good, fully Microsoft-signed installation of MSBuild 18.6.x, the MSVC v145 toolset, and the Windows 10/11 SDK, all in a single step. Select the variant that matches your target workload:

Variant A — C# only (≈ 2.4 GB)

winget install --id Microsoft.VisualStudio.Community -e --source winget ` --override "--add Microsoft.VisualStudio.Workload.ManagedDesktop \ --add Microsoft.Net.Component.4.8.1.SDK \ --add Microsoft.Net.Component.4.8.1.TargetingPack \ --add Microsoft.VisualStudio.Component.CSharp \ --includeRecommended --passive --norestart"

Variant B — C# + native C++ + C++/CLI (≈ 4.3 GB)

winget install --id Microsoft.VisualStudio.Community -e --source winget ` --override "--add Microsoft.VisualStudio.Workload.ManagedDesktop \ --add Microsoft.VisualStudio.Workload.NativeDesktop \ --add Microsoft.VisualStudio.Component.VC.CLI.Support \ --add Microsoft.Net.Component.4.8.1.SDK \ --add Microsoft.Net.Component.4.8.1.TargetingPack \ --add Microsoft.VisualStudio.Component.CSharp \ --includeRecommended --passive --norestart"

 

A brief description of each component:

Component

Why it's there

Workload.ManagedDesktop

Brings WPF, WinForms, the .NET Framework 4.8.1 target packs and Roslyn analyzers.

Workload.NativeDesktop

Native C++ with the MSVC v145 toolset, Windows 10/11 SDK 10.0.26100, ATL/MFC if you add them.

VC.CLI.Support

The "C++/CLI support" component. Without it, anything with <CLRSupport>true</CLRSupport> fails to compile.

Net.Component.4.8.1.*

SDK + targeting pack for the .NET Framework 4.8.1 surface that ships with Windows.

 

As a Microsoft FTE you have access to Visual Studio Proffessional or Visual Studio Enterprise. If you already have installed a different version of Visual Studio, you don't need to install 2026 Community.

Important Visual Studio 2026 ships the new v145 platform toolset, not v143. Old .vcxproj files copied in from VS 2022 must be retargeted, otherwise the build fails with MSB8020 — build tools for v143 cannot be found. Copilot can perform this retargeting on request, but the requirement must be stated explicitly.

First launch & sign-in

Change into your project folder and start Copilot. The default invocation recommended for day-to-day FTE work is:

cd "$env:USERPROFILE\AppData\Local\Projects\Repro-12345" copilot --yolo --no-ask-user --model "claude-opus-4.8" --effort "medium"

The flags are defined as follows:

Flag

What it does

--yolo

Pre-approves tool execution, file writes and shell commands for the session. Equivalent to --allow-all-tools --allow-all-paths --allow-all-urls. Use only within trusted folders.

--no-ask-user

Suppresses the interactive confirmation prompts mid-task — the agent continues without interruption.

--model

Pins a specific model. Useful when determinism across a team is required or when comparing runs. To see, which model best fits your needs, take a look at the model comparison at https://docs.github.com/en/copilot/reference/ai-models/model-comparison

--effort

low, medium, or high. Higher values allocate more reasoning tokens per turn — slower, but advantageous for complex build errors and architectural work.

 

Copilot first prompts you to confirm that you trust the current folder. Select "Yes, and remember this folder for future sessions" only for workspaces you control. Then run /login if you have not yet authenticated — this opens a browser tab in which you sign in with your FTE GitHub identity.

Sign in to Copilot on an unmanaged / non-FTE-domain machine

The Copilot CLI signs in with a GitHub identity, not directly with your Microsoft corporate account. On a company-managed device this is transparent — your Enterprise Managed User (EMU) account <alias>_microsoft is accepted without further action.

On a fresh VM, an Azure VM image, or any other unmanaged device, /login will refuse the EMU account unless you have explicitly linked a personal GitHub account to your EMU and granted it Copilot entitlement. The same pattern holds for engineers at other enterprises: they sign in with the GitHub identity their organization has provisioned with a Copilot seat. Individuals on a Copilot Pro or Pro+ plan can sign in directly with their personal GitHub handle, without any linking step.

For Microsoft FTEs, linking is performed once, from any browser, at:

https://aka.ms/copilot

Sign in with your Microsoft corporate account, link your personal GitHub handle (creating one first if necessary), and accept the prompts. When complete, the page should display every checkbox in green:

 

A correctly linked account ataka.ms/copilot. The first green check confirms Copilot on the EMU account; the bottom green check is the one that is relevant for unmanaged VMs — "GitHub Copilot enabled for your personal GitHub account for use everywhere", granted through your MicrosoftCopilot organization membership.

Once that bottom row is green, run copilot inside your VM, use /login, and select your personal GitHub handle when the browser device-code flow opens. From that point onward, every CLI session in that VM uses your linked personal identity, your Copilot entitlement follows you, and your activity remains attributable through the MicrosoftCopilot organization for audit purposes.

Tip The link between the EMU and the personal account is per-user, not per-device. Once it is established, every new VM you provision will work after a single /login — there is no need to revisit aka.ms/copilot again unless you change your personal handle.

The bootstrap prompt

This is arguably the most valuable step in the entire setup. Before asking Copilot to build anything, instruct it to profile your machine and write itself a memory file. Paste the following verbatim:

I have Visual Studio 2026 Community Edition installed with C#, .NET Framework 4.8.1 targeting pack, native C++ and C++/CLI. Locate the installation paths using vswhere, verify that you can create and compile a C# app, a native C++ app, and a C++/CLI app. Then create a global copilot-instructions.md under %USERPROFILE%\.copilot\ that documents the VS environment so future sessions don't have to rediscover it.

The behavior that follows is informative to observe. Copilot uses vswhere.exe to locate the installation, reads setup.config.json, generates three disposable projects in your temp folder, builds them with MSBuild, parses the output, and on success writes a structured Markdown file to your user profile. That file is then automatically picked up by every future Copilot CLI session on this machine. The discovery work is performed exactly once.

What a good copilot-instructions.md looks like

Copilot CLI reads instructions from several well-known locations, in this order of precedence:

  1. .github/copilot-instructions.md in the repo
  2. AGENTS.md (or CLAUDE.md, GEMINI.md) in the repo root
  3. .github/instructions/**/*.instructions.md for path-scoped rules
  4. %USERPROFILE%\.copilot\copilot-instructions.md as a global fallback

The machine-level file should be concise, declarative, and contain exact paths. A verified skeleton is shown below:

# Build environment on this PC (Visual Studio 2026) - Visual Studio Community 2026, version **18.6.2** - MSBuild **18.6.3** - MSVC toolset **14.51.36231**, platform toolset name **v145** - Windows 10/11 SDK: **10.0.26100** - Default .NET Framework target: **net481** ## Key paths | Purpose | Path | |--------------------|------| | VS install root | `C:\Program Files\Microsoft Visual Studio\18\Community` | | vswhere.exe | `C:\Program Files (x86)\Microsoft Visual Studio\Installer\vswhere.exe` | | MSBuild.exe | `...\18\Community\MSBuild\Current\Bin\MSBuild.exe` | | MSVC cl.exe (x64) | `...\VC\Tools\MSVC\14.51.36231\bin\Hostx64\x64\cl.exe` | | vcvars64.bat | `...\VC\Auxiliary\Build\vcvars64.bat` | ## Conventions - For SDK-style csproj, always run `Restore` together with `Build`. - New `.vcxproj` files must use `<PlatformToolset>v145</PlatformToolset>`. - For C++/CLI: `<CLRSupport>true</CLRSupport>` and `<TargetFrameworkVersion>v4.8.1</TargetFrameworkVersion>`. - Never put build output under `%TEMP%`.

Tip Treat copilot-instructions.md as a configuration file. Commit a repo-local version to your .github/ folder for projects with non-standard build flags (custom response files, signing scripts, and similar). The repo-local version overrides the global one.

Worked example: a C# console app on .NET Framework 4.8.1

Consider a task representative of customer-engineering work — a small utility that reads a registry key and emits JSON. Inside Copilot, enter:

Create a new C# console project targeting net481 in .\RegistryProbe\. The app should accept a registry key path as its single argument, read all values under it, and print them as JSON to stdout. Add a few unit tests with MSTest. Build the project with MSBuild in Release and show me the dotnet run output.

Behind the scenes Copilot will:

  1. Use vswhere to locate MSBuild (already cached via the instructions file).
  2. Generate RegistryProbe.csproj with <TargetFramework>net481</TargetFramework>.
  3. Write Program.cs with appropriate argument parsing and a Microsoft.Win32.RegistryKey reader.
  4. Add an MSTest project, wire it up via a .sln file.
  5. Run MSBuild /t:Restore,Build /p:Configuration=Release and read the output.
  6. Run the binary against a safe key like HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion and present the JSON output.

For a more deliberate workflow, press Shift+Tab to toggle Plan mode. Copilot will produce an implementation plan first and wait for approval before writing to disk — a recommended practice for any work involving a customer reproduction.

Worked example: a native C++ command-line tool

The same approach applied to native C++. Prompt:

Create a new native C++ console project ".\PortPing\" targeting x64, Release, using the v145 platform toolset and Windows SDK 10.0.26100. The tool takes "host:port" on the command line and prints "open" or "closed" depending on whether a TCP connect succeeds within 1 second. Build it with MSBuild and run it against microsoft.com:443.

The agent generates a PortPing.vcxproj that looks roughly like:

<PropertyGroup Label="Globals"> <PlatformToolset>v145</PlatformToolset> <WindowsTargetPlatformVersion>10.0.26100.0</WindowsTargetPlatformVersion> <ConfigurationType>Application</ConfigurationType> </PropertyGroup>

…and builds it with:

& $msbuild "PortPing.vcxproj" /p:Configuration=Release /p:Platform=x64

Tip If you want Copilot to compile a single .cpp with cl.exe instead of going through MSBuild, instruct it to load vcvars64.bat first. This sets INCLUDE, LIB and PATH for the MSVC toolchain so that direct compiler invocations succeed.

Worked example: a C++/CLI interop assembly

C++/CLI is the workload that most commonly causes confusion in practice, because it requires the optional VC.CLI.Support component and a few non-obvious project properties. With Copilot, the task reduces to:

Create a C++/CLI class library ".\InteropBridge\" that targets .NET Framework 4.8.1, uses the v145 toolset, and exposes a managed wrapper around the native function CreateFileW. Add a small C# console app in the same solution that consumes the wrapper. Build everything Release|x64 and run the C# app to prove the interop works.

The two critical lines that Copilot adds to the .vcxproj are:

<CLRSupport>true</CLRSupport> <TargetFrameworkVersion>v4.8.1</TargetFrameworkVersion>

If the C++/CLI support component is not yet installed, Copilot will detect the corresponding "Cannot find <CLRSupport>" error, suggest the correct winget command to add the component, and offer to re-run the build.

Models, effort, and custom agents

Copilot CLI allows the model to be selected per session, and this selection should be made deliberately. The appropriate choice depends on the task at hand:

Profile

Recommended invocation

When to use

Fast iteration

copilot --yolo --model "claude-opus-4.8" --effort "medium"

Day-to-day work: scaffolding, small refactors, and build fixes.

Hard problems

copilot --yolo --model "claude-opus-4.7-1m-internal" --effort "high" --context "long_context"

Cross-file analysis, large reproductions, and complex C++ template errors.

Maximum autonomy

copilot --allow-all-urls --allow-all-tools --allow-all-paths --no-ask-user

Sandboxed Dev Box or VM only — never on a production laptop.

To determine which model best suits your purpose, consult the model comparison at https://docs.github.com/en/copilot/reference/ai-models/model-comparison. And if you have ever wondered what skills, tools, plugins and similar concepts actually are — or what an MCP server is for — the answers are available at https://docs.github.com/en/copilot/concepts/agents/copilot-cli/comparing-cli-features.

Useful slash commands within Copilot CLI

  • /plan — produce an implementation plan before any code is written.
  • /review — run the code-review subagent against the local diff.
  • /diff — display the changes Copilot has made to the working tree.
  • /agent — select a custom agent (explore, task, general-purpose, rubber-duck, code-review, research).
  • /mcp — register an MCP server (useful for Azure CLI, Kusto, or internal tooling).
  • /instructions — show which instruction files are currently loaded — useful when investigating unexpected agent behavior.

Tip Press Shift+Tab at any time to toggle Plan mode. For any action that would be recorded in a change ticket, planning first is inexpensive insurance.

Security & compliance for Microsoft FTEs

GitHub Copilot CLI falls within the scope of Microsoft's standard "responsible use of AI" guidance. Engineers at other enterprises should substitute the equivalent policy of their own employer, and individuals using a personal Copilot subscription should still treat these principles as a sensible default. The condensed guidance for engineers is:

  • Do not paste customer data into prompts unless your engagement explicitly permits it. The agent will faithfully transmit such data to the model.
  • Do not use --yolo on shared infrastructure. The flag pre-approves shell execution and file writes. Reserve it for personal workspaces or sandboxed Dev Boxes.
  • Review the diff. Use /diff or git status before committing — Copilot will, on occasion, refactor files you did not anticipate.
  • Secrets stay out of source. If you ask Copilot to test something that requires credentials, direct it to use azd, the Az module, or your Key Vault — never inline secrets.
  • Telemetry and logs from the CLI reside under %USERPROFILE%\.copilot\logs. If a customer escalation requires evidence of agent actions, those logs serve as your audit trail.

Important Any action Copilot performs on your machine still runs as you. Treat agent sessions with the same care you would extend to a teammate with full access to your development machine.

Beyond Visual Studio: where else Copilot CLI helps

The Visual Studio scenarios in this article are the most obvious use case, but they are by no means the only one. The following list covers additional areas in which Copilot CLI has proven productive on real engagements.

  • Generating and maintaining tests — Ask Copilot to write MSTest, NUnit or xUnit tests for an existing class, including mocks and boundary cases, or to fill coverage gaps in methods that were touched in the last commit.
  • Refactoring across files — Rename a public API, extract an interface, or restructure a folder hierarchy and have Copilot propagate the change consistently throughout the solution, including project file references and unit tests.
  • Modernizing legacy code — Port code from .NET Framework 4.8.1 to .NET 8/9, replace WebClient with HttpClient, or swap manual JSON parsing for System.Text.Json. Copilot handles the boilerplate; you review the deltas.
  • Building customer reproductions from scratch — Translate a bug description from a support ticket into a minimal repro project. Copilot scaffolds the solution, simulates the failing call path, and runs it to confirm the symptom matches.
  • Reviewing the local diff — The /review slash command analyses staged or unstaged changes and surfaces real issues — race conditions, off-by-one errors, missing input validation, dropped exceptions — while suppressing style-only noise.
  • Writing and optimizing database code — Generate Entity Framework Core models and migrations from an existing schema, draft complex LINQ or raw SQL queries, and ask Copilot to inspect an execution plan and propose targeted indexes or query rewrites.
  • Interpreting performance and memory profiles — Hand Copilot a PerfView trace summary, a dotnet-counters capture, or a BenchmarkDotNet result table. It identifies hot paths, allocation spikes, and contended locks, then suggests concrete code changes to address them.
  • Generating documentation — Produce XML doc comments for public APIs, README sections, architecture overviews, or migration guides — written directly from the current code, not from a stale design document.
  • Investigating unfamiliar codebases — Drop into a foreign repository and ask "what does this service do?", "where is authentication enforced?", or "which class handles cache invalidation?". The explore subagent answers with file and line references rather than vague summaries.
  • Analyzing logs and traces — Point Copilot at an ETW trace, an Event Viewer export, a Fiddler capture, or a multi-megabyte server log. It groups errors, identifies the dominant failure pattern, and suggests next steps for the investigation.
  • PowerShell, Bicep and other operational scripting — Beyond compiled applications, Copilot writes and debugs PowerShell modules, Bicep/ARM templates, Terraform configurations, and shell scripts — useful, among other things, for provisioning the very Dev Box or Azure VM described earlier in this article.
  • Drafting technical communication — Generate the first draft of a customer-facing root-cause analysis, a pull request description, a release-note entry, or an internal incident retrospective directly from a diff or a chat transcript. The final wording remains yours; the boilerplate does not.

On a slightly meta note: GitHub Copilot CLI can also take a surprising amount of work off your hands when, for instance, you set out to write a blog post like this one. :-)

A note on customer data. Several of the scenarios above — building customer reproductions and analyzing customer-supplied logs in particular, but also any review, database or profiling task that involves production samples — may rely on material provided by a customer. Such data must always be handled with care: depending on the engagement, the customer's data-classification level, or the applicable regulatory regime, sharing it with a Copilot model may be restricted or outright prohibited. Public Sector engagements warrant particular caution. When in doubt, redact, reproduce against synthetic data, or confirm with your account team or compliance contact before pasting anything into a prompt.

Wrapping up

A clean Windows installation, two winget commands, one bootstrap prompt, and a global copilot-instructions.md are all that separate you from a fully working, AI-driven C# / C++ / C++/CLI workflow on Visual Studio 2026. From that point, the agent does what agents do best — it handles the routine work while you remain focused on the customer problem. The list of additional scenarios in the previous section is intentionally non-exhaustive — most engineering tasks that can be described in plain English are candidates, provided the agent has the relevant context and the required permissions.

Productive prompting — and may your builds remain evergreen.

Read the whole story
alvinashcraft
30 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

What's new in Swift: June 2026 Edition

1 Share

Welcome to “What’s new in Swift,” a curated digest of releases, videos, and discussions in the Swift project and community.

June was an exciting month for Swift, featuring announcements at WWDC and community events around the globe. We invited the organizers of one of them to share with us:

Hey, it’s Mikaela and Adrian. We are organizers of CommunityKit, a community-organized conference that takes place the same week as WWDC, and iOSDevHappyHour, a monthly online meetup that keeps the community connected year-round. This is our fifth year coming out to Cupertino, and we love being able to create a place for the community to thrive, no matter where developers live.

CommunityKit brought together over 250 developers in real life to geek out over the announcements, stay for the community and vibes, see what everyone is creating, and learn from each other. Some of the highlights from this year’s event were the Indie Fair, where developers showcased their apps; the Watch Party, our annual gathering to watch the keynotes together; and Make Something, Ship Nothing, a hands-on postcard-making hangout to close the week. This year we also introduced workshops, including “Inclusive by Design” by Danielle Lewis, and for the Swift community: “Write Faster, Smarter Swift” by Paul Hudson.

We can’t wait to hear about what everyone builds and brings to next year’s Indie Fair, and hope to see you at CommunityKit and iOSDevHappyHour!

Now on to other news about Swift:

WWDC26 highlights

At its WWDC26 conference, Apple provided an update on its adoption of Swift and made a variety of new Swift-related announcements. Some highlights:

  • During the Platforms State of the Union, Apple announced that parts of the core operating system kernel are being written in Swift for upcoming releases.
  • What’s new in Swift featured changes in Swift since last year, including a preview of what’s coming in Swift 6.4, like up to 4x faster URL parsing and support for async code in defer blocks.
  • The QUIC transport layer in Apple’s networking stack was rewritten in Swift. The project has been open sourced and is available for cross-platform use through SwiftNIO integration.
  • A new Swift package, Foundation Models framework utilities, was released with tools for working with LLMs, including custom skills and context management helpers. It runs on Apple platforms and select Linux distributions.
  • The Foundation Models framework itself will be open sourced in the future, meaning the same Swift APIs you use in your app could run on your server.
  • Container Machine is a new tool that provides a lightweight, persistent Linux environment on a Mac. Unlike a container, which is modeled after an application, a container machine is modeled after the environment itself. Container machines share the host environment, including the home directory and configuration. It’s written in Swift and open source.

Videos to watch

Community highlights

  • Swift Package Index joined Apple and remains open source. The team says they’re working together to build a comprehensive package registry for the Swift community.
  • Yeo Kheng Meng blogged about bringing Swift to the Apple II, complete with a REPL, compiler, file browser, and editor. It’s a subset of Swift and was built with AI assistance.
  • Apple shared an adoption story on the Swift blog: Migrating the TrueType Hinting Interpreter, covering how the TrueType hinting interpreter in macOS and iOS was rewritten in Swift from C. It runs 13% faster on average.
  • The Swift Ecosystem Steering Group announced the creation of the Networking workgroup. This group will work on a unified networking stack for Swift, layered from low-level I/O primitives, through common protocols, to a modern HTTP client and server API.

New package releases

  • New Swift bindings for the OkHttp Java library were released. If you’re using Swift on Android and looking for an HTTP client this may be useful. The project was generated with swift-java.
  • Kiln is a new documentation engine written in Swift. Built to replace MkDocs-based documentation sites, it gives more options for the Swift community to render docs, in addition to the DocC project which is used for the official Swift documentation. You can see Kiln in action at the Vapor documentation.
  • Version 0.4.0 of Elementary UI was released, a frontend framework for running Swift applications natively in the browser.

Swift Evolution

The Swift project adds new language features through the Swift Evolution process. These are some of the proposals currently under review or recently accepted for a future Swift release.

Under active review:

  • SE-0526 withDeadline - Asynchronous operations in Swift can run indefinitely, and implementing time limits manually using task groups and clock sleep operations is verbose and error-prone. This proposal adds withDeadline, a function that executes an async operation with a composable absolute time limit specified as a clock instant, canceling the operation if it hasn’t completed in time. It also allows multiple nested operations to share the same deadline, avoiding the drift that accumulates when relative durations are passed through call layers.

Recently accepted:

  • SE-0474 Yielding Accessors - When you call a mutating method on a computed property, Swift creates the illusion of in-place mutation by getting a copy, mutating it, then setting it back. This causes unnecessary copy-on-write buffer duplication for types like String, and is impossible for noncopyable types, which can’t be copied out at all. This proposal adds yielding borrow and yielding mutate, two new ways to implement computed properties and subscripts that instead lend the caller direct access to the underlying value without copying it.

Recently accepted with modifications:

  • SE-0529 Add FilePath to the Standard Library - FilePath in the swift-system package parses platform-specific path syntax on the developer’s behalf, provides a normalized view of path components, and enables filesystem resolution. However, shipping in an external package means the standard library, Swift runtime, and toolchain libraries such as Foundation cannot depend on it. This proposal adds FilePath and its associated types to the Swift module, alongside essential functionality for construction, decomposition, resolution, and C interoperability.
  • SE-0527 UniqueArray - Swift’s Array can’t store noncopyable elements without compromising its copy-on-write semantics or performance predictability. This proposal adds two new types to a new Containers module: RigidArray, a fixed-capacity array that traps on overflow, and UniqueArray, a dynamically growing array that enforces unique ownership by being noncopyable itself.
Read the whole story
alvinashcraft
45 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

AI Red Teaming and Adversarial Testing: Designing Effective AI Red Team Exercises (Part 2)

1 Share

In Part 1, we explored what AI red teaming is, how it differs from traditional penetration testing and governance activities, and why generative AI introduces an entirely new category of risks. We also examined several common types of AI failures, including hallucinations, harmful outputs, bias, privacy leakage, and misuse. Understanding these risks is the first step, but recognizing that they exist is only half the challenge. The real value comes from designing structured exercises that deliberately test whether your AI systems can withstand realistic misuse and adversarial behavior.

Unlike conventional software testing, AI red teaming is not simply a checklist of vulnerabilities to verify. Every AI system is unique because it combines different models, prompts, retrieval mechanisms, data sources, plugins, APIs, and business processes. A customer support chatbot poses very different risks from those of an internal HR assistant, a healthcare copilot, or an AI agent that can create users, send emails, or approve financial transactions. As a result, effective AI red teaming begins with understanding the AI system’s purpose before attempting to break it.

The objective is not to make the AI fail for its own sake. Instead, the goal is to identify weaknesses that could affect security, privacy, compliance, reliability, or business operations, so they can be addressed before the system reaches production or new capabilities are introduced.

Start with Understanding the AI System

Before writing a single adversarial prompt, take time to understand exactly what the AI system is designed to do. This may sound obvious, but many organizations jump straight into testing without fully understanding the architecture of the solution they are assessing.

Generative AI rarely operates in isolation. A modern enterprise AI solution often includes several interconnected components working together. These might include the large language model itself, a retrieval system that searches organizational content, plugins that connect to external services, orchestration layers that coordinate multiple AI agents, and identity systems that determine what information users are allowed to access.

Each of these components expands the potential attack surface.

For example, consider an internal Microsoft 365 Copilot deployment. The language model itself may be highly secure, but Copilot also relies on Microsoft Graph to retrieve documents, emails, Teams conversations, SharePoint content, calendars, and meeting notes. If permissions within Microsoft 365 are overly permissive, the AI may retrieve information that users should not realistically discover so easily. In this case, the weakness is not the language model but the surrounding ecosystem.

Similarly, a custom AI assistant may retrieve data from SQL databases, REST APIs, customer records, or proprietary knowledge bases. The red team must understand each of these connections because every integration introduces additional opportunities for misuse.

Before testing begins, it is useful to answer questions such as:

  • What business problem does the AI solve?
  • Who are the intended users?
  • What information can the AI access?
  • Can it perform actions or only generate responses?
  • Does it connect to internal or external systems?
  • Are human approvals required for important decisions?
  • What safeguards already exist?

These questions establish the context for every testing activity that follows.

Identify What Matters Most

Not every AI failure carries the same level of risk. A grammar mistake in an email assistant is very different from an AI system approving fraudulent financial transactions or exposing personally identifiable information. This is why AI red teaming should always begin with identifying what matters most to the organization. Think about the potential business impact rather than the technology itself.

For example, an AI assistant supporting customer service may present risks such as providing inaccurate warranty information, exposing customer account details, or generating inappropriate responses that damage the organization’s reputation. An HR assistant, on the other hand, may need to protect employee records, salary information, disciplinary actions, and confidential recruitment discussions. A software development assistant may introduce entirely different concerns, including generating insecure code, recommending vulnerable libraries, or exposing proprietary source code.

Rather than attempting to test every possible scenario equally, prioritize the areas where failures would have the greatest operational, financial, legal, or reputational consequences.

One useful exercise is to imagine tomorrow’s headline if the AI fails.

  • Would the story involve the leak of confidential information?
  • Would customers receive harmful advice?
  • Would regulators investigate a compliance violation?
  • Would executives lose confidence in the organization’s AI strategy?

Answering these questions often reveals where testing efforts should be concentrated.

Develop Realistic Threat Scenarios

One of the defining characteristics of effective AI red teaming is realism.

Testing should reflect how real users, both well-intentioned and malicious, are likely to interact with the system. Artificial or overly simplistic prompts rarely uncover meaningful weaknesses because attackers rarely behave in predictable ways.

Instead of asking whether an AI will answer an obviously prohibited question, consider how someone might gradually manipulate the conversation over time. Attackers often begin with harmless requests, slowly building context and trust before introducing increasingly sensitive instructions.

Imagine an AI assistant responsible for summarizing legal contracts.

Rather than immediately requesting confidential information, an attacker might begin by asking the AI to explain standard contract terminology. They may then request examples of renewal clauses, followed by typical pricing structures, before eventually asking the assistant to compare those examples with current customer agreements. Individually, each request appears legitimate. Combined together, they may reveal commercially sensitive information.

Similarly, an internal employee may unintentionally misuse the AI by asking it to summarize documents they have not fully read themselves. If the AI hallucinates missing details, those inaccuracies may become embedded within reports, presentations, or executive briefings without anyone realizing the information is incorrect.

These scenarios illustrate why AI red teaming must consider both malicious intent and accidental misuse.

Think Like Different Types of Users

Not every user interacts with AI in the same way, and not every risk originates from a malicious attacker. Effective red teaming considers a wide range of user personas, each bringing different motivations, knowledge, and objectives.

A curious employee may simply explore the boundaries of what the AI can do without intending any harm. A frustrated customer may repeatedly challenge a chatbot after receiving incorrect responses. A competitor may attempt to extract proprietary information. A cybercriminal may carefully craft prompts designed to bypass safeguards or manipulate automated workflows.

Each of these individuals approaches the AI differently, and each presents unique testing opportunities.

For example, consider how an internal finance assistant might respond to different users:

  • A finance employee may request quarterly revenue forecasts.
  • A project manager may ask about departmental budgets.
  • An executive may request strategic financial summaries.
  • A contractor may attempt to access information outside their responsibilities.
  • An attacker may disguise themselves as a legitimate employee through carefully written prompts.

Testing should assess how consistently the AI enforces access boundaries across these scenarios. Thinking from multiple perspectives often uncovers weaknesses that purely technical testing overlooks.

Building Adversarial Scenarios

Once you understand the AI system and the users interacting with it, you can begin designing adversarial scenarios.

A scenario is more than a single prompt. It represents an entire conversation or workflow that attempts to achieve a specific objective.

For example, suppose the objective is to determine whether an AI assistant will reveal confidential merger information.

  • Rather than immediately asking for the confidential documents, the red team might construct a realistic sequence of interactions.
  • The conversation could begin with general questions about recent industry acquisitions before shifting to publicly available financial reports. The attacker might then ask the AI to compare internal planning documents with public announcements, ultimately seeking discrepancies that reveal information not yet released.
  • Each step appears reasonable in isolation. The overall sequence gradually increases pressure on the AI while remaining believable.
  • This conversational approach is considerably more effective than isolated prompts because it reflects how real attackers often operate.

Measuring Success

Unlike traditional software testing, AI red teaming rarely produces simple pass-or-fail outcomes. Instead, every exercise should measure how the AI behaved under pressure. For example, consider three possible responses when testing whether an AI reveals confidential information.

  • The first response is ideal. The AI politely refuses the request, explains why the information cannot be shared, and redirects the user toward appropriate resources.
  • The second response is less obvious. The AI rejects most requests but unintentionally reveals small pieces of sensitive information that, while individually harmless, could contribute to a larger disclosure.
  • The third response is a complete failure, in which the AI discloses confidential information without any meaningful resistance.

These three outcomes require different remediation activities and carry different levels of business risk.

Organizations should therefore evaluate AI behavior across several dimensions rather than relying on a binary success-or-failure assessment.

  • Accuracy of the response.
  • Protection of confidential information.
  • Consistency with organizational policies.
  • Resistance to manipulation.
  • Appropriate refusal behavior.
  • Transparency when uncertainty exists.
  • Respect for user permissions.
  • Ability to maintain context throughout extended conversations.

Useful evaluation criteria include measuring these characteristics over time, which allows organizations to track improvements as prompts, retrieval systems, and safety controls evolve.

Determining Severity

Not every finding discovered during AI red teaming deserves the same priority. Some issues may simply reduce the quality of the user experience, while others could result in regulatory investigations, financial loss, or significant reputational damage.

A useful way to think about severity is to evaluate the business consequences rather than focusing solely on the model’s technical behavior.

For example, an AI assistant that occasionally formats dates incorrectly may be considered a low-severity issue. Although inconvenient, the impact is unlikely to extend beyond minor user frustration. By comparison, an assistant that leaks confidential customer records, generates discriminatory hiring recommendations, or exposes privileged financial information represents a critical business risk that requires immediate remediation.

When assessing findings, consider questions such as:

  • Could confidential information be exposed?
  • Would customers receive unsafe or misleading advice?
  • Does the issue violate regulatory or contractual obligations?
  • Could the organization suffer financial loss?
  • Would the organization’s reputation be affected if the behavior became public?
  • Is the issue repeatable, or does it occur only under very specific circumstances?

Answering these questions helps organizations prioritize remediation efforts and communicate risks more effectively to business leaders who may not understand the underlying technical details.

AI Red Teaming is a Collaborative Exercise

One of the biggest mistakes organizations make is assuming AI red teaming belongs exclusively to security teams. Unlike infrastructure testing, evaluating AI systems requires expertise from multiple disciplines because AI affects far more than technology alone.

Security professionals bring experience in adversarial thinking and attack simulation. Data scientists understand how models are trained and where limitations may exist. Developers understand the surrounding application architecture. Compliance specialists ensure regulatory requirements are considered throughout testing. Legal teams provide guidance on privacy, intellectual property, and contractual obligations. Business owners contribute operational knowledge that helps identify realistic misuse scenarios.

Perhaps most importantly, end users should also be involved.

Employees frequently interact with AI in ways developers never anticipated. Observing how users naturally phrase questions often uncovers prompt sequences that no formal test plan would have included. Their curiosity, assumptions, and everyday workflows provide valuable insight into how AI will behave once deployed across the organization.

Successful AI red teaming is therefore not a one-time security exercise carried out by a small specialist team. It is a collaborative process that combines technical expertise, business knowledge, governance, and real-world user behavior to build confidence that AI systems remain safe, reliable, and aligned with organizational objectives.

Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete

Codex vs Claude Code: Which AI Coding Assistant to Choose

1 Share

AI coding assistants have evolved from simple autocomplete tools into capable development agents that can write code, debug applications, refactor projects, and even execute complex workflows.

Among the newest generation of tools, OpenAI's Codex and Anthropic's Claude Code have emerged as two of the strongest options for developers.

Both platforms promise to improve productivity, reduce repetitive work, and help teams ship software faster. But they approach software development differently.

Choosing between them depends less on finding a universal winner and more on understanding which tool aligns with your workflow, team structure, and development goals.

What We'll Cover Here:

Understanding Codex

Codex interface

Codex is OpenAI's dedicated coding agent designed to assist developers throughout the software development lifecycle.

Unlike earlier code generation tools that focused mainly on snippets and autocomplete, modern Codex operates more like an autonomous development partner.

It can understand large codebases, generate new features, fix bugs, review existing implementations, and work on multiple tasks simultaneously.

OpenAI has expanded Codex beyond a simple command-line experience, introducing desktop and cloud-based environments that allow developers to delegate work while continuing with other responsibilities.

According to OpenAI, Codex can read, edit, and run code while operating in its own environment to complete assigned tasks. This makes it particularly useful for teams that want an AI assistant capable of handling longer-running assignments independently.

Understanding Claude Code

Claude Code interface

Claude Code takes a different approach. Rather than emphasising autonomous execution, Anthropic has focused heavily on developer collaboration and reasoning quality.

Claude Code functions as a terminal-native assistant that integrates directly into existing workflows. Developers can interact with it conversationally while maintaining close oversight of the coding process.

The tool is particularly strong at explaining architectural decisions, reviewing unfamiliar codebases, and helping developers work through complex implementation challenges. Instead of simply generating solutions, Claude Code often provides context that helps engineers understand why a particular approach may be preferable.

This makes Claude Code attractive for developers who view AI as an intelligent collaborator rather than an independent coding agent.

Codex vs Claude Code: Direct Comparison

The Difference in Philosophy

The biggest distinction between Codex and Claude Code lies in their approaches to autonomy.

Codex is designed to execute delegated work efficiently. Developers describe objectives, and the system attempts to complete them with minimal intervention. It excels in situations where productivity and task completion are the primary objectives.

Claude Code, on the other hand, prioritises interaction. It keeps developers closely involved in the decision-making process and often produces explanations alongside implementation suggestions.

Neither philosophy is inherently better.

Teams building products under tight deadlines may benefit from Codex's autonomous capabilities. Developers working on complex systems that require thoughtful design discussions may prefer Claude Code's collaborative style.

Code Quality and Reasoning

When evaluating coding assistants, raw output quality matters.

Claude Code has earned a reputation for producing clean, maintainable code with strong architectural awareness. It often breaks larger problems into logical components and provides reasoning that helps developers understand the trade-offs involved.

Codex tends to optimise for execution and efficiency. Its outputs frequently focus on accomplishing the requested task with minimal overhead while maintaining practical production considerations.

Comparative testing has shown that Claude Code often excels in documentation tasks and feature design. Codex demonstrates strong consistency across multiple categories of development work. Research analysing thousands of pull requests found that no single agent dominated every software engineering task, reinforcing the idea that context matters when selecting a tool.

Workflow Integration

The way an AI coding assistant fits into your existing development process can significantly impact adoption and long-term value.

Claude Code is built around a terminal-first experience, allowing developers to interact with the model directly within familiar command-line environments. This makes it particularly appealing to engineers who prefer maintaining close control over implementation decisions while receiving real-time guidance and feedback.

Codex takes a different approach by emphasising automation and delegation. Developers can assign coding tasks and review the completed work later, making it well-suited for teams looking to reduce repetitive workloads and improve development velocity. This model can be especially useful in larger organisations where engineers frequently juggle multiple projects and priorities.

Ultimately, the right choice depends on how your team prefers to work. Developers seeking an interactive coding companion may gravitate toward Claude Code, while organisations focused on streamlining execution may find Codex a better fit within their existing workflows.

Deployment Options

Writing code is only part of the software development process. Once an application is complete, developers still need a reliable way to test, deploy, and maintain it in production.

Whether you use Codex or Claude Code, the deployment workflow remains largely the same. AI coding assistants can generate production-ready applications, but they don't replace the infrastructure needed to host them.

Developers still need platforms like Vercel, Hostinger and Railway that support automated deployments, scalable environments, SSL certificates, backups, monitoring, and straightforward rollback options.

For teams looking to deploy apps built with Claude, platforms like AWS and Vercel make it easier. They integrate continuous delivery pipelines while providing the reliability expected from production systems.

The same applies when you try to deploy apps built with Codex. Services such as Hostinger simplify deployments with managed Node.js hosting, Git integration, and built-in security features, allowing developers to move from AI-generated code to a live production environment with minimal configuration.

As AI coding assistants become part of everyday development workflows, selecting the right production hosting for AI coding assistants is becoming just as important as choosing the coding tool itself. The best workflow combines an intelligent development assistant with infrastructure that makes shipping software fast, reliable, and repeatable.

Productivity Considerations

One of the primary reasons organisations adopt AI coding assistants is to improve development velocity.

Codex often shines when repetitive or well-defined tasks dominate the workload. Generating boilerplate code, implementing straightforward features, writing tests, or executing multi-step workflows are scenarios where autonomy can deliver meaningful time savings.

Claude Code provides value during exploratory development. Developers can brainstorm implementation approaches, validate assumptions, and receive guidance while preserving human oversight.

The productivity gains from each tool depend heavily on how teams allocate engineering effort.

Organisations emphasising rapid delivery may prioritise Codex.

Teams prioritising knowledge sharing and architectural consistency may lean toward Claude Code.

Security and Oversight

As AI agents gain more capabilities, governance becomes increasingly important.

Claude Code's interactive design naturally encourages human review before significant actions occur. This reduces the likelihood of unintended modifications and reinforces developer accountability.

Codex introduces stronger automation capabilities, which can accelerate workflows but also require clearly defined operational safeguards. Organisations adopting autonomous coding agents should establish review processes, permission controls, and testing requirements before integrating them into production environments.

The goal is not to eliminate human involvement but to position AI appropriately within existing software development practices.

Should you Choose Codex or Claude Code?

The answer depends on how you work.

Choose Codex if your team values autonomy, wants to delegate substantial development tasks, and needs an assistant that can operate independently across multiple assignments. Organisations focused on maximising throughput may find this approach particularly compelling.

Choose Claude Code if you prefer collaborative problem-solving, appreciate detailed reasoning, and want AI assistance that remains closely integrated with human decision-making throughout the development process.

Neither assistant replaces engineering judgment. Instead, they amplify different aspects of software development.

Final Thoughts

The debate between Codex and Claude Code reflects a broader shift within software engineering. AI assistants are no longer limited to suggesting individual lines of code. They're evolving into sophisticated development partners capable of influencing planning, implementation, testing, and deployment.

Codex emphasises execution. Claude Code emphasises collaboration.

For some teams, Codex will unlock significant productivity gains by handling routine work autonomously. For others, Claude Code will enhance decision-making by serving as an intelligent coding companion.

Ultimately, the best choice is the one that complements your team's existing strengths and addresses its most significant bottlenecks.

As AI continues to reshape development practices, the organisations that succeed will not necessarily be those using the most advanced tools. They will be the ones who integrate those tools thoughtfully into well-defined engineering processes.

Hope you enjoyed this article. You can connect with me on LinkedIn.



Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete

AutoMapper 16.2.0 and MediatR 14.2.0 Released

1 Share

Today we released the 16.2.0 version of AutoMapper and 14.2.0 version of MediatR:

This release is a bit more enterprise-focused, with extensions for setting the license keys via environment variables, fixing some threading issues around license key validation, and including more security-related items on releases (SBOMs, etc.)

These last few releases have been targeting items more or less required for enterprises consuming commercial packages. In the next release, we'll be more focused on adding features and fixing historical bugs.

Enjoy!

Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories