Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
147125 stories
·
33 followers

ESLint v10.0.0 released

1 Share

Highlights

ESLint v10.0.0 is a major release that includes several new features and breaking changes. Here are some of the most notable updates.

Installing

Because this is a major release, you may not automatically be upgraded by npm. To ensure you are using this version, run:

npm i eslint@10.0.0 --save-dev
1

Node.js < v20.19.0, v21.x, v23.x no longer supported

As of this post, Node.js v24.x is the LTS release, and as such we are dropping support for all versions of Node.js prior to v20.19.0 as well as v21.x and v23.x.

Migration Guide

As there are a lot of changes, we’ve created a migration guide describing the breaking changes in great detail along with the steps you should take to address them. We expect that most users should be able to upgrade without any build changes, but the migration guide should be a useful resource if you encounter problems.

New configuration file lookup algorithm

ESLint v10.0.0 locates eslint.config.* by starting from the directory of each linted file rather than the current working directory as was the case with ESLint v9.x. The new behavior allows for using multiple configuration files in the same run and can be particularly useful in monorepo setups.

In ESLint v9.x, this config lookup behavior could be enabled with the v10_config_lookup_from_file feature flag. In ESLint v10.0.0, this behavior is now the default and the v10_config_lookup_from_file flag has been removed.

Removed eslintrc functionality

As announced in Flat config rollout plans, the eslintrc config system has been completely removed in ESLint v10.0.0. Specifically, this means:

  1. The ESLINT_USE_FLAT_CONFIG environment variable is no longer honored.
  2. The CLI no longer supports eslintrc-specific arguments (--no-eslintrc, --env, --resolve-plugins-relative-to, --rulesdir, --ignore-path).
  3. .eslintrc.* and .eslintignore files will no longer be honored.
  4. /* eslint-env */ comments are reported as errors.
  5. The loadESLint() function now always returns the ESLint class.
  6. The Linter constructor configType argument can only be "flat" and will throw an error if "eslintrc" is passed.
  7. The following Linter eslintrc-specific methods are removed:
    • defineParser()
    • defineRule()
    • defineRules()
    • getRules()
  8. The following changes to the /use-at-your-own-risk entrypoint:
    • LegacyESLint is removed
    • FileEnumerator is removed
    • shouldUseFlatConfig() function will always return true

JSX references are now tracked

ESLint v10.0.0 now tracks JSX references, enabling correct scope analysis of JSX elements.

Previously, JSX identifiers weren’t tracked as references, which could lead to incorrect results in rules relying on scope information. For example:

import { Card } from "./card.jsx";

export function createCard(name) {
  return <Card name={name} />;
}
1
2
3
4
5

Prior to v10.0.0:

  • False positives: <Card> could be reported as “defined but never used” (no-unused-vars).
  • False negatives: Removing the import might not trigger an “undefined variable” error (no-undef).

Starting with v10.0.0, <Card> is treated as a normal reference to the variable in scope. This eliminates confusing false positives/negatives, aligns JSX handling with developer expectations, and improves the linting experience in projects using JSX.

Espree and ESLint Scope now include types

Beginning with Espree v11.1.0 and ESLint Scope v9.1.0, these packages now contain built-in type definitions.

Previously, type definitions were provided by Definitely Typed packages @types/espree and @types/eslint-scope. There are several differences between the old and new type definitions, mostly bug fixes. If your code relies on types for the Espree and ESLint Scope packages, check if there are any updates needed.

Enhancements to RuleTester

Since its earliest days, ESLint has provided the RuleTester API to help plugin authors test their rules against custom test cases and configurations. This release introduces several enhancements to RuleTester to enforce more robust test definitions and improve debugging.

Assertion options

The RuleTester#run() method now supports assertion options, specifically requireMessage, requireLocation, and requireData, to let developers enforce stricter requirements in rule tests. These options enforce that every invalid test case explicitly checks violation messages, locations, and data, ensuring that a test fails if it doesn’t meet the requirements.

  • requireMessage

    • Ensures every test case includes a message check.
    • Accepts:
      • true: Must use an array of objects for errors, rather than a numeric count shorthand, to check the problems reported by a rule. Each object must include a message or messageId property as usual to check the message of a reported problem.
      • "message": Must check using message only.
      • "messageId": Must check using messageId only.
    • Purpose: Prevents tests from passing without verifying the actual message.
  • requireLocation

    • Ensures every test case includes a location check.
    • Accepts: true
    • Requires line and column in each object of the errors array.
    • endLine and endColumn are optional if the actual report doesn’t include them.
    • Purpose: Guarantees that tests validate the location of an error.
  • requireData

    • Ensures every test case includes a data check.
    • Accepts: true
    • When set to true, RuleTester will require invalid test cases to include a data object whenever a messageId references a message with placeholders. This helps ensure that tests remain consistent with rule messages that rely on placeholder substitution.

Example Usage:

ruleTester.run("my-rule", rule, {
  valid: [
    { code: "var foo = true;" }
  ],
  invalid: [
    {
      code: "var invalidVariable = true;",
      errors: [
        { message: "Unexpected invalid variable.", line: 1, column: 5 }
      ]
    }
  ],
  assertionOptions: {
    requireMessage: true,
    requireLocation: true
  }
});
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17

Improved location reporting for failing tests

RuleTester now decorates stack traces with information that makes it easier to locate failing test cases in your source code. For example, the test output will now include stack trace lines indicating the index of the failing test case in the invalid array and the file and line number where that test case is defined.

Note that these line numbers may not always be included, depending on how your tests are structured. When the lines cannot be determined precisely, the failing test index and the printed code snippet are still available to locate the test case.

countThis option in max-params rule

The max-params rule now supports the new countThis option, which supersedes the deprecated countVoidThis. With the setting countThis: "never", the rule will now ignore any this annotation in a function’s argument list when counting the number of parameters in a TypeScript function. For example:

function doSomething(this: SomeType, first: string, second: number) {
 // ...
}
1
2
3

will be considered a function taking only 2 parameters.

color property in formatter context

When the --color or --no-color option is specified on the command line, ESLint sets an additional color property on the context object passed to a formatter (the second argument of the format() method). This property is true for --color and false for --no-color. Custom formatters can use this value to determine whether to apply color styling, based on the assumption that the terminal supports or does not support colors as indicated by the option.

Updated eslint:recommended

The eslint:recommended configuration is updated to include new rules that we feel are important.

Removed deprecated rule context members

The following rule context members are no longer available:

  • context.getCwd() - Use context.cwd instead
  • context.getFilename() - Use context.filename instead
  • context.getPhysicalFilename() - Use context.physicalFilename instead
  • context.getSourceCode() - Use context.sourceCode instead
  • context.parserOptions - Use context.languageOptions or context.languageOptions.parserOptions instead
  • context.parserPath - No replacement

Removed deprecated SourceCode methods

The following SourceCode methods are no longer available:

  • getTokenOrCommentBefore() - Use getTokenBefore() with the { includeComments: true } option instead
  • getTokenOrCommentAfter() - Use getTokenAfter() with the { includeComments: true } option instead
  • isSpaceBetweenTokens() - Use isSpaceBetween() instead
  • getJSDocComment() - No replacement

Program AST node range spans entire source text

Starting with ESLint v10.0.0, Program AST node’s range spans the entire source text. Previously, leading and trailing comments/whitespace were not included in the range.

Jiti < v2.2.0 no longer supported

ESLint v10.0.0 drops support for jiti versions prior to 2.2.0 when loading TypeScript configuration files due to known issues that can cause compatibility problems when configurations load certain plugins.

Breaking Changes

  • f9e54f4 feat!: estimate rule-tester failure location (#20420) (ST-DDT)
  • a176319 feat!: replace chalk with styleText and add color to ResultsMeta (#20227) (루밀LuMir)
  • c7046e6 feat!: enable JSX reference tracking (#20152) (Pixel998)
  • fa31a60 feat!: add name to configs (#20015) (Kirk Waiblinger)
  • 3383e7e fix!: remove deprecated SourceCode methods (#20137) (Pixel998)
  • 501abd0 feat!: update dependency minimatch to v10 (#20246) (renovate[bot])
  • ca4d3b4 fix!: stricter rule tester assertions for valid test cases (#20125) (唯然)
  • 96512a6 fix!: Remove deprecated rule context methods (#20086) (Nicholas C. Zakas)
  • c69fdac feat!: remove eslintrc support (#20037) (Francesco Trotta)
  • 208b5cc feat!: Use ScopeManager#addGlobals() (#20132) (Milos Djermanovic)
  • a2ee188 fix!: add uniqueItems: true in no-invalid-regexp option (#20155) (Tanuj Kanti)
  • a89059d feat!: Program range span entire source text (#20133) (Pixel998)
  • 39a6424 fix!: assert ‘text’ is a string across all RuleFixer methods (#20082) (Pixel998)
  • f28fbf8 fix!: Deprecate "always" and "as-needed" options of the radix rule (#20223) (Milos Djermanovic)
  • aa3fb2b fix!: tighten func-names schema (#20119) (Pixel998)
  • f6c0ed0 feat!: report eslint-env comments as errors (#20128) (Francesco Trotta)
  • 4bf739f fix!: remove deprecated LintMessage#nodeType and TestCaseError#type (#20096) (Pixel998)
  • 523c076 feat!: drop support for jiti < 2.2.0 (#20016) (michael faith)
  • 454a292 feat!: update eslint:recommended configuration (#20210) (Pixel998)
  • 4f880ee feat!: remove v10_* and inactive unstable_* flags (#20225) (sethamus)
  • f18115c feat!: no-shadow-restricted-names report globalThis by default (#20027) (sethamus)
  • c6358c3 feat!: Require Node.js ^20.19.0 || ^22.13.0 || >=24 (#20160) (Milos Djermanovic)

Features

Bug Fixes

  • 436b82f fix: update eslint (#20473) (renovate[bot])
  • 1d29d22 fix: detect default this binding in Array.fromAsync callbacks (#20456) (Francesco Trotta)
  • 727451e fix: fix regression of global mode report range in strict rule (#20462) (ntnyq)
  • e80485f fix: remove fake FlatESLint and LegacyESLint exports (#20460) (Francesco Trotta)
  • 9eeff3b fix: update esquery (#20423) (cryptnix)
  • b34b938 fix: use Error.prepareStackTrace to estimate failing test location (#20436) (Francesco Trotta)
  • 51aab53 fix: update eslint (#20443) (renovate[bot])
  • 23490b2 fix: handle space before colon in RuleTester location estimation (#20433) (Francesco Trotta)
  • f244dbf fix: use MessagePlaceholderData type from @eslint/core (#20348) (루밀LuMir)
  • d186f8c fix: update eslint (#20427) (renovate[bot])
  • 2332262 fix: error location should not modify error message in RuleTester (#20421) (Milos Djermanovic)
  • ab99b21 fix: ensure filename is passed as third argument to verifyAndFix() (#20405) (루밀LuMir)
  • 8a60f3b fix: remove ecmaVersion and sourceType from ParserOptions type (#20415) (Pixel998)
  • eafd727 fix: remove TDZ scope type (#20231) (jaymarvelz)
  • 39d1f51 fix: correct Scope typings (#20404) (sethamus)
  • 2bd0f13 fix: update verify and verifyAndFix types (#20384) (Francesco Trotta)
  • ba6ebfa fix: correct typings for loadESLint() and shouldUseFlatConfig() (#20393) (루밀LuMir)
  • e7673ae fix: correct RuleTester typings (#20105) (Pixel998)
  • 53e9522 fix: strict removed formatters check (#20241) (ntnyq)
  • b017f09 fix: correct no-restricted-import messages (#20374) (Francesco Trotta)

Documentation

  • e978dda docs: Update README (GitHub Actions Bot)
  • 4cecf83 docs: Update README (GitHub Actions Bot)
  • c79f0ab docs: Update README (GitHub Actions Bot)
  • 773c052 docs: Update README (GitHub Actions Bot)
  • f2962e4 docs: document meta.docs.frozen property (#20475) (Pixel998)
  • 8e94f58 docs: fix broken anchor links from gerund heading updates (#20449) (Copilot)
  • 1495654 docs: Update README (GitHub Actions Bot)
  • 0b8ed5c docs: document support for :is selector alias (#20454) (sethamus)
  • 1c4b33f docs: Document policies about ESM-only dependencies (#20448) (Milos Djermanovic)
  • 3e5d38c docs: add missing indentation space in rule example (#20446) (fnx)
  • 63a0c7c docs: Update README (GitHub Actions Bot)
  • 65ed0c9 docs: Update README (GitHub Actions Bot)
  • b0e4717 docs: [no-await-in-loop] Expand inapplicability (#20363) (Niklas Hambüchen)
  • fca421f docs: Update README (GitHub Actions Bot)
  • d925c54 docs: update config syntax in no-lone-blocks (#20413) (Pixel998)
  • 7d5c95f docs: remove redundant sourceType: "module" from rule examples (#20412) (Pixel998)
  • 02e7e71 docs: correct .mts glob pattern in files with extensions example (#20403) (Ali Essalihi)
  • 264b981 docs: Update README (GitHub Actions Bot)
  • 5a4324f docs: clarify "local" option of no-unused-vars (#20385) (Milos Djermanovic)
  • e593aa0 docs: improve clarity, grammar, and wording in documentation site README (#20370) (Aditya)
  • 3f5062e docs: Add messages property to rule meta documentation (#20361) (Sabya Sachi)
  • 9e5a5c2 docs: remove Examples headings from rule docs (#20364) (Milos Djermanovic)
  • 194f488 docs: Update README (GitHub Actions Bot)
  • 0f5a94a docs: [class-methods-use-this] explain purpose of rule (#20008) (Kirk Waiblinger)
  • df5566f docs: add Options section to all rule docs (#20296) (sethamus)
  • adf7a2b docs: no-unsafe-finally note for generator functions (#20330) (Tom Pereira)
  • ef7028c docs: Update README (GitHub Actions Bot)
  • fbae5d1 docs: consistently use “v10.0.0” in migration guide (#20328) (Pixel998)
  • 778aa2d docs: ignoring default file patterns (#20312) (Tanuj Kanti)
  • 4b5dbcd docs: reorder v10 migration guide (#20315) (Milos Djermanovic)
  • 5d84a73 docs: Update README (GitHub Actions Bot)
  • 37c8863 docs: fix incorrect anchor link in v10 migration guide (#20299) (Pixel998)
  • 077ff02 docs: add migrate-to-10.0.0 doc (#20143) (唯然)
  • 3822e1b docs: Update README (GitHub Actions Bot)
  • 9f08712 Build: changelog update for 10.0.0-rc.2 (Jenkins)
  • 1e2c449 Build: changelog update for 10.0.0-rc.1 (Jenkins)
  • c4c72a8 Build: changelog update for 10.0.0-rc.0 (Jenkins)
  • 7e4daf9 Build: changelog update for 10.0.0-beta.0 (Jenkins)
  • a126a2a build: add .scss files entry to knip (#20389) (Francesco Trotta)
  • f5c0193 Build: changelog update for 10.0.0-alpha.1 (Jenkins)
  • 165326f Build: changelog update for 10.0.0-alpha.0 (Jenkins)

Chores

  • 1ece282 chore: ignore /docs/v9.x in link checker (#20452) (Milos Djermanovic)
  • 034e139 ci: add type integration test for @html-eslint/eslint-plugin (#20345) (sethamus)
  • f3fbc2f chore: set @eslint/js version to 10.0.0 to skip releasing it (#20466) (Milos Djermanovic)
  • afc0681 chore: remove scopeManager.addGlobals patch for typescript-eslint parser (#20461) (fnx)
  • 3e5a173 refactor: use types from @eslint/plugin-kit (#20435) (Pixel998)
  • 11644b1 ci: rename workflows (#20463) (Milos Djermanovic)
  • 2d14173 chore: fix typos in docs and comments (#20458) (o-m12a)
  • 6742f92 test: add endLine/endColumn to invalid test case in no-alert (#20441) (경하)
  • 3e22c82 test: add missing location data to no-template-curly-in-string tests (#20440) (Haeun Kim)
  • b4b3127 chore: package.json update for @eslint/js release (Jenkins)
  • f658419 refactor: remove raw parser option from JS language (#20416) (Pixel998)
  • 2c3efb7 chore: remove category from type test fixtures (#20417) (Pixel998)
  • 36193fd chore: remove category from formatter test fixtures (#20418) (Pixel998)
  • e8d203b chore: add JSX language tag validation to check-rule-examples (#20414) (Pixel998)
  • bc465a1 chore: pin dependencies (#20397) (renovate[bot])
  • 703f0f5 test: replace deprecated rules in linter tests (#20406) (루밀LuMir)
  • ba71baa test: enable strict mode in type tests (#20398) (루밀LuMir)
  • f9c4968 refactor: remove lib/linter/rules.js (#20399) (Francesco Trotta)
  • 6f1c48e chore: updates for v9.39.2 release (Jenkins)
  • 54bf0a3 ci: create package manager test (#20392) (루밀LuMir)
  • 3115021 refactor: simplify JSDoc comment detection logic (#20360) (Pixel998)
  • 4345b17 chore: update @eslint-community/regexpp to 4.12.2 (#20366) (루밀LuMir)
  • 772c9ee chore: update dependency @eslint/eslintrc to ^3.3.3 (#20359) (renovate[bot])
  • 0b14059 chore: package.json update for @eslint/js release (Jenkins)
  • d6e7bf3 ci: bump actions/checkout from 5 to 6 (#20350) (dependabot[bot])
  • 139d456 chore: require mandatory headers in rule docs (#20347) (Milos Djermanovic)
  • 3b0289c chore: remove unused .eslintignore and test fixtures (#20316) (Pixel998)
  • a463e7b chore: update dependency js-yaml to v4 [security] (#20319) (renovate[bot])
  • ebfe905 chore: remove redundant rules from eslint-config-eslint (#20327) (Milos Djermanovic)
  • 88dfdb2 test: add regression tests for message placeholder interpolation (#20318) (fnx)
  • 6ed0f75 chore: skip type checking in eslint-config-eslint (#20323) (Francesco Trotta)
  • 1e2cad5 chore: package.json update for @eslint/js release (Jenkins)
  • 9da2679 chore: update @eslint/* dependencies (#20321) (Milos Djermanovic)
  • 0439794 refactor: use types from @eslint/core (#20235) (jaymarvelz)
  • cb51ec2 test: cleanup SourceCode#traverse tests (#20289) (Milos Djermanovic)
  • 897a347 chore: remove restriction for type in rule tests (#20305) (Pixel998)
  • d972098 chore: ignore prettier updates in renovate to keep in sync with trunk (#20304) (Pixel998)
  • a086359 chore: remove redundant fast-glob dev-dependency (#20301) (루밀LuMir)
  • 564b302 chore: install prettier as a dev dependency (#20302) (michael faith)
  • 8257b57 refactor: correct regex for eslint-plugin/report-message-format (#20300) (루밀LuMir)
  • e251671 refactor: extract assertions in RuleTester (#20135) (唯然)
  • 2e7f25e chore: add legacy-peer-deps to .npmrc (#20281) (Milos Djermanovic)
  • 39c638a chore: update eslint-config-eslint dependencies for v10 prereleases (#20278) (Milos Djermanovic)
  • 8533b3f chore: update dependency @eslint/json to ^0.14.0 (#20288) (renovate[bot])
  • 796ddf6 chore: update dependency @eslint/js to ^9.39.1 (#20285) (renovate[bot])
Read the whole story
alvinashcraft
43 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

8 Things You Didn't Know About Code Mode

1 Share

blog cover

Agents fundamentally changed how we program. They enable developers to move faster by disintermediating the traditional development workflow. This means less time switching between specialized tools and fewer dependencies on other teams. Now that agents can execute complicated tasks, developers face a new challenge: using them effectively over long sessions.

The biggest challenge is context rot. Because agents have limited memory, a session that runs too long can cause them to "forget" earlier instructions. This leads to unreliable outputs, frustration, and subtle but grave mistakes in your codebase. One promising solution is Code Mode.

Instead of describing dozens of separate tools to an LLM, Code Mode allows an agent to write code that calls those tools programmatically, reducing the amount of context the model has to hold at once. While many developers first heard about Code Mode through Cloudflare's blog post, fewer understand how it works in practice.

I have been using Code Mode for a few months and recently ran a small experiment. I asked goose to fix its own bug where the Gemini model failed to process images in the CLI but worked in the desktop app, then open a PR. The fix involved analyzing model configuration, tracing image input handling through the pipeline, and validating behavior across repeated runs. I ran the same task twice: once with Code Mode enabled and once without it.

Here is what I learned from daily use and my experiment.

1. Code Mode is Not an MCP-Killer

In fact, it uses MCP under the hood. MCP is a standard that lets AI agents connect to external tools and data sources. When you install an MCP server in an agent, that MCP server exposes its capabilities as MCP tools. For example, goose's primary MCP server called the developer extension exposes tools like shell enabling goose to run commands and text_editor, so goose can view and edit files.

Code Mode wraps your MCP tools as JavaScript modules, allowing the agent to combine multiple tool calls into a single step. Code Mode is a pattern for how agents interact with MCP tools more efficiently.

2. goose Supports Code Mode

Code Mode support landed in goose v1.17.0 in December 2025. It ships as a platform extension called "Code Execution" that you can enable in the desktop app or CLI.

To enable it:

  • Desktop app: Click the extensions icon and toggle on "Code Execution"
  • CLI: Run goose configure and enable the Code Execution extension

Since its initial implementation, we've added so many improvements!

3. Code Mode Keeps Your Context Window Clean

Every time you install an MCP server (or "extension" in the goose ecosystem), it adds a significant amount of data to your agent's memory. Every tool comes with a tool definition describing what the tool does, the parameters it accepts, and what it returns. This helps the agent understand how to use the tool.

These definitions consume space in your agent's context window. For example, if a single definition takes 500 tokens and an extension has five tools, that is 2,500 tokens gone before you even start. If you use multiple extensions, you could easily double or even decuple that number.

Without Code Mode, your context window could look like this:

[System prompt: ~1,000 tokens]
[Tool: developer__shell - 500 tokens]
[Tool: developer__text_editor - 600 tokens]
[Tool: developer__analyze - 400 tokens]
[Tool: slack__send_message - 450 tokens]
[Tool: slack__list_channels - 400 tokens]
[Tool: googledrive__search - 500 tokens]
[Tool: googledrive__download - 450 tokens]
... and so on for every tool in every extension

As your session progresses, useful context gets crowded out by tool definitions you aren't even using: the code you are discussing, the problem you are solving, or the instructions you previously gave. This leads to performance degradation and memory loss. While I used to recommend disabling unused MCP servers, Code Mode offers a better fix. It uses three tools that help the agent discover what tools it needs on demand rather than having every tool definition loaded upfront:

  1. search_modules - Find available extensions
  2. read_module - Learn what tools an extension offers
  3. execute_code - Run JavaScript that uses those tools

I wanted to see how true this was so I ran an experiment: I had goose solve a user's bug and put up a PR with and without code mode. Code Mode used 30% fewer tokens for the same task.

MetricWith Code ModeWithout Code Mode
Total tokens23,33933,648
Input tokens23,12833,560

4. Code Mode Batches Operations Into a Single Tool Call

The token savings do not just come from loading fewer tool definitions upfront. Code Mode also handles the "active" side of the conversation through a method called batching.

When you ask an agent to do something, it typically breaks your request into individual steps, each requiring a separate tool call. You can see these calls appear in your chat as the agent executes the tasks. For example, if you ask goose to "check the current branch, show me the diff, and run the tests," it might run four individual commands:

▶ developer__shell → git branch --show-current

▶ developer__shell → git status

▶ developer__shell → git diff

▶ developer__shell → cargo test

Each of these calls adds a new layer to the conversation history that goose has to track. Batching combines these into a single execution. When you turn Code Mode on and give that same prompt, you will see just one tool call:

▶ Code Execution: Execute Code
generating...

Inside that one execution, it batches all the commands into a script:

import { shell } from "developer";

const branch = shell({ command: "git branch --show-current" });
const status = shell({ command: "git status" });
const diff = shell({ command: "git diff" });
const tests = shell({ command: "cargo test" });

As a user, you see the same results, but the agent only has to remember one interaction instead of four. By reducing these round trips, Code Mode keeps the conversation history concise so the agent can maintain focus on the task at hand.

5. Code Mode Makes Smarter Tool Choices

When an agent has access to dozens of tools, it sometimes makes a "logical" choice that is technically wrong for your environment. This happens because, in a standard setup, the agent picks tools from a flat list based on short text descriptions. This can lead to a massive waste of time and tokens when the agent picks a tool that sounds right but lacks the necessary context.

I saw this firsthand during my experiments. I had an extension enabled called agent-task-queue, which is designed to run background tasks with timeouts.

When I asked goose to run the tests for my PR, it looked at the available tools and saw agent-task-queue. The LLM reasoned that a test suite is a "long-running task," making that extension a perfect fit. It chose the specialized tool over the generic shell.

However, the tool call failed immediately:

FAILED exit=127 0.0s
/bin/sh: cargo: command not found

My environment was not configured to use that specific extension for my toolchain. goose made a reasonable choice based on the description, but it was the wrong tool for my actual setup.

In the Code Mode session, this mistake never happened. Code Mode changes how the agent interacts with its capabilities by requiring explicit import statements.

Instead of browsing a menu of names, goose had to be intentional about which module it was using. It chose to import from the developer module:

import { shell } from "developer";

const test = shell({ command: "cargo test -p goose --lib formats::google" });

By explicitly importing developer, Code Mode ensured the tests ran in my actual shell environment.

6. Code Mode Is Portable Across Editors

goose is more than an agent; it's also an ACP (Agent Client Protocol) server. This means you can connect it to any editor that supports ACP, like Zed or Neovim. Plus, any MCP server you use in goose will work there, too.

I wanted to try this myself, so I set up Neovim to connect to goose with Code Mode enabled. Here's the configuration I used:

{
"yetone/avante.nvim",
build = "make",
event = "VeryLazy",
opts = {
provider = "goose",
acp_providers = {
["goose"] = {
command = "goose",
args = { "acp", "--with-builtin", "code_execution,developer" },
},
},
},
dependencies = {
"nvim-lua/plenary.nvim",
"MunifTanjim/nui.nvim",
},
}

The key line is the one where I enable Code Mode right inside the editor config:

args = { "acp", "--with-builtin", "code_execution,developer" },

To test it, I asked goose to list my Rust files and count the lines of code. Instead of a long stream of individual shell commands cluttering my Neovim buffer, I saw one singular tool call: Code Execution. It worked exactly like it does in the desktop app. This portability means you can build a powerful, efficient agent workflow and take it with you to whatever environment you're most comfortable in.

Neovim with Code Mode enabled

7. Code Mode Performs Differently Across LLMs

I ran my experiments using Claude Opus 4.5. Your results may vary depending on which model you use.

Code Mode requires the LLM to do things that not all models do equally well:

  • Write valid JavaScript - The model has to generate syntactically correct code. Models with stronger code generation capabilities will produce fewer errors.
  • Follow the import pattern - Code Mode expects the LLM to import tools from modules like import { shell } from "developer". Some models might try to call tools directly without importing, which will fail.
  • Use the discovery tools - Before writing code, the LLM should call search_modules and read_module to learn what tools are available. Some models skip this step and guess, leading to hallucinated tool names.
  • Handle errors gracefully - When a code execution fails, the model needs to read the error, understand what went wrong, and try again. Some models are better at this feedback loop than others.

If Code Mode is not working well for you, try switching models. A model that excels at code generation and instruction following will generally perform better with Code Mode than one optimized for other tasks.

8. Code Mode Is Not for Every Task

Code Mode adds overhead. Before executing anything, the LLM has to:

  1. Call search_modules to find available extensions
  2. Call read_module to learn what tools an extension offers
  3. Write JavaScript code
  4. Call execute_code to run it

For simple, single-tool tasks, this overhead is not worth it. If you just need to run one shell command or view one file, regular tool calling is faster.

Based on my experiments, here is when Code Mode makes sense:

Use Code Mode WhenSkip Code Mode When
You have multiple extensions enabledYou only have 1-2 extensions
Your task involves multi-step orchestrationYour task is a single tool call
You want longer sessions without context rotSpeed matters more than context longevity
You are working across multiple editorsYou are doing a quick one-off task

Try It Out

If you want to experiment with Code Mode, here are some resources:

Documentation:

Previous posts:

Community:

  • Join our Discord to share what you learn
  • File issues on GitHub if something does not work as expected

Run your own experiments and let us know what you find.

Read the whole story
alvinashcraft
44 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Salesforce Shelves Heroku

1 Share
Salesforce is essentially shutting down Heroku as an evolving product, moving the cloud platform that helped define modern app deployment to a "sustaining engineering model" focused entirely on stability, security and support. Existing customers on credit card billing see no changes to pricing or service, but enterprise contracts are no longer available to new buyers. Salesforce said it is redirecting engineering investment toward enterprise AI.

Read more of this story at Slashdot.

Read the whole story
alvinashcraft
3 hours ago
reply
Pennsylvania, USA
Share this story
Delete

RFK Jr. Has Packed an Autism Panel With Cranks and Conspiracy Theorists

1 Share
Among those Robert F. Kennedy Jr. recently named to a federal autism committee are people who tout dangerous treatments and say vaccine manufacturers are “poisoning children.”
Read the whole story
alvinashcraft
3 hours ago
reply
Pennsylvania, USA
Share this story
Delete

What’s New in vcpkg (Nov 2025 – Jan 2026)

1 Share

This blog post summarizes changes to the vcpkg package manager as part of the 2025.12.12 and 2026.01.16 registry releases and the 2025-11-13, 2025-11-18, 2025-11-19, 2025-12-05, and 2025-12-16 tool releases. These updates include support for targeting the Xbox GDK October 2025 update, removing a misleading and outdated output message, and other minor improvements and bug fixes.

Some stats for this period:

  • There are now 2,750 total ports available in the vcpkg curated registry. A port is a versioned recipe for building a package from source, such as a C or C++ library.
  • 82 new ports were added to the curated registry.
  • 504 ports were updated by December and 584 ports were updated in January. As always, we validate each change to a port by building all other ports that depend on or are depended by the library that is being updated for our 15 main triplets.
  • 182 community contributors made commits.
  • The main vcpkg repo has over 7,300 forks and 26,600 stars on GitHub.

vcpkg changelog (2025.12.12, 2026.01.16 releases)

  • Removed an outdated output message after running vcpkg upgrade that could mislead users (PR: Microsoft/vcpkg-tool#1802).
  • Updated vcpkg to understand new layout structure and environment variables for targeting Xbox as of the October 2025 Microsoft GDK update. (PRs: Microsoft/vcpkg-tool#1834, thanks @walbourn!).
    • GameDKLatest is associated with the ‘old’ layouts and only exists when they are optionally installed by October 2025 or by earlier GDKs. October 2024 GDK or later are still in-service.
    • GameDKXboxLatest is associated with the ‘new’ layouts which are always present for October 2025 or later.
  • Other minor improvements and bug fixes.

Total ports available for tested triplets

Triplet Ports available
x86-windows 2549
x64-windows 2678
x64-windows-release 2678
x64-windows-static 2557
x64-windows-static-md 2614
x64-uwp 1506
arm64-windows 2304
arm64-windows-static-md 2290
arm64-uwp 1475
arm64-osx 2484
x64-linux 2688
arm-neon-android 2106
x64-android 2167
arm64-android 2134
x86-windows 2549

While vcpkg supports a much larger variety of target platforms and architectures (as community triplets), the list above is validated exhaustively to ensure updated ports don’t break other ports in the catalog.

Thank you to our contributors

vcpkg couldn’t be where it is today without contributions from our open-source community. Thank you for your continued support! The following people contributed to the vcpkg, vcpkg-tool, or vcpkg-docs repos in this release (listed by commit author or GitHub username):

a-alomran Christopher Lee jreichel-nvidia Richard Powell
Aaron van Geffen Chuck Walbourn Kadir Rimas Misevičius
Aditya Rao Colden Cullen Kai Blaschke RobbertProost
Adrien Bourdeaux Connor Broyles Kai Pastor Rok Mandeljc
Ajadaz CQ_Undefine Kaito Udagawa RPeschke
Alan Jowett Craig Edwards kedixa Saikari
Alan Tse Crindzebra Sjimo Kevin Ring Scranoid
albertony cuihairu Kiran Chanda Sean Farrell
Aleks Tuchkov Dalton Messmer Kyle Benesch Seth Flynn
Aleksandr Orefkov Daniel Collins kzhdev shixiong2333
Aleksi Sapon David Fiedler Laurent Rineau Silvio Traversaro
Alex Emirov deadlightreal LE GARREC Vincent Simone Gasparini
Alexander Neumann Dennis lemourin Sina Behmanesh
Alexis La Goutte Dr. Patrick Urbanke lithrad Stephen Webb
Alexis Placet Dzmitry Baryshau llm96 Steven
Allan Hanan eao197 Lukas Berbuer SunBlack
Anders Wind Egor Tyuvaev Lukas Schwerdtfeger Sylvain Doremus
Andre Nguyen Ethan J. Musser Marcel Koch Szabolcs Horvát
Andrew Kaster Eviral Martin Moene Takatoshi Kondo
Andrew Tribick Fidel Yin Matheus Gomes talregev
Ankur Verma freshthinking matlabbe Theodore Tsirpanis
Argentoz Fyodor Krasnov Matthias Kuhn Thomas Arcila
Attila Kovacs galabovaa Michael Hansen Thomas1664
autoantwort GioGio Michele Caini TLescoatTFX
ayeteadoe Giuseppe Roberti Mikhail Titov Tobias Markus
Ayush Acharjya Glyn Matthews miyan Toby
Barak Shoshany Gordon Smith miyanyan toge
Benno Waldhauer hehanjing Morcules Tom Conder
Bernard Teo Hiroaki Yutani myd7349 Tom M.
Bertin Balouki SIMYELI Hoshi Mzying2001 Tom Tan
bjovanovic84 huangqinjin Nick D’Ademo Tommy-Xavier Robillard
blavallee i-curve Nikita UlrichBerndBecker
bwedding Igor Kostenko Osyotr Vallabh Mahajan
Byoungchan Lee ihsan demir PARK DongHa Vincent Le Garrec
Cappecasper03 Ioannis Makris pastdue Vitalii Koshura
Carson Radtke Ivan Maidanski Pasukhin Dmitry Vladimir Shaleev
cDc Jaap Aarts Patrick Colis Waldemar Kornewald
Charles Cabergs JacobBarthelmeh Paul Lemire Wentsing Nee
Charles Dang James Grant Pavel Kisliak wentywenty
Charles Karney Janek Bevendorff Pedro López-Cabanillas xavier2k6
chausner Jeremy Dumais Raul Metsma ycdev1
chenjunfu2 Jesper Stemann Andersen RealChuan Yunze Xu
Chris Birkhold Jinwoo Sung RealTimeChris Yury Bura
Chris Leishman JoergAtGithub Rémy Tassoux zuhair-naqvi
Chris Sarbora John Wason Riccardo Ressi
Chris W Jonatan Nevo Richard Barnes

Learn more

You can find the main release notes on GitHub. Recent updates to the vcpkg tool can be viewed on the vcpkg-tool Releases page. To contribute to vcpkg documentation, visit the vcpkg-docs repo. If you’re new to vcpkg or curious about how a package manager can make your life easier as a C/C++ developer, check out the vcpkg website – vcpkg.io.

If you would like to contribute to vcpkg and its library catalog, or want to give us feedback on anything, check out our GitHub repo. Please report bugs or request updates to ports in our issue tracker or join more general discussion in our discussion forum.

The post What’s New in vcpkg (Nov 2025 – Jan 2026) appeared first on C++ Team Blog.

Read the whole story
alvinashcraft
3 hours ago
reply
Pennsylvania, USA
Share this story
Delete

Building an AI Skills Executor in .NET: Bringing Anthropic’s Agent Pattern to the Microsoft Ecosystem

1 Share

When Anthropic released their Agent Skills framework, they published a blueprint for how enterprise organizations should structure AI agent capabilities. The pattern is straightforward: package procedural knowledge into composable skills that AI agents can discover and apply contextually. Microsoft, OpenAI, Cursor, and others have already adopted the standard, making skills portable across the AI ecosystem.

But here’s the challenge for .NET shops: most implementations assume Python or TypeScript. If your organization runs on the Microsoft stack, you need an implementation that speaks C#.

This article walks through building a proof-of-concept AI Skills Executor in .NET 10 that combines Azure AI Foundry for LLM capabilities with the official MCP C# SDK for tool execution. I want to be upfront about something: what I’m showing here is a starting point, not a production-ready framework. The goal is to demonstrate the pattern and the key integration points so you can evaluate whether this approach makes sense for your organization, and then build something more robust on top of it.

The complete working code is available on GitHub if you want to follow along.

The Scenario: Why You’d Build This

Before we get into any code, I want to ground this in a real problem. Otherwise, every pattern looks like a solution searching for a question.

Imagine you’re the engineering lead at a mid-size financial services firm. Your team manages about forty .NET microservices across Azure. You’ve got a mature CI/CD pipeline, established coding standards, and a healthy backlog of technical debt that nobody has time to address. Sound familiar?

Your developers are already using AI assistants like GitHub Copilot and Claude to write code faster. That’s great. But you keep running into the same frustrations. A junior developer asks the AI to set up a new microservice, and it generates a project structure that doesn’t match your organization’s conventions. A senior developer crafts a detailed prompt for your specific deployment pipeline, shares it in a Slack thread, and within a month there are fifteen variations floating around with no way to standardize or improve any of them. Your architecture review board has patterns they want enforced, but those patterns live in a Confluence wiki that no AI assistant knows about.

This is the problem skills solve. Instead of every developer independently teaching their AI assistant how your organization works, you encode that knowledge once into a skill. A “New Service Scaffolding” skill that knows your project structure, your required NuGet packages, your logging conventions, and your deployment configuration. A “Code Review” skill that checks against your actual standards, not generic best practices. A “Tech Debt Assessment” skill that can scan a repo and produce a prioritized report using your team’s severity criteria.

The Skills Executor is the engine that makes these skills operational. It loads the right skill, connects it to an LLM via Azure AI Foundry, gives the LLM access to tools through MCP servers, and runs the agentic loop until the job is done. Keep this financial services scenario in mind as we walk through the architecture. Every component maps back to making this kind of organizational knowledge usable.

Where Azure AI Foundry Fits

If you’ve been tracking Microsoft’s AI platform evolution, you know that Azure AI Foundry (recently rebranded as Microsoft Foundry) has become the unified control plane for enterprise AI. The reason it matters for a skills executor is that it gives you a single endpoint for model access, agent management, evaluation, and observability, all under one roof with enterprise-grade security.

For this project, Foundry provides two things we need. First, it’s the gateway to Azure OpenAI models with function calling support, which is what drives the agentic loop at the core of the executor. You deploy a model like GPT-4.1 to your Foundry project, and the executor calls it through the Azure OpenAI SDK using your Foundry endpoint. Second, as you mature beyond this proof of concept, Foundry gives you built-in evaluation, red teaming, and monitoring capabilities that you’d otherwise have to build from scratch. That path from prototype to production is a lot shorter when your orchestration layer already speaks Foundry’s language.

The Azure AI Foundry .NET SDK (currently at version 1.2.0-beta.1) provides the Azure.AI.Projects client library for connecting to a Foundry project endpoint. In our executor, we use the Azure.AI.OpenAI package to interact with models deployed through Foundry, which means the integration is mostly about pointing your OpenAI client at your Foundry-provisioned endpoint instead of a standalone Azure OpenAI resource.

Understanding the Architecture

The Skills Executor has four cooperating components. The Skill Loader discovers and parses SKILL.md files from a configured directory, pulling metadata from YAML frontmatter and instructions from the markdown body. The Azure OpenAI Service handles all LLM interactions through your Foundry-provisioned endpoint, including chat completions with function calling. The MCP Client Service connects to one or more MCP servers, discovers their available tools, and routes execution requests. And the Skill Executor itself orchestrates the agentic loop: taking user input, managing the conversation with the LLM, executing tool calls when requested, and returning final responses.

Skills Executor Architecture DiagramFigure: The Skills Executor architecture showing how skills, Azure OpenAI (via Foundry), and MCP servers work together.

The important design decision here is that the orchestrator contains zero business logic about when to use specific tools. It provides the LLM with available tools, executes whatever the LLM requests, feeds results back, and repeats until the LLM produces a final response. All the intelligence comes from the skill’s instructions guiding the LLM’s decisions. This is what makes the pattern composable. Swap the skill, and the same executor does completely different work.

Setting Up the Project

The solution uses three projects targeting .NET 10 (the current LTS release):

dotnet new sln -n SkillsQuickstart
dotnet new classlib -n SkillsCore -f net10.0
dotnet new console -n SkillsQuickstart -f net10.0
dotnet new console -n SkillsMcpServer -f net10.0

dotnet sln add SkillsCore
dotnet sln add SkillsQuickstart
dotnet sln add SkillsMcpServer

The key NuGet packages for the orchestrator are Azure.AI.OpenAI for LLM interactions through your Foundry endpoint and ModelContextProtocol --prerelease for MCP client/server capabilities. The MCP C# SDK is maintained by Microsoft in partnership with Anthropic and is currently working toward its 1.0 stable release, so the prerelease flag is still needed. For the MCP server project, you just need the ModelContextProtocol and Microsoft.Extensions.Hosting packages.

Skills as Markdown Files

A skill is a folder containing a SKILL.md file with YAML frontmatter for metadata and markdown body for instructions. Think back to our financial services scenario. Here’s what a tech debt assessment skill might look like:

---
name: Tech Debt Assessor
description: Scans codebases and produces prioritized tech debt reports.
version: 1.0.0
author: Platform Engineering
category: quality
tags:
  - tech-debt
  - analysis
  - reporting
---

# Tech Debt Assessor

You are a technical debt analyst for a .NET microservices environment.
Your job is to scan a codebase and produce a prioritized assessment.

## Severity Framework

- **Critical**: Security vulnerabilities, deprecated APIs with known exploits
- **High**: Missing test coverage on business-critical paths, outdated packages
  with available patches
- **Medium**: Code style violations, TODO/FIXME accumulation, copy-paste patterns
- **Low**: Documentation gaps, naming convention inconsistencies

## Workflow

1. Use analyze_directory to understand the project structure
2. Use count_lines to gauge project scale by language
3. Use find_patterns to locate TODO, FIXME, HACK, and BUG markers
4. Synthesize findings into a report organized by severity

**ALWAYS use tools to gather real data. Do not guess about the codebase.**

Notice how the skill encodes your organization’s specific severity framework. A generic AI assistant would apply some default notion of tech debt priority. This skill applies yours. And because it’s just a markdown file in a git repo, your architecture review board can review changes to it the same way they review code.

The Skill Loader parses these files by splitting the YAML frontmatter from the markdown body. The implementation uses YamlDotNet for deserialization and a simple string-splitting approach for frontmatter extraction. I won’t paste the full loader code here since it’s fairly standard file I/O and YAML parsing. You can see the complete implementation in the GitHub repository, but the core idea is that each SKILL.md file becomes a SkillDefinition object with metadata properties (name, description, tags) and an Instructions property containing the full markdown body.

Connecting to MCP Servers

The MCP Client Service manages connections to MCP servers, discovers their tools, and routes execution requests. The core flow is: connect to each configured server using StdioClientTransport, call ListToolsAsync() to discover available tools, and maintain a lookup dictionary mapping tool names to the client that owns them.

When the executor needs to call a tool, it looks up the tool name in the dictionary and routes the call to the right MCP server via CallToolAsync(). This means you can have multiple MCP servers, each with different tools. A custom server with your internal tools, a GitHub MCP server for repository operations, a filesystem server for file access. The executor doesn’t care where a tool lives.

Server configuration lives in appsettings.json:

{
  "McpServers": {
    "Servers": [
      {
        "Name": "skills-mcp-server",
        "Command": "dotnet",
        "Arguments": ["run", "--project", "../SkillsMcpServer"],
        "Enabled": true
      },
      {
        "Name": "github-mcp-server",
        "Command": "npx",
        "Arguments": ["-y", "@modelcontextprotocol/server-github"],
        "Environment": {
          "GITHUB_PERSONAL_ACCESS_TOKEN": ""
        },
        "Enabled": true
      }
    ]
  }
}

The empty GITHUB_PERSONAL_ACCESS_TOKEN is intentional. The service resolves empty environment values from .NET User Secrets at runtime, keeping sensitive tokens out of source control.

The Agentic Loop

This is the core of the executor, and it’s where the pattern earns its keep. The agentic loop is the conversation cycle between the user, the LLM, and the available tools. Here’s the essential logic, stripped of error handling and logging:

public async Task<SkillResult> ExecuteAsync(SkillDefinition skill, string userInput)
{
    var messages = new List<ChatMessage>
    {
        new SystemChatMessage(skill.Instructions ?? "You are a helpful assistant."),
        new UserChatMessage(userInput)
    };

    var tools = BuildToolDefinitions(_mcpClient.GetAllTools());
    var iterations = 0;

    while (iterations++ < MaxIterations)
    {
        var response = await _openAI.GetCompletionAsync(messages, tools);
        messages.Add(new AssistantChatMessage(response));

        var toolCalls = response.ToolCalls;
        if (toolCalls == null || toolCalls.Count == 0)
        {
            // No tool calls means the LLM has produced its final answer
            return new SkillResult
            {
                Response = response.Content.FirstOrDefault()?.Text ?? "",
                ToolCallCount = iterations - 1
            };
        }

        // Execute each requested tool and feed results back
        foreach (var toolCall in toolCalls)
        {
            var args = JsonSerializer.Deserialize<Dictionary<string, object?>>(
                toolCall.FunctionArguments);
            var result = await _mcpClient.ExecuteToolAsync(toolCall.FunctionName, args);
            messages.Add(new ToolChatMessage(toolCall.Id, result));
        }
    }

    throw new InvalidOperationException("Max iterations exceeded");
}

The loop keeps going until the LLM responds without requesting any tool calls (meaning it’s done) or until a safety limit on iterations is reached. Each tool result gets added to the conversation history, so the LLM has full context of what it’s discovered.

Let’s trace through our financial services scenario. A developer selects the Tech Debt Assessor skill and asks “Assess the tech debt in our OrderService at C:\repos\order-service.” The executor loads the skill’s instructions as the system prompt, sends the request to Azure OpenAI through Foundry with the available MCP tools, and the LLM (guided by the skill’s workflow) starts calling tools. First analyze_directory to understand the project structure, then count_lines for scale metrics, then find_patterns to locate debt markers. After each tool call, the results come back into the conversation, and the LLM decides what to do next. Eventually, it synthesizes everything into a severity-prioritized report using your organization’s framework.

The BuildToolDefinitions method bridges MCP and Azure OpenAI by converting MCP tool schemas into ChatTool function definitions. It’s a one-liner per tool using ChatTool.CreateFunctionTool(), mapping the tool’s name, description, and JSON schema.

Building Custom MCP Tools

The MCP C# SDK makes exposing custom tools simple. You create a class with methods decorated with the [McpServerTool] attribute, and the SDK handles discovery and protocol communication:

[McpServerToolType]
public static class ProjectAnalysisTools
{
    [McpServerTool, Description("Analyzes a directory structure and returns a tree view")]
    public static string AnalyzeDirectory(
        [Description("Path to the directory to analyze")] string path,
        [Description("Maximum depth to traverse")] int maxDepth = 3)
    {
        // Walk the directory tree, return a formatted string representation
        // Full implementation in the GitHub repo
    }

    [McpServerTool, Description("Counts lines of code by file extension")]
    public static string CountLines(
        [Description("Path to the directory to analyze")] string path,
        [Description("File extensions to include (e.g., .cs,.js)")] string? extensions = null)
    {
        // Enumerate files, count lines per extension, return summary
    }

    [McpServerTool, Description("Finds TODO, FIXME, and HACK comments in code")]
    public static string FindPatterns(
        [Description("Path to the directory to search")] string path)
    {
        // Scan files for debt markers, return locations and context
    }
}

The server’s Program.cs is minimal. Five lines to register the MCP server with stdio transport and auto-discover tools from the assembly:

var builder = Host.CreateApplicationBuilder(args);

builder.Services
    .AddMcpServer()
    .WithStdioServerTransport()
    .WithToolsFromAssembly();

await builder.Build().RunAsync();

When the Skills Executor starts your MCP server, the SDK automatically discovers all [McpServerTool] methods and exposes them through the protocol. Any MCP-compatible client can use these tools, not just your executor. That’s the portability of the standard at work.

Three Skills, Three Patterns

The architecture supports different tool-usage patterns depending on what the skill needs. Back to our financial services firm:

Code Explainer skill uses no tools at all. A developer pastes in a complex LINQ query from the legacy monolith, and the skill relies entirely on the LLM’s reasoning to explain what it does. No tool calls needed. The skill instructions just tell the LLM to start with a high-level summary, walk through step by step, and flag any design decisions worth discussing.

The Tech Debt Assessor from our earlier example uses custom MCP tools. It can’t just reason about a codebase in the abstract. It needs to actually inspect the file structure, count lines, and find patterns. The skill instructions lay out a specific workflow and explicitly tell the LLM to always use tools rather than guessing.

GitHub Assistant skill uses the external GitHub MCP server. When a developer asks “What open issues are tagged as P0 in the order-service repo?”, the skill maps that to the GitHub MCP server’s list_issues tool. The skill instructions explain which tools are available and how to translate user requests into tool calls.

The key thing to notice: the executor code is identical across all three cases. The only thing that changes is the SKILL.md file. That’s the whole point. Swap the skill, swap the behavior.

What This Architecture Gives You

I keep coming back to the question of “why.” Why build a custom executor when you could just use Claude or Copilot directly? Three reasons stand out for enterprise teams.

Standardization without rigidity. Skills let you standardize how AI performs common tasks without hardcoding business logic into application code. When your code review standards change, you update the SKILL.md file, not the orchestrator. Domain experts can write skills without understanding the execution infrastructure. Platform teams can enhance the executor without touching individual skills.

Tool reusability across contexts. MCP servers expose tools that any skill can use. The project analysis tools work whether invoked by a tech debt assessor, a documentation generator, or a migration planner. You build the tools once and compose them differently through skills.

Ecosystem portability. Because skills follow Anthropic’s open standard, they work in VS Code, GitHub Copilot, Claude, and any other tool that supports the format. Skills you create for this executor also work in those environments. Your investment compounds across your development toolchain rather than getting locked into one vendor.

What This Doesn’t Give You (Yet)

I want to be honest about the gaps, because shipping something like this to production would require more work.

There’s no authentication or authorization layer. In a real deployment, you’d want to control which users can access which skills and which tools. There’s no retry logic or circuit-breaking on MCP server connections. The error handling is minimal. There’s no telemetry or observability beyond basic console output, though Azure AI Foundry’s built-in monitoring would help close that gap as you mature the solution. There’s no skill chaining (one skill invoking another), no versioning strategy for skill updates, and no caching of skill metadata.

Think of this as the architectural proof that the pattern works in .NET. The production hardening is a separate effort, and it’ll look different depending on your organization’s requirements.

Where to Go From Here

If the pattern resonates, here’s how I’d suggest approaching it. Start by identifying two or three repetitive tasks your team does that involve organizational knowledge an AI assistant wouldn’t have on its own. Write those as SKILL.md files. Get the executor running locally and test whether the skills actually produce useful output with your real codebases and workflows.

From there, the natural extension points are: skill chaining to allow complex multi-step workflows, a centralized skill registry so teams across your organization can share and discover skills, and observability hooks that feed into Azure AI Foundry’s evaluation and monitoring capabilities. If you’re running Foundry Agents, there’s also a path to wrapping the executor as a Foundry agent that can be managed through the Foundry portal.

The real value isn’t in the executor code itself. It’s in the skills your organization creates. Every procedure, standard, and piece of institutional knowledge that you encode into a skill is one less thing that lives only in someone’s head or a Slack thread.

Get the Code

The complete implementation is available on GitHub at github.com/MCKRUZ/DotNetSkills. Clone the repository to explore the full source, run the examples, or use it as a foundation for building your own skills executor. All the plumbing code I skipped in this article (the full Skill Loader, the MCP Client Service, the Azure OpenAI Service wrapper) is there and commented.

References

The post Building an AI Skills Executor in .NET: Bringing Anthropic’s Agent Pattern to the Microsoft Ecosystem appeared first on Microsoft Foundry Blog.

Read the whole story
alvinashcraft
3 hours ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories