Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
156936 stories
·
33 followers

NuGet Package Metadata Best Practices: README, Icon, Tags, and License

1 Share

Learn NuGet package metadata best practices. Configure README files, icons, license expressions, tags, and repository URLs to make your .NET package shine on NuGet.org.

Read the whole story
alvinashcraft
just a second ago
reply
Pennsylvania, USA
Share this story
Delete

Run the Right AI Model for the Right Copilot Task — No Cloud Credits Wasted

1 Share

⚠ This blog post was created with the help of AI tools. Yes, I used a bit of magic from language models to organize my thoughts and automate the boring parts, but the geeky fun and the 🤖 in C# are 100% mine.

Hi!

Part 1 of the CopilotHarness series


Hero: local AI running on your machine

Big models are great for heavy thinking. But what about simple questions — “rename this variable”, “write a short docstring”, “what does this function do”? Those don’t need GPT-5 across the internet. They can be answered instantly by a model running on your own machine, offline, for free.

This post shows you how to wire GitHub Copilot in VS Code to local models using three minimal proxies — one for Ollama, one for Foundry Local, one for Azure OpenAI — and how to run all three together with a single command.


1. How BYOK Works in VS Code

VS Code Copilot supports a Bring Your Own Model mechanism. You register any OpenAI-compatible endpoint in a config file, and Copilot treats it as just another model in the picker — no extension, no plugin.

Official docs: Bring your own model to GitHub Copilot Chat

The config lives in a file called chatLanguageModels.json in your VS Code user folder:

{
  "providers": [
    {
      "name": "Ollama (local)",
      "vendor": "customendpoint",
      "url": "http://localhost:5099/v1",
      "modelId": "llama3.1:8b",
      "chatModelId": "copilot-chat-model"
    }
  ]
}
That's it — point url at any OpenAI-compatible endpoint, give it a name, and it appears in Copilot's model picker.
BYOK flow: VS Code → config → proxy → model

2. The Three Proxy Flavors

A local model provider isn’t always OpenAI-compatible out of the box. Each proxy in this repo acts as a thin translation layer — it speaks OpenAI on one side and the local backend on the other.

Proxy overview: three proxies connecting VS Code to different backends
ProxyPortBackendBest for
OllamaProxy5099Ollama (local)Quickest start, huge model catalog
FoundryLocalProxy5101Foundry Local SDK (offline, NPU)Offline/air-gapped, hardware acceleration
FoundryProxy5100Azure OpenAI / Foundry cloudProduction-grade, secret-managed cloud models

Each proxy is a single ASP.NET Core Minimal API file — no frameworks, no abstractions. The entire proxy fits on one screen. That’s intentional: these are teaching samples, not production middleware.


3. The Shared Secret: Unwrapping Copilot’s Envelope

Here’s something most developers don’t know: when GitHub Copilot Chat sends “hi”, it doesn’t send just "hi". It sends this:

<attachments>...file contents...</attachments>
<context>...editor state...</context>
<reminderInstructions>...workspace instructions...</reminderInstructions>
<userRequest>hi</userRequest>
The actual message is buried inside <userRequest>. A proxy that naively reads the last user message sees ~3 KB of boilerplate instead of the word "hi".

All three proxies share one class — CopilotMessageExtractor — that unwraps this envelope. It lives in the shared Proxies.Common library:

// The key method — finds <userRequest>...</userRequest> and returns its content.
// Falls back gracefully for non-Copilot clients (curl, SDK) that send plain text.
public static string ExtractTypedUserMessage(string rawUserMessage)
{
    // Look for <userRequest> first (VS Code Copilot always uses this)
    var userRequest = ExtractTagContent(rawUserMessage, "userRequest")
                   ?? ExtractTagContent(rawUserMessage, "user-request");

    if (!string.IsNullOrWhiteSpace(userRequest))
        return userRequest.Trim();

    // No tag — strip all known wrapper blocks and return what's left
    var stripped = rawUserMessage;
    foreach (var tag in CopilotWrapperTags)
        stripped = RemoveTagBlock(stripped, tag);

    // If stripping removed everything, fall back to the raw message
    // (this path is hit for plain curl/SDK clients — they send plain text)
    return string.IsNullOrWhiteSpace(stripped.Trim())
        ? rawUserMessage.Trim()
        : stripped.Trim();
}

This class is why the proxies work correctly with both Copilot Chat and direct API calls. The logging in each proxy shows the real ask — not the 3 KB envelope.


4. OllamaProxy — 5 Minutes to Your First Local Model

Pre-requisite: Ollama running with at least one model pulled.

# Pull a model
ollama pull llama3.1:8b

# Start the proxy
cd src/proxies/OllamaProxy
dotnet run
# → http://localhost:5099

The proxy auto-discovers your installed Ollama models and passes the model ID through. Add it to VS Code:

// chatLanguageModels.json  (Windows: %APPDATA%\Code\User\)
{
  "providers": [
    {
      "name": "Ollama — llama3.1:8b",
      "vendor": "customendpoint",
      "url": "http://localhost:5099/v1",
      "modelId": "llama3.1:8b",
      "chatModelId": "copilot-chat-model"
    }
  ]
}

Verify it’s running:

curl http://localhost:5099/health
# → {"status":"ok","backend":"ollama","models":["llama3.1:8b",...]}
Proxies test app — health dashboard showing all three proxies green

5. FoundryLocalProxy — Offline + NPU Inference

Microsoft Foundry Local runs models fully offline using the ONNX Runtime with hardware acceleration (CPU, GPU, NPU on Windows).

No pre-requisites — the SDK downloads the model on first run and caches it locally.

cd src/proxies/FoundryLocalProxy
dotnet run
# First run: downloads phi-4-mini (~2.5 GB) automatically
# → http://localhost:5101

The Models page in the test app shows which models are cached, lets you load/unload (frees GPU RAM instantly), and delete models from disk:

Foundry Local model management — load, unload, delete cached models

💡 Tip: Use the Models page to download a model before chatting with it. If you send a chat request to an unloaded model, you get a clear error explaining the model needs to be loaded first — not a cryptic 500.

Add it to VS Code alongside Ollama — Copilot lets you pick which model to use per conversation:

{
  "name": "Foundry Local — phi-4-mini",
  "vendor": "customendpoint",
  "url": "http://localhost:5101/v1",
  "modelId": "phi-4-mini",
  "chatModelId": "copilot-chat-model"
}

6. FoundryProxy — Azure OpenAI with Proper Secret Management

For cloud models, FoundryProxy uses .NET User Secrets so your API key never touches the repo.

cd src/proxies/FoundryProxy

# Store credentials locally (never committed to git)
dotnet user-secrets set "Foundry:Endpoint"   "https://your-resource.openai.azure.com"
dotnet user-secrets set "Foundry:ApiKey"     "your-key"
dotnet user-secrets set "Foundry:Deployment" "gpt-4o-mini"

dotnet run
# → http://localhost:5100

7. All Three Together — One Command with Aspire

The fastest way to run everything is via the .NET Aspire CLI. One command starts all three proxies, the Blazor test app, and the Aspire dashboard with logs, traces, and health checks:

cd src/proxies
aspire start

What starts:

ServiceURLWhat it is
ollama-proxyhttp://localhost:5099OllamaProxy
foundry-proxyhttp://localhost:5100FoundryProxy
foundry-local-proxyhttp://localhost:5101FoundryLocalProxy
proxies-test-apphttp://localhost:5102Blazor test UI
Aspire dashboardprinted in consoleLogs, traces, health for all services

Requires Aspire CLI: dotnet workload install aspire

The Blazor test app at http://localhost:5102 gives you a browser UI to test all three proxies without writing any code:

Chat page — pick a proxy, model, and send a message with streaming support

Compare page — same prompt sent to all three proxies simultaneously traces for every request, including custom LlmActivity spans with prompt text, model ID, token counts, and latency. You can see exactly what Copilot sent and what the model returned.

Aspire dashboard — all four services running and healthy
Aspire traces — LLM spans with latency and token counts

To stop everything:

aspire stop

8. Wire It to VS Code Copilot

The /setup page at http://localhost:5102/setup generates the exact chatLanguageModels.json snippet for each running proxy, with the correct port and model ID. Copy and paste into your VS Code user config folder:

Setup page — auto-generated VS Code config snippets for each proxy
  • Windows: %APPDATA%\Code\User\chatLanguageModels.json
  • macOS: ~/Library/Application Support/Code/User/chatLanguageModels.json
  • Linux: ~/.config/Code/User/chatLanguageModels.json

After saving, reload VS Code. Open Copilot Chat, click the model picker, and your local models appear alongside the built-in cloud models.

💡 Shortcut: If you have the CopilotHarness CLI tool installed, running harness init writes this file automatically.


9. What’s Next — Smart Routing

The proxies shown here are static: you pick a model manually per conversation. The next level is automatic routing — where every Copilot request is analyzed and sent to the best model automatically.

“Is this a simple rename? → local llama3.1:8b. Is this a complex architecture question? → cloud GPT-5. Is this about GitHub Actions? → a specialist agent.”

That’s what the full CopilotHarness router does — policy-based routing with semantic matching, local classifiers, and per-request telemetry. Part 2 of this series walks through building and using it.

Repo: github.com/elbruno/ElBruno.CopilotHarness


Quick Reference

GoalCommand
Start just Ollama proxycd src/proxies/OllamaProxy && dotnet run
Start all three + test UIcd src/proxies && aspire start
Stop allaspire stop
Generate VS Code configOpen http://localhost:5102/setup
View tracesOpen Aspire dashboard URL printed in console
Manage Foundry Local modelsOpen http://localhost:5102/models
Test proxy healthcurl http://localhost:5099/health

This is Part 1 of the CopilotHarness series.
Next: Part 2 — Smart Routing: Sending Each Request to the Right Model Automatically

Code: github.com/elbruno/ElBruno.CopilotHarness/tree/main/src/proxies

Happy coding!

Greetings

El Bruno

More posts in my blog ElBruno.com.

More info in https://beacons.ai/elbruno






Read the whole story
alvinashcraft
29 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

Better Models: Worse Tools

1 Share

Better Models: Worse Tools

Armin reports on a weird problem he ran into while hacking on Pi:

The short version is that newer Claude models sometimes call Pi’s edit tool with extra, invented fields in the nested edits[] array. And not Haiku or some small model: Opus 4.8. The edit itself is usually correct but the arguments do not match the schema as the model invents made-up keys and Pi thus rejects the tool call and asks to try again.

That alone is not too surprising as models emit malformed tool calls sometimes. Particularly small ones. What surprised me is that this is getting worse with newer Anthropic models as both Opus 4.8 and Sonnet 5 show it but none of the older models. In other words, the SOTA models of the family are worse at this specific tool schema than their older siblings.

Armin theorizes that this is because more recent Anthropic models have been specifically trained (presumably via Reinforcement Learning) to better use the edit tools that are baked into Claude Code. This has the unfortunate effect that other coding harnesses, such as Pi, may find that their own custom edit tools are more likely to be used incorrectly.

Claude's edit tool uses search and replace. OpenAI's Codex uses an apply_patch mechanism instead, and OpenAI have talked in the past about how their models are trained to use that tool effectively.

Does this mean third-party coding harnesses like Pi should implement multiple edit tools just so they can use the one with the best performance for the underlying model the user has selected?

Tags: armin-ronacher, ai, openai, generative-ai, llms, anthropic, llm-tool-use, coding-agents, pi

Read the whole story
alvinashcraft
49 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

Azure landing zone (ALZ) enters its next chapter

1 Share

Hey all 👋

I’ve got a genuinely exciting update to share today, and it’s one that’s been a long time coming: Azure landing zone (ALZ) is now an official Microsoft product, owned by the Azure Migrate product team.

A bit of history first

For the past five or so years, ALZ has been built completely in the open, in the open source repos, in community calls, in issues, in PRs, together with an incredible group of customers, partners, and Microsoft folks who cared enough to keep showing up and making it better. That community effort is the entire reason ALZ is what it is today, and it deserves a moment of recognition before we talk about what’s changing.

So, what’s actually changing?

ALZ is graduating from a community-driven, open-source initiative into a fully-fledged, officially owned Microsoft product, with a dedicated product team behind it in the Azure Migrate team.

For you, practically? Nothing changes. The GitHub repos, the modules, the way you consume ALZ today all stay exactly as they are. What’s changing is who’s steering the ship, and that it now has the backing, investment, and roadmap of an official product team. If you’ve got an issue to raise, that still happens exactly where it always has, over at aka.ms/alz/issues.

What about us?

I want to be upfront about this part: myself (Jack Tracey), Matt White, Jared Holgate, and Zach Trocinski are no longer involved in the day-to-day of ALZ. No more issue triage, no more day-to-day operations from us. That responsibility now sits with the Azure Migrate team.

We’re not disappearing entirely, though. Over the last couple of months we’ve been running a proper handover, and we’ll continue to be around behind the scenes for those moments when the Azure Migrate team needs a bit of extra context we didn’t manage to pass on during that process.

And honestly? You’re in great hands. The Azure Migrate team already know ALZ inside and out. They’ve been working alongside us building the Azure Migrate agent’s platform landing zone creation experience, which uses ALZ under the hood. This isn’t a handover to strangers, it’s a handover to people who’ve already been in the engine room with us.

What can you expect from the Azure Migrate team?

The Azure Migrate team are keen to keep the community engagement up and active, just as we did in the ALZ team of old. They want to run community calls and be just as visible and active as we’ve always tried to be.

So, keep an eye out for blog posts and announcements from them over the coming months. This is very much a “watch this space” moment, and we’re confident you’ll see the same energy and openness from them that you’ve come to expect from ALZ.

A personal thank you

Before I move on, I wanted to add something a bit more personal.

ALZ is one of the things I’m proudest of from my time at Microsoft so far. I’ve built and led it over the past five or six years, surrounded by genuinely great people who helped shape it, and backed by an amazing community, customer, and partner base who supported us every step of the way to make it the success it is today.

So, while I’m stepping away and I’m no longer involved in the day-to-day, ALZ will always hold a special place for me. I’ll forever be happy to chat about it socially. It’s something I still have real passion for, and that’s not going away just because my day job has moved on.

That said, I’m taking those learnings and that passion into other things at Microsoft, including now focusing on AVM (Azure Verified Modules) alongside Jared and several other great folks. We’ll have some announcements of a similar nature to share on that front soon, so watch this space 😁

Thank you 🙌

And finally, the wider thanks. Alongside Matt White, Jared Holgate, and Zach Trocinski, huge thanks to: Paul Grimley, Rob Kuehfus, Sacha Narinx, Seif Bassem, Arjen Huitema, Nelson Pereira, Paulo Alves Oliveira Jr., Vishal Mehrotra, Charlie Grabiaud, Simona Tarantola, Bruno Gabrielli, Luke Taylor, Adam Tuckwell, and Kevin Rowlandson.

A special shout-out too to Remo Leone Laudo, Rhys Ash, Jamie Pla, Igor Jovovic, and Haflidi Fridthjofsson, who will continue to contribute to the ALZ IaC modules alongside their day jobs as CSAs, as and when they can 🙂

And to everyone else who’s contributed to ALZ over the past five years or so, through code, issues, feedback, conversations, or just using it and telling us what worked and what didn’t, thank you. This milestone is yours as much as anyone’s.

Here’s to the next chapter for ALZ. 🎉

Read the whole story
alvinashcraft
14 hours ago
reply
Pennsylvania, USA
Share this story
Delete

Ex-Microsoft engineer rebuilds Notepad in 2.5KB using nothing but stuff Windows already had

1 Share

Dave Plummer, the retired Microsoft engineer who built Task Manager and helped ship Space Cadet Pinball, has recreated Notepad in roughly 2.5 kilobytes. The project is called TinyRetroPad, and despite the size (or lack of it), it still has Open, Save, Find and Replace, printing, font selection, word wrap, and the unsaved changes prompt, packed into an executable that is significantly smaller than the featured image above this paragraph.

TinyRetroPad by Dave Plummer
Credit: WindowsLatest.com

Plummer has spent recent months telling Microsoft what they do not want to hear about Windows 11. He argued the OS needs its own Windows XP SP2 moment, a stretch where Microsoft drops new features and only fixes what is broken. He has also said Windows 11 has turned into a sales channel for Microsoft’s other products, nudging users toward Edge, OneDrive, and Copilot.

At a time when Memory and storage cost a fortune, what we’re interested in is how an app was created with an install size that mocks the entire fabric of software development.

TinyRetroPad in action

How does TinyRetroPad fit an entire Notepad into 2.5KB?

Plummer explains this isn’t really a magic trick. Windows already contains most of what makes up a Windows application: a window manager, menus, common dialogues, clipboard handling, edit controls, font selection, and file open and save dialogues, along with printing infrastructure. A tiny native Windows program doesn’t have to bring along its own entire civilization.

Size of TinyRetroPad
Credit: WindowsLatest.com

As Plummer puts it, “it arrives with a lunchbox and a map of the city.” A mature operating system is also a giant library of already solved problems, and because that machinery is already installed on the machine, a tiny executable can call into it and appear to perform miracles.

RAM usage in TinyRetroPad
Credit: WindowsLatest.com

TinyRetroPad is a fork of Matt Power’s Dave’s Tiny Editor, itself built on tiny.asm, a project Plummer wrote years ago to prove what the smallest complete Windows application could look like. It’s a thin wrapper around RICHEDIT50W, the rich text control Windows has carried for decades. Drawing characters, managing the cursor, handling selection, cut, copy, paste, undo history,

Windows already does all of it inside that one control. Early versions used the plainer EDIT control and got down to 890 bytes, though Windows Defender wasn’t a fan of how aggressively that build was compressed. Later versions moved to RICHEDIT for cheap access to the Courier font and bigger file support, settling at 981 bytes before a single menu existed.

Context menu in TinyRetroPad
Credit: WindowsLatest.com

The growth log Plummer kept shows what each addition cost:

  • The File menu brought it to 1,375 bytes.
  • The unsaved changes prompt, which needed a real dirty flag and a close, pushed it to 1,622 bytes.
  • Find and Replace came at 2,143 bytes
  • Printing was the biggest single jump, getting the whole thing at 2,476 bytes.
fonts in TinyRetroPad
Credit: WindowsLatest.com

None of this works without Crinkler, a compression linker built for the demoscene that squeezes and rearranges the executable instead of just linking it. Sometimes a whole feature adds nothing to the file size because the code happens to compress well. Sometimes a clean function ends up bigger than an ugly, repetitive one, since Crinkler compresses repetition far more efficiently than a lookup table full of branches.

It’s also not a finished product. There’s no Releases page for some reason, and Crinkler-built executables may trigger antivirus false positives. The open GitHub issues read like a list of what a 2.5KB program gives up. One user reported it chewing through around 500MB of RAM on 64-bit Windows 7, and others found it won’t run on Windows XP SP3 at all.

Why the internet keeps calling Windows 11’s Notepad bloated

Modern Notepad has spent the last couple of years turning into a case study in feature creep. The notepad.exe on a typical Windows 11 install comes in at around 352KB, with an install size closer to 808KB, because that exe is really a stub pointing at a UWP and WinUI app adding up to roughly 5MB on disk. The original XP-era Notepad was about 65KB in total.

Windows 11 Notepad vs Windows 10 Notepad RAM comparison

Of course, you’re not losing any precious memory because of the bloated Notepad, but the way Microsoft deviated it from being a simple text editor is what created all this backlash.

Tabs and autosave were welcome additions, and now I can’t think of Notepad without these. But in June 2025, Notepad gained Markdown formatting, and users pointed out that Windows already had WordPad for that job before Microsoft killed it off.

By August, the right-click menu had grown so cluttered with Copilot options that Microsoft had to redesign it just to make cut and paste findable again. A Create a table tool arrived in January 2026, and image support followed in February, built on that same Markdown engine.

Notepad image insert

That month gave us proof that this feature creep costs something real. Microsoft confirmed an 8.8 rated remote code execution flaw, tracked as CVE-2026-20841, where a malicious Markdown link could let an attacker run code with the victim’s own permissions just by getting them to click it inside Notepad. A plain text editor with no link handling could never have that problem.

By March, Microsoft scaled back Copilot branding across several apps, and by April, Microsoft mostly just renamed Copilot to Writing Tools in Notepad instead of pulling the AI features out.

New AI tools in Notepad

The real argument is about Windows, not Notepad

Windows 11 LTSC, the long-term servicing edition Microsoft builds for enterprises that can’t tolerate constant change, still ships the classic Notepad with no Copilot and no Markdown, and neither does Windows 10’s. The plain Notepad TinyRetroPad is recreating what was never deleted. Microsoft just quietly retired it from Windows 11.

Windows 11 LTSC has only Microsoft Edge as a modern app
Windows 11 LTSC has only Microsoft Edge as a modern app

Plummer has said the point was never to get anyone to use a hand-assembled 2.5KB editor. It’s to show how much untapped potential already sits inside Windows, because modern app development defaults to bundling everything an app might need instead of asking what the OS already provides.

In a recent test, Windows Latest found that Windows 11’s Media Player, takes a few seconds to open a video and uses 377MB idle, against 103.4MB and instant playback on the legacy version, one that predates HEVC yet plays it better than the modern app does without a $0.99 Store add-on.

Windows Media Player RAM usage when compared with Legacy

Sure, we need modern-looking apps in Windows 11, but that mustn’t come at the cost of efficiency and control. We’re not saying that Microsoft isn’t allowed to bundle subscription plans in their inbox apps, but Windows 11 itself isn’t free. It’s paid software. Microsoft’s decades-old classic apps still look good and are robust. Also, the software giant built Calculator, Notepad, and Media Player decades ago without today’s tools and infrastructure. What needs to change isn’t the hardware. It’s the mindset that every rewrite needs to be as efficient as possible, just for the sake of it being possible.

The post Ex-Microsoft engineer rebuilds Notepad in 2.5KB using nothing but stuff Windows already had appeared first on Windows Latest

Read the whole story
alvinashcraft
14 hours ago
reply
Pennsylvania, USA
Share this story
Delete

Valve Open-Sources Steam Machine's E-Ink Display

1 Share
Valve has open-sourced the design for a customizable e-ink front panel for the Steam Machine, dubbed the "Inkterface." "All of it is available on their GitLab under the MIT license, which goes over everything you need to make your own and stick it on the front of your fancy new Steam Machine," reports GamingOnLinux. From the report: They're now calling it the "Inkterface" and there's a good few things you'll need to make it including: 1 x Adafruit ESP32 Feather with 2MB PSRAM. 1 x Adafruit eInk Breakout Friend. 1 x Adafruit 5.83" Monochrome eInk Panel. 13 x M2.5 x 5mm Pan Head Machine Screws. 4 x 1/4" x 1/4" x 3/16" Stepped Magnet SB443-OUT. Valve even provided a video on the GitLab showing it being put together [...].

Read more of this story at Slashdot.

Read the whole story
alvinashcraft
14 hours ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories