Read more of this story at Slashdot.
Read more of this story at Slashdot.
Read more of this story at Slashdot.
Agents League was designed to showcase what agentic AI can look like when developers move beyond single‑prompt interactions and start building systems that plan, reason, verify, and collaborate.
Across three competitive tracks—Creative Apps, Reasoning Agents, and Enterprise Agents—participants had two weeks to design and ship real AI agents using production‑ready Microsoft and GitHub tools, supported by live coding battles, community AMAs, and async builds on GitHub.
Today, we’re excited to spotlight the winning project for the Reasoning Agents track, built on Microsoft Foundry: CertPrep Multi‑Agent System — Personalised Microsoft Exam Preparation by Athiq Ahmed.
The Reasoning Agents Challenge Scenario
The goal of the Reasoning Agents track challenge was to design a multi‑agent system capable of effectively assisting students in preparing for Microsoft certification exams. Participants were asked to build an agentic workflow that could understand certification syllabi, generate personalized study plans, assess learner readiness, and continuously adapt based on performance and feedback. The suggested reference architecture modeled a realistic learning journey: starting from free‑form student input, a sequence of specialized reasoning agents collaboratively curated Microsoft Learn resources, produced structured study plans with timelines and milestones, and maintained learner engagement through reminders. Once preparation was complete, the system shifted into an assessment phase to evaluate readiness and either recommend the appropriate Microsoft certification exam or loop back into targeted remediation—emphasizing reasoning, decision‑making, and human‑in‑the‑loop validation at every step.
All details are available here: agentsleague/starter-kits/2-reasoning-agents at main · microsoft/agentsleague.
The Winning Project: CertPrep Multi‑Agent System
The CertPrep Multi‑Agent System is an AI solution for personalized Microsoft certification exam preparation, supporting nine certification exam families.
At a high level, the system turns free‑form learner input into a structured certification plan, measurable progress signals, and actionable recommendations—demonstrating exactly the kind of reasoned orchestration this track was designed to surface.
Inside the Multi‑Agent Architecture
At its core, the system is designed as a multi‑agent pipeline that combines sequential reasoning, parallel execution, and human‑in‑the‑loop gates, with traceability and responsible AI guardrails.
The solution is composed of eight specialized reasoning agents, each focused on a specific stage of the learning journey:
Throughout the pipeline, a 17‑rule Guardrails Pipeline enforces validation checks at every agent boundary, and two explicit human‑in‑the‑loop gates ensure that decisions are made only when sufficient learner confirmation or data is present.
CertPrep leverages Microsoft Foundry Agent Service and related tooling to run this reasoning pipeline reliably and observably:
Notably, the full pipeline is designed to run in under one second in mock mode, enabling reliable demos without live credentials.
User Experience: From Onboarding to Exam Readiness
Beyond its backend architecture, CertPrep places strong emphasis on clarity, transparency, and user trust through a well‑structured front‑end experience. The application is built with Streamlit and organized as a 7‑tab interactive interface, guiding learners step‑by‑step through their preparation journey.
From a user’s perspective, the flow looks like this:
The UI also includes an Admin Dashboard and demo‑friendly modes, enabling judges, reviewers, or instructors to inspect reasoning traces, switch between live and mock execution, and demonstrate the system reliably without external dependencies.
Why This Project Stood Out
This project embodies the spirit of the Reasoning Agents track in several ways:
It demonstrates how agentic patterns translate cleanly into maintainable architectures when supported by the right platform abstractions.
Try It Yourself
Explore the project, architecture, and demo here:
As we shared in the announcement, Microsoft Foundry Toolkit for Visual Studio Code is now generally available. In this deep dive, we walk through everything that’s in the GA release — from the rebrand and extension consolidation, to model experimentation, agent development, evaluations, and on-device AI for scientists and engineers pushing the boundaries of edge hardware.
Whether you’re exploring your first model, shipping a production agent, or squeezing performance from edge hardware, Foundry Toolkit meets you where you are.
You’ve heard about a new model and want to try it right now — not after spinning up infrastructure or writing boilerplate API code. That’s exactly what Microsoft Foundry Toolkit is built to deliver.
With a Model Catalog spanning 100+ models — cloud-hosted from GitHub, Microsoft Foundry, OpenAI, Anthropic, and Google, plus local models via ONNX, Foundry Local, or Ollama — you go from curiosity to testing in minutes.
The Model Playground is where experimentation lives: compare two models side by side, attach files for multimodal testing, enable web search, adjust system prompts, and watch streaming responses come in.
When something works, View Code generates ready-to-use snippets in Python, JavaScript, C#, or Java — the exact API call you just tested, translated into your language of choice and ready to paste.
Foundry Toolkit supports the full agent development journey with two distinct paths and a clean bridge between them.
Agent Builder is a low-code interface that lets you take an idea, define instructions, attach tools, and start a conversation — all without writing a line of code. It’s the fastest way to validate whether an agent concept actually works. You can:
The result is a working, testable agent in minutes — perfect for validating use cases or prototyping features before investing in a full codebase.
For teams building complex systems — multi-agent workflows, domain-specific orchestration, production deployments — code gives you control. Foundry Toolkit scaffolds production-ready code structures for Microsoft Agent Framework, LangGraph, and other popular orchestration frameworks. You’re not starting from scratch; you’re starting from a solid foundation.
Once your agent is running, Agent Inspector turns debugging from guesswork into real engineering:
When you’re ready to ship, one-click deployment packages your agent and deploys it to a production-grade runtime on Microsoft Foundry Agent Service as a hosted-agent. The Hosted Agent Playground lets you test it directly from the VS Code sidebar, keeping the feedback loop tight.
These paths aren’t silos — they’re stages. When your Agent Builder prototype is ready to grow, export it directly to code with a single click. The generated project includes the agent’s instructions, tool configurations, and scaffolding — giving your engineering team a real starting point rather than a rewrite.
GitHub Copilot with the Microsoft Foundry Skill keeps momentum going once you’re in code. The skill knows the Agent Framework patterns, evaluation APIs, and Foundry deployment model. Ask it to generate an agent, write an evaluation, or scaffold a multi-agent workflow, and it produces code that works with the rest of the toolkit.
At every stage — prototype or production — integrated evaluations let you measure agent quality without switching tools. Define evaluations using familiar pytest syntax, run them from VS Code Test Explorer alongside your unit tests, and analyze results in a tabular view with Data Wrangler integration. When you need scale, submit the same definitions to run in Microsoft Foundry. Evaluations become versioned, repeatable, and CI-friendly — not one-off scripts you hope to remember.
AI running on your device — at your pace, without data leaving your machine.
Cloud-hosted AI is convenient — but it's not always the right fit. Local models offer:
That's why we're bringing a complete end-to-end workflow for discovering, running, converting, profiling, and fine-tuning AI models directly on Windows. Whether you're a developer exploring what models can do, an engineer optimizing models for production, or a researcher training domain-specific model adapters, Foundry Toolkit gives you the tools to work with local AI without compromise.
As we mentioned at the beginning of this article, the Model Playground is your starting point — not only for cloud models but also for local models. It includes Microsoft's full catalog of models, including the Phi open model family and Phi Silica — Microsoft's local language model optimized for Windows. As you go deeper, the Playground also supports any LLM model you've converted locally through the Conversion workflow — add it to My Resources and try it immediately in the same chat experience.
Getting a model from a research checkpoint to something that runs efficiently on your specific hardware is non-trivial. Foundry Toolkit's conversion pipeline handles the full transformation for a growing selection of popular HuggingFace models: Hugging Face → Conversion → Quantization → Evaluation → ONNX
The result: a model optimized for Windows ML — Microsoft's unified runtime for local AI on Windows.
All supported hardware targets are aligned with Windows ML's execution provider ecosystem:
Why Windows ML matters for you: Windows ML lets your app automatically acquire and use hardware-specific EPs at runtime — no device-specific code required. Your converted model runs across the full range of supported Windows hardware.
Once your model has been converted successfully, Foundry Toolkit gives you everything you need to validate, share, and ship:
Converting a local model is one thing. Understanding how it uses your hardware is another. Foundry Toolkit’s profiling tools give you real-time visibility into CPU, GPU, NPU, and memory consumption — with per-second granularity and a 10-minute rolling window.
Three profiling modes cover different workflows:
For example, when you run a local model in the Playground, you get detailed visibility into what's happening under the hood during inference — far beyond basic resource usage. Windows ML Event Breakdown surfaces how execution time is spent: a single model execution is broken down into phases — such as session initialization versus active inference — so you know whether slowness is a one-time startup cost or a per-request bottleneck.
When you profile any ONNX model directly, operator-level tracing shows exactly which graph nodes and operators are dispatched to the NPU, CPU, or GPU, and how long each one takes. This makes it straightforward to identify which parts of your model are underutilizing available hardware — and where quantization, graph optimization, or EP changes will have the most impact.
Generic models are capable. Domain-specific models are precise with LoRA (Low-Rank Adaption). Foundry Toolkit's fine-tuning workflow lets you train LoRA adapters for Phi Silica using your own data — no ML infrastructure required.
Bring your data, customize your LoRA parameters, and submit a job to the cloud. Foundry Toolkit spins up Azure Container Apps to train your adapter with your own subscription. To validate finetuning quality, the workflow tracks training and evaluation loss curves for your LoRA adapter and cloud inference is available to validate the adapter’s behavior, helping you confirm learning progress and output quality before shipping.
Once satisfied, download the adapter and incorporate it into your app for use at runtime.
This is the full loop: train in the cloud → run at the edge. Domain adaptation for local AI, without standing up your own training infrastructure.
Foundry Toolkit for VS Code GA supports every stage of serious AI development:
All of it, inside VS Code. All of it, now generally available. Install Foundry Toolkit from the VS Code Marketplace →
Get Started with Hands on Labs and Samples:
We'd love to hear what you build. Share feedback and file issues on GitHub, and join the broader conversation in the Microsoft Foundry Community.
This episode of The Modern .NET Show is supported, in part, by RJJ Software's Strategic Technology Consultation Services. If you're an SME (Small to Medium Enterprise) leader wondering why your technology investments aren't delivering, or you're facing critical decisions about AI, modernization, or team productivity, let's talk.
"Artificial intelligence is nothing new. It enables machines to simulate human cognitive functions such as reasoning, learning, problem solving and all using algorithms and vast data data sets to recognise patterns. And then it makes predictions and performs, you know, language processing, image recognition, and all those stuff."— Joydip Kanjilal
Hey everyone, and welcome back to The Modern .NET Show; the premier .NET podcast, focusing entirely on the knowledge, tools, and frameworks that all .NET developers should have in their toolbox. I'm your host Jamie Taylor, bringing you conversations with the brightest minds in the .NET ecosystem.
Today, we're joined by Joydip Kanjilal to talk about GitHub Copilot, agentic workflows for developers, and the benefits (and drawbacks) of having an AI agent help you write code.
Note that I didn't say, "write all the code for you," because an AI agent is simply helping you to be more productive.
"You want to you know, convert, I mean uh migrate a legacy application to a modern-day enterprise application, there will be a lot of redundant code that you will otherwise have to write. So that all that code can be automatically generated by Copilot, provided you have provided the right context."— Joydip Kanjilal
Along the way, we talked about the importance of the context that you give to an AI agent, security best practises (spoiler: you wouldn't give a new junior the keys to teh castle on day one, do the same with your AI agents), and the most important things to remember when using AI agents.
Before we jump in, a quick reminder: if The Modern .NET Show has become part of your learning journey, please consider supporting us through Patreon or Buy Me A Coffee. Every contribution helps us continue bringing you these in-depth conversations with industry experts. You'll find all the links in the show notes.
Anyway, without further ado, let's sit back, open up a terminal, type in `dotnet new podcast` and we'll dive into the core of Modern .NET.
The full show notes, including links to some of the things we discussed and a full transcription of this episode, can be found at: https://dotnetcore.show/season-8/context-is-everything-getting-the-most-from-github-copilot-with-joydip-kanjilal/
Remember to rate and review the show on Apple Podcasts, Podchaser, or wherever you find your podcasts, this will help the show's audience grow. Or you can just share the show with a friend.
And don't forget to reach out via our Contact page. We're very interested in your opinion of the show, so please get in touch.
You can support the show by making a monthly donation on the show's Patreon page at: https://www.patreon.com/TheDotNetCorePodcast.
Music created by Mono Memory Music, licensed to RJJ Software for use in The Modern .NET Show.
Editing and post-production services for this episode were provided by MB Podcast Services.