Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
148894 stories
·
33 followers

Microsoft isn’t launching a subscription-based Windows 12 AI OS in 2026. The rumors are just AI hallucinations.

1 Share

Microsoft’s enshittification of Windows 11 got so bad in 2025 that hating on the company and its products became the new cool thing to do on social media. Though the company is responsible for most of it, spreading fake rumours about a supposedly AI-focused, subscription-based Windows 12 coming in 2026 doesn’t fix the issue.

The fact is that Microsoft is not releasing Windows 12 in 2026, and there is no credible evidence that the company is preparing a subscription-based version of Windows either. And since Windows 12 doesn’t exist as of now, claims about it having AI as the foundation, with a minimum 40 TOPS NPU, are pure speculation, or rather, I say “hallucination”.

Copilot asking permission to use your files in the known six folders
Copilot asking permission to use your files in the known six folders

Much of the discussion traces back to articles published on some AI-generated forums’ content and tech publications.

Windows Forum article about Windows 12 is generated by ChatGPT
Windows Forum article about Windows 12 is generated by ChatGPT

Our investigation found that multiple AI-driven websites were referencing one another as sources, creating a loop of AI hallucinations that made fabricated claims appear credible.

Either way, the damage was already done as the story got pushed into Reddit, where, as expected, users clouded the comments section with their long-standing aversion towards Microsoft.

Interestingly, those Reddit threads and posts on X also became “sources” for AI tools, which increased the models’ confidence and caused the hallucinations to spread further.

The Windows 12 rumour is built on old leaks and outdated concepts, probably by AI

The rumor describes Hudson Valley as an upcoming Windows 12 release. In reality, Hudson Valley was the internal codename for Windows 11 version 24H2, which has already shipped. And as you already know, 24H2 looks nothing like the redesign described in the rumor, and none of the supposed UI changes or architectural shifts ever appeared.

The supposed “leaks” also mention CorePC, which was a concept discussed in actual leaks several years ago. CorePC was to be a modular Windows architecture that could separate system components, improve update reliability, and scale the OS for different device categories. However, despite years of speculation, CorePC has never appeared publicly. As things stand today, there is no evidence that CorePC is part of Microsoft’s current Windows roadmap.

Another familiar talking point in the rumor is the idea that Windows 12 could become subscription-based. Back in 2023, some internal references to subscription status flags had people concerned that Microsoft might move Windows to a recurring payment model.

However, later we found those internal flags referred to a cloud-based service for enterprises, not an actual OS for consumers.

Then, the so-called “new” interface in Windows 12 leak talks of a floating taskbar with rounded corners, system indicators moved to the top-right corner, and a large search bar centered at the top of the screen. These descriptions are actually identical to a concept interface Microsoft showed internally and during Ignite 2022, which leaked online at the time.

That design prototype never shipped and has not appeared in any modern Windows builds.

Windows UI Concept Microsoft accidentally showed at Ignite 2022
Windows UI Concept Microsoft accidentally showed at Ignite 2022. Source: Microsoft

Windows enthusiast phantomofearth noted that the rumor “reads like it’s straight out of 2023 when Panos [Panay] was around.” The references to Hudson Valley, CorePC, and subscription-based OS were all happening several years ago.

phantomofearth X repost about Windows 12 rumors
phantomofearth X repost about Windows 12 rumors

The story is that at one point during the Panos Panay era, there were internal plans to ship a new Windows generation around 2024. But after leadership changes inside Microsoft’s Windows division, that direction was scrapped, and the work ultimately became Windows 11 version 24H2 instead.

Panos Panay introduces Windows Copilot at Microsoft Build in May.
Panos Panay introduced Windows Copilot at Microsoft Build in May. Source: Dan DeLong for Microsoft

Despite these inconsistencies, the rumor still spread widely online, mostly due to the non-existent trust that users have in Microsoft.

When will Windows 12 actually launch?

With the rumor debunked, the obvious question is, will Microsoft release Windows 12 anytime soon? Well, the answer is almost certainly not in 2026.

The company’s focus right now is on fixing Windows 11 itself. Internally, the priority appears to be addressing long-standing complaints about performance, reliability, and the AI overload.

Ask Copilot in taskbar
Ask Copilot in taskbar

The next major version on the roadmap is Windows 11 26H2, which will follow earlier platform work already underway. Note that Windows 11 26H1 was for enabling improvements for ARM devices, specifically the Snapdragon X2 series.

Even if Microsoft eventually decides to move forward with a new Windows generation, it is unlikely to happen soon. A Windows 12 release would not arrive before 2027 at the earliest, and it almost certainly would not resemble the AI-heavy redesign described in the viral rumor.

Ironically, the reason the rumor spread so quickly says a lot about the current state of the Windows ecosystem. Microsoft’s aggressive AI push, controversial features like Recall, and the perception that the company prioritizes Copilot over everything have eroded user trust.

Recall home

When a rumor suggested Windows could become subscription-based or radically redesigned with AI running the show, many people just assumed the worst.

With Windows 10 support ended, millions of users have reluctantly moved to Windows 11. At the same time, competition is heating up with Apple’s newly introduced $599 MacBook Neo. Windows and PC manufacturers could face serious pressure unless Microsoft improves the Windows experience and regains at least some of the trust that has eroded.

MacBook Neo
MacBook Neo. Source: Apple

The real challenge for Microsoft isn’t launching a new version of Windows. It’s rebuilding confidence in the one people are already using.

As things stand, it would be unwise of Microsoft to launch Windows 12, or any major feature update to the existing Windows 11, unless it’s one that fixes the OS from the ground up.

The post Microsoft isn’t launching a subscription-based Windows 12 AI OS in 2026. The rumors are just AI hallucinations. appeared first on Windows Latest

Read the whole story
alvinashcraft
50 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

A new video from the White House mixes Call of Duty footage with actual video of Iran strikes

1 Share
A screenshot of the Call of Duty footage in the White House’s video.

On Wednesday, the White House posted a video of actual military strikes on Iran in the style usually seen in Call of Duty highlight videos, and started the video with a clip from Call of Duty. The real-life footage of missiles and other munitions hitting targets in Iran shows clips seen in other Trump administration videos, like this one posted to the U.S. Central Command X account.

As noted by The Washington Post's Drew Harwell, the animation at the start appears to be from Call of Duty: Modern Warfare III when a player activates a …

Read the full story at The Verge.

Read the whole story
alvinashcraft
50 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Amazon lays off robotics staff in latest cuts

1 Share
A worker waits for a Hercules drive to move another fabric shelving unit into place at a workstation inside Amazon’s AUS2 fulfillment center. (GeekWire File Photo / Todd Bishop)

Amazon is laying off an undisclosed number of employees from its robotics division. Business Insider first reported the news, and the company confirmed the cuts in a statement to GeekWire.

“We regularly review our organizations to make sure teams are best set up to innovate and deliver for our customers,” a company spokesperson said. “Following a recent review, we’ve made the difficult decision to eliminate a relatively small number of robotics roles. We don’t make these decisions lightly, and we’re committed to supporting employees whose roles are affected with severance pay, health insurance benefits, and job placement support.”

The layoffs are separate from Amazon’s broader cuts announced in January that impacted more than 16,000 corporate workers — the second phase in a restructuring that totals 30,000 positions, the largest workforce reduction in the company’s history.

In a memo to employees in January, Beth Galetti, Amazon’s senior vice president of people experience and technology, said the company did not plan to make regular rounds of massive cuts. “Some of you might ask if this is the beginning of a new rhythm — where we announce broad reductions every few months,” she wrote. “That’s not our plan.” 

However, Galetti added that teams will continue to evaluate their operations and “make adjustments as appropriate,” saying that’s “never been more important than it is today in a world that’s changing faster than ever.”

Amazon’s robotics unit supports the company’s growing robot fleet that helps move products around its fulfillment centers. The company deployed its 1 millionth robot last year. In January, Amazon shut down its new Blue Jay warehouse robotic system, according to Business Insider.

Amazon also announced in January that it will close all of its Amazon Go and Amazon Fresh grocery store locations. The “Just Walk Out” technology originally developed for Amazon Go convenience stores, which uses overhead cameras and sensors to avoid traditional checkout, will live on as a licensing business.

Amazon previously slashed 27,000 positions in 2023 across multiple rounds of layoffs.

The company’s corporate roles numbered around 350,000 people in early 2023, the last time Amazon provided a public figure. Its overall workforce stands at 1.58 million, which includes warehouse employees.

Read the whole story
alvinashcraft
51 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

VS Code 1.110 Ships with Agent Plugins, Browser Tools and Session Memory

1 Share
Visual Studio Code 1.110 (February 2026) adds new agent extensibility, browser-driving chat tools, and expanded chat accessibility.
Read the whole story
alvinashcraft
52 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Evaluating AI Agents: Techniques to Reduce Variance and Boost Alignment for LLM Judges

1 Share

In an ideal world, an LLM judge would behave like an experienced Subject Matter Expert (SME). To achieve this, we must align the judge with SME preferences and minimize any systematic biases it exhibits. (See our previous article for a detailed overview of common bias types in LLM judges.) We begin with techniques to improve alignment.

Pre-calibration of LLM judges to human preferences

Choosing the right models to calibrate

When choosing which model, models, or model family to use as the underlying driver of your LLM judge, it effectively comes down to a trade-off between cost and capability. Larger models tend to be more effective than smaller ones [1]; they also tend to be more expensive and slower. When deploying your suite of judges (as required in most projects) carefully decide which areas of your system are fundamentally required to be more aligned and consider deploying more expensive models here while using smaller, cheaper models in less essential areas. Randomly sampling data points and exposing them intermittently to the bulkier judges is also a tactic that can be used. Fundamentally, model choice is also a design choice, and systematic testing is required to help you decide which models to use in which areas.

Calibrating models to align with human preferences

LLM judges are delicate and highly responsive to system prompts. The most important thing when using LLM judges is consistency. This means that once you have a chosen system prompt that has been proven to align with human preferences, that you must stick to that system prompt for the duration of your evaluation. Fiddling with the system prompt should be done in advance and only in advance of evaluation – otherwise you end up in a situation in which you can be moving the goalposts to ensure that your shot goes in.

Therefore, the goal of calibration is to adjust the system prompt of a model to best align with responses that would be given by an SME. Unfortunately, this does mean that we need to collect and label SME responses such that we can evaluate the alignment.

Consider the example below where we wish to train an LLM judge to score a response from 1-5. This judge could be used to score a plethora of AI applications. Here’s how we can successfully align the judge.

  1. Create a stratified sample of diverse responses.
    • Ensure the full range of potential values are covered (e.g. 1–5 or 1–10).
    • Ensure to include edge cases and ambiguous samples.
    • Ensure diversity across content lengths, quality, tone, and so on.
    • Hold validation sets and test sets as standard.

An example:

ResponseToneLength
‘This is response A…’Clear and concise300
‘This is response Z…’Unclear and directionless150

 

2. Have human labelers annotate or score each response.

    • Decide clear and consistent scoring criteria, ideally within a group.
    • Have SMEs score the responses independently.
    • Calculate inter-annotator agreement using either Cohen’s Kappa (if two annotators) [2] or Fleiss’ Kappa (if 3 or more annotators) [3]. Use weighted kappa calculations if required; you may want to penalize large disagreements more severely.
    • Target κ > 0.6 — if not close to this then maybe a joint discussion or adjudication is required, as there may be severe ambiguity in the questions or the scoring criteria.
    • The SMEs should score responses blind to accompanying information such as tone and length.

Example extended:

ResponseToneLengthSME 1 scoreSME 2 scoreSME 3 score
‘This is response A…’Clear and concise100324
   
This is response Z…’Unclear and directionless200434

 

3. Create a baseline system prompt and compare with human scores.

    • Create an initial system prompt for the judge and pass in the validation set.
    • Pass in the same responses to the judging LLM and retrieve the scores.
    • Compare the scores from the LLM judges to the human annotated responses with correlation metrics such as Spearman’s or the Pearson coefficients.
    • Compare the agreement rate with the SME judges (rounded mean, Median or mode) score with the LLM judges ranking again using the correct Kappa measure.
    • Conduct line-by-line error analysis of discrepancies between human and LLM judges. Analyze and explore whether any systematic bias exists.

4. Iterate and improve.

    • Repeat the process by adjusting the system prompt according to deductions made from error analysis and metric observation.

5. Final validation.

    • After a significant improvement and plateauing of the improvement has been witnessed, the final test set that was initially withheld can be tested to ensure consistency.

6. Set and forget.

    • After the final system prompts are set for the LLM judges, leave them constant throughout experimentation to avoid any bias in the evaluation pipeline.

Post-calibration to mitigate bias

Further analysis of alignment

Stress testing alignment

Stress testing serves to rigorously assess whether the alignment between LLM judges and human evaluators remains robust under varying conditions and across different subpopulations. For instance, while an LLM judge may closely match human scores for short responses, it might consistently misjudge longer ones. If the dataset is dominated by short responses, this can artificially inflate overall correlation metrics and obscure critical weaknesses in the evaluator.

  • Stratified agreement analysis: Evaluate human–LLM agreement separately for distinct categories, such as short versus long responses, simple versus complex queries, different tones or writing styles, and diverse content domains. This helps to pinpoint where alignment may falter.
  • Counterfactual perturbations: Introduce minor modifications to the inputs—such as shuffling candidate order, shortening answers, or substituting synonyms—to observe whether the LLM judge’s scores change in a meaningful way. Such tests uncover sensitivity and potential bias in the evaluation process.
  • Permutation tests: Randomly permute answer labels or scoring assignments to ensure that observed patterns of alignment are not artifacts of dataset structure or chance.

When stress testing exposes deficiencies in the current evaluator, the next step is to iteratively refine the LLM judge to achieve stronger and more consistent alignment with human judgment.

Statistical validation of improvements

It is essential to confirm that any observed improvements are substantive and not merely statistical artifacts. This is where robust statistical validation is critical:

  • Paired significance tests: Use methods such as paired t-tests or Wilcoxon signed-rank tests to compare human–LLM deviations before and after calibration, ensuring that improvements are statistically supported.
  • Multiple testing corrections: Apply procedures like the Benjamini–Hochberg correction when evaluating numerous metrics or subgroups, reducing the risk of false positives.
  • Confidence intervals: Report confidence intervals for agreement estimates to quantify uncertainty and avoid overinterpreting marginal differences.

By systematically applying these practices, stakeholders gain clearer, data-driven assurances regarding the reliability of the LLM judge and the durability of any measured improvements. This approach ensures that alignment is not only statistically sound but also meaningful and stable across all relevant scenarios.

Regression to test for bias

Once the LLM judge is calibrated to align with human preferences and its performance meets our evaluation criteria, the final step is to quantify the presence and magnitude of residual biases that may persist.

Just like humans, LLM judges can exhibit systematic biases. Prior research has shown that LLM judges often display consistent patterns of favoritism when evaluated across many examples. Three well‑documented types of bias include:

  • Positional bias – Preferring the first or second option in a comparison, independent of quality.
  • Verbosity bias – Favoring longer answers over shorter, more concise ones.
  • Self‑bias – Giving higher scores to responses generated by the same model family as the judge.

Understanding these systematic effects allows us to diagnose limitations in the evaluator and iteratively reduce bias during further tuning. While aligning an LLM judge with human preferences is essential, achieving lower bias makes the evaluator even more reliable.

A straightforward yet powerful method to measure bias is regression modelling. Consider verbosity bias as an example.

Take the example of building an agent, in which we would like to evaluate how good our agent is at answering specific questions. We would like to compare that agent to a standard out-of-the-box LLM. We would like to know if our agent is producing answers of more substance. However, we know that in advance that LLM judges tend to favor longer answers. So how do we ensure that our LLM judge isn't biased towards our agent simply because it gives longer answers. We can attempt to control for that confounding influence by using linear regression.

Score=β01 (Agent)+β2 (LengthNormalized)+ε

This formulation lets us isolate the separate effects of who generated the answer and how long the answer is, while holding all other factors fixed. All other bias types can be modelled in a similar fashion, given the standard assumptions of linear regression apply.

The sign, magnitude, and statistical significance of the regression coefficients quantify the extent of each bias. For instance: A large positive and statistically significant β2 in this specification would indicate strong verbosity bias.

Bringing statistical rigor into practice with Microsoft Foundry

A lot of this work can be repeated easily using Microsoft Foundry and the open-source package judgesync.

In practice, statistical validation is most valuable when it is tightly integrated into the evaluation workflow. Microsoft Foundry natively supports paired statistical testing, enabling developers to directly quantify pre‑ and post‑calibration improvements.

What’s even cooler—and especially useful—is Microsoft Foundry’s evaluation cluster analysis feature (currently in public preview). It helps you understand and compare evaluation results by grouping samples with similar patterns, making it easier to surface alignment gaps where LLM judges diverge from human evaluators across response lengths, complexity, and content styles—issues that are often hidden by aggregate metrics.


Reference

[1] Zheng, L., Chiang, W.-L., Sheng, Y., Zhuang, S., Wu, Z., Zhuang, Y., Lin, Z., Li, Z., Li, D., Xing, E. P., Zhang, H., Gonzalez, J. E., & Stoica, I. (2023). Judging LLM-as-a-judge with MT-Bench and Chatbot Arena [arXiv preprint]. arXiv. https://doi.org/10.48550/arXiv.2306.05685

[2] Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1), 37–46. https://doi.org/10.1177/001316446002000104

[3] Fleiss, J. L., & Cohen, J. (1973). The equivalence of weighted kappa and the intraclass correlation coefficient as measures of reliability. Educational and Psychological Measurement, 33(3), 613–619. https://doi.org/10.1177/001316447303300309

Read the whole story
alvinashcraft
53 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

🚀 The Great Foundry Shift: Microsoft Foundry New vs Classic Explained

1 Share

Introduction

If you've been working in the Azure AI ecosystem, you've likely noticed a seismic shift happening at ai.azure.com. What was once Azure AI Studio, then became Azure AI Foundry, and has now been rebranded as Microsoft Foundry — but the rebrand is the least interesting part. The architecture, portal experience, agent capabilities, and developer workflows have been fundamentally redesigned.

Microsoft now ships two portal experiences side by side — Foundry (New) and Foundry (Classic) — accessible via a toggle in the portal banner. But this isn't just a UI facelift. Under the hood, the resource model, project hierarchy, agent framework, tool ecosystem, and governance surface have all changed in meaningful ways.

This article breaks down every major difference across every layer of the stack so you can make an informed decision about when to migrate, what you'll gain, and what still requires the Classic experience.

1. The Branding & Naming Evolution

Era

Name

Portal URL

Pre-2024

Azure AI Studio

ai.azure.com

Mid-2024 to Late 2025

Azure AI Foundry

ai.azure.com

2026 (Current)

Microsoft Foundry

ai.azure.com

 

💡 Key Point: "Azure AI Foundry" is now "Microsoft Foundry." Screenshots across Microsoft Learn documentation are still being updated. Both portals live at the same URL — a toggle in the top banner lets you switch between (New) and (Classic).

2. Portal Philosophy: When to Use Which

Portal

Best For

Foundry (Classic)

Working with multiple resource types: Azure OpenAI resources, Foundry resources, hub-based projects, and Foundry projects. Use this when you need features not yet available in the new experience (e.g., Prompt Flow, open-source model deployments on managed compute, Azure ML workloads).

Foundry (New)

A streamlined, agent-first experience for building, managing, and scaling multi-agent applications. Only Foundry projects are visible. Hub-based projects, Azure OpenAI standalone resources, and legacy project types are not shown.

The new portal is not a superset of the classic portal — it is a focused reimagining for the agentic AI era. If you need legacy ML capabilities, you still switch back to Classic.

3. Resource Architecture & Project Hierarchy

Foundry (Classic): The Hub-Based Model

Classic supports two project types:

  • Hub-Based Projects — Built on the Microsoft.MachineLearningServices resource provider. A "Hub" (Azure AI Hub) acts as the parent resource, and projects are children of that hub. Hubs required provisioning extra resources: Azure Storage, Azure Key Vault, and optionally Azure Container Registry as mandatory sibling resources.
  • Foundry Projects — A newer project type introduced under the Microsoft.CognitiveServices provider. These are child resources of a Foundry Resource (kind: AIServices).

Foundry (New): One Project Type to Rule Them All

The new portal only surfaces Foundry Projects. The resource hierarchy is simplified:

Foundry Resource (Microsoft.CognitiveServices/account, kind: AIServices)
  └── Foundry Project (Microsoft.CognitiveServices/account/project)
        └── Project Assets (agents, evaluations, files, indexes)

Aspect

Classic (Hub-Based)

New (Foundry Project)

Resource Provider

Microsoft.MachineLearningServices

Microsoft.CognitiveServices

Parent Resource

AI Hub

Foundry Resource (AIServices)

Required Sibling Resources

Storage Account, Key Vault (mandatory)

None required by default

Project Isolation

Via Hub RBAC

Native project-level RBAC

Agent Service GA

Preview only

General Availability

Foundry SDK & API

Limited

Full support

ML Training (AutoML, Pipelines)

Yes

No (use hub-based project)

Prompt Flow

Yes

No

Managed Compute (HuggingFace)

Yes

No

💡 Critical Takeaway: New generative AI and model-centric features are available only through the Foundry Resource and its Foundry projects. Hub-based projects will not receive new agent or model features.

4. Resource Provider Unification

One of the most impactful architectural changes is the consolidation under the Microsoft.CognitiveServices provider namespace:

Resource

Provider

Kind

Microsoft Foundry

Microsoft.CognitiveServices/account

AIServices

Foundry Project

Microsoft.CognitiveServices/account/project

AIServices

Azure Speech

Microsoft.CognitiveServices/account

Speech

Azure Language

Microsoft.CognitiveServices/account

Language

Azure Vision

Microsoft.CognitiveServices/account

Vision

This means:

  • Unified RBAC: The same Azure RBAC actions work across Foundry, Azure OpenAI, Speech, Vision, and Language.
  • Unified Azure Policy: Existing custom Azure Policies continue to apply if you're upgrading from Azure OpenAI to Foundry.
  • Unified Networking: Private Link, VNet configuration, and network isolation share the same management patterns.

In contrast, hub-based projects under Microsoft.MachineLearningServices had a completely separate RBAC model, networking stack, and policy surface.

5. Security & Governance: Separation of Concerns

Foundry (New) — Clear Control Plane vs. Data Plane Separation

Layer

Scope

Who

Examples

Control Plane

Foundry Resource (top-level)

IT Admins

Create deployments, configure networking, manage projects, set encryption

Data Plane

Foundry Project (child)

Developers

Build agents, run evaluations, upload files, test in playground

This means IT can set up governance once at the resource level, and developers can self-serve by creating projects as isolated workspaces without needing admin intervention.

Starter RBAC Assignments:

  • Azure AI User for each developer at the Foundry Resource scope
  • Azure AI User for each project managed identity at the Foundry Resource scope

Foundry (Classic) — Hub-Centric Governance

In Classic, hub-based projects relied on the Hub as the governance boundary. This worked but required IT to manage the Hub and its dependent resources (Storage, Key Vault). Foundry projects under Classic had the same new governance model as above, but the portal experience was merged with hub-based projects, adding confusion.

6. Agent Service: The Biggest Leap Forward

The Foundry Agent Service is arguably the reason Microsoft rebuilt the portal experience. Here's how agent capabilities differ:

Foundry (Classic) — Agent Service in Preview

  • Agents available in preview only within hub-based projects
  • Agents in GA within Foundry projects (accessed through Classic portal)
  • Single-agent interactions primarily
  • Limited tool selection
  • Basic observability
  • Required connection strings for SDK authentication

Foundry (New) — Agent Service, Fully Realized

a) Multi-Agent Orchestration & Workflows

Build advanced automation with the visual workflow builder using SDKs for C# and Python. Supports:

  • Sequential workflows — Agent A → Agent B → Agent C in defined order
  • Group Chat — Dynamic control passing between agents based on context
  • Human-in-the-loop — Approval steps, clarifying questions mid-workflow
  • Visual YAML editor — Edit workflows visually or in YAML; changes sync in real-time
  • Power Fx integration — Excel-like formulas for conditional logic, variable handling, data transformations
  • Versioning — Every save creates an immutable version with full history

b) Agent Types

Type

Kind

Description

Prompt Agent

prompt

LLM-backed agent defined declaratively with model config, instructions, tools, and prompts

Hosted Agent

hosted

Containerized agent running custom code, deployed and managed by Foundry

Workflow

YAML-based

Orchestrates multiple agents together using agentic patterns

c) Memory (Preview — New Only)

Long-term agent memory is a brand-new capability:

  • User Profile Memory: Stores preferences, dietary restrictions, language preferences — persists across sessions
  • Chat Summary Memory: Distilled summaries of conversation topics for cross-session continuity
  • Three Phases: Extraction → Consolidation → Retrieval
  • Memory search tool or low-level Memory Store APIs
  • Supports up to 10,000 memories per scope, 100 scopes per store

d) Foundry IQ — Knowledge Integration (Preview — New Only)

A managed, multi-source knowledge base for enterprise content:

  • Connects to Azure Blob Storage, SharePoint, OneLake, and public web
  • Automated document chunking, vector embedding generation, metadata extraction
  • Agentic retrieval engine: Decomposes complex questions into subqueries, executes in parallel, semantically reranks, returns unified responses with citations
  • Permission-aware: Synchronizes ACLs, honors Microsoft Purview sensitivity labels, enforces permissions at query time
  • One knowledge base can serve multiple agents

e) Expanded Tool Catalog (Preview — New Only)

The new portal introduces a Foundry Tool Catalog with 1,400+ tools:

Tool Category

Examples

Built-in

Azure AI Search, Code Interpreter, File Search, Grounding with Bing, Image Generation, Computer Use, SharePoint, Microsoft Fabric, Browser Automation, Web Search

MCP Servers (Remote)

Publisher-hosted servers using Model Context Protocol

MCP Servers (Local)

Self-hosted MCP servers connected to Foundry

Custom

OpenAPI 3.0 specs, Agent-to-Agent (A2A) endpoints, custom MCP endpoints

Private Catalog

Organization-scoped tools visible only to your team

Classic had a much smaller tool surface: primarily Azure AI Search, File Search, Code Interpreter, and custom functions.

f) Integration & Publishing Capabilities

The new portal supports:

  • Publish agents to Microsoft 365, Teams, and BizChat
  • Containerized deployments for portability
  • Open protocol support: MCP and A2A with full authentication
  • AI Gateway integration (Azure API Management)
  • Azure Policy integration for agent governance

7. Model Deployment: What Changed?

The model deployment story is shared across both portals but the new portal streamlines the experience significantly.

Deployment Types (Available in Both)

Deployment Type

SKU

Data Processing

Billing

Global Standard

GlobalStandard

Any Azure region

Pay-per-token

Global Provisioned

GlobalProvisionedManaged

Any Azure region

Reserved PTU

Global Batch

GlobalBatch

Any Azure region

50% discount, 24-hr target

Data Zone Standard

DataZoneStandard

Within data zone (EU/US)

Pay-per-token

Data Zone Provisioned

DataZoneProvisionedManaged

Within data zone

Reserved PTU

Data Zone Batch

DataZoneBatch

Within data zone

50% discount

Standard (Regional)

Standard

Single region

Pay-per-token

Regional Provisioned

ProvisionedManaged

Single region

Reserved PTU

Developer

DeveloperTier

Any Azure region

Fine-tuned eval only, no SLA

Key Differences by Portal

Capability

Classic

New

Models sold directly by Azure (Azure OpenAI, DeepSeek, xAI)

Via connections in hub-based; native in Foundry projects

Native

Partner/Community Models (via Marketplace)

Via connections in hub-based; native in Foundry projects

Native

Models on Managed Compute (HuggingFace etc.)

Hub-based projects only

Not supported

Serverless API Endpoints

Hub-based projects

Standard deployment only

Model Catalog Browsing

Available without sign-in

Requires project context

8. Observability & Monitoring

Capability

Classic

New

Azure Monitor metrics

Scoped to resource level

Scoped to resource + project level

Application Insights integration

Manual setup

Built-in for Agent Service

Conversation-level tracing

SDK-based (manual)

Real-time in portal with built-in metrics

Evaluation workflows

Available (preview)

Available with continuous evaluation via Python SDK

Agent monitoring dashboard

Not available

Built-in "Operate" section

Model tracking

Basic

Enhanced with centralized AI asset management

The new portal introduces a dedicated "Operate" section for centralized AI asset management — agents, models, and tools in one place. You can register agents from other clouds, get alerts when agents or models need attention, and manage fleet health at scale.

9. SDK & API Experience

Aspect

Classic

New

Authentication

Connection strings

Project endpoint + DefaultAzureCredential

SDK

azure-ai-projects (preview)

azure-ai-projects (GA), unified Foundry SDK

Languages

Python, C# (GA); JS/TS, Java (preview)

Same

API Surface

Foundry API (limited for hub-based)

Full Foundry API — agents, evaluations, models, indexes, data

VS Code Extension

Available

Available (enhanced)

Migration Example

Classic (Hub-Based, Preview SDK):

# Used connection strings
client = AIProjectClient.from_connection_string(
    conn_str="your_connection_string",
    credential=DefaultAzureCredential()
)

New (Foundry Project, GA SDK):

from azure.identity import DefaultAzureCredential
from azure.ai.projects import AIProjectClient

project = AIProjectClient(
    endpoint="your_project_endpoint",
    credential=DefaultAzureCredential()
)

10. Data Storage & Encryption

Feature

Classic

New

Default Storage

Microsoft-managed (logical separation)

Microsoft-managed (logical separation)

Bring Your Own Storage

Supported

Supported

BYOS for Agent State

Standard setup available

Standard setup available

Customer-Managed Key Encryption

Supported (FIPS 140-2, 256-bit AES)

Supported (same)

Bring Your Own Key Vault

Supported

Supported

BCDR for Agents

Customer-provisioned Cosmos DB

Customer-provisioned Cosmos DB

The storage layer is largely unchanged between portals — both support the same bring-your-own patterns for compliance.

11. Networking & Private Access

Feature

Classic

New

Private Link

Fully supported

Supported (some limitations for end-to-end isolation)

VNet Integration (Container Injection)

Supported

Supported

End-to-End Network Isolation

Fully supported (SDK, CLI, Portal)

Partially supported — use Classic, SDK, or CLI for fully isolated deployments

💡 Important: If you require end-to-end network isolation in production, Microsoft currently recommends using the Classic experience, SDK, or CLI until the new portal reaches full parity.

12. Navigation & UX Differences

Classic Portal

  • Left pane navigation organized by development stages: Define & Explore → Build & Customize → Observe & Improve
  • Customizable left pane per project, per user (pin/unpin items)
  • Management Center — centralized hub for projects, quotas, permissions, usage metrics
  • Breadcrumb navigation showing project type (Hub vs. Foundry)
  • Supports browsing model catalog without signing in

New Portal

  • Top menu bar with Build/Operate sections
  • Project selector in upper-left corner — switch between recently used projects
  • "Operate" section for centralized AI asset management (agents, models, tools)
  • Streamlined navigation with redesigned interface
  • Faster load times with dynamic prefetching
  • Only shows default project per Foundry resource; "View all resources" opens Classic portal

13. Foundry Local — Runs Everywhere

A capability that works across both experiences: Foundry Local lets you run LLMs on your own device for free. Integrates with inference SDKs, supports HuggingFace model compilation, and provides a local development loop.

14. Feature Availability Matrix (Comprehensive)

Feature

Classic — Hub-Based

Classic — Foundry Project

New — Foundry Project

Agents (GA)

Preview only

GA

GA

Multi-Agent Workflows

No

No

Yes

Memory (Long-Term)

No

No

Yes (Preview)

Foundry IQ Knowledge Base

No

No

Yes (Preview)

Tool Catalog (1,400+ tools)

No

No

Yes (Preview)

MCP & A2A Protocol Support

No

No

Yes

Publish to M365/Teams/BizChat

No

No

Yes

Centralized AI Asset Mgmt

No

No

Yes

Models (Azure OpenAI, etc.)

Via connections

Native

Native

Models on Managed Compute

Yes

No

No

Prompt Flow

Yes

No

No

AutoML / ML Pipelines

Yes

No

No

Evaluations

Yes

Yes (Preview)

Yes (Preview)

Playgrounds

Yes

Yes

Yes

Content Understanding

Yes

Yes

Yes

Fine-Tuning

Yes

Yes

Yes

Datasets & Indexes

Yes

Yes

Yes

Full Foundry SDK & API

Limited

Full

Full

E2E Network Isolation

Full

Full

Partial

RBAC (Resource + Project)

Hub-level

Resource + Project

Resource + Project

Azure Policy Integration

Yes

Yes

Enhanced

Disable Preview Features

RBAC or Tags

RBAC or Tags

RBAC or Tags

15. Migration Path: Classic → New

Microsoft provides a clear migration guide:

  • Step 1: Locate your existing Foundry resource (the AIServices kind resource created alongside your hub)
  • Step 2: Create a new Foundry project under that resource
  • Step 3: What transfers automatically: Model deployments, data files, fine-tuned models, assistants, vector stores
  • Step 4: What doesn't transfer: Preview agent state (threads, messages, files), open-source model deployments, hub project access
  • Step 5: Update SDK code — replace connection strings with project endpoints
  • Step 6: Optionally recreate connections for tools and data sources
  • Step 7: Optionally clean up hub-based projects (keep them if you still need ML training or Prompt Flow)

Estimated migration time: 5–10 minutes for project creation; additional time for agent code migration depending on complexity.

16. When Should You Move to Foundry (New)?

Move now if:

  • You're building agentic applications and want GA Agent Service
  • You need multi-agent orchestration (workflows, sequential, group chat)
  • You want the Tool Catalog, Memory, or Foundry IQ
  • You want to publish agents to Microsoft 365/Teams
  • You're starting greenfield and want simplified governance

Stay on Classic (for now) if:

  • You depend on Prompt Flow for orchestration
  • You deploy open-source models on managed compute (HuggingFace, etc.)
  • You need Azure Machine Learning features (AutoML, ML Pipelines, training)
  • You require fully isolated end-to-end networking in the portal (not just SDK/CLI)
  • You have extensive hub-based project investments you're not ready to migrate

Final Thoughts

Microsoft Foundry (New) isn't just a portal redesign — it's a platform pivot from "AI Studio for building chatbots" to "the enterprise AI agent factory." The introduction of multi-agent workflows, long-term memory, Foundry IQ knowledge bases, a 1,400+ tool catalog with MCP/A2A support, and centralized fleet management represents a generational leap.

But it's also an honest work-in-progress. Network isolation parity, Prompt Flow, and managed compute for open-source models are still reasons to keep the Classic experience bookmarked. The good news is that both portals coexist at the same URL, and switching between them is a single toggle.

The direction is clear: Foundry (New) is the future. Start building there. Fall back to Classic only when you must.

Useful Links

Note: All information sourced from Microsoft Learn documentation as of March 2026. Feature availability may change as Microsoft continues updating both portal experiences.

Read the whole story
alvinashcraft
53 minutes ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories