Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
149296 stories
·
33 followers

Microsoft Foundry on Windows Server

1 Share

As organizations embrace AI, new opportunities exist for Windows Server customers who want to leverage on-premises AI. While Azure remains the best place for cutting edge models and AI inference hardware accelerators, certain industries - such as healthcare, finance, manufacturing, and retail - require on-premises AI to improve and accelerate existing business workflows. 

Microsoft Foundry on Windows helps harness the power of AI on existing server deployments. Microsoft Foundry on Windows includes Foundry Local and Windows ML that enable server customers to build local AI experiences and real-time inferencing.  

Leveraging  AI on your own infrastructure gives control over data residency, compliance, and latency. 

This blog details how Microsoft Foundry on Windows brings local AI capabilities to Windows Server deployments. It explores why Foundry Local and Windows ML are a strong fit for on-premises AI, highlighting technical considerations, and showing how customers can easily build generative AI applications with Foundry Local catalog, or proprietary models of any type via Windows ML 

Windows Server as local AI platform 

Windows Server 2025 reached GA last year and introduced significant enhancements—including advanced storage capabilities, GPU partitioning (GPU-P), and Discrete Device Assignment (DDA) for assigning GPU resources to virtual machines, and massive Hyper V scalability with support for up to 2,048 vCPUs per Gen2 VM. These capabilities combine to make Windows Server 2025 ideal for AI-intensive workloads. Built to power mission critical environments where compliance and continuity are non-negotiable, Windows Server offers a robust, enterprise grade infrastructure that enables AI inferencing on premises without leaving your datacenter.  

Scenarios for On-Premises AI 

Although many organizations are investing in AI on Azure to leverage the latest innovations, we understand there are several situations where on-premises AI capabilities are required. Below are a few examples of such scenarios. 

Healthcare 

Meet regulatory requirements. Maintain Protected Health Information (PHI) and clinical records within your on-premises perimeter to meet compliance requirements—while enabling AI-powered insights locally. 

Finance 

Act on insights instantly. Process financial reports and transaction logs near the source to reduce latency and avoid round trips to external endpoints, ensuring speed and confidentiality. 

Manufacturing 

Operate in disconnected environments. Run AI workflows in air-gapped or intermittently connected plants to support predictive maintenance and quality control without relying on cloud connectivity. 

Retail offices 

Operate in latency-sensitive environments. Run AI models for basic inferencing to improve point-of-sale efficiency and deliver personalized experiences. 

Technical Snapshot 

Microsoft Foundry on Windows supports a two-pronged approach to make Windows Server platform AI-ready: 

Windows ML enables application service owners to introduce AI workflows or inferencing within existing server applications. It automatically identifies available processors (CPU or GPU) based on server hardware, downloads optimal execution providers (EPs) and allows the application to use AI models locally. Windows ML supports ONNX Runtime under the hood, ensuring compatibility with popular frameworks and optimized execution providers. 

Foundry Local enables seamless discovery, download, and orchestration of AI models directly on Windows Servers, including support for hardware acceleration on servers with GPUs. It also streamlines deployment of foundational models on virtual machines with GPU-P partitioning, ensuring hardware isolation and optimized resource sharing for compliance sensitive environments. 

The foundry model catalog will continue to evolve with more models and APIs, like embedding models support. 

Simple steps to get started! 

  1. Onboard Foundry Local on your existing server infrastructure: Install Foundry Local on Windows Server 2025  
  2. Identify a practical use case for AI inferencing: Start with a simple scenario—such as summarizing reports or translating content to native language. 
  3. Pilot with existing prebuilt models in the catalog for rapid results. Validate performance and compatibility with your hardware. 
  4. Integrate with existing workflow: Connect inference endpoints to your current applications or automation pipelines. Keep data local while enhancing processes with AI insights. Foundry Local provides an SDK, Command Line Interface (CLI), and a REST API for ease of use and integration into existing workflows and applications. 
  5. Measure performance: Track latency, throughput, and resource utilization to optimize deployment. Use these insights to fine-tune and iterate.  

Deep dive: Unlock the power of BYOM + Windows ML on Windows Server 

Bring Your Own Model (BYOM): This gives organizations the freedom to choose custom AI models tailored to their domain and business needs. For instance, a manufacturing company might bring a predictive maintenance model trained on its own sensor data to anticipate equipment failures and reduce downtime.  

Windows ML allows use of proprietary models to run seamlessly on Windows Server. Windows ML automatically discovers, downloads and registers the latest version of all compatible execution providers (EP). Tools like AI Toolkit Extension for VS Code  can be used for model optimization and quantization to prepare models for efficient local execution. 

In summary, with BYOM and Windows ML on server customers can deploy custom AI models to provide inferencing solutions locally to existing business workloads. 

Resources: 

 

Read the whole story
alvinashcraft
4 hours ago
reply
Pennsylvania, USA
Share this story
Delete

Azure SRE Agents with Deepthi Chelupati

1 Share

How can Azure Site Reliability Engineering Agents help your Azure infrastructure? While at Techorama in Utrecht, Richard chatted with Deepthi Chelupati about the LLM service that helps you build and maintain more reliable applications and infrastructure in Azure. Deepthi talks about monitoring deployment problems, handling errors in production, and application telemetry in operation as areas where Azure SRE Agents can help catch issues earlier and let you respond faster. The conversation dives into the array of capabilities and where you can provide information to administrators so they can decide how to proceed—perhaps pushing information to development or starting a recovery process of some kind. How you use Azure SRE Agents is up to you!

Links

Recorded October 28, 2025





Download audio: https://cdn.simplecast.com/audio/c2165e35-09c6-4ae8-b29e-2d26dad5aece/episodes/4016d1c4-ea56-461c-b600-397bee8b3e97/audio/25633b46-b7c7-42ec-a39b-83ca8cc4488a/default_tc.mp3?aid=rss_feed&feed=cRTTfxcT
Read the whole story
alvinashcraft
4 hours ago
reply
Pennsylvania, USA
Share this story
Delete

Migrating from Microsoft Learn to aspire.dev

1 Share

When my team (the Aspire team) decided to migrate all Aspire content from Microsoft Learn to the shiny new aspire.dev site, we knew we’d signed up for a marathon. You may have noticed banners atop both Microsoft Learn: Aspire and aspire.dev announcing the migration…

Microsoft Learn: Aspire

Banner on Microsoft Learn announcing the migration of .NET Aspire documentation to aspire.dev, featuring yellow and blue gradient background with informational text about the transition



Read the whole story
alvinashcraft
4 hours ago
reply
Pennsylvania, USA
Share this story
Delete

Daily Reading List – November 18, 2025 (#668)

1 Share

Amazing launch day for Gemini 3 and more. Check out more about that, and other interesting happenings.

[blog] A new era of intelligence with Gemini 3. Super overview of the monster set of announcements today.

[blog] Start building with Gemini 3. Text and media are important parts of any modern LLM, but coding is what most folks were watching from us. Logan explains how we’ve stepped up our game for builders.

[blog] Introducing Google Antigravity, a New Era in AI-Assisted Software Development. Speaking of that, this is the most important release of the day (to me). It’s an entirely new way to build modern software.

[blog] Bringing Gemini 3 to Enterprise. We’re good at ensuring that new models are available simultaneously in the places that matter most.

[blog] Gemini 3 brings upgraded smarts and new capabilities to the Gemini app. These are some legit upgrades to the experience, and once again open the door to new use cases.

[blog] 5 things to try with Gemini 3 Pro in Gemini CLI. Fantastic list of new things to try. That fifth use cases is eye opening.

[blog] What Technical Debt Means To IT Professionals. How do you define tech debt? Most realize that it goes far beyond old code. This post digs deeper and offers some ways to escape.

[blog] Only three kinds of AI products actually work. Early days. We’ll go from “doesn’t work” to “works” fairly quickly.

[blog] Rich and dynamic user interfaces with Flutter and generative UI. Use a library that’s connected to what we’re doing with the Gemini app interface.

[blog] How to build your own resume reviewer with Google AI Studio in minutes. There are few things that can one-shot an application better than Google AI Studio.

[blog] Chamber 🏰 of Tech Secrets #55: Issue-Driven Development. When you’re hands-on with these tools (versus simply observing) you tend to develop pragmatic practices. I like how Brian is thinking about directing his agentic workflows.

[blog] On Cursor, Erich Gamma, VS Code forks and the surprising role of the Eclipse Foundation. Open extension ecosystems (outside of VS Code) will play a big part in the upcoming years. While lots of tools depend on VS Code forks, you don’t want to be dependent on Microsoft’s control of an ecosystem.

[blog] Local code meets cloud compute: Using Google Colab in VS Code. This is low-key a big deal. Millions of people use Colab regularly, and this gives them an additional surface to work in.

[article] Go team to improve support for AI assistants. Go is already an excellent choice if you want to generate quality code with AI. Or build agents. We’re going to keep investing there.

Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:



Read the whole story
alvinashcraft
4 hours ago
reply
Pennsylvania, USA
Share this story
Delete

New T-SQL AI Features are now in Public Preview for Azure SQL and SQL database in Microsoft Fabric

1 Share

At the start of this year, we released a new set of T-SQL AI features for embedding your relational data for AI applications. Today, we have brought those features to Azure SQL and SQL database in Microsoft Fabric.

This post will help you get started using the new AI functions of Azure SQL.

Prerequisites

Set up your environment

The following section guides you through setting up the environment and installing the necessary software and utilities.

Set up the database

The following section guides you through using the embeddings model to create vector arrays on relation data and use the new vector similarity search functionality in Azure SQL and SQL database in Microsoft Fabric.

Create database scoped credentials

Use the following sample code to create a set of database scoped credentials for calling our Azure OpenAI Endpoint and providing the key in the header:

Note: Your Endpoint URLs and Key will be different that these in the blog post

-- Create a master key for the database
if not exists(select * from sys.symmetric_keys where [name] = '##MS_DatabaseMasterKey##')
begin
create master key encryption by password = N'V3RYStr0NGP@ssw0rd!';
end
go

-- Create the database scoped credential for Azure AI Content Understanding
if not exists(select * from sys.database_scoped_credentials where [name] = 'https://azure.cognitiveservices.azure.com/')
begin
create database scoped credential [https://azure.cognitiveservices.azure.com/]
with identity = 'HTTPEndpointHeaders', secret = '{"api-key":"YOUR_AZURE_OPEN_AI_KEY"}';
end
go

Create the EXTERNAL MODEL in the database

1. Using SSMS or VS Code, login to the database.

2. Open a new query sheet

3. Next, run the following SQL to create an EXTERNAL MODEL that points to an Azure  OpenAI embedding model (here ill be using text-embedding-3-small):

Note: Your Endpoint URLs will be different that these in the blog post

CREATE EXTERNAL MODEL text3small
WITH (
LOCATION = 'https://azure.cognitiveservices.azure.com/openai/deployments/text-embedding-3-small/embeddings?api-version=2023-05-15',
API_FORMAT = 'Azure OpenAI',
MODEL_TYPE = EMBEDDINGS,
MODEL = 'text-embedding-3-small',
CREDENTIAL = [https://azure.cognitiveservices.azure.com/]
);

Test the EXTERNAL MODEL

To test the embeddings endpoint, run the following SQL:

select AI_GENERATE_EMBEDDINGS(N'test text' USE MODEL text3small);

You should see a JSON vector array returned similar to the following:

[0.1529204398393631,0.4368368685245514,-3.6136839389801025,-0.7697131633758545…

Embed Product Data

This next section of the tutorial will alter the Adventure Works product table to add a new vector data type column.

1. Run the following SQL to add the columns to the Product table:

ALTER TABLE [SalesLT].[Product]
ADD embeddings VECTOR (768),
chunk NVARCHAR (2000);

2. Next, we are going to use the EXTERNAL MODEL and AI_GENERATE_EMBEDDINGS to create embeddings for text we supply as an input.

Run the following code to create the embeddings:

-- create the embeddings
SET NOCOUNT ON;

DROP TABLE IF EXISTS #MYTEMP;

DECLARE @ProductID int
DECLARE @text NVARCHAR (MAX);

SELECT * INTO #MYTEMP FROM [SalesLT].Product WHERE embeddings IS NULL;

SELECT @ProductID = ProductID FROM #MYTEMP;

SELECT TOP(1) @ProductID = ProductID FROM #MYTEMP;

WHILE @@ROWCOUNT <> 0
BEGIN
SET @text = (
SELECT p.Name + ' ' + ISNULL(p.Color, 'No Color') + ' ' + c.Name + ' ' + m.Name + ' ' + ISNULL(d.Description, '')
FROM [SalesLT].[ProductCategory] c,
[SalesLT].[ProductModel] m,
[SalesLT].[Product] p
LEFT OUTER JOIN [SalesLT].[vProductAndDescription] d
ON p.ProductID = d.ProductID
AND d.Culture = 'en'
WHERE p.ProductCategoryID = c.ProductCategoryID
AND p.ProductModelID = m.ProductModelID
AND p.ProductID = @ProductID
);
UPDATE [SalesLT].[Product] SET [embeddings] = AI_GENERATE_EMBEDDINGS(@text USE MODEL text3small), [chunk] = @text WHERE ProductID = @ProductID;

DELETE FROM #MYTEMP WHERE ProductID = @ProductID;

SELECT TOP(1) @ProductID = ProductID FROM #MYTEMP;
END

2. Use the following query to see if any embeddings were missed:

SELECT *
FROM SalesLT.Product
WHERE embeddings IS NULL;

3. And use this query to see a sample of the new columns and the data within:

SELECT TOP 10 chunk,
embeddings
FROM SalesLT.Product;

Use VECTOR_DISTANCE

Vector similarity searching is a technique used to find and retrieve data points that are similar to a given query, based on their vector representations. The similarity between two vectors is measured using a distance metric, such as cosine similarity or Euclidean distance. These metrics quantify the similarity between two vectors by calculating the angle between them or the distance between their coordinates in the vector space.

Vector similarity searching has numerous applications, such as recommendation systems, search engines, image and video retrieval, and natural language processing tasks. It allows for efficient and accurate retrieval of similar items, enabling users to find relevant information or discover related items quickly and effectively.

This section of the tutorial will be using the new function VECTOR_DISTANCE.

VECTOR_DISTANCE

Uses K-Nearest Neighbors or KNN

Use the following SQL to run similarity searches using VECTOR_DISTANCE.

declare @search_text nvarchar(max) = 'I am looking for a red bike and I dont want to spend a lot'
declare @search_vector vector(768) = AI_GENERATE_EMBEDDINGS(@search_text USE MODEL text3small);
SELECT TOP(4)
p.ProductID, p.Name , p.chunk,
vector_distance('cosine', @search_vector, p.embeddings) AS distance
FROM [SalesLT].[Product] p
ORDER BY distance;

declare @search_text nvarchar(max) = 'I am looking for a safe helmet that does not weigh much'
declare @search_vector vector(768) = AI_GENERATE_EMBEDDINGS(@search_text USE MODEL text3small);
SELECT TOP(4)
p.ProductID, p.Name , p.chunk,
vector_distance('cosine', @search_vector, p.embeddings) AS distance
FROM [SalesLT].[Product] p
ORDER BY distance;

declare @search_text nvarchar(max) = 'Do you sell any padded seats that are good on trails?'
declare @search_vector vector(768) = AI_GENERATE_EMBEDDINGS(@search_text USE MODEL text3small);
SELECT TOP(4)
p.ProductID, p.Name , p.chunk,
vector_distance('cosine', @search_vector, p.embeddings) AS distance
FROM [SalesLT].[Product] p
ORDER BY distance;


Chunk with embeddings

This section uses the `AI_GENERATE_CHUNKS` function with `AI_GENERATE_EMBEDDINGS` to simulate breaking a large section of text into smaller set sized chunks to be embedded.

1. First, create a table to hold the text:

CREATE TABLE textchunk
(
text_id INT IDENTITY (1, 1) PRIMARY KEY,
text_to_chunk NVARCHAR (MAX)
);
GO

2. Next, insert the text into the table:

INSERT INTO textchunk (text_to_chunk)
VALUES ('All day long we seemed to dawdle through a country which was full of beauty of every kind. Sometimes we saw little towns or castles on the top of steep hills such as we see in old missals; sometimes we ran by rivers and streams which seemed from the wide stony margin on each side of them to be subject to great floods.'),
('My Friend, Welcome to the Carpathians. I am anxiously expecting you. Sleep well to-night. At three to-morrow the diligence will start for Bukovina; a place on it is kept for you. At the Borgo Pass my carriage will await you and will bring you to me. I trust that your journey from London has been a happy one, and that you will enjoy your stay in my beautiful land. Your friend, DRACULA');
GO

3. Finally, create chunks of text to be embedded using both functions:

SELECT c.*, AI_GENERATE_EMBEDDINGS(c.chunk USE MODEL text3small)
FROM textchunk t
CROSS APPLY
AI_GENERATE_CHUNKS(source = text_to_chunk, chunk_type = N'FIXED', chunk_size = 50, overlap = 10) c

The post New T-SQL AI Features are now in Public Preview for Azure SQL and SQL database in Microsoft Fabric appeared first on Azure SQL Devs’ Corner.

Read the whole story
alvinashcraft
9 hours ago
reply
Pennsylvania, USA
Share this story
Delete

Level up design-to-code collaboration with GitHub’s open source Annotation Toolkit

1 Share

If you’ve ever been handed a design file and thought, “Wait—what exactly is this supposed to do?” you’re not alone. 

The handoff between designers and developers is one of the most common points where product workflows break down. You are looking at components and trying to figure out what’s interactive, what’s responsive, what happens when text gets bigger. The designer is trying to express something that isn’t directly stated on the canvas. Somewhere in that gap, accessibility considerations get missed. Knowledge walks out the door in lost Slack threads. Then it all comes back later as a bug that could have been prevented if messages weren’t missed or if expectations had been clearer upfront.

GitHub’s accessibility design team ran into this exact problem internally. They looked at their own accessibility audit data and realized something striking: nearly half of accessibility audit issues (48%) could have been prevented if design intent had been better documented upfront by integrating WCAG (Web Content Accessibility Guidelines) considerations directly into annotations. So they built something to fix it. And now they’ve open sourced it.

It’s called the Annotation Toolkit, and it’s a Figma library designed to make the handoff easier. The framework brings structure, clarity, and accessibility-first practices into every design-to-code interaction.

What the Annotation Toolkit is (and isn’t)

At its core, the Annotation Toolkit is a Figma library of stamps (annotations) that you can drop into your designs. Each annotation lets you:

  • Express design intent beyond what’s visually on the canvas.
  • Document accessibility behaviors like responsive reflow or table handling.
  • Guide engineers clearly by linking numbered stamps to descriptions.

Instead of documenting all this in Figma comments (which get lost), Slack threads (which disappear), or scattered one-off clarifications (which nobody can remember later), the annotations live right inside your design file. They’re numbered, they’re portable, and they stay with your work.

Think of it like embedding clarity directly into the handoff.

Why it matters: Accessibility by default

The toolkit was built by GitHub’s accessibility design team specifically so that accessibility considerations aren’t something you bolt on at the end. They’re baked into the design workflow from the start.

Each annotation comes with built-in guidance. Want to mark a table? The toolkit addresses nearly every design-preventable accessibility issue under WCAG guidelines, including things like reflow behavior. Adding an image? It prompts you to document the context so developers can write proper alt text. The toolkit doesn’t just let you document accessibility—it teaches you as you go.

That’s not a small thing. It means developers stop guessing. It means accessibility isn’t a specialist concern anymore, but is part of the conversation from day one.

Real-world application: From pain points to productivity

Before this toolkit, GitHub teams relied on a patchwork of Figma comments, Slack threads, and one-off clarifications. This patched approach resulted in knowledge gaps and repeated accessibility oversights.

But now, annotations provide:

  • Clarity at scale: engineers no longer guess at intended behaviors.
    Consistency across teams: designers, product managers (PM), and developers all share a common language.
  • Preventative QA: many issues are resolved at the design stage instead of post-build.

Annotations enable Figma to become more than just a canvas. It’s a tool for expressing a much deeper level of information.

@hellojanehere, product manager at GitHub

Tutorial: How to use the Annotation Toolkit

How to get started

You’ve got two paths here, so pick whichever feels easier:

Option 1: From Figma Community (fastest)

  1. Head to the @github profile on Figma (figma.com/@github).
  2. Find the Annotation Toolkit and click the link to duplicate it.
  3. It goes straight to your drafts.
  4. Access the components anytime from your Assets tab.

Option 2: From GitHub (if you want all the docs at once)

  1. Visit github.com/github/annotation-toolkit.
  2. Download the exported Figma file from the repo.
  3. Open it in Figma and duplicate it to your workspace.
  4. Same deal—find components in your Assets tab.

Once you’ve got the toolkit, adding your first annotation is straightforward. Open any design file, drag an annotation stamp into it (say, the Image annotation on a profile picture), and you’ll see a numbered label appear. Pair that number with a description block and write what you need. That’s it. You’ve just documented something that would normally disappear into a Slack thread.

The toolkit comes with design checkpoints, which are basically interactive checklists that keep accessibility top of mind as you work. If you want to go deeper, everything is documented. The repo has tutorials for every annotation type, deep dives on WCAG compliance, and guidance on avoiding common handoff friction. Check it out and contribute back if you find gaps.

The bigger picture

The Annotation Toolkit is a shift in how we think about collaboration. By embedding intent, accessibility, and clarity directly into Figma, GitHub is giving the developer-designer partnership a new foundation.

It’s not about replacing conversations. It’s about making them more meaningful. When intent is clear, work flows faster, and the end result is better for everyone.

The toolkit is actively maintained by GitHub staff and open to contributions. If you spot something that could be better, head over to github.com/github/annotation-toolkit and open an issue. Report bugs, suggest features, or contribute new annotation types. The team is actively looking for feedback on how you’re using it and what’s missing.

👉 Explore the toolkit on Figma at @GitHub or dive into the repository on GitHub. If you want to see it in action first, check out the walkthrough. Try it, contribute, and help shape the future of accessible, collaborative design.

The post Level up design-to-code collaboration with GitHub’s open source Annotation Toolkit appeared first on The GitHub Blog.

Read the whole story
alvinashcraft
9 hours ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories