Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
151360 stories
·
33 followers

SE Radio 715: Sahaj Garg on Designing for Ambiguity in Human Input

1 Share

Sahaj Garg, co-founder and CTO of Wispr, a voice-to-text AI that turns speech into polished writing, talks with host Amey Ambade about designing systems for the ambiguity that's inherent in human input (text, voice, multimodal). Sahaj focuses on concrete architectural and training strategies for building robust AI systems. This episode examines the problem of ambiguity, where it shows up, building robust systems, personalization, communicating uncertainty, and evaluation. The conversation starts by exploring the difference between inherent and reducible ambiguity, major categories of ambiguity including lexical, syntactic, and pragmatic, and the additional sources of ambiguity in voice, such as homophones and accents. Garg details how to build systems through model training, including providing additional context and constructing datasets for good annotation. They discuss personalization with a focus on "revealed preferences"—learning from user behavior without explicit feedback—and fighting the problem of AI writing that "regresses to the mean." Finally, they consider how to communicate uncertainty to users without degrading the experience, as well as methods for evaluating ambiguity resolution through offline and online signals.





Download audio: https://traffic.libsyn.com/secure/seradio/715-sahaj-garg-ambiguity-human-input.mp3?dest-id=23379
Read the whole story
alvinashcraft
33 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

Beyond Copilot: AI Agents That Build, Debug and Govern Power Automate Flows

1 Share
From: Microsoft Developer
Duration: 36:16
Views: 131

What if AI agents could not only suggest Power Automate flows - but actually build, debug, optimize, and govern them for you?

In this episode of The Low Code Revolution show, we go beyond the boundaries of Copilot with real, end‑to‑end demos showing how AI agents interact directly with Power Automate using the Flow Studio MCP (Model Context Protocol) server created by John Liu and Catherine Han.

John and Catherine demonstrate how AI agents powered by Flow Studio MCP can securely access Power Platform environments, understand flow runs, apply enterprise standards and advanced patterns, fix errors, and even implement approval escalation - all using natural language.

✅ Learn more:
🔗 Flow Studio MCP - https://mcp.flowstudio.app
🔗 Flow Studio MCP Guides: Getting Started, Debug, Build, Tools, Copilot Skills - https://learn.flowstudio.app

✅ Chapter Markers:
00:00 - Introduction
01:20 - Background: why Flow Studio MCP server was built for Power Automate
02:54 - Challenges with APIs & flow visibility
06:47 - Demo 1: Listing environments and existing flows
08:42 - Demo 2: Building and testing a complex approval flow
20:22 - Demo 3: Enforcing naming conventions using enterprise standards
23:56 - Demo 4: Approval escalation and timeout patterns
27:25 - Demo 5: Debugging failed flows and proactive error handling
30:34 - Future vision: Copilot Studio + multi‑agent scenarios
31:42 - Where to get Flow Studio MCP
35:27 - Wrap‑up

✅ Resources:
Flow Studio - https://flowstudio.app
Blog - https://johnliu.net/
John Liu LinkedIn - https://www.linkedin.com/in/johnnliu/
Catherine Han LinkedIn - https://www.linkedin.com/in/catherinetzungnihan/

Read the whole story
alvinashcraft
40 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

WW 978: Pre-Peated - "Copilot Is for Entertainment Purposes Only"

1 Share

Julia Liuson is leaving Microsoft. Liuson joined Microsoft in 1992, the same year as CEO Satya Nadella (she worked on Access at first). She helped build the first version of Visual Studio and was the first female corporate vice president at Microsoft. Liuson has been president of Microsoft's Developer Division since 2021. Also, curious about life on the other side of the fence? Paul has a tip for finding games that are optimized for Linux. Plus, Chrome joins the 21st century with vertical tabs and a real reading view. Just be sure to install those anti-tracking extensions.

Windows

  • Microsoft promises more native apps for Windows 11, but... which apps? New apps? Replacements for existing apps?
  • Thanks for making us revisit the web app vs. native app thing yet again, Microsoft
  • Windows 11 version 25H2 is now being pushed to all compatible PCs
  • Compatibility milestone, not a big deal because 24H2/25H2 features are identical, same underlying codebase - but some will complain that Microsoft is "forcing" 25H2 on them
  • Secure Boot certificate notifications are now available so you can see where your PC is at
  • Another month, another emergency Windows Update patch
  • New Dev/Beta builds add Xbox Mode, new haptic effects, etc., plus a new Canary build with features we've seen before
  • Microsoft is taking the Insider Program on the road
  • Component shortages trigger another Raspberry Pi price hike, but also a promise for the future
  • The AMD Ryzen 9 9950X3D2 Dual Edition processor will be available from leading retailers starting Apr. 22 with a retail price of $899

AI

  • Microsoft's terms of service for Copilot say it's for entertainment purposes only. Yes, really.
  • Microsoft AI releases new foundational models for transcription, voice, and images
  • Word on iPhone gets Copilot co-create capabilities - used to be AI Mode, you need a Microsoft 365 Copilot subscription
  • Anthropic has hired away a key AI executive from Microsoft, and what he has to say about the opportunity is interesting
  • Anthropic brings Computer Use to Windows
  • Google: Seriously, we are not training AI with your Gmail
  • Google AI Pro plans now offer 5 TB of cloud storage, yikes

Xbox & gaming

  • Xbox is refreshing the look of achievements on the console
  • Call of Duty: Modern Warfare, more coming to Game Pass this month
  • Was this the best COD ever? In search of greatness
  • Also: Forza Horizon 6 launches May 19 and will be available on Xbox Series X|S, Xbox on PC and Xbox Cloud as an Xbox Play Anywhere title, and playable day one with Xbox Game Pass
  • Xbox will hold FanFest events around the world

Tips & picks

  • Tip of the week: So you want to try gaming on Linux
  • App pick of the week: Google Chrome
  • RunAs Radio this week: Securing AI Agents with Niall Merrigan
  • Brown liquor pick of the week: Corowa Peated Single Barrel 521

Hosts: Leo Laporte, Paul Thurrott, and Richard Campbell

Download or subscribe to Windows Weekly at https://twit.tv/shows/windows-weekly

Check out Paul's blog at thurrott.com

The Windows Weekly theme music is courtesy of Carl Franklin.

Join Club TWiT for Ad-Free Podcasts!
Support what you love and get ad-free audio and video feeds, a members-only Discord, and exclusive content. Join today: https://twit.tv/clubtwit

Sponsors:





Download audio: https://pdst.fm/e/pscrb.fm/rss/p/mgln.ai/e/294/cdn.twit.tv/megaphone/ww_978/ARML2762476630.mp3
Read the whole story
alvinashcraft
52 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

From 10 Failed Stacks to Production: How a Data Scientist Built a Job Board with Wasp, a Full-stack Framework for the Agentic Era

1 Share
note

Hireveld is currently down while Marcel works on a major refactor - but it's real, we swear! It'll be back up soon.

Marcel Coetzee is a data scientist and AI consultant based in South Africa. With a background in actuarial science and data science, he runs his own consultancy. He also builds SaaS products on the side. His latest project, Hireveld, is a job board tackling South Africa's broken hiring market. He built it entirely with Wasp after trying nearly every other stack out there.

Tell us about yourself. How did you end up building web apps as a Python developer?

My path has been a bit unconventional. I started in actuarial science, which involves insurance, mathematical statistics and risk modeling. From there I moved into data science, then data engineering, and eventually into building products. Python has been my main language through all of that.

Today I run my own consultancy doing data engineering and AI work. But I've always wanted to build my own things too, so I started learning the JavaScript ecosystem and working on SaaS products on the side. I'm not a JS native by any means, but with the rise of agentic coding tools, I realized I could finally turn my ideas into real full-stack applications without spending years mastering every corner of Node and React.

What's Hireveld, and what problem are you solving?

Hireveld homepage showing 'Hire without the markup' headline
Hireveld's landing page - hire without the markup

Hiring in South Africa is expensive and opaque. Recruitment agencies take a massive cut of annual salary. The established job boards charge thousands of rands just to post a single listing. And too many roles still get filled through personal connections rather than merit.

I built Hireveld to change that. Employers post for free, applicants get ranked anonymously, and employers pay a flat fee to reveal candidates. It's simple, it's cheap, and it puts merit first. The whole thing runs on Wasp - auth, background jobs for expiring old listings, transactional email, payment integration, the works.

You mentioned trying about 10 different stacks before landing on Wasp. What happened?

Yeah, I went through quite the journey. I started with PocketBase because I liked the idea of owning my code and not being locked into a cloud platform. It's a solid tool, but I quickly realized I needed PostgreSQL for search, background jobs, and a frontend that wasn't stitched together by hand. It just didn't scale to what I was building.

Then I tried Next.js, Nuxt, Svelte - they're decent, but those codebases can grow extremely quickly. As someone who's still relatively new to the JS ecosystem, I'd hit the limits of my knowledge fast. I even tried Django, thinking I'd stick with Python, but it's accumulated so much over the years. Too much magic, too much stuff.

My philosophy is: the projects that succeed expose as few abstractions as possible to the user. I try to keep myself at the highest level of abstraction I can. When I found Wasp on GitHub, the config file clicked for me immediately. You declare what you want - auth, database, jobs, email - and it all works together. I was writing actual product code on day one instead of gluing infrastructure together.

Don't prioritize the important over the urgent. With other stacks I was spending time on infrastructure decisions that felt important but weren't getting me closer to a product. Wasp let me focus on the urgent thing: shipping.

You're a big advocate for agentic coding. How does Wasp fit into that workflow?

This is where Wasp really shines, and honestly I think more people need to know about it. I've been building Hireveld almost entirely through agentic coding - Claude Code in the terminal - and after trying 10 different things, Wasp is by far the best experience for AI-assisted development.

Here's why: context is the precious commodity. Every line of code in your project takes a chunk of the model's context window. Wasp keeps the codebase tight and small.

The .wasp config file means the AI can understand your entire app's architecture at a glance - your routes, your auth setup, your jobs, your entities. Instead of the agent crawling through hundreds of files trying to figure out how things connect, it's all declared in one place. And because Wasp is opinionated and constrained, the agent doesn't try to do 50 different things. When something is wrong, the compiler screams. That tight feedback loop is exactly what agentic coding needs.

Wasp respects the model's context length. It keeps things tidy. The constraint is the feature - it's what keeps both you and the AI from spiraling into a 20,000-line mess.

I should say - I'm not blindly vibe coding. I know where my files are, I know my routes, I hand-edit the main.wasp file when I need to. I take testing seriously, both e2e and unit tests. QA is the layer where you, the human, decide what you actually want to build. But Wasp gives me the structure to stay at a high level and be productive, even as someone whose main language is Python. Also bring my data science background to bear by simulating data to gauge how the system would react to real traffic.

You also contributed back to Wasp - tell us about the Microsoft Auth integration.

Hireveld job search showing filters and a Junior Web Developer listing
Hireveld's job search interface

Hireveld targets the South African enterprise market, and enterprises run on Microsoft. They need Entra ID (Azure AD) for single sign-on - it's non-negotiable. When I started building, Wasp didn't have a Microsoft OAuth provider. With most frameworks, that would mean either paying a fortune for a third-party service or building a fragile custom integration that becomes tech debt.

But Wasp's codebase is approachable enough that I could build the provider myself and contribute it back. The PR process was great - Carlos and the team were welcoming and helpful. That's the sweet spot I was looking for: a framework that's batteries-included enough that I'm not rebuilding auth from scratch, but open enough that when I need something custom, I can add it without fighting the framework.

The community in general has been one of the best parts. The developers are genuinely friendly, my contributions felt valued, and I can tell the team takes agentic coding seriously - they maintain a Claude Code skill, they keep their prompts updated, they engage with the tooling ecosystem. That's an unusual level of involvement for a framework team.

What would you say to a developer considering Wasp for their next project?

If you're building a full-stack web app in 2026 and you're using AI tools to code - which you should be - try Wasp. Seriously. I went through PocketBase, Next.js, Nuxt, Svelte, Django, and more. Wasp is the only one where I felt like I was building my product from day one instead of fighting my tools.

It gave me auth, type-safe full-stack operations, background jobs, and transactional email - all wired together from a single config file. Everything else - the ranking algorithm, payments, file storage - I built on top of what Wasp provided. That separation is what made it possible to ship as a solo developer.

And if you're coming from Python or another ecosystem and you're intimidated by the JavaScript world - don't be. Wasp abstracts away enough of the complexity that you can stay at a high level and be productive. I'm proof of that.

Wasp is the full-stack framework for the agentic era. It's the one that lets you focus on what you're building, not how you're building it.


Marcel Coetzee is a data scientist, AI consultant, and SaaS builder based in South Africa. You can find him on GitHub and reach out to him on coetzee.marcel2@gmail.com

Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete

The WebCodecs Handbook: Native Video Processing in the Browser

1 Share

If you've ever tried to process video in the browser, like for a video editing or streaming app, your options were either to process video on a server (expensive) or to use ffmpeg.js (clunky). With the WebCodecs API, there's now a better way to do this.

WebCodecs is a relatively new API that allows browser applications to process video efficiently with very low-level control.

In the past, if you wanted to build, say, a video-editing app or live-streaming studio or anything that required 'heavy lifting', you needed to build a native desktop application. Many SaaS tools like Canva got around this with server-side video processing, which provided a much better UX, but which is much more complex and expensive.

With WebCodecs, it's now possible to build these apps entirely in the browser, without requiring users to download and install software, and without expensive, complex server infrastructure.

This isn't theoretical. Video Editing tools like Capcut saw an 83% boost in traffic after switching to WebCodecs + WebAssembly [1]. Utility apps like Remotion Convert and Free AI Video Upscaler (both open source) process thousands of videos a day with zero server costs and no installation required [2].

Remotion Convert

WebCodecs is even being used for entirely new use cases, like generating videos programatically [3].

If you're building any kind of video app, it's worthwhile to at least know about WebCodecs as an option for working with video in the browser.

In this guide, we will:

  1. Review the basics of Video Processing

  2. Introduce the WebCodecs API

  3. Discuss Muxing + Demuxing to read and write video files

  4. Build our own video conversion utility to convert videos between webm + mp4, and apply basic transformations

  5. Cover some production-level concerns

  6. Discuss additional resources

The goal of this article is to be a practical entry point and introduction to the WebCodecs API for frontend developers. It'll teach you how the API works and what you can do with it. I'll assume you know the basics of Javascript but you don't need to be a senior developer or a video engineer to follow along.

At the end, I'll mention additional learning resources and references. In future tutorials, I'll go more in-depth on specific topics like building a video editor, or doing live-streaming with WebCodecs. But this handbook should provide a solid starting point for what WebCodecs is, what it can do, and how to build a basic application with it.

Table of Contents

Prerequisites

You don't need to be a video engineer to follow along, but you should be comfortable with:

  • Core JavaScript, including async/await and callbacks

  • Basic browser APIs like fetch and the DOM

  • What a File object is and how file inputs work in HTML

  • A general sense of what HTML5 is (we'll use it briefly, but won't go deep)

No prior knowledge of video processing, codecs, or media APIs is required — that's what the first half of this handbook covers.

Primer on Video Processing

Hold your bunnies, because before getting into WebCodecs, I want to make sure you're aware of what codecs are before we even consider putting codecs on the web.

Video Frames

I presume you know what a video is. Ironically the 'video' below is actually a gif, but you get the idea.

Big Buck Bunny, an opensource video

Videos are just a series of images, shown one after the other, in quick succession. Each image is called a Video Frame, and each frame is associated with a timestamp. When a video player plays back the video, it displays each video frame at the time indicated by the timestamp.

Video Frames

Every frame in the video is made of pixels, with a 4K video frame containing approximately 8 million pixels (3840*2160 = 8294400).

VideoFrames have pixels

Each pixel itself is actually made of 3 components: a Red, Green, and Blue value (also called RGB value).

RGB Channels

Each of of the R, G and B color values is stored as an 8-bit integer, ranging from 0 to 255, with the number indicating the intensity of the red, green, or blue color component.

uint8 color channel

Combining the intensity of each of the R, G, and B components lets you represent any arbitrary color on the color spectrum:

RGB Color value examples

So for each pixel, we need 3 bytes of data: 1 byte for each of the R, G, and B color values (1 byte = 8 bits). A 4K video frame therefore would contain ~25 Megabytes of data.

RGB Channnels

At 30 frames per second (a typical frame rate), a 1 hour, 4K video would be around 746 Gigabytes of data. If you've ever downloaded a large video or recorded HD video with your phone camera, you'll know that video files can be large, but they're never that large.

In reality, actual video files you might watch on YouTube, record on your phone camera, or download from the internet are ~100x smaller than that. The reason actual video files are much smaller is because of video compression, a family of very sophisticated algorithms that help reduce the data by ~100x.

Without this video compression, you wouldn't be able to record more than 10 minutes of video on the latest high-end smartphones, and you wouldn't be able to stream anything HD on a high-end home internet connection.

As sophisticated as our modern devices and internet connections are, without aggressive video compression, we wouldn't be able to watch, record, or stream anything in HD.

Codecs

A codec is a fancy word for a video compression algorithm. There are a few established codecs / compression algorithms, such as:

  • h264: The most common codec. If you see an mp4 file, it most likely uses the h264 codec.

  • vp9: An open source codec used commonly by YouTube and in video conferencing, often found in webm files.

  • av1: A new open source codec, increasingly being used by platforms like YouTube and Netflix.

How these algorithms work is too complex and out of scope for this handbook. But at a very high level, here are some major ways these algorithms compress video:

Removing detail

All these algorithms use a technique called the Discrete Cosine Transform to "remove details". As you remove "detail" from the video frame, the frame starts looking "blockier". This technique is so effective, though, that you can compress a video frame by ~10x before the differences start becoming visible to the human eye.

For the curious, you can see this video by Computerphile on how the DCT algorithm works.

DCT algorithm removing details

Encoding frame differences

When you actually look at a sequence of video frames, you'll notice that visually they're quite similar, with only small portions of the video changing, depending on how much movement there is.

These codecs/compression algorithms use sophisticated math and computer vision techniques to encode just the differences between frames,.

Frame Differences

You therefore only need to send the first frame (a Key Frame) – then for subsequent frames you can send the "frame differences", also called Delta Frames, to reconstruct the each full frame.

Key Frames vs Delta frames

In practice, for an hour long video, we don't just encode the first frame and store millions of delta frames. Instead, algorithms encode every 60th frame or so as a Key Frame, and then the next 59 frames are delta frames.

This technique is also highly effective, reducing data used by another ~10x. The distinction between Key Frames and Delta Frames is one of the few bits of "how these algorithms work" that you actually need to be aware of.

There's a number of other details and compression techniques that go into these compression algorithms that are out of scope for an intro article.

Encoding & Decoding

For video compression to work, we need to be able to both compress video (turn raw video into compressed binary data) and then decompress video (turn the compressed binary data back into raw video frames).

Turning raw video frames into compressed binary data is called encoding, and turning compressed binary data back into raw video frames is called decoding. The word codec is just an abbreviation for "encode decode".

VideoEncoder and VideoDecoder

From a practical, developer perspective, you don't need to know how these codecs work, but you do need to know that:

  1. There are different video codecs, like h264, vp9, and av1

  2. When you encode a video with a codec (like h264), you need a video player that can support the same codec to play back the video.

  3. Encoding video takes a lot more computation than decoding video, so playing 4K video on a low-end phone is fine, but encoding 4K video on it would be super slow.

  4. Most consumer devices (phones, laptops) have specialized chips designed specifically for encoding and decoding video, making encoding/decoding much faster than if run on the CPU like a normal software program. This is called hardware acceleration.

In practice, there are only a handful of video codecs, because the entire world needs to agree on standards, so that video recorded on an iPhone can be played back on a windows device.

Containers

Most people haven't heard of h264 or vp9. When you think of video files, you typically think of file formats like MP4 or MKV. These are also relevant, but they're a separate thing called containers.

A video file typically has encoded audio, encoded video, and metadata about the video file. A file format like MP4 describes a specific format for storing the encoded audio and video data, as well as the metadata.

Video Container

Video compression software stores the encoded audio/video and metadata into a file according to the file format / specs. This is called muxing.

Likewise, video players follow the file format specs to read the metadata and find the encoded audio/video. This is called demuxing.

When compressing a video file, you need to both encode it and mux it (in that order). These are two separate stages of the process. Likewise, when playing a video file, you need to both demux it and then decode it (in that order).

When a video player opens, say, an mp4 file, the logic flow is as follows:

  • Ok, the file ends in .mp4, so it must be an mp4 file. Let me load the library for parsing mp4 files, and parse then parse file.

  • Great, I've parsed the mp4 file, I now have the metadata and know where in the byte offsets are to fetch the encoded audio and video.

  • I'll start fetching the first encoded video frames, decode them, and start displaying the decoded video frame to the user.

If you ever see a "video file is corrupt" message from a video player, it's likely that the video file doesn't follow the file format spec and there was an error while trying the parse / demux the video.

What is WebCodecs?

Now that we've covered codecs, let's put them on the Web.

WebCodecs = Web + Codecs

WebCodecs is an API that allows frontend developers to encode and decode video in the browser efficiently (using hardware acceleration), and with very low level control (encode/decode on a frame by frame basis).

The hardware acceleration bit is important, as you can't just poly fill or re-implement the API yourself. WebCodecs gives direct access to specialized hardware for encoding/decoding, making it as performant as a desktop video app.

Before WebCodecs

It's worth taking a moment to understand why WebCodecs exists. Before the WebCodecs API existed, there were several alternatives you could use for video operations in the browser.

  • HTMLVideoElement: You can still create a element and use it for decoding a video. It's easy to use, but you lack frame level control. Your only control is setting the 'video.currrentTime' property and waiting for it to seek, often leading to dropped/missing frames.

  • Media Recorder API: Essentially allows you to 'screen record' any canvas element or video stream. While it works, it's functionally equivalent to screen recording Adobe Premeire pro instead of clicking render. For editing scenarios, you lose frame level control and can only process video at real-time speed.

  • FFMPEG.js: A port of the popular video processing tool ffmpeg, which runs ffmpeg in the browser. Many tools used this in the past, but it lacks hardware acceleration, making it much slower than WebCodecs. It also has file size restrictions stemming from the fact that it runs in WebAssembly, making it difficult to work with videos that are larger than 100 MB.

WebCodecs was built and released in 2021 to enable low-level, hardware accelerated video decoding and encoding. It's great for high-performance streaming and video editing, which were use cases not well-served by the existing APIs.

Core API

The core API for WebCodecs consists of two new "data types", the VideoFrame and EncodedVideoChunk, as well as the VideoEncoder and VideoDecoder interfaces.

VideoFrame

The Javascript VideoFrame object conceptually contains both pixel data and metadata about the video frame.

VideoFrame object

You can actually create a new VideoFrame object from any image source, as long as you include the metadata:

const bitmapFrame = new VideoFrame(imgBitmap, {timestamp: 0});

const imageFrame = new VideoFrame(htmlImageEl, {timestamp: 0});

const videoFrame = new VideoFrame(htmlVideoEl, {timestamp: 0});

const canvasFrame = new VideoFrame(canvasEl, {timestamp: 0});

For a video editing app, for example, you would typically perform image editing operations on each frame on a canvas, and then you would grab each VideoFrame from the canvas.

You can also draw a VideoFrame to a canvas using the Canvas 2D rendering context:

ctx.drawImage(frame, 0, 0);

You would typically do this when rendering / playing back a video in the browser.

EncodedVideoChunk

An EncodedVideoChunk is just the compressed version of a VideoFrame, containing the binary data as well as the same metadata as the frame.

EncodedVideoChunk

You would typically get EncodedVideoChunks from a library which extracts them from a File object.

import { getVideoChunks } from 'webcodecs-utils'

const chunks = <EncodedVideoChunk[]> await getVideoChunks(<File> file);

Alternatively, it's the output you get from a VideoEncoder object.

There's not much useful stuff you can do with EncodedVideoChunks – it's just the binary data that you read from files, write to files, or stream over the internet.

Video streaming with encode and decode

The value in EncodedVideoChunk is that it's ~100x smaller than raw video data, which is why you'd send EncodedVideoChunks instead of raw video when streaming (and writing to a file).

VideoEncoder

A VideoEncoder turns VideoFrame objects into EncodedVideoChunk objects.

VideoEncoder

The core API looks something like this, where you define the callback where the VideoEncoder returns EncodedVideoChunk objects.

const encoder = new VideoEncoder({
    output: function(chunk: EncodedVideoChunk, meta: any){
        // Do something with the chunk
    },
    error: function(e: any)=> console.warn(e);
});

Keep in mind that this is an async process, and not even a typical async process. You can't just treat this as a per-frame operation.

// Does not work like this
const frame  = await encoder.encode(chunk);

This is because of how video encoding actually works under the hood. So you have to accept that the outputs are returned via callback, and you get the outputs when you get them.

Once you define your encoder, you can then configure the VideoEncoder with your choice of codec (we'll get to this), as well as other parameters like width, height, framerate and bitrate.

encoder.configure({
    'codec': 'vp9.00.10.08.00', // We'll get to this
     width: 1280,
     height: 720,
     bitrate: 1000000 //1 MBPS,
     framerate: 25
});

You can then start encoding frames. Here we assume we already have VideoFrame objects, and we make every 60th frame a Key Frame.

for (let i=0; i < frames.length; frames++){
    encoder.encode(frames[i], {keyFrame: i%60 ==0})
}

VideoDecoder

The Video Decoder does the reverse, turning EncodedVideoChunk objects into VideoFrame objects.

VideoDecoder

Here's a simplified example of how to set up the VideoDecoder. First, extract the EncodedVideoChunk objects and the decoder config from the video file. Here, we don't choose the config – the config was chosen by whoever encoded the file. When decoding, we extract the config from the file.

import { demuxVideo } from 'webcodecs-utils';

const {chunks, config} = await demuxVideo(<File> file);

Next, we set up the VideoDecoder by specifying the callback when VideoFrame objects are generated, and we configure it with the config.

const decoder = new VideoDecoder({
    output: function(frame: VideoFrame){
        //do something with the VideoFrame
    },
    error: function(e: any)=> console.warn(e);
});

decoder.configure(config)

Again, like with VideoEncoder, it returns frames in a callback. Finally we can start decoding chunks.

for (const chunk of chunks){
    decoder.decode(chunk);
}

Putting it all together

At its core, the WebCodecs API is just the two data types (EncodedVideoChunk, VideoFrame) and the VideoEncoder and VideoDecoder interfaces which convert between the two data types.

The core of WebCodecs

Keep in mind that the WebCodecs API doesn't actually work with video files. It only applies the encoding and decoding, and EncodedVideoChunk objects just represent binary data.

Reading video files and writing video files are their own, separate thing called muxing/demuxing.

Muxing and Demuxing

To write to a video file, you'll also need to mux the video. And to play a video file, you need to demux the video. This involves following the file format of the video container, parsing the video file (in the case of demuxing), or placing encoded video data in the right place in the file you are writing to (muxing).

Muxing and Demuxing are not included in the WebCodecs API, so you'll need to use a separate library to handle muxing and demuxing.

Demuxing

To play a video back in the browser, we need to both demux the video and decode the video, in that order.

Demuxing and decoding

There are several libraries you can use to demux videos, including MediaBunny or web-demuxer. For the purposes of this tutorial, I put a very simplified wrapper around these libraries and exposed it in the webcodecs-utils package, so that demuxing is a very simple 2-liner:

import { demuxVideo } from 'webcodecs-utils'
const {chunks, config} = await demuxVideo(file);

This reads the entire video into memory, so don't do this in practice. But it's helpful in making a simple, readable hello world for WebCodecs.

The following snippet will take in a video file (File object), decode it, and paint the result to a canvas. Here, we get the frames from the output callback, and run the draw calls directly from the callback.

import { demuxVideo } from 'webcodecs-utils'

async function playFile(file: File){

    const {chunks, config} = await demuxVideo(file);
    const canvas = document.createElement('canvas');
    const ctx = canvas.getContext('2d');

    const decoder = new VideoDecoder({
        output(frame: VideoFrame) {
            ctx.drawImage(frame, 0, 0);
            frame.close()
        },
        error(e) {}
    });


    decoder.configure(config);

    for (const chunk of chunks){
        decoder.decode(chunk)
    }

}

Here's our super barebones demo for playing back an actual video:

For a more 'correct' demuxing example, here is what demuxing looks like with MediaBunny, where you can extract chunks in an iterative fashion.

import { EncodedPacketSink, Input, ALL_FORMATS, BlobSource } from 'mediabunny';

const input = new Input({
  formats: ALL_FORMATS,
  source: new BlobSource(<File> file),
});

const videoTrack = await input.getPrimaryVideoTrack();
const sink = new EncodedPacketSink(videoTrack);

for await (const packet of sink.packets()) {
  const chunk = <EncodedVideoChunk> packet.toEncodedVideoChunk();
}

Muxing

To write a video file, you not only need to encode it (with the VideoEncoder) you also need to mux it. This involves taking the encoded chunks and placing them in the right place in the output binary file that you're writing to.

Muxing and Encoding

Again, you need a library to mux videos ( MediaBunny), but for demo purposes I created a super simple wrapper. Here we define a super basic ExampleMuxer.

import { ExampleMuxer } from 'webcodecs-utils'

const muxer = new ExampleMuxer('video');

for (const chunk of encodedChunks){
    muxer.addChunk(chunk);
}

const outputBlob = await muxer.finish();

As a full encoding + muxing demo, we'll create an encoder, and we'll set it to mux the output encoded chunks as soon as they are returned.

const encoder = new VideoEncoder({
    output: function(chunk, meta){
        muxer.addChunk(chunk, meta);
    },
    error: function(e){}
})

encoder.configure({
    'codec': 'avc1.4d0034', // We'll get to this
     width: 1280,
     height: 720,
     bitrate: 1000000 //1 MBPS,
     framerate: 25
});

We'll then define a canvas animation, which will draw the current frame number to the screen, just to prove it's working.

const canvas = new OffscreenCanvas(640, 360);
const ctx = canvas.getContext('2d');
const TOTAL_FRAMES=300;
let frameNumber = 0;
let chunksMuxed = 0;
const fps = 30;


function renderFrame(){
    ctx.fillStyle = '#000';
    ctx.fillRect(0, 0, canvas.width, canvas.height);
    ctx.fillStyle = 'white';
    ctx.font = `bold ${Math.min(canvas.width / 10, 72)}px Arial`;
    ctx.textAlign = 'center';
    ctx.textBaseline = 'middle';
    ctx.fillText(`Frame ${frameNumber}`, canvas.width / 2, canvas.height / 2);
}

Finally we'll create the encode loop, which will draw the current frame, and then encode it.


let flushed = false;

async function encodeLoop(){

    renderFrame();

    const frame = new VideoFrame(canvas, {timestamp: frameNumber/fps*1e6});
    encoder.encode(frame, {keyFrame: frameNumber %60 ===0});
    frame.close();

    frameNumber++;

    if(frameNumber === TOTAL_FRAMES) {
        if (!flushed) encoder.flush();
    }
    else return requestAnimationFrame(encodeLoop);
}

Putting it all together, you can encode the canvas animation to a video file with frame-level accuracy.

You can download the video and use any video inspection tool to verify that every single frame number is included.

Videos with frame level accuracy

This is one of the critical distinctions that separates this from other web APIs like MediaRecorder which can also encode video, but has no frame-level accuracy. WebCodecs makes sure that you can control and guarantee the consistency of each frame.

Finally, a proper full, muxing example using MediaBunny would look like this:

import {
  EncodedPacket,
  EncodedVideoPacketSource,
  BufferTarget,
  Mp4OutputFormat,
  Output
} from 'mediabunny';

async function muxChunks(chunks: EncodedVideoChunk[]): Promise <Blob>{

    const output = new Output({
        format: new Mp4OutputFormat(),
        target: new BufferTarget(),
    });

    const source = new EncodedVideoPacketSource('avc');
    output.addVideoTrack(source);

    await output.start();

    for (const chunk of chunks){
        source.add(EncodedPacket.fromEncodedChunk(chunk))
    }

    await output.finalize();
    const buffer = <ArrayBuffer> output.target.buffer;
    return new Blob([buffer], { type: 'video/mp4' });

});

Building a Video Converter Utility

Now that we've covered the basics of WebCodecs as well as Muxing, we'll move towards actually building an MVP of something useful: a video converter utility. We'll be able to use it to convert between mp4 and webm, and do some basic operations like resizing and flipping the video.

Transcoding

Before we do resizing and flipping, let's first handle a basic conversion decoding a video, and encoding the video to a new format. This is called transcoding.

To transcode video, we need to set up a pipeline with the following processes:

  • Demuxing: Read EncodedVideoChunks from a video file

  • Decoding: Convert EncodedVideoChunks to VideoFrames

  • Encoding: Convert VideoFrames to new EncodedVideoChunks

  • Muxing: Write the EncodedVideoChunks to a new video file

Our pipeline looks something like this:

Transcoding pipeline

Using everything we've covered in this article up until now, we could build a full working demo with just VideoEncoder and VideoDecoder as discussed. But then state management and tracking frames becomes complicated and error prone.

We're going to add one more abstraction, using the Streams API, which will make our pipeline look like the below. It ties directly to our mental model of our pipeline and simplifies a ton of details like state management.

const transcodePipeline = demuxerReader
    .pipeThrough(new VideoDecoderStream(videoDecoderConfig))
    .pipeThrough(new VideoEncoderStream(videoEncoderConfig))
    .pipeTo(createMuxerWriter(muxer));

await transcodePipeline;

To do this, we'll create a TransformStream for the VideoDecoder and VideoEncoder.

class VideoDecoderStream extends TransformStream<{ chunk: EncodedVideoChunk; index: number }, { frame: VideoFrame; index: number }> {
  constructor(config: VideoDecoderConfig) {
    let pendingIndices: number[] = [];
    super(
      {
        start(controller) {
          decoder = new VideoDecoder({
            output: (frame) => {
              const index = pendingIndices.shift()!;
              controller.enqueue({ frame, index });
            },
            error: (e) => controller.error(e),
          });

          decoder.configure(config);
        },

        async transform(item, controller) {
          pendingIndices.push(item.index);
          decoder.decode(item.chunk);
        },

        async flush(controller) {
          await decoder.flush();
          if decoder.state !== 'closed' decoder.close();
        },
      }
    );
  }
}

I won't bore you with the full code, but I've packaged these utilities in the webcodecs-utils package, which can be used as such:

import {
  SimpleDemuxer,
  VideoDecodeStream,
  VideoEncodeStream,
  SimpleMuxer,
} from "webcodecs-utils";

Our code for transcoding a file then becomes this:

const demuxer = new SimpleDemuxer(videoFile);
await demuxer.load();
const decoderConfig = await demuxer.getVideoDecoderConfig();

const encoderConfig = {/*Whatever we decide*/};

// Set up muxer
const muxer = new SimpleMuxer({ video: "avc" });

// Build the upscaling pipeline
await demuxer.videoStream()
  .pipeThrough(new VideoDecodeStream(decoderConfig))
  .pipeThrough(new VideoEncodeStream(encoderConfig))
  .pipeTo(muxer.videoSink());

// Get output
const blob = await muxer.finalize();

For this intermediate demo, just to actually get transcoding to work, we'll download a pre-built file, and we'll introduce a toggle to output an mp4 file (using h264) or a webm file (using vp9).

We'll use avc1.4d0034 for h264 (most widely supported h264 codec string) and vp09.00.40.08.00 for vp9 (most widely supported vp9 string).

Here's a basic transcoding demo on CodePen:

Transformations

If we want to do any kind of transformations to the video, like flips, crops, rotations, resizing, and so on, we can't just work with pure VideoFrame objects.

The simplest way to accomplish this would be to introduce a Canvas element, where we'll use a 2d Canvas Context to manipulate our source frame and draw that to a canvas.

const canvas = new OffscreenCanvas(width, height);
const ctx = canvas.getContext('2d');

// Very easy to do transformations
ctx.drawImage(sourceFrame, 0, 0);

We'll then use the Canvas as a source image for our output video frame.

const outFrame = new VideoFrame(canvas, {timestamp: sourceFrame.timestamp});

To apply a resize operation, we'll first set the canvas dimensions to our output height and width.

const canvas = new OffscreenCanvas(outputWidth, outputHeight);
const ctx = canvas.getContext('2d');

// Resize sourceFrame to fit output dimensions
ctx.drawImage(sourceFrame, 0, 0, outputWidth, outputHeight);

To apply a horizontal flip operation with canvas2d, we can do the following:

ctx.scale(-1, 1);
ctx.translate(-outputWidth, 0);
ctx.drawImage(sourceFrame, 0, 0, outputWidth, outputHeight);

You can create a full render function that applies these transformations which looks like this:

function render(videoFrame, outW, outH, flipped) {

  canvas.width  = outW;
  canvas.height = outH;

  if (flipped) {
    ctx.scale(-1, 1);
    ctx.translate(-outW, 0);
  }
  ctx.drawImage(videoFrame, 0, 0, outW, outH);

}

Here's an interactive demo of what these transformations look like:

Transform Pipeline

With these transformations, we need to adjust our pipeline to include a transformation step. It will take in a VideoFrame, apply the transforms, and return a transformed frame.

Transcoding pipeline with transforms

In the webcodecs-utils package, there is a VideoProcessStream object for this purpose, which takes in an async function which takes in a VideoFrame and returns a VideoFrame:

import { VideoProcessStream} from "webcodecs-utils";
 
new VideoProcessStream(async (frame) => {
      // Apply transformations
      return procesedFrame;
    }),

So to apply our transformations, we can set it up as so:

import { VideoProcessStream} from "webcodecs-utils";
 

const canvas = new OffscreenCanvas(outW, outH);
const ctx = canvas.getContext('2d');

const processStream = new VideoProcessStream(async (frame) => {
  
  if (flipped) {
    ctx.scale(-1, 1);
    ctx.translate(-outW, 0);
  }
  ctx.drawImage(frame, 0, 0, outW, outH);

  return new VideoFrame(canvas, {timestamp: frame.timestamp});

});

And then our full pipeline looks like this:

const demuxer = new SimpleDemuxer(videoFile);
await demuxer.load();
const decoderConfig = await demuxer.getVideoDecoderConfig();

const encoderConfig = {/*Whatever we decide*/};

// Set up muxer
const muxer = new SimpleMuxer({ video: "avc" });

// Build the upscaling pipeline
await demuxer.videoStream()
  .pipeThrough(new VideoDecodeStream(decoderConfig))
  .pipeThrough(processStream) // Just defined this
  .pipeThrough(new VideoEncodeStream(encoderConfig))
  .pipeTo(muxer.videoSink());

// Get output
const blob = await muxer.finalize();

Here's a full working demo with the process pipeline:

Complete Demo

Now, for the complete tool, we'll make some key changes:

  • You can upload your own video

  • We'll preview the transformations by extracting a frame

  • We'll add progress measurement

For the input, that's trivial:

<input type="file" onchange="handler(event)" />

For frame previews, we could use WebCodecs to generate a preview, but because the preview doesn't need frame-level accuracy or high performance, it's easier to just use the HTML5 VideoElement to grab a video frame from the source file.

async function getFirstFrame(file) {
  const video = document.createElement("video");
  video.src = URL.createObjectURL(file);
  video.muted = true;

  await new Promise((resolve) => video.addEventListener("loadeddata", resolve, { once: true }));
  video.currentTime = 0;
  await new Promise((resolve) => video.addEventListener("seeked", resolve, { once: true }));

  return new VideoFrame(video, {timestamp: 0});
}

Finally, we can calculate progress in the process function by using the frame timestamp / the video duration.

const {duration} = await demuxer.getMediaInfo();


const processStream = new VideoProcessStream(async (frame) => {
  
  if (flipped) {
    ctx.scale(-1, 1);
    ctx.translate(-outW, 0);
  }
  ctx.drawImage(frame, 0, 0, outW, outH);

   // Frame timestamps are in microseconds, duration in seconds
  const progress = frame.timestamp/(duration*1e6); 

  return new VideoFrame(canvas, {timestamp: frame.timestamp});

});

Putting this all together, we can finally put together a full working video converter utility:

And that's it! We've built an MVP of something actually useful with WebCodecs 🎉, with Demuxing, Decoding, Canvas Transforms, Encoding, and Muxing.

The only difference between this and a full-fledged browser editing suite like Capcut is the scale and scope of transformations. But the video processing logic would be nearly identical.

Production Concerns

It's great that we've been able to create something useful, but before we wrap up, it's important to cover some production-level concerns.

Codecs

You might have noticed strings like vp09.00.10.08 in the demos, but I glossed over the details. We'll cover that now:

First, WebCodecs works with specific codec strings like vp09.00.10.08, not just 'vp9'. The following won't work:

const codec = VideoEncoder({
    codec: 'vp9', //This won't work!
    //...
})

As discussed previously, when decoding video, you don't really get a choice of codec. The video is already encoded, and so you need to get the codec from the video, as shown in the previous demos.

The demuxing libraries mentioned will identify the correct codec string, so you don't need to worry about that.

const decoderConfig = await demuxer.getVideoDecoderConfig();
//decoderConfig.codec = exact codec string for the video

When encoding a video, you can can choose your codec. Some people care a lot about codec choice, but from a very practical, pragmatic perspective, these rules of thumb should work for most developers:

  • If the videos your app generates will be downloaded by users and/or you want to output mp4 files, use h264.

  • If the videos generated are for internal use or you control video playback, and you don't care about format, use vp9 with webm (open source, better compression, most widely supported codec).

  • For most apps, these two options will cover you — deeper codec selection is a rabbit hole you don't need to go down yet.

Once you have a codec family chosen, you need to choose a specific codec string such as avc1.42001f.

The other numbers in the string specify certain codec parameters which are not as important from a developer perspective. If your goal is maximum compatibility, here's your cheat sheet for what codec strings to use

h264 (for mp4 files)
  • avc1.42001f - base profile, most compatible, supports up to 720p (99.6% support)

  • avc1.4d0034 - main profile, level 5.2 (supports up to 4K) (98.9% support)

  • avc1.42003e - base profile, level 6.2 (supports up to 8k) (86.8% support)

  • avc1.64003e - high profile - level 6.2 (supports up to 8k) (85.9% support)

vp9 (for webm files)

You can also use the getCodecString function from the webcodecs-utils package:

import { getCodecString } from 'webcodecs-utils'

const codec_string = getCodecString('vp9', width, height, bitrate)

You can find a comprehensive list of what codecs and codec strings you can use in WebCodecs here.

Bit rate

On top of height and width (which you presumably know from your content) and a codec string (which we just discussed), you also need to specify a bit rate when encoding video.

Video Compression algorithms have a trade-off between quality and file size. You can have high quality video with big file sizes, or lower quality video with lower file sizes.

Here's a quick visualization of what different quality levels look like for a 1080p video encoded at different bit rates:

300 kbps

300kbps frame

1 Mbps

1Mbps frame

3 Mbps

3 Mbps frame

10 Mbps

10 Mbps frame

Here's a quick lookup table for bitrate guidance:

Resolution Bitrate (30fps) Bitrate (60fps)
4K 13-20 Mbps 20-30 Mbps
1080p 4.5-6 Mbps 6-9 Mbps
720p 2-4 Mbps 3-6 Mbps
480p 1.5-2 Mbps 2-3 Mbps
360p 0.5-1 Mbps 1-1.5 Mbps
240p 300-500 kbps 500-800 kbps

You can also use this utility function in your own app as a quick approximation:

function getBitrate(width, height, fps, quality = 'good') {
    const pixels = width * height;

    const qualityFactors = {
      'low': 0.05,
      'good': 0.08,
      'high': 0.10,
      'very-high': 0.15
    };

    const factor = qualityFactors[quality] || qualityFactors['good'];

    // Returns bitrate in bits per second
    return pixels * fps * factor;
  }

The same function is also available in the webcodecs-utils package:

import { getBitrate } from 'webcodecs-utils'

GPU vs CPU

Most user devices have some type of graphics card (typically called integrated graphics). These are specialized chips with specific silicon architectures optimized for encoding and decoding video, as well as for basic graphics.

You might hear "GPU" and think AI data centers and gamers. But as far as web applications are concerned, almost everyone has a GPU.

This is important because while most frontend-development almost exclusively deals with the CPU, WebCodecs and video processing work primarily on the GPU.

Here's a quick guide for what kind of data is stored where:

Data Type Location
VideoFrame GPU
EncodedVideoChunk CPU
ImageBitmap GPU
ArrayBuffer CPU
File CPU + Disk

There's a performance cost to moving data around, and this also becomes important for managing memory.

Memory

VideoFrame objects can be quite large – 30MB for a 4K video. A user's graphics card typically reserves some portion of RAM for "Video Memory" or "VRAM" which is where VideoFrame objects would be stored.

So if a user has 8GB of RAM, they would typically have 2GB of VRAM (how much is decided by the operating system).

If the amount of video data exceeds VRAM, your application will crash. This means that for a typical user, if you have more than 67 4K frames in memory (~2 seconds of video) the program will crash.

When VideoFrames are generated

VideoFrame objects are generated whenever you create a new VideoFrame(source) but also from the VideoDecoder, specifically the output callback. Every time a frame is generated, memory usage goes up.

How to remove VideoFrames

You can't rely on standard garbage collection for VideoFrame objects. You have to explicitly call close() on a frame when you're done:

frame.close()

In the Streams/Pipeline code and demo showed earlier, frames are actually being closed as soon as they are encoded in the VideoProcessStream and VideoEncodeStream interfaces.

The other reason Streams are helpful for WebCodecs is the highWaterMark property, which defaults to 10. What this means is that when you run:

await demuxer.videoStream()
  .pipeThrough(new VideoDecodeStream(decoderConfig))
  .pipeThrough(processStream) 
  .pipeThrough(new VideoEncodeStream(encoderConfig))
  .pipeTo(muxer.videoSink());

You ensure that no more than 10 video frames are in memory at any given time. The Streams API allows you to specify that limit while the browser itself deals with the logic of how to make that happen.

If you don't use the Streams API, you'll need to make sure you manage keeping track of memory limits and number of open video frames yourself.

Further Resources

Through this article we've gone over the basics of video processing, introduced the core concepts of the WebCodecs API, and built an MVP of a video converter utility. This is one of the simplest possible demos which actually touches all parts of the API. We also covered some basic production concerns.

This is just an introduction, and only scratches the surface of WebCodecs. For how simple the API looks, building a proper, production-ready WebCodecs application requires moving beyond hello-world demos.

To learn more about WebCodecs, you can check out MDN and the WebCodecsFundamentals, a comprehensive online textbook going much more in depth on WebCodecs.

You can also examine the source code of existing, production tested apps like Remotion Convert (source code) which is most similar to the demo app we covered, and Free AI Video Upscaler (source code, processing pipeline) which is the inspiration for the design patterns presented here and implemented in webcodecs-utils.

Finally, while WebCodecs is harder than it looks, you can make your life a lot easier by using a library like MediaBunny, which simplifies a lot of the details of things like memory management, file I/O, and other details. I use it in my own production WebCodecs applications.

Whether or not you actually build a full, production grade WebCodecs application, you now at least know that it's an option – one that's relatively new, provides better UX with lower server costs, and which is increasingly being adopted by prominent video applications like Capcut and Descript for its benefits.



Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete

What's new for the Microsoft Fluent UI Blazor library 5.0 RC2

1 Share

We just shipped the second release candidate for v5, and boy did we manage to squeeze in a slew of new stuff...

Since RC1, we’ve worked hard on delivering two new components — AutoComplete and Toast — along with a powerful Theme API (including a Theme Designer), major DataGrid enhancements such as pinned columns, and dozens of improvements across the board.

AutoComplete

The FluentAutocomplete component brings a completely rebuilt, fully-featured AutoComplete experience to v5, replacing the v4 implementation with a more modern and better designed component.

Key capabilities:

  • Search-as-you-type — Filter options dynamically as the user types, with built-in debounce support.
  • Multi-select — Select multiple items displayed as dismissible badges inside the input area.
  • Keyboard navigation — Full arrow-key navigation, Enter to select, Escape to close, and Backspace to remove the last selected item.
  • Custom option templates — Use OptionTemplate to render rich, custom content for each suggestion.
  • Progress indicator — Show a loading indicator while fetching results asynchronously.
  • MaxAutoHeight / MaxSelectedWidth — Control the layout and overflow behavior of selected items.

Toast & ToastService

The new FluentToast service provides a feature-complete experience, including support for live-updating progress toasts.

The library supports four toast scenarios through ToastType:

  • Communication — General notifications and messages
  • Confirmation — Success / failure confirmations
  • IndeterminateProgress — Long-running operations without progress tracking
  • DeterminateProgress — Operations with measurable progress (e.g. upload)

Simple usage:

@inject IToastService ToastService

// Fire-and-forget
await ToastService.ShowToastAsync(options =>
{
    options.Title = "File saved",
    options.Intent = ToastIntent.Success,
    Timeout = 3000,
});

Advanced: Use a live toast instance

For scenarios like uploads or long-running operations, use ShowToastInstanceAsync to get a live instance reference. You can then update the content of the Toast while it is being shown:

var toast = await ToastService.ShowToastInstanceAsync(options =>
{
    options.Title = "Uploading document...";
    options.Type = ToastType.DeterminateProgress;
});

// Update progress while visible
await toast.UpdateAsync(t => t.Progress = 50);

// Complete and dismiss
await toast.UpdateAsync(t => t.Progress = 100);
await toast.CloseAsync();

Other highlights

  • Queueing — The FluentToastProvider manages maximum visible toasts, queue promotion, and positioning.
  • Pause on hover — Toast timeout pauses when the user hovers over it, or when the browser window loses focus.
  • Animated transitions — Smooth open/close animations.
  • Accessibility — ARIA attributes and politeness levels are applied based on toast intent.

Theme API & Designer

We're introducing a comprehensive Theme API that gives you full control (within the bounds of the Fluent Design System) over your application’s visual identity — from a simple brand color to a fully customized design token set, with built-in persistence and a live Theme Designer.

Set the brand color declaratively

Add data-theme and data-theme-color attributes to yourtag:

<body data-theme="light" data-theme-color="#0078D4">

The library automatically detects these attributes, generates a color ramp, and applies it to the application.

Set the brand color with code

A full API is available through the IThemeService. A simple example on how to use this:

@inject IThemeService ThemeService

// Set a custom brand color
await ThemeService.SetThemeAsync("#6B2AEE");

// Switch to dark mode
await ThemeService.SetDarkThemeAsync();

// Toggle light ↔ dark
await ThemeService.SwitchThemeAsync();

// Apply the Teams theme
await ThemeService.SetTeamsLightThemeAsync();

// Full control with settings
await ThemeService.SetThemeAsync(new ThemeSettings
{
    Color = "#6B2AEE",
    Mode = ThemeMode.Dark,
    HueTorsion = 0.1f,
    Vibrancy = 0.2f,
});

Theme Designer

The demo site includes a Theme Designer page where you can interactively pick a brand color, adjust hue torsion and vibrancy, preview the generated color ramp, and see your theme applied to actual components in real time. When you’re happy with the result, you can apply the settings to the demo site with a click on a button.

Key features

  • Brand color ramp — Automatic generation of a full color ramp from a single hex color (with the option to use the exact specified color!)
  • Light / Dark / System — Support for all three modes, with automatic system preference detection
  • Teams themes — Built-in Teams Light and Teams Dark themes
  • localStorage — Theme settings are cached and restored across sessions automatically
  • Per-element theming — Apply a custom theme to a specific ElementReference without changing global settings
  • RTL support — SwitchDirectionAsync() to toggle between LTR and RTL

DataGrid Enhancements

The FluentDataGrid has been significantly enhanced in RC2

Pinned (frozen/sticky) columns

Columns can now be pinned to the left or right edge of the grid, so they remain visible during horizontal scrolling:

<FluentDataGrid Items="@people" Style="overflow-x: auto; max-width: 800px;">
    <PropertyColumn Property="@(p => p.Id)" Width="80px" Pin="DataGridColumnPin.Left" />
    <PropertyColumn Property="@(p => p.Name)" Width="200px" Pin="DataGridColumnPin.Left" />
    <PropertyColumn Property="@(p => p.Email)" Width="300px" />
    <PropertyColumn Property="@(p => p.City)" Width="200px" />
    <PropertyColumn Property="@(p => p.Country)" Width="200px" />
    <PropertyColumn Property="@(p => p.Actions)" Width="100px" Pin="DataGridColumnPin.Right" />
</FluentDataGrid>

Pinned columns require an explicit width and must be contiguous (all start-pinned columns must come first, all end-pinned columns must come last). The DataGrid validates these rules and throws a descriptive exception when detecting an invalid configuration.

HierarchicalSelectColumn

Besides the hierarchical DataGrid option added in RC!, a new column type has now been added that provides parent-child selection behavior, allowing users to select groups of related rows through a hierarchical checkbox.

And more

  • StripedRows parameter for alternating row styling
  • DisableCellFocus parameter
  • OnSortChanged event callback
  • Skip debounce delay on first provider call when using Virtualize
  • Fix SelectedItems getting unselected when using pagination/virtualization
  • Some Width issues fixed

Calendar & DatePicker

We've added MinDate and MaxDate parameters so it is now possible to constrain the selectable date range:

<FluentCalendar @bind-Value="@selectedDate"
                MinDate="@DateTime.Today"
                MaxDate="@DateTime.Today.AddMonths(3)" />

Other changes include:

  • Year view: current year centered — The year picker now places the current year in the middle row for better usability.
  • Fix: month/year navigation getting stuck — Resolved an issue where clicking month or year could leave the calendar in a stuck state.
  • Width forwarded — The Width parameter is now properly forwarded to the underlying FluentTextInput.

Other component improvements

Some other noteworthy improvements and fixes:

  • FluentLink — Support clickable links with OnClick events and improved hover styles
  • Badge — Added .fluent-badge CSS classes for custom styling
  • AppBar — Allow hiding active bar, render active bar when horizontal
  • AppBar — Calculate height of active bar dynamically
  • Nav — Enhanced accessibility (a11y) support
  • Nav — Refactoring and issue fixes
  • DragContainer/DropZone — Switch from Action to EventCallback for event handlers
  • Checkbox — Fix “checked” logic to respect ThreeState parameter
  • Accordion — Fix change event to only trigger for fluent-accordion elements
  • List — Refactor OptionSelectedComparer to use IEqualityComparer
  • Placeholder — Fix placeholder rendering error
  • Custom events — Rename custom events to avoid .NET 11 exception

Localization

Building on the localization system we introduced in RC1:

  • Paginator localizable strings — All Paginator strings (page labels, navigation buttons) are now localizable through IFluentLocalizer.
  • Translation key-value pairs reference — The documentation now includes a complete table of all default translation keys and their values, making it easier to implement IFluentLocalizer.

MCP Server & Tooling

  • Migration Service for v4 → v5 — A new migration service and resources to help automate the transition from v4 to v5 with the MCP Server.
  • ModelContextProtocol updated to v1.1.0 — The MCP Server now uses the latest MCP protocol version.
  • AI Skills docs and download UI — Documentation for the AI Skills available through the library, with a download UI.
  • Version compatibility tools — New tools and documentation to check version compatibility across the Fluent UI Blazor packages family.

Help us and try it now

We are still in the Release Candidate phase. The APIs are mostly stabilized, but we would still very much like the community to help us identify any remaining issues before the final release. Please file issues on GitHub, and don’t hesitate to contribute.

To see what we still want to complete before the v5 final release, see our dev-v5 - TODO List

Thank you to everyone who has already contributed, tested, and provided feedback. We hope you will continue doing so.

A special shout-out goes to the community contributors who made significant contributions to this release!

Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories