Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
150093 stories
·
33 followers

Windows stack limit checking retrospective: arm64, also known as AArch64

1 Share

Our survey of stack limit checking wraps up with arm64, also known as AArch64.

The stack limit checking takes two forms, one simple version for pure arm64 processes, and a more complex version for Arm64EC. I’m going to look at the simple version. The complex version differs in that it has to check whether the code is running on the native arm64 stack or the emulation stack before calculating the stack limit. That part isn’t all that interesting.

; on entry, x15 is the number of paragraphs to allocate
;           (bytes divided by 16)
; on exit, stack has been validated (but not adjusted)
; modifies x16, x17

chkstk:
    subs    x16, sp, x15, lsl #4
                            ; x16 = sp - x15 * 16
                            ; x16 = desired new stack pointer
    csello  x16, xzr, x16   ; clamp to 0 on underflow

    mov     x17, sp
    and     x17, x17, #-PAGE_SIZE   ; round down to nearest page
    and     x16, x16, #-PAGE_SIZE   ; round down to nearest page

    cmp     x16, x17        ; on the same page?
    beq     done            ; Y: nothing to do

probe:
    sub     x17, x17, #PAGE_SIZE ; move to next page¹
    ldr     xzr, [x17]      ; probe
    cmp     x17, x16        ; done?
    bne     probe           ; N: keep going

done:
    ret

The inbound value in x15 is the number of bytes desired divided by 16. Since the arm64 stack must be kept 16-byte aligned, we know that the division by 16 will not produce a remainder. Passing the amount in paragraphs expands the number of bytes expressible in a single constant load from 0xFFF0 to 0x0FFF0 (via the movz instruction), allowing convenient allocation of stack frames up to just shy of a megabyte in size. Since the default stack size is a megabyte, this is sufficient to cover all typical usages.

Here’s an example of how a function might use chkstk in its prologue:

    mov     x15, #17328/16      ; desired stack frame size divided by 16
    bl      chkstk              ; ensure enough stack space available
    sub     sp, sp, x15, lsl #4 ; reserve the stack space

Okay, so let’s summarize all of the different stack limit checks into a table, because people like tables.

  x86-32 MIPS PowerPC Alpha AXP x86-64 AArch64
unit requested Bytes Bytes Negative bytes Bytes Bytes Paragraphs
adjusts stack pointer before returning Yes No No No No No
detects stack placement at runtime No Yes Yes Yes Yes Yes
short-circuits No Yes Yes Yes Yes No
probe operation Read Write Read Write Either Read

As we discussed earlier, if the probe operation is a write, then short-circuiting is mandatory.

¹ If you’re paying close attention, you may have noticed that PAGE_SIZE is too large to fit in a 12-bit immediate constant. No problem, because the assembler rewrites it as

    sub x17, x17, #PAGE_SIZE/4096, lsl #12

The post Windows stack limit checking retrospective: arm64, also known as AArch64 appeared first on The Old New Thing.

Read the whole story
alvinashcraft
30 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Scaling SignalR With a Redis Backplane

1 Share

Get design and reference implementation for secure deployment of AI agents on infrastructure. Teleport's Agentic Identity Framework provides standards-based, opinionated designs so teams can ship confidently. Check it out.

Don't gamble with AI Agents. Coder Workspace provides a secure, self-hosted environment, so you can leverage AI safely, collaborate effectively, and stay in control. Learn more.

I ran into this one the hard way.

I built a real-time notification feature with SignalR, tested it locally, everything worked great. Then I scaled to two instances behind a load balancer, and notifications started disappearing for some users.

The code was fine. The problem was that SignalR connections are bound to the server process that accepted them. Each instance only knows about its own connections. So when an API request lands on Server 1 but the user is connected to Server 2, the notification just... doesn't get delivered.

This is the SignalR scale-out problem, and it bites almost everyone who goes from one instance to more.

Why SignalR Breaks When You Scale Out

With a single instance, everything just works.

SignalR single server deployment showing all clients connected to the same server.

The server holds the full map of who's connected, so sending a message to a user, a group, or all clients works because that map is complete.

But scale out to two or more instances, and that map fractures.

SignalR multi-server deployment showing clients connected to different servers.

Server 1 has no idea Client 3 or Client 4 even exist. So when an order status change happens on Server 1 and needs to reach Client 3, Server 1 checks its connection map, finds nothing, and the message is quietly dropped.

The Backplane Pattern

The fix is a backplane - a shared messaging layer that sits between all your server instances.

Every server publishes outgoing messages to a central channel, and every server subscribes to that same channel. When a message comes in, each server checks if any of its local connections should receive it.

SignalR backplane deployment showing clients connected to different servers.

When Server 1 wants to notify Client 3:

  1. Server 1 publishes the message to the backplane
  2. All servers receive the message
  3. Server 2 recognizes Client 3 as one of its connections and delivers the notification

From your code's perspective, it looks like every server can see every connection. Redis works really well for this because its Pub/Sub delivers messages to all subscribers in near real-time. And if you're already using Redis for distributed caching, you don't even need to spin up anything new.

Setting It Up

Let me walk through how I set this up in an order notification system. Clients connect to a SignalR hub, authenticate via JWT, and get real-time status updates when an order changes.

Install the NuGet Package

dotnet add package Microsoft.AspNetCore.SignalR.StackExchangeRedis

Register the Backplane

Chain .AddStackExchangeRedis() onto your AddSignalR() call:

builder.Services.AddSignalR()
    .AddStackExchangeRedis(builder.Configuration.GetConnectionString("cache")!);

If you're running with .NET Aspire, you will have the Redis connection string registered via environment variables, so you can just pull it from configuration. Reference the same named connection:

builder.AddRedisDistributedCache("cache");

builder.Services.AddSignalR()
    .AddStackExchangeRedis(builder.Configuration.GetConnectionString("cache")!);

If you have multiple SignalR apps sharing the same Redis instance, you'll want to add a channel prefix. Otherwise, messages from one app will reach subscribers in every app on that Redis server.

builder.Services.AddSignalR()
    .AddStackExchangeRedis(connectionString, options =>
    {
        options.Configuration.ChannelPrefix = RedisChannel.Literal("OrderNotifications");
    });

The nice thing is that your IHubContext<> call site doesn't change at all. Clients.User(...) works the same whether you have one instance or ten - the backplane handles routing behind the scenes.

When I tested this with two replicas in .NET Aspire, I tagged each notification with the sending instance's ID. A client on Replica 1 received a notification stamped with Replica 2's ID, which confirmed the message was crossing instances through Redis.

The Sticky Sessions Requirement

This is something you should know before you set up the backplane: you still need sticky sessions.

The Redis backplane solves message routing, but it does not remove the need for sticky sessions.

SignalR's connection negotiation is a two-step process:

  1. The client sends a POST to /hub/negotiate to obtain a connection token
  2. The client uses that token to establish the WebSocket connection

Both requests must land on the same server. If your load balancer routes the negotiation to Server 1 but the WebSocket upgrade to Server 2, the connection fails.

SignalR connection negotiation showing the need for sticky sessions.

Make sure sticky sessions are enabled in your load balancer. Most load balancers support this via IP hash or cookie affinity - check the docs for whichever you're using.

What Happens When Redis Goes Down?

One thing worth knowing: SignalR does not buffer messages when Redis is unavailable.

If Redis stops responding, any messages sent during the outage are simply lost. SignalR may throw exceptions, but your existing WebSocket connections stay open - clients don't get disconnected. Once Redis comes back, SignalR reconnects automatically.

For most real-time scenarios like order updates or live dashboards, this is fine. The next state change triggers a fresh notification anyway, or the user can just reload. If you're dealing with something more critical (financial data, operational alerts), you'll want a reconciliation strategy on reconnect or a durable queue running alongside.

Redis Backplane vs. Azure SignalR Service

If you're on Azure, the managed Azure SignalR Service is worth considering. It proxies all client connections through the service, so sticky sessions aren't required and your servers only hold a small number of constant connections to the service.

The Redis backplane is the better fit when you're self-hosted, latency-sensitive, or already running Redis. For everything else, Azure SignalR Service is the cleaner option.

Summary

Honestly, the Redis backplane is almost too simple to set up. One method call on AddSignalR() and your app goes from silently dropping messages to routing them across every instance. You don't need to make changes to your hub code, client code, or application logic.

Just remember two things:

  • You still need sticky sessions
  • Messages aren't buffered if Redis goes down temporarily

Get those two right and SignalR scales out just as smoothly as the rest of your stack.

If you want to go deeper on building real-time features and APIs in .NET, check out my Pragmatic REST APIs course.

Hope this was useful. See you next week.




Read the whole story
alvinashcraft
30 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

File Based Document Repository with Version Control in .NET with TX Text Control

1 Share
In this article, we will explore how to implement a file-based document repository with version control in .NET using TX Text Control. This solution allows you to manage and track changes to your documents effectively.

Read the whole story
alvinashcraft
31 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

2.7.1: Fix wsl stuck when misconfigured cifs mount presents (#14466)

1 Share
  • detach terminal before running mount -a

  • Potential fix for pull request finding

Co-authored-by: Copilot Autofix powered by AI 175728472+Copilot@users.noreply.github.com

  • use _exit on error before execv in child process to avoid unintentional resource release

  • Add regression test

  • Fix clang format issue

  • fix all clang format issue

  • Potential fix for pull request finding

Co-authored-by: Copilot Autofix powered by AI 175728472+Copilot@users.noreply.github.com

  • resolve ai comments

  • move test to unit test

  • Fix string literal

  • Overwrite fstab to resolve pipeline missing file issue


Co-authored-by: Feng Wang wangfen@microsoft.com
Co-authored-by: Copilot Autofix powered by AI 175728472+Copilot@users.noreply.github.com

Read the whole story
alvinashcraft
31 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Daily Reading List – March 20, 2026 (#746)

1 Share

Project Hail Mary is maybe my favorite fiction book from the past few years. I read it twice, which I never do. So, I’m super excited to go see the movie tonight. If you see it too, let me know.

[article] What Is the PARK Stack? I know there have been other acronym-stacks since LAMP, but I struggle to remember them. This one is about PyTorch, AI models, Ray, and Kubernetes. Might stick.

[blog] Kubernetes v1.36 — Sneak Peek. Speaking of Kubernetes, there always seems to be another version around the corner. Scale to zero is interesting!

[article] Anthropic just shipped an OpenClaw killer called Claude Code Channels, letting you message it over Telegram and Discord. Neat. Expect a lot of “OpenClaw killers” this year as people experiment with multi-agent orchestrators.

[blog] How I overhauled my app UI in minutes with Stitch and AI Studio. Great example. Take those existing apps and let these smart tools help with redesign, rearchitecture, and deployment.

[article] 9 reasons Java is still great. Java is doing fine. I’m not sure it’s the default choice for many startups, but it’s well-established and constantly improving.

[blog] Next-gen caching with Memorystore for Valkey 9.0, now GA. If you like open software, fast performance, and reliable databases, Valkey could be on your radar.

[blog] Building an MCP Ecosystem at Pinterest. Here we go. Let’s get some real-world practices from users, not just messages from vendors and thought-leaders.

[blog] Streamline read scalability with Cloud SQL autoscaling read pools. One smart way to scale relational databases is to use read replicas. Now we’re offering a clean way to autoscale your read replicas without requiring any changes to your apps.

[article] State of JavaScript 2025: Survey Reveals a Maturing Ecosystem with TypeScript Cementing Dominance. This is a dense report, so InfoQ rolled up some of the highlights. But dig into the source material and see what stands out to you.

Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:



Read the whole story
alvinashcraft
32 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Live from Replit HQ: Agent 4 Launch Pt. 1

1 Share
Summary Manny and Raymmar opened the doors at HQ for a two-hour behind-the-scenes look at our team, the energy after launch, and what Agent 4 actually means for builders. Manny demoed a taste-development app built in Agent 4 with a landing page, web app, and mobile-native version all inside one unified project on the Infinite Canvas. Raymmar showed Replitopolis, a live 3D city pulling from our BigQuery data where every building is a real user and every orange glow is a commit. Eight other team members joined the stream, from designers and engineers to our marketing team and Amjad, our CEO and co-founder. Last week was Agent 4 launch week, and the team could finally breathe. Our very own community managers Manny Bernabe and Raymmar Tirado decided the best way to mark it was to bring the community inside HQ: pull teammates on camera, share what everyone has been building, and have fun along the way.

Read the whole story
alvinashcraft
32 minutes ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories