Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
153769 stories
·
33 followers

From latency to instant: Modernizing GitHub Issues navigation performance

1 Share

When you’re working through a backlog—opening an issue, jumping to a linked thread, then back to the list—latency isn’t just a metric. It’s a context switch. Even small delays add up, and they hit hardest at the exact moments developers are trying to stay in flow. It’s not that GitHub Issues was “slow” in isolation; it’s that too many navigations still paid the cost of redundant data fetching, breaking flow again and again.

Earlier this year, we set out to fix that—not by chasing marginal backend wins, but by changing how issue pages load end-to-end. Our approach was to shift work to the client and optimize perceived latency: render instantly from locally available data, then revalidate in the background. To make that work, we built a client-side caching layer backed by IndexedDB, added a preheating strategy to improve cache hit rates without spamming requests, and introduced a service worker so cached data remains usable even on hard navigations.

In this post, we’ll walk through how the system works and what changed in practice. We’ll cover the metric we optimized for; the caching and preheating architecture; how the service worker speeds up navigation paths that used to be slow; and the results across real-world usage. We’ll also dig into the tradeoffs—because this approach isn’t free—and what still needs to happen to make “fast” the default across every path into Issues. If you’re building a data-heavy web app, these patterns are directly transferable: you can apply the same model to reduce perceived latency in your own system without waiting for a full rewrite.

The speed of thought: Web performance in 2026

In 2026, “fast enough” is not a competitive bar. For developer tools, latency is product quality. When someone is triaging multiple issues, reviewing a feature request or reporting a bug, every avoidable wait breaks flow.

Modern local-first tools and aggressively optimized clients have moved the standard from “loads in a second” to “feels instant.” In this world, users do not benchmark us against old web apps. They benchmark us against the fastest experience they have ever had every day.

GitHub Issues is not a small surface area. Every week millions of people around the world rely on Issues to keep their codebase running smoothly. As Issues also becomes the planning layer for AI-assisted work, perceived performance becomes even more critical: if the loop between intent and feedback is slow, the entire system feels slow.

We heard the same problems from both internal teams and the community: Issues felt too heavy compared to tools built with speed as a first principle. The bottleneck was not feature depth or correctness. It was architecture and request lifecycle. Too many common paths still paid the full cost of server rendering, network fetches, and client boot, even when data had effectively been seen before.

Our Issues Performance team’s job was to close that gap. The objective was straightforward and technical: redesign data flow and navigation behavior so the product feels instant by default.

Before changing architecture, we needed to align on what “fast” means in user terms and how to measure it. Generic page metrics are useful, but they are not sufficient for a complex product surface like Issues.

We use HPC (Highest Priority Content), an internal metric closely aligned with Web Vitals LCP, to measure when the primary content (the content users care about) on the page is first rendered. Like LCP, this is anchored to a single HTML element selected by the browser, which on issue pages is most often the issue title or the issue body. If that element is rendered quickly, the experience feels responsive even if non-critical page regions are still loading.

Operationally, we bucket navigations using HPC thresholds:

  • Instant: HPC < 200 ms
  • Fast: HPC < 1000 ms
  • Slow: HPC >= 1000 ms

These thresholds give us a practical model for user-perceived speed, not just raw backend latency. The <200 ms bucket maps to interactions that feel immediate in real workflows, while the <1000 ms bucket captures experiences that are still acceptable but no longer invisible to users.

This is also the point at which our measurement philosophy evolved. Historically, we dedicated significant effort to tracking the p90 and p99 of the HPC and minimizing the worst tail of the distribution. While this work remains important, it does not inherently ensure that the product feels fast for the majority of users. It is possible to enhance the p99 of the HPC while still leaving the median experience feeling sluggish.

For this initiative, we shifted focus toward distribution quality: how many navigations land in our fast and instant buckets across the whole population? The goal is not just fewer terrible outliers. It’s to make speed the default path for the majority of sessions.

The baseline: Navigation mix before we changed anything

Before implementing optimizations, we needed a clear model of how users were actually reaching issues#show (the route for viewing an issue). Treating all navigations as one class of traffic would hide the real bottlenecks.

We identified three primary navigation types:

  • Hard navigation: a full browser load (cold start or refresh) where we pay the full cost of network, server rendering, asset loading, JavaScript boot and React hydration.
  • Turbo navigation: a Rails Turbo transition that updates targeted page regions without a full reload. It avoids some hard-navigation overhead but still depends heavily on server-rendered responses.
  • Soft navigation (React): a client-side transition inside the existing React runtime, where we can often avoid full page bootstrap costs.

Our measured distribution at the start of the workstream was:

Graph showing navigation mix for issues show route (57.6% hard, 37.5% react).

That distribution made one thing obvious: the dominant path was also the slowest. Any strategy focused only on React soft navigations could improve part of the experience, but it could not move overall perceived performance enough on its own.

Graph showing HPC distribution by navigation type (2.05 hard, 1.76 turbo, 1.04 react).

This baseline shaped our next architecture decisions: improve the fast paths and reduce the hard-navigation penalty, because that’s where most users were seeing the most latency.

One thing to note: GitHub is still in the middle of moving from Rails-rendered pages to a React frontend. During that transition, many user journeys cross the Rails/React boundary. When that happens—for example, navigating from a Rails page into Issues—the browser often has to do a full hard navigation and cold boot. That boundary crossing is a big reason hard navigations made up the largest share of our baseline.

We expect that share of hard navigations to decrease over time as more surfaces become React-native. But we could not wait for platform migration alone to solve our problem. We started by optimizing React soft navigations first, where we had immediate architectural leverage and could ship improvements quickly.

Once we aligned on the target, our strategy became clear: build a local-first application model with stale-while-revalidate. That means rendering immediately from locally available data to minimize user-visible latency, then asynchronously revalidating against the server and reconciling the UI if newer data exists.

Step 1: Client-side caching with IndexedDB

We started where we had the most leverage and where we want to move most traffic in the future: React soft navigations. In this path, the runtime is already alive, so the dominant cost is usually data fetch latency, not application boot. If we could remove network from repeated visits, we could move a large slice of traffic into the instant bucket.

Our pre-workstream analysis showed a strong repeated-access pattern: users reopen the same issues frequently during triage and collaboration loops. Based on that behavior, we estimated a potential cache-hit ratio of roughly 30% for issues#show and used that as the initial viability threshold.

Architectural diagram showing the client cache layer.

The implementation was to extend our current in-memory store with a persistent client cache in IndexedDB.

Why we chose IndexedDB for this layer:

  • Durable browser storage that survives tab closes and browser restarts, unlike memory-only stores.
  • Indexed object-store model, which gives efficient key-based lookups for issue query payloads.
  • Larger practical quota than localStorage, making it appropriate for real working sets.

On top of that storage layer, we implemented stale-while-revalidate semantics:

  • Read path: on soft navigation, attempt to hydrate from local cache first and render immediately.
  • Revalidation path: issue a background network request for freshness and reconcile the in-memory store if data changed.
  • Failure behavior: when network is degraded, users still get a usable page from cache, with freshness reconciled once connectivity recovers, introducing a new graceful-degradation model.

The architectural point is that this is not “cache or correctness.” It is latency-first rendering with asynchronous consistency checks on the same navigation.

Initial production results validated the model. After broad rollout to all users, approximately 22% of React navigations became instant—up from 4% pre-launch—representing about 15% of total request volume. Observed cache-hit ratio landed around one-third (~33%), which was consistent with the earlier revisit analysis.

Graph showing HPC distribution after cache rollout.

The main tradeoff is controlled staleness. We measured server/cache divergence at about 4.7% and treated that as an explicit operating envelope: acceptable for the perceived speed gains on soft navigations, with safeguards to limit user-visible inconsistency.

Moving the needle on cache-hit ratios

Caching is only as good as its cache-hit ratio. The IndexedDB-backed SWR (Stale-While-Revalidate) layer gave us a strong first step, but a one-third hit rate also exposed the next limitation: most navigations still arrived before the data did.

The naive answer was obvious: prefetch every likely next issue as early as possible. We explored that direction and quickly ran into the real constraint, which was not implementation complexity but capacity. On high-fanout surfaces such as issue lists, dashboards, and projects, eager prefetching amplifies request volume, creates N+1-style access patterns and pushes unnecessary compute onto the system for pages a user may never open.

So we changed the objective. Instead of trying to make prefetched data always fresh, we optimized for a cheaper and more scalable condition: make sure some usable data is already local by the time the user clicks.

Flow diagram showing preheating process. Steps are: Look at issues index, For each issue in the list trigger a preheat request, Is data in the cache present? if yes, add to IndexDB. If no, fetch data, then add to IndexDB.

That is preheating. Preheating proactively walks high-intent issue references and prepares cache entries ahead of navigation, but it only hits the network when the issue is not already present in the client cache. If usable data already exists, preheating stops. This makes it fundamentally different from traditional preloading. It is cache-population logic, not freshness-enforcement logic.

This is an explicit tradeoff between freshness and capacity usage. We are willing to serve data that may be slightly stale if that allows the navigation itself to complete near instantaneous, because once the user opens the issue, we can still revalidate in the background and converge to the latest server state.

To support that model efficiently, we introduced an in-memory cache version in front of IndexedDB. IndexedDB gives persistence across tabs and sessions, but it is still asynchronous and therefore not free on the critical path. The in-memory layer sits between the active in-memory store and persistent storage, allowing hot issue payloads to be served synchronously without paying even the IndexedDB read cost. In practice, this removes another async boundary from soft navigation and materially increases the probability of rendering directly from memory.

Diagram showing the in-memory cache layer.

Operationally, preheating is triggered from high-intent surfaces such as issue lists, dashboards, projects, and dependency views. Requests run on low-priority workers, are strictly rate-limited and are guarded by circuit breakers, so the mechanism backs off under pressure. User-initiated work always takes precedence over speculative fetches, allowing us to avoid the noisy-neighbor problem and keep the system stable while still improving cache-hit ratios for real user navigations.

Graph showing HPC distribution after preheating rollout.

The result was a large shift in distribution. After rolling out preheating broadly, instant navigations for issues#show increased to roughly 30% overall. For React navigations specifically, up to ~70% became instant. Cache-hit ratio rose to roughly 96%.

That tradeoff was acceptable. We spent a small amount of controlled background capacity to move a large percentage of real user navigations out of the network-bound path.

Expanding the fast path: Optimizing turbo and hard navigations

We were happy with the React navigation gains, but soft navigations aren’t the whole story. Even as more of GitHub moves from Rails to React, hard navigations will always exist—refreshes, new tabs, direct URLs, and inbound links. Those cold starts still matter, so we wanted cached data to help there too.

The mechanism we chose was a service worker.

A service worker is a browser-managed script that runs outside the page itself and can intercept network requests before they reach the server. Conceptually, it sits between the browser and the origin as a programmable middleman. That makes it one of the few web platform primitives that can influence hard navigations without requiring the page’s JavaScript runtime to already be active.

For issues#show, our service worker extends the same local-first model we built for React navigations. When the browser starts a navigation request for an issue page, the service worker intercepts it and checks whether the issue data is already available in local cache. If it is, the worker annotates the outgoing request with a specific header that tells the server it can skip a substantial amount of work.

Diagram showing service worker interception flow.

When the service worker detects a cache hit, it signals to the server via a request header. From there, the navigation splits into two paths:

  • Cache hit path: return a thin HTML shell (layout + minimal markup + JS), and let React render from the locally cached issue payload.
  • Cache miss path: return the normal response (server loads data and SSRs the page).

This is a strict optimization: if the cache is cold, stale, or the service worker isn’t available, behavior falls back to the standard server-rendered path.

This had an especially strong effect on Turbo navigations, because Turbo paths are still heavily constrained by server response time. Once the service worker can signal that issue data is already present, the server spends much less time computing the application fragment, and Turbo benefits almost immediately from that reduction in backend work.

Graph showing HPC distribution for Turbo navigations after service worker rollout.

Hard-navigation gains are real, but they are less immediately visible than Turbo gains: on cache-hit hard navigations, so we trade SSR time for client-side rendering. The critical path now becomes JavaScript download and execution.

To reduce that cost, we split code by route using React.lazy and dynamic route preloading, so only the code required for the current route is fetched up front. We apply the same principle at the component level, loading only what’s necessary for the initial view and deferring non-critical modules. For example, we only fetch the issue editor bundle when a user enters edit mode, and use intent-based prefetching (like hover) to hide that latency without bloating the initial bundle.

Distribution showing HPC for Hard navs.

The results

After deploying these changes, we wanted to step back and look at the cumulative impact. We analyzed the HPC metric across the entire rollout period—from the initial IndexedDB cache through preheating, in-memory layering, and the service worker—and the trend is clear and sustained: the distribution is shifting toward fast.

Chart showing the HPC drop over various percentiles.

Rather than cherry-pick a single good week, we looked at the full window to share some concrete wins from recent months. Below are the HPC percentiles across all issues#show traffic:

  • P10: ~600 ms → 70 ms — the fastest navigations moved firmly into the instant bucket, well below 200 ms.
  • P25: ~800 ms → 120 ms — a quarter of all navigations now complete in under 120 ms, down from nearly a full second.
  • P50: ~1,200 ms → 700 ms — the median experience crossed below the one-second threshold, moving from the slow bucket into fast.
  • P75: 1,800 ms → 1,400 ms — the upper quartile dropped by over 400 ms, shrinking the long tail of perceptible latency.
  • P90: 2,400 ms → 2,100 ms — even the slowest navigations improved, though this tail remains the clearest signal of where further work is needed.

The pattern that stands out is the outsized improvement in the lower percentiles. P10 and P25 compressed dramatically because cached and preheated navigations now dominate that part of the distribution. The median improved meaningfully but is still shaped by cold-start traffic. And the upper tail, while better, reflects the hard-navigation paths where JavaScript boot and client rendering are now the bottleneck—exactly the area we are targeting next.

Numbers tell the optimization story, but what ultimately matters is the user impact. The video below shows what these changes feel like in practice—navigating between issues at full speed in a real session:

The work ahead

GitHub Issues is faster today than it has ever been. Across soft navigations, preheated paths, and service-worker-accelerated flows, we have materially changed the distribution of user-perceived latency and moved a much larger share of traffic into the instant bucket.

At the same time, we are not done. Cold starts that rely on SSR are still a real hurdle, especially when client boot and JavaScript execution become the dominant cost after server work is reduced.

The next phase is about moving bigger rocks. We are planning targeted rewrites of parts of our backend stack optimized explicitly for low-latency delivery and are investing in a modern UI delivery layer closer to the edge to reduce round trips and improve response time further.

Performance remains a continuous systems investment, not a one-time project. The architecture is improving, the bottlenecks are changing, and we will keep iterating until fast is the default experience across all navigation paths.

Check out the Quickstart guide for GitHub Issues >

The post From latency to instant: Modernizing GitHub Issues navigation performance appeared first on The GitHub Blog.

Read the whole story
alvinashcraft
just a second ago
reply
Pennsylvania, USA
Share this story
Delete

Securing Azure apps with Aspire enterprise networking

1 Share

Network security has a funny way of showing up late.

You start with a web app, an API, storage, and Key Vault. The app works. The demo works. Everyone is happy. Then the production checklist shows up:

  • Can storage and secrets be taken off the public internet?
  • Can this app run inside our virtual network?
  • What outbound IPs do we give to the partner allowlist?
  • Can we prove only the right subnets can talk to the right services?
  • What happens when someone accidentally opens a storage account to the world?

That is usually the point where the application model and the infrastructure model start drifting apart. The app says “I need storage and Key Vault.” The networking template says “I have seven subnets, two private DNS zones, a NAT gateway, a network security group, and a growing collection of comments explaining why nobody should touch any of it.”

Aspire makes this a lot nicer. The new Azure networking support lets you describe the network shape next to the resources that use it. Virtual networks (VNets), subnets, delegated subnets, NAT gateways, private endpoints, Network Security Groups (NSGs), and Network Security Perimeters (NSPs) can all be modeled from your AppHost.

This post is not going to pretend every enterprise network is simple. They are not. But the common building blocks are easier to reason about once you know what each one is for.

The short version

Here is a rough mental model for each Azure networking feature:

Azure feature Use it when Why it matters
Virtual network You need a private address space for the app It is the network boundary your subnets, private endpoints, and routing attach to
Subnet You need to separate workloads inside the VNet It gives each part of the system its own address range and policy surface
Delegated subnet A platform service needs to manage a subnet, like Azure Container Apps (ACA) It lets the service place its managed infrastructure in your VNet safely
NAT gateway You need predictable outbound public IPs It gives outbound traffic a stable address for allowlists and auditing
Private endpoint You want a platform as a service (PaaS) resource reachable privately It puts a private IP for that service inside your VNet and removes public exposure
NSG You need subnet-level traffic rules It is the basic allow/deny control for traffic entering or leaving a subnet
NSP You need PaaS-layer guardrails across services It lets you lock down Azure PaaS services to approved networks or subscriptions instead of the entire internet, even when those services still need public network access

The important part is that these are not competing features. A good production setup usually uses several of them together.

If you want the docs open while you read, start with Azure security best practices for Aspire deployments and the Aspire Azure Virtual Network integration reference.

Start with the network shape

Aspire 13.2 introduced the Aspire Azure Virtual Network integration. It is provided by the Aspire.Hosting.Azure.Network package, and it is the piece that lets your AppHost describe Azure networking primitives directly. Add it with:

aspire add azure-network

Then you can start modeling the network in code:

#:package Aspire.Hosting.Azure.AppContainers@13.3.0
#:package Aspire.Hosting.Azure.Network@13.3.0

var builder = DistributedApplication.CreateBuilder(args);

var network = builder.AddAzureVirtualNetwork(
    "app-network",
    addressPrefix: "10.20.0.0/16");

var appsSubnet = network.AddSubnet("apps", "10.20.0.0/23");
var privateEndpointSubnet = network.AddSubnet(
    "private-endpoints",
    "10.20.10.0/27");
var dataSubnet = network.AddSubnet("data", "10.20.20.0/24");

That is already useful. You have a single place where the application and its network layout live together.

The VNet is the private address space. The subnets are where you start drawing boundaries. It’s a good idea to keep separate subnets for:

  • compute, such as Azure Container Apps
  • private endpoints
  • data or service-specific infrastructure
  • shared egress, if the environment needs a dedicated outbound path

Could you put everything in one subnet? Sometimes. Should you? Usually no. The minute you want different routing, different NSG rules, or a clean way to explain the design to your security team, separate subnets pay for themselves.

Azure Container Apps wants a delegated subnet

ACA can run in your virtual network. When you do that, the Container Apps environment needs a subnet that is delegated to Microsoft.App/environments.

In Aspire, that becomes part of the same resource graph:

var containerApps = builder.AddAzureContainerAppEnvironment("apps")
    .WithDelegatedSubnet(appsSubnet);

builder.AddProject<Projects.Api>("api");

builder.AddProject<Projects.Web>("web");

That WithDelegatedSubnet call is doing something important. It tells Azure that this subnet is for the managed Container Apps environment, not a random subnet where anything can be dropped later.

This is one of those network details that is easy to miss when you are writing infrastructure by hand. ACA owns infrastructure inside that subnet. It needs room to scale. It needs the delegation. And you generally should not mix private endpoints or unrelated resources into the same subnet.

My practical rule: give ACA its own delegated subnet and size it with growth in mind. The Azure Container Apps docs have recommendations for choosing the subnet size.

Use NAT gateways when someone asks, “What IP does this come from?”

Many apps eventually call something outside Azure:

  • a payment provider
  • a legacy API
  • a partner endpoint
  • a corporate firewall
  • a SaaS service with an IP allowlist

If that external system needs to allowlist your outbound traffic, “whatever IP Azure happens to use today” is not a satisfying answer.

A NAT gateway gives outbound traffic from a subnet a stable public IP.

var builder = DistributedApplication.CreateBuilder(args);

var network = builder.AddAzureVirtualNetwork(
    "app-network",
    addressPrefix: "10.20.0.0/16");
var appsSubnet = network.AddSubnet("apps", "10.20.0.0/23");

var egressIp = builder.AddPublicIPAddress("egress-ip");

var natGateway = builder.AddNatGateway("egress")
    .WithPublicIPAddress(egressIp);

appsSubnet.WithNatGateway(natGateway);

Now resources in appsSubnet have a deterministic outbound path.

This does not make the app private by itself. It is about egress, not ingress. Use it when outbound identity matters: partner allowlists, firewall rules, centralized auditing, or incident response.

One small warning: NAT gateway is not a policy engine. It does not decide which domains your app is allowed to call. It gives the traffic a predictable public IP. Pair it with routing, NSGs, Azure Firewall, or other controls when you need actual outbound filtering.

Private endpoints are for taking PaaS off the public internet

Private endpoints are one of the biggest wins for securing Azure resources. Instead of reaching a storage account or Key Vault through a public endpoint, Azure gives that service a private IP inside your VNet.

Aspire models this from the subnet that will hold the private endpoint:

var builder = DistributedApplication.CreateBuilder(args);

var network = builder.AddAzureVirtualNetwork(
    "app-network",
    addressPrefix: "10.20.0.0/16");
var privateEndpointSubnet = network.AddSubnet(
    "private-endpoints",
    "10.20.10.0/27");

var storage = builder.AddAzureStorage("storage");
var blobs = storage.AddBlobs("documents");
var keyVault = builder.AddAzureKeyVault("secrets");

privateEndpointSubnet.AddPrivateEndpoint(blobs);
privateEndpointSubnet.AddPrivateEndpoint(keyVault);

Aspire handles the annoying parts that usually come with private endpoints:

  • creating the private endpoint resource
  • creating and linking the private DNS zone
  • wiring DNS zone groups
  • configuring the target resource to deny public network access when deployed

That last point is the one that matters the most. A private endpoint that still leaves the public endpoint wide open is only half a security story.

Storage and Key Vault are great examples because they are so common. Almost every production app needs durable data and secrets, and neither one needs to be broadly reachable from the public internet.

Private endpoints are the right tool when the question is:

Can this service be reachable only through my private network?

If the answer should be yes, reach for a private endpoint.

NSGs are the subnet traffic rules

Network Security Groups (NSGs) are the familiar allow/deny rules for traffic entering or leaving a subnet. They are not glamorous, but they are still one of the most useful controls in Azure networking.

Aspire gives you shorthand helpers on subnets:

using Azure.Provisioning.Network;
...

appsSubnet
    .AllowInbound(
        port: "443",
        from: AzureServiceTags.AzureLoadBalancer,
        protocol: SecurityRuleProtocol.Tcp)
    .DenyInbound(from: AzureServiceTags.Internet);

For more explicit control, you can create an NSG resource and attach it:

using Azure.Provisioning.Network;
...

var appNsg = builder.AddNetworkSecurityGroup("apps-nsg")
    .WithSecurityRule(new AzureSecurityRule
    {
        Name = "allow-https-from-load-balancer",
        Priority = 100,
        Direction = SecurityRuleDirection.Inbound,
        Access = SecurityRuleAccess.Allow,
        Protocol = SecurityRuleProtocol.Tcp,
        SourceAddressPrefix = AzureServiceTags.AzureLoadBalancer,
        SourcePortRange = "*",
        DestinationAddressPrefix = "*",
        DestinationPortRange = "443"
    });

appsSubnet.WithNetworkSecurityGroup(appNsg);

Use NSGs for broad traffic rules at the subnet boundary. For example, you can:

  • allow HTTPS traffic to the app subnet only from Azure’s load-balancing infrastructure
  • deny direct inbound traffic from the internet to subnets that should not accept it
  • keep SSH, RDP, and other admin ports closed unless you have an explicit management path
  • limit traffic between subnets so one compromised workload cannot freely reach everything else

NSGs are stateful L3/L4 rules. They are not a replacement for authentication, authorization, WAF policies, or application-level security. Think of them as the network’s first “nope” before traffic gets anywhere near your code.

NSPs protect PaaS resources as a group

Private endpoints are fantastic, but they are not the only way to think about PaaS security. NSPs add a logical boundary around PaaS resources. Instead of only asking “what subnet is this traffic from?”, an NSP lets you group resources like Storage, Key Vault, Cosmos DB, and SQL and define access rules for the perimeter.

Aspire 13.3 adds first-class NSP support:

var builder = DistributedApplication.CreateBuilder(args);

var perimeter = builder.AddNetworkSecurityPerimeter("data-boundary")
    .WithAccessRule(new AzureNspAccessRule
    {
        Name = "allow-corp-network",
        Direction = NetworkSecurityPerimeterAccessRuleDirection.Inbound,
        AddressPrefixes = { "203.0.113.0/24" }
    });

var secrets = builder.AddAzureKeyVault("secrets")
    .WithNetworkSecurityPerimeter(
        perimeter,
        NetworkSecurityPerimeterAssociationAccessMode.Learning);

var database = builder.AddAzureCosmosDB("catalog")
    .WithNetworkSecurityPerimeter(perimeter);

var storage = builder.AddAzureStorage("storage")
    .WithNetworkSecurityPerimeter(perimeter);

Learning mode is powerful because you can attach a resource to the perimeter and observe what would be blocked before flipping to Enforced. That gives you a safer rollout path: measure first, then tighten.

NSPs are not a replacement for VNets or private endpoints. They complement them. Use NSPs when you want a PaaS-layer guardrail that follows the resource grouping, not just the network path.

Good places to use NSPs:

  • A storage account or Key Vault that should be reachable by approved apps, build agents, or corporate networks, but not the whole internet.
  • Shared data services used by apps in different VNets or subscriptions, where private endpoints alone do not describe every allowed caller.
  • Production hardening work where Learning mode lets you find legitimate callers before you start blocking traffic.

Putting it together

The smaller snippets above are focused on one idea at a time. If you want a copy/paste starting point, use this complete AppHost shape. It includes the package directives and using statements needed for the networking APIs.

#:sdk Aspire.AppHost.Sdk@13.3.0
#:package Aspire.Hosting.Azure.AppContainers@13.3.0
#:package Aspire.Hosting.Azure.KeyVault@13.3.0
#:package Aspire.Hosting.Azure.Network@13.3.0
#:package Aspire.Hosting.Azure.Storage@13.3.0

#pragma warning disable ASPIREAZURE003

using Aspire.Hosting.Azure;
using Azure.Provisioning.Network;

var builder = DistributedApplication.CreateBuilder(args);

var network = builder.AddAzureVirtualNetwork(
    "app-network",
    addressPrefix: "10.20.0.0/16");

var appsSubnet = network.AddSubnet("apps", "10.20.0.0/23");
var privateEndpointSubnet = network.AddSubnet(
    "private-endpoints",
    "10.20.10.0/27");

var egressIp = builder.AddPublicIPAddress("egress-ip");
var natGateway = builder.AddNatGateway("egress")
    .WithPublicIPAddress(egressIp);

appsSubnet
    .WithNatGateway(natGateway)
    .AllowInbound(
        port: "443",
        from: AzureServiceTags.AzureLoadBalancer,
        protocol: SecurityRuleProtocol.Tcp)
    .DenyInbound(from: AzureServiceTags.Internet);

var containerApps = builder.AddAzureContainerAppEnvironment("apps")
    .WithDelegatedSubnet(appsSubnet);

var perimeter = builder.AddNetworkSecurityPerimeter("data-boundary")
    .WithAccessRule(new AzureNspAccessRule
    {
        Name = "allow-corp-network",
        Direction = NetworkSecurityPerimeterAccessRuleDirection.Inbound,
        AddressPrefixes = { "203.0.113.0/24" }
    });

var storage = builder.AddAzureStorage("storage")
    .WithNetworkSecurityPerimeter(perimeter);
var blobs = storage.AddBlobs("documents");

var keyVault = builder.AddAzureKeyVault("secrets");

privateEndpointSubnet.AddPrivateEndpoint(keyVault);

builder.AddProject("api", "api/api.csproj")
    .WithReference(blobs)
    .WithReference(keyVault);

builder.Build().Run();

A simplified diagram for that AppHost looks like this:

network architecture image

Putting the network in the AppHost does not remove the need for good network design. It makes the design visible at the same level as the app itself. In the example above, the API references storage and Key Vault, while the network model says how those resources are exposed, where the app runs, what subnet ACA uses, what outbound path it takes, and which resource is protected by the perimeter.

How to roll this out

If you already have an Aspire app and want to harden the Azure side, don’t try to do everything at once. Instead, try breaking each piece into its own step:

  1. Add a VNet and give your app a real subnet plan.
  2. Move Azure Container Apps into a delegated subnet.
  3. Add a private endpoint subnet.
  4. Put Storage and Key Vault behind private endpoints.
  5. Add a NAT gateway if anything outside Azure needs stable outbound IPs.
  6. Add NSG rules that document and enforce the subnet boundaries.
  7. Add NSPs for PaaS resources, start in Learning mode, then move to Enforced when the logs look right.

That sequence keeps each step understandable. It also gives you a clean rollback story if you discover some dependency you did not know existed.

A few mistakes to avoid

Do not put private endpoints in the ACA delegated subnet. Give private endpoints their own subnet. It keeps the design cleaner and avoids mixing platform-managed compute infrastructure with service endpoints.

Do not make your subnets too small. ACA scales. Private endpoints accumulate. Future you will not be impressed by a perfectly packed IP plan that has no room for the next service.

Do not use NAT gateway as if it were a firewall. It gives you stable outbound IPs. It does not decide whether calling example.com is okay.

Do not assume private endpoints automatically solve every access path. Check the target resource’s public network access settings and DNS behavior. Aspire helps here, but the architecture still needs to be intentional.

Do not jump straight to enforced NSP rules on a busy production environment. Learning mode exists for a reason.

Try it

Upgrade to Aspire 13.3 and start small:

aspire update --self
aspire update

Then start with the network shape: add a VNet, decide which subnets you need, and give Azure Container Apps its own delegated subnet. From there, add private endpoints, NAT gateway, NSG rules, and NSPs as needed.

Networking can still be complex. But with Aspire, it does not have to be a separate story from the app. That is the part that changes the day-to-day experience: the secure architecture is right there in the code you already use to describe the system.

The post Securing Azure apps with Aspire enterprise networking appeared first on Aspire Blog.

Read the whole story
alvinashcraft
33 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

.NET Agent Skills

1 Share
Read the whole story
alvinashcraft
40 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

Python 3.15 Is Feature Frozen

1 Share
Read the whole story
alvinashcraft
45 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

Segment Heap support for C++ projects in Visual Studio

2 Shares

Visual Studio 2026 version 18.6 makes it easier to take advantage of modern Windows memory management improvements. Segment Heap is a modern heap implementation in Windows that delivers stronger protection against common memory vulnerabilities, higher allocation throughput, lower memory fragmentation, better scalability across cores, and more predictable performance under load. Starting with this release, new C++ projects are now configured to use Segment Heap by default.

Onboarding your project to use the Segment Heap

New C++ projects come with Segment Heap enabled by default. For existing projects, follow the steps below to enable it.

For MSBuild solution-based C++ projects, the project property is located at Project -> Properties -> Manifest Tool -> Input and Output -> Enable Segment Heap. You can opt into the segment heap on a per-project basis, allowing you to onboard at your own pace.

Property Pages

For CMake users, Visual Studio provides a helper script, SegmentHeap.cmake, that integrates Segment Heap into your build automatically. If you manage your configuration through CMakePresets, you can enable Segment Heap by setting CMAKE_PROJECT_TOP_LEVEL_INCLUDES. Optionally, you can use the VS_SEGMENT_HEAP_ALLOWLIST and VS_SEGMENT_HEAP_EXCLUDE environment variables in the same preset to control which targets opt in:

{
    "name": "foo",
    "displayName": "Foo",
    "inherits": "",
    "environment": {
        "VS_SEGMENT_HEAP_ALLOWLIST": "target1;target2;",
        "VS_SEGMENT_HEAP_EXCLUDE": "target3;"
    },
    "cacheVariables": {
        "CMAKE_PROJECT_TOP_LEVEL_INCLUDES": "$env{VSINSTALLDIR}Common7/IDE/CommonExtensions/Microsoft/CMake/cmake/Microsoft/SegmentHeap.cmake"
    }
}

In this example, the optional VS_SEGMENT_HEAP_ALLOWLIST variable limits Segment Heap to target1 and target2, while the optional VS_SEGMENT_HEAP_EXCLUDE variable keeps it disabled for target3. This gives you fine-grained control over which targets in the project use Segment Heap when you need it.

Segment Heap integration is designed to coexist cleanly with existing toolchains and build configurations. It integrates into the standard linker + manifest tool flow, and it avoids introducing custom build steps or requiring changes to your toolchain configuration. This design ensures that Segment Heap adoption is low-risk and does not interfere with existing build logic.

How to check if Segment Heap is enabled

You can verify whether Segment Heap is enabled by checking the final application manifest embedded in your executable. Open the executable directly in Visual Studio, inspect the RT_MANIFEST resource for the following entry:

<heapType>SegmentHeap</heapType>

This indicates that the Segment Heap is active for your application.

Alternatively, you can open a Developer Command Prompt for Visual Studio, extract the embedded manifest with mt, and then open the generated manifest file in Visual Studio, and locate the same entry there.

For example:

mt.exe -inputresource:YourApp.exe;#1 -out:YourApp.manifest

Because Segment Heap is enabled via manifest embedding, the presence of this declaration in the final binary confirms that the feature is in effect.

Get started and share your feedback

We encourage you to download Visual Studio 2026 version 18.6 Stable to start using Segment Heap in your C++ projects. Whether you’re creating a new project or onboarding an existing one, we’d love to hear how it goes. You can reach us through Help > Send Feedback in the Visual Studio IDE or by posting on Developer Community.

The post Segment Heap support for C++ projects in Visual Studio appeared first on C++ Team Blog.

Read the whole story
alvinashcraft
54 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

A constant-space linear-time algorithm for deleting all but the 10 most recent files in a directory

1 Share

Say you have a directory full of files, and you want to delete all but the 10 most recent files. Is there a way to tell Find­First­File to enumerate the files in date order?

No, there is no way to tell Find­First­File to enumerate the files in date order. The files enumerated by Find­First­File are produced in whatever order the file system driver wants. For example, FAT typically enumerates them in the order the files appear in the directory listing, which could be in order of creation if the files were added sequentially, or some mishmash order if there were renames or deletions mixed in.

Since you can’t control the order in which the files are enumerated, you’ll have to do the sorting yourself. The naïve solution is to read in all the entries, sort them by last-modified date, and then delete all but the last ten. This is O(n) space and O(n log n) running time.

But you can do better.

This job calls for a priority queue. A priority queue is a data structure that supports these operations, where n is the number of items in the priority queue.

  • Add sorted: O(log n)
  • Find largest: O(1)
  • Remove largest: O(log n)

The above description is for a max-priority queue. There is also a min-priority queue where the final two operations are “find smallest” and “remove smallest”. The two versions are equivalent because you can just use a reverse-sense comparison to switch from one to the other.

What we can do is enumerate all the files and add them one by one to a min-priority queue sorted by modified date. The priority queue holds the newest items. If the priority queue size exceeds 10, then we delete the file corresponding to the “smallest” (earliest) entry in the priority queue, and the remove that entry from the priority queue.

Since the priority queue size has a fixed cap, all of the operations run in O(1) time because the value of n is bounded by a predetermined constant. (Of course, the larger the cap, the larger the constant in O(1).) The overall algorithm then runs in O(n) times, where n is the number of files in the directory.

Here’s a sketch of a solution. To get a min-priority heap, we have to reverse the sense of the comparison in dateAscending.

constexpr int files_to_keep = 10;

auto dateAscending = [](const WIN32_FIND_DATA& a, const WIN32_FIND_DATA& b) {
    return CompareFileTime(&a.ftLastWriteTime, &b.ftLastWriteTime) > 0;
    };

std::priority_queue<WIN32_FIND_DATA,
        std::vector<WIN32_FIND_DATA>, decltype(dateAscending)>
        names(dateAscending);

WIN32_FIND_DATA wfd;
wil::unique_hfind findHandle( FindFirstFileW(L"*.*", &wfd));
if (findHandle.is_valid())
{
    do
    {
        if (wfd.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY) {
            // Skip directories
            continue;
        }

        names.push(wfd);
        if (names.size() > files_to_keep) {
            DeleteFileW(names.top().cFileName);
            names.pop();
        }
    } while (FindNextFileW(findHandle.get(), &wfd));
}

It’s unfortunate that std::priority_queue doesn’t have a deduction guide that deduces the Comparator. We have to specify it explicitly, and since it comes after the Container, we have to write out the container type manually instead of allowing it to be deduced.

It’s also unfortunate that it’s hard to call reserve() on the vector hiding inside the priority_queue. This means that the names.push() could throw an exception. At least we use an RAII type (wil::unique_hfind) to ensure that the find handle is not leaked.

If you have access to std::inplace_vector, you could use a

std::priority_queue<WIN32_FIND_DATA,
        std::inplace_vector<WIN32_FIND_DATA, files_to_keep + 1>,
        decltype(dateAscending)> names(dateAscending);

to avoid memory allocations entirely. (It also makes it clearer that the algorithm is constant-space.)

This is an example of a so-called online algorithm, an algorithm that does its work incrementally rather than requiring all of the input before it can start working.

Exercise: What if the task was to delete the 10 oldest files?

The post A constant-space linear-time algorithm for deleting all but the 10 most recent files in a directory appeared first on The Old New Thing.

Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories