Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
150326 stories
·
33 followers

Azure Application Gateway Now Supports TCP and TLS Termination – A Game Changer?

1 Share

Disclaimer: This post was originally published on Azure with Danidu and has been reproduced here with permission. You can find the original post here.

Big news from Microsoft! Azure Application Gateway has achieved general availability (GA) for TCP and TLS protocol termination. This is a significant enhancement that many of us have been waiting for. But what does this really mean, and is it the game changer we’ve been hoping for?

Let’s dive in!

What’s New?

Azure Application Gateway has traditionally been a Layer 7 (application layer) load balancer, handling HTTP, HTTPS, WebSockets, and HTTP/2 traffic. Now, with the GA release of TCP and TLS termination, Application Gateway can also function as a Layer 4 (transport layer) proxy.

In simple terms, Application Gateway can now support non-HTTP/HTTPS traffic!

What Does “Termination” Mean?

Application Gateway operates as a terminating proxy. Here’s how it works:

  1. Client Connection: A client establishes a TCP or TLS connection directly with Application Gateway using its frontend listener’s IP address and port
  2. Gateway Termination: Application Gateway terminates this incoming connection at the proxy
  3. New Backend Connection: The gateway establishes a separate new connection with one of the backend servers selected by its distribution algorithm
  4. Request Forwarding: The client’s request is forwarded to the backend server through this new connection
  5. Response Handling: The backend response is sent back to the client via Application Gateway

This differs from Azure Load Balancer, which is a pass-through load balancer where clients establish direct connections with backend servers. With Application Gateway, you get that extra layer of control and management capabilities.

Architecture: Before and After

Before: Limited to HTTP(S) Only

Previously, if you wanted to load balance non-HTTP traffic (like databases, custom TCP applications, or other protocols), you had to use:

  • Azure Load Balancer for Layer 4 traffic
  • Application Gateway for Layer 7 HTTP(S) traffic
  • Separate firewall solutions for security

This meant multiple entry points and more complex architectures. Not ideal!

Now: Unified Gateway for All Traffic

Now, Application Gateway can serve as a single endpoint for:

  • HTTP/HTTPS traffic (Layer 7)
  • TCP traffic (Layer 4)
  • TLS traffic (Layer 4)
  • WebSockets
  • HTTP/2

All through the same frontend IP address!

Key Capabilities

Let’s talk about what this new feature brings to the table.

1. Hybrid Mode Support

You can now use a single Application Gateway instance to handle both HTTP and non-HTTP workloads simultaneously. Configure different listeners for different protocols, all using the same gateway resource.

Use cases:

  • Front-end web traffic (HTTPS) + backend database connections (TCP/TLS)
  • API Gateway (HTTPS) + message queue connections (TCP)
  • Multiple services with different protocols behind one entry point

2. Flexible Backend Options

Your backends can be located anywhere:

  • Azure Resources: VMs, VM Scale Sets, App Services, Event Hubs, SQL databases
  • On-Premises Servers: Accessible via FQDN or IP addresses
  • Remote Services: Any accessible TCP/TLS endpoint

This is huge for hybrid cloud scenarios!

3. Centralized TLS Certificate Management

With TLS termination support, you can:

  • Offload TLS processing from backend servers
  • Manage certificates centrally through Application Gateway
  • Integrate with Azure Key Vault for secure certificate storage
  • Use custom domains with your own certificates (even from private CAs!)
  • Simplify compliance by managing certificates in one place

No more certificate sprawl across multiple servers.

4. Autoscaling

Application Gateway supports autoscaling up to 125 instances for both Layer 7 and Layer 4 traffic, ensuring your infrastructure scales with demand.

5. Private Gateway Support

TCP and TLS proxy works with private-only Application Gateway deployments, enabling isolated environments with enhanced security for sensitive workloads.

Is This a Game Changer?

The Good News

Yes, in many ways! This feature allows you to:

  1. Simplify Architecture: Use Application Gateway as the single entry point for all external traffic (both HTTP and non-HTTP)
  2. Reduce Costs: Potentially consolidate multiple load balancers into one solution
  3. Centralize Management: One place for routing, certificates, and backend health monitoring
  4. Hybrid Workloads: Support modern and legacy applications through the same gateway
  5. Custom Domains: Front any backend service with your custom domain name

The Reality Check

But here’s the important caveat: The WAF doesn’t inspect TCP/TLS traffic.

Critical Limitation: “A WAF v2 SKU gateway allows the creation of TLS or TCP listeners and backends to support HTTP and non-HTTP traffic through the same resource. However, it does not inspect traffic on TLS and TCP listeners for exploits and vulnerabilities.”

What this means:

  • WAF (Web Application Firewall) rules only protect HTTP(S) traffic
  • TCP and TLS traffic passes through without WAF inspection
  • Application Gateway provides routing and load balancing for TCP/TLS, not security inspection

Security Options for TCP/TLS Traffic:

Now, does this mean you must have Azure Firewall? Not necessarily! You have options:

  1. Network Security Groups (NSGs): You can use NSGs for basic Layer 3/Layer 4 security (IP-based filtering, port restrictions). This is often sufficient for many scenarios and is much more cost-effective.
  2. Azure Firewall: Provides advanced features like threat intelligence, FQDN filtering, and centralized logging. You get more capabilities, but it comes at a higher cost.
  3. No Additional Firewall: In some controlled environments (like private-only gateways with strict network segmentation), you might rely solely on Application Gateway + NSGs.

The Trade-off:

  • NSGs give you basic Layer 3/4 protection (IP filtering, port control) – similar to what you’d get with traditional firewall rules
  • Azure Firewall gives you advanced threat detection, deep packet inspection, and more sophisticated filtering
  • You get less features with just NSGs compared to Azure Firewall, but for many workloads, that’s perfectly fine!

So even though we can pass traffic through Application Gateway, it will only process the HTTP/HTTPS traffic using its WAF. The TCP/TLS traffic? It just goes through without any inspection. Choose your additional security layer based on your requirements and budget!

Important Considerations

Before you jump in and start configuring TCP/TLS listeners, here are some things you need to know:

1. Connection Draining

  • Default draining timeout: 30 seconds (not user-configurable)
  • Any configuration update (PUT operation) will terminate active connections after this timeout
  • Plan your maintenance windows accordingly!

This is important – every time you make a configuration change, those active TCP connections will be dropped after 30 seconds. So be mindful when making changes in production.

2. AGIC Not Supported

Application Gateway Ingress Controller (AGIC) for Kubernetes does not support TCP/TLS proxy. AGIC works only with Layer 7 HTTP(S) listeners.

If you’re running Kubernetes and using AGIC, this feature won’t help you for non-HTTP workloads. Stick with HTTP(S) for AGIC scenarios.

3. SKU Requirements

TCP/TLS proxy is available only on:

  • Standard v2 SKU
  • WAF v2 SKU

Remember: On WAF v2, the firewall only protects HTTP(S) traffic, not TCP/TLS. You can use a WAF v2 SKU for this, but don’t expect WAF protection on your TCP/TLS traffic.

Configuration Overview

Let’s look at how to configure TCP/TLS proxy on Application Gateway. I’ll walk you through the components you need.

Required Components

  1. Frontend Listener
    • Protocol: TCP or TLS
    • Frontend IP: Public or private IPv4
    • Port: Application-specific (e.g., 1433 for SQL Server)
    • Priority: Required for routing rules
  2. Backend Pool
    • Target type: IP address or FQDN
    • Backend servers: Azure VMs, on-premises servers, or any accessible endpoint
  3. Backend Settings
    • Backend protocol: TCP or TLS
    • Backend port: Application-specific
    • Timeout: Configurable in seconds
  4. Routing Rule
    • Links listener to backend pool
    • Connects backend settings
    • Requires priority value

Example: SQL Server Configuration

Here’s a sample scenario of a TCP traffic: SQL Server traffic.

Listener Configuration:

  • Protocol: TCP
  • Port: 1433 (standard SQL Server port)
  • Frontend IP: Public or private

Backend Settings:

  • Protocol: TCP
  • Port: 1433
  • Timeout: 20 seconds

Backend Pool:

  • SQL Server VM IP addresses or FQDNs

Routing Rule:

  • Priority: 100
  • Links all components together

NOTE – YOU NEED TO SELECT TCP/TLS in the listener settings to get the settings enabled in the backend settings

For detailed step-by-step GUI configuration, refer to the official Microsoft documentation.

Infrastructure as Code: Bicep Template

Alright, now for the fun part! If you prefer to automate your deployments using infrastructure as code (and you should!), here’s a working Bicep template to create an Application Gateway with TCP proxy support.

Important Note: This example uses native Bicep resources rather than Azure Verified Modules (AVM). Since the TCP/TLS proxy feature just reached GA, AVM support may still be maturing. Once you verify the latest AVM module version supports these new properties, you can migrate to AVM for cleaner, more maintainable code. Check the AVM Application Gateway module for updates.

This template creates an Application Gateway specifically configured for TCP traffic – perfect for scenarios like SQL Server load balancing.

// Application Gateway with TCP proxy support
resource applicationGateway 'Microsoft.Network/applicationGateways@2023-11-01' = {
  name: applicationGatewayName
  location: location
  tags: {
    Environment: environment
    Purpose: 'TCP Proxy Gateway'
  }
  properties: {
    // SKU - Must be Standard_v2 or WAF_v2 for TCP/TLS support
    sku: {
      name: 'Standard_v2'
      tier: 'Standard_v2'
    }

    // Autoscaling configuration
    autoscaleConfiguration: {
      minCapacity: 2
      maxCapacity: 10
    }

    // Gateway IP configuration - connects Application Gateway to subnet
    gatewayIPConfigurations: [
      {
        name: 'appGatewayIpConfig'
        properties: {
          subnet: {
            id: subnet.id
          }
        }
      }
    ]

    // Frontend IP configuration - the public IP clients connect to
    frontendIPConfigurations: [
      {
        name: 'appGatewayFrontendIP'
        properties: {
          publicIPAddress: {
            id: publicIp.id
          }
        }
      }
    ]

    // Frontend port - the port clients connect to
    frontendPorts: [
      {
        name: 'tcpPort'
        properties: {
          port: tcpListenerPort
        }
      }
    ]

    // Backend address pool - your actual backend servers
    backendAddressPools: [
      {
        name: 'tcpBackendPool'
        properties: {
          backendAddresses: [for ip in backendServerIPs: {
            ipAddress: ip
          }]
        }
      }
    ]

    // NEW: Backend settings for TCP traffic
    // This is where you configure the TCP protocol and backend port
    backendSettingsCollection: [
      {
        name: 'tcpBackendSettings'
        properties: {
          port: backendTcpPort
          protocol: 'Tcp'  // This is the key! Use 'Tcp' or 'Tls'
          timeout: 60      // Connection timeout in seconds
        }
      }
    ]

    // NEW: TCP listener (Layer 4)
    // This replaces httpListeners for TCP/TLS traffic
    listeners: [
      {
        name: 'tcpListener'
        properties: {
          frontendIPConfiguration: {
            id: resourceId('Microsoft.Network/applicationGateways/frontendIPConfigurations', applicationGatewayName, 'appGatewayFrontendIP')
          }
          frontendPort: {
            id: resourceId('Microsoft.Network/applicationGateways/frontendPorts', applicationGatewayName, 'tcpPort')
          }
          protocol: 'Tcp'  // TCP or TLS protocol
        }
      }
    ]

    // NEW: Routing rules for TCP traffic
    // This replaces requestRoutingRules for TCP/TLS traffic
    routingRules: [
      {
        name: 'tcpRoutingRule'
        properties: {
          ruleType: 'Basic'
          priority: 100  // Priority is required for routing rules
          listener: {
            id: resourceId('Microsoft.Network/applicationGateways/listeners', applicationGatewayName, 'tcpListener')
          }
          backendAddressPool: {
            id: resourceId('Microsoft.Network/applicationGateways/backendAddressPools', applicationGatewayName, 'tcpBackendPool')
          }
          backendSettings: {
            id: resourceId('Microsoft.Network/applicationGateways/backendSettingsCollection', applicationGatewayName, 'tcpBackendSettings')
          }
        }
      }
    ]
  }
}

Understanding the New TCP/TLS Properties

Let me highlight the key differences from traditional HTTP configuration:

1. Backend Settings Collection (new)

  • Replaces backendHttpSettingsCollection for TCP/TLS traffic
  • Uses protocol: 'Tcp' or protocol: 'Tls' instead of HTTP/HTTPS
  • Simpler configuration – no cookie affinity, no request timeout (uses connection timeout instead)

2. Listeners (new)

  • Replaces httpListeners for TCP/TLS traffic
  • Protocol field accepts Tcp or Tls
  • No hostname or requireServerNameIndication settings needed

3. Routing Rules (new)

  • Replaces requestRoutingRules for TCP/TLS traffic
  • Links TCP listeners to backend pools and settings
  • Must include priority value (100-20000)

Hybrid Mode: HTTP + TCP Together

Want to handle both HTTP and TCP traffic? You can combine both in one gateway:

Notice how HTTP uses httpListeners and requestRoutingRules, while TCP uses listeners and routingRules. They coexist peacefully in the same gateway!

// Add HTTP-specific components alongside TCP
properties: {
  // ... existing TCP configuration ...

  // Add HTTP backend settings
      frontendPorts: [
      {
        name: 'port_1433'
        properties: {
          port: 1433
        }
      }
    ]
    backendAddressPools: [
      {
        name: applicationGateways_appgw_name
        properties: {
          backendAddresses: []
        }
      }
      {
        name: 'sql'
        properties: {
          backendAddresses: []
        }
      }
    ]
    loadDistributionPolicies: []
    backendHttpSettingsCollection: []
    backendSettingsCollection: [
      {
        name: 'sqltest'
        properties: {
          port: 1433
          protocol: 'Tcp'
          timeout: 20
        }
      }
    ]
    httpListeners: []
    listeners: [
      {
        name: 'test'
        id: '${applicationGateways_appgw_name_resource.id}/listeners/test'
        properties: {
          frontendIPConfiguration: {
            id: '${applicationGateways_appgw_name_resource.id}/frontendIPConfigurations/appGwPublicFrontendIpIPv4'
          }
          frontendPort: {
            id: '${applicationGateways_appgw_name_resource.id}/frontendPorts/port_1433'
          }
          protocol: 'Tcp'
          hostNames: []
        }
      }
    ]
    urlPathMaps: []
    requestRoutingRules: []
    routingRules: [
      {
        name: 'test'
        properties: {
          ruleType: 'Basic'
          priority: 100
          listener: {
            id: '${applicationGateways_appgw_name_resource.id}/listeners/test'
          }
          backendAddressPool: {
            id: '${applicationGateways_appgw_name_resource.id}/backendAddressPools/sql'
          }
          backendSettings: {
            id: '${applicationGateways_appgw_name_resource.id}/backendSettingsCollection/sqltest'
          }
        }
      }
    ]

Use Cases and Scenarios

Let’s talk about some real-world scenarios where this feature shines:

1. Hybrid Application Architecture

Modern web applications (HTTPS) combined with legacy TCP-based services through a single gateway. This is perfect for those scenarios where you’re modernizing but still need to support that old legacy app that runs on TCP.

2. Custom Protocol Applications

Applications using proprietary TCP or TLS protocols that previously couldn’t leverage Application Gateway. Now you can bring them under the same umbrella.

3. Multi-Tier Applications

Front-end APIs (HTTP/HTTPS) and backend message queues or cache servers (TCP/TLS) all accessible through one entry point, simplifying your overall architecture.

4. Cross-Premises Connectivity

Load balance traffic to on-premises servers or remote services using FQDN or IP addressing. Great for hybrid cloud scenarios.

Comparison: Application Gateway vs. Load Balancer

Now you might be wondering, “Should I use Application Gateway or Azure Load Balancer for my TCP traffic?” Let me break it down:

FeatureApplication Gateway (TCP/TLS)Azure Load Balancer
TypeTerminating ProxyPass-through
LayerLayer 4 & Layer 7Layer 4 only
ConnectionSeparate frontend/backend connectionsDirect client-to-backend
LatencyModerate (proxy overhead)Microsecond-level
ThroughputHigh (125 instances max)Millions of flows
Certificate ManagementCentralized, Key Vault integrationOn backend servers
AutoscalingYes (2-125 instances)N/A (always available)
Custom DomainsYesNo
Use CaseVersatility, centralized managementPerformance, simplicity

When to use Application Gateway:

  • You need centralized certificate management
  • You want a single-entry point for HTTP and non-HTTP traffic
  • You need custom domain support
  • You want autoscaling capabilities
  • Management and versatility are priorities

When to use Azure Load Balancer:

  • You need microsecond-level latency
  • You need to handle millions of flows
  • You want the simplest possible setup
  • Performance is the top priority
  • You don’t need certificate management

Best Practices

Based on my experience and the documentation, here are some best practices to follow:

1. Plan for Connection Draining

  • Be aware of the 30-second default timeout
  • Schedule configuration changes during maintenance windows
  • Test connection handling before production deployment
  • Consider the impact on long-running connections

2. Use Autoscaling

  • Configure appropriate min/max instance counts
  • Monitor metrics to tune autoscaling thresholds
  • Start with at least 2 instances for high availability
  • Be mindful of max instances to control costs

3. Leverage Azure Key Vault

  • Store TLS certificates in Key Vault
  • Use managed identities for Application Gateway to access certificates
  • Implement certificate rotation policies
  • Maintain proper certificate security practices

4. Implement Proper Monitoring

  • Enable diagnostic logs
  • Configure Azure Monitor alerts for health probe failures
  • Track connection metrics and backend response times
  • Set up dashboards for visibility

5. Security Considerations

  • Remember: WAF does not protect TCP/TLS traffic (I can’t stress this enough!)
  • At minimum, use Network Security Groups (NSGs) for Layer 3/4 IP-based filtering
  • Consider Azure Firewall if you need advanced threat protection, FQDN filtering, or centralized logging
  • Implement proper network segmentation regardless of your firewall choice
  • Use private endpoints where appropriate
  • Don’t assume Application Gateway gives you complete security – it’s a load balancer, not a firewall!

6. Hybrid Mode Design

  • Separate HTTP(S) and TCP/TLS workloads into different backend pools
  • Use priority-based routing for complex scenarios
  • Document your listener and routing rule configurations thoroughly
  • Maintain organized configurations as hybrid mode can become complex

Conclusion

The general availability of TCP and TLS termination on Azure Application Gateway is indeed a significant enhancement that brings several benefits to the table.

Key Advantages:

  • Single entry point for all traffic types
  • Centralized certificate and configuration management
  • Support for hybrid workloads (HTTP + non-HTTP)
  • Flexible backend options (Azure, on-premises, remote)
  • Autoscaling for both Layer 4 and Layer 7 traffic

Important Limitations:

  • WAF protection applies only to HTTP(S) traffic
  • You need additional security for TCP/TLS traffic (NSGs at minimum, Azure Firewall for advanced features)
  • AGIC (Kubernetes Ingress) not supported for TCP/TLS
  • Fixed 30-second connection draining timeout

So, is it a game changer?

For many scenarios, yes! This feature significantly simplifies architectures where you need to support both HTTP and non-HTTP workloads. However, it’s not a complete replacement for Azure Firewall or other security solutions. Think of it as a powerful addition to your toolbox rather than a silver bullet.

The ability to use a single Application Gateway instance as a unified entry point for diverse protocols is valuable for:

  • Cost optimization (consolidating multiple load balancers)
  • Simplified management (one place for routing and certificates)
  • Architectural flexibility (supporting legacy and modern apps together)

Just remember to complement it with appropriate security controls for comprehensive protection.

Overall, this is a welcome addition to Azure’s networking capabilities, and I’m excited to see how it evolves. The fact that we can now front non-HTTP services with Application Gateway opens up a lot of interesting architectural possibilities.

Additional Resources

Want to learn more? Check out these resources:

Read the whole story
alvinashcraft
37 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Platform Engineering at scale: Lessons from the conference circuit

1 Share

Q4 is conference season in the US with GitHub Universe, KubeCon (including ArgoCon and Platform Engineering day), and AWS reInvent all taking place in November and December.

In our latest episode of CD Office Hours, Bob Walker and Steve Fenton shared some of the topics that came up at these events. You can get some interesting insights into where the industry is right now with Platform Engineering. The TL;DR is that most organizations are on version 2.0 or 3.0 of their platforms, and there are some hard-won lessons to share along the way.

Differing audiences for GitHub Universe and KubeCon

It’s worth calling out an interesting observation that Bob noticed between the audiences at GitHub Universe and KubeCon. The audience at GitHub Universe has a much higher ratio of organization leadership to individual contributors compared to KubeCon. This leads to more strategic conversations at GitHub Universe and more tactical conversations at KubeCon.

However, the same themes resonated at both events: Platform Hub, templating, and blueprints and guardrails. Key initiatives at many organizations include seeking solutions to standardize their deployment processes while maintaining flexibility.

KubeCon + GitHub Universe intro

Platform 2.0 and 3.0: Learning the hard way

Platform Engineering Day is an event that happens on Day 0 of KubeCon week and which Octopus Deploy is a sponsor.

It was at Platform Engineering Day that the theme of teams working on the second or third iteration of their internal platforms really came to light.

Bob describes the typical theme that came up in conversation: “They’ve kind of like, oh, we’ve tried this and we’ve run into this issue. It’s like, okay, now we need to kind of pull this back and rebuild it a bit because it’s just not working out for what we have.”

It’s essential not to frame this as a failure story, but rather to view it as a maturity story. The first iteration of the platform works well for the initial teams that adopt it. However, when reality hits, and you need to scale to hundreds or thousands of teams, new requirements and considerations come into play. Version 2.0, 3.0, and beyond are then born.

Challenges and insights from early adopters

The early adopters of a new platform usually fall into a few categories:

  • They’re more forward-thinking and cutting-edge
  • They’re often working on greenfield applications
  • They can get away with shortcuts that don’t scale

Bob explains: “When you start rolling it out at scale to hundreds or thousands of teams, you’re probably just not going to be able to do those shortcuts. And so that’s where people are starting to see some of these challenges at scale.”

Scaling agile: Challenges from Insights from early teams

With one of the major events of the last few weeks being KubeCon, it’s no surprise that many Platform Engineering conversations centered around Kubernetes and Argo CD. Some organizations are opting to standardize on GitOps solutions, but this presents some challenges.

While Argo CD is built around the concept of declarative GitOps deployments, there are some gaps that enterprises often bump into:

  • Limited environment concept: You cannot natively model dev, staging, and production in Argo CD - they are often set up as separate applications.
  • No concept of environment progression: Given the above point, you cannot push immutable versioned releases through your environment lifecycle.
  • Limited orchestration: There is limited support for pre and post-deploy steps. E.g. ServiceNow integration, database deployments, etc.

All of these gaps become more apparent when teams move beyond the early adopters’ ‘happy path’ deployments.

The ‘How many Argos?’ question

How do the above Argo pains manifest themselves when talking to attendees at an event like KubeCon?

There’s usually a sign in how they answer the ‘How many Argos?’ question.

Steve explains: “If you have one Argo, it is actually superbly simple and you just don’t have many problems. But as soon as you have an answer that’s greater than one, you start bumping into those things where it gets a bit more tricky.”

The answer also isn’t limited to the number of Argo instances that are in play. How many applications you have within an Argo instance can also reveal a few curve balls as Bob Walker discovered, “We have one Argo instance, but we have 30,000 applications in that Argo instance.”

Argo CD: Scaling challenges and best practices explained

Bob goes on to explain that the team in question has adopted a classic hub-and-spoke model. One Argo CD instance that is talking to multiple Kubernetes clusters.

This approach simplifies where you look for things as they are all contained within one Argo CD instance, but it does create security concerns, as Steve explains, “You’re effectively giving people this one room. If you can get into this one room, all the vaults come off of that one room, and you just have access to everything.”

Bob’s response brings us back to what it means from the perspective of platform architecture decisions: “Does it work for you? Does it satisfy your company’s policies and requirements?”

Which, as Steve reinforces, often leads to discovering that “people don’t know what their organizational policy and requirements are. So it’s kind of back to the drawing board.”

Argo CD: Hub-and-spoke model risks

Make policies mandatory, not platforms

That segues us nicely across the Atlantic Ocean to a London DevOps meetup where the topic of Platform Engineering was still high on the agenda.

Steve walked us through the conversation around his presentation at London DevOps that challenges the typical approach to platform adoption.

The typical approach: Make the platform mandatory

The better approach: Make policies mandatory, the platform is optional

The compliance disconnect

Steve describes the typical pattern that happens within an organization: “Teams that are on the platform have got all of this compliance and security stuff, all of these requirements, and they’re hitting them because they’re on the platform. And if you’re a team that’s not on the platform, you kind of get away with it. You just don’t do them.”

That leads to teams avoiding the platform so they can avoid compliance requirements and take some shortcuts. If the goal is changed to ensure everyone meets the compliance requirements, the platform should naturally become a more appealing option.

Creating demand instead of mandates

Steve explains how this mindset shift can make an impact: “Instead of making your platform mandatory, you need to make your policies mandatory. Every pipeline should include security scanning and artifact attestation or whatever it is. That’s what should be mandatory. But if the team solves it without the platform, you’re happy.”

Mandatory policies over platform: A smart strategy?

This approach creates natural demand for the platform. Teams realize: “Yeah, we can do CI/CD. But then there’s all these other things that we don’t really want to do, but if we use the platform, we’ll get those for free.”

AI is now everywhere - where does it fit in the conversation with platform teams?

The bridge between AI and platform teams focused on a common pain point - they’re dealing with an increasing number of ‘support tickets’ for the platform.

Bob describes the problem: “When I talked to some Platform Engineers at Platform Engineering Day, they’re like, yes, that happens a lot more than we think because we’ve effectively become a ticketing system. We have all these templates, and we’re the ones supplying them to the developers. And if something goes wrong immediately, they just turn around and say, this isn’t working, fix it.”

AI-powered triage and remediation

The solution focuses on enabling developers to self-serve and resolve common issues:

  1. AI interprets deployment logs to explain why a step failed
  2. AI provides recommendations for fixing the issue
  3. Identifies when it’s a transient error and suggests retrying it
  4. Escalates only when it’s a template bug requiring platform team intervention

Bob explains: “Providing that self-service remediation like, yes, this was caused because there was a network issue. Go ahead and retry it. I retried it, it worked. Happy days. I didn’t have to bug anyone about that.”

This use of AI fits perfectly in the non-deterministic failure resolution space. Steve notes: “That’s effectively what you would be going and searching this stuff up online and trawling through it to find answers. So it can shortcut that process.”

Self-service remediation: Empowering developers

The deterministic versus non-deterministic line

When it comes to CI/CD and Platform Engineering, there’s a clear boundary between where AI can help and where AI really doesn’t belong.

Many vendors are falling over themselves with AI-powered everything messaging, but with little substance underneath the buzzwords. Our conversation on the usefulness of AI centered around non-deterministic and deterministic tasks.

Non-deterministic tasks (AI helps):

  • Failure analysis and remediation recommendations
  • Prospecting and research
  • Getting started on scripts or configurations
  • Documentation summaries

Deterministic tasks (AI doesn’t belong):

  • Deployment execution
  • Build processes
  • Compliance attestation
  • Anything requiring audit trails

Bob emphasizes: “When it comes back to CI/CD, I’m not going to use it to generate a complete CI/CD pipeline, or have AI make the determination as to what steps to run. I want that to be deterministic. I want it to be consistent every single time. Also, it has to be deterministic if you have to have any sort of compliance like SOX compliance, Dodd-Frank, PCI, HIPAA, because you have to attest to those things.”

The realistic expectation

Bob summarizes the practical approach: “Go into it going, I think what it’s going to produce is 90% there, 80% correct, but I still need to check the other 20%. I think you’re okay with doing something like that. Use AI to speed up the non-thinking parts of your day - the repetitive, all that extra stuff - but learn how things work.”

Catch the full episode in the video below.

Full episode

Happy deployments!

Read the whole story
alvinashcraft
37 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Unlocking the Power of Web with Copilot Chat’s New URL Context

1 Share

There are many scenarios where Copilot Chat can feel limited by the built-in model training data. Maybe you want guidance on the latest web framework, documentation, or project-specific resources—but Copilot’s responses just aren’t specific enough. For developers who rely on up-to-date or esoteric answers, this gap can be a real frustration.  

URL Context: Bringing the web into Copilot Chat 

With the new URL context feature, Copilot Chat can now access and use information directly from web pages you specify. By pasting a URL into your Copilot Chat prompt, you empower Copilot to pull real-time, relevant information from the source. This means more tailored responses. 18 0 url context image

How to use reference URLs in Copilot Chat 

Getting started is easy. When you need Copilot to answer using a specific web resource, simply paste the desired URL into your chat prompt. Copilot will then process the contents of that page, giving you context-aware answers that go beyond its original training. This opens the door to more personalized support. 

Limitations to keep in mind 

While the URL context feature is powerful, there are a few things to remember. Copilot’s ability to extract and understand content depends on the accessibility of the web page and the clarity of its information. Some sites might have authentication restrictions or have dynamic content that limit Copilot’s ability to read the web page’s content. Always review responses for accuracy and completeness, especially when referencing complex or highly technical sources. 

Check out the new Visual Studio Hub 

Stay connected with everything Visual Studio in one place! Visit the Visual Studio Hub for the latest release notes, YouTube videos, social updates, and community discussions. 

Appreciation for your feedback 

Your feedback helps us improve Visual Studio, making it an even more powerful tool for developers. We are immensely grateful for your contributions and look forward to your continued support. By sharing your thoughts, ideas, and any issues you encounter through Developer Community, you help us improve and shape the future of Visual Studio. 

The post Unlocking the Power of Web with Copilot Chat’s New URL Context appeared first on Visual Studio Blog.

Read the whole story
alvinashcraft
37 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Growing Yourself as a Software Engineer, Using AI to Develop Software

1 Share

Sharing your work as a software engineer inspires others, invites feedback, and fosters personal growth, Suhail Patel said at QCon London. Normalizing and owning incidents builds trust, and it supports understanding the complexities. AI enables automation but needs proper guidance, context, and security guardrails.

By Ben Linders
Read the whole story
alvinashcraft
37 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

We Got Claude to Fine-Tune an Open Source LLM

1 Share
Read the whole story
alvinashcraft
37 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Software in the Age of AI

1 Share

In 2025 AI reshaped how teams think, build, and deliver software. We’re now at a point where “AI coding assistants have quickly moved from novelty to necessity [with] up to 90% of software engineers us[ing] some kind of AI for coding,” Addy Osmani writes. That’s a very different world to the one we were in 12 months ago. As we look ahead to 2026, here are three key trends we have seen driving change and how we think developers and architects can prepare for what’s ahead.

Evolving Coding Workflows

New AI tools changed coding workflows in 2025, enabling developers to write and work with code faster than ever before. This doesn’t mean AI is replacing developers. It’s opening up new frontiers to be explored and skills to be mastered, something we explored at our first AI Codecon in May.

AI tools in the IDE and on the command line have revived the debate about the IDE’s future, echoing past arguments (e.g., VS Code versus Vim). It’s more useful to focus on the tools’ purpose. As Kent Beck and Tim O’Reilly discussed in November, developers are ultimately responsible for the code their chosen AI tool produces. We know that LLMs “actively reward existing top tier software engineering practices” and “amplify existing expertise,” as Simon Willison has pointed out. And a good coder will “factor in” questions that AI doesn’t. Does it really matter which tool is used?

The critical transferable skill for working with any of these tools is understanding how to communicate effectively with the underlying model. AI tools generate better code if they’re given all the relevant background on a project. Managing what the AI knows about your project (context engineering) and communicating it (prompt engineering) are going to be key to doing good work.

The core skills for working effectively with code won’t change in the face of AI. Understanding code review, design patterns, debugging, testing, and documentation and applying those to the work you do with AI tools will be the differential.

The Rise of Agentic AI

With the rise of agents and Model Context Protocol (MCP) in the second half of 2025, developers gained the ability to use AI not just as a pair programmer but as an entire team of developers. The speakers at our Coding for the Agentic World live AI Codecon event in September 2025 explored new tools, workflows, and hacks that are shaping this emerging discipline of agentic AI.

Software engineers aren’t just working with single coding agents. They’re building and deploying their own custom agents, often within complex setups involving multi-agent scenarios, teams of coding agents, and agent swarms. This shift from conducting AI to orchestrating AI elevates the importance of truly understanding how good software is built and maintained.

We know that AI generates better code with context, and this is also true of agents. As with coding workflows, this means understanding context engineering is essential. However, the differential for senior engineers in 2026 will be how well they apply intermediate skills such as product thinking, advanced testing, system design, and architecture to their work with agentic systems.

AI and Software Architecture

We began 2025 with our January Superstream, Software Architecture in the Age of AI, where speaker Rebecca Parsons explored the architectural implications of AI, dryly noting that “given the pace of change, this could be out of date by Friday.” By the time of our Superstream in August, things had solidified a little more and our speakers were able to share AI-based patterns and antipatterns and explain how they intersect with software architecture. Our December 9 event will look at enterprise architecture and how architects can navigate the impact of AI on systems, processes, and governance. (Registration is still open—save your seat.) As these events show, AI has progressed from being something architects might have to consider to something that is now essential to their work.

We’re seeing successful AI-enhanced architectures using event-driven models, enabling AI agents to act on incoming triggers rather than fixed prompts. This means it’s more important than ever to understand event-driven architecture concepts and trade-offs. In 2026, topics that align with evolving architectures (evolutionary architectures, fitness functions) will also become more important as architects look to find ways to modernize existing systems for AI without derailing them. AI-native architectures will also bring new considerations and patterns for system design next year, as will the trend toward agentic AI.

As was the case for their engineer coworkers, architects still have to know the basics: when to add an agent or a microservice, how to consider cost, how to define boundaries, and how to act on the knowledge they already have. They also need to understand how an AI element relates to other parts of their system: What are the inputs and outputs? And how can they measure performance, scalability, cost, and other cross-functional requirements?

Companies will continue to decentralize responsibilities across different functions this year, and AI brings new sets of trade-offs to be considered. It’s true that regulated industries remain understandably wary of granting access to their systems. They’re rolling out AI more carefully with greater guardrails and governance, but they are still rolling it out. So there’s never been a better time to understand the foundations of software architecture. It will prepare you for the complexity on the horizon.

Strong Foundations Matter

AI has changed the way software is built, but it hasn’t changed what makes good software. As we enter 2026, the most important developer and architecture skills won’t be defined by the tool you know. They’ll be defined by how effectively you apply judgment, communicate intent, and handle complexity when working with (and sometimes against) intelligent assistants and agents. AI rewards strong engineering; it doesn’t replace it. It’s an exciting time to be involved.


Join us at the Software Architecture Superstream on December 9 to learn how to better navigate the impact of AI on systems, processes, and governance. Over four hours, host Neal Ford and our lineup of experts including Metro Bank’s Anjali Jain and Philip O’Shaughnessy, Vercel’s Dom Sipowicz, Intel’s Brian Rogers, Microsoft’s Ron Abellera, and Equal Experts’ Lewis Crawford will share their hard-won insights about building adaptive, AI-ready architectures that support continuous innovation, ensure governance and security, and align seamlessly with business goals.

O’Reilly members can register here. Not a member? Sign up for a 10-day free trial before the event to attend—and explore all the other resources on O’Reilly.



Read the whole story
alvinashcraft
38 minutes ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories