Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
148818 stories
·
33 followers

A dual strategy: legal action and new legislation to fight scammers

1 Share
An overview of how Google is taking legal action, supporting strong bipartisan legislation and releasing new features to fight scammers.
Read the whole story
alvinashcraft
32 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Microsoft adds news screen recording capabilities to Sysinternals ZoomIt

1 Share
Microsoft’s suite of Sysinternal tools is an interesting collection. While not as exciting – or as frequently updated – as PowerToys, it is home to a number of incredibly useful utilities, especially for power users and system administrators. But there are also instances of crossover, and the screen recording tool ZoomIt is a perfect example. Although ZoomIt has been merged into the PowerToys utility suite, a standalone version remains available for Sysinternal fans who want to steer clear of PowerToys. In an interesting move, Microsoft has pushed an update to the standalone version of ZoomIt with new features which are… [Continue Reading]
Read the whole story
alvinashcraft
32 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Announcing Microsoft cloud security benchmark v2 (public preview)

1 Share

Overview

Since its first introduction in 2019, the Azure Security Benchmark and its successor Microsoft cloud security benchmark announced in 2023, Microsoft cloud security benchmark (“the Benchmark”) has been widely used by our customers to secure their Azure environments, especially as a security bible and toolkit for Azure security implementation planning and helping the security compliance on various industry and government regulatory standards.

 

What’s new?

We’re thrilled to announce the Microsoft cloud security benchmark v2 (public preview), a new Benchmark version with the enhancement in following areas:

  • Adding artificial intelligence security into our scope to address the threats and risks in this emerging domain.
  • Expanding the prior simple basic control guideline to a more comprehensive, risk and threats-based control guide with more granular technical implementation examples and references details.
  • Expanding the Azure Policy based control measurements from ~220 to ~420 to cover more new security controls and expanding the measurements on the existing controls.
  • Expanding the control mappings to more industry regulations standards such as NIST CSF, PCI-DSS v4, ISO 27001, etc.
  • Alignment with SFI objectives to introduce Microsoft internal security best practices to our customers. 

 

Microsoft Defender for Cloud update

In addition, you will soon see the Benchmark dashboard embedded into the Microsoft Defender for Cloud with additional 200+ Azure Policy mapped to the respective controls, allowing you to monitor the Azure resources against the respective controls in the Benchmark.

Value proposition recap

Please also refer to How Microsoft cloud security benchmark helps you succeed in your cloud security journey if you want to understand more on the value proposition of Microsoft cloud security benchmark. 

Read the whole story
alvinashcraft
33 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Real‑Time AI Streaming with Azure OpenAI and SignalR

1 Share

TL;DR

We’ll build a real-time AI app where Azure OpenAI streams responses and SignalR broadcasts them live to an Angular client. Users see answers appear incrementally just like ChatGPT while Azure SignalR Service handles scale. You’ll learn the architecture, streaming code, Angular integration, and optional enhancements like typing indicators and multi-agent scenarios.

Why This Matters

Modern users expect instant feedback. Waiting for a full AI response feels slow and breaks engagement. Streaming responses:

  • Reduces perceived latency: Users see content as it’s generated.
  • Improves UX: Mimics ChatGPT’s typing effect.
  • Keeps users engaged: Especially for long-form answers.
  • Scales for enterprise: Azure SignalR Service handles thousands of concurrent connections.

What you’ll build

  • A SignalR Hub that calls Azure OpenAI with streaming enabled and forwards partial output to clients as it arrives.
  • An Angular client that connects over WebSockets/SSE to the hub and renders partial content with a typing indicator.
  • An optional Azure SignalR Service layer for scalable connection management (thousands to millions of long‑lived connections).
    References: SignalR hosting & scale; Azure SignalR Service concepts.

Architecture

 

  • The hub calls Azure OpenAI with streaming enabled (await foreach over updates) and broadcasts partials to clients.
  • Azure SignalR Service (optional) offloads connection scale and removes sticky‑session complexity in multi‑node deployments.
    References: Streaming code pattern; scale/ARR affinity; Azure SignalR integration.

Prerequisites

  • Azure OpenAI resource with a deployed model (e.g., gpt-4o or gpt-4o-mini)
  • .NET 8 API + ASP.NET Core SignalR backend
  • Angular 16+ frontend (using microsoft​/signalr)

Step‑by‑Step Implementation

1) Backend: ASP.NET Core + SignalR

Install packages

 

dotnet add package Microsoft.AspNetCore.SignalR dotnet add package Azure.AI.OpenAI --prerelease dotnet add package Azure.Identity dotnet add package Microsoft.Extensions.AI dotnet add package Microsoft.Extensions.AI.OpenAI --prerelease dotnet add package Microsoft.Azure.SignalR

Using DefaultAzureCredential (Entra ID) avoids storing raw keys in code and is the recommended auth model for Azure services.

Program.cs

 

var builder = WebApplication.CreateBuilder(args); builder.Services.AddSignalR(); // To offload connection management to Azure SignalR Service, uncomment: // builder.Services.AddSignalR().AddAzureSignalR(); builder.Services.AddSingleton<AiStreamingService>(); var app = builder.Build(); app.MapHub<ChatHub>("/chat"); app.Run();

AiStreamingService.cs  - streams content from Azure OpenAI

using Microsoft.Extensions.AI; using Azure.AI.OpenAI; using Azure.Identity; public class AiStreamingService { private readonly IChatClient _chatClient; public AiStreamingService(IConfiguration config) { var endpoint = new Uri(config["AZURE_OPENAI_ENDPOINT"]!); var deployment = config["AZURE_OPENAI_DEPLOYMENT"]!; // e.g., "gpt-4o-mini" var azureClient = new AzureOpenAIClient(endpoint, new DefaultAzureCredential()); _chatClient = azureClient.GetChatClient(deployment).AsIChatClient(); } public async IAsyncEnumerable<string> StreamReplyAsync(string userMessage) { var messages = new List<ChatMessage> { ChatMessage.CreateSystemMessage("You are a helpful assistant."), ChatMessage.CreateUserMessage(userMessage) }; await foreach (var update in _chatClient.CompleteChatStreamingAsync(messages)) { // Only text parts; ignore tool calls/annotations var chunk = string.Join("", update.Content .Where(p => p.Kind == ChatMessageContentPartKind.Text) .Select(p => ((TextContent)p).Text)); if (!string.IsNullOrEmpty(chunk)) yield return chunk; } } }

Modern .NET AI extensions (Microsoft.Extensions.AI) expose a unified streaming pattern via CompleteChatStreamingAsync.

ChatHub.cs - pushes partials to the caller

using Microsoft.AspNetCore.SignalR; public class ChatHub : Hub { private readonly AiStreamingService _ai; public ChatHub(AiStreamingService ai) => _ai = ai; // Client calls: connection.invoke("AskAi", prompt) public async Task AskAi(string prompt) { var messageId = Guid.NewGuid().ToString("N"); await Clients.Caller.SendAsync("typing", messageId, true); await foreach (var partial in _ai.StreamReplyAsync(prompt)) { await Clients.Caller.SendAsync("partial", messageId, partial); } await Clients.Caller.SendAsync("typing", messageId, false); await Clients.Caller.SendAsync("completed", messageId); } }

2) Frontend: Angular client with microsoft​/signalr

Install the SignalR client

 

npm i microsoft/signalr

Create a SignalR service (Angular)

// src/app/services/ai-stream.service.ts import { Injectable } from '@angular/core'; import * as signalR from '@microsoft/signalr'; import { BehaviorSubject, Observable } from 'rxjs'; @Injectable({ providedIn: 'root' }) export class AiStreamService { private connection?: signalR.HubConnection; private typing$ = new BehaviorSubject<boolean>(false); private partial$ = new BehaviorSubject<string>(''); private completed$ = new BehaviorSubject<boolean>(false); get typing(): Observable<boolean> { return this.typing$.asObservable(); } get partial(): Observable<string> { return this.partial$.asObservable(); } get completed(): Observable<boolean> { return this.completed$.asObservable(); } async start(): Promise<void> { this.connection = new signalR.HubConnectionBuilder() .withUrl('/chat') // same origin; use absolute URL if CORS .withAutomaticReconnect() .configureLogging(signalR.LogLevel.Information) .build(); this.connection.on('typing', (_id: string, on: boolean) => this.typing$.next(on)); this.connection.on('partial', (_id: string, text: string) => { // Append incremental content this.partial$.next((this.partial$.value || '') + text); }); this.connection.on('completed', (_id: string) => this.completed$.next(true)); await this.connection.start(); } async ask(prompt: string): Promise<void> { // Reset state per request this.partial$.next(''); this.completed$.next(false); await this.connection?.invoke('AskAi', prompt); } } ``

Angular component

// src/app/components/ai-chat/ai-chat.component.ts import { Component, OnInit } from '@angular/core'; import { AiStreamService } from '../../services/ai-stream.service'; @Component({ selector: 'app-ai-chat', templateUrl: './ai-chat.component.html', styleUrls: ['./ai-chat.component.css'] }) export class AiChatComponent implements OnInit { prompt = ''; output = ''; typing = false; done = false; constructor(private ai: AiStreamService) {} async ngOnInit() { await this.ai.start(); this.ai.typing.subscribe(on => this.typing = on); this.ai.partial.subscribe(text => this.output = text); this.ai.completed.subscribe(done => this.done = done); } async send() { this.output = ''; this.done = false; await this.ai.ask(this.prompt); } }

HTML Template

<!-- src/app/components/ai-chat/ai-chat.component.html --> <div class="chat"> <div class="prompt"> <input [(ngModel)]="prompt" placeholder="Ask me anything…" /> <button (click)="send()">Send</button> </div> <div class="response"> <pre>{{ output }}</pre> <div class="typing" *ngIf="typing">Assistant is typing…</div> <div class="done" *ngIf="done">✓ Completed</div> </div> </div>

Streaming modes, content filters, and UX

Azure OpenAI streaming interacts with content filtering in two ways:

  • Default streaming: The service buffers output into content chunks and runs content filters before each chunk is emitted; you still stream, but not necessarily token‑by‑token.
  • Asynchronous Filter (optional): The service returns token‑level updates immediately and runs filters asynchronously. You get ultra‑smooth streaming but must handle delayed moderation signals (e.g., redaction or halting the stream).

Best practices

  • Append partials in small batches client‑side to avoid DOM thrash; finalize formatting on "completed".
  • Log full messages server‑side only after completion to keep histories consistent (mirrors agent frameworks).

Security & compliance

  • Auth: Prefer Microsoft Entra ID (DefaultAzureCredential) to avoid key sprawl; use RBAC and Managed Identities where possible.
  • Secrets: Store Azure SignalR connection strings in Key Vault and rotate periodically; never hardcode.
  • CORS & cross‑domain: When hosting frontend and hub on different origins, configure CORS and use absolute URLs in withUrl(...).

Connection management & scaling tips

  • Persistent connection load: SignalR consumes TCP resources; separate heavy real‑time workloads or use Azure SignalR to protect other apps.
  • Sticky sessions (self‑hosted): Required in most multi‑server scenarios unless WebSockets‑only + SkipNegotiation applies; Azure SignalR removes this requirement.

Learn more

Read the whole story
alvinashcraft
33 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Retirement Date for Office Online Server (OOS) Announced

1 Share

We would like to inform our SharePoint Server customers that Microsoft has announced the retirement of Office Online Server (OOS), effective December 31, 2026 - Announcing the retirement for Office Online Server Office Online Server (formerly known as Office Web Apps Server) is a product that can be optionally integrated with SharePoint Server. It enables users to work with supported document types in various scenarios without first having to download them. 

What does this mean?

  • Users will lose the ability to view, edit, and collaborate on Office documents directly in the browser. Office client applications can still be used to view and edit Office documents. 
  • SharePoint Server relies on OOS to generate previews for Word, Excel, PowerPoint, and PDF documents. After OOS support ends, these thumbnails will no longer be actively supported in libraries and search results.
  • Office Online Server will not be supported in any configuration after retirement, including scenarios where it is paired with SharePoint Server Subscription Edition.

Microsoft will not actively block the use of OOS in SharePoint Server after its retirement. However, no further updates will be provided, including security updates. 

This announcement does not impact Microsoft's support for SharePoint Server, and there are no plans to retire SharePoint Server Subscription Edition. Microsoft has publicly committed to supporting SharePoint Server Subscription Edition until at least December 31, 2035. 

 

The SharePoint Server Team 

Read the whole story
alvinashcraft
33 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

329: Azure Front Door: Please Use the Side Entrance

1 Share

Welcome to episode 329 of The Cloud Pod, where the forecast is always cloudy! Justin, Jonathan, and special guest Elise are in the studio to bring you all the latest in AI and cloud news, including – you guessed it – more outages, and more OpenAI team-ups. We’ve also got GPUs, K8 news, and Cursor updates. Let’s get started! 

Titles we almost went with this week

  • Azure Front Door: Please Use the Side Entrance – el -jb
  • Azure and NVIDIA: A Match Made in GPU Heaven – mk
  • Azure Goes Down Under the Weight of Its Own Configuration – el
  • GitHub Turns Your Copilot Subscription Into an All-You-Can-Eat Agent Buffet – mk, el
  • Microsoft Goes Full Blackwell: No Regrets, Just GPUs
  • Jules Verne Would Be Proud: Google’s CLI Goes 20,000 Bugs Under the Codebase
  • RAG to Riches: AWS Makes Retrieval Augmented Generation Turnkey
  • Kubectl Gets a Gemini Twin: Google Teaches AI to Speak Kubernetes
  • I’m Not a Robot: Azure WAF Finally Learns to Ask the Important Questions
  • OpenAI Puts 38 Billion Eggs in Amazon’s Basket: Multi-Cloud Gets Complicated
  • The Root Cause They’ll Never Root Out: Why Attrition Stays Off the RCA
  • Google’s New Extension Lets You Deploy Kubernetes by Just Asking Nicely
  • Cursor 2.0: Now With More Agents Than a Hollywood Talent Agency

Follow Up 

04:46 Massive Azure outage is over, but problems linger – here’s what happened | ZDNET 

  • Azure experienced a global outage on October 29, affecting all regions simultaneously, unlike the recent AWS outage that was limited to a single region. 
  • The incident lasted approximately eight hours from noon to 8 PM ET, impacting major services including Microsoft 365, Teams, Xbox Live, and critical infrastructure for Alaska Airlines, Vodafone UK, and Heathrow Airport, among others.
  • The root cause was an inadvertent tenant configuration change in Azure Front Door that bypassed safety validations due to a software defect. Microsoft’s protection mechanisms failed to catch the erroneous deployment, allowing invalid configurations to propagate across the global fleet and cause HTTP timeouts, server errors, and elevated packet loss at network edges.
  • Recovery required rolling back to the last known good configuration and gradually rebalancing traffic across nodes to prevent overload conditions. 
  • Some customers experienced lingering issues even after the official recovery time, with Microsoft temporarily blocking configuration changes to Azure Front Door while completing the restoration process.
  • The incident highlights concentration risk in cloud infrastructure, as this marks the second major cloud provider outage in October 2025. 
  • Despite Azure revenue growing 40 percent in the latest quarterly report, Microsoft’s stock declined in after-hours trading as the company acknowledged capacity constraints in meeting AI and cloud demands.
  • Affected Azure services included App Service, Azure SQL Database, Microsoft Entra ID, Container Registry, Azure Databricks, and approximately 15 other core platform services. Microsoft has implemented additional validation and rollback controls to prevent similar configuration deployment failures, though the full post-incident report remains pending.

07:06 Matt – “The fact that you’re plus one week and still can’t actually make changes or even do simple things like purge a cache makes me think this is a lot bigger on the backend than they let on at the beginning.”

AI Is Going Great – Or How ML Makes Money

08:30 AWS and OpenAI announce multi-year strategic partnership | OpenAI

  • AWS and OpenAI formalized a 38 billion dollar multi-year partnership providing OpenAI immediate access to hundreds of thousands of NVIDIA GPUs (GB200s and GB300s) clustered via Amazon EC2 UltraServers, with capacity deployment targeted by the end of 2026. 
  • The infrastructure supports both ChatGPT inference serving and next-generation model training with the ability to scale to tens of millions of CPUs for agentic workloads.
  • The partnership builds on existing integration where OpenAI’s open weight foundation models became available on Amazon Bedrock earlier this year, making OpenAI one of the most popular model providers on the platform. Thousands of customers, including Thomson Reuters, Peloton, and Verana Health, are already using these models for agentic workflows, coding, and scientific analysis.
  • AWS positions this as validation of their large-scale AI infrastructure capabilities, noting they have experience running clusters exceeding 500,000 chips with the security, reliability, and scale required for frontier model development. 
  • The low-latency network architecture of EC2 UltraServers enables optimal performance for interconnected GPU systems.
  • This represents a significant shift in OpenAI’s infrastructure strategy, moving substantial compute workloads to AWS while maintaining its existing Microsoft Azure relationship. 
  • The seven-year commitment timeline with continued growth provisions indicates long-term capacity planning for increasingly compute-intensive AI model development.

09:53 Elise – “It sort of feels like OpenAI has a strategic partnership with everyone right now, so I’m sure this will help them, just like everything else that they have done will help them. We’re banking a lot on OpenAI being very successful.” 

17:11 Google removes Gemma models from AI Studio after GOP senators complaint – Ars Technica

  • Google removed its open Gemma AI models from AI Studio following a complaint from Senator Marsha Blackburn, who reported the model hallucinated false sexual misconduct allegations against her when prompted with leading questions. 
  • The model allegedly fabricated detailed false claims and generated fake news article links, demonstrating the persistent hallucination problem across generative AI systems.
  • The removal only affects non-developer access through AI Studio’s user interface, where model behavior tweaking tools could increase hallucination likelihood. 
  • Developers can still access Gemma through the API and download models for local development, suggesting Google is limiting casual experimentation rather than pulling the model entirely.
  • This incident highlights the ongoing challenge of AI hallucinations in production systems, which no AI firm has successfully eliminated despite mitigation efforts. 
  • Google’s response indicates a shift toward restricting open model access when inflammatory outputs could result from user prompting, potentially setting a precedent for how cloud providers handle politically sensitive AI failures.
  • The timing follows congressional hearings where Google defended its hallucination mitigation practices, with the company’s representative acknowledging these issues are widespread across the industry. 
  • This creates a tension between open model availability and liability concerns when models generate defamatory content, particularly affecting cloud-based AI platforms.

23:00 Matt – “That’s everything on the internet, though. When Wikipedia first came out and you started using it, we were told you can’t reference Wikipedia, because who knows what was put on there…you can’t blindly trust.”  

Cloud Tools  

26:53 Introducing Agent HQ: Any agent, any way you work – The GitHub Blog

  • GitHub launches Agent HQ as a unified platform to orchestrate multiple AI coding agents from Anthropic, OpenAI, Google, Cognition, and xAI directly within GitHub and VS Code, all included in paid Copilot subscriptions. 
  • This eliminates the fragmented experience of juggling different AI tools across separate interfaces and subscriptions.
  • Mission Control provides a single command center across GitHub, VS Code, mobile, and CLI to assign work to different agents in parallel, track their progress, and manage agent identities and permissions just like human team members. 
  • The system maintains familiar Git primitives like pull requests and issues while adding granular controls over when CI runs on agent-generated code.
  • VS Code gets Plan Mode for building step-by-step task approaches with clarifying questions before code generation, plus AGENTS.md files for creating custom agents with specific rules like preferred logging frameworks or testing patterns. 
  • It’s the only editor supporting the full Model Context Protocol specification with one-click access to the GitHub MCP Registry for integrating tools like Stripe, Figma, and Sentry.
  • GitHub Code Quality in public preview now provides org-wide visibility into code maintainability and reliability, with Copilot automatically reviewing its own generated code before developers see it to catch technical debt early. 
  • Enterprise admins get a new control plane for governing AI access, setting security policies, and viewing Copilot usage metrics across the organization.
  • The platform keeps developers on GitHub’s existing compute infrastructure, whether using GitHub Actions or self-hosted runners, avoiding vendor lock-in while OpenAI Codex becomes available this week in VS Code Insiders for Copilot Pro+ users as the first partner agent.

27:20 Jonathan- “I’m like the different interfaces; they all bring something a little different.” 

31:55 Cursor introduces its coding model alongside multi-agent interface – Ars :Technica

  • Cursor launches version 2.0 of its IDE with Composer, its first competitive in-house coding model built using reinforcement learning and mixture-of-experts architecture. 
  • The company claims Composer is 4x faster than similarly intelligent models while maintaining competitive intelligence levels with frontier models from OpenAI, Google, and Anthropic.
  • The new multi-agent interface in Cursor 2.0 allows developers to run multiple AI agents in parallel for coding tasks, expanding beyond the single-agent workflow that has been standard in AI-assisted development environments. 
    • This represents a shift toward more complex, distributed AI assistance within the IDE.
  • Cursor’s internal benchmarking shows Composer prioritizes speed over raw intelligence, outperforming competitors significantly in tokens per second while slightly underperforming the best frontier models in intelligence metrics. 
    • This positions it as a practical option for developers who need faster code generation and iteration cycles.
  • The IDE maintains its Visual Studio Code foundation while deepening LLM integration for what Cursor calls vibe coding, where AI assistance is more directly embedded in the development workflow. 
  • Previously, Cursor relied entirely on third-party models, making this its first attempt at vertical integration in the AI coding assistant space.

33:03 Elise- “Cursor had an agent built, and I thought it was ok, but it was wrong a lot. The 2.0 agent seems fabulous, comparatively, and a lot faster.” 

AWS 

43:25 The Model Context Protocol (MCP) Proxy for AWS is now generally available

  • AWS has released the Model Context Protocol (MCP) Proxy for AWS, a client-side proxy that enables MCP clients to connect to remote AWS-hosted MCP servers using AWS SigV4 authentication. 
  • The proxy works with popular AI development tools like Amazon Q Developer CLI, Cursor, and Kiro, allowing developers to integrate AWS service interactions directly into their agentic AI workflows.
  • The proxy enables developers to access AWS resources like S3 buckets and RDS tables through MCP servers while maintaining AWS security standards through SigV4 authentication. 
  • It includes built-in safety controls such as read-only mode to prevent accidental changes, configurable retry logic for reliability, and logging capabilities for troubleshooting issues.
  • The MCP Proxy bridges the gap between local AI development tools and AWS-hosted MCP servers, particularly those built on Amazon Bedrock AgentCore Gateway or Runtime. 
  • This allows AI agents and developers to extend their workflows to include AWS service interactions without manually handling authentication and protocol communications.
  • Installation options are flexible, supporting deployment from source, Python package managers, or containers, making it straightforward to integrate with existing MCP-supported development environments. 
  • The proxy is open-source and available now through the AWS GitHub repository at https://github.com/aws/mcp-proxy-for-aws with no additional cost beyond standard AWS service usage.

44:10 Matt – “This is a nice little tool to help with production…and easier stepping stone than having to build all this stuff yourself.” 

47:07 Amazon ECS now supports built-in Linear and Canary deployments

  • Amazon ECS now includes native linear and canary deployment strategies alongside existing blue/green deployments, eliminating the need for external tools like AWS CodeDeploy for gradual traffic shifting. 
  • Linear deployments shift traffic in equal percentage increments with configurable step sizes and bake times, while canary deployments route a small percentage to the new version before completing the shift.
  • The feature integrates with CloudWatch alarms for automatic rollback detection and supports deployment lifecycle hooks for custom validation steps. 
  • Both strategies include a post-deployment bake time that keeps the old revision running after full traffic shift, enabling quick rollback without downtime if issues emerge.
  • Available now in all commercial AWS regions where ECS operates, the deployment strategies work with Application Load Balancer and ECS Service Connect configurations. 
  • Customers can implement these strategies through Console, SDK, CLI, CloudFormation, CDK, and Terraform for both new and existing ECS services without additional cost beyond standard ECS pricing.
  • This brings ECS deployment capabilities closer to parity with Kubernetes native deployment options and reduces dependency on CodeDeploy for teams running containerized workloads. 
  • The built-in approach simplifies deployment pipelines for organizations that previously needed separate deployment orchestration tools.

48:45 Jonathan – “I always wonder why they haven’t built these things previously, and I guess it was possible through CodeDeploy, but if it was possible through CodeDeploy, then why add it to ECS now? I feel like we kind of get this weird sprawl.” 

50:35 Amazon Route 53 Resolver now supports AWS PrivateLink

  • Route 53 Resolver now supports AWS PrivateLink, allowing customers to manage DNS resolution features entirely over Amazon’s private network without traversing the public internet. 
  • This includes all Resolver capabilities like endpoints, DNS Firewall, Query Logging, and Outposts integration.
  • The integration addresses security and compliance requirements for organizations that need to keep all AWS API calls within private networks. Operations like creating, deleting, and editing Resolver configurations can now be performed through VPC endpoints instead of public endpoints.
  • Available immediately in all regions where Route 53 Resolver operates, including AWS GovCloud (US) regions. 
  • No additional feature announcements for pricing were mentioned, so standard Route 53 Resolver pricing applies, plus PrivateLink endpoint costs (typically $0.01 per hour per AZ plus data processing charges).
  • Primary use case targets enterprises with strict network isolation policies, particularly in regulated industries like finance and healthcare, where DNS management traffic must remain on private networks. 
  • This complements existing hybrid DNS architectures using Resolver endpoints for on-premises connectivity.

51:04 Jonathan – “Good for anyone who wanted this!” 

54:05 Mountpoint for Amazon S3 and Mountpoint for Amazon S3 CSI driver add monitoring capability

  • Mountpoint for Amazon S3 now emits near real-time metrics using the OpenTelemetry Protocol, allowing customers to monitor operations through CloudWatch, Prometheus, and Grafana instead of parsing log files manually. 
  • This addresses a significant operational gap for teams running data-intensive workloads that mount S3 buckets as file systems on EC2 instances or Kubernetes clusters.
  • The new monitoring capability provides granular metrics, including request counts, latency, and error types at the EC2 instance level, enabling proactive troubleshooting of issues like permission errors or performance bottlenecks. Customers can now set up alerts and dashboards using standard observability tools rather than building custom log parsing solutions.
  • Integration works through CloudWatch agent or OpenTelemetry collector, making it compatible with existing monitoring infrastructure that many organizations already have deployed. The feature is available immediately for both the standalone Mountpoint client and the Mountpoint for Amazon S3 CSI driver used in Kubernetes environments.
  • This update is particularly relevant for machine learning workloads, data analytics pipelines, and containerized applications that treat S3 as a file system and need visibility into storage layer performance. Setup instructions are available in the Mountpoint GitHub repository with configuration examples for common observability platforms.

GCP

58:31 New Log Analytics query builder simplifies writing SQL code | Google Cloud Blog

  • Google Cloud has released the Log Analytics query builder to general availability, providing a UI-based interface that generates SQL queries automatically for users who need to analyze logs without deep SQL expertise. 
  • The tool addresses the common challenge of extracting insights from nested JSON payloads in log data, which typically requires complex SQL functions like JSON_VALUE and JSON_EXTRACT that many DevOps engineers and SREs find time-consuming to write.
  • The query builder includes intelligent schema discovery that automatically detects and suggests JSON fields and values from your datasets, along with a real-time SQL preview so users can see the generated code and switch to manual editing when needed. 
    • Key capabilities include search across all fields, automatic aggregations and grouping, and one-click visualization to dashboards, making it practical for incident troubleshooting and root cause analysis workflows.
  • Google plans to expand the feature with cross-project log scopes, trace data integration for joining logs and traces, query saving and history, and natural language to SQL conversion using Gemini AI. 
  • The query builder works with existing Log Analytics pricing, which is based on the amount of data scanned during queries, similar to BigQuery’s on-demand pricing model.
  • The tool integrates directly with Google Cloud’s observability stack, allowing users to query logs alongside BigQuery datasets and other telemetry types in a single interface. 
  • This consolidation reduces context switching for teams managing complex distributed systems across multiple GCP services and projects.

1:00:01 Jonathan- “I think this is where everything is going. Why spend half an hour crafting a perfect SQL query…when you can have it figure it all out for you.” 

1:01:12 GKE and Gemini CLI work better together | Google Cloud Blog  

  • Google has open-sourced a GKE extension for Gemini CLI that integrates Kubernetes Engine operations directly into the command-line AI agent. The extension works as both a Gemini CLI extension and a Model Context Protocol server compatible with any MCP client, allowing developers to manage GKE clusters using natural language commands instead of verbose kubectl syntax.
  • The integration provides three main capabilities: GKE-specific context resources for more natural prompting, pre-built slash command prompts for complex workflows, and direct access to GKE tools, including Cloud Observability integration. Installation requires a single command for Gemini CLI users, with separate instructions available for other MCP clients.
  • The primary use case targets ML engineers deploying inference models on GKE who need help selecting appropriate models and accelerators based on business requirements like latency targets. 
  • Gemini CLI can automatically discover compatible models, recommend accelerators, and generate deployable Kubernetes manifests through conversational interaction rather than manual configuration.
  • This builds on Gemini CLI’s extension architecture that bundles MCP servers, context files, and custom commands into packages that teach the AI agent how to use specific tools. 
  • The GKE extension represents Google’s effort to make Kubernetes operations more accessible through AI assistance, particularly for teams managing AI workload deployments.
  • The announcement includes no pricing details as both Gemini CLI and the GKE extension are open source projects, though standard GKE cluster costs and any Gemini API usage charges would still apply during operation.

1:02:10 Matt – “Anything to make Kubernetes easier to manage, I’m on board for it.” 

1:05:06 Master multi-tasking with the Jules extension for Gemini CLI | Google Cloud Blog

  • Google has launched the Jules extension for Gemini CLI, which acts as an autonomous coding assistant that handles background tasks like bug fixes, security patches, and dependency updates while developers focus on primary work. 
  • Jules operates asynchronously using the /jules command, working in isolated environments to address multiple issues in parallel and creating branches for review.
  • The extension integrates with other Gemini CLI extensions to create automated workflows, including the Security extension for vulnerability analysis and remediation, and the Observability extension for crash investigation and automated unit test generation. 
    • This modular approach allows developers to chain together different capabilities for comprehensive task automation.
  • Jules addresses common developer productivity drains by handling routine maintenance tasks that typically interrupt deep work sessions. The tool can process multiple GitHub issues simultaneously, each in its own environment, and prepares fixes for human review rather than automatically committing changes.
  • The extension is available now as an open source project on GitHub at github.com/gemini-cli-extensions/jules, with no pricing information provided, as it appears to be a free developer tool. 
  • Google is building an ecosystem of Gemini CLI extensions that can be combined with Jules for various development workflows.

1:06:16 Jonathan – “Google obviously listens to their customers because it was only half an hour ago when I said something like this would be pretty useful.”

1:11:36 Announcing GA of Cost Anomaly Detection | Google Cloud Blog

  • Google’s Cost Anomaly Detection has reached general availability with AI-powered alerts now enabled by default for all GCP customers across all projects, including new ones. 
  • The service automatically monitors spending patterns and sends alerts to Billing Administrators when unusual cost spikes are detected, with no configuration required.
  • The GA release introduces AI-generated anomaly thresholds that adapt to each customer’s historical spending patterns, reducing alert noise by flagging only significant, unexpected deviations. 
  • Customers can override these intelligent baselines with custom values if needed, and the system now supports both absolute-dollar thresholds and percentage-based deviation filters to accommodate projects of different sizes and sensitivities.
  • The improved algorithm solves the cold start problem that previously required six months of spending history, now providing immediate anomaly protection for brand new accounts and projects from day one. 
    • This addresses a key limitation from the public preview phase and ensures comprehensive cost monitoring regardless of account age.
  • Cost Anomaly Detection remains free as part of GCP’s cost management toolkit and integrates with Cloud Budgets to create a layered approach for preventing, detecting, and containing runaway cloud spending. 
  • The anomaly dashboard provides root cause analysis to help teams quickly understand and address cost spikes when they occur.
  • Interested in pricing details? Check out the billing console here

1:14:01 Elise – “I just wonder, there’s so many third-party companies that specialize in this kind of thing. So I wonder if they realized that they could just do a little bit better.”

Azure

1:16:37 Building the future together: Microsoft and NVIDIA announce AI advancements at GTC DC | Microsoft Azure Blog

  • Microsoft and NVIDIA are expanding their AI partnership with several infrastructure and model updates. 
  • Azure Local now supports NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs, enabling organizations to run AI workloads at the edge with cloud-like management through Azure Arc, targeting healthcare, retail, manufacturing, and government sectors requiring data residency and low-latency processing.
  • Azure AI Foundry adds NVIDIA Nemotron models for agentic AI and enterprise reasoning, plus NVIDIA Cosmos models for physical AI applications like robotics and autonomous vehicles. 
  • Microsoft also introduced TRELLIS for 3D asset generation, all deployable as NVIDIA NIM microservices with enterprise-grade security and scalability.
  • Microsoft deployed the first production-scale cluster of NVIDIA GB300 NVL72 systems with over 4,600 Blackwell Ultra GPUs in the new NDv6 GB300 VM series. 
  • Each rack delivers 130 TB/s of NVLink bandwidth and up to 136 kW of compute power, designed for training and deploying frontier models with integrated liquid cooling and Azure Boost for accelerated I/O.
  • Also, NVIDIA Run:ai is now available on Azure Marketplace, providing GPU orchestration and workload management across Azure NC and ND series instances. The platform integrates with AKS, Azure Machine Learning, and Azure AI Foundry to help enterprises dynamically allocate GPU resources, reduce costs, and improve utilization across teams.
  • Azure Kubernetes Service now supports NVIDIA Dynamo framework on ND GB200-v6 VMs, demonstrating 1.2 million tokens per second with the gpt-oss 120b model. 
  • Microsoft reports up to 15x throughput improvement over Hopper generation for reasoning models, with deployment guides available for production implementations.

1:21:53 Jonathan – “That’s a really good salesy number to quote, though, 1.2 million tokens a second – that’s great, but that’s not an individual user. One individual user will not get 1.2 million tokens a second out of any model. That is, at full capacity with as many users running inference as possible on that cluster. The total generation output might be 1.2 million tokens a second, which is still phenomenal, but as far as the actual user experience, you know, if you were a business that wanted really fast inference, you’re not going to get 1.2 million tokens a second.”

1:23:26 Public Preview: Azure Functions zero-downtime deployments with rolling Updates in Flex Consumption 

  • Azure Functions in the Flex Consumption plan now supports rolling updates for zero-downtime deployments through a simple configuration change. 
  • This eliminates the need for forceful instance restarts during code or configuration updates, allowing the platform to gracefully transition workloads across instances.
  • Rolling updates work by gradually replacing old instances with new ones while maintaining active request handling, similar to deployment strategies used in container orchestration platforms. 
  • This brings enterprise-grade deployment capabilities to serverless functions without requiring additional infrastructure management.
  • The capability is currently in public preview for the Flex Consumption plan specifically, which is Azure’s newer consumption-based pricing model that offers more flexibility than the traditional Consumption plan. 
  • Pricing follows the standard Flex Consumption model based on execution time and memory usage, with no additional cost for the rolling update feature itself.

1:24:42 Matt – “It’s a nice quality of life feature that they’re adding to everything. It’s in preview, though, so don’t deploy production workloads leveraging this.” 

1:25:06 The Azure PAYG API Shift: What’s Actually Changing (and Why It Matters) 

  • Microsoft is deprecating the legacy Consumption API for Azure Pay-As-You-Go cost data retrieval and replacing it with two modern approaches: the Cost Details API for Enterprise and Microsoft Customer Agreement subscriptions, and the Exports API for PAYG and Visual Studio subscriptions. 
  • This shifts from a pull model, where teams constantly query APIs, to a subscribe model where Azure delivers cost data directly to Azure Storage Accounts as CSV files.
  • The change addresses significant scalability and consistency issues with the old API that struggled with throttling, inconsistent schemas across different subscription types, and handling large enterprise-scale datasets. 
  • The new APIs support FOCUS-compliant schemas, include reservations and savings plans data in single exports, and integrate better with Power BI and Azure Data Factory for FinOps automation.
  • FinOps teams need to audit existing scripts that call the Microsoft.Commerce/UsageAggregates endpoint and migrate to storage-based data ingestion instead of direct API calls. 
  • While the legacy endpoint remains live but unsupported, Microsoft strongly recommends immediate migration, though the deprecation timeline may extend based on customer adoption rates.
  • The practical impact for cloud teams is more reliable cost data pipelines with fewer failed jobs, predictable scheduled exports eliminating API throttling issues, and consistent field mappings across all subscription types. 
  • Teams should review Microsoft’s field mapping reference documentation, as column names have changed between the old and new APIs.
  • PAYG customers currently must use the Exports API with storage-based retrieval, though Microsoft plans to eventually extend Cost Details API support to PAYG subscriptions. 
  • The transition requires updating data flow architecture but provides an opportunity to standardize FinOps processes across different Azure billing models.

1:27:12 Matt – “A year or two ago, we did an analysis at my day job, and we were trying to figure out the savings plan’s amount if we buy X amount, how much do we need to buy everything along those lines. And we definitely ran into like throttling issues, and it was just bombing out on us at a few points, and a lot of weird loops we had to do because the format just didn’t make sense with moderate stuff. It’s a great way. I would suggest you move not because they’re trying to get rid of it, but because it will make your life better.”

1:28:05 Generally Available: Azure WAF CAPTCHA Challenge for Azure Front Door

  • Azure WAF now includes CAPTCHA challenge capabilities for Front Door deployments, allowing organizations to distinguish between legitimate users and automated bot traffic. 
  • This addresses common threats like credential stuffing, web scraping, and DDoS attacks that traditional WAF rules may miss.
  • The CAPTCHA feature integrates directly into Azure Front Door‘s WAF policy engine, enabling administrators to trigger challenges based on custom rules, rate limits, or anomaly detection patterns. 
  • Organizations can configure CAPTCHA thresholds and exemptions without requiring changes to backend application code.
  • This capability targets e-commerce sites, financial services, and any web application experiencing bot-driven abuse or account takeover attempts. 
  • The CAPTCHA challenge adds a human verification layer that complements existing WAF protections like OWASP rule sets and custom security policies.
  • Pricing follows the standard Azure Front Door WAF model with per-policy charges plus request-based fees, though specific CAPTCHA-related costs were not detailed in the announcement. 
  • Organizations already using Front Door Premium can enable this feature through policy configuration updates.
  • The general availability means this protection is now production-ready across all Azure regions where Front Door operates, removing the need for third-party CAPTCHA services or custom bot mitigation solutions for many Azure customers.
  • We just wonder what we’re going to replace re: Captcha with when AI can click the button like a human can. 

1:31:04 Public Preview: Instant Access Snapshots for Azure Premium SSD v2 and  Ultra Disk Storage

  • Azure now offers Instant Access Snapshots in public preview for Premium SSD v2 and Ultra Disks, eliminating the traditional wait time for snapshot restoration. Previously, customers had to wait for snapshots to fully hydrate before using restored disks, but this feature allows immediate disk restoration with high performance right after snapshot creation.
  • This capability addresses a critical operational need for enterprises running high-performance workloads on Azure’s fastest storage tiers. 
  • Premium SSD v2 and Ultra Disks are typically used for mission-critical databases, SAP HANA, and other latency-sensitive applications where downtime during recovery operations directly impacts business operations.
  • The feature reduces recovery time objectives for disaster recovery and backup scenarios, particularly valuable for customers who need rapid failover capabilities. Organizations can now create point-in-time copies and immediately spin up test environments or recover from failures without the performance penalty of background hydration processes.
  • This positions Azure’s premium storage offerings more competitively against AWS’s EBS snapshots with fast snapshot restore and Google Cloud’s instant snapshots. 
  • The preview status means customers should test thoroughly before production use, and Microsoft has not yet announced general availability timing or any pricing changes specific to this snapshot capability.

Closing

And that is the week in the cloud! Visit our website, the home of the Cloud Pod, where you can join our newsletter, Slack team, send feedback, or ask questions at theCloudPod.net or tweet at us with the hashtag #theCloudPod





Download audio: https://episodes.castos.com/5e2d2c4b117f29-10227663/2203823/c1e-2okobm679os8op5j-6zqzxkwpf9xg-ucqxd6.mp3
Read the whole story
alvinashcraft
33 minutes ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories