Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
156555 stories
·
33 followers

The Download: GitHub Copilot app, Anthropic news, and Minecraft

1 Share
From: GitHub
Duration: 4:39
Views: 189

Welcome to another episode of The Download with GPS! This week, we cover the general availability of the new GitHub Copilot desktop app, designed to be your control center for managing multiple AI agents safely. We also discuss the US government directive that led Anthropic to pull their Fable 5 and Mythos 5 models offline. Finally, we explore Arnis, an open-source project that lets you turn real-world geography into accurate Minecraft worlds.

#Minecraft #GitHubCopilot #Anthropic

— CHAPTERS —

00:00 Welcome to The Download
00:30 GitHub Copilot app is now generally available
02:19 Anthropic disables Fable 5 and Mythos 5
03:10 Turn your hometown into Minecraft with Arnis
04:14 Outro

— RESOURCES —

GitHub Copilot app https://github.blog/changelog/2026-06-17-github-copilot-app-generally-available/
Fable and Mythos offline https://www.anthropic.com/news/fable-mythos-access
Arnis: https://github.com/louis-e/arnis
video demo of Arnis by YouTuber KasaiSora: https://www.youtube.com/watch?v=Ujy37wk_EcE&t=63s

Stay up-to-date on all things GitHub by connecting with us:

YouTube: https://gh.io/subgithub
Blog: https://github.blog
X: https://twitter.com/github
LinkedIn: https://linkedin.com/company/github
Insider newsletter: https://resources.github.com/newsletter/
Instagram: https://www.instagram.com/github
TikTok: https://www.tiktok.com/@github

About GitHub
It’s where over 180 million developers create, share, and ship the best code possible. It’s a place for anyone, from anywhere, to build anything—it’s where the world builds software. https://github.com

Read the whole story
alvinashcraft
54 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

359: Tokenomicon Sounds Metal, but it's Just Cloud Budgets

1 Share

 Welcome to episode 359 of The Cloud Pod, where the weather is always cloudy! Justin and Ryan are in the studio this week to bring you all the latest in cloud and AI news, including AI governance, FinOps’ final conference, and even an earnings story courtesy of Oracle. These and so much more – so let’s get started! 

Titles we almost went with this week

  • You Shall Not Pass Unless Your Network Policy Says So
  • One CLI Wizard to Rule All AWS Agents
  • AWS WAF Turns AI Crawlers Into Cash Cows
  • No More Delete and Pray for AWS Cost Reports
  • Stop Rolling Your Own Certificate Rotation AWS Did It
  • Tux Gets a Security Checkup, Microsoft Antivirus Style
  • Coal Plant to Cloud Plant Google’s Billion Dollar Glow Up
  • FinOps Grows Up and Gets an AI Spending Problem
  • Tokenomics Foundation Wants to Bill AI by the Word
  • Sweet Home Alabama Now Runs on Google Cloud Infrastructure

A big thanks to this week’s sponsors:

We’re sponsorless! Want to get your brand, company, or service in front of a very enthusiastic group of cloud news seekers? You’ve come to the right place! Send us an email or hit us up on our Slack channel for more info.

General News

02:53 Microsoft restricts Claude Fable for employees over data retention concerns 

  • Microsoft has restricted Claude Fable 5 from its internal GitHub Copilot model picker, even though the model is available to external GitHub Copilot and Azure Foundry customers. 
  • All other Claude models remain available internally because they operate under Zero Data Retention rules.
  • The core issue is that Claude Fable 5 requires data retention to power Anthropic’s new safety classifiers, meaning prompts and outputs are stored for up to 30 days by default, and up to two years if flagged for policy violations. 
  • This creates a meaningful conflict with enterprise data handling expectations.
  • This situation highlights a broader tension cloud enterprises face when adopting frontier AI models that bundle safety mechanisms requiring data retention, since those requirements may conflict with internal legal and compliance policies around confidential information.
  • The restriction is notable because Microsoft is both a distribution partner for Anthropic through Azure and a direct competitor via its own AI offerings, so internal adoption decisions carry weight beyond typical enterprise procurement concerns.
  • For developers and businesses evaluating Claude Fable 5 through Azure Foundry or GitHub Copilot, this serves as a reminder to review the specific data retention terms for Mythos-class models before deploying them in workflows that handle sensitive or proprietary information.

04:23 Statement on the US government directive to suspend access to Fable 5 and Mythos 5

  • The US government issued an export control directive requiring Anthropic to immediately suspend access to Fable 5 and Mythos 5 for all foreign nationals, forcing a full customer shutdown to ensure compliance. 
  • All other Anthropic models remain unaffected.
  • The government’s concern centers on a reported narrow, non-universal jailbreak involving asking the model to read a codebase and identify software flaws. Anthropic reviewed the technique and found that the capability level is already available in other publicly deployed models, including OpenAI’s GPT-5.5.
  • Anthropic’s defense-in-depth strategy for Fable 5 included thousands of hours of red-teaming with the US government, UK AISI, and third-party organizations, plus a mandatory 30-day customer data retention policy specifically to detect and mitigate jailbreak attempts.
  • Anthropic is complying with the directive but publicly disagrees with the standard applied, arguing that requiring recall of a commercial model over a narrow non-universal jailbreak would effectively halt all frontier model deployments across the industry.
  • This situation raises a practical question for cloud and enterprise customers about deployment risk when government directives can abruptly disable access to production AI services without advance notice or detailed technical disclosure.

05:36 Justin – “…the government sounds like they overreacted as they like to do in this era, and Anthropic doesn’t agree, and they’re going back and forth, and hopefully they get it back. That’s the goal.”

08:51 A frontier without an ecosystem is not stable

  • Satya Nadella published an essay arguing that a frontier AI model without a surrounding ecosystem is inherently unstable, suggesting Microsoft’s strategic focus remains on platform and developer ecosystem building, rather than standalone model capabilities.
  • The core argument positions this AI transition as distinct from previous platform shifts, such as mobile and cloud, in which digital systems augmented human work rather than potentially replacing organizational structures and firm boundaries.
  • For cloud practitioners, this framing matters because it signals where Microsoft is likely to invest, specifically in tooling, APIs, and developer infrastructure that ties AI capabilities into existing enterprise workflows rather than competing purely on model benchmarks.
  • The post generated substantial engagement with 35,000 reposts and 49,000 likes, indicating the topic of AI economic impact on business structures is drawing broad attention from the technology community.
  • Podcast hosts may want to discuss whether the ecosystem argument favors established cloud providers like Microsoft, Google, and AWS, who already have developer ecosystems in place, and what that means for independent AI labs trying to build commercial businesses.

09:59 Ryan – “AI is not particularly useful unless you give it access to data, and you can’t give a frontier model access to data, right? You can train it on data and build your own model, but you need some sort of mechanism or platform to set up the rag, to set up any kind of localization or customer state, you know, grounding; like you can’t just put it in there.”

20:07 FinOps X 2026 Day 1 Keynote: The Wild West of AI, Token Economics and the Evolving Role of FinOps

  • The FinOps Foundation and the Linux Foundation announced their intent to form the Tokenomics Foundation, with backing from Oracle, Google, Microsoft, IBM, JPMorgan Chase, and others, to create open standards for AI billing and token cost attribution.
  • Tokens are being positioned as the atomic unit of AI spend, and the core challenge FinOps practitioners face is answering two questions: what does AI actually cost, and how do you measure the value of intelligence produced?
  • AWS announced several concrete feature updates, including Target Coverage for Savings Plans, Granular Cost Attribution for Amazon Bedrock, and an AI-powered FinOps Agent that supports natural language queries and autonomous savings execution.
  • Microsoft confirmed plans to support FOCUS 1.4 in 2026 and highlighted Microsoft Fabric and Foundry as tools for unifying data and AI cost management across the enterprise.
  • The FinOps Foundation updated its certification path with a new Technology Value credential covering public cloud, SaaS, data platforms, and data centers, and FinOps X will fold into a broader Tokenomicon conference in San Diego in June 2027.

23:10 FinOps X 2026 Day 2 Keynote: From Alerts to Agents

  • The FinOps Foundation introduced a Crawl, Walk, Run maturity model for agentic FinOps, framing the progression from automated alerts to autonomous cost management as a structured path practitioners can follow using community-validated best practices.
  • FOCUS 1.4 was announced at the conference, continuing the Foundation’s push to standardize cloud cost and usage data, with Oracle also announcing FOCUS 1.3 support and Flexera and Google Cloud both adding FOCUS compatibility to their platforms.
  • Google Cloud presented an AI Explainability Agent alongside Automated Spend Caps and full-stack AI cost visibility in Cost Reports, representing a shift from post-billing reactive alerts toward proactive cost control for AI workloads.
  • Pinterest shared a Tokenomic Layer Cake model that separates Product AI costs from Internal AI infrastructure costs, offering practitioners a practical framework for stacking optimizations so efficiency gains compound over time.
  • The Foundation announced Tokenomicon, a new event series dedicated specifically to the economics of AI, with the first event scheduled for Amsterdam in September 2026, signaling that AI cost management is becoming a distinct discipline within the broader FinOps practice.

24:41 Justin – “I also don’t know that tokens are going to last forever. It’s lasted longer than I thought it was going to, but I think there’s a lot of pressure for people to explain tokens and how they’re measured, and once you try to do that, it’s very difficult.”

AI Is Going Great – or How ML Makes Money 

25:53 OpenAI to acquire Ona

  • OpenAI is acquiring Ona, a cloud execution platform that has served 2 million developers with secure, reproducible cloud environments, to expand the Codex agentic coding ecosystem.
  • The core technical addition is Ona’s customer-controlled execution model, which allows agents to run persistently inside an organization’s own cloud infrastructure rather than being tied to a single device or active session.
  • Codex currently has 5 million weekly users, up 400% from earlier this year, and the acquisition addresses a specific gap where longer-running agent tasks spanning hours or days need persistent, session-independent execution environments.
  • For enterprise deployments, Ona’s technology provides controls over where agents run, credential scoping, activity logging, and workflow review gates, which are requirements organizations need before moving agents from experimentation into production.
  • Developers and IT teams should note that this positions Codex as a persistent background worker across the software lifecycle, handling tasks like running tests, resolving issues, and modernizing applications without requiring an active user session.

26:43 Justin – “I’d guess this is nice that there’s a third party involved that if OpenAI ever wants to build something on top of all these data centers they’re building for Stargate, this is nice for them. But all the cloud providers are basically giving you this, too. So it’s sort of interesting.”

29:26 Results from first Anthropic Public Record

  • Anthropic surveyed nearly 52,000 Americans in late 2025 to establish a public baseline on AI attitudes, finding that 64% fear job displacement and 56% fear cognitive dependency, with both concerns dropping notably among daily AI users (54% and 46% respectively).
  • Only 15% of Americans trust AI companies to make decisions about AI development, the lowest of any institution tested, falling below the federal government at 20% and well below independent experts at 43%. This trust deficit is a notable data point for cloud and AI vendors building enterprise relationships.
  • Support for government AI regulation reached 71% overall and was bipartisan, with 79% of Democrats and 68% of Republicans in favor. Privacy, child safety, and liability for harm were the top areas where Americans want regulatory action.
  • The survey found that daily AI users support government oversight at nearly the same rate as the general public (74% vs 71%), suggesting that hands-on experience with AI does not reduce appetite for accountability and regulation.
  • Anthropic is pairing this public survey data with its Anthropic Economic Index and the 81,000-person Claude user interview study to build a multi-source picture of AI adoption, which the company says will inform its policy frameworks around mandatory safety testing and worker displacement support.

32:52 Ryan – “I think that trying to make sure that you have a broader exposure to how people are feeling and how people are using it is important, because I think that – especially when you start talking about regulation – you don’t want to regulate it for one type of person or one industry.” 

37:53 Kimi K2.7 Code: Open-Source Agentic Coding Model

  • Moonshot AI has released Kimi K2.7 Code, an open-source coding-focused model built on a Mixture-of-Experts architecture with 1 trillion total parameters and 32 billion activated per token, supporting a 256K context window with full weights available on Hugging Face.
  • Benchmark improvements over K2.6 are notable, with gains of 21.8% on Kimi Code Bench v2, 11% on Program Bench, and 31.5% on MLS Bench Lite, plus roughly 10% improvement on agentic task benchmarks measuring autonomous execution.
  • A key efficiency improvement is a 30% reduction in thinking-token usage compared to K2.6, which translates directly to lower API costs and faster responses without sacrificing benchmark scores, an important consideration for production agentic workflows.
  • The model is purpose-built for long-horizon coding tasks like multi-file refactoring and extended debugging sessions, and always runs with thinking mode enabled, meaning non-thinking requests automatically fall back to K2.6.
  • Pricing starts at $19/month through Kimi Code membership, while API access uses per-token billing at $0.95 per million input tokens, with cache misses dropping to $0.19 on cache hits, making it worth evaluating for teams running high-volume coding agent pipelines.

39:06 Justin – “Kimi 2.6 is one of my favorite open source models for coding, and I use it all the time.” 

41:43 Anthropic “pauses” token-based billing for its Claude Agent SDK

  • Anthropic announced then quickly paused a billing change for its Claude Agent SDK that would have shifted outside SDK usage to standard API rates starting June 15, with subscribers receiving only a monthly credit equal to their subscription price.
  • Under the current model, Agent SDK usage counts against weekly subscription caps rather than per-token API rates, which analysis suggests can make a Claude subscription worth many multiples of its cost compared to equivalent API spending.
  • The pause affects third-party apps and programmatic use via the claude -p command, meaning developers building automation workflows on top of Claude subscriptions can continue operating under existing limits for now.
  • For developers and businesses evaluating build-vs-buy decisions around AI agents, this situation highlights the pricing risk of building on consumption models that sit outside standard API contracts, since the underlying economics can shift with limited notice.
  • No revised timeline or alternative pricing structure has been announced, leaving Agent SDK users in a holding pattern that complicates longer-term cost planning for agentic. 

23:34 Ryan – “I imagine there are real business problems behind these; capacity and end costs, I’m sure, are a factor. I don’t think it’s all just trying to squeeze every last dollar out of consumers, but I’m not really a big fan of this pricing model.” 

Cloud Tools

44:12  Route public traffic to private applications with Cloudflare

  • Cloudflare is launching Application Services for Private Origins in closed beta for Enterprise customers, allowing public traffic to reach private applications without exposing those origins to the public internet, public IPs, or inbound firewall rules.
  • The feature works by adding a use_private_routing flag to standard DNS records, which signals Cloudflare’s proxy to route the final hop through existing private network connectivity like IPsec, GRE, CNI, or Cloudflare Mesh rather than over the public internet.
  • All of Cloudflare’s existing application services, including WAF, bot management, rate limiting, caching, and Workers, apply normally to this traffic, meaning private internal APIs and tools get the same security stack as public-facing applications without additional infrastructure.
  • The routing model extends beyond HTTP through Spectrum for TCP and UDP services and Workers VPC bindings, so databases, SSH endpoints, and AI agent backends on private IPs can all be fronted by Cloudflare without a load balancer or connector software on the origin.
  • Cloudflare is targeting general availability in Q4 2026 and has stated private-to-private traffic flows as the next milestone, where users and services on private networks would reach other private applications through the same Cloudflare security layer.

46:16 Ryan – “I’m a big fan of having gateway access to SSH endpoints for managing jump hosts and bastion, and having stuff that’s quickly stood up and taken down so that you can build inside environments, which is neat.”

AWS

47:04 Try the new console experience in Amazon Bedrock, optimized for Anthropic- and OpenAI-compatible APIs

  • AWS launched a new Amazon Bedrock console experience built around the bedrock-mantle endpoint, which supports OpenAI Chat Completions API, OpenAI Responses API, and Anthropic Messages API, making it easier for teams already using those SDKs to route requests through Bedrock without rewriting code.
  • The project-based workflow is the most practical addition here, letting developers group models, API keys, usage metrics, and code snippets under a single project, which reduces the context-switching that typically slows down the evaluation-to-production cycle.
  • Live documentation that auto-populates with your project’s model ID, region, endpoint URL, and API key is a notable developer experience improvement, since you can copy a code snippet directly from the console and run it without manual edits.
  • The console includes direct integration instructions for AI coding agents, including Claude Code, Cline, Codex, Cursor, and OpenCode, allowing teams to route those tools through Bedrock using IAM credentials or Bedrock API keys rather than direct vendor endpoints.
  • The new console is available now across 15 regions, including US East, US West, several Asia Pacific locations, and multiple European regions, though fully managed Bedrock features like Agents, Knowledge Bases, and Guardrails remain in the existing console on the bedrock-runtime endpoint. Pricing follows standard Bedrock inference rates for the underlying models used.

48:07 Justin – “I will tell you that the Bedrock Console – today – is terrible. It is not great.”  

52:10 AWS Cost and Usage Report 2.0 now supports table configurations update

  • CUR 2.0 now allows customers to update table configurations like column selection, time granularity, and export format directly through the AWS Console or SDK/CLI, eliminating the previous requirement to delete and recreate exports when adopting new features.
  • This change is particularly useful for teams running ETL pipelines against CUR data, since they previously had no in-place update path and had to manage export recreation carefully to avoid disrupting downstream jobs.
  • The update takes effect on the next scheduled export delivery, so customers should plan schema changes with their data engineering teams to avoid breaking existing cost reporting workflows.
  • There is no additional charge for this capability since CUR 2.0 exports are priced based on S3 storage and data transfer costs, though customers should review the AWS Data Exports documentation at the link in the announcement for configuration details.
  • The feature is available in all commercial AWS Regions except AWS GovCloud (US) and China Regions, which is a notable gap for customers operating in those environments who will still need to manage exports through the old delete-and-recreate approach.

52:53 Ryan – “Man, where were all these cost usage optimizations when I had to generate all the reports?” 

54:11 Amazon OpenSearch Service launches MCP Apps for agentic observability

  • Amazon OpenSearch Service now supports MCP Apps, letting AI agents in local IDEs like Claude Desktop and VS Code investigate incidents using logs, traces, metrics, and alerts stored in OpenSearch domains and Amazon Managed Service for Prometheus without switching tools.
  • Each MCP App tool call returns a dual response: a text summary for the agent to reason over and an interactive visualization rendered in the same conversation thread, keeping human review integrated into the agentic workflow.
  • The feature covers a broad set of observability use cases, including root cause analysis, distributed trace exploration, service maps, PromQL metric charts, and cross-signal correlations, all within a single conversation context.
  • Available MCP App tools span log, metrics, and trace investigation, service performance, topology, agent health, cluster health, and instrumentation scoring, giving teams a fairly complete observability toolkit through the agentic interface.
  • The feature is available in all AWS regions where the Amazon OpenSearch UI is supported, and pricing follows existing OpenSearch Service consumption costs with no separate MCP App charge mentioned in the announcement.

54:32 Ryan – “The open search API, if it’s anything like the Elastic Search API, is cumbersome to use. I find it very difficult to sort of query Elastic Search natively. And then you add all that sort of reliability and performance problems we’ve had with log ingestion and that kind of stuff – which still makes me a little twitchy. Enough time has passed where I can sort of talk about it openly now. So I kind of liked the idea of having MCP front that and having an easy way to query that data, where I can just have AI do it for me… which is great.”

56:00 Evaluate AI agents systematically with Agent-EvalKit 

  • Agent-EvalKit is an open-source Apache 2.0 toolkit that evaluates AI agents by tracing their full execution path, including tool calls and intermediate state, rather than just checking final output quality. It integrates directly with AI coding assistants like Claude Code and Kiro CLI, keeping evaluation inside the development environment.
  • The toolkit organizes evaluation into six phases: Plan, Data, Trace, Run, Eval, and Report. Each phase produces artifacts that feed into the next, and developers invoke them through natural language slash commands, with results stored in an eval directory for reuse across evaluation cycles.
  • A travel research agent case study illustrates the practical value: Response Quality scored 83.9% and looked acceptable on the surface, but Faithfulness scored only 32.3%, revealing the agent was fabricating exchange rates and temperatures whenever tool calls returned empty results. This kind of failure is invisible to output-only testing.
  • The toolkit supports OpenTelemetry-compatible tracing and integrates with frameworks including Strands, LangGraph, and CrewAI, plus evaluation libraries like DeepEval and the Strands Evals SDK
  • For production monitoring beyond pre-deployment testing, AWS recommends pairing it with Amazon Bedrock AgentCore Observability and AgentCore Evaluation.
  • Costs are not fixed since Agent-EvalKit itself is free, but LLM-as-judge metrics require Amazon Bedrock foundation model inference, so teams should review Bedrock pricing based on their model selection and test case volume before running large evaluations.

58:13 AWS announces AWS Workload Credentials Provider

  • AWS Workload Credentials Provider is an open source, client-side tool that automates certificate deployment from ACM and secrets caching from Secrets Manager, replacing custom EventBridge automation that customers previously had to build and maintain themselves.
  • The tool is particularly relevant given the CA/B Forum mandate to reduce public certificate lifetimes, which increases the operational burden of certificate rotation at scale and raises the risk of expiry-related outages.
  • It runs on Windows and Linux with built-in support for Apache and NGINX, handling certificate export, file placement, and server reload behavior through simple configuration rather than custom scripting.
  • For secrets management, it maintains full backwards compatibility with the existing Secrets Manager Agent, so teams can consolidate both use cases into a single provider without reworking existing integrations.
  • The provider is available now across all AWS Regions, works for both AWS and non-AWS workloads, and is open source on GitHub
  • Pricing follows standard ACM and Secrets Manager rates with no additional charge for the provider itself.

59:15 Ryan – “It’s great when you’re using ACM natively with Amazon, you know, and the load balancer, and it’s just sort of handled.” 

1:00:00 AWS DevOps Agent expands with custom SRE agents and MCP/A2A protocols

  • AWS DevOps Agent now supports custom SRE agents that run on a schedule within Agent Spaces, enabling teams to automate recurring tasks like daily database health checks or log anomaly detection without manual intervention.
  • The addition of MCP and A2A protocol support lets developers invoke DevOps Agent from tools they already use, including Kiro, Claude, and other coding assistants, reducing context switching during incident investigation.
  • Teams can now connect their own sub-agents built on Amazon Bedrock or third-party frameworks via A2A, effectively extending DevOps Agent with custom capabilities rather than being limited to built-in functionality.
  • Additional updates include incident-skip rules, Git-managed skills, persistent memories, human labeling for task quality tracking, and customer-created dashboards, suggesting the service is maturing toward production-grade SRE use cases.
  • The service is now available in five additional regions, though pricing details are not specified in the announcement, and teams should review the supported regions table and recent improvements page in the AWS docs before planning adoption.

1:00:52 Ryan – “Agents like this can be expensive too because there’s a lot of data… so as long as these things can remain affordable, it’s great.” 

42:46 Amazon CloudWatch Query Studio is now generally available

  • Amazon CloudWatch Query Studio is now generally available, offering a unified interface for querying and visualizing metrics using either PromQL or Metrics Insights SQL from a single workspace within the CloudWatch console.
  • Teams managing services across multiple AWS accounts and regions can use per-query cross-account and cross-region selectors to correlate metrics like latency and error rates without switching between consoles or tools.
  • The visualization options are notably broad, including line, bar, scatter plot, heatmap, histogram, pie, gauge, and number widgets, with dual y-axis support and series overrides, making it more capable than basic CloudWatch charting.
  • Query Studio integrates with CloudWatch dashboards and supports Grafana imports, which gives teams already using Grafana a migration path or a way to work across both toolsets without rebuilding queries from scratch.
  • The service is available in all commercial AWS regions except Middle East UAE, Middle East Bahrain, and Israel Tel Aviv. 
  • Pricing follows standard CloudWatch metrics query costs, so teams should review their existing CloudWatch pricing tier before adopting it at scale.

1:03:04 Justin – “…this is a cool feature. I’m glad it exists, but please stop calling everything a studio.”

1:03:41 AWS launches Cost Explorer historical data retention for accounts in billing groups

  • AWS Cost Explorer now retains historical billing data at original AWS billable rates for accounts that are part of billing groups in AWS Billing Conductor or Billing Transfer, closing a gap that previously cut off access to pre-enrollment cost history.
  • Before this change, accounts mapped to billing groups could only see pro forma rates set by the payer account, making it difficult to compare costs or run reports that spanned the period before and after joining a billing group.
  • Existing Billing Conductor and Billing Transfer customers automatically gain access to their historical data with no migration or configuration steps required, which is a practical benefit for teams already managing multi-account environments.
  • The feature supports reporting continuity for enterprises using Billing Transfer to centralize cost management across multiple AWS organizations, a common pattern for large companies or managed service providers handling consolidated billing.
  • Billing Transfer is available in all AWS Regions except GovCloud, China Beijing, and China Ningxia, and there is no additional cost mentioned for this historical data retention capability specifically.

1:04:47 Ryan – “For the 12 people that need this, they’re gonna love it.” 

1:06:01 Introducing the Kiro merch store

  • Kiro, AWS’s AI-powered IDE, has launched an official merchandise store at shop.kiro.dev with 15 items, including apparel, accessories, and developer-focused gear like mechanical keyboard keycaps and a ghost plushie for rubber duck debugging.
  • The store launch is notable mainly as a community-building effort rather than a technical announcement, signaling that Kiro is investing in developer identity and brand presence beyond the product itself.
  • From a practical standpoint, this is not an AWS infrastructure or tooling update, so listeners looking for technical developments should note this is purely a marketing and community initiative with no direct impact on cloud workflows or costs.
  • The merchandise details do reflect some developer-specific thinking, such as PBT keycaps chosen for durability and a roll-top backpack with a dedicated 16-inch laptop compartment with side zipper access, suggesting the product team is targeting working developers rather than casual fans.

1:08:14 AWS WAF adds AI traffic monetization capability to help content owners charge AI bots for content access 

  • AWS WAF now lets content publishers charge AI bots for access to their content directly at the network edge, using HTTP 402 responses and the x402 open protocol for machine-to-machine payments settled in USDC stablecoins. 
  • This addresses a real cost problem since AI bot traffic now exceeds 50% of web traffic for many publishers, with AI-specific crawlers growing over 300% year-over-year, while returning little to no referral traffic back to publishers.
  • Payment settlement is handled through Coinbase’s x402 Facilitator, with Stripe and Machine Payments Protocol support coming soon. AWS does not take a cut of content revenue, and the feature is available at no additional charge beyond standard WAF pricing.
  • The feature builds on existing AWS WAF Bot Control, which already classifies over 650 distinct AI bot types, including GPTBot, Claude-Web, and Perplexity-Bot, assigning each a verified or unverified status using cryptographic signatures or IP reputation matching. 
  • Publishers can set per-request pricing by content path, bot category, or verification tier without modifying origin infrastructure.
  • A notable constraint is that the Monetize action only works with CloudFront-associated web ACLs and is not supported for regional web ACLs, so publishers need to route through CloudFront to use this capability.
  • A test mode using blockchain testnets like Base Sepolia and Solana Devnet lets teams validate the full payment flow before going live, which is a practical consideration given the novelty of stablecoin-based payment flows in production infrastructure.

1:09:28 Ryan – “I understand a news media site – all of that content – you’re going to pay for all of the hosting and all the hits and there’s going to be zero benefit, you’re not going to have any eyes on your page. It’s someone just getting an answer on some other tool, right? So I really do understand it. And they’re already struggling for money. So it does sort of make sense to me that they would need to do something like that. And I think the alternative to this is just blocking that traffic.”

1:14:26 AWS Sign-in now supports resource-based policies and resource control policies

  • AWS Sign-in now supports resource-based policies at the account level and resource control policies (RCPs) at the organization level, giving teams a way to restrict console sign-in to specific trusted networks.
  • Policies are evaluated both at sign-in and whenever the console session requests new credentials, meaning network restrictions are enforced continuously rather than just at the initial login point.
  • RCPs integrate with AWS Organizations, so security teams can enforce consistent sign-in network controls across all accounts in an org without configuring each account individually.
  • This feature pairs with AWS Management Console Private Access to create layered controls, letting organizations define both which networks users can sign in from and which accounts those users can reach.
  • The feature is available at no additional cost in all AWS commercial regions, making it a straightforward addition to existing preventive security controls for organizations already using SCPs and Organizations.

1:15:21 Ryan – “…for SCPs, that’s where the security really wanted just these big overall ban hammers, right? Resource control is something that, I think it brings it a little bit more down to like the cloud team or someone who’s kind of more in line with the runtime, because it allows you to do contextual access based off of resource, right? Instead of granting all of the permissions to any resource, this allows you to specify the resources specifically.“

GCP

1:16:37  Introducing DiffusionGemma

  • Google released DiffusionGemma, a 26B Mixture of Experts open model under Apache 2.0 that generates text using diffusion rather than sequential token-by-token processing, producing up to 4x faster output on GPUs like the NVIDIA H100 at 1000+ tokens per second.
  • The speed advantage is specifically designed for local and low-concurrency inference scenarios, not high-traffic cloud serving where autoregressive models remain more cost-efficient. 
  • Developers building real-time interactive tools like inline editors or code infilling tools are the primary target audience.
  • The model activates only 3.8B of its 26B parameters during inference, fitting within 18GB VRAM when quantized, making it compatible with consumer GPUs like the RTX 4090 and 5090. This is a notable accessibility consideration for developers without enterprise hardware.
  • Bi-directional attention is a key technical differentiator, allowing the model to generate 256 tokens simultaneously where every token can reference all others. This enables use cases that autoregressive models handle poorly, such as code infilling, Sudoku-style constraint solving, and structured format generation.
  • DiffusionGemma is available now on Hugging Face and through GCP Model Garden, with toolchain support from vLLM, MLX, Hugging Face Transformers, Unsloth, and NVIDIA NeMo
  • Google explicitly notes output quality is lower than standard Gemma 4, so production quality-sensitive workloads should stay on the standard model

1:19:18 Choosing your surface: Antigravity 2.0, Antigravity CLI, Antigravity IDE, or Antigravity SDK 

  • Google announced Antigravity, an AI agent platform available in four distinct surfaces: a desktop app (Antigravity 2.0), a terminal-based CLI, an IDE integration, and a Python SDK. All four interfaces run on the same underlying agent harness, meaning plugins, skills, and core logic are consistent regardless of which surface you choose.
  • Antigravity 2.0 is the default recommendation for most users, offering a standalone desktop app that can manage multiple autonomous agents working across independent projects simultaneously, including scheduled tasks for things like code quality checks.
  • The CLI surface is built in Go for speed and supports headless execution, making it a practical option for SSH workflows, remote containers, or CI/CD pipelines where a GUI is not available.
  • The Python SDK is notable for teams wanting to build custom agents, as it runs on the same shared harness as Google’s official tools and allows local development with deployment to Google Cloud requiring no code changes.
  • No pricing information was provided in the announcement. 
  • Documentation and downloads are available at antigravity.google for teams evaluating which surface fits their workflow.

1:19:29 Justin – “Basically, it’s anti-gravity with the IDE, which is what they started with. And then they came out with a CLI. And now they’ve got an SDK and Anti-Gravity 2.0 as the as the desktop app all available to you now.”

1:20:53 Google expands Alabama data center campus, funds community efforts

  • Google is investing $1.5 billion in 2026 and 2027 to expand its existing data center campus in Jackson County, Alabama, a facility that has operated since 2019 on a repurposed former coal-plant site. 
  • This expansion signals continued growth in Google’s physical infrastructure footprint in the southeastern United States.
  • The expansion is notable for its self-funded model, with Google covering 100% of its own power and infrastructure costs, which is worth noting for GCP customers thinking about how hyperscaler investments translate to regional capacity and reliability.
  • Google is pairing the infrastructure investment with a $2 million Energy Impact Fund in partnership with TVA and CAANEAL, focused on local energy efficiency and weatherization programs, reflecting a broader pattern of data center operators addressing community energy concerns alongside capacity growth.
  • Community-facing commitments include $550,000 in STEM kits for fourth through eighth graders and digital skills training that has reached over 130,000 Alabamians to date, which speaks to the workforce pipeline considerations that often accompany large-scale data center expansions.
  • For GCP customers, the practical takeaway is that continued infrastructure investment in this region supports long-term availability and capacity for workloads running in Google’s US-based regions, though specific new region or zone announcements were not part of this particular update.

1:22:22 Justin – “…sustainability now becomes how do we make people not mad at us?”

1:22:34 Brazos liquid cooling system for air-cooled data centers

  • Google developed Brazos, a rack-mounted closed-loop liquid-to-air cooling system designed to handle chips exceeding 1000W thermal design power without requiring full facility retrofits. It installs one rack at a time into existing air-cooled data centers, separating the internal liquid loop from facility water supplies.
  • Each Brazos unit supports 60 kW of thermal load per rack across three modular chassis, runs on deionized water or a 25% propylene glycol mixture, and operates on 40-60V DC input connecting directly to standard rack busbars. Pumps and fans are hot-swappable field-replaceable units to reduce repair time.
  • Brazos uses OCP ORv3 form-factor racks and Google plans to open-source the full technical specifications through the Open Compute Project forum in the coming months, inviting manufacturers and thermal engineers to produce and market the design independently.
  • The primary audience for this announcement is data center operators running legacy air-cooled facilities who need to support high-density AI or HPC workloads without the capital expense and time required for full chilled water infrastructure upgrades.
  • No pricing information was provided in the announcement. 
  • Organizations interested in adopting the design should monitor the Open Compute Project forum at opencompute.org for upcoming specification releases and engage with Google’s manufacturing suppliers directly.

1:24:22 Ryan – “Back when I was building data centers, like it was one of those things, everal data centers that we had were half empty, right? Because we didn’t have the power density. And so it’s l millions of square feet of just empty space with a little like rack mount sticking out of the floor… so I like seeing these kinds of announcements”

Azure

1:25:32 Stop wasting time and use Custom Extensions for PIM approvals

  • Custom Extensions for PIM allow organizations to inject their own approval logic into the Privileged Identity Management workflow via a standard REST API, replacing manual approval steps with automated validation against external systems like ServiceNow, Workday, or Dynamics.
  • When a user submits a PIM activation request, the system pauses its internal checks and sends an HTTP payload to your custom API endpoint, which then returns an approved or denied response that PIM executes automatically, supporting both pre-approval and post-approval configurations.
  • The licensing requirement is a notable consideration: Custom Extensions require Entra ID Governance licenses or Entra Suite, not just the Entra P2 licenses that cover standard PIM functionality, which adds cost for organizations looking to automate their approval workflows.
  • This feature is best suited for organizations that already have a mature PIM process in place and want to reduce admin overhead through ticket validation automation, rather than those still working on basic PIM adoption.
  • Setup involves creating the custom extension in the Entra Admin Center under ID Governance, linking it to specific roles or groups, and connecting it to an App Registration with the requestedAccessTokenVersion set to 2, which is a non-obvious configuration step worth noting for teams planning implementation.

1:26:58 Ryan – “I think we’re going to see a lot more of this as everyone is trying to deal with Agentic identity.” 

1:27:14 AI 200 – Azure Container Apps Express: Blazing-Fast Deployments Without the Overhead

  • Azure Container Apps Express App is a new preview creation mode that eliminates the need to pre-provision a Container Apps Environment, reducing deployment time to under 3 minutes from zero to a publicly accessible URL. 
  • It is currently only accessible via containerapps.azure.com and the Azure CLI, not the main Azure Portal.
  • The Express mode auto-provisions its own environment, requires only three inputs (app name, resource group, and region), and defaults to a public endpoint on port 80 with 0.5 vCPU and 1 GB memory, making it well-suited for rapid prototyping and CI/CD pipelines.
  • Express Apps support scale-to-zero and up to 300 maximum replicas with KEDA-based scale rules, putting it on par with standard Container Apps for burst scenarios despite the simplified setup experience.
  • Regional availability is currently limited to East Asia and West Central US during preview, which is a notable constraint for teams with data residency requirements or latency-sensitive workloads in other regions.
  • Pricing details are not explicitly covered in the announcement, so teams should verify costs at the Azure pricing page before adopting Express mode for production workloads, particularly given the auto-provisioned environment model which may differ from standard Container Apps billing.

1:27:55 Justin – “This sounds like a good way to waste a lot of money…” 

1:29:25 Introducing scheduled antivirus scans on Microsoft Defender Linux

  • Microsoft Defender for Endpoint on Linux now supports scheduled antivirus scans, a capability that security teams have long relied on for consistent threat coverage across device fleets. 
  • This addresses a notable gap for organizations running Linux workloads under compliance frameworks that require periodic full-system scans.
  • The feature helps catch dormant or previously missed threats that real-time protection may not surface, making it particularly relevant for servers handling sensitive workloads where periodic deep scans are part of audit requirements.
  • This addition brings Linux endpoint protection closer to feature parity with the Windows version of Defender, which matters for organizations managing mixed OS environments through a single security platform like Microsoft Defender XDR.
  • Target customers are enterprise security and compliance teams running Linux servers in regulated industries such as finance, healthcare, or government, where scheduled scan logs serve as evidence for audits.
  • Pricing is tied to existing Microsoft Defender for Endpoint licensing rather than being a separate add-on, so current Linux Defender customers should be able to adopt this without additional cost. 
  • Organizations not yet licensed should check the Microsoft Defender for Endpoint plan details at microsoft.com/security for current pricing tiers.

Oracle

1:32:43 Oracle Announces Record Q4 and FY 2026 Results Driven by Cloud Infrastructure & Cloud Applications

  • Oracle reported Q4 FY2026 total revenue of $19.2 billion, up 21% year-over-year, with cloud infrastructure (IaaS) growing 93% and total cloud revenue reaching $9.9 billion. 
  • The growth is notable but worth watching given that free cash flow was negative $23.7 billion as Oracle continues heavy datacenter investment.
  • The Remaining Performance Obligations figure of $638 billion, up 363% year-over-year, sounds striking until you read the fine print: a substantial portion comes from large AI contracts where customers either prepaid for GPUs or supplied their own hardware, totaling $75 billion. This structure shifts capital burden to customers rather than Oracle.
  • Oracle Multicloud AI Database reportedly grew 404% in Q4, which the company calls its fastest growing product ever, though it is growing from a smaller base and the metric reflects early adoption momentum rather than established scale.
  • Oracle is guiding for $90 billion in total FY2027 revenue and expects cloud revenue growth of 57 to 64% in Q1 FY2027, which would require sustaining the current infrastructure buildout funded by roughly $40 billion in planned debt and equity financing next fiscal year.
  • The Oracle Health AI rewrite of the Cerner system is positioned as a near-term growth driver, with Oracle projecting double-digit growth for that business in FY2027. Given Cerner’s historically troubled integration, listeners should watch whether execution matches the projection.

1:33:59 Justin – “They’re spending a lot of money on AI, so I hope it works out for everybody…” 

Cloud Journey 

1:32:43 Running an AI-native engineering org 

  • Anthropic’s engineering director describes how agentic coding shifted the primary bottleneck from writing code to verifying it, meaning code review, security checks, and correctness validation now consume the time that implementation used to take.
  • The team replaced traditional sprint planning and design docs with just-in-time planning built around prototypes and PR discussions, reflecting that long-horizon roadmaps became obsolete when execution speed increased substantially.
  • Human review is now reserved for specific high-stakes areas like security-sensitive code, legal risk, and product judgment, while Claude handles style, linting, bug catching, and test generation automatically.
  • Role boundaries have blurred noticeably, with product managers writing more code and engineers taking on design and content work, which has practical implications for how teams hire and structure responsibilities.
  • The article suggests engineering leaders track three metrics as they adopt agentic workflows and cautions against treating throughput as the primary success measure, since the real goal is solving the underlying problem faster, not just generating more output.

Closing

And that is the week in the cloud! Visit our website, the home of the Cloud Pod, where you can join our newsletter, Slack team, send feedback, or ask questions at theCloudPod.net or tweet at us with the hashtag #theCloudPod





Download audio: https://episodes.castos.com/5e2d2c4b117f29-10227663/2506949/c1e-p8j8uwdmm5s11jo6-6z8vrgn6c7xk-vpdxyw.mp3
Read the whole story
alvinashcraft
54 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Designing Opal: Building a Personal AI Pet

1 Share
From: Microsoft Developer
Duration: 9:27
Views: 2,348

What happens when a designer builds their own AI assistant from scratch?

In this episode of Cozy AI Kitchen, Thoa Nguyen walks through Opal—a personal AI “pet” running on a Raspberry Pi, powered by agents, GPT-4, and browser automation. From real-time web browsing to Discord integrations and even updating GitHub projects, this isn’t just a demo—it’s a glimpse at how designers are building full AI systems.

If you’re curious about agents, personal AI, or how to actually build something like this yourself, this one’s packed with ideas.

#AI #Agents #RaspberryPi #GPT4 #DeveloperTools

Speakers:
Thoa Nguyen, Product Design @ Microsoft CoreAI
https://www.linkedin.com/in/thoanguyen/

John Maeda - Host, Cozy AI Kitchen
VP of Design and Artificial Intelligence, Microsoft
https://www.linkedin.com/in/johnmaeda/

🎯 Explore more Cozy AI Kitchen episodes
https://aka.ms/CAIK-YTPlaylist

🔔 Subscribe for more Microsoft Developer content

Read the whole story
alvinashcraft
55 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

OpenSpec - with Tabish Bidiwale

1 Share

In this episode, I was joined by Tabish Bidiwale, the creator of OpenSpec! As a daily user (and big fan) of OpenSpec, this was a great honour to geek-out about it with the creator, and chat in general about coding with AI agents.

For a full list of show notes, or to add comments, please see the website here





Download audio: https://www.buzzsprout.com/978640/episodes/19407103-openspec-with-tabish-bidiwale.mp3
Read the whole story
alvinashcraft
55 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Why does Windows sometimes say a file is ‘in use’ even when I’ve closed the app that was using it?

1 Share
From: Microsoft Developer
Duration: 1:54
Views: 440

That mysterious "file is in use" error isn't always what it seems. Mark Russinovich breaks down why Windows locks files, what might actually be using them, and how developers can investigate the cause.

More from Mark: https://msft.it/6050vqV92

#windows #developers #sysinternals #softwareengineering #debugging

Read the whole story
alvinashcraft
55 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

156: Slate is America's Cheapest EV Truck: No One Will Buy the Base

1 Share
In this episode:
  • Slate reveals official truck price, opens orders
  • Lucid cuts large chunk of employees. Again.
  • Rivian kills cheapest R1 trims 
  • Much, much more!

Regular cohosts:
Tom Moloughney from State of Charge and EVchargingstations.com
https://evchargingstations.com/ https://www.youtube.com/StateOfChargeWithTomMoloughney
Kyle Conner from Out of Spec Studios
https://outofspecstudios.com/
Martyn Lee from EV News Daily
https://www.evnewsdaily.com/
Domenick Yoney from Drive Electric with Domenick
https://www.youtube.com/@DriveElectricWithDomenick





Download audio: https://dts.podtrac.com/redirect.mp3/audioboom.com/posts/8921230.mp3?modified=1782474166&sid=5141110&source=rss
Read the whole story
alvinashcraft
55 minutes ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories