Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
156076 stories
·
33 followers

Above the Cloud: Building Data Centers in Space - Richard Campbell - NDC Copenhagen 2026

1 Share
From: NDC
Duration: 55:07
Views: 40

This talk was recorded at NDC Copenhagen in Copenhagen, Denmark. #ndccopenhagen #ndcconferences #developer #softwaredeveloper

Attend the next NDC conference near you:
https://ndcconferences.com
https://ndccopenhagen.com/

Subscribe to our YouTube channel and learn every day:
/ @NDC

Follow our Social Media!

https://www.facebook.com/ndcconferences
https://twitter.com/NDC_Conferences
https://www.instagram.com/ndc_conferences/

#ai #cloud #space

The demand for AI data centers has reached a fever pitch - stressing the power grid and people's nerves. Does putting those data centers in space make any sense?

Join Richard Campbell as he explores the challenges and potential of putting data centers into orbit. The balance of cost, performance, latency, power, cooling, and reliability makes for a difficult mix - but the potential for the future is significant!

Read the whole story
alvinashcraft
31 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

You don’t understand DNS like you think you do​​​​‌‍​‍​‍‌‍‌​‍‌‍‍‌‌‍‌‌‍‍‌‌‍‍​‍​‍​‍‍​‍​‍‌​‌‍​‌‌‍‍‌‍‍‌‌‌​‌‍‌​‍‍‌‍‍‌‌‍​‍​‍​‍​​‍​‍‌‍‍​‌​‍‌‍‌‌‌‍‌‍​‍​‍​‍‍​‍​‍‌‍‍​‌‌​‌‌​‌​​‌​​‍‍​‍​‍‌‍​‌‍‌‌​​‍‍‌​‌‌​‌‍​‌‌‍​‌‍‍‌‍‌‌‍‌‍‌‌‌​‍‌‍‌‍‌‍​‌‍‌‌​‍‍‌‍​‌‍​‍‌‍‍‌‌‍‍‌‌​‌‍‌‌‌‍‍‌‌​​‍‌‍‌‌‌‍‌​‌‍‍‌‌‌​​‍‌‍‌‌‍‌‍‌​‌‍‌‌​‌‌​​‌​‍‌‍‌‌‌​‌‍‌‌‌‍‍‌‌​‌‍​‌‌‌​‌‍‍‌‌‍‌‍‍​‍‌‍‍‌‌‍‌​​‌‌‍​‌‍‌‌‌‍​​‌‌‌‍​‌​​‍​‍‌​‍​​‍‌​​‍​​‍​​​​​​‍‌​‌​‌‍​‍​​​​‌​‍‌‌‍​‍​‍​‌‍‌​‌‍​​‍‌​‍‌​‌‌‌‍​‍‌‍​‌​‌‌​​​​‌‌‍​‌‌‍‌​‌‍​​‌‌​‌​​‍‌‌​‌‍‌‌​​‌‍‌‌​‌‌‍​‍‌‍​‌‍‌‍‌‌‌​​‌‍‌​‌‌​​‍‌​​‌‍​‌‌‌​‌‍‍​​‌‌‌​‌‍‍‌‌‌​‌‍​‌‍‌‌​‌‍​‍‌‍​‌‌​‌‍‌‌‌‌‌‌‌​‍‌‍​​‌‌‍‍​‌‌​‌‌​‌​​‌​​‍‌‌​​‌​​‌​‍‌‌​​‍‌​‌‍​‍‌‌​​‍‌​‌‍‌‍​‌‍‌‌​​‍‍‌​‌‌​‌‍​‌‌‍​‌‍‍‌‍‌‌‍‌‍‌‌‌​‍‌‍‌‍‌‍​‌‍‌‌​‍‍‌‍​‌‍​‍‌‍‌‍‍‌‌‍‌​​‌‌‍​‌‍‌‌‌‍​​‌‌‌‍​‌​​‍​‍‌​‍​​‍‌​​‍​​‍​​​​​​‍‌​‌​‌‍​‍​​​​‌​‍‌‌‍​‍​‍​‌‍‌​‌‍​​‍‌​‍‌​‌‌‌‍​‍‌‍​‌​‌‌​​​​‌‌‍​‌‌‍‌​‌‍​​‌‌​‌​​‍‌‍‌‌​‌‍‌‌​​‌‍‌‌​‌‌‍​‍‌‍

1 Share
Ryan welcomes Cricket Liu, DNS expert and Chief Evangelist at Infoblox, to the show to talk all things DNS. They cover the evolution of one of the oldest DNS server implementations, BIND, and what the future holds for protected DNS configurations; the realities of security threats like DDoS and DNS spoofing; and why outages often trace back to a lack of understanding of DNS’s fundamental role.​​​​‌‍​‍​‍‌‍‌​‍‌‍‍‌‌‍‌‌‍‍‌‌‍‍​‍​‍​‍‍​‍​‍‌​‌‍​‌‌‍‍‌‍‍‌‌‌​‌‍‌​‍‍‌‍‍‌‌‍​‍​‍​‍​​‍​‍‌‍‍​‌​‍‌‍‌‌‌‍‌‍​‍​‍​‍‍​‍​‍‌‍‍​‌‌​‌‌​‌​​‌​​‍‍​‍​‍‌‍​‌‍‌‌​​‍‍‌​‌‌​‌‍​‌‌‍​‌‍‍‌‍‌‌‍‌‍‌‌‌​‍‌‍‌‍‌‍​‌‍‌‌​‍‍‌‍​‌‍​‍‌‍‍‌‌‍‍‌‌​‌‍‌‌‌‍‍‌‌​​‍‌‍‌‌‌‍‌​‌‍‍‌‌‌​​‍‌‍‌‌‍‌‍‌​‌‍‌‌​‌‌​​‌​‍‌‍‌‌‌​‌‍‌‌‌‍‍‌‌​‌‍​‌‌‌​‌‍‍‌‌‍‌‍‍​‍‌‍‍‌‌‍‌​​‌‌‍​‌‍‌‌‌‍​​‌‌‌‍​‌​​‍​‍‌​‍​​‍‌​​‍​​‍​​​​​​‍‌​‌​‌‍​‍​​​​‌​‍‌‌‍​‍​‍​‌‍‌​‌‍​​‍‌​‍‌​‌‌‌‍​‍‌‍​‌​‌‌​​​​‌‌‍​‌‌‍‌​‌‍​​‌‌​‌​​‍‌‌​‌‍‌‌​​‌‍‌‌​‌‌‍​‍‌‍​‌‍‌‍‌‌‌​​‌‍‌​‌‌​​‍‌​​‌‍​‌‌‌​‌‍‍​​‌‌‍‌‌‌‍​‌‍​‌‍‌‌‌​‍‌​​‌‌​​‌‍​‍‌‍​‌‌​‌‍‌‌‌‌‌‌‌​‍‌‍​​‌‌‍‍​‌‌​‌‌​‌​​‌​​‍‌‌​​‌​​‌​‍‌‌​​‍‌​‌‍​‍‌‌​​‍‌​‌‍‌‍​‌‍‌‌​​‍‍‌​‌‌​‌‍​‌‌‍​‌‍‍‌‍‌‌‍‌‍‌‌‌​‍‌‍‌‍‌‍​‌‍‌‌​‍‍‌‍​‌‍​‍‌‍‌‍‍‌‌‍‌​​‌‌‍​‌‍‌‌‌‍​​‌‌‌‍​‌​​‍​‍‌​‍​​‍‌​​‍​​‍​​​​​​‍‌​‌​‌‍​‍​​​​‌​‍‌‌‍​‍​‍​‌‍‌​‌‍​​‍‌​‍‌​‌‌‌‍​‍‌‍​‌​‌‌​​​​‌‌‍​‌‌‍‌​‌‍​​‌‌​‌​​‍‌‍‌‌​‌‍‌‌​​‌‍‌‌​‌‌‍​‍‌‍​‌‍‌‍‌‌‌​​‌‍‌​‌‌​​‍‌‍‌​​‌‍​‌‌‌​‌‍‍​​‌‌‍‌‌‌‍​‌‍​‌‍‌‌‌​‍‌​​‌‌​​‍‌‍‌​​‌‍‌‌‌​‍‌​‌​​‌‍‌‌‌‍​‌‌​‌‍‍‌‌‌‍‌‍‌‌​‌‌​​‌‌‌‌‍​‍‌‍​‌‍‍‌‌​‌‍‍​‌‍‌‌‌‍‌​​‍​‍‌‌
Read the whole story
alvinashcraft
1 hour ago
reply
Pennsylvania, USA
Share this story
Delete

Microsoft Highlights Visual Studio Live! Event Lineup and Longtime Developer Community Role

1 Share
A Microsoft MVP Blog post on Visual Studio Live!'s longevity arrives as the 2026 conference series continues with upcoming stops at Microsoft HQ, San Diego and Orlando.
Read the whole story
alvinashcraft
1 hour ago
reply
Pennsylvania, USA
Share this story
Delete

Upcoming June 2026 Microsoft 365 Champion community call

1 Share

Join our next community call on June 23, 2026, to explore Microsoft 365 Copilot Cowork and learn how it can really help you get stuff done.

Reminder: Our community calls are in the Teams webinar format, so you must register to receive the link to join. The join link will be sent to you in email with your webinar registration confirmation.

The calls will still start at 5 minutes past the hour for both sessions (at 8:05 AM and 5:05 PM PT), and it will still end at the top of the hour (9:00 AM and 6:00 PM PT, respectively).

While our calls are open to everyone, you must be a member of the Microsoft 365 Champion Program in order to access the presentation materials - the access link is in the initial welcome email and the monthly newsletter emails sent the week before the community calls.

An on-demand recording will still be available on our Driving Adoption > Events pages, as well as on our Microsoft Community Learning YouTube channel.

If you have not yet joined our Champion community, sign up here to get access to the monthly newsletters, calendar invites, and program assets (e.g., the presentations).

Read the whole story
alvinashcraft
1 hour ago
reply
Pennsylvania, USA
Share this story
Delete

Episode 577: Let’s go to Buc-ee’s

1 Share

This week, we discuss the Fable ban, SpaceX's $60B Cursor buy, and why Lovable wins when AI picks your stack. Plus, Europeans are at the World Cup and already drank Boston dry.

Watch the YouTube Live Recording of Episode 577

Runner-up Titles

  • Waited out the storm
  • Maybe we should build some castles or something
  • Years of lawsuits ahead of us
  • The ultimate dream
  • AI SEO
  • I hope you’re paid by the hour
  • Always be monitoring to me

Rundown

Relevant to your Interests

Sponsors

Nonsense

Conferences

SDT News & Community

Recommendations

Sponsored By:





Download audio: https://aphid.fireside.fm/d/1437767933/9b74150b-3553-49dc-8332-f89bbbba9f92/8f1b7e9e-84d1-4c00-be69-93fc0aef149a.mp3
Read the whole story
alvinashcraft
1 hour ago
reply
Pennsylvania, USA
Share this story
Delete

358: AI Spend Limits Because Frontier Models Aren't Free Therapy

1 Share

Welcome to episode 358 of The Cloud Pod, where the weather is always cloudy! 

Justin, Matt, and Ryan (who, rumour has it, was working on an Eagles music podcast) are in the studio this week to bring you all the latest in AI and cloud news (and begging for a AI spend limit increase), including anthropic wanting everyone – except themselves – to slow down AI development, GitHub’s insane number of commits, and even an announcement from CoreWeave, plus so much more. Let’s get started! 

Titles we almost went with this week

  • Stop Configuring Domains One by One Like a Peasant
  • SSH Into Your AI Agent Like It’s 1999
  • Your AWS Bill Finally Has an AI Babysitter
  • Stop Blaming Engineering, the AI Will Do It Now
  • GPU Queue Anxiety Meet Your Serverless Spark Therapist
  • One Wildcard Certificate to Rule All Subdomains
  • One PTU Reservation to Rule All Regions
  • Twelve Billion Parameters Walk Into a Laptop
  • Squeezing Gemma 4 Until the Bits Cry
  • Azure Cobalt 200 VMs Are Really Arm-ed and Dangerous
  • AI has gone all Fables and Myth
  • Arm-ed she blows: but probably not to a region near you
  • Dash to change your password as Dashlane gets owned
  • Siri AI shows just how slow Gemini is
  • AI Announces going public, and then spreads Myths about AI development

A big thanks to this week’s sponsors:

There are many cloud cost management tools out there, but only Archera provides insured commitments. It sounds fancy, but it’s really simple. Archera gives you the cost savings of a 1 or 3-year AWS Savings Plan with a commitment as short as 30 days. If you do not use all the cloud resources you have committed to, Archera will literally cover the difference. Other cost management tools may say they offer “insured commitments”, but remember to ask: Will you actually give me my rebate? Because Archera will. 

Check out thecloudpod.net/archera to schedule a demo today. 

General News

01:27 How GitHub plans to win developers back

  • GitHub’s scale challenge has grown substantially beyond earlier projections. 
  • The platform processed 1 billion commits in all of 2025, but now handles 1.4 billion commits per month, with AI agents alone generating over 17 million pull requests monthly.
  • The technical remediation work has shifted from surface-level scaling to architectural rebuilding. GitHub has addressed MySQL contention, moved webhooks off MySQL entirely, rewritten the GitHub Actions job dispatch system, and is migrating performance-sensitive code from its Ruby monolith to Go.
  • GitHub’s migration to Microsoft Azure, previously reported as a capacity move, is now described as a deeper infrastructure overhaul. 
  • The goal is service isolation so that a degraded subsystem like Actions does not cascade failures to Git or other core services.
  • Microsoft is providing engineering support from teams with experience scaling systems at comparable load levels, which represents a more direct operational involvement than what was previously discussed.
  • New feature releases like the Copilot CLI app are being developed outside the core GitHub.com infrastructure, which GitHub says allows continued product work without adding risk to the systems currently under remediation.

03:0 Ryan – “I’d actually like to see AI coding take this up a little bit, because I think it is a ridiculous sort of growth that I don’t think is sustainable, and so much of vibe-coded garbage is really bloated…But there are definitely functionality things that it can do a lot more efficiently, and doesn’t.” 

AI Is Going Great – or How ML Makes Money 

07:44 The Interoperable Lakehouse: Agency Over Your Data

  • Snowflake’s Interoperable Lakehouse is now generally available, built on Apache Iceberg v3, Apache Polaris, and a new Open Semantic Interchange spec with 54 participating vendors. 
  • The Iceberg v3 support adds VARIANT for semi-structured data, row lineage, deletion vectors, nanosecond timestamps, and geospatial types, closing gaps that previously made Iceberg impractical for many workloads.
  • Horizon Catalog now supports full bidirectional read and write access from external engines like Spark, Trino, and PyIceberg via vended credentials, meaning teams can define governance policies once in Snowflake and have them enforced across any compatible engine without data migration.
  • Zero-copy integrations with SAP (GA), Salesforce, and Workday (private preview) bring enterprise system data into Snowflake without ETL pipelines, preserving semantic context so AI agents reason over current, governed data rather than stale copies.
  • Managed Iceberg replication and failover are coming soon to GA, with an Optimized Refresh feature in public preview showing 1.6x to 22x faster replication performance in preview testing, which directly reduces Recovery Point Objectives for mission-critical workloads.
  • Horizon Context and Semantic View Autopilot (GA) addresses the semantic fragmentation problem by automatically generating and maintaining shared business definitions across databases, BI tools, and data pipelines, giving AI agents a consistent semantic layer rather than conflicting definitions across systems.

09:25 Snowflake CoCo: AI Coding Agent for the Modern Data Stack

  • Snowflake announced CoCo at Summit 2026, expanding it from an AI coding agent into a full AI development platform with a native desktop app for Windows and macOS, Cloud Agents running inside Snowsight, an Agent SDK, and upcoming Slack and mobile integrations. 
  • Each Cloud Agent session provisions an isolated container that can run Python, shell commands, dbt builds, and web searches with no local setup required.
  • On the ADE-Bench framework from dbt Labs, CoCo achieved a 72.1% pass rate on real-world data engineering tasks, outperforming both Claude Code and OpenAI Codex, which each scored 65.1%. 
  • CoCo also used 51% fewer tokens and completed tasks 8% faster than Claude Code on Opus 4.7, attributed to targeted data exploration and native tool integrations with Snowflake, dbt, and Airflow instead of bash-based workflows.
  • The CoCo Agent SDK packages the same agentic engine as an installable library for JavaScript and Python, giving developers programmatic access to Snowflake querying, SQL execution, codebase search, and file editing without building that infrastructure themselves. This allows platform engineers to embed CoCo into CI/CD pipelines, backend services, and internal tools.
  • Governance is enforced at the infrastructure level, with every CoCo operation running under the user’s existing Snowflake RBAC, LLM inference staying within Snowflake’s security perimeter, and full prompt logging, query tagging, and audit trails available for admin oversight. 
  • This addresses a common gap where generic coding agents generate plausible-looking code that fails in governed production environments.

09:59 Snowflake CoWork: The Personal Work Agent for Every Knowledge Worker

  • Snowflake rebranded Snowflake Intelligence to CoWork, positioning it as a personal work agent for knowledge workers that combines proactive task automation, multi-agent orchestration, and persistent memory across sessions. 
  • The system moves beyond reactive Q&A toward background monitoring, scheduled analysis, and direct action in tools like Slack, Gmail, Salesforce, and Jira via MCP connectors.
  • The upcoming Cortex Sense context layer is a notable technical addition, automatically learning business definitions from query history, dashboards, and metadata without manual configuration. Internal testing showed 83% accuracy on complex enterprise queries with Cortex Sense enabled, compared to 47% without it and 23% for frontier coding agents using Snowflake MCP.
  • Deep Research, moving to general availability soon, uses a multi-agent swarm orchestration system to analyze both structured and unstructured enterprise data in parallel, outperforming single-agent systems by over a third on Snowflake’s Hybrid Deep Research Benchmark
  • This allows users to get fully cited analytical reports in minutes on questions that previously required days of analyst work.
  • Several features are entering public or private preview, including Memory for persistent user preferences, User Skills for recording reusable multi-step workflows, Async Agent API for long-running background tasks, and an iOS mobile app for full CoWork access on the go.
  • The governance model is worth noting for enterprise buyers: every agent action is scoped by role-based access controls, admin-defined policies determine what agents can do autonomously versus what requires human approval, and a complete audit trail logs all actions with policy reasons.

10:29 Justin – “I assume Anthropic will be suing them any moment for trademark infringement, but nice to see that you’re getting some smartness for the data friends who desperately need all the DevOps help they can get. So I appreciate they’re getting these tools.”

16:00 Anthropic urges global pause in AI development

  • Anthropic published a blog post calling on major AI labs to consider slowing development, citing the risk of recursive self-improvement, where AI systems could enhance their own capabilities without human intervention. Co-author Jack Clark estimated this could occur within two years.
  • The proposal draws a direct parallel to nuclear arms control, suggesting a global agreement and verification regime. Anthropic noted a key challenge: training runs are far easier to conceal than missile silos, raising practical questions about enforcement.
  • The call for a slowdown comes as Anthropic reported an annualized revenue run rate on track for $50 billion by the end of June 2026, up from $9 billion at the end of 2025, and filed confidential IPO paperwork at a valuation near $1 trillion.
  • Critics, including David Sacks, characterized the move as regulatory capture, arguing that established players advocating for development restrictions could disadvantage newer or smaller competitors in the AI space.
  • For cloud practitioners, the broader implication is that compute governance and training run transparency may become compliance considerations, particularly if international frameworks modeled on arms treaties gain traction among governments.

16:41 Ryan – “This has been what people have been sort of warning for ages with AI development, and this isn’t anything new. I’m surprised by the timing of it because it doesn’t make sense to me that they’re doing this now, but this is a huge concern. And I know just from trying to secure workloads in my day job, you try to put human and loop flows in place, but you know, people don’t really want to be in the loop. The whole advantage of using AI is the advantage the velocity gains. So having a human that does all the approval is problematic.”

20:04 Claude Fable 5 and Claude Mythos 5

  • Anthropic launched Claude Fable 5 for general availability and Claude Mythos 5 for restricted access, both priced at $10 per million input tokens and $50 per million output tokens, which is less than half the cost of the previous Mythos Preview model. 
  • Fable 5 is the general-use version with safety classifiers active, while Mythos 5 is the same underlying model with certain safeguards lifted for vetted cybersecurity and biology partners.
  • The models introduce a tiered safety classifier system that automatically routes flagged requests in cybersecurity, biology/chemistry, and distillation categories to Claude Opus 4.8 instead of refusing outright. 
  • Anthropic reports this fallback triggers in fewer than 5% of sessions, and external red-teaming found zero successful universal jailbreaks on harmful cyber queries across 30 different public jailbreak techniques.
  • On the software engineering side, Stripe reported Fable 5 completed a codebase-wide migration across a 50-million-line Ruby codebase in one day, a task estimated to take a full team over two months manually. The model also scores highest among frontier models on Cognition’s FrontierCode evaluation for production-quality coding standards.
  • Mythos 5 demonstrated autonomous scientific research capabilities, including outperforming a recently published Science journal model on a genomics task despite being 100 times smaller, and accelerating drug design workflows roughly ten times in internal protein design testing.
  • Anthropic is requiring 30-day data retention for all Mythos-class model traffic, including on third-party surfaces, specifically to detect novel jailbreaks and cross-request attacks, with explicit commitments not to use this data for model training.

23:34 Matt – “I would also say you gotta get the foundation of your house set up. So if you are patching, it’s not that you’re patching, it’s how you’re patching… I don’t want somebody, to use a very simple example, I have fifty EC2 instances or VMs, and to do patching, I can’t have somebody log into fifty VMs. That’s not sustainable, and that’s not gonna work. Ryan in security here will check the box saying you are doing patching, but I’ve wasted three people’s days on this. But if you build it out so that each thing is an auto scaling group and everything else, which is where you’re going with the CICD stuff, and you build that proper workflow out, then patching is just release the new image.”

Security 

29:46 Dashlane explains how attackers managed to download encrypted password vaults

  • Attackers exploited Dashlane’s device enrollment API by brute-forcing six-digit one-time tokens sent to user email addresses, successfully registering new devices on fewer than 20 accounts and downloading encrypted vaults before automated lockouts stopped the campaign.
  • The attack highlights a known tradeoff in OTP-based authentication: six-digit numeric codes have only one million possible values, making them vulnerable to brute force if rate-limiting and lockout mechanisms are not sufficiently aggressive.
  • Downloaded vaults remain encrypted and unreadable without the user’s master password, which Dashlane never stores, so the practical risk to affected users depends entirely on the strength of their master password.
  • This incident is a useful case study for developers building device enrollment or account linking flows, as it demonstrates how API endpoints handling authentication tokens need strict rate limiting, anomaly detection, and account lockout thresholds to prevent automated abuse.

30:55 Ryan – “And right now, it’s the strength of that master password. But with quantum encryption, it’s going to be able to break through the algorithm generally.” 

Cloud Tools

36:30  Hashicorp: rethinking infrastructure access in the age of agentic AI

  • HashiCorp Boundary addresses a growing security gap where AI agents need infrastructure access, but traditional IAM models were designed for human users with predictable access patterns. 
  • The core value is giving each AI agent a unique identity with just-in-time credentials rather than static long-lived secrets.
  • Boundary’s credential injection feature means AI agents never directly handle or see credentials at any point during a session. 
  • When paired with HashiCorp Vault, it generates short-lived dynamic credentials that expire after use, which limits the blast radius if an agent or orchestration layer is compromised.
  • The session-focused control plane enforces identity-aware authorization at the connection layer before infrastructure access is established, rather than relying on application-layer gateways. This means the entire network is abstracted away from agents, and all connections route through a Boundary proxy so only authorized identities can establish sessions.
  • The incident response use case in the article is worth noting because it shows each discrete action getting its own ephemeral session account that is deactivated once its purpose is fulfilled. This means standing privileges are continuously revoked rather than persisting across an agent’s entire operational lifetime.
  • Complete session recording and audit logging give security teams the ability to replay and review every action an AI agent took, tied to a specific operator, intent, and timeframe. 
  • This addresses the compliance challenge organizations face when they cannot see or verify what autonomous agents are doing across their infrastructure.

37:52 Ryan – “I’m so annoyed by this because they’re like, this is rethinking an age of agentic AI. No, this is what we should do for all authentication, not just AI. It doesn’t treat anything about AI. It doesn’t identify AI agents. And it’s just setting up a user within HashiCorp boundary and then assigning that user to an agentic AI, just like a human. So this doesn’t actually address anything agentic. And these things should be patterns we need to be moving to in general.” 

AWS

42:46 Improve your application resilience with Amazon Cognito multi-Region replication 

  • Amazon Cognito now supports multi-region replication, automatically synchronizing user profiles, credentials, and pool configurations from a primary region to a secondary region of your choice. 
  • This eliminates the need for custom-built replication solutions that previously created security risks and operational overhead.
  • The feature is read-only on the secondary side, meaning authentication continues during failover, but new user registrations and profile updates are unavailable. Teams should note that Lambda triggers, WAF configurations, and log streaming must be manually configured in the target Region separately.
  • A notable requirement is that customers must configure a multi-region customer-managed KMS key before enabling replication, and OIDC issuer endpoints must be updated across all client applications, including mobile app store resubmissions. This upfront migration work is a practical consideration before adoption.
  • Pricing is an add-on to existing Essentials and tiers, costing $0.0045 per monthly active user per replica Region on Essentials and $0.006 on Plus, with M2M authentication adding a 30% surcharge on standard token pricing. The feature is available across 20-plus Regions spanning North America, Europe, Asia Pacific, and South America.
  • For regulated industries like healthcare and financial services, the companion customer-managed keys feature provides encryption control that can help meet compliance requirements, and is available in a broader set of regions, including AWS GovCloud.

43:54 Matt – “… it’s just a nice quality of life improvement to actually get this out.”

45:36 Customize federated sign-in with new Amazon Cognito Lambda trigger

  • Amazon Cognito now supports an inbound federation Lambda trigger that intercepts federated authentication responses from external identity providers before user attributes are written to the user pool, giving developers programmatic control over attribute transformation, filtering, and enrichment.
  • For B2B and SaaS applications, the trigger solves a practical problem: enterprise SAML providers send hundreds of group memberships that exceed Cognito’s 2,048-character attribute limit, enabling developers to filter and normalize groups without coordinating changes with customer IT departments.
  • For B2C applications, the trigger enables automatic account linking across multiple sign-in methods by matching federated email addresses to existing local Cognito accounts, preventing duplicate user records when customers forget they already registered with a different provider.
  • The trigger runs on every federated sign-in rather than only at initial account creation, which means linking logic and attribute transformations apply continuously, and developers always have access to the latest IdP attributes.
  • Key implementation constraints to note: the Lambda function must complete within 5 seconds, errors in the function can block authentication for legitimate users, and automatic email-based account linking will not work with Apple’s Hide My Email feature
  • The trigger is available now across all regions where Cognito is supported, with no separate pricing beyond standard Cognito and Lambda costs.

47:26 AWS Step Functions adds AgentCore-powered agentic reasoning step

  • AWS Step Functions now supports AI agent reasoning steps via an optimized integration with Amazon Bedrock AgentCore harness, currently in preview, allowing you to embed configurable AI agents directly into visual workflows without managing the underlying agent loop infrastructure.
  • Practical use cases include document classification, unstructured form extraction, and multi-agent pipelines where agents run in parallel or sequence with optional human approval gates at critical decision points.
  • Per-invocation overrides for model, system prompt, and tools let teams reuse a single harness configuration across different workflow contexts, and a session ID parameter enables agent context persistence within or across workflow executions.
  • Observability is built in through workflow execution history showing agent input, output, token usage, and duration, with links to detailed agent turn logs in Amazon CloudWatch for auditing every decision.
  • The integration is available in four regions (US East N. Virginia, US West Oregon, Europe Frankfurt, Asia Pacific Sydney) and follows standard Step Functions pricing with no additional integration charges, though standard Amazon Bedrock and AgentCore pricing still applies for model inference.

48:25 Ryan – “You know I lust over state machines, so I find it funny because this is all I think about when I’m putting an agent workflow together. This would be so much easier in a state machine. And so now they’ve done it. I will absolutely use this so much, because it’s something I already kind of do with lambda functions. It’s just now that I won’t have to define the logic as specifically. It’ll just be like four pages of markdown in my lab.”

51:29 Amazon Bedrock AgentCore Runtime introduces interactive shells for terminal access into agent sessions

  • Amazon Bedrock AgentCore Runtime now supports interactive shells via a new InvokeAgentRuntimeCommandShell API, giving developers a PTY-backed terminal over WebSocket directly into a running agent session, complementing the existing one-shot command execution API.
  • This is particularly useful for developers running coding agents like Claude Code or Amazon Kiro, allowing them to inspect files, run ad-hoc commands, and debug environment state as if working in a local terminal, with persistent state for environment variables and working directory across commands.
  • Each shell session is identified by a runtime session ID and shell ID, enabling manual reconnection after network drops, and a single agent runtime supports up to 10 concurrent shells for watching agents work across multiple branches simultaneously.
  • The CLI entry point is straightforward: agentcore exec –it –runtime followed by the runtime ARN, lowering the barrier for developers already familiar with standard terminal workflows to adopt the feature.
  • Pricing details are not specified in the announcement, so teams evaluating this feature should check the AgentCore Runtime pricing page directly before building workflows that depend on concurrent shell sessions at scale.

52:46 Matt – “Somebody needed it to debug some environment variable or working directory, and they were like, we could just quickly do this thing because it’s running ECS under the hood. We’ll just literally change the CLI call from AWS ECS exec to AWS Agent Core exec, and we’ve added a whole new feature, guys.” 

53:12 AWS Cost Explorer launches intelligent cost explanations powered by Amazon Q

  • AWS Cost Explorer now includes an “Analyze with Amazon Q” button that generates automatic cost analysis covering trends, top drivers, and anomalies based on whatever filters and time period you have configured, eliminating the need to manually cross-reference multiple data points.
  • The feature adapts its output based on the date range selected, providing historical analysis for past periods, forecasts for future dates, or a combined view for mixed ranges, and maintains conversation context so you can ask follow-up questions to dig deeper.
  • This continues AWS’s pattern of embedding Amazon Q capabilities directly into existing console tools rather than requiring users to switch contexts, similar to integrations seen in services like CloudWatch and Security Hub.
  • From a practical standpoint, this is available in all commercial AWS Regions at no additional charge, meaning customers already using Cost Explorer can access it without budget considerations, though standard Cost Explorer usage costs still apply.
  • The most immediate use case is for teams doing monthly or quarterly cost reviews who previously had to manually build narratives around their spend data, as Q can now generate that explanation automatically as a starting point for optimization conversations.

54:10 Matt – “That will forever be my goal in life – understand what’s an EC2 other.” 

54:20 AWS FinOps Agent is now available in preview

  • AWS FinOps Agent is now in preview at no additional charge, offering an AI-driven tool that answers cost questions, surfaces optimization recommendations, and runs scheduled FinOps workflows directly from the AWS Management Console.
  • The agent integrates with AWS Cost Optimization Hub and AWS Compute Optimizer to surface rightsizing, idle resource, and Savings Plans recommendations, and can automatically open Jira tickets to route action items to engineering teams.
  • Automated anomaly investigation is a notable capability here, where the agent detects cost spikes, investigates root cause, and posts findings to Slack without requiring manual triage from FinOps or engineering staff.
  • Preview availability is limited to US East (N. Virginia) for the agent itself, though cost and usage data cover all standard AWS Regions, excluding GovCloud and China regions.
  • Teams currently spending significant time on manual cost reporting and anomaly triage are the most likely to benefit, as the agent can generate reports for finance teams and handle recurring workflows on a user-defined schedule.

55:02 Justin – “This is kind of nice. I don’t know if it’s a full-featured solution for everybody, but it’s definitely something that’s gonna help you get started.”

GCP

56:52 Introducing Gemma 4 12B

  • Google released Gemma 4 12B, a multimodal model that runs locally on consumer hardware with 16GB of VRAM, positioning it between the smaller E4B and the larger 26B MoE model in the Gemma 4 family.
  • The model uses an encoder-free architecture, meaning vision inputs are processed through a lightweight embedding module and audio is projected directly into the same dimensional space as text tokens, reducing memory usage and latency compared to traditional separate encoder approaches.
  • Gemma 4 12B is the first mid-sized Gemma model to support native audio input, and it includes Multi-Token Prediction drafters to reduce inference latency for agentic workloads.
  • For GCP customers, the model can be deployed through Model Garden, Cloud Run with GPU support, and GKE, giving teams flexibility in how they operationalize it in production environments.
  • The model is released under Apache 2.0 and is available through Hugging Face, Kaggle, Ollama, and LM Studio, with fine-tuning support via Unsloth and inference support through vLLM, llama.cpp, and SGLang
  • Google also released a Gemma Skills repository on GitHub to support agentic development patterns.

57:36 Gemma 4 with quantization-aware training

  • Google released Quantization-Aware Training checkpoints for Gemma 4, which integrates quantization directly into the training process rather than applying it afterward, resulting in better quality preservation compared to standard Post-Training Quantization approaches.
  • The mobile-specialized quantization schema reduces the Gemma 4 E2B model to under 1GB of memory by combining static activations, channel-wise quantization, targeted 2-bit compression for token generation layers, and embedding plus KV cache optimization.
  • For desktop and server use cases, QAT checkpoints are available in Q4_0 format with GGUF files ready for llama.cpp and compressed tensors for vLLM, with weights downloadable directly from Hugging Face at no cost for the model weights themselves.
  • Developers can selectively deploy only the modalities they need, such as text-only without audio or vision encoders, which further reduces the memory footprint and makes the models practical for constrained edge environments using Google’s LiteRT-LM runtime or Transformers.js for web deployment.
  • The release supports fine-tuning of QAT weights through Hugging Face Transformers and Unsloth, and also preserves the inference speedup from Multi-Token Prediction when using MTP QAT checkpoints, giving developers flexibility to optimize for both quality and throughput simultaneously.

58:17 Ryan – “These are things we need Jonathan for.” 

58:45 Gemini models for Apple developers

  • Google announced that Apple developers can now access cloud-hosted Gemini models through Apple’s Foundation Models framework via the Firebase Apple SDK, starting with iOS 27, macOS 27, and related platforms. 
  • The integration allows developers to swap between on-device Apple models and cloud-hosted Gemini models using the same API surface, which simplifies building agentic app experiences.
  • The integration is built on Firebase AI Logic, which removes the need to build and maintain a separate backend server for Gemini model access. 
  • Firebase App Check is included to protect service APIs from abuse, addressing a common production security concern.
  • Gemini is also being integrated directly into Xcode as an agentic coding assistant for multi-step development tasks like code review, bug fixing, and feature building. Authentication supports both individual developers using a self-serve Gemini API key from Google AI Studio and enterprise teams using the Gemini Enterprise Agent Platform for dedicated quotas and data privacy controls.
  • Pricing has two tiers: individual developers can start with a free tier through Google AI Studio at ai.google.dev, while enterprise developers access dedicated corporate quotas through the Gemini Enterprise Agent Platform. 
  • The preview release of the Foundation Models framework integration was set to begin the day after the WWDC announcement.
  • This is a practical option for iOS and macOS developers who want to add cloud AI capabilities without leaving the Apple development ecosystem or managing separate infrastructure. 
  • The shared API surface between local and cloud inference is particularly useful for managing latency and cost tradeoffs in production apps.

1:00:01 Ryan – “I love the Apple Google partnership on this. You know, I’m really happy that Apple didn’t decide to develop its own frontier model and just muddy that space.” 

Azure

1:03:27 New Azure Cobalt 200 VMs deliver 50% performance improvement, fully optimized for modern agentic AI workloads

  • Azure Cobalt 200 Arm-based VMs are now in early access preview, built on the Arm Neoverse V3 core and fabricated on TSMC’s 3nm process, delivering up to 50% better CPU performance over Cobalt 100 with up to 128 vCPUs per VM. 
  • Real workload benchmarks show up to 135% better performance for database workloads and up to 80% better performance for caching workloads compared to the previous generation.
  • The VMs are specifically designed for agentic AI workloads, where continuous reasoning and sequential decision-making require sustained per-core performance and low latency. 
  • Each physical core gets dedicated 3 MB of L2 cache and a 192 MB system-level L3 cache, allowing more agent sandboxes per VM without sacrificing throughput.
  • Cobalt 200 expands the Arm VM portfolio with two new families beyond what Cobalt 100 offered: High-Memory Optimized Mpsv4 VMs and Dense Local Storage Lpsv5 VMs, with all series delivering up to 85 Gbps network bandwidth and 70 Gbps remote storage throughput. 
  • Memory encryption is enabled by default through a custom memory controller with negligible performance impact.
  • Microsoft’s own services, including Dataverse and Azure SQL Database, are already validating Cobalt 200, with Dataverse reporting up to 60% better performance over Cobalt 100. 
  • Migration from Cobalt 100 is described as seamless, with full compatibility across existing workloads and support for AKS Arm nodes, GitHub Actions runners, and major languages including Python, Java, and .NET.
  • The preview is currently available in eight regions, including West US3, East US2, Central US, and Sweden Central, with additional regions to follow. Pricing is not yet publicly specified, so teams evaluating cost should sign up at aka.ms/Cobalt200VMs-signup for early access details.

1:04:44 Matt – “It’s great that they added this; I feel like they’re finally getting into the game of ARM. Getting capacity for them might require some twisting of your account team’s arm, especially if you want them at any scale. But the other problem is, which I still find comical, is that you can’t run Windows Server on ARM.”

1:06:58 Foundry IQ: Build smarter agents faster with unified knowledge and serverless retrieval

  • Foundry IQ is Microsoft’s unified knowledge platform for AI agents, now generally available with full SLA coverage, stable APIs, and compliance certifications. 
  • It lets developers connect multiple data sources like Azure Blob Storage, OneLake, and web content into a single knowledge base without building custom connectors for each system.
  • The new Serverless Developer tier, in public preview, scales to zero when idle and bills by Compute Units measured in 0.25 CU increments per minute. Billing is not expected to begin until late 2026, so developers can experiment at no cost for now, accessible through the Foundry portal at ai.azure.com.
  • Agentic retrieval quality improvements show up to 20% better answer quality benchmarks and up to 54% improved recall compared to single-shot RAG, achieved through better query batching, semantic ranking, and server-side token caching to reduce redundant token consumption across multi-turn conversations.
  • The Foundry IQ MCP server exposes knowledge bases as a remote Model Context Protocol server, making them accessible from Claude, ChatGPT, LangChain, and the Microsoft Agent Framework without framework-specific integrations.
  • New security capabilities in preview include cross-tenant customer-managed keys using federated identity credentials, Purview sensitivity-label auditing, and incremental SharePoint permissions sync, keeping enterprise data governance intact as content flows into agent workflows.

1:10:26 Generally Available: Azure Database for PostgreSQL – Flexible Server: DuckDB extension

  • Azure Database for PostgreSQL Flexible Server now supports the DuckDB extension in general availability, allowing users to run analytical workloads directly within their PostgreSQL environment without moving data to a separate system.
  • DuckDB is an in-process analytical database engine optimized for OLAP queries, so this extension lets PostgreSQL users run fast column-oriented analytics alongside their transactional workloads in the same managed service.
  • This is particularly useful for data engineers and developers who want to avoid the overhead of spinning up separate analytics infrastructure, since DuckDB can query large datasets efficiently using familiar SQL syntax.
  • The feature falls under the Databases and Hybrid plus multicloud categories, suggesting Microsoft sees this as relevant for customers running mixed workloads or integrating PostgreSQL with other data sources across environments.
  • Pricing for this extension was not specified in the announcement, so customers should check Azure Database for PostgreSQL Flexible Server pricing pages directly, as costs will likely depend on existing compute and storage tiers rather than a separate charge for the extension itself.

1:10:50 Justin – “I remember when there were companies that made nothing but columnar databases. Now you just get it as an extension on top of PostgreSQL. Kind of impressive. I bet those companies aren’t doing well these days.”

51:03 Global PTU Reservations Are Now Region-Agnostic

  • Azure’s Global PTU (Provisioned Throughput Unit) reservations are now region-agnostic as of June 2026, meaning a single reservation can cover AI model deployments across multiple regions instead of requiring separate per-region commitments.
  • The practical benefit here is reduced stranded capacity. Previously, if you over-provisioned in one region and under-utilized in another, you were paying for unused reservations. Now a single pool covers wherever your workload actually runs.
  • This is specifically tied to Microsoft Foundry (formerly Azure OpenAI Service infrastructure), so it targets customers running high-throughput AI inference workloads who need predictable performance and cost at scale.
  • From a cost management standpoint, consolidating reservations simplifies billing and procurement, which matters for enterprises managing AI spend across multiple geographic deployments. Specific pricing still depends on model type and throughput tier, so customers should check the Azure pricing calculator for their specific use case.
  • The flexibility to deploy where capacity is available without reservation constraints is a practical operational improvement, particularly useful during regional capacity crunches that have been a known pain point for provisioned throughput customers.

1:12:02 Justin – “Good! Glad you learned what the word ‘global’ means.” 

1:15:30 Generally Available: Azure API Management Premium v2 and Standard v2 now support wildcard custom hostnames

  • Azure API Management Premium v2 and Standard v2 now support wildcard custom hostnames, meaning a single entry like *.api.contoso.com and one wildcard certificate can cover all subdomains automatically instead of requiring separate configuration per subdomain.
  • The practical benefit is reduced operational overhead at scale. A team onboarding ten new API surfaces previously needed ten separate domain and certificate management tasks, and wildcard support eliminates that repetitive work.
  • This capability is now available on both Standard v2 and Premium v2 tiers, which means organizations do not need to move to higher-tier deployments just to get flexible domain management. Pricing details are not specified in the announcement, so listeners should check the Azure API Management pricing page for tier comparisons.
  • Target use cases include rapidly growing API environments with dynamic subdomains, such as microservices architectures or multi-tenant platforms, where new API surfaces are frequently added, and consistent branded endpoints matter.
  • The feature reached general availability in June 2026 and was announced at Microsoft Build. Teams currently managing large API estates with manual per-subdomain hostname configurations would benefit most from evaluating this update.

Emerging Clouds 

1:22:25 Full Stack Observability for AI | CoreWeave Solution Brief

  • CoreWeave Mission Control is an AI-native observability platform that provides end-to-end visibility across infrastructure, clusters, and workloads, addressing a gap that general-purpose monitoring tools often miss in GPU-heavy environments.
  • The platform combines real-time telemetry with GPU utilization analytics, which is particularly relevant as organizations struggle to justify and optimize the cost of large-scale GPU deployments.
  • Audit-ready logging and automated operational insights suggest the platform is targeting enterprise customers who need compliance documentation alongside performance monitoring, not just raw metrics.
  • The full-stack framing here is notable because AI workloads span multiple layers simultaneously, from bare metal GPU performance up through cluster orchestration and individual job execution, making siloed monitoring tools less effective.
  • For teams running inference or training at scale on CoreWeave, tighter observability tooling built into the platform could reduce the engineering overhead of stitching together third-party solutions like Prometheus, Grafana, and custom GPU exporters.

Closing

And that is the week in the cloud! Visit our website, the home of the Cloud Pod, where you can join our newsletter, Slack team, send feedback, or ask questions at theCloudPod.net or tweet at us with the hashtag #theCloudPod





Download audio: https://episodes.castos.com/5e2d2c4b117f29-10227663/2498469/c1e-p8j8uwq63kh17707-v6vgqnrdhkdr-3eakne.mp3
Read the whole story
alvinashcraft
1 hour ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories