Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
148884 stories
·
33 followers

344: Amazon’s Coding Bot Bites the Hand That Runs It

1 Share

Welcome to episode 344 of The Cloud Pod, where the forecast is always cloudy! Justin is out of the office at a World of Warcraft Tournament (not really), and Ryan is pursuing his lifelong dream of becoming a roadie for The Eagles (maybe?), so it’s Jonathan and Matt holding down the fort this week, and they’ve got a ton of cloud news for you! From security to AI assistants, we’ve got all the news you need. Let’s get started! 

Titles we almost went with this week

  • Zero Bus, All Gas, No Kafka Brakes
  • AI Coding Bot Bites the Hand That Runs It
  • When Your Robot Developer Goes Rogue on AWS
  • Kubernetes VPA Finally Stops Evicting Your Database Pods
  • Google Trains 100 Million People, Still No One Reads the Docs 
  • MCP Walks Into a Bar Not Enterprise Ready Yet
  • No More Pod Evictions Kubernetes 1.35 Scales In Place
  • No Keys No Drama Just IAM and Cloud SQL
  • One Agent to Rule Them All in Kubernetes
  • IAM Tired of Writing Policies Manually
  • When Your AI Coding Tool Has Delete Permissions
  • One Dashboard to Rule All Your GPU Clusters
  • Serverless Reservations Prove Nothing Is Truly Free Range
  • Kiro Takes the Wheel on AWS IAM Policies
  • Stop Blaming Backups for Your Bad Architecture
  • AI Agent Goes Rogue, Takes AWS Down With It
  • Everything is Bigger in Texas Except the Water Usage
  • OpenAI launches the college basketball of Inference. Pro service – low cost

General News 

1:05 Code Mode: give agents an entire API in 1,000 tokens

  • Cloudflare‘s Code Mode MCP server reduces token consumption by 99.9% compared to a traditional MCP implementation, exposing the entire Cloudflare API (over 2,500 endpoints) through just two tools, search() and execute(), using roughly 1,000 tokens versus 1.17 million for a conventional approach.
  • The architecture works by having the AI agent write JavaScript code against a typed OpenAPI spec representation, rather than loading tool definitions into context, with code executing inside a sandboxed V8 isolate (Dynamic Worker) that restricts file system access, environment variables, and external fetches by default.
  • This approach addresses a fundamental constraint in agentic AI systems: adding more tools to give agents broader capabilities directly competes with the available context space for the task at hand.

01:41 Jonathan- “It’s good. I’m not sure I could imagine 2 ½ thousand MCP tool definitions in a context window and still actually use it for anything.”   

AI Is Going Great – Or How ML Makes Money 

03:58 OpenClaw creator Peter Steinberger joins OpenAI

  • Peter Steinberger, creator of viral AI assistant OpenClaw (formerly Clawdbot/Moltbot), has joined OpenAI to lead development of next-generation personal agents. 
  • OpenClaw gained attention for its ability to perform real-world tasks like calendar management, flight booking, and autonomous social network participation.
  • OpenAI will maintain OpenClaw as an open source project through a foundation structure, allowing the community to continue development while Steinberger focuses on building similar capabilities into OpenAI’s product suite. 
  • This acquisition-to-open-source model differs from typical tech company acquisitions, where projects are absorbed or shut down.
  • The move signals OpenAI’s strategic focus on agentic AI systems that can execute multi-step tasks autonomously rather than just responding to prompts. Steinberger’s experience building practical automation workflows could accelerate OpenAI’s development of agent capabilities that compete with offerings from Anthropic, Google, and Microsoft.
  • For developers, this represents a shift in how personal AI assistants may be deployed, moving from standalone applications to integrated agent frameworks within larger platforms. 
  • The open source continuation of OpenClaw provides a reference implementation for building task-oriented AI systems.

04:19 Matt – “This is kind of where I see Anthriopic Cowork slowly going to, being your personal assistant, and having this be your ability to manage your real-world tasks. It’s great, and if they can build that into OpenAI, then it becomes a lot more of a personal assistant than just a general tool that you’re using.” 

09:11 Making frontier cybersecurity capabilities available to defenders

  • Anthropic launched Claude Code Security in a limited research preview for Enterprise and Team customers, with free expedited access for open-source maintainers. 
  • Unlike traditional static analysis tools that match known vulnerability patterns, it reasons through code contextually, the way a human security researcher would, catching logic flaws and access control issues that rule-based tools miss.
  • The tool uses a multi-stage verification process where Claude re-examines its own findings to filter false positives, assigns severity ratings, and provides confidence scores. 
  • Critically, no patches are applied without human approval, keeping developers in the decision loop.
  • For cloud and enterprise teams, this integrates directly into Claude Code on the web, meaning security review happens within existing developer workflows rather than requiring separate tooling. The dashboard surfaces validated findings alongside suggested patches for team review.
  • Want to request access? You can do that here

09:35 Preview, review, and merge with Claude Code

  • Claude Code on desktop now closes the full development loop by adding live app preview, inline code review, and GitHub PR monitoring in a single interface, reducing the need to switch between tools during development.
  • The new auto-fix and auto-merge features allow Claude to monitor PRs in the background, automatically attempt to fix CI failures, and merge PRs once all checks pass, letting developers move on to new tasks without manually tracking PR status.
  • The inline code review feature via the Review Code button lets Claude examine local diffs and leave comments directly in the desktop diff view before any code leaves the machine, functioning as an automated pre-push review step.
  • Session portability is now built in, allowing developers to start a session in the CLI using /desktop to bring context into the desktop app, or push local sessions to the web or Claude mobile app using the Continue with Claude Code on the web button.
  • These updates are available now to all users and represent a shift toward agentic, background-running development workflows where the AI continues working on tasks like CI remediation while the developer focuses elsewhere.

11:20 Jonathan – “It’s a very human way of going back and self-reflecting on the work that you’ve just done.”  

18:08 Announcing General Availability of Zerobus Ingest, part of Lakeflow Connect

  • Databricks has announced General Availability of Zerobus Ingest, part of Lakeflow Connect, a serverless streaming service that pushes data directly into Delta tables without intermediate message buses like Kafka
  • It supports thousands of concurrent connections and achieves over 10GB per second of aggregate throughput with data landing in under 5 seconds.
  • The core architectural difference is a single-sink design versus Kafka’s multi-sink approach, reducing a traditional five-system streaming stack down to two components. 
  • This eliminates dedicated compute and storage for the message bus itself, along with the engineering overhead to manage it, at a fraction of the cost per gigabyte compared to self-managed Kafka.
  • Developers can integrate via gRPC, REST APIs, or language-specific SDKs, and every write is automatically governed through Unity Catalog for lineage tracking and access control. 
  • This means streaming data gets the same governance treatment as the rest of the lakehouse from the moment it arrives.
  • Real-world deployments include Toyota using it to detect factory overheating conditions in minutes rather than hours, and Joby Aviation reducing aircraft telemetry resolution latency from days to minutes. 
  • Both cases highlight manufacturing and IoT as strong use cases where low-latency ingestion has a direct operational impact.
  • Zerobus Ingest is now GA on AWS and Azure, with Google Cloud support coming soon, priced under the Lakeflow Jobs Serverless SKU with a 6-month promotional pricing period currently active.

20:05 Jonathan – “I’m not a fan of Kafka in general, but I am a fan of doing things at massive scale, so it’s kind of cool.” 

07:27 OpenAI prepares new ChatGPT Pro Lite tier at $100 monthly

  • OpenAI appears to be preparing a ChatGPT Pro Lite tier at $100 per month, slotting between the existing Plus plan at $20 and the full Pro plan at $200, based on findings from engineer Tibor Blaho, who has a consistent track record of uncovering unreleased features.
  • The new tier would address a notable pricing gap for users who regularly hit Plus rate limits but cannot justify the full Pro cost, with freelancers, researchers, and developers as the likely target audience.
  • The plan may be structured around compute-heavy use cases, including Codex and persistent agentic workloads, where background-running agents carry substantially higher infrastructure costs than standard chat interactions.
  • OpenAI recently hired Peter Steinberger, creator of the open-source agent framework OpenClaw, and has signaled a multi-agent direction for ChatGPT, suggesting the Pro Lite tier could serve as an entry point for always-on agentic capabilities rather than just increased chat limits.
  • No release date or confirmed feature set exists yet, but the addition of a mid-tier option would create competitive pressure on Google, which currently lacks an equivalent individual plan at this price point.

21:56 Matt – “I just think they needed a different naming convention.” 

Cloud Tools 

23:11 HCP Packer adds SBOM vulnerability scanning

  • HCP Packer now includes SBOM vulnerability scanning in public beta, allowing platform teams to scan software bills of materials against MITRE’s CVE database and classify findings by severity directly within the artifact registry.
  • The feature builds on last year’s SBOM storage capabilities, which are now generally available, meaning teams can generate, store, and now actively scan SBOMs for known vulnerabilities in a single workflow.
  • This addresses a supply chain security gap by surfacing vulnerability data at the image level, covering AMIs, Docker containers, and virtual machines before they reach production environments.
  • Teams can see which specific package versions are affected and when vulnerabilities were detected, giving them the information needed to prioritize remediation without leaving the HCP Packer interface.
  • The feature is available in public beta at no cost through the free HCP Packer tier, making it accessible for teams looking to add CVE scanning to their image management process without additional tooling.

24:15 Jonathan – “It’s only as current as the time you built it though…” 

25:43 Why Kubernetes 1.35 is a game-changer for stateful workload scaling 

  • Kubernetes 1.35 brings two notable autoscaling milestones: In-Place Pod Resize graduating to GA and Vertical Pod Autoscaler’s InPlaceOrRecreate update mode reaching beta, allowing VPA to adjust CPU and memory on running pods without evicting them.
  • The practical benefit for stateful workloads is substantial. 
  • Previously, VPA had to evict and recreate pods to apply new resource requests, which caused disruption for databases, caches, and other restart-sensitive applications. In-place resizing preserves the pod UID, container ID, and restart count throughout the adjustment.
  • VPA operates in three stages worth understanding: a recommendation-only mode for passive observation, an InPlaceOrRecreate mode that attempts live resizing first and falls back to eviction only when node resources are insufficient, and configurable policies using minAllowed and maxAllowed to bound what VPA can actually set.
  • VPA controllers are not bundled with Kubernetes itself. 
  • Engineers need to clone the kubernetes/autoscaler repository and run the vpa-up.sh script to deploy the Recommender, Updater, and Admission Controller components alongside the mutating 

26:09 Jonathan – “I think the practical benefit for stable workloads are fairly substantial, if you’re one of those crazy people who like to run databases or SQL server on Kubernetes (like Cody) because previously those pods would be evicted and new resources requested, which would obviously cause disruption, stale caches, and other issues.” 

AWS 

31:20 Amazon service was taken down by AI coding bot

  • Listener note: paywall article
  • Amazon’s Kiro AI coding tool caused a 13-hour outage of an AWS cost exploration service in December after engineers granted it broad permissions, and it autonomously decided to delete and recreate the environment rather than patch it. 
  • A second outage involved Amazon Q Developer, though Amazon says neither event impacted core customer-facing AWS services.
  • Amazon’s official position is that both incidents were user error stemming from improper access controls, not failures of the AI tools themselves. 
  • Kiro is designed to request authorization before acting, but the engineer involved had been granted broader permissions than intended, bypassing that safeguard.
  • The incidents highlight a practical risk with agentic AI tools in production environments: when an AI agent is given the same permissions as a human operator without requiring peer review, it can take destructive autonomous actions that a second set of eyes might have caught. AWS has since added mandatory peer review and staff training as corrective measures.
  • AWS is pushing for 80 percent of its developers to use AI coding tools at least once weekly, which means these tools are being adopted at scale internally before the risk patterns are fully understood. 
  • Listeners running their own AI agents in production should treat permission scoping and human-in-the-loop approval gates as non-optional controls, not optional defaults.
  • Kiro launched in July 2025 and is positioned as a specification-driven coding assistant meant to go beyond simple vibe coding. 
  • The December incident was limited to mainland China, and the second incident had no customer-facing impact, but the pattern of two production disruptions in a few months is worth tracking as agentic tools become more common in enterprise workflows.

33:24 Matt – “…if you’re letting the AI tool start to do things inside of production environments, that’s where you need to watch it, and you need to probably have it be a little bit more specific, so the human needs to kind of be watching what’s going on and peer reviewing it.” 

35:49 Amazon pushes back on Financial Times report blaming AI coding tools for AWS outages 

  • Amazon issued a public rebuttal to a Financial Times report claiming its Kiro AI coding tool caused multiple AWS outages, acknowledging one limited incident in December but attributing it to a misconfigured access control role rather than a flaw in the AI tool itself.
  • The confirmed disruption affected only AWS Cost Explorer in a single China region for roughly 13 hours, with no customer inquiries received, and did not touch core services like compute, storage, or databases.
  • Amazon’s core defense is that the issue was user error, not AI error, noting that a misconfigured role could result from any developer tool or manual action, AI-powered or not.
  • In response to the incident, AWS has added safeguards, including mandatory peer review for production access, which is a practical governance consideration for any organization deploying agentic AI tools in production environments.
  • The broader takeaway for AWS customers is that agentic AI tools capable of autonomous actions, like deleting and recreating environments, require clear human oversight policies and access control guardrails before being used in production systems.

37:00 AWS IAM Policy Autopilot is now available as a Kiro Power

  • AWS IAM Policy Autopilot, an open source static code analysis tool launched at re:Invent 2025, is now available as a Kiro Power, allowing developers to generate baseline IAM policies directly within the Kiro IDE without manual policy writing.
  • The integration uses a one-click installation model that removes the need for manual MCP server configuration, streamlining how developers access policy generation tools during AI-assisted development workflows.
  • Key use cases include rapid prototyping of AWS applications, baseline policy creation for new projects, and keeping developers in their coding environment rather than switching to the IAM console or documentation.
  • This fits into the broader trend of embedding security and permissions tooling earlier in the development cycle, helping teams start with least-privilege policies that can be refined over time rather than retrofitting permissions after the fact.
  • The tool is open source and available on GitHub at github.com/awslabs/iam-policy-autopilot, with no additional cost mentioned beyond standard Kiro and AWS service usage, making it accessible for teams already using the Kiro IDE.

38:18 Jonathan – “I’m really on the fence about this. Because on one hand, I know the pain, especially with things like deployment policies…and just trying to figure out every permission that has to be added so that Terraform can just do a deployment – it becomes very complicated. At the same time, if you have a machine that looks at your code and says ‘this is the policy you need for it,’ I don’t think that’s any security at all unless there’s another check at the end.” 

-Honorable Mentions- 

41:52 Amazon Redshift Serverless introduces 3-year Serverless Reservations

  • Amazon Redshift Serverless now offers 3-year Serverless Reservations, providing up to 45% cost savings compared to standard on-demand RPU pricing while maintaining the serverless model’s flexibility.
  • The reservations are managed at the AWS payer account level and can be shared across multiple AWS accounts, making this useful for organizations running Redshift Serverless workloads across linked accounts.
  • -stop
  • Billing runs 24/7 on an hourly basis, metered per second, meaning you pay for reserved RPUs continuously, regardless of actual usage, so this option makes most sense for consistently active workloads rather than sporadic ones.
  • Any RPU consumption beyond the reserved amount falls back to standard on-demand rates, so customers need to size their reservations carefully to avoid negating the savings.
  • Reservations can be purchased through the Redshift console or via the create-reservation API and are available in all regions where Redshift Serverless is currently supported.
  • More information is available on the Amazon Redshift Management Guide, which you can find here

42:03 Amazon Says It Will Spend $12 billion On Louisiana Data Centers 

  • Amazon has announced a $12 billion investment in data center campuses in Louisiana, aimed at expanding infrastructure capacity for AI and cloud computing workloads.
  • A notable aspect of the deal is Amazon’s commitment to covering its own power costs directly, working with regional utility Southwestern Electric Power Company to avoid passing energy expenses onto local consumers.
  • Amazon is pairing the infrastructure investment with solar energy projects in Louisiana, which aligns with its broader sustainability commitments and addresses concerns about grid strain from large-scale data center operations.
  • This announcement reflects a broader industry trend where cloud providers are proactively addressing public and political concerns about data center energy consumption, following a similar commitment from Microsoft last month regarding higher electricity rate payments.
  • For AWS customers, this expansion signals continued investment in US-based infrastructure capacity, which could translate to improved regional availability and lower latency for workloads in the southern United States over time.

42:18 Announcing AWS Elemental Inference

  • AWS Elemental Inference is a fully managed AI service that automatically generates vertical video crops and highlight clips from live and on-demand broadcasts in parallel with encoding, targeting broadcasters who need to distribute content across TikTok, Instagram Reels, YouTube Shorts, and similar platforms without dedicated production staff.
  • The service uses an agentic AI approach with no prompts or human-in-the-loop intervention required, handling both vertical video cropping and metadata-based highlight detection automatically, which reduces the manual workflow overhead typically associated with multi-platform content distribution.
  • Beta testing with large media companies showed 34% or more cost savings on AI-powered live video workflows compared to using multiple point solutions, making this a notable consolidation option for media organizations already using AWS Elemental encoding services.
  • A practical sports broadcasting use case is highlighted where highlight clips can be identified and distributed to social platforms during live games rather than hours after the fact, addressing a real operational gap in live content workflows.
  • The service is available in four regions at launch: US East N. Virginia, US West Oregon, Asia Pacific Mumbai, and Europe Ireland. 
  • Pricing details are not specified in the announcement, so listeners should check the AWS Elemental Inference documentation at docs.aws.amazon.com/elemental-inference for current pricing information.

GCP

57:25  Managed MCP servers for Google Cloud databases

  • Google Cloud expanded its managed MCP server support to cover AlloyDB, Spanner, Cloud SQL, Bigtable, and Firestore, allowing AI agents to interact with these databases through natural language without requiring infrastructure deployment or complex configuration.
  • The security model relies entirely on IAM for authentication rather than shared keys, and all agent actions are logged in Cloud Audit Logs, which addresses a practical concern for teams worried about giving AI agents access to production databases.
  • A new Developer Knowledge MCP server connects IDEs directly to Google’s official documentation, letting agents reference best practices in real time during tasks like database migrations or app development troubleshooting.
  • Because these servers follow the open MCP standard, they work with third-party clients like Anthropic’s Claude in addition to Gemini, which broadens the practical appeal beyond teams already committed to Google’s AI tooling.
  • Google has signaled plans to extend managed MCP support to Looker, Memorystore, Pub/Sub, Kafka, and migration services in the coming months, suggesting this is an ongoing buildout rather than a one-time release. 
  • Pricing is not separately listed for MCP access and likely falls under existing database service costs.

44:12 Matt – “Anything that makes databases easier, I’m all for.” 

45:12 Gemini 3.1 Pro: Announcing our latest Gemini AI model

  • Gemini 3.1 Pro is now available in preview for developers via Google AI Studio, Gemini CLI, Vertex AI, and Android Studio, with enterprise access through Vertex AI and Gemini Enterprise. Pricing details have not been publicly announced for the preview period.
  • The model scores 77.1% on the ARC-AGI-2 benchmark, which tests reasoning on novel logic patterns, representing more than double the score of the previous Gemini 3 Pro model. 
  • This positions it as a stronger option for complex problem-solving tasks compared to its predecessor.
  • Practical use cases highlighted include generating animated SVGs from text prompts, building live data dashboards by connecting to public APIs, and prototyping interactive 3D interfaces with hand-tracking and generative audio. These examples suggest the model is particularly suited for developers working on data visualization and creative coding projects.
  • Consumer access is rolling out through the Gemini app and NotebookLM, but the 3.1 Pro tier is restricted to Google AI Pro and Ultra plan subscribers. This tiered access model means free-tier users will not have access during the preview phase.
  • Google notes the model is still in preview while they validate performance for agentic workflows before a general availability release. GCP customers evaluating it for production use should factor in that capabilities and pricing may shift before the full release.

46:23 Matt – “It’s just amazing to me how fast these models are improving. This one is saying it scored a 77%, where models a year ago where 40 and 50%. Seeing how fast everything is moving is insane.” 

47:36 Understanding the Firefly clock synchronization protocol

  • Google’s Firefly is a software-based clock synchronization protocol that achieves sub-10-nanosecond NIC-to-NIC synchronization across data center hardware, without requiring specialized or expensive dedicated timing equipment.
  • The protocol uses a distributed consensus algorithm built on random graphs rather than a traditional hierarchical time server model, which improves convergence speed, scalability, and resilience to network path asymmetries.
  • Firefly decouples internal synchronization from external UTC synchronization, meaning external time server jitter does not degrade the precision of clock alignment within the data center fabric itself.
  • Financial services workloads are a primary beneficiary, as regulatory requirements mandate sub-100 microsecond external UTC synchronization and sub-10 nanosecond internal synchronization, both of which Firefly meets on standard cloud infrastructure.
  • Beyond finance, the protocol has practical implications for distributed database consistency, ML workload coordination, and fine-grained network telemetry, potentially enabling workloads that previously required on-premises dedicated hardware to run on cloud infrastructure instead. No specific pricing details were provided in the announcement.

48:52 Jonathan – “The fact that you need to guarantee sub-hundred microsynchronization for financial systems is crazy.” 

-Honorable Mentions- 

50:32 America-India Connect infrastructure connects four continents

  • Google is investing $15 billion in AI infrastructure in India and launching America-India Connect, a multi-continent subsea cable initiative that establishes new fiber-optic routes connecting the United States, India, Singapore, South Africa, and Australia. 
  • The project creates Visakhapatnam as a new international subsea gateway on India’s east coast, adding network diversity beyond existing Mumbai and Chennai landing points.
  • The infrastructure combines multiple subsea cable systems, including Equiano, Nuvem, Bosun, Tabua, TalayLink, and Honomoana, to create redundant high-capacity routes between American coasts and India through both African and Pacific paths. 
  • This approach provides network resilience for over 1 billion people in India while improving connectivity across the Southern Hemisphere.
  • Google Cloud is serving as the primary cloud infrastructure provider for India’s iGOT Karmayogi platform, which delivers training to over 20 million public servants across 800+ districts. 
  • The platform will use AI to digitize legacy training content and enable access in 18+ Indian languages, supporting the government’s Mission Karmayogi initiative for civil service modernization.
  • The announcement positions these subsea cables as critical infrastructure to prevent an AI divide, with documented evidence that subsea cable connectivity improves internet affordability and reliability while driving productivity and economic growth. 
  • The initiative builds on Google’s existing infrastructure investments in Africa, Australia, and the Pacific region.
  • Added this one just for you, Justin. 

52:20 Wilbarger County data center

  • Google is building a new data center in Wilbarger County, Texas, expanding its existing infrastructure footprint in the state. 
  • This is primarily an infrastructure capacity announcement rather than a new GCP service or feature.
  • The facility will use air-cooling technology instead of traditional water cooling, limiting water consumption to only essential campus operations like kitchens. This is a notable operational choice given ongoing concerns about data center water usage in drought-prone regions.
  • Google has contracted to add more than 7,800 MW of net-new energy generation and capacity to the Texas electricity grid, with the Wilbarger facility co-located alongside new clean power developed in partnership with AES.
  • Google announced a $30 million Energy Impact Fund in November to support energy affordability, school weatherization, and energy workforce development across Texas. Details on the fund are available here
  • For GCP customers, additional Texas-based infrastructure generally signals potential improvements in latency and redundancy for workloads serving the south-central US region, though Google has not announced specific new GCP regions or zones tied to this facility.

52:55 Use Lyria 3 to create music tracks in the Gemini app

  • Google DeepMind’s Lyria 3 model is now available in beta within the Gemini app, letting users generate 30-second music tracks with lyrics, custom cover art, and style controls from text prompts or uploaded photos and videos. 
  • This is available to users 18 and older in 8 languages, with higher usage limits for Google AI Plus, Pro, and Ultra subscribers.
  • Lyria 3 improves on previous versions by auto-generating lyrics from prompts, offering more control over style, vocals, and tempo, and producing more musically complex outputs without requiring users to provide their own creative assets.
  • All generated tracks are embedded with SynthID, Google DeepMind’s imperceptible watermark, and the Gemini app now extends its AI content verification to audio files, allowing users to upload audio and check whether it was generated by Google AI.
  • The feature is also rolling out to YouTube creators via Dream Track for Shorts soundtracks, connecting Lyria 3 to a broader content creation workflow beyond the Gemini app itself.
  • On the responsible AI side, Google states Lyria 3 was trained with copyright and partner agreements in mind, artist-specific prompts are treated as stylistic inspiration rather than direct mimicry, and output filters check against existing content, though Google acknowledges this approach is not guaranteed to catch all issues.

Azure

57:25 A milestone achievement in our journey to carbon negative

  • Microsoft has achieved its 2025 goal of matching 100 percent of global electricity consumption with renewable energy, contracting 40 gigawatts of new renewable capacity across 26 countries since 2020. 
  • This represents enough energy to power approximately 10 million US homes, with 19 GW currently online and the remainder coming online over the next five years.
  • The renewable energy procurement has reduced Microsoft’s reported Scope 2 carbon emissions by an estimated 25 million tons and mobilized billions in private investment through over 400 contracts with 95 utilities and developers. This directly impacts Azure datacenter operations globally, supporting the infrastructure that runs customer workloads while advancing toward the company’s 2030 carbon negative commitment.
  • Microsoft is expanding beyond renewable energy to include nuclear power and other carbon-free technologies, including a 50 MW fusion project with Helion in Washington state and restarting the 835 MW Crane Clean Energy Center in Pennsylvania with Constellation Energy. The Climate Innovation Fund has allocated $806 million to 67 investees, with 38 percent directed toward energy systems innovation.
  • The company is deploying AI-driven tools to accelerate clean energy deployment, including collaborations with Idaho National Laboratory for nuclear licensing and the Midcontinent Independent System Operator for grid optimization. 
  • These tools aim to streamline the design, permitting, and deployment of new power technologies to expand grid capacity more efficiently.
  • Azure customers benefit indirectly through more sustainable cloud infrastructure, though Microsoft notes the shift to an all-of-the-above decarbonization strategy recognizes that rising electricity demand from datacenters, AI workloads, and digital services requires diverse carbon-free energy sources beyond renewables alone.

55:58 Generally Available: Quota and deployment troubleshooting tools for Azure Functions Flex Consumption 

  • Azure Functions Flex Consumption now has generally available quota and deployment troubleshooting tools built directly into the platform, giving developers clearer visibility into quota limits and constraints without needing to dig through documentation or support tickets.
  • The quota troubleshooting experience surfaces Flex Consumption-specific limits in context, which is useful for teams hitting scaling walls and trying to understand why deployments are behaving unexpectedly.
  • This is a quality-of-life improvement aimed at developers and platform engineers who use Flex Consumption for its per-execution billing model and fast scaling, helping reduce time spent diagnosing deployment failures.
  • Pricing for Flex Consumption remains consumption-based, so there is no additional cost for these troubleshooting tools themselves. More details are available at the Azure updates page here
  • Teams already invested in Azure Functions should note this reduces reliance on external monitoring or support escalations for common quota-related issues, keeping troubleshooting within the Azure portal workflow.

56:32 Matt – “This is a great quality of life improvement because you can see why things are breaking when you’re using flexible consumption.” 

-Honorable Mentions-

1:01:07 Public Preview Announcement: Empower Real-Time Security with Microsoft Sentinel’s CCF Push Feature | Microsoft Community Hub

  • Microsoft Sentinel’s CCF Push feature, now in public preview, allows security data providers to send logs directly to a Sentinel workspace without the traditional setup overhead of manually configuring Data Collection Endpoints, Data Collection Rules, Entra app registrations, and RBAC assignments. Pressing Deploy handles all resource provisioning automatically.
  • The feature is built on Sentinel’s Log Ingestion API, which supports high-throughput data ingestion, pre-ingestion data transformation, and direct targeting of system tables, making it more flexible than the older polling-based connector model.
  • For partners and ISVs building Sentinel integrations, CCF Push reduces time to market by consolidating connector deployment through the Content Hub as a single interface, rather than requiring customers to configure multiple Azure resources independently.
  • Early adopters include security vendors like Obsidian Security and Varonis, suggesting the feature is already being validated in real-world security workflows. 
  • Developers can reference the MS Learn documentation here to get started.
  • No specific pricing details were provided in the announcement, but since CCF Push feeds data into Sentinel workspaces, standard Sentinel and Log Analytics ingestion costs would apply. 
  • Organizations evaluating this feature should factor in their existing Sentinel pricing tier when estimating costs.

1:01:24 Microsoft Sovereign Cloud adds governance, productivity and support for large AI models securely running even when completely disconnected

  • Azure Local disconnected operations are now generally available, allowing organizations to run mission-critical infrastructure with full Azure governance and policy enforcement even when completely isolated from cloud connectivity. This targets government, defense, and regulated industries where external dependencies are either unacceptable or prohibited.
  • Microsoft 365 Local disconnected brings Exchange Server, SharePoint Server, and Skype for Business Server into fully air-gapped sovereign environments running on Azure Local, with Microsoft committing support for these workloads through at least 2035. 
  • This keeps productivity tools available under the same governance boundary as infrastructure workloads.
  • Foundry Local now supports large multimodal AI models running on-premises hardware, including NVIDIA GPUs, within fully disconnected sovereign environments. This extends local AI inferencing capabilities beyond the smaller models Foundry Local previously supported, with Microsoft providing deployment, update, and operational health support.
  • The three components together form a full-stack sovereign private cloud covering infrastructure, productivity, and AI inferencing, all manageable through consistent Azure governance tooling regardless of connectivity state. 
  • Pricing is not publicly listed and appears to vary based on deployment scale and customer qualification, so organizations should contact Microsoft directly for specifics.
  • Target customers include public sector agencies, classified environments, and regulated industries in regions where data residency and operational autonomy are legal or contractual requirements. 
  • Azure Local is disconnected, and Microsoft 365 Local is available worldwide, while large model support on Foundry Local is currently limited to qualified customers.

Emerging Clouds 

1:03:04 Introducing Command Center:  The unified operations platform  for AI workloads

  • Crusoe Command Center is a unified operations platform that consolidates GPU cluster monitoring, orchestration, and support into a single interface, addressing the common problem of engineers context-switching between fragmented dashboards during AI training runs.
  • The platform integrates with Crusoe Managed Kubernetes and supports Managed Slurm, allowing long-running multi-week training jobs to operate continuously across large GPU clusters without manual intervention.
  • AutoClusters is a key component that automatically detects GPU performance degradation, evicts compromised nodes, and replaces them with healthy instances from a reserve pool, reducing the need for around-the-clock manual oversight.
  • On the observability side, Command Center supports multiple access methods, including a UI, Grafana via PromQL API, and a Prometheus endpoint, while a Telemetry Relay feature streams infrastructure metrics directly to external tools to reduce data silos.
  • The Crusoe Watch Agent, paired with Telemetry Relay, extends visibility to custom application-level metrics, allowing teams to correlate workload performance with underlying GPU health data for more precise troubleshooting.

1:04:04 Matt – “The whole stack here is what I kind of find nice. The smaller clouds are trying to attack that whole vertical a lot more, where they’re giving you that depth all the way down, so if you are training your own model, you get the CPU, you get the GPU, you can see that whole stack of what’s going on, and really start to fine-tune.”     

1:05:09 Expanding our Agentic Inference Cloud: Introducing GPU Droplets Powered by AMD Instinct MI350X GPUs

  • DigitalOcean is adding AMD Instinct MI350X GPUs to its GPU Droplets lineup, built on the CDNA 4 architecture and optimized for inference workloads, including prefill phase compute, low-latency token generation, and larger context windows.
  • The platform has demonstrated measurable results with existing customers, including a 2x increase in production request throughput and 50% reduction in inference costs for Character.AI, giving potential adopters concrete performance benchmarks to evaluate.
  • DigitalOcean is positioning these offerings toward AI-native companies and developers who need enterprise features like HIPAA eligibility and SOC 2 compliance without the complexity of larger cloud providers, with provisioning available in a few clicks.
  • The GPUs are currently available in the Atlanta datacenter, with AMD Instinct MI355X GPUs planned for next quarter, which will introduce liquid-cooled rack infrastructure to support larger models and datasets.
  • For smaller businesses and developers, the predictable usage-based pricing and simplified deployment model represent a meaningful alternative to the more complex pricing and configuration requirements typical of hyperscaler GPU offerings.

Closing

And that is the week in the cloud! Visit our website, the home of the Cloud Pod, where you can join our newsletter, Slack team, send feedback, or ask questions at theCloudPod.net or tweet at us with the hashtag #theCloudPod





Download audio: https://episodes.castos.com/5e2d2c4b117f29-10227663/2382918/c1e-7nknsv3m59bwj41v-rk2qm0wkfwon-ykry8x.mp3
Read the whole story
alvinashcraft
just a second ago
reply
Pennsylvania, USA
Share this story
Delete

Battery Technical Quality Enforcement is Here: How to Optimize Common Wake Lock Use Cases

1 Share

Posted by Alice Yuan, Senior Developer Relations Engineer


In recognition that excessive battery drain is top of mind for Android users, Google has been taking significant steps to help developers build more power-efficient apps. On March 1st, 2026, Google Play Store began rolling out the wake lock technical quality treatments to improve battery drain. This treatment will roll out gradually to impacted apps over the following weeks. Apps that consistently exceed the "Excessive Partial Wake Lock" threshold in Android vitals may see tangible impacts on their store presence, including warnings on their store listing and exclusion from discovery surfaces such as recommendations.

Users may see a warning on your store listing if your app exceeds the bad behavior threshold.

This initiative elevated battery efficiency to a core vital metric alongside stability metrics like crashes and ANRs. The "bad behavior threshold" is defined as holding a non-exempted partial wake lock for at least two hours on average while the screen is off in more than 5% of user sessions in the past 28 days. A wake lock is exempted if it is a system held wake lock that offers clear user benefits that cannot be further optimized, such as audio playback, location access, or user-initiated data transfer. You can view the full definition of excessive wake locks in our Android vitals documentation.

As part of our ongoing initiative to improve battery life across the Android ecosystem, we have analyzed thousands of apps and how they use partial wake locks. While wake locks are sometimes necessary, we often see apps holding them inefficiently or unnecessarily, when more efficient solutions exist. This blog will go over the most common scenarios where excessive wake locks occur and our recommendations for optimizing wake locks.  We have already seen measurable success from partners like WHOOP, who leveraged these recommendations to optimize their background behavior.

Using a foreground service vs partial wake locks

We’ve often seen developers struggle to understand the difference between two concepts when doing background execution: foreground service and partial wake locks.

A foreground service is a lifecycle API that signals to the system that an app is performing user-perceptible work and should not be killed to reclaim memory, but it does not automatically prevent the CPU from sleeping when the screen turns off. In contrast, a partial wake lock is a mechanism specifically designed to keep the CPU running even while the screen is off. 

While a foreground service is often necessary to continue a user action, a manual acquisition of a partial wake lock is only necessary in conjunction with a foreground service for the duration of the CPU activity. In addition, you don't need to use a wake lock if you're already utilizing an API that keeps the device awake. 

Refer to the flow chart in Choose the right API to keep the device awake to ensure you have a strong understanding of what tool to use to avoid acquiring a wake lock in scenarios where it’s not necessary.

Third party libraries acquiring wake locks

It is common for an app to discover that it is flagged for excessive wake locks held by a third-party SDK or system API acting on its behalf. To identify and resolve these wake locks, we recommend the following steps:

  • Check Android vitals: Find the exact name of the offending wake lock in the excessive partial wake locks dashboard. Cross-reference this name with the Identify wake locks created by other APIs guidance to see if it was created by a known system API or Jetpack library. If it is, you may need to optimize your usage of the API and can refer to the recommended guidance.

  • Capture a System Trace: If the wake lock cannot be easily identified, reproduce the wake lock issue locally using a system trace and inspect it with the Perfetto UI. You can learn more about how to do this in the Debugging other types of excessive wake locks section of this blog post.

  • Evaluate Alternatives: If an inefficient third-party library is responsible and cannot be configured to respect battery life, consider communicating the issue with the SDK's owners, finding an alternative SDK or building the functionality in-house.

Common wake lock scenarios

Below is a breakdown of some of the specific use cases we have reviewed, along with the recommended path to optimize your wake lock implementation.

User-Initiated Upload or Download

Example use cases: 

  • Video streaming apps where the user triggers a download of a large file for offline access.

  • Media backup apps where the user triggers uploading their recent photos via a notification prompt.

How to reduce wake locks: 

  • Do not acquire a manual wake lock. Instead, use the User-Initiated Data Transfer (UIDT) API. This is the designated path for long running data transfer tasks initiated by the user, and it is exempted from excessive wake lock calculations.

One-Time or Periodic Background Syncs

Example use cases: 

  • An app performs periodic background syncs to fetch data for offline access. 

  • Pedometer apps that fetch step count periodically.

How to reduce wake locks: 

  • Do not acquire a manual wake lock. Use WorkManager configured for one-time or periodic work.  WorkManager respects system health by batching tasks and has a minimum periodic interval (15 minutes), which is generally sufficient for background updates. 

  • If you identify wake locks created by WorkManager or JobScheduler with high wake lock usage, it may be because you’ve misconfigured your worker to not complete in certain scenarios. Consider analyzing the worker stop reasons, particularly if you’re seeing high occurrences of STOP_REASON_TIMEOUT.

workManager.getWorkInfoByIdFlow(syncWorker.id)
  .collect { workInfo ->
      if (workInfo != null) {
        val stopReason = workInfo.stopReason
        logStopReason(syncWorker.id, stopReason)
      }
  }
  • In addition to logging worker stop reasons, refer to our documentation on debugging your workers. Also, consider collecting and analyzing system traces to understand when wake locks are acquired and released.

  • Finally, check out our case study with WHOOP, where they were able to discover an issue with configuration of their workers and reduce their wake lock impact significantly.

Bluetooth Communication

Example use cases: 

  • Companion device app prompts the user to pair their Bluetooth external device.

  • Companion device app listens for hardware events on an external device and user visible change in notification.

  • Companion device app’s user initiates a file transfer between the mobile and bluetooth device.

  • Companion device app performs occasional firmware updates to an external device via Bluetooth.

How to reduce wake locks: 

  • Use companion device pairing to pair Bluetooth devices to avoid acquiring a manual wake lock during Bluetooth pairing. 

  • Consult the Communicate in the background guidance to understand how to do background Bluetooth communication. 

  • Using WorkManager is often sufficient if there is no user impact to a delayed communication. If a manual wake lock is deemed necessary, only hold the wake lock for the duration of Bluetooth activity or processing of the activity data.

Location Tracking

Example use cases: 

  • Fitness apps that cache location data for later upload such as plotting running routes

  • Food delivery apps that pull location data at a high frequency to update progress of delivery in a notification or widget UI.

How to reduce wake locks: 

  • Consult our guidance to Optimize location usage. Consider implementing timeouts, leveraging location request batching, or utilizing passive location updates to ensure battery efficiency.

  • When requesting location updates using the FusedLocationProvider or LocationManager APIs, the system automatically triggers a device wake-up during the location event callback. This brief, system-managed wake lock is exempted from excessive partial wake lock calculations.

  • Avoid acquiring a separate, continuous wake lock for caching location data, as this is redundant. Instead, persist location events in memory or local storage and leverage WorkManager to process them at periodic intervals.

override fun onCreate(savedInstanceState: Bundle?) {
    locationCallback = object : LocationCallback() {
        override fun onLocationResult(locationResult: LocationResult?) {
            locationResult ?: return
            // System wakes up CPU for short duration
            for (location in locationResult.locations){
                // Store data in memory to process at another time
            }
        }
    }
}

High Frequency Sensor Monitoring

Example use cases: 

  • Pedometer apps that passively collect steps, or distance traveled. 

  • Safety apps that monitor the device sensors for rapid changes in real time, to provide features such as crash detection or fall detection.

How to reduce wake locks: 

  • If using SensorManager, reduce usage to periodic intervals and only when the user has explicitly granted access through a UI interaction. High frequency sensor monitoring can drain the battery heavily due to the number of CPU wake-ups and processing that occurs.

  • If you’re tracking step counts or distance traveled, rather than using SensorManager, leverage Recording API or consider utilizing Health Connect to access historical and aggregated device step counts to capture data in a battery-efficient manner.

  • If you’re registering a sensor with SensorManager, specify a maxReportLatencyUs of 30 seconds or more to leverage sensor batching to minimize the frequency of CPU interrupts. When the device is subsequently woken by another trigger such as a user interaction, location retrieval, or a scheduled job, the system will immediately dispatch the cached sensor data.

val accelerometer = sensorManager.getDefaultSensor(Sensor.TYPE_ACCELEROMETER)

sensorManager.registerListener(this,
                 accelerometer,
                 samplingPeriodUs, // How often to sample data
                 maxReportLatencyUs // Key for sensor batching 
              )

  • If your app requires both location and sensor data, synchronize their event retrieval and processing. By piggybacking sensor readings onto the brief wake lock the system holds for location updates, you avoid needing a wake lock to keep the CPU awake. Use a worker or a short-duration wake lock to handle the upload and processing of this combined data.

Remote Messaging

Example use cases: 

  • Video or sound monitoring companion apps that need to monitor events that occur on an external device connected using a local network.

  • Messaging apps that maintain a network socket connection with the desktop variant.

How to reduce wake locks: 

  • If the network events can be processed on the server side, use FCM to receive information on the client. You may choose to schedule an expedited worker if additional processing of FCM data is required. 

  • If events must be processed on the client side via a socket connection, a wake lock is not needed to listen for event interrupts. When data packets arrive at the Wi-Fi or Cellular radio, the radio hardware triggers a hardware interrupt in the form of a kernel wake lock. You may then choose to schedule a worker or acquire a wake lock to process the data.

  • For example, if you’re using ktor-network to listen for data packets on a network socket, you should only acquire a wake lock when packets have been delivered to the client and need to be processed.

val readChannel = socket.openReadChannel()
while (!readChannel.isClosedForRead) {
    // CPU can safely sleep here while waiting for the next packet
    val packet = readChannel.readRemaining(1024) 
    if (!packet.isEmpty) {
         // Data Arrived: The system woke the CPU and we should keep it awake via manual wake lock (urgent) or scheduling a worker (non-urgent)
         performWorkWithWakeLock { 
              val data = packet.readBytes()
              // Additional logic to process data packets
         }
    }
}

Summary

By adopting these recommended solutions for common use cases like background syncs, location tracking, sensor monitoring and network communication, developers can work towards reducing unnecessary wake lock usage. To continue learning, read our other technical blog post or watch our technical video on how to discover and debug wake locks: Optimize your app battery using Android vitals wake lock metric. Also, consult our updated wakelock documentation. To help us continue improving our technical resources, please share any additional feedback on our guidance in our documentation feedback survey.

Read the whole story
alvinashcraft
28 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

Get started with GitHub Copilot SDK

1 Share

Did you know GitHub Copilot now has an SDK and that you can leverage your existing license to build AI integrations into your app? No, well I hope I have you attention now.

Install

You need two pieces here to get started:

  • GitHub Copilot CLI
  • A supported runtime, which at present means either Node.js, .NET, Python or Go

Then you need to install the SDK for your chosen runtime like so:

pip install github-copilot-sdk

The parts

So what do you need to know to get started? There are three concepts:

  • Client, you need to create and instance of it. Additionally you need to start and stop it when you're done with it.
  • Session. The session takes an object where you can set things like model, system prompt and more. Also, the session is what you talk when you want to carry out a request.
  • Response. The response contains your LLM response.

Below is an example program using these three concepts. As you can see we choose "gpt-4.1" as model but this can be changed. See also how we pass the prompt to the function send_and_wait.

import asyncio
from copilot import CopilotClient

async def main():
    client = CopilotClient()
    await client.start()

    session = await client.create_session({"model": "gpt-4.1"})
    response = await session.send_and_wait({"prompt": "What is 2 + 2?"})

    print(response.data.content)

    await client.stop()

asyncio.run(main())

Ok, now that we know what a simple program looks like, let's make something interesting, an FAQ responder.

Your first app

An FAQ for a web page, is often a pretty boring read. A way to make that more interesting for the end user is if they can instead chat with the FAQ, let's make that happen.

Here's the plan:

  • Define a static FAQ
  • Add the FAQ as part of the prompt.
  • Make a request to to the LLM and print out the response.

Let's build out the code little by little. First, let's define the FAQ information.

-1- FAQ information

# faq.py

faq = {
  "warranty": "Our products come with a 1-year warranty covering manufacturing defects. Please contact our support team for assistance.",
  "return_policy": "We offer a 30-day return policy for unused products in their original packaging. To initiate a return, please visit our returns page and follow the instructions.",     
  "shipping": "We offer free standard shipping on all orders over $50. Expedited shipping options are available at checkout for an additional fee.",
}

Next, let's add the call to the Copilot SDK

-2 Adding the LLM call


import asyncio
from copilot import CopilotClient

def faq_to_string(faq: dict) -> str:
    return "\n".join([f"{key}: {value}" for key, value in faq.items()])

async def main(user_prompt: str = "Tell me about shipping"):
    client = CopilotClient()
    await client.start()

    prompt = f"Here's the FAQ, {faq_to_string(faq)}\n\nUser question: {user_prompt}\nAnswer:"   

    session = await client.create_session({"model": "gpt-4.1"})
    response = await session.send_and_wait({"prompt": prompt})

    print(response.data.content)

    await client.stop()

if __name__ == "__main__":
    print("My first app using the GitHub Copilot SDK!")
    print(f"[LOG] Asking the model about shipping information...")
    asyncio.run(main("Tell me about shipping"))

Note how we concatenate the FAQ data with the user's prompt:

 prompt = f"Here's the FAQ, {faq_to_string(faq)}\n\nUser question: {user_prompt}\nAnswer:"   

-3- Let's run it

Now run it:

uv run faq.py

You should see output like so:

My first app using the GitHub Copilot SDK!
[LOG] Asking the model about shipping information...
We offer free standard shipping on all orders over $50. Expedited shipping options are available at checkout for an additional fee.

What's next

Check out the official docs

Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete

Copilot App on Windows: Opening web links alongside your conversations begins rolling out to Windows Insiders

1 Share
Hello Windows Insiders,   In today's update, we are introducing a new way to get things done in the Copilot App on Windows. Now, when you click to open a link, Copilot opens the content in a sidepane next to your conversation instead of a separate browser window, so you don't lose context.    With your permission, Copilot will also have access to the context of the tabs you open in that conversation (and only in that conversation), so you can ask clarifying questions, summarize information across tabs, or ask Copilot's help in drafting exactly the right words needed for the task. Tabs you open will be saved with the conversation, so you can return to them when you come back to that conversation. Additionally, if you choose to enable it, you can sync passwords and form data so it's easier to work within Copilot.  [caption id="attachment_178674" align="alignnone" width="1024"]View web content in a sidepane next to your conversation without leaving the app. View content in a sidepane next to your conversation without leaving the app.[/caption] Today's release also comes with an update to the CopilotApp that makes it faster, more reliable, and packed with all of the latest Copilot features that make the PC experience even better.As part of this update, some features like Podcasts and Study and Learn mode from Copilot.com are getting added, while others may be pulled back while we iterate on the experience; we willadd priority features back in before the updated app is generally available.   These updates (Copilot app version 146.0.3856.39 and higher) are beginning to roll out to all Insider Channels. Availability may vary as we gradually expand the rollout to Insiders worldwide. We’re excited to preview these improvements with our Insider community and ensure we deliver a great Copilot experience for all Windows customers. With the power of the web, you'll be able to get even more things done with Copilot.    FEEDBACK: Please provide feedback directly within the Copilot app by clicking on your profile icon and choosing “Give feedback”.     Thanks,  Microsoft Copilot Team  
Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete

python-1.0.0rc3

1 Share

[1.0.0rc3] - 2026-03-04

Added

  • agent-framework-core: Add Shell tool (#4339)
  • agent-framework-core: Add file_ids and data_sources support to get_code_interpreter_tool() (#4201)
  • agent-framework-core: Map file citation annotations from TextDeltaBlock in Assistants API streaming (#4316, #4320)
  • agent-framework-claude: Add OpenTelemetry instrumentation to ClaudeAgent (#4278, #4326)
  • agent-framework-azure-cosmos: Add Azure Cosmos history provider package (#4271)
  • samples: Add auto_retry.py sample for rate limit handling (#4223)
  • tests: Add regression tests for Entry JoinExecutor workflow input initialization (#4335)

Changed

  • samples: Restructure and improve Python samples (#4092)
  • agent-framework-orchestrations: [BREAKING] Tighten HandoffBuilder to require Agent instead of SupportsAgentRun (#4301, #4302)
  • samples: Update workflow orchestration samples to use AzureOpenAIResponsesClient (#4285)

Fixed

  • agent-framework-bedrock: Fix embedding test stub missing meta attribute (#4287)
  • agent-framework-ag-ui: Fix approval payloads being re-processed on subsequent conversation turns (#4232)
  • agent-framework-core: Fix response_format resolution in streaming finalizer (#4291)
  • agent-framework-core: Strip reserved kwargs in AgentExecutor to prevent duplicate-argument TypeError (#4298)
  • agent-framework-core: Preserve workflow run kwargs when continuing with run(responses=...) (#4296)
  • agent-framework-core: Fix WorkflowAgent not persisting response messages to session history (#4319)
  • agent-framework-core: Fix single-tool input handling in OpenAIResponsesClient._prepare_tools_for_openai (#4312)
  • agent-framework-core: Fix agent option merge to support dict-defined tools (#4314)
  • agent-framework-core: Fix executor handler type resolution when using from __future__ import annotations (#4317)
  • agent-framework-core: Fix walrus operator precedence for model_id kwarg in AzureOpenAIResponsesClient (#4310)
  • agent-framework-core: Handle thread.message.completed event in Assistants API streaming (#4333)
  • agent-framework-core: Fix MCP tools duplicated on second turn when runtime tools are present (#4432)
  • agent-framework-core: Fix PowerFx eval crash on non-English system locales by setting CurrentUICulture to en-US (#4408)
  • agent-framework-orchestrations: Fix StandardMagenticManager to propagate session to manager agent (#4409)
  • agent-framework-orchestrations: Fix IndexError when reasoning models produce reasoning-only messages in Magentic-One workflow (#4413)
  • agent-framework-azure-ai: Fix parsing oauth_consent_request events in Azure AI client (#4197)
  • agent-framework-anthropic: Set role="assistant" on message_start streaming update (#4329)
  • samples: Fix samples discovered by auto validation pipeline (#4355)
  • samples: Use AgentResponse.value instead of model_validate_json in HITL sample (#4405)
  • agent-framework-devui: Fix .NET conversation memory handling in DevUI integration (#3484, #4294)
Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete

dotnet-1.0.0-rc3

1 Share

What's Changed

  • .NET: Support hosted code interpreter for skill script execution by @SergeyMenshykh in #4192
  • .NET: AgentThread serialization alternatives ADR by @westey-m in #3062
  • .NET: Add helpers to more easily access in-memory ChatHistory and make ChatHistoryProvider management more configurable. by @westey-m in #4224
  • .Net: Add additional Hosted Agent Samples by @rogerbarreto in #4325
  • .NET: Revert ".NET: Support hosted code interpreter for skill script execution" by @SergeyMenshykh in #4385
  • .NET: Fixing issue with invalid node Ids when visualizing dotnet workflows. by @alliscode in #4269
  • .NET: Fix FileAgentSkillsProvider custom SkillsInstructionPrompt silently dropping skills by @SergeyMenshykh in #4388
  • .NET: AuthN & AuthZ sample with asp.net service and web client by @westey-m in #4354
  • .NET: Update GroupChat workflow builder to support name and description by @peibekwe in #4334
  • .NET: Skip OffThread observability test by @rogerbarreto in #4399
  • .NET: AzureAI Package - Skip tool validation when UseProvidedChatClientAsIs is true by @rogerbarreto in #4389
  • [BREAKING] Add response filter for store input in *Providers by @westey-m in #4327
  • .NET: [BREAKING] Change *Provider StateKey to list of StateKeys by @westey-m in #4395
  • .NET: Updated Copilot SDK to the latest version by @dmytrostruk in #4406
  • .NET: Disable OpenAIAssistant structured output integration tests by @SergeyMenshykh in #4451
  • .NET: Update Azure.AI.Projects 2.0.0-beta.1 by @rogerbarreto in #4270
  • .NET: Skip flacky UT + (Attempt) Merge Gatekeeper fix by @rogerbarreto in #4456
  • .NET: Discover skill resources from directory instead of markdown links by @SergeyMenshykh in #4401
  • .NET: Update package versions by @dmytrostruk in #4468
  • .NET: Fixed CA1873 warning by @dmytrostruk in #4479

Full Changelog: dotnet-1.0.0-rc2...dotnet-1.0.0-rc3

Read the whole story
alvinashcraft
2 minutes ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories