Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
153218 stories
·
33 followers

GitHub Copilot’s Agentic Memory: Teaching AI to Remember and Learn Your Codebase

1 Share

Disclaimer: This post was originally published on Azure with AJ and has been reproduced here with permission. You can find the original post here.

One of the biggest challenges with AI coding assistants has been their stateless nature with every interaction starting from scratch, requiring developers to repeatedly explain coding conventions, architectural patterns, and repository-specific knowledge. GitHub has just changed the game with the public preview of agentic memory for GitHub Copilot, a revolutionary capability that allows AI agents to remember and learn from your codebase over time.

This isn’t just another incremental improvement, it’s a fundamental shift toward truly intelligent AI assistants that grow smarter with every interaction. Let’s dive into how this groundbreaking feature works and why it’s set to transform how we collaborate with AI in our development workflows.

The Problem: Context Lost in Translation

Picture this scenario: You’re working on a complex enterprise application with specific coding conventions, database connection patterns, and synchronised configuration files. Every time you interact with GitHub Copilot, you find yourself explaining the same architectural decisions and coding standards. The AI produces decent code, but it lacks the deep understanding of your repository’s unique patterns and requirements.

Traditional AI assistants suffer from “contextual amnesia” – they can’t retain knowledge between sessions. This means:

  • Repetitive explanations: Constantly re-explaining coding conventions and patterns
  • Inconsistent suggestions: AI recommendations that don’t align with established codebase patterns
  • Missed relationships: Failing to understand dependencies between files that must stay synchronised
  • Generic responses: One-size-fits-all solutions that don’t respect repository-specific best practices

While we can leverage GitHub Copilot’s instructions to provide context, this approach is limited. Instructions can become lengthy, hard to maintain, and still don’t solve the problem of retaining knowledge across sessions.

Introducing Agentic Memory: AI That Learns and Remembers

Agentic memory represents a paradigm shift in how AI assistants work with code. Instead of starting fresh with each interaction, Copilot now builds and maintains a persistent understanding of your repository through “memories”. Tightly scoped pieces of knowledge that it discovers and validates over time.

How Memories Are Created

The memory system works through what GitHub calls “just-in-time verification”. Here’s the elegant process:

graph TD
    A[Copilot Agent Working] --> B{Discovers Actionable Pattern}
    B -->|Yes| C[Create Memory with Citations]
    B -->|No| D[Continue Work]
    C --> E[Store Memory in Repository]
    E --> F[Available for Future Sessions]

    style A fill:#2563eb,stroke:#1e40af,stroke-width:3px,color:#fff
    style C fill:#059669,stroke:#047857,stroke-width:3px,color:#fff
    style E fill:#94a3b8,stroke:#64748b,stroke-width:2px,color:#0f172a
    style F fill:#94a3b8,stroke:#64748b,stroke-width:2px,color:#0f172a

When Copilot discovers something worth remembering, it creates a structured memory entry:

{
  "subject": "API version synchronisation",
  "fact": "API version must match between client SDK, server routes, and documentation.",
  "citations": [
    "src/client/sdk/constants.ts:12",
    "server/routes/api.go:8",
    "docs/api-reference.md:37"
  ],
  "reason": "If the API version is not kept properly synchronised, the integration can fail or exhibit subtle bugs. Remembering these locations will help ensure they are kept synchronised in future updates."
}

Memory Validation and Self-Healing

The brilliant aspect of this system is its self-healing nature. Before applying any stored memory, Copilot validates it against the current codebase:

graph TD
    A[Agent Starts New Session] --> B[Retrieve Repository Memories]
    B --> C[Check Citations Against Current Code]
    C --> D{Citations Valid?}
    D -->|Yes| E[Apply Memory Knowledge]
    D -->|No| F[Store Corrected Memory or Discard]
    E --> G[Continue with Enhanced Context]
    F --> G

    style A fill:#2563eb,stroke:#1e40af,stroke-width:3px,color:#fff
    style C fill:#f59e0b,stroke:#d97706,stroke-width:3px,color:#fff
    style E fill:#059669,stroke:#047857,stroke-width:3px,color:#fff
    style F fill:#94a3b8,stroke:#64748b,stroke-width:2px,color:#0f172a
    style G fill:#94a3b8,stroke:#64748b,stroke-width:2px,color:#0f172a

This real-time verification ensures that memories remain accurate even as code evolves, branches change, and files are refactored.

Privacy and Security: Is it safe to use?

One of the first questions that comes to mind with any AI memory system is privacy and security. Is it safe to let an AI remember details about my codebase?

What I had to understand is that Copilot Memory stores repository-scoped memories only. This means memories are tied to a specific repository and can only be used by Copilot operations on that same repository.

Key points:

  • Repository Isolation: Memories are strictly scoped to individual repositories (not shared across repositories or orgs)
  • Permission-Based Creation: Only contributors with write permissions can create memories
  • Access Control: Memories can be used by other users with appropriate repository access, but not outside that repository
  • Management Tools: Repository owners can view and delete stored memories via repository settings
  • Automatic Expiry: Memories automatically delete after 28 days unless refreshed through validation

This ensures that sensitive repository knowledge stays within the appropriate boundaries while enabling powerful AI assistance.

Current Availability and Getting Started

Agentic memory is currently available in public preview for:

  • Copilot Coding Agent: Enhanced task completion with repository-specific knowledge
  • Copilot Code Review: Smarter pull request reviews based on learned patterns
  • Copilot CLI: Context-aware command-line assistance

Enabling Memory for Your Team

The feature is opt-in and available for all paid Copilot plans:

For Individual Users (Copilot Pro/Pro+):

  1. Navigate to Personal Copilot Settings
  2. Under “Features”, find “Copilot Memory”
  3. Select “Enabled” from the dropdown

For Organisations and Enterprises:

  1. Go to Organisation/Enterprise Settings
  2. Navigate to Copilot policies
  3. Enable Copilot Memory for your team

Repository Management: Repository owners can review and manage stored memories via: Repository Settings > Copilot > Memory

Implementation Best Practices

To maximise the value of agentic memory in your development workflows:

1. Start with High-Impact Repositories

Enable memory on repositories with:

  • Complex coding conventions
  • Synchronised configuration files
  • Specific architectural patterns
  • Multiple team contributors

2. Monitor Memory Quality

Regularly review stored memories to:

  • Remove outdated or incorrect memories
  • Validate that learned patterns align with current practices
  • Ensure memories reflect your team’s coding standards

3. Leverage Cross-Agent Benefits

Use multiple Copilot features together:

  • Let Code Review agents learn from expert developer patterns
  • Allow Coding Agent to benefit from review insights
  • Use CLI with enhanced repository context

4. Educate Your Team

Ensure team members understand:

  • How memories are created and validated
  • The privacy and security model
  • How to review and manage repository memories

Conclusion: A New Era of Intelligent Development

GitHub Copilot’s agentic memory represents a fundamental evolution in AI-assisted development. By solving the “contextual amnesia” problem, it enables AI agents to become true collaboration partners that grow more valuable over time.

The beauty lies not just in the technical implementation, but in how it transforms the developer experience. No more explaining the same patterns repeatedly. No more generic suggestions that miss repository-specific context. Instead, you get AI assistance that truly understands your codebase and respects your team’s established practices.

As we embrace this new era of agentic AI, the question isn’t whether to adopt these capabilities, but how quickly we can integrate them into our development workflows to unlock their full potential.

Have you enabled agentic memory in your repositories yet? What patterns do you hope Copilot will learn from your codebase? Share your experiences and thoughts in the comments below.

References

Read the whole story
alvinashcraft
45 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

'America Is Slow-Walking Into a Polymarket Disaster'

1 Share
In an opinion piece for The Atlantic, senior editor Saahil Desai argues that media outlets are increasingly treating prediction markets like Polymarket and Kalshi as legitimate signals of reality. The risk, as Desai warns, is a future where news coverage amplifies manipulable betting odds and turns politics, geopolitics, and even tragedy into speculative gambling theater. Here's an excerpt from the report: [...] The problem is that prediction markets are ushering in a world in which news becomes as much about gambling as about the event itself. This kind of thing has already happened to sports, where the language of "parlays" and "covering the spread" has infiltrated every inch of commentary. ESPN partners with DraftKings to bring its odds to SportsCenter and Monday Night Football; CBS Sports has a betting vertical; FanDuel runs its own streaming network. But the stakes of Greenland's future are more consequential than the NFL playoffs. The more that prediction markets are treated like news, especially heading into another election, the more every dip and swing in the odds may end up wildly misleading people about what might happen, or influencing what happens in the real world. Yet it's unclear whether these sites are meaningful predictors of anything. After the Golden Globes, Polymarket CEO Shayne Coplan excitedly posted that his site had correctly predicted 26 of 28 winners, which seems impressive -- but Hollywood awards shows are generally predictable. One recent study found that Polymarket's forecasts in the weeks before the 2024 election were not much better than chance. These markets are also manipulable. In 2012, one bettor on the now-defunct prediction market Intrade placed a series of huge wagers on Mitt Romney in the two weeks preceding the election, generating a betting line indicative of a tight race. The bettor did not seem motivated by financial gain, according to two researchers who examined the trades. "More plausibly, this trader could have been attempting to manipulate beliefs about the odds of victory in an attempt to boost fundraising, campaign morale, and turnout," they wrote. The trader lost at least $4 million but might have shaped media attention of the race for less than the price of a prime-time ad, they concluded. [...] The irony of prediction markets is that they are supposed to be a more trustworthy way of gleaning the future than internet clickbait and half-baked punditry, but they risk shredding whatever shared trust we still have left. The suspiciously well-timed bets that one Polymarket user placed right before the capture of Nicolas Maduro may have been just a stroke of phenomenal luck that netted a roughly $400,000 payout. Or maybe someone with inside information was looking for easy money. [...] As Tarek Mansour, Kalshi's CEO, has said, his long-term goal is to "financialize everything and create a tradable asset out of any difference in opinion." (Kalshi means "everything" in Arabic.) What could go wrong? As one viral post on X recently put it, "Got a buddy who is praying for world war 3 so he can win $390 on Polymarket." It's a joke. I think.

Read more of this story at Slashdot.

Read the whole story
alvinashcraft
1 hour ago
reply
Pennsylvania, USA
Share this story
Delete

Satya Nadella’s new metaphor for the AI Age: We are becoming ‘managers of infinite minds’

1 Share
Microsoft CEO Satya Nadella and former UK Prime Minister Rishi Sunak at the World Economic Forum in Davos. (Screenshot via LinkedIn)

Bicycles for the mind. … Information at your fingertips. … Managers of infinite minds?

Microsoft CEO Satya Nadella riffed on some famous lines from tech leaders past this week in an appearance at the World Economic Forum in Davos, Switzerland, and offered up his own trippy candidate to join the canon of computing metaphors. 

Nadella traced the lineage in a conversation with former UK Prime Minister Rishi Sunak. “Computers are like a bicycle for the mind,” was the famous line from Apple’s Steve Jobs. “Information at your fingertips,” Bill Gates’ frequent refrain back in the day, was more practical in the classic Microsoft style.

And now? “All of us are going to be managers of infinite minds,” Nadella said. “And so if we have that as the theory, then the question is, what can we do with it?”

He was referring to AI agents — the autonomous software that can take on tasks, work through problems, and keep going while you sleep. Microsoft and others have been talking for the better part of a year now about people starting to oversee large fleets of them. 

Nadella said it’s already reshaping how teams are structured. At Microsoft-owned LinkedIn, the company has merged design, program management, product management, and front-end engineering into a single new role: full-stack builders. Overall, he called it the biggest structural change to software teams he’s seen in a career that started at Microsoft in the 1990s.

“The jobs of the future are here,” Nadella said, putting his own spin on a famous line often attributed to sci-fi writer William Gibson. “They’re just not evenly distributed.”

Nadella’s comments came during a live stream for LinkedIn Premium members, hosted from Davos by LinkedIn VP and Editor in Chief Daniel Roth, after Sunak mentioned his two teenage daughters, and the world they’ll enter. Young people may not manage lots of people at age 20 or 21, he said, “but they will be managing a team of agents.” 

Sunak was referencing an essay by Goldman Sachs CIO Marco Argenti in Time. 

The agentic shift, Argenti wrote, requires “moving from being a sole performer to an orchestra conductor” — your team now includes AI agents that “must be guided and supervised with the same approach you would apply to a new, junior colleague.”

Nadella agreed, saying “we do need a new theory of the mind” to navigate what’s coming, before he offered up his new metaphor about managing infinite minds.

In other remarks at Davos, Nadella made headlines with his warning that AI’s massive energy demands risk eroding its “social permission” unless it delivers tangible benefits in health, education, and productivity. Energy costs, he added, will decide the AI race’s winners, with GDP growth tied to cheap power for processing AI tokens.

Whether “infinite minds” catches on like “bicycles” and “fingertips” remains to be seen. But it’s definitely more psychedelic. And if this shift is stranger than what came before, maybe we do need a mind-expanding metaphor to make sense of it all.

Read the whole story
alvinashcraft
1 hour ago
reply
Pennsylvania, USA
Share this story
Delete

338: T5Gemma Says "AI’ll be Back”

1 Share

Welcome to episode 338 of The Cloud Pod, where the forecast is always cloudy! Justin, Ryan, Matt, and Jonathan are in the studio today to bring you all the latest in cloud and AI news, including a bit of a buying spree (inlcuding whole power companies) Veo 3.1, Cowork, and more – today in the cloud!  

Titles we almost went with this week

  • Snowflake’s Ironic Timing: Buying Downtime Prevention Tool While Experiencing Downtime
  • Flexera Buys ProsperOps and Chaos Genius, Promises Less Chaos and More Prosperity
  • Flexera Goes Shopping: Two FinOps Acquisitions to Prosper and Reduce Chaos
  • Token of Appreciation: Gemini CLI Now Tracks Every Penny of Your AI Spend
  • Snowflake Buys Observe to Stop Its Own Services from Melting Down
  • Google’s Veo 3.1 Goes Vertical: Finally Understanding How People Actually Hold Their Phones
  • Alphabet’s New Power Move: Buying the Company That Literally Powers Data Centers
  • Dashboard Confessional: Gemini CLI Gets Transparent About Its Usage
  • Microsoft’s New Agent Works 24/7 and Never Asks for a Raise
  • From Robot Vacuums That Climb Stairs to TVs You Can’t Feel: CES Gets Weird
  • Agent Shopping: When Your AI Has Better Taste Than You Do
  • The cloudpod hosts do not like any stories this week
  • AWS took a nap on announcements this week
  • Claude is my new co-worker
  • Wake up, AWS, and give us some fun news
  • The $200 Assistant: Is Cowork the End of Workplace Admins?
  • Azure has more interesting announcements than AWS oh noooo
  • If you can’t beat them in AI, just acquire everyone
  • Notebook LM turns the Data Tables on you

AI Is Going Great – Or How ML Makes Money 

01:11 Anthropic launches Cowork, a Claude Code-like for general computing – Ars Technica

  • Anthropic launches Cowork, a new feature in the macOS Claude desktop app that extends Claude Code‘s agentic capabilities to general office work tasks. 
  • Users can grant Claude access to specific folders and use plain language instructions to automate tasks like filling expense reports from receipt photos, writing reports from notes, or reorganizing files.
  • Cowork lowers the technical barrier compared to Claude Code by making AI-assisted file operations accessible to non-developer knowledge workers, including marketers and office staff. 
  • The feature was developed after Anthropic observed users already applying Claude Code to general knowledge work despite its developer-focused positioning.
  • The tool provides similar functionality to what was possible through Model Context Protocol integrations, but offers a more streamlined interface with Claude Code-style usability improvements. 
  • Users can submit new requests or modifications to ongoing tasks without waiting for the initial assignment to complete.
  • Cowork represents a strategic expansion of Anthropic’s agentic AI approach beyond software development into broader productivity workflows. The feature demonstrates how AI agents with file system access can automate routine knowledge work tasks that previously required manual processing of documents and data.

02:15 Ryan – “This week is the first time I actually tried to use AI to generate a PowerPoint presentation. It did not go well. It did generate some cool images, though.” 

07:42 Enhanced Veo 3.1 capabilities are now available in the Gemini API.

  • Google has released Veo 3.1 updates in the Gemini API and Google AI Studio, adding enhanced Ingredients to Video capabilities that maintain character identity and background consistency across generated videos. 
  • The model now supports native 9:16 vertical format generation optimized for mobile-first applications, eliminating the need to crop from landscape orientation.
  • The updated model delivers professional-grade output with new 4K resolution support and improved 1080p quality using state-of-the-art enhancement techniques. All generated videos include SynthID digital watermarking for content provenance tracking.
  • These capabilities are available today through the Gemini API for developers and Vertex AI for enterprise customers. Google AI Studio provides a demo app for testing the new features at ai.studio/apps/bundled/veo_studio.
  • The vertical video format addresses the growing demand for social media content creation, while the 4K output positions Veo 3.1 for professional video production workflows. The character consistency improvements reduce the need for manual editing and post-processing in multi-shot video projects.

08:20 Justin – “Don’t make the same mistakes that I do, and go try this and then get a $35 bill, which I did the first time I tried Veo out. So, do be cautious with this one!”  

11:08 Snowflake Announces Intent to Acquire Observe to Deliver AI-Powered Observability

  • Snowflake is acquiring Observe to integrate AI-powered observability directly into its data platform, allowing customers to analyze telemetry data like logs, metrics, and traces alongside their business data. 
  • This consolidation eliminates the need for separate observability tools and reduces data movement between systems.
  • The acquisition addresses the growing challenge of managing observability data at scale, which has become increasingly expensive and complex as organizations generate massive volumes of telemetry information. 
  • Observe’s approach stores data in a structured format that enables more efficient querying and analysis compared to traditional observability platforms.
  • By bringing observability into Snowflake’s platform, customers can correlate operational metrics with business outcomes using the same SQL-based tools they already use for analytics. 
  • This unified approach should help teams identify how application performance issues directly impact revenue, customer experience, and other business metrics.
  • The deal positions Snowflake to compete more directly with observability vendors like Datadog, Splunk, and New Relic by offering native capabilities rather than requiring third-party integrations. 
  • Organizations already using Snowflake for data warehousing can now consolidate their observability spend and simplify their tool stack.

12:08 Ryan – “I don’t know how to feel about this; I feel like Snowflake is a part of an application, but it’s not the entirety of an application. I definitely see a use for this for data warehousing and visualizing, but I don’t think it replaces your traditional observability tools because you have too many data sources that are outside of Snowflake.” 

Cloud Tools

13:58 Flexera acquires ProsperOps and Chaos Genius to expand its FinOps solution with agentic and AI-enabled cost optimization

  • Flexera acquires two FinOps companies to add autonomous AI-driven cost optimization across major cloud platforms and data analytics services: ProsperOps brings automated commitment management for AWS, Azure, and Google Cloud with over $6B in annual cloud usage under management, while Chaos Genius focuses specifically on Snowflake and Databricks optimization with reported cost reductions up to 30%.
  • The acquisitions shift Flexera’s FinOps approach from passive recommendations to active autonomous execution through agentic AI. 
  • This means the platform can automatically purchase and manage cloud commitments and optimize data workloads without requiring manual human intervention, addressing the challenge of dynamic cloud usage patterns that don’t align well with static commitment purchases.
  • ProsperOps will continue operating as a separate brand while integrating with Flexera’s existing FinOps capabilities. The company was growing at over 90% and has generated more than $3 billion in lifetime savings for customers, suggesting strong market demand for automated rate optimization solutions.
  • The Chaos Genius acquisition specifically targets the emerging problem of runaway costs in data analytics platforms like Snowflake and Databricks as AI workloads scale. 
  • This addresses a gap in traditional FinOps tools that primarily focused on compute and storage optimization but lacked specialized capabilities for modern data cloud platforms.
  • These moves position Flexera to cover the complete FinOps Framework defined by the FinOps Foundation, combining cost visibility, workload optimization, and rate optimization in a single platform. 
  • This matters for enterprises struggling to manage costs across an increasingly complex mix of traditional cloud services, AI infrastructure, and specialized data platforms.

15:35 Matt – “It definitely needs some pretty strong guardrails of what your business objective is, like don’t go over 90% savings plan or look at the secondary market for short term if you see a random burst for a few months. But it’s not a terrible idea…”      

AWS

19:12 Weirdly enough, there are no AWS stories this week. 

GCP

20:06 Instant insights: Gemini CLI’s New Pre-Configured Monitoring Dashboards | Google Cloud Blog

  • Google has added pre-configured monitoring dashboards to Gemini CLI that provide immediate visibility into usage metrics like monthly active users, token consumption, and code changes without requiring custom query writing. 
  • The dashboards integrate with Google Cloud Monitoring and use OpenTelemetry for standardized data collection, allowing teams to track CLI adoption and performance across their organization.
  • The implementation uses direct GCP exporters that bypass intermediate OTLP collector configurations, simplifying setup to three steps: setting the project ID, authenticating with proper IAM roles, and updating the settings.json file. This reduces infrastructure complexity compared to traditional OpenTelemetry deployments that require separate collector services.
  • Organizations can analyze raw OpenTelemetry logs and metrics to answer specific questions like identifying power users by token consumption, tracking budget allocation by command type, and monitoring tool reliability through status codes. The data follows GenAI OpenTelemetry conventions, ensuring compatibility with other observability backends like Prometheus, Jaeger, or Datadog if teams want to switch platforms.
  • The feature targets development teams using Gemini CLI who need to understand tool adoption patterns and justify AI tooling investments through concrete usage metrics. 
  • Engineering managers can track which developers benefit most from AI assistance and where token budgets are being allocated across different command types

21:55 Ryan – “As long as there’s no metric for how stupid a question is, because that. That I don’t want.”   

22:40 We’re advancing U.S. energy innovation with Intersect.

  • Alphabet announced a definitive agreement to acquire Intersect, a company specializing in data center and energy infrastructure solutions. 
  • This acquisition aims to accelerate the deployment of data center capacity and energy generation infrastructure in the United States.
  • The deal addresses a critical bottleneck in AI and cloud infrastructure expansion by bringing expertise in energy development and data center deployment under Alphabet’s umbrella. Intersect’s capabilities will help Google bring more computing capacity online faster, which is essential given the substantial power requirements of AI workloads and hyperscale cloud operations.
  • This acquisition reflects the growing importance of energy infrastructure as a limiting factor for cloud providers, particularly as AI training and inference workloads drive unprecedented power demands. By acquiring energy infrastructure expertise, Google positions itself to better control the full stack from power generation through data center operations.
  • The announcement provides limited technical details about integration timelines or specific projects, but signals Google’s commitment to vertical integration in the infrastructure space. This move follows similar investments by other hyperscalers in power generation and energy partnerships to support their expanding data center footprints.

22:50 Justin – “If you can’t get the capacity from the vendor, just buy them – and then force them to do it. Good move!”   

25:00 Google’s NotebookLM introduces Data Tables feature

  • NotebookLM now includes Data Tables, a feature that automatically synthesizes information from multiple sources into structured tables that can be exported directly to Google Sheets
  • The feature is available today for Pro and Ultra users, with rollout to all users planned for the coming weeks.
  • The feature addresses a common workflow challenge where valuable information is scattered across multiple documents, requiring manual compilation. Data Tables automates this process by extracting and organizing key facts into clean, structured formats without manual data entry.
  • Use cases span professional and personal applications, including converting meeting transcripts into action item tables with owners and priorities, synthesizing research data like clinical trial outcomes across multiple papers, creating competitor analysis tables with pricing and strategy comparisons, and building study guides organized by relevant categories.
  • The feature represents Google’s continued integration of AI capabilities into productivity tools, positioning NotebookLM as a research and synthesis tool rather than just a note-taking application. 
  • This builds on NotebookLM’s existing source analysis capabilities by adding structured data output.
  • The tiered rollout strategy, with Pro and Ultra users receiving immediate access, suggests Google is testing the feature with power users before broader deployment, likely to gather usage patterns and refine the table generation algorithms.

25:52 Justin – “I love creating spreadsheets; my budgets, all of my tracking of things, tasks I’m doing, vacation planning – it all lives in spreadsheets. And you’re going to take that away from me, Google? How dare you. AI is coming for my passion for spreadsheets.”   

29:53 T5Gemma 2: The next generation of encoder-decoder models

  • Google releases T5Gemma 2,  a new generation of encoder-decoder models based on Gemma 3, available now in pre-trained checkpoints at three sizes: 270M-270M (370M total), 1B-1B (1.7B total), and 4B-4B (7B total) parameters. The models use tied word embeddings and merged decoder attention to reduce parameter count while maintaining capabilities, making them suitable for on-device applications and rapid experimentation.
  • T5Gemma 2 adds multimodal vision capabilities using an efficient vision encoder for visual question answering and reasoning tasks, extends context windows to 128K tokens using Gemma 3’s alternating local and global attention mechanism, and supports over 140 languages out of the box. 
  • These represent the first multi-modal and long-context encoder-decoder models in the Gemma family.
  • The architecture merges decoder self-attention and cross-attention into a single unified layer, reducing model complexity and improving parallelization for better inference performance. 
  • This structural change, combined with tied embeddings, allows more active capabilities within the same memory footprint compared to the original T5Gemma.
  • Benchmarks show T5Gemma 2 outperforms Gemma 3 on several multimodal tasks, delivers substantial quality gains on long-context problems compared to both Gemma 3 and T5Gemma, and shows improved performance on coding, reasoning, and multilingual tasks. Post-training results indicate better performance than decoder-only counterparts, making these models suitable for both research and production applications.
  • The models are designed for developers to post-train for specific tasks before deployment, continuing the approach from the original T5Gemma of adapting pre-trained decoder-only models into an encoder-decoder architecture without the computational cost of training from scratch.
  •  Pre-trained checkpoints are available across multiple platforms for broad developer access.

31:14 Jonathan – “I’m actually looking forward to playing with the T5Gemma model because the encoder part of it is what’s going to make it really special. Transformers have always had these two halves, encoder and decoder, and most LMs only use the decoder. And what that means is that as the attention is calculated for each token in the context window, it only ever attends to previous tokens in the message. So if you have a word, that word can only ever be related to something that you’ve already said in the conversation. But people aren’t like that. People go back and forth, and they refer back to things they said… people just suck at communication most of the time. And so what the encoder model does is it looks at the entire message holistically. It doesn’t only look at the last word by the time it gets to the last word, it looks at everything and encodes the meaning of the entire text. And then from there, it passes it to the decoder, and the decoder starts generating text based on the entire knowledge of the whole thing.”

33:39 New tech and tools for retailers to succeed in an agentic shopping era

  • Google launches Universal Commerce Protocol (UCP), an open standard for agentic commerce co-developed with Shopify, Etsy, Wayfair, Target, and Walmart. 
  • UCP enables AI agents to interact across the entire shopping journey from discovery to post-purchase support, working alongside existing protocols like A2A, AP2, and MCP. The protocol is endorsed by over 20 companies, including Adyen, American Express, Mastercard, Stripe, and Visa.
  • New agentic checkout feature goes live in AI Mode in Search and Gemini app, allowing shoppers to purchase from eligible U.S. retailers directly within Google’s AI surfaces. 
  • The integration uses Google Pay and PayPal for payments, with retailers maintaining seller of record status and the ability to customize the implementation. Global expansion and additional capabilities like loyalty rewards and product discovery are planned for the coming months.
  • Business Agent launches tomorrow as a branded AI assistant that appears directly in Search results for retailers like Lowe’s, Michaels, Poshmark, and Reebok. U.S. retailers can activate and customize this agent through Merchant Center, with future capabilities including training on retailer data, customer insights, product offers, and direct agentic checkout within the chat experience.
  • Google introduces Direct Offers pilot in AI Mode, allowing advertisers to present exclusive discounts and deals to shoppers during AI-powered searches. The system uses AI to determine when offers are relevant to display, initially focusing on discounts with plans to expand to bundles and free shipping. Early partners include Petco, e.l.f. Cosmetics, Samsonite, Rugs USA, and Shopify merchants.
  • Merchant Center adds dozens of new data attributes designed for conversational commerce discovery across AI Mode, Gemini, and Business Agent. These attributes extend beyond traditional keywords to include product Q&A, compatible accessories, and substitutes, rolling out first to a small group of retailers before broader expansion.

35:20 Ryan – “I think it’s important to standardize. In a web transaction where you’re doing shopping, there’s so many handoffs to different things, I can see, as more and more AI and agent-based or agent-assisted transactions happen, being able to talk a common language is super important.” 

33:38 Read Sundar Pichai’s remarks at the 2026 National Retail Federation

  • Google announced Universal Commerce Protocol (UCP), an open standard for agentic commerce built with Shopify, Etsy, Wayfair, Target, and Walmart. The protocol enables native checkout directly in Google Search AI Mode and Gemini, allowing retailers to maintain merchant of record status and own customer relationships while offering personalized pricing and loyalty enrollment at checkout.
  • Gemini Enterprise for Customer Experience is now available in preview, providing retailers with integrated shopping assistants, support bots, and agentic search capabilities. 
  • The Home Depot and McDonald’s are already using these agents for customer service, while Kroger is testing a shopping agent that brings AI Mode functionality directly into retailer apps.
  • Google processed over 90 trillion tokens through its API in December 2025, representing an 11x increase from 8.3 trillion tokens in December 2024. This growth demonstrates the rapid adoption of AI capabilities by retailers and the scale at which Google’s infrastructure is supporting commercial AI applications.
  • Wing delivery service expanded to Houston, with Orlando, Tampa, and Charlotte coming soon, after doubling deliveries in existing markets during 2025 through its Walmart partnership. 
  • The expansion addresses the high cost and logistical challenges of last-mile delivery for retailers.

38:35 Jonathan – “So is this how Google is going to make money in the future? Because obviously serving ads through AI is both controversial and a very lame customer experience. Are they going to start skimming off a percentage of sales for sales they direct to these retailers through their AI interface?”  

Azure 

39:58 Announcing public preview: Uncovering hidden threats with the Dynamic Threat Detection Agent | Microsoft Community Hub

  • Microsoft launches the Dynamic Threat Detection Agent in public preview, an AI-powered backend service that runs continuously within Defender to identify hidden threats across Defender and Sentinel environments. 
  • The agent operates autonomously with no setup required, automatically generating alerts with natural language explanations, MITRE technique mappings, and remediation steps directly into existing XDR workflows.
  • The agent achieves over 85% precision across thousands of alerts and 28 threat types by combining adaptive GenAI detection with hyperscale threat intelligence from TITAN and UEBA behavioral analytics. 
  • It runs a five-step investigation loop at machine scale, starting from high-priority incidents, building unified activity timelines, testing hypotheses through automated Q&A, and closing detection gaps with explainable alerts that include transparent reasoning traces.
  • Public preview is free for Security Copilot customers and enabled by default for eligible organizations, with general availability planned for late 2026 when it transitions to Security Copilot’s SCU-based consumption model. 
  • Starting July 2026, the agent will be included with Microsoft 365 E5 licenses that have Security Copilot entitlement, and customers can disable it or monitor usage through detailed consumption reporting at any time.
  • The service respects data residency by running region-local and integrates deeply with the Microsoft security ecosystem, using Sentinel to correlate third-party and native telemetry while surfacing Copilot-sourced detections in Defender. 
  • Built on Azure Synapse for massive scale, it can run thousands of parallel investigations and deliver near-real-time detections while continuously learning from analyst feedback to improve detection quality and reduce alert noise.

43:54 Jonathan – “You don’t want to block a potential customer who’s about to press a button to spend tens of thousands of dollars either. guess false positives are almost as bad as false negatives.”

45:26 Generally Available: Geo-Replication for Azure Service Bus Premium

  • Azure Service Bus Premium now includes generally available Geo-Replication, allowing customers to replicate messaging infrastructure across regions for disaster recovery. 
  • This addresses a critical need for enterprises running mission-critical messaging workloads that require protection against regional outages.
  • The feature provides active replication of Service Bus entities, including queues, topics, and subscriptions, between paired regions, maintaining message ordering and metadata consistency. 
  • Organizations can now implement cross-region failover strategies without building custom replication logic or managing multiple Service Bus namespaces manually.
  • This capability is exclusive to the Premium tier of Service Bus, which starts at approximately $677 per month for the base messaging unit. Customers should factor in additional costs for cross-region data transfer and the secondary namespace when planning their disaster recovery architecture.
  • The geo-replication option complements existing Service Bus disaster recovery features like Geo-Disaster Recovery (metadata-only failover), giving customers flexibility in choosing between cost-optimized metadata replication or full data replication based on their recovery time objectives. 
  • This is particularly relevant for financial services, healthcare, and retail sectors, where message loss during regional failures is unacceptable.

46:23 Justin – “I’m surprised this wasn’t already part of premium, but I’m also sort of intrigued that they think people’s messaging strategies only involve two regions, because some of the cost architectures I’ve seen are like multiple regions with active replication across these things for geodistributed applications that need to have globally low latency for user populations everywhere – and I guess I just can’t run that on this service. So I guess, screw you? Or wait for Azure Service Bus Ultra?” 

After Show 

46:38 CES 2026: The best tech announced so far | The Verge

  • CES 2026 showcased significant infrastructure innovations, including Wi-Fi 8 routers from Asus and others, despite the standard not being finalized until 2028, plus solid-state battery breakthroughs from Donut Lab claiming 400 Wh/kg energy density that could give EVs 30 percent more range. These developments signal major shifts in networking and power infrastructure that cloud and edge computing deployments will eventually leverage.
  • Smart home and IoT devices are getting serious upgrades with Matter compatibility becoming standard across Ikea and Philips Hue products, while spatial awareness features like Hue’s SpatialAware use AR to map rooms for better lighting distribution. For cloud professionals, this represents the maturation of IoT protocols and edge AI processing that will drive increased demand for home automation backend services.
  • The display technology race is heating up with Samsung showing creaseless foldable OLED panels, Dell launching a 52-inch 6K Thunderbolt hub monitor, and LG reviving its Wallpaper TV with wireless video transmission. These advances in display tech and connectivity standards like Thunderbolt 5, delivering 120Gbps speeds, will impact how professionals design workspaces and remote work setups.
  • AI wearables are moving beyond glasses with Razer’s Project Motoko headphones featuring 4K cameras, on-device AI processing via Qualcomm chips, and 36-hour battery life that eclipses current smart glasses. This shift toward headphone-based AI assistants could influence how voice interfaces and edge AI applications are developed for consumer devices.
  • Robotics took center stage with practical home automation like Roborock’s stair-climbing Saros Rover vacuum and LG’s CLOiD dual-arm robot that can fold laundry and handle kitchen tasks. While still in development, these robots represent the convergence of computer vision, edge AI, and mechanical engineering that will require robust cloud backends for training and coordination.

Closing

And that is the week in the cloud! Visit our website, the home of the Cloud Pod, where you can join our newsletter, Slack team, send feedback, or ask questions at theCloudPod.net or tweet at us with the hashtag #theCloudPod





Download audio: https://episodes.castos.com/5e2d2c4b117f29-10227663/2333121/c1e-qx4xb7om6kfkq90r-2504d6g6ijjx-t2kywf.mp3
Read the whole story
alvinashcraft
1 hour ago
reply
Pennsylvania, USA
Share this story
Delete

Daily Reading List – January 21, 2026 (#704)

1 Share

It’s been a fourteen meeting day (with one more this evening) so my battery is drained. On the plus side, lots of great things going on around here.

[article] The Palantirization of everything. Many companies are enamored with high-touch, forward-deployed engineers. But is that a playbook others can copy?

[blog] Architecture for Disposable Systems. I like the thought exercise behind this idea. What if that app doesn’t need careful engineering?

[blog] Code Is Cheap Now. Software Isn’t. No barrier to entry, and virtually no cost to produce code. But software is still expensive, and doing it with taste and timing will remain a differentiator.

[article] How Google’s ‘internal RL’ could unlock long-horizon AI agents. This space is so far from “done.” Don’t assume that any shortcoming of the current approach is going to stay that way!

[blog] A Software Library with No Code. I screwed around with this idea a couple of years ago and Drew does a more sophisticated take with today’s more powerful tools.

[blog] Welcome to MCP-P-A-looza. You can use MCP from basically any language. Heck, even Haskell. William gathers a lot of the work in one place.

[article] Why Everyone Should Still Use an RSS Reader in 2026. Still my most relied upon learning tool. Without Feedly, I’d be stuck.

[blog] Agent Psychosis: Are We Going Insane? Armin wonders if we’re losing the plot, getting addicted to prompts, or need better tools as we figure out the new norms of software engineering.

[article] AI coding requires developers to become better managers. Good take from Matt on specs and planning. It’s time to grow those skills around slowing down, exploring problem spaces, and capturing the right intent.

[blog] Sawasdee Thailand! Google Cloud launches new region in Bangkok. Hmm, I may need to find an excuse to go visit this year.

[blog] A Brief History of Ralph. A few months ago, “Ralph Wiggum” was just a sweet idiot kid from The Simpsons. Now? It’s a hot AI engineering approach.

[blog] AI Agent Engineering in Go with the Google ADK. My product area is actively working to make Go the best language for devs building AI apps. See here how to build out some AI agents in Go.

[article] ServiceNow positions itself as the control layer for enterprise AI execution. None of the big enterprise SaaS vendors wants to be reduced to an API used by an agent. Expect more pushes like this one.

[blog] Software engineers can no longer neglect their soft skills. If you want to be great at software in 2026, focus your skills training on communication.

Want to get this update sent to you every day? Subscribe to my RSS feed or subscribe via email below:



Read the whole story
alvinashcraft
1 hour ago
reply
Pennsylvania, USA
Share this story
Delete

dotnet-1.0.0-preview.260121.1

1 Share

What's Changed

  • .NET: [BREAKING] Change GetNewThread and DeserializeThread to async by @westey-m in #3152
  • .NET: Improve resolving AITool from DI by @DeagleGross in #3175
  • .NET: Properly point agentCard to agent endpoint by @DeagleGross in #3176
  • .NET: Implement IReadOnlyList on InMemoryChatMessageStore by @westey-m in #3205
  • .NET: Make ChatMessageStore and AIContextProvider context props settable by @westey-m in #3196
  • .NET: [Breaking] RenameAgentRunResponse and AgentRunResponseUpdate classes by @SergeyMenshykh in #3197
  • .NET: [Breaking] Rename AgentRunResponseEvent and AgentRunUpdateEvent classes by @SergeyMenshykh in #3214
  • .NET: Merge AgentRunOptions.AdditionalProperties into ChatOptions.AdditionalProperties by @westey-m in #3184
  • .NET: Update Google.GenAI to 0.11.0 and remove polyfill implementations by @Copilot in #3232
  • .NET: [BREAKING] Renamed CreateAIAgent/GetAIAgent to AsAIAgent by @dmytrostruk in #3222
  • .NET Purview Middleware: Improve Background Job Runner Injection by @eoindoherty1 in #3256
  • .NET: Delete sync extension methods for agent by @westey-m in #3291
  • .NET: Update Microsoft.Extensions.AI.* packages to 10.2.0 by @Copilot in #3211
  • .NET: Pass AdditionalProperties from parent to child when exposing an agent as a FunctionTool by @westey-m in #3219
  • .NET: Durable Agent samples and automated validation for non-Azure Functions by @cgillum in #3042
  • .NET: Add sample to show multiple AIContextProvider usage by @westey-m in #3284
  • .Net: Fix DebuggerDisplay attribute to reference existing property by @Copilot in #3326
  • .NET: Update Conversation Sample to use Conversation Id instead by @rogerbarreto in #3180
  • .NET: Improve readme for agents V2 by @rogerbarreto in #3285
  • .NET: Fix DebuggerDisplay attribute in AIAgent.cs to reference existing properties by @Copilot in #2985

Full Changelog: dotnet-1.0.0-preview.260108.1...dotnet-1.0.0-preview.260121.1

Read the whole story
alvinashcraft
1 hour ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories