Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
156679 stories
·
33 followers

Mary Jo Foley: What’s a consumer-focused outsider doing at the helm of Microsoft’s AI push?

1 Share
Jacob Andreou speaks onstage during TechCrunch Disrupt 2023. (Photo by Kimberly White/Getty Images for TechCrunch, CC By 2.0)

It’s not surprising that Microsoft is looking to turn its Copilot platform into a “Super App,” given that its rivals are doing the same. But Microsoft is going about the task in a way that doesn’t follow its usual playbook, by putting a big bet on a consumer-savvy hire from the outside with some feather-ruffling ways.

The company’s newly minted Copilot Executive Vice President Jacob Andreou came to Microsoft from Greylock Partners and before that, Snapchat-maker Snap. Andreou currently oversees more than 11,000 Microsoft employees, according to a recent profile in Fortune.

Microsoft is bringing onboard another former Snap (and Discord) vice president, Peter Sellis, to help, GeekWire has learned. Sources say Sellis will be leading Copilot Design, Growth and Engineering, reporting to Andreou.

Andreou is part of a recently formed Copilot Leadership Team. His charter is to lead the “Copilot experience” by driving design, product, growth and engineering, as outlined in a March 2026 reorg memo from CEO Satya Nadella. He is one of a small group charged with shaping the future of Copilot, alongside others focused on the underlying Copilot platform and AI models.

Given Andreou’s Snap background, his plan to meld Microsoft’s consumer and enterprise Copilot experiences makes sense. It won’t be a snap, however. (See what I did there?)

Even though both share the Copilot brand, consumer Copilot and Microsoft 365 Copilot don’t work the same way or use the same data sources or architecture. To boot, Microsoft hasn’t had a lot of luck with this kind of consumer-enterprise unification, as evidenced by the low interest in and uptake of its free, consumer-focused Teams product compared to its business-focused Teams collaboration offering.

The 33-year-old, Los Angeles-based Andreou seemingly is undaunted by the challenge and is pushing some employees to clock 12-hour days to keep up with younger, AI-focused companies, Fortune reports.

Microsoft was infamous for requiring employees to work long hours and weekends during crunch times leading up to delivering Windows NT and Windows 95, but not so much in recent years. Microsoft is known as a place where outsiders often struggle to thrive compared to those who climb the corporate ladder for years, making Andreou’s approach feel even riskier.

Andreou has been a big backer of the Tasks productivity layer in consumer Copilot, which is still in public preview. Tasks, which enables Copilot to handle actionable items, is similar to the recently released Copilot Cowork layer that is part of Microsoft 365 Copilot. (I asked Microsoft if the two would merge as a single Cowork-type offering at some point but was told the company had no comment.)

However, the holy grail remains the “Super App.” With the Copilot Super App, Microsoft is looking to give consumers and business users a reason to stay within Copilot regardless of the AI task with which they – or their agents – are engaging.

“Come summer, we will be bringing coding to all knowledge work within one Copilot Super App. That’s really exciting. So you’re going to have Chat, Cowork, and Code all in Copilot,” Nadella told Microsoft Build conference attendees in early June.

Microsoft isn’t the only AI-focused company working on extending its AI coding capability beyond just developers. Nor is it the only one betting on the Super App concept.

  • OpenAI is working to turn ChatGPT into a Super App that brings together ChatGPT and Codex into a single environment that operates like a personal assistant.
  • Anthropic is extending Claude to become a Super App (though it hasn’t used that terminology), as well, by creating a single environment that combines productivity, development and automation tools.

The Copilot Super App isn’t Andreou’s only focus. He tells Fortune that AI model choice and home-grown AI model excellence also are among his key priorities.

Microsoft is expanding model choice in the Copilot Cowork feature beyond Anthropic to include OpenAI and soon, Microsoft’s own Cowork 1 model – which may be based on Microsoft’s hosted version of the open-source DeepSeek model. Cowork 1 will be the newest addition to Microsoft’s growing pool of Microsoft-developed models, seven of which debuted at Build this year. Microsoft is seeking to position itself as the champion of lower cost, efficient models built for those who are token-maxxed out.

Andreou definitely has his work cut out for him as a consumer guy in a heavily enterprise-centric company.

Microsoft 365 Copilot and consumer Copilot are just two of more than two dozen different “Copilot”-branded commercial offerings available across the various Microsoft product teams, which can feel overwhelming.

Microsoft also needs to give users a clearer way to find and use the quickly expanding stable of first- and third-party agents, like the OpenClaw-based Microsoft Scout personal assistant. Will Andreou and his Super App quest bring at least some order to the Copilot and agent madness? We’ll know more sometime this summer.

Read the whole story
alvinashcraft
52 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

The Gemini app is bringing personalized image creation to more users.

1 Share
Personal Intelligence makes the Gemini app feel tailored to you. With your permission, it pulls from Google tools like Gmail, Google Photos, YouTube and Search to provid…
Read the whole story
alvinashcraft
52 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Why intent prediction needs more than an LLM​​​​‌‍​‍​‍‌‍‌​‍‌‍‍‌‌‍‌‌‍‍‌‌‍‍​‍​‍​‍‍​‍​‍‌​‌‍​‌‌‍‍‌‍‍‌‌‌​‌‍‌​‍‍‌‍‍‌‌‍​‍​‍​‍​​‍​‍‌‍‍​‌​‍‌‍‌‌‌‍‌‍​‍​‍​‍‍​‍​‍‌‍‍​‌‌​‌‌​‌​​‌​​‍‍​‍​‍‌‍​‌‍‌‌​​‍‍‌​‌‌​‌‍​‌‌‍​‌‍‍‌‍‌‌‍‌‍‌‌‌​‍‌‍‌‍‌‍​‌‍‌‌​‍‍‌‍​‌‍​‍‌‍‍‌‌‍‍‌‌​‌‍‌‌‌‍‍‌‌​​‍‌‍‌‌‌‍‌​‌‍‍‌‌‌​​‍‌‍‌‌‍‌‍‌​‌‍‌‌​‌‌​​‌​‍‌‍‌‌‌​‌‍‌‌‌‍‍‌‌​‌‍​‌‌‌​‌‍‍‌‌‍‌‍‍​‍‌‍‍‌‌‍‌​​‌​‌‌​​​‌‍‌‍​‌​​‌‌​​​​‍‌‌‍​​‍‌​‌​‌‍‌‍​‌‌​‌‌​‍‌​‌​‌‍‌‍​​‌​​‌​‍‌​‍​‌‍​‌‌‍‌​‌‍​​‍‌​‌‌​‌‌​‌‌​‌‍‌‍‌‌​‍​​​‍​​‌‌‍‌‌‌‍​‌​‌​‌​‍‌‌​‌‍‌‌​​‌‍‌‌​‌‌‍​‍‌‍​‌‍‌‍‌‌‌​​‌‍‌​‌‌​​‍‌​​‌‍​‌‌‌​‌‍‍​​‌‌‌​‌‍‍‌‌‌​‌‍​‌‍‌‌​‌‍​‍‌‍​‌‌​‌‍‌‌‌‌‌‌‌​‍‌‍​​‌‌‍‍​‌‌​‌‌​‌​​‌​​‍‌‌​​‌​​‌​‍‌‌​​‍‌​‌‍​‍‌‌​​‍‌​‌‍‌‍​‌‍‌‌​​‍‍‌​‌‌​‌‍​‌‌‍​‌‍‍‌‍‌‌‍‌‍‌‌‌​‍‌‍‌‍‌‍​‌‍‌‌​‍‍‌‍​‌‍​‍‌‍‌‍‍‌‌‍‌​​‌​‌‌​​​‌‍‌‍​‌​​‌‌​​​​‍‌‌‍​​‍‌​‌​‌‍‌‍​‌‌​‌‌​‍‌​‌​‌‍‌‍​​‌​​‌​‍‌​‍​‌‍​‌‌‍‌​‌‍​​‍‌​‌‌​‌‌​‌‌​‌‍‌‍‌‌​‍​​​‍​​‌‌‍‌‌‌‍​‌​‌​‌​‍‌‍‌‌​‌‍‌‌​​‌‍‌‌​‌‌‍​‍‌‍​

1 Share
Ryan sits down with Frank Portman, CTO at Yobi, to talk about why next-token prediction, though great for language, isn’t the right inductive bias for forecasting human behavior. They discuss how Yobi builds a “foundation model of behavior” using transformers and graph neural networks instead of chat-style LLMs, and what it takes to run millions of personalization decisions per second while keeping consumer data private.​​​​‌‍​‍​‍‌‍‌​‍‌‍‍‌‌‍‌‌‍‍‌‌‍‍​‍​‍​‍‍​‍​‍‌​‌‍​‌‌‍‍‌‍‍‌‌‌​‌‍‌​‍‍‌‍‍‌‌‍​‍​‍​‍​​‍​‍‌‍‍​‌​‍‌‍‌‌‌‍‌‍​‍​‍​‍‍​‍​‍‌‍‍​‌‌​‌‌​‌​​‌​​‍‍​‍​‍‌‍​‌‍‌‌​​‍‍‌​‌‌​‌‍​‌‌‍​‌‍‍‌‍‌‌‍‌‍‌‌‌​‍‌‍‌‍‌‍​‌‍‌‌​‍‍‌‍​‌‍​‍‌‍‍‌‌‍‍‌‌​‌‍‌‌‌‍‍‌‌​​‍‌‍‌‌‌‍‌​‌‍‍‌‌‌​​‍‌‍‌‌‍‌‍‌​‌‍‌‌​‌‌​​‌​‍‌‍‌‌‌​‌‍‌‌‌‍‍‌‌​‌‍​‌‌‌​‌‍‍‌‌‍‌‍‍​‍‌‍‍‌‌‍‌​​‌​‌‌​​​‌‍‌‍​‌​​‌‌​​​​‍‌‌‍​​‍‌​‌​‌‍‌‍​‌‌​‌‌​‍‌​‌​‌‍‌‍​​‌​​‌​‍‌​‍​‌‍​‌‌‍‌​‌‍​​‍‌​‌‌​‌‌​‌‌​‌‍‌‍‌‌​‍​​​‍​​‌‌‍‌‌‌‍​‌​‌​‌​‍‌‌​‌‍‌‌​​‌‍‌‌​‌‌‍​‍‌‍​‌‍‌‍‌‌‌​​‌‍‌​‌‌​​‍‌​​‌‍​‌‌‌​‌‍‍​​‌‌‍‌‌‌‍​‌‍​‌‍‌‌‌​‍‌​​‌‌​​‌‍​‍‌‍​‌‌​‌‍‌‌‌‌‌‌‌​‍‌‍​​‌‌‍‍​‌‌​‌‌​‌​​‌​​‍‌‌​​‌​​‌​‍‌‌​​‍‌​‌‍​‍‌‌​​‍‌​‌‍‌‍​‌‍‌‌​​‍‍‌​‌‌​‌‍​‌‌‍​‌‍‍‌‍‌‌‍‌‍‌‌‌​‍‌‍‌‍‌‍​‌‍‌‌​‍‍‌‍​‌‍​‍‌‍‌‍‍‌‌‍‌​​‌​‌‌​​​‌‍‌‍​‌​​‌‌​​​​‍‌‌‍​​‍‌​‌​‌‍‌‍​‌‌​‌‌​‍‌​‌​‌‍‌‍​​‌​​‌​‍‌​‍​‌‍​‌‌‍‌​‌‍​​‍‌​‌‌​‌‌​‌‌​‌‍‌‍‌‌​‍​​​‍​​‌‌‍‌‌‌‍​‌​‌​‌​‍‌‍‌‌​‌‍‌‌​​‌‍‌‌​‌‌‍​‍‌‍​‌‍‌‍‌‌‌​​‌‍‌​‌‌​​‍‌‍‌​​‌‍​‌‌‌​‌‍‍​​‌‌‍‌‌‌‍​‌‍​‌‍‌‌‌​‍‌​​‌‌​​‍‌‍‌​​‌‍‌‌‌​‍‌​‌​​‌‍‌‌‌‍​‌‌​‌‍‍‌‌‌‍‌‍‌‌​‌‌​​‌‌‌‌‍​‍‌‍​‌‍‍‌‌​‌‍‍​‌‍‌‌‌‍‌​​‍​‍‌‌
Read the whole story
alvinashcraft
53 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Memora: A Harmonic Memory Representation Balancing Abstraction and Specificity

1 Share
Three minimalist white icons on a purple-to-pink gradient background. From left to right: an hourglass, a circular gauge, and a pair of angle brackets with a slash.

At a glance

  • Today’s AI agents don’t remember past interactions. They must repeatedly be fed relevant information or retrieve it from external sources, which becomes less efficient as they handle longer and more complex tasks. To scale agent capabilities, we need a more efficient way to retain and access information over time.
  • Memora is a scalable memory system that dramatically increases agent productivity on long-horizon tasks by decoupling what is stored (rich memory content) from how it’s retrieved (lightweight abstractions and cue anchors), balancing abstraction and specificity.
  • Memora sets new state-of-the-art on LoCoMo and LongMemEval, outperforming Mem0, RAG, and full-context inference while using up to 98% fewer context tokens.
  • Memora paper (opens in new tab) is published at ICML 2026. Memora code is available at https://github.com/microsoft/Memora (opens in new tab).

Imagine a workplace AI assistant helping you run a multi-month project. Over weeks of conversations, you share constraints, agree on milestones, revise deadlines, and surface dozens of stakeholder preferences. When you later ask it to draft an update for a colleague, it should recall not just the latest decision but the journey that got you there: what was tried, what was ruled out, who weighed in. Today’s AI agents struggle with this. Modern large language models (LLMs) are powerful reasoners, but they are effectively stateless: every session starts from zero, every long conversation forces the model to re-read its entire history, and every new piece of information is either stored as raw text (fragmented and noisy) or compressed into a vague summary (precise details lost). As AI assistants and autonomous agents move into long-horizon deployments, such as copilots that tracks a project for many months or even research agents that build up domain expertise with long horizon usage, the absence of principled memory system has become the critical bottleneck.

A growing line of work has begun to fill this gap. Systems like Mem0 extract atomic facts from conversations; retrieval-augmented (RAG) approaches index raw text fragments for later recall; and graph-based memory systems such as Zep and GraphRAG impose structure through entity relations. Each represents real progress, yet each runs into the same wall: existing designs force an unavoidable tradeoff between specificity (preserving fine-grained detail) and abstraction (organizing memory efficiently as it grows). Memora is built to give agents both.

What is Memora

Memora is an agentic memory framework designed for long-horizon AI agents. Memora’s central insight is to decouple what is stored from how it is retrieved. Memory content can remain rich and expressive, such as a project timeline, a multi-turn discussion about constraints, while a separate, lightweight structural layer handles indexing and retrieval. The result is a memory system that scales: it consolidates related information into stable units, surfaces fine-grained details when they matter, and lets the agent navigate its own history without re-reading everything. On standard long-conversation benchmarks, Memora sets new state-of-the-art performance while using up to 98% fewer tokens than would be consumed by dumping the full history into context.

Why this is hard: the abstraction–specificity tension

Existing memory systems fall into two extremes. Content-fragmentation systems, such as RAG and Mem0, embed extracted facts or text fragments directly. This preserves detail but produces brittle, isolated entries that lose narrative coherence. Coarse-abstraction systems compress experience into compact summaries. They are efficient, but summarization strips away the constraints, edge cases, and numeric details that make memory useful in the first place. Graph-based systems add structure on top of content, yet still rely on the content itself for retrieval and typically require rigid ontologies that don’t generalize across domains. None of these resolves the underlying tension between abstraction (which keeps memory efficient) and specificity (which gives memory utility).

Overview of the Memora architecture showing how multimodal data is segmented, converted into structured memory entries and an implicit memory graph, then retrieved through a policy-driven process optimized with group-relative learning to return relevant episodic memories.
Figure 1: Architecture overview of Memora.

How Memora works

Memora resolves this tension through a harmonic organization. Each memory entry has two components: a primary abstraction, which a short phrase (6–8 words) that captures what the memory is fundamentally about, and a memory value holding the rich content itself. Crucially, only the primary abstraction is embedded for similarity search; the value is never directly retrieved through its own content. This separation means new information about an evolving topic merges into the existing memory entry under the same primary abstraction, rather than fragmenting into a chain of partial duplicates. Complementing primary abstractions, cue anchors are short, context-aware tags extracted from each memory’s value, providing alternative access paths to the same memory. They function as flexible, organically-generated metadata.

To make this concrete: suppose a user says, “Dave and Sarah agreed to push the prototype to April 1, the pilot to May 2, and the MVP to May 30.” A knowledge-graph system would need predefined entity types and relation schemas: Person → agreed_on → Milestone → has_date → Date, and any new relation type would require schema extension. In Memora, the primary abstraction Updated Project Orion timeline agreed by Dave and Sarah serves as the canonical access point, while cue anchors like Dave Project Orion update, Project Orion prototype schedule, and Project Orion pilot timeline provide alternative retrieval paths — all without committing to an ontology. A later query about Dave’s recent contributions, or the prototype schedule, or pilot timing can all route to the same underlying memory through different cues, with the full detail preserved in the memory value.

On top of this representation, Memora introduces a policy-guided retriever that treats memory access as an active reasoning process. Rather than returning the top-k semantically similar items in a single shot, the policy retriever iteratively refines its query, expands through cue anchors to surface related-but-not-similar memories, and decides when to stop. This lets the agent navigate to relevant non-local context that pure semantic search would miss, chasing multi-hop dependencies the way a human would when recalling connected events. The retrieval policy can be either hand-prompted with a strong LLM or distilled into a much smaller model via reinforcement learning.

video series

On Second Thought

A video series with Sinead Bovell built around the questions everyone’s asking about AI. With expert voices from across Microsoft, we break down the tension and promise of this rapidly changing technology, exploring what’s evolving and what’s possible.

Opens in a new tab

Results

Bar chart comparing LoCoMo overall scores across memory systems using LLM-judge, F1, and BLEU metrics. Memora (P) achieves the highest LLM-judge score (0.863), followed by Memora (S) (0.849) and Full Context (0.825). Memora variants outperform other memory-based approaches across all three metrics.
Figure 2: Memora performance on LoCoMo dataset.

We evaluate Memora on two long-context benchmarks: LoCoMo, where dialogues average 600 turns, and LongMemEval, with 115,000-token contexts. Memora achieves new state-of-the-art performance on both: 86.3% LLM-judge accuracy on LoCoMo and 87.4% on LongMemEval, outperforming RAG, Mem0, Nemori, Zep, LangMem, and even full-context inference. The gap is largest on multi-hop reasoning, where Memora’s ability to traverse cue anchors pays the biggest dividends. The efficiency story is just as striking: Memora stores roughly half the memory entries per conversation that Mem0 does (344 vs. 651) and reduces token consumption by up to 98% relative to full-context inference. Less to read, less to store, better answers.

Looking forward

Memora’s design has implications beyond benchmark performance. We see this work as a step toward AI agents that can sustain long-term collaboration with users and accumulate organizational knowledge over months and years, not just within a single session. Building on this foundation, we are pursuing several complementary directions. MemLoop explores how memory systems can learn from retrieval and task failures, attribute errors to specific stages of the memory pipeline, and improve themselves over time. Deferred Memory investigates when memory construction should be postponed until sufficient context, evidence, or future utility becomes available, rather than committing prematurely to what should be stored. Group Memory examines how knowledge can be shared across teams and agents while preserving provenance, access boundaries, ownership, and sensitive context. We release our code alongside the paper and invite the community to build on this representation and explore what becomes possible when AI agents are no longer stateless.

Acknowledgements

We would like to thank Shantanu Dixit (Research Fellow) Paramaguru Harimurugan (Research Fellow), Rujia Wang, Victor Rühle, and Robert Sim for contributing to this project.

Opens in a new tab

The post Memora: A Harmonic Memory Representation Balancing Abstraction and Specificity appeared first on Microsoft Research.

Read the whole story
alvinashcraft
53 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

Highlights from Git 2.55

1 Share

The open source Git project just released Git 2.55 with features and bug fixes from over 100 contributors, 33 of them new. We last caught up with you on the latest in Git back when 2.54 was released.

To celebrate this most recent release, here is GitHub’s look at some of the most interesting features and changes introduced since last time.

Repacking with incremental multi-pack indexes

Returning readers of this series may recall our coverage of incremental multi-pack indexes and incremental multi-pack reachability bitmaps. In case you could use a refresher, here’s the short version.

Git stores the contents of your repository as individual objects: commits, trees, and blobs. Those objects usually live in packfiles, which are compressed collections of objects. A packfile has a corresponding pack index that lets Git locate any object inside the pack quickly. But large repositories do not usually have just one packfile: over time, fetches, pushes, maintenance tasks, and repacks can leave many packs behind.

A multi-pack index (or MIDX) gives Git a single index over many packs. Instead of opening and searching each pack’s individual index, Git can ask the MIDX which pack contains a given object and at which offset. This is especially useful for large repositories, and it is one of the building blocks behind GitHub’s repository maintenance strategy.

As we covered when Git 2.47 introduced the incremental MIDX format, a repository can store its MIDX as a chain of layers instead of as a single MIDX covering every pack. A single-file MIDX is simple and efficient to read, but it has an important maintenance cost; since that file includes every pack it covers, even a small update can require a large write in an already-large repository.

Incremental MIDXs address that by storing a chain of MIDX layers. Each layer covers some collection of packs, and the chain file records the order of those layers. Appending a new layer to the tip of the chain does not invalidate the older layers, so Git can index newly created packs without rewriting a single MIDX that covers the entire repository.

Git 2.55 teaches git repack how to write those incremental MIDX chains directly:

$ git repack --write-midx=incremental 

Without any other options, that mode is append-only: Git writes a new layer for the packs created by the repack and leaves the existing layers alone. That is already useful when you want to minimize how much metadata gets rewritten during a maintenance run.

But an append-only chain cannot grow forever. If each maintenance run adds a new layer, then eventually the chain itself becomes the thing you need to maintain. Git 2.55 also supports combining --write-midx=incremental with geometric repacking:

$ git repack --write-midx=incremental --geometric=2 -d

When those modes are used together, each repack creates a new tip layer, then decides whether adjacent layers should be compacted together. The default rule is controlled by repack.midxSplitFactor: if the accumulated object count in newer layers grows large enough relative to the next older layer, Git merges those layers into a single replacement layer. Otherwise, the older layers are left untouched.

At a high level, the algorithm works like this. Below, NN refers to the repack.midxNewLayerThreshold value, and ff refers to the repack.midxSplitFactor value:

  1. Pick the un-MIDX’d packs as geometric repacking candidates. If the tip MIDX layer has at least NN packs, include those as candidates too.
  2. Apply the usual geometric repacking rule to that candidate set, and write a new tip MIDX layer covering the resulting packs.
  3. Compact adjacent MIDX layers while the accumulated object count of the newer layer(s) exceeds 1/f1/f of the next deeper layer’s object count.

To see how the pieces fit together, let’s start with a repository that already has an incremental MIDX chain. The older layers are on the left, and the tip layer is on the right. Meanwhile, normal repository activity keeps producing new packs. Those packs are not covered by any MIDX layer yet, which means the next maintenance run has two jobs: decide what to repack, and decide how much of the MIDX chain to rewrite.

Diagram showing a chain of MIDX layers with newly written packs.

Ordinarily, those un-MIDX’d packs are the only geometric repacking candidates: Git can write a new pack and a new tip MIDX layer without disturbing any existing layer. The figure below shows the more interesting case, where the current tip layer has accumulated enough packs to meet the configured repack.midxNewLayerThreshold. Once that threshold is met, packs from the tip layer can join the newly written packs as geometric repacking candidates.

Geometric repacking then asks a local question about the newest candidate packs. Geometric repacking then asks a local question about the newest candidate packs: is the pack immediately to the left of some suffix of packs (𝒫\mathcal{P}) large enough to preserve the geometric progression if Git rolls up 𝒫\mathcal{P}? In the first attempt below, 𝒫\mathcal{P} contains the smallest pack from the current tip layer along with the new un-MIDX’d packs. But the pack to the left of the split is only 30,000 objects, which is smaller than twice the size of 𝒫\mathcal{P}, so this split is too far to the right.

Diagram indicating the first geometric split is too small.

So Git moves the split one pack earlier and asks the same question again. Now 𝒫\mathcal{P} includes one more pack from the tip layer. The pack immediately to the left has 100,000 objects, which is at least twice the size of the selected suffix. That is the point where the geometric invariant holds, so Git can roll up exactly those packs into a new pack.

Diagram showing how moving the split left finds a geometric roll-up.

After writing that new pack, Git writes a new tip MIDX layer over the surviving pack from the previous tip layer and the newly written roll-up pack. At this point, the packfiles themselves are in good shape, but the MIDX chain may still have accumulated too many small adjacent layers. Git applies the same “newer compared to older” instinct to the MIDX layers themselves: if the newer layer is large enough relative to its neighbor, compact their metadata into a replacement layer.

Diagram showing the new tip layer cannot compact with its neighbor.

That compaction step is deliberately metadata-only. Git does not repack the objects from those layers again; it writes a new MIDX layer that covers the same packfiles. Then it considers the next older layer. Here, the compacted layer is still smaller than half of the deeper layer, so Git stops. The older layer remains untouched, which is the key property that keeps this maintenance incremental.

Diagram showing compaction stops before rewriting the deeper layer.

The result is a compromise between two extremes. A single-file MIDX minimizes lookup complexity, but can require large rewrites during maintenance. A purely append-only incremental MIDX minimizes each write but allows the chain to grow without bound. Geometric incremental repacking keeps the number of layers logarithmic in the total number of objects, while ensuring that the newest, smallest layers are rewritten more often than older, larger ones.

This also integrates with Git’s existing repack machinery. Newly written packs that are not yet covered by the MIDX chain are always candidates for the geometric repack; packs in deeper MIDX layers are left alone. Packs in the tip MIDX layer join the candidate set only after the tip layer has at least repack.midxNewLayerThreshold packs. If the tip layer is still smaller than that threshold, Git skips disturbing it entirely and simply appends a new layer for the newly written packs.

For repositories that receive a steady stream of new objects, this means routine maintenance can update the repository’s pack metadata incrementally, without forcing each maintenance run to rewrite a single MIDX covering the entire object store.

[source]

Fixing up earlier commits with git history

Anyone who has polished a commit series before sending it for review has probably had this experience: you notice that a change in your working tree really belongs in an earlier commit, not at the tip of the branch.

Today, one common way to handle that is to create a fixup commit and then autosquash it:

$ git commit --fixup=<commit> 
$ git rebase --autosquash <commit>^

That works, but it asks you to spell out the mechanism instead of the intent. Git 2.55 builds on the experimental git history command, which Git 2.54 introduced, by adding a new fixup subcommand. It applies the changes currently staged in the index to an earlier commit:

$ git history fixup <commit>

Here is a small example. The first commit introduced a pancake recipe, followed by a few more commits on top. Later, we realize that the recipe was missing maple syrup. After staging that one-line change, git history fixup <commit> folds it into the original recipe commit and replays the descendant commits on top.

An animated gif showing Here is a small example. The first commit introduced a pancake recipe, followed by a few more commits on top. Later, we realize that the recipe was missing maple syrup. After staging that one-line change, git history fixup <commit> folds it into the original recipe commit and replays the descendant commits on top.

Here the staged change becomes part of the target commit itself. The target commit keeps its message and authorship by default, unless you pass --reedit-message, and Git rewrites the commits that follow so the branch ends at an equivalent history with the fix in the right place.

Like the rest of git history, this command is still experimental. It is also intentionally conservative. Because fixup reads from the index, it needs a working tree and cannot operate in a bare repository; if applying the staged change would produce a conflict, the command aborts instead of leaving you in the middle of a stateful rewrite.

[source]

The tip of the iceberg…

Now that we’ve covered the largest changes in more detail, let’s take a look at a selection of some other new features and updates in this release.

  • Returning readers of this series may remember our coverage of config-based hooks from Git 2.54, which let you define hooks in your Git configuration rather than only as executable files in $GIT_DIR/hooks. Hooks are the scripts Git runs at well-known points in your workflow, like before creating a commit or after receiving a push. Moving them into configuration makes those hooks easier to share, compose, and selectively disable without copying scripts into each repository’s hooks directory.

    Git 2.55 extends that work by allowing compatible configured hooks to run in parallel. For example, a project might have independent pre-commit hooks for linting and unit tests; if both declare hook.<name>.parallel = true, Git can run them at the same time. The number of concurrent jobs can be controlled globally with hook.jobs, per event with hook.<event>.jobs, or on the command line with git hook run -j. Hooks that need shared state, like commit-message hooks or other hooks that inspect the index or working tree, continue to run serially.

    [source]

  • If you have ever run git status only to be greeted by a long pause at your terminal, you may have used Git’s built-in filesystem monitor to speed things back up. When core.fsmonitor is enabled, commands like git status can ask a long-running daemon which paths have changed instead of scanning the entire working tree.

    Until now, that built-in daemon was available only on macOS and Windows. Git 2.55 adds support for Linux, where the implementation uses inotify. That works without elevated privileges, but requires one watch per directory, so very large repositories may need to raise the fs.inotify.max_user_watches limit. As on other platforms, the daemon is conservative around network-mounted repositories, which remain opt-in.

    [source]

  • Reachability bitmaps are one of the tricks Git uses to answer questions like “which objects are reachable from this commit?” without walking the entire object graph from scratch. They make object traversals faster, but Git still has to build and update those bitmaps during maintenance tasks like git repack --write-midx-bitmaps.

    Git 2.55 makes that generation path faster by avoiding unnecessary tree recursion, reusing already-computed selected bitmaps, caching object positions, and sorting bitmaps before XORing them together. In benchmarks from the patch series, those general improvements reduced bitmap generation time in one large repository from about 612 seconds to about 294 seconds.

    The same series also improves pseudo-merge bitmaps, which group related references together so Git can combine precomputed bit arrays during a traversal instead of rediscovering the same objects repeatedly. In one benchmark, pseudo-merges made a full git rev-list --objects --use-bitmap-index traversal nearly 20 times faster, but previously nearly doubled bitmap generation time. After these changes, pseudo-merges keep most of their traversal speedup while adding much less work to the bitmap generation path.

    [source, source]

  • If you use partial clones, filtered packs, or other workflows where Git intentionally omits some objects, pack size still matters. The git pack-objects --path-walk mode, introduced in Git 2.51, groups objects by path before performing a second compression pass, which can produce better deltas when path locality matters.

    In Git 2.55, --path-walk can be combined with filters including blob:none, blob:limit=<n>, tree:0, object:type=<type>, sparse:<oid>, and compatible combine: filters. That makes packing using --path-walk available in more partial-clone and filtered-pack workflows. In one benchmark on Git’s own repository, a blob-less path-walk repack produced a pack roughly 16% smaller, at the cost of a slower fresh-delta computation.

    [source]

  • Git learned a new experimental command, git format-rev, for pretty-formatting revisions from standard input. Unlike git log, which walks a range of history, git format-rev is designed for cases where you encounter commits one at a time or embedded in other text.

    For example, suppose you’re using git last-modified to print the commit that last modified each path in some directory. What if you wanted to know who last modified each path, not just which commit did it? You could replace those commits with author names by piping its output through something like this:

    $ git last-modified | perl -F'\t' -lane ' 
      chomp($F[0] = qx(git show -s --format=%an $F[0])); 
      print join "\t", @F 
    ' 
    Junio C Hamano	builtin/commit.c 
    [...]

    That works, but it has to start a new Git process for each row just to format the commit. In Git 2.55, git format-rev can handle that part as a normal pipeline:

    $ git last-modified | 
      git format-rev --stdin-mode=text --format=%an 
    Junio C Hamano	builtin/commit.c 
    [...]

    The command’s text mode can also rewrite full commit object names found in freeform text, which makes it useful for commit-message hooks or other scripting workflows.

    [source]

  • When you push your repository somewhere, you may have noticed output that starts with remote::

    $ git push origin main 
    Enumerating objects: 5, done. 
    [...] 
    remote: Resolving deltas: 100% (2/2), completed with 1 local object.

    When a client fetches or pushes, Git can multiplex different streams over the same connection: one sideband carries packfile data (the actual objects being transferred), another carries progress messages that the client usually prints to stderr, and a third carries errors from the remote.

    Those progress messages are useful, but they also come from the other side of the connection and may be printed directly to your terminal. Before Git 2.55, that meant a server could include arbitrary terminal control sequences in sideband output, including sequences that move the cursor or erase text. Git now masks most of those control characters by default while still allowing ANSI color sequences, so colored progress output continues to work.

    [source]

  • Suppose you are halfway through editing a file when you realize that you started from the wrong branch. If the branch you want to switch to changed the same path, a plain git checkout <branch> will refuse to move and risk clobbering your work. git checkout -m <branch> is the “try to carry my local edits with me” version of that operation.

    But what happens when the other side has modifications against that same path? Previously, git checkout -m gave you one chance to resolve the resulting conflicts immediately. Git 2.55 makes that safer by using an autostash internally, so the conflicted local changes are saved as a stash entry that you can either resolve right away or reapply later.

    [source]

  • Some projects need to publish the same branch to more than one place, like a primary host and one or more mirrors. Remote groups have long been available to git fetch, where a group is configured with remotes.<name> as a whitespace-separated list of remotes. Git 2.55 lets git push use the same shorthand:

    $ git config remotes.publish "github gitlab mirror" 
    $ git push publish main

    This is equivalent to pushing to each remote in the group in sequence. Since atomicity can only be guaranteed for a single transport connection, --atomic is not supported when pushing to a group.

    [source]

  • git log --graph is great for visualizing branch structure, right up until the graph itself gets too wide to read. In repositories with many parallel branches, the graph lanes can consume most of the terminal before you get to the commit subject.

    Git 2.55 adds --graph-lane-limit=<n> to git log --graph and related commands. Lanes beyond the limit are replaced with `~`, making graph output more manageable in repositories with very wide histories.

    [source]

  • Suppose you want to list the 10 most recent commits on your branch. That is easy enough: git log -n 10 does exactly that. But what if you want the 10 oldest commits? If you are thinking, “it surely isn’t git log --reverse --10,” then congratulations: you’re a veteran Git user! Instead of reversing the history and then printing 10 commits, Git takes the 10 most recent commits and reverses their order.

    You can get there by post-processing the whole range (for example with git log --reverse <range> | tail -10) but doing so still asks Git to print and format all of the commits that the shell is going to throw away. Git 2.55 adds a new --max-count-oldest=<n> option to git rev-list and the git log family of commands, which selects the oldest n commits in a range instead.

    [source]

  • During a fetch, the client and server negotiate by having the client advertise commits it already has as have lines. That lets the server avoid sending objects the client can already reach. But in repositories with many references, the negotiation algorithm may skip a ref that is especially important for finding common history.

    Git 2.55 adds new controls for which references participate in negotiation. The new include and restrict options, along with corresponding remote.* configuration, allow users to require certain refs to be sent as have lines or to limit negotiation to a specific set of refs.

    [source]

…the rest of the iceberg

That’s just a sample of changes from the latest release. For more, check out the release notes for 2.55, or any previous version in the Git repository.

The post Highlights from Git 2.55 appeared first on The GitHub Blog.

Read the whole story
alvinashcraft
53 minutes ago
reply
Pennsylvania, USA
Share this story
Delete

What You Bring to AI Determines the Result

1 Share

Harper Carroll came to AI education through a CS background at Stanford, machine learning engineering at Meta, and a brief stint at a small GPU compute startup in late 2023, where she noticed that almost no one understood how to fine-tune open source models. She started writing and teaching to help drive signups for the startup’s platform. Her first guide, posted right after Mistral 7B was released, when she had about 50 followers, got 50,000 views. In March 2024, a video explaining the difference between AI and machine learning got 5 million views, with 1 in 20 viewers following her afterward. She now has more than 500,000 followers across multiple platforms and is a full-time AI educator.

We covered fine-tuning versus prompting, what it actually means to learn to code in 2025, and what the AI field gets wrong when it talks to the public.

Understanding the world with math

We started with Harper’s own AI learning journey, and it contained a wonderful insight. She grew up loving math and came to computer science at Stanford because algorithms seemed like wonderful math puzzles. Eventually she realized that AI is “understand[ing] the world around us with math.” Text-based LLMs are only one branch. The field as a whole is “the math of the world.” That seems like a deep intuition that all of us need to internalize.

AI as a medium

A study that circulated last year found that people who used AI to write essays showed reduced brain activity compared to people who write unaided. The reaction in many quarters was alarm. People said, “We’re outsourcing cognition and our brains will atrophy.” Harper’s smart response was that those users must have given the AI a one-sentence prompt and accepted whatever came back.

As she put it, that’s the equivalent of just telling Alexa to order you the most popular book this week. Of course less brain activity is being measured! Contrast that with the difference between shopping for a book by browsing and searching at Amazon versus driving to a physical bookstore. There’s certainly a difference, but it isn’t outsourcing cognition. It’s saving time, and that time might well be spent on other demanding cognitive tasks.

My framing is that AI is a medium, the way language is a medium, or photography. Anyone can take a photograph or write a book. The words available to every writer are the same; what differs is what they do with them, just as some photographers do something with it that others can’t. The same is true of software. There’s a line in Aaron Sorkin’s movie The Social Network where the Zuckerberg character says about the Winklevosses, “If you guys were the inventors of Facebook, you’d have invented Facebook.” An idea and its execution aren’t the same thing. One person gives AI a prompt and the output is bad. Another builds a process around AI and the output is great. What you bring to the medium is what determines the result. Harper agreed.

Fine-tuning is like psychedelics for AI

I’ve been trying to figure out how we can use AI for writing and editing at O’Reilly. We want skills and workflows that accelerate our productivity but don’t produce copy that reads as whatever the base model sounds like when nobody’s putting in any effort.

Takeaway posts like this one are a great use case for AI-assisted writing. As source material we have a transcript, with the actual conversation between the participants (or in the case of one of our online conferences, their presentations). We want a structured summary that captures the high points and suggests possible clips for social media. I (or whomever is using this AI-assisted workflow) can then rewrite, rearrange, elaborate, or delete from that first draft. It might not be as good as a draft written from scratch, but quite frankly, it’s far better than the alternative, which is no summary at all. I just don’t have time to write them all unaided.

When I’m writing an article, I generate a similar “transcript” by recording myself talking about the ideas I’m wrestling with and trying to put into the world. Then I ask Claude to put it together into something a bit more structured.

I’ve been improving Claude’s ability to produce prose that we can use by rewriting its output, showing it the differences, and then asking it to construct a skill that captures what it’s learned. Over time, it’s gotten closer and closer to something that I’m comfortable with, and I’m now generalizing that into a system that learns any author’s voice, respects the various conventions of the target content type (which can be very different across books, articles and blog posts, social media, and marketing materials like back cover copy and course descriptions), and applies editing suggestions from my favorite books on good writing, including Strunk and White and On Writing Well by William Zinsser.

Harper attacked the same problem from a different angle. She built a dataset of roughly 1,000 of her Instagram captions, video transcripts, and X posts, then fed them to Claude as context and asked it to write in her style. Unfortunately, the output tested 100% AI by a detection tool, even with 1,000 examples of her real voice in the prompt. She then fine-tuned an open source Llama model on the same data. The fine-tuned output tested 100% human. She gave a compelling demo at South by Southwest showing how easy this is to do. It took her about 20 minutes.

After Harper said that prompting doesn’t shift the output distribution the way fine-tuning does, I told her the story about the French writer Marcel Proust that I first used in my conversation with Steve Wilson, which I picked up from Alain de Botton’s How Proust Can Change Your Life. A friend comes to visit the bedridden Proust, and making polite conversation begins to tell him about the train trip to Paris. “More slowly,” Proust replies. This cycle repeats several times until the friend is telling him small details like the old man feeding pigeons on the steps of the station.

Harper got it, and broke it down more slowly in her inimitable way. Here’s why in-context prompting fails where fine-tuning succeeds:

Basically AI models are these massive mathematical equations, and the parameters are variables when you’re training, and then they become constants in those equations when you’re running inference . . .So what you’re doing when you’re training the model is you’re learning how to map, by adjusting those constants when they’re variables during training,. . .input to desired output.

Once the model is deployed, the probability distribution over output tokens is fixed. You can put 1,000 examples in a prompt and ask the model to pattern-match, but you’re asking it to do that with frozen weights. The surface behavior bends a little, but the underlying distribution doesn’t shift. Fine-tuning lets you actually modify the weights and how the model wants to write.

Her suggested approach for building the training dataset is to take your own writing, have AI rewrite it with its characteristic tics, then train with the AI version as input and your original as the target output. You’re teaching the model to undo the tells.

Should people still learn to code?

We also spent time on the inevitable question of whether people should still learn to code. We both agree they should, but not necessarily like they used to, by learning the detailed syntax of a programming language, then by trial and error as they painfully learn how hard it is to get the desired behavior.

Harper’s take (which I also agree with) is that vibe coding has lowered the floor. People who could never afford to hire someone to build a product can now do so themselves. But it has also raised the ceiling, because people who actually understand systems can build vastly more sophisticated things with the same tools, which takes us back to the case for AI as a medium.

Perhaps more importantly to the question of how much coding you should learn, experienced developers will also see failure modes that pure vibe coders miss. Harper gave an example that came from watching a friend using an agent tool that had, at some point, started storing its data in a Word document and using it as a makeshift database, probably because the session started with a Word doc. It was extremely slow and extremely inefficient. An engineer sees the problem immediately. A vibe coder might run that system for months before noticing something is wrong.

So yes, you should learn enough about coding to understand what’s happening. The art of teaching programming to the next generation will be developing useful projects that also highlight underlying concepts of software architecture and engineering.

Intuition as differentiator

Silicon Valley runs heavily on logic and on the idea that good decisions come from better data, more rigorous analysis, and sharper models. In this environment, intuition can get dismissed as something “soft and fuzzy,” Harper noted. And that’s the wrong mindset for AI.

AI is getting better and better at exactly the things the logical axis does well, but intuition remains a challenge because it often contradicts what the data says. Good intuition “goes against the input,” to use Harper’s phrase. A model that’s been trained to recognize patterns in data will, almost by definition, struggle with making decisions that run counter to those patterns. Just as skills-informed judgment supercharges AI-assisted engineers, intuition could be a uniquely human skill for a long time. Elevating it as a concern might bring the industry more of an attitude of humility towards ourselves and our place in the world.

What the field gets wrong

I closed by asking Harper what the AI field most consistently gets wrong in how it talks to the public. She said that too much of the public-facing discourse leads with fear, of job displacement, of rapidly approaching AGI, and of a rocky transition that requires a universal basic income to cushion the blow. She’s not calling those impossible futures, but she thinks they’re the wrong introduction to the technology.

A lot of companies are using AI to ask how to do the same things at lower cost. The better question is how to raise ambitions. AI doesn’t just scale individual capabilities. It scales what organizations can attempt. But for it to work out that way, everybody has to actually learn AI. We can’t have AI haves and have-nots. That means lower-cost models, serious open source investment, and companies that don’t just become serfs to the major platforms.

Harper has been making this point for a while, to audiences ranging from engineers to people who’ve never written a line of code. “There is not really much to fear right now,” she says. “AI is this incredible productivity tool.” The people who will struggle, in her view, are the ones who refuse to engage with it at all.

At O’Reilly, we’ve been working on a version of the same narrative at an organizational level. The fear-first narrative produces avoidance, and avoidance is the one thing that will actually leave someone behind. So we’re building a corporate AI transformation practice that starts with people’s existing jobs, and figures out how to “mix in” AI to make them more impactful. We’re learning how to teach both the humans and the agents at the same time to make them more productive together.

On July 9, I’ll be speaking with Trail of Bits cofounder and CEO Dan Guido about the playbook his company used to go AI native, which he first outlined at this year’s [un]prompted. He’ll give a version of the same talk, then take about 40 minutes of audience questions on what worked, what didn’t, and what is still unsolved. I hope you join us to find out what’s changed since [un]prompted and where the playbook is heading next. Register here; it’s free and open to all.



Read the whole story
alvinashcraft
53 minutes ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories