The Git project recently released Git 2.52. After a relatively short 8-week release cycle for 2.51, due to summer in the Northern Hemisphere, this release is back to the usual 12-week cycle. Let’s look at some notable changes, including contributions from the GitLab Git team and the wider Git community.
New git-last-modified(1) command
Many Git forges like GitLab display files in a tree view like this:
| Name | Last commit | Last update |
|---|---|---|
| README.md | README: *.txt -> *.adoc fixes | 4 months ago |
| RelNotes | Start 2.51 cycle, the first batch | 4 weeks ago |
| SECURITY.md | SECURITY: describe how to report vulnerabilities | 4 years |
| abspath.c | abspath: move related functions to abspath | 2 years |
| abspath.h | abspath: move related functions to abspath | 2 years |
| aclocal.m4 | configure: use AC_LANG_PROGRAM consistently | 15 years ago |
| add-patch.c | pager: stop using the_repository |
7 months ago |
| advice.c | advice: allow disabling default branch name advice | 4 months ago |
| advice.h | advice: allow disabling default branch name advice | 4 months ago |
| alias.h | rebase -m: fix serialization of strategy options | 2 years |
| alloc.h | git-compat-util: move alloc macros to git-compat-util.h | 2 years ago |
| apply.c | apply: only write intents to add for new files | 8 days ago |
| archive.c | Merge branch 'ps/parse-options-integers' | 3 months ago |
| archive.h | archive.h: remove unnecessary include | 1 year |
| attr.h | fuzz: port fuzz-parse-attr-line from OSS-Fuzz | 9 months ago |
| banned.h | banned.h: mark strtok() and strtok_r() as banned |
2 years |
<br></br>
Next to the files themselves, we also display which commit last modified each respective file. This information is easy to extract from Git by executing the following command:
$ git log --max-count=1 HEAD -- <filename>
While nice and simple, this has a significant catch: Git does not have a way to extract this information for each of these files in a single command. So to get the last commit for all the files in the tree, we'd need to run this command for each file separately. This results in a command pipeline similar to the following:
$ git ls-tree HEAD --name-only | xargs --max-args=1 git log --max-count=1 HEAD --
Naturally, this isn't very efficient:
-
We need to spin up a fresh Git command for each file.
-
Git has to step through history for each file separately.
As a consequence, this whole operation is quite costly and generates significant load for GitLab.
To overcome these issues, a new Git subcommand git-last-modified(1) has been introduced. This command returns the commit for each file of a given commit:
$ git last-modified HEAD
e56f6dcd7b4c90192018e848d0810f091d092913 add-patch.c
373ad8917beb99dc643b6e7f5c117a294384a57e advice.h
e9330ae4b820147c98e723399e9438c8bee60a80 advice.c
5e2feb5ca692c5c4d39b11e1ffa056911dd7dfd3 alloc.h
954d33a9757fcfab723a824116902f1eb16e05f7 RelNotes
4ce0caa7cc27d50ee1bedf1dff03f13be4c54c1f apply.c
5d215a7b3eb0a9a69c0cb9aa43dcae956a0aa03e archive.c
c50fbb2dd225e7e82abba4380423ae105089f4d7 README.md
72686d4e5e9a7236b9716368d86fae5bf1ae6156 attr.h
c2c4138c07ca4d5ffc41ace0bfda0f189d3e262e archive.h
5d1344b4973c8ea4904005f3bb51a47334ebb370 abspath.c
5d1344b4973c8ea4904005f3bb51a47334ebb370 abspath.h
60ff56f50372c1498718938ef504e744fe011ffb banned.h
4960e5c7bdd399e791353bc6c551f09298746f61 alias.h
2e99b1e383d2da56c81d7ab7dd849e9dab5b7bf0 SECURITY.md
1e58dba142c673c59fbb9d10aeecf62217d4fc9c aclocal.m4
The benefit of this is obviously that we only have to execute a single Git process now to derive all of that information. But even more importantly, it only requires us to walk the history once for all files together instead of having to walk it multiple times. This is achieved by:
-
Start walking the history from the specified commit.
-
For each commit:
- If it doesn't modify any of the paths we're interested in we continue to the next commit.
- If it does, we print the commit ID together with the path. Furthermore, we remove the path from the set of interesting paths.
-
When the list of interesting paths becomes empty we stop.
Gitaly has already been adjusted to use the new command, but the logic is still guarded by a feature flag. Preliminary testing has shown that git-last-modified(1) is in most situations at least twice as fast compared to using git log --max-count=1.
These changes were originally written by multiple developers from GitHub and were upstreamed into Git by Toon Claes.
git-fast-export(1) and git-fast-import(1) signature-related improvements
The git-fast-export(1) and git-fast-import(1) commands are designed to be mostly used by interoperability or history rewriting tools. The goal of interoperability tools is to make Git interact nicely with other software, usually a different version control system, that stores data in a different format than Git. For example hg-fast-export.sh is a “Mercurial to Git converter using git-fast-import."
Alternately, history-rewriting tools let users — usually admins — make changes to the history of their repositories that are not possible or not allowed by the version control system. For example, reposurgeon says in its introduction that its purpose is “to enable risky operations that version-control systems don't want to let you do, such as (a) editing past comments and metadata, (b) excising commits, (c) coalescing and splitting commits, (d) removing files and subtrees from repo history, (e) merging or grafting two or more repos, and (f) cutting a repo in two by cutting a parent-child link, preserving the branch structure of both child repos."
Within GitLab, we use git-filter-repo to let admins perform some risky operations on their Git repositories. Unfortunately, until Git 2.50 (released last June), both git-fast-export(1) and git-fast-import(1) didn't handle cryptographic commit signatures at all. So, although git-fast-export(1) had a --signed-tags=<mode> option that allows users to change how cryptographic tag signatures are handled, commit signatures were simply ignored.
Cryptographic signatures are very fragile because they are based on the exact commit or tag data that was signed. When the signed data or any of its preceding history changes, the cryptographic signature becomes invalid. This is a fragile but necessary requirement to make these signatures useful.
But in the context of rewriting history this is a problem:
-
We may want to keep cryptographic signatures for both commits and tags that are still valid after the rewrite (e.g. because the history leading up to them did not change).
-
We may want to create new cryptographic signatures for commits and tags where the previous signature has become invalid.
Neither git-fast-import(1) nor git-fast-export(1) allow for these use cases though, which limits what tools like git-filter-repo or reposurgeon can achieve.
We have made some significant progress:
-
In Git 2.50 we added a
--signed-commits=<mode>option togit-fast-export(1)for exporting commit signatures, and support ingit-fast-import(1)for importing them. -
In Git 2.51 we improved the format used for exporting and importing commit signatures, and we made it possible for
git-fast-import(1)to import both a signature made on the SHA-1 object ID of the commit and one made on its SHA-256 object ID. -
In Git 2.52 we added the
--signed-commits=<mode>and--signed-tags=<mode>options togit-fast-import(1), so the user has control over how to handle signed data at import time.
There is still more to be done. We need to add the ability to:
-
Retain only those commit signatures that are still valid to
git-fast-import(1). -
Re-sign data where the signature became invalid.
We have already started to work on these next steps and expect this to land in Git 2.53. Once done, tools like git-filter-repo(1) will finally start to handle cryptographic signatures more gracefully. We will keep you posted in our next Git release blog post.
This project was led by Christian Couder.
New and improved git-maintenance(1) strategies
Git repositories require regular maintenance to ensure that they perform well. This maintenance performs a bunch of different tasks: references get optimized, objects get compressed, and stale data gets pruned.
Until Git 2.28, these maintenance tasks were performed by git-gc(1). The problem with this command is that it wasn't built with customizability in mind: While certain parameters can be configured, it is not possible to control which parts of a repository should be optimized. This means that it may not be a good fit for all use cases. Even more importantly, it made it very hard to iterate on how exactly Git performs repository maintenance.
To fix this issue and allow us to iterate again, Derrick Stolee introduced git-maintenance(1). In contrast to git-gc(1), it is built with customizability in mind and allows the user to configure which tasks specifically should be running in a certain repository. This new tool was made the default for Git’s automated maintenance in Git 2.29, but, by default, it still uses git-gc(1) to perform the maintenance.
While this default maintenance strategy works well in small or even medium-sized repositories, it is problematic in the context of large monorepos. The biggest limiting factor is how git-gc(1) repacks objects: Whenever there are more than 50 packfiles, the tool will merge all of them together into a single packfile. This operation is quite CPU-intensive and causes a lot of I/O operations, so for large monorepos this operation can easily take many minutes or even hours to complete.
Git already knows how to minimize these repacks via “geometric repacking.” The idea is simple: The packfiles that exist in the repository must follow a geometric progression where every packfile must contain at least twice as many objects as the next smaller one. This allows Git to amortize the number of repacks required while still ensuring that there is only a relatively small number of packfiles overall. This mode was introduced by Taylor Blau in Git 2.32, but it was not wired up as part of the automated maintenance.
All the parts exist to make repository maintenance way more scalable for large monorepos: We have the flexible git-maintenance(1) tool that can be extended to have a new maintenance strategy, and we have a better way to repack objects. All that needs to be done is to combine these two.
And that's exactly what we did with Git 2.52! We have introduced a new “geometric” maintenance strategy that you can configure in your Git repositories. This strategy is intended as a full replacement for the old strategy based on git-gc(1). Here is the config code you need:
$ git config set maintenance.strategy geometric
From hereon, Git will use geometric repacking to optimize your objects. This should lead to less churn while ensuring that your objects are in a better-optimized state, especially in large monorepos.
In Git 2.53, we aim to make this the default strategy. So stay tuned!
This project was led by Patrick Steinhardt.
New subcommand for git-repo(1) to display repository metrics
Performance of Git operations in a repository are often dependent on certain characteristics of its underlying structure. At GitLab, we host some extremely large repositories and having insight into the general structure of a repository is critical to understand performance. While it is possible to compose various Git commands and other tools together to surface certain repository metrics, Git lacks a means to surface info about a repository's shape/structure via a single command. This has led to the development of other external tools, such as git-sizer(1), to fill this gap.
With the release of Git 2.52, a new “structure” subcommand has been added to git-repo(1) with the aim to surface information about a repository's structure. Currently, it displays info about the number of references and objects in the repository in the following form:
$ git repo structure
| Repository structure | Value |
| -------------------- | ------ |
| * References | |
| * Count | 1772 |
| * Branches | 3 |
| * Tags | 1025 |
| * Remotes | 744 |
| * Others | 0 |
| | |
| * Reachable objects | |
| * Count | 418958 |
| * Commits | 87468 |
| * Trees | 168866 |
| * Blobs | 161632 |
| * Tags | 992 |
In subsequent releases we hope to expand on this and provide other interesting data points like the largest objects in the repository.
This project was led by Justin Tobler.
Improvements related to the Google Summer of Code 2025
We had three successful projects with the Google Summer of Code.
Refactoring in order to reduce Git's global state
Git contains several global variables used throughout the codebase. This increases the complexity of the code and reduces the maintainability. As part of this project, Ayush Chandekar worked on reducing the usage of the the_repository global variable via a series of patches.
The project was mentored by Christian Couder and Ghanshyam Thakkar.
Machine-readable Repository Information Query Tool
Git lacks a centralized way to retrieve repository information, requiring users to piece it together from various commands. While git-rev-parse(1) has become the de-facto tool for accessing much of this information, doing so falls outside its primary purpose.
As part of this project, Lucas Oshiro introduced a new command, git-repo(1), which will house all repository-level information. Users can now use git repo info to obtain repository information:
$ git repo info layout.bare layout.shallow object.format references.format
layout.bare=false
layout.shallow=false
object.format=sha1
references.format=reftable
The project was mentored by Patrick Steinhardt and Karthik Nayak.
Consolidate ref-related functionality into git-refs
Git offers multiple commands for managing references, namely git-for-each-ref(1), git show-ref(1), git-update-ref(1), and git-pack-refs(1). This makes them harder to discover and creates overlapping functionality. To address this, we introduced the git-refs(1) command to consolidate these operations under a single interface. As part of this this project, Meet Soni extended the command by adding the following subcommands:
-
git refs optimizeto optimize the reference backend -
git refs listto list all references -
git refs existsto verify the existence of a reference
The project was mentored by Patrick Steinhardt and shejialuo.
What's next?
Ready to experience these improvements? Update to Git 2.52.0 and start using git last-modified.
At GitLab, we will of course ensure that all of these improvements will eventually land in a GitLab instance near you!
Learn more in the official Git 2.52.0 release notes and explore our complete archive of Git development coverage.

