Welcome to the 127th edition of Git Rev News, a digest of all things Git. For our goals, the archives, the way we work, and how to contribute or to subscribe, see the Git Rev News page on git.github.io.
This edition covers what happened during the months of August and September 2025.
Doing blobless clone by default; switching between blobless, treeless and full clones by a command
Dilyan Palauzov (Дилян Палаузов) sent an email to the Git mailing
list where he proposed making blobless cloning
(--filter=blob:none
) the default behavior for git clone
via a
global configuration option. He also suggested adding a command to
download all locally missing history, a command to convert a
repository to a pure treeless or pure blobless clone, and a config
option to make blobless clone the default behavior when running just
git clone <URL>
.
He said that most users used git clone
to build or change software, not to
immediately analyze history with commands like git log
. Therefore,
a reduced data download would speed up initialization, save
bandwidth, and reduce server load.
Kristoffer Haugsbakk replied saying the proposed command to “download all locally missing history” for treeless and blobless clones “sounds like git-backfill(1)”. He also noted that he had “never used blob/treeless” clones himself.
Derrick Stolee, who likes to be called just “Stolee”, and who
contributed the git backfill
command, replied to Kristoffer
confirming that git backfill
is intended to assist with downloading
the missing blobs in a blobless partial clone.
About treeless clones though, he noted that git backfill
is not
optimized for them, and that treeless clones are generally not
intended for “refilling,” as downloading missing trees is
“particularly expensive”.
Stolee suggested using scalar clone
, which is already shipped with
Git, instead of making blobless cloning the default, as
scalar clone
was contributed partly to allow users to opt into a
version of git clone
that incorporates “best practices and
advanced features as they are developed”, while git clone
maintains backward compatibility. He recognized that scalar clone
might not be “discoverable enough” though.
Junio Hamano replied to Stolee’s suggestion that a future command
like git big-clone
could emerge from the feedback on
scalar clone
. He said a separate command like git big-clone
would not be discoverable enough either. Instead as a new feature
matures, it should be a welcome change for git clone
to borrow it
as a new option. Such optimizations (like those for large repos)
could be automatically enabled based on the repository’s size,
provided it was done with end-user consent.
Patrick Steinhardt replied to Stolee about treeless clones. He
agreed that the existing command git backfill
is not optimized for
refilling treeless clones, and proposed an idea to backfill trees by
batching based on depth, but concluded that this method was
“definitely not ideal” and would perform “way worse compared to
backfilling blobs”.
Patrick also said that for these reasons he generally recommends not to use treeless clones at all.
Stolee replied to Patrick agreeing with the general caution regarding treeless clones, and that they were “not a good approach for doing ongoing work as a human”.
However he noted that they are useful if a user needs the speed of a shallow clone combined with the ability to analyze commit history (though with no path history) for an “ephemeral scenario like a CI build”. But they are a “tool for a very narrow case” and should only be used by those who understand how to avoid their pitfalls. Patrick then agreed with that point of view.
Konstantin Ryabitsev, the system administrator for kernel.org,
replied to the original email from Dilyan about making blobless
clones the default behavior for git clone
. He said a
counter-rationale to this proposal was that shallow clones (which
include blobless clones) generate significantly more load on the
server side.
The reason is that for these partial clones, no pre-existing packs
are available for the operation, requiring more computation from the
server. So changing the default behavior for git clone
could
likely result in slower clones for everyone and lead to more
unavailable servers due to the high load.
Ben Knoble also replied to Dilyan’s original email by opposing the proposal to make blobless clones the default behavior while agreeing that managing this preference via a config option was a reasonable compromise.
Ben’s opinion was that such a default behavior would defeat the “tremendous advantage of distributed version control”, which is about having the whole repository available independently. It would also make some of his use cases more difficult as he frequently clones repositories specifically to run “history spelunking searches”.
He noted that he primarily deals with repositories where the issue isn’t about clones, but about mismanaging large binary files in history, which causes large blobs and clone times.
Who are you and what do you do?
I’m Toon from Belgium. My name is pronounced like “tone” (rhymes with “bone”), and not like the “toon” in “cartoon”, but usually I’m already happy if people remember my name 😉.
I’m employed by GitLab for more than 8 years, and since late 2024 I’ve been part of the Git team, contributing to the Git project. I’ve started my professional career in 2008 building software for a payment terminal running embedded GNU/Linux using C & C++. Later I’ve transitioned into doing web and mobile development for a while. And now recently, I’ve been circling back to more lower-level programming, contributing to Git using C.
What would you name your most important contribution to Git?
I’m fairly new in the Git community, but recently I’ve been working on
adding git last-modified
(1).
It’s a sub-command that will be released in Git v2.52. This command
finds the commit that last modified each path in a tree. It can
be used on forges (like GitLab, GitHub, Codeberg), to show commit
data in a tree view.
What are you doing on the Git project these days, and why?
The subcommand git last-modified
(1)
was recently merged in the ‘master’. But there’s more work to be
done to improve its performance.
If you could get a team of expert developers to work full time on something in Git for a full year, what would it be?
Once data is committed to Git, and it’s made part of the history (i.e. committed or merged into the default branch), it’s trapped forever. This is a core principle of Git: you cannot rewrite history without changing commit hashes. This is very powerful, but also problematic in some scenarios.
For example, at my $DAYJOB we receive commonly the request from customers to remove confidential or sensitive information from a Git repository. This is not possible without rewriting history. Or when, by accident, large files are committed to Git, you cannot get them out (without rewriting history). Or people might want to remove/change their personal information in a repository, for example when they transition genders.
Can we (and should we?) build something that removes and overwrites pieces of history, without changing commit hashes? It’s a slippery slope, because from experience I know Git users are very creative and might use this feature in ways it was not intended for.
If you could remove something from Git without worrying about backwards compatibility, what would it be?
The use of the double ..
and triple ...
dot notation in
gitrevisions(7)
and git diff(1)
. I even once ranted about it in a video.
What is your favorite Git-related tool/library, outside of Git itself?
I’m a big fan of Magit. It’s arguably the best tool to interact with Git and I also learned a lot from it. I consider myself an advanced Git user, but I wouldn’t be able to split up changes in several commits without Magit.
Do you happen to have any memorable experience w.r.t. contributing to the Git project? If yes, could you share it with us?
One of my earliest contributions to Git was a bug fix in the code used
by git bundle create
. We noticed sometimes references didn’t end up in
the bundle. After a lot of digging I submitted a patch that removed
about 30 lines of code written way back in 2007. The code from 2007
caused references to be skipped if they were modified while the
git bundle create
process was running. But it wasn’t needed anymore
due some changes made in 2013, although no one ever realized that.
You can read more about it in the patch.
It was really satisfying to submit a patch that was nothing more than code deletion of really old code (and adding some tests). And it taught me to write a good commit message, which I was praised for by the maintainer. It was a very nice experience as a newcomer in the community.
What is your toolbox for interacting with the mailing list and for development of Git?
I mostly live in Emacs and my terminal (zsh). I consume email in Emacs using notmuch. To submit patches I use b4, which I also sometimes use to pull in patches. But I also sometimes pull in the branches from Junio’s fork or the fork shared across my colleagues.
In Git, I compile and unit test changes using Meson. Its use was introduced in the Git project around the end of 2024. It’s reliable because it prevents me from forgetting to recompile before running tests; it’s fast because it parallelizes compilation by default and automatically uses Ccache; it allows out-of-tree builds, which is really convenient if you want to benchmark various revisions of Git.
What is your advice for people who want to start Git development? Where and how should they start?
Learn to navigate the mailing list archive. It lacks structure so
things can be hard to find, but there’s so much information up there. If
you’re interested in a topic, or you think you’ve found the bug, start digging.
Use git blame(1)
to find the commit that introduced the changes
and look up the conversation around it in the mailing list archive.
This will help you understand why some decisions are made. Also it
familiarizes you with the people in the community, how they think,
how they communicate, and what’s expected from you. Having the
knowledge from those conversations can help you build a strong case
whenever you’re submitting a feature change or bug fix.
Various
Link:
tags (for Linux),
by Jonathan Corbet on LWN.net.Light reading
git-greb
script that feeds git grep
to git blame
(with appropriate options) in order to blame matching lines.what-changed-twice
needs a new name
by Mark Dominus (陶敏修) on his The Universe of Discourse blog,
about the tool to help get related changes into the same commit.git clone
operations in any environment, reducing clone time and disk space,
with the Git Much Faster script.
jj
) is a Git-compatible version control system
written in Rust, first mentioned in
Git Rev News Edition #85.git log -S
..git/info/exclude
.git push
and git fetch
with multiple URLs,
and jj git push --all-remotes
.sourcegraph/sourcegraph
repository went private
(there is a public snapshot available at sourcegraph/sourcegraph-public-snapshot,
which is a read-only archived repository).Slightly heavier reading
Easy watching
Git tools and sites
Written in Scala, under an MIT license.
git-sqlite is a collection of shell scripts, including a custom diff and merge driver for SQLite, that allows a SQLite database to be tracked using the Git version control system. Under MIT license.
This edition of Git Rev News was curated by Christian Couder <christian.couder@gmail.com>, Jakub Narębski <jnareb@gmail.com>, Markus Jansen <mja@jansen-preisler.de> and Kaartic Sivaraam <kaartic.sivaraam@gmail.com> with help from Toon Claes, Johannes Schindelin, Bruno Brito, Gerard Murphy, Jack Lot, Ben Knoble and Štěpán Němec.