Welcome to the 72nd edition of Git Rev News, a digest of all things Git. For our goals, the archives, the way we work, and how to contribute or to subscribe, see the Git Rev News page on git.github.io.
This edition covers what happened during the month of January 2021.
[PATCH 0/5] Support for commits signed by multiple algorithms
brian m. carlson sent a patch series to allow verifying signed commits and tags when using multiple hash algorithms. This is a follow up from brian’s multi-year work on supporting the SHA-256 hash algorithm in Git, to deal with the fact that the original SHA-1 algorithm is becoming more and more outdated and insecure.
One of the trickiest part in supporting a new hash algorithm is that when Git objects (except blobs) are converted to the new hash, their contents changes, because the hashes they contain (to reference other Git objects) are converted too. So old signatures they contain become invalid.
A way to overcome this issue is to add a new signature, that signs the converted object, to each signed object that is converted. This way such object would have 2 signatures, and can always be verified using one of them, even if it gets converted back and forth.
brian’s patch series addressed the issue that for SHA-256 tags it was initially planned to have the signature in a Git object header (which is called a header signature) instead of at the end of the tag message (which is called a trailing signature), but unfortunately the patch implementing that got lost. So we use trailing signatures.
brian then explained “We can’t change this now, because otherwise it would be ambiguous whether the trailing signature on a SHA-256 object was for the SHA-256 contents or whether the contents were a rewritten SHA-1 object with no SHA-256 signature at all.” So the solution he implemented was to “use the trailing signature for the preferred hash algorithm and use a header for the other variant”.
brian thinks this solution is the best we can do in the current situation, as it still allows converting back and forth between hashes, and verifying signatures created with older versions of Git, though tags signed with multiple algorithms can’t be verified with older versions of Git.
For commits, brian’s patch series fixes the bug that old header signatures weren’t stripped off before verifying new signatures, so verification always failed.
The result of his series is then that signing both commits and tags can now be round-tripped through both SHA-1 and SHA-256 conversions.
Junio Hamano, the Git maintainer, replied to a patch in the series
suggesting using the size_t
type for byte lengths, instead of
unsigned long
, as unsigned long
was breaking 32-bit builds.
brian agreed and sent a version 2 of the series with Junio’s fix.
Junio replied to the cover letter of this series asking “How widely are SHA-256 tags in use in the real world, though?”, and if it was really too late to use a header signature for tags, as was originally planned.
brian replied:
I don’t know. I don’t know of any major hosting platform that supports them, but of course many people may be using them independently on self-hosted instances.
He also explained why he thought the solution didn’t matter much, because he’d just noticed that old Git versions don’t properly strip header signatures, so wouldn’t anyway be able to verify tags or commits with multiple signatures.
He ended his reply saying “there’s a lot more prep work (surprise) before we get to anything interesting.” To which Junio replied: “Uncomfortably excited to hear this ;-)”.
brian replied with an interesting summary of his in progress work.
Gábor Szeder then reported a Clang warning, while Junio suggested
more unsigned long
to size_t
changes.
brian then sent a version 3 of his patch series with fixes for the issues reported by Gábor and Junio, and then a few weeks later version 4 to fix another small issue.
This patch series is scheduled to be merged in the master
branch
soon.
Who are you and what do you do?
I am Taylor. I work at GitHub, where I spend most of my time contributing to the Git project.
What would you name your most important contribution to Git?
I’m still submitting many of the patches, but I think multi-pack reachability bitmaps will be my most important contribution to Git so far.
If they work, they’ll allow hosting providers who rely on reachability bitmaps more flexibility to control when and how they repack their repository. If you want to use bitmaps, you have no choice but to repack your repository into one enormous pack. Multi-pack reachability bitmaps mean that you won’t have to, and can instead repack your repository however you want.
If I’m limited to things that I have finished, I would say that the on-disk reverse index is another good one. But that’s kind of cheating, since it’s related to multi-pack bitmaps ;).
What are you doing on the Git project these days, and why?
Most of my time is spent working on multi-pack bitmaps. There are a lot of different topics in my fork that all need to get merged in order to make this feature work. So, I’m spending my time in all of the usual ways: submitting patches, responding to review, submitting more patches, and so on.
When I’m not doing that, I enjoy to read and review patches from other folks on the list. I feel like there is a lot of exciting work going on recently, and so I’m always interested.
Every release or so I have the pleasure of writing a “release highlights” blog post that GitHub publishes. We’re still a few weeks away from a release (at the time of writing), so it’s not something that I’m working on yet, but it will come up soon enough.
If you could get a team of expert developers to work full time on something in Git for a full year, what would it be?
I’m not sure :). I care a lot about performance on large repositories, so I think that if I were in charge of such a team, that I would just set them loose to explore the boundaries of what’s possible, and push them further.
If you could remove something from Git without worrying about backwards compatibility, what would it be?
Gosh, I would love to get rid of .keep
packs, or at least the
distinction between in-core and on-disk ones. They’re incredibly useful,
but they are subtly different, in ways that are sort of hard to reason
about.
What is your favorite Git-related tool/library, outside of Git itself?
I use tig
often, particularly its
blame and log view. Michael Haggerty’s git when-merged
is another indispensable tool in my workflow.
I’m curious to try Stephen Jung’s git absorb
tool, but I haven’t gotten to it yet.
Various
At FOSDEM 2021 (this year the event happened in a virtual format) there was a lightning talk Building a Git learning game: A playful approach to version control (video available) initiated by two students who wanted to understand Git themselves… This Oh My Git! game is Open Source and written using the Godot game engine. There are binaries for Linux, macOS, and Windows, but currently no web version, as the game uses real Git as a part of its backend (with some sandboxing).
The similar interactive online Learn Git Branching game was mentioned in Git Rev News Edition #30.
Light reading
Git tools and sites
This edition of Git Rev News was curated by Christian Couder <christian.couder@gmail.com>, Jakub Narębski <jnareb@gmail.com>, Markus Jansen <mja@jansen-preisler.de> and Kaartic Sivaraam <kaartic.sivaraam@gmail.com> with help from Taylor Blau.