Welcome to the 76th edition of Git Rev News, a digest of all things Git. For our goals, the archives, the way we work, and how to contribute or to subscribe, see the Git Rev News page on git.github.io.
This edition covers what happened during the month of May 2021.
[BUG] Unix Builds Requires Pthread Support
Not long after 2.32.0-rc0 was released, Randall S. Becker reported
to the mailing list that
a patch series
from Jeff Hostetler, which introduced a “Simple IPC Mechanism”,
broke the build for the NonStop x86 and ia64 platforms. This build
defines NO_PTHREADS
, as supporting pthreads on these platforms is
considered to cause “a bunch of other issues”.
It seems that Jeff patch series has introduced a file, called
ipc-unix-socket.c
, which contains a call to the
pthread_sigmask()
function part of the pthreads library, which is
of course not linked to when NO_PTHREADS
is defined.
Randall suggested a “simple, but probably wrong fix” which just
surrounds the call to pthread_sigmask()
with #ifndef NO_PTHREADS
and #endif
.
Peff, alias Jeff King, replied to Randall that usually an “async” mechanism is used for tasks that can be performed by several workers in parallel, and that underneath this mechanism can be implemented both using different processes and using different threads. At build time, depending on the availability of pthread, one or the other implementation is selected.
Peff couldn’t tell though, if the “async” interface that the IPC
mechanism defines can actually be implemented using
processes. Anyway he proposed an improved patch to fix the build by
just removing the files implementing the mechanism from the build if
NO_PTHREADS
if defined, similar to dealing with NO_UNIX_SOCKETS
.
Jeff Hostetler, who had implemented the IPC mechanism, then replied to Peff that the mechanism was heavily threaded and that there was no point in trying to “fake it” with processes. So he agreed with Peff’s patch, which conditionally removes it from the build.
Peff replied to Jeff asking if he wanted to pick his patch up from there and produce a polished patch before the 2.32.0 release. Jeff then agreed to do that, and sent a more elaborate patch based on Peff’s patch and on some discussions about it that happened in the meantime between Peff and Randall.
Junio reviewed Jeff’s patch and made some suggestions, which after further discussion were integrated in version 2 of the patch.
A version 3 soon followed to fix the build for people using CMake instead of make. This version was merged before 2.32.0-rc1.
Who are you and what do you do?
I’m Han-Wen Nienhuys. I work at Google, where I manage the Gerrit team. If you don’t know yet, Gerrit is the Git-based code review system that Google uses for large open source projects, such as Android and Chrome.
For a while, Shawn Pearce was my boss. I got to know him because of previous Git adventures at Google: I created git5, a wrapper around our internal Perforce (‘p4’) deployment as a side project. You can still see some traces of this, if you look at the commit history of git-p4.
What would you name your most important contribution to Git?
I didn’t contribute many patches to Git directly, but here is my biggest one so far:
Gerrit accepts code for review through a git push
command, and user
studies showed that Gerrit’s error messages didn’t stand out among
the verbose terminal output. I added colorization for keywords like
“ERROR” and “WARNING” to draw more attention.
What are you doing on the Git project these days, and why?
Gerrit uses Git as a database, storing all review metadata (comments, LGTMs) in Git itself, which is extremely cool, but it also creates extremely ..interesting.. scaling problems for Git and JGit.
One area of scaling is the ref database: Gerrit stores both metadata and each version of a code review in a separate ref. As a result, large projects, such as Chrome, now have several millions of refs in their repository. Handling those efficiently is a challenge. Shawn designed reftable to solve these problems, and partly implemented it in JGit, but he never got round to updating the Git project itself to use the format.
At the end 2019, I thought it would be an interesting and fun project to drive that project further. I did severely underestimate how complicated it would be to do brain surgery on Git itself, though, so 1.5 years later, it still hasn’t merged. Working on Git itself (as opposed to Gerrit, and managing the team) is a side project, so progress hasn’t been as fast as I’d like it to be.
If you could get a team of expert developers to work full time on something in Git for a full year, what would it be?
Rewrite Git on top of libgit2. Or even better, rewrite it in Go. Hacking on Git is a fun trip down memory lane, because the last time I wrote C seriously, I was half my current age. It doesn’t feel very productive though, in part because of limitations of the language (memory management, lack of strings/maps/GC etc.).
I realize that’s ambitious, and would turn off a lot of the current contributors, however, I think much could also be gained by structuring the program better (e.g. banish global variables, introduce more unit tested abstraction boundaries), and that could be achieved by rewriting parts on top of libgit2.
If you could remove something from Git without worrying about backwards compatibility, what would it be?
The packed/loose ref backend. All of Git storage is user-accessible
(under the .git
directory), but the packed/loose ref storage is
extra insidious, because you read/write it using cat
and echo
,
which invites people to break abstraction boundaries.
What is your favorite Git-related tool/library, outside of Git itself?
It’s Gerrit, of course.
Various
The 2021 Gerrit Community Survey results have been published, showing a robust increase of the overall number of users and the adoption of the most recent versions of Gerrit Code Review.
Got typos? Git 2.32 lands, finally offers way to reword commits [easily].
Light reading
Crypto Mining is Killing All Free CI/CD Platforms by Davide ‘CoderDave’ Benvegnù is a blog post on Dev.to and a video on YouTube. The platforms affected include GitLab, TravisCI, CircleCI and GitHub Actions; the CI providers have limited their free tiers, or have to spend much effort on combating crypto miners abusing the platform in response.
Pulling GitHub into the kernel process by Jake Edge on LWN.net (free link for not subscribed) describes attempts to make it possible to submit patches to the Linux kernel via GitHub pull requests (PR). While at it, it also mentions various attempts and tools to improve an email-based workflow.
A Random Walk Through Git by Bakken & Bæck is a weird in-depth tour through Git and some of its internals, including reflog, interactive rebase, bisect and rerere.
How to never type passwords when using Git by Michelle Mannering on Dev.to.
Confusing Terms in the Git Terminology by Pragati Verma on Dev.to.
Git Good - The magic of keeping a clean Git history post by Chris Manson is designed to help you form a solid mental model while working with Git both professionally and in an open source project, and how to ensure you are following best practices to make the process easier for everyone.
Merge vs rebase by Evandro Myller on Dev.to.
Git Log: How to Use It by James Gallagher (2020), one of Git tutorials on Career Karma blog.
Fixing basic mistakes with Git by Abhinav Pandey on Dev.to.
After CRUD: Intro to Git and basic workflows by John Mosesman (2019).
Git tools and sites
A Git history visualization page by Jeff Palmer <jeffrey.palmer@acm.org> shows “An Interactive Development History” of Git in three columns: project and contributor statistics, relative cumulative contributions by contributor, and aggregated commits by contributor by month with milestone annotations. Jeff wrote an associated blog post about how he created the visualization, and he’s also looking for feedback and ideas for milestones or features he could add.
git undo: We can do better
describes the git-undo
tool that is
a part of the git-branchless suite of tools;
it can undo bad merges and rebases with ease, and even some rarer operations.
This git undo
tool was made possible by a recent addition to Git:
the reference-transaction hook
(not yet present on the GitHooks.com site, mentioned
in Git Rev News Edition #43;
the reference-transaction hook was described in #75).
The GitUp client also supports undo/redo via snapshots, also by adding additional plumbing on top of Git. This tool was first mentioned in Git Rev News #5.
The undo
Git alias described in Git Undo
post by Enrico Campidoglio simply uses the reflog, and is less capable. This post
was mentioned in Git Rev News #19.
git-branchless is a suite of tools, written in Rust, to help you visualize, navigate, manipulate, and repair your commit history. It’s based off of the branchless Mercurial workflows at large companies such as Google and Facebook.
This edition of Git Rev News was curated by Christian Couder <christian.couder@gmail.com>, Jakub Narębski <jnareb@gmail.com>, Markus Jansen <mja@jansen-preisler.de> and Kaartic Sivaraam <kaartic.sivaraam@gmail.com> with help from Han-Wen Nienhuys, Luca Milanesio and Jeffrey Palmer.