Welcome to the 85th edition of Git Rev News, a digest of all things Git. For our goals, the archives, the way we work, and how to contribute or to subscribe, see the Git Rev News page on git.github.io.
This edition covers what happened during the month of February 2022.
[PATCH] http API: fix dangling pointer issue noted by GCC 12.0
Ævar Arnfjörð Bjarmason sent a patch to the mailing list to fix an issue that appears when compiling Git with the gcc 12.0 development branch.
The issue is that this gcc version emits a warning on the last line of the following code:
int finished = 0;
slot->finished = &finished;
where slot
is an instance of struct active_request_slot
which
has an int *finished
member.
The warning is “storing the address of local variable ‘finished’ in ‘*slot.finished’ [-Werror=dangling-pointer=]”.
Ævar first thought about making the struct active_request_slot
member an
int finished
member, so it wouldn’t be a pointer anymore, and then about
removing the int *finished
member entirely and using the
int in_use
member instead. His patch implemented the latter.
Taylor Blau reviewed Ævar’s patch but wasn’t sure about one of the
places where the int *finished
member was removed.
Junio Hamano also reviewed the patch and wondered if the new gcc
version was correct in flagging the code in the first place, as it
appeared that slot->finished
wasn’t used outside the function
where it’s assigned the address of a local variable.
Junio also wondered if it was correct to remove the int *finished
member entirely and use the int in_use
member instead, as that
looked like reversing the patch that introduced int *finished
without fully explaining why it was correct to do so.
To address the issue Junio then proposed implementing the following:
if (slot->finished == &finished)
slot->finished = NULL;
so that hopefully gcc would understand that the slot->finished
field isn’t using a local address outside the current function.
Ævar replied to Junio, that assigning slot->finished
to NULL
at
the end of the function would indeed quiet gcc. So he sent
a version 2 of his patch
that only added slot->finished = NULL;
at the end of the function.
Junio wasn’t sure if the change was completely correct as he said he no longer trusted his own reading of the http code, but he agreed that it should be merged if gcc wasn’t fixed soon.
Taylor took another look at the patch and found other functions
where slot->finished
is assigned, so he wondered if a better fix
would be to change the int *finished
member into a tri-state
“UNKNOWN/TRUE/FALSE enum”.
Ævar then sent a version 3 of his patch
where he only improved the commit message compared to the previous
version. In the new commit message he explains why the other places
where slot->finished
is assigned are already correct and don’t
need any change.
Taylor then agreed with Ævar’s explanations, but Junio wasn’t sure
about them, so he changed the commit message a bit, explaining how
we could make sure that the change to clear slot->finished
is
correct by adding the following code before it:
if (slot->finished)
BUG("the unconditional clearing at the end is wrong");
For now the patch has still not been merged into the next
branch
as Junio hopes that gcc 12.0 will be fixed before it’s actually
released. But in case it isn’t, at least we are ready for it.
Who are you and what do you do?
Hi! My name is Victoria Dye, and I work as a software developer at GitHub. I
spend a lot of my time contributing to Git & some of its downstream forks
(Git for Windows and microsoft/git
), and occasionally Git Credential
Manager (although it’s been a while). I really enjoy solving tough
performance problems in software, so I’m hoping to do more of that in Git in
the not-so-distant future!
What would you name your most important contribution to Git?
I haven’t been contributing to Git for all that long, so it’s hard to judge
the importance of anything I’ve done. That said, I’m pretty proud of the
work I’ve done around sparse index and sparse-checkout
. It’s been exciting
to contribute and see those features mature so rapidly over the past couple
months, and I hope it continues to gain traction with users looking for
better performance in Git!
What are you doing on the Git project these days, and why?
My focus pretty much since I started working on Git has been on integrating
commands with sparse index. Most recently that’s been git stash
, which not
only requires its own integration, but also the integration of all the
commands it shells out to (update-index
, read-tree
, etc.). The goal of
these initial sparse index integrations (including those by Derrick Stolee &
Lessley Dennington) is to cover the most popular commands as reported by
users at $DAYJOB (git add
, git reset
, etc.), so the theoretical
performance gains of the sparse index can turn into a real improvement to
user experience.
Beyond my own work, I’m looking forward to mentoring a GSoC contributor in more sparse index integration! It’s an admittedly complicated topic, but — based on my own experience – working on it can give new contributors an incredible depth of knowledge in Git that’s hard to get elsewhere. Plus, I love mentoring developers and creating a positive environment for them, so I’m excited to have the chance to do that here!
If you could get a team of expert developers to work full time on something in Git for a full year, what would it be?
It would be great to see the shell script-based test suite rewritten into a
unit/integration/end-to-end testing stack. The tests we have are almost
exclusively end-to-end (excepting hackery with test-tool
or trace2
logs
exposing internal functions to test them), so we often have incomplete
coverage of some of the trickier code paths (e.g., unpack_trees
). And,
especially for developers that like test-driven development, it can be a
lot easier working with code that has a robust unit testing suite.
If you could remove something from Git without worrying about backwards compatibility, what would it be?
In general, I don’t think there’s a feature that I’d want to completely remove. Git is a fantastic tool in that it lets developers structure their work in whatever way works best for them; something I personally don’t have a great experience with (looking at you, submodules) might just happen to be someone else’s favorite tool.
However, I would absolutely remove an implementation pattern in Git itself: shell script builtins and Git commands shelling out to other Git commands. They’re prone to bugs, difficult to troubleshoot, and can almost always be implemented more efficiently by commonizing the code they’re trying to get out of another command.
What is your favorite Git-related tool/library, outside of Git itself?
I really, really like my current setup of VSCode + GitLens. I used to have a bit of a “text editor is for editing text, Git command line is for source control” attitude and avoided using any integrations between the two, but recently I gave in and installed GitLens. I tend to heavily use the commit history when investigating bugs & learning a new area of the code, so the inline ‘blame’ and commit-message-on-hover in particular make that process about as fast it could be.
What does your mailing list workflow look like?
For submissions, I use GitGitGadget.
It really simplifies the process of submitting to Git (especially re-rolls)
and, if anything, it prevents me from making silly mistakes with
git send-email
.
For everything else, I have two email inbox folders: things addressed to me directly (submission feedback, patches I’ve been CC’d on) and “everything else”. For direct feedback/CCs, I try to respond within a couple days if I think I can contribute something valuable to the discussion. The “everything else” folder is set to “automatically mark as read” to avoid a distracting barrage of “unread”s in my inbox, so I typically scan it a couple times a day to stay up-to-date and check whether there’s anything I can help with or review.
Do you think contributing to Git has a high entry barrier? If yes, do you have any suggestions to improve the experience for new contributors?
In my experience, not really (at least, not so high that it excludes would-be contributors). But I think it might seem that way because the mailing list can be super intimidating when you’re looking to get started.
The mailing list is constantly active with complex patches and lots of experienced developers iterating on complicated feedback, so it can be easy to feel as though you can’t contribute much to it or that your code won’t meet the project’s standards. There have been efforts in the past to guide new contributors and help them technically learn the project, but it ultimately comes down to the first interactions someone has with the rest of the mailing that that determines whether or not they stick with it.
In terms of suggestions for improvement, I think it’s important for reviewers to maintain a supportive tone in even highly constructive reviews. A lot can get lost in text communication, so sometimes it’s worth explicitly calling out your appreciation for a contribution. This is especially true for new contributors, but is just as applicable to any reviews.
Another (more concrete) suggestion would be to make it easier for new contributors to find appropriate “starter” features/bugs of different sizes. Other open source projects have issue trackers with a “good first project” tag (and, as a new contributor on other projects, I’ve definitely appreciated them). Although there have been discussions about creating public issue trackers outside the mailing list, to my knowledge nothing like that exists yet for Git, let alone a subset for new contributors.
What is your advice for people who want to start Git development? Where and how should they start?
If you’re not coming in with a clear project or bugfix in mind, it can be really difficult to figure out where to start. If you’re in that position, something that has helped me in the past has been to just start testing things until they break. Read up on a command you find interesting, look for scenarios (e.g. options, sequences) that aren’t well-tested in our existing tests, and start experimenting! Worst case, you can improve test coverage and help us avoid issues in the future; a lot of times, though, you’ll find a bug or an opportunity to add a new option. In any event, the experience will teach you a ton about Git and make you more comfortable with contributing.
And once you have a working change, don’t be afraid to submit it out of fear that it’s not “good enough”. The goal we all share is to make Git as good as it can possibly be, so your contribution – no matter how small or trivial you think it is – is a valuable part of what makes this project succeed.
If there’s one tip you would like to share with other Git developers, what would it be?
Keep learning and trying new things! The Git development community has a vast range of experience across its members – from people who worked on it since the beginning to people making their first open source contribution ever. But it’s also an easy place to get stuck in “your niche” (if you’re lucky, you get stuck with a couple of other people so you can review each others’ work).
Of course, none of us have infinite time to spend on Git. But every once in a while, take a minute to read a contribution from someone you don’t know on a topic you’re not familiar with. Or investigate a bug report on a command you’ve never worked on! Or share your idea for a process we could try (like the Git Review Club)! Not only will you learn something new and improve yourself as a developer, but the collaboration you foster will help all of us improve Git in the long term.
Various
Light reading
Git tools and sites
split-patch.py
tool, which is git add -p
on steroids for patches (*.patch
files),
allowing to choose in which bucket to put each hunk, and
the git-split-commit
command (which can be used from interactive rebase)
which uses split-patch.py
to split the HEAD commit into multiple commits.jj
) is an experimental Git-compatible DVCS.
It combines features from Git (data model, speed),
Mercurial (anonymous branching, simple CLI free from “the index”, revsets, powerful history-rewriting),
and Pijul/Darcs (first-class conflicts), with features not found in either of them.This edition of Git Rev News was curated by Christian Couder <christian.couder@gmail.com>, Jakub Narębski <jnareb@gmail.com>, Markus Jansen <mja@jansen-preisler.de> and Kaartic Sivaraam <kaartic.sivaraam@gmail.com> with help from Victoria Dye and Bruno Brito.