Welcome to the 91st edition of Git Rev News, a digest of all things Git. For our goals, the archives, the way we work, and how to contribute or to subscribe, see the Git Rev News page on git.github.io.
This edition covers what happened during the month of September 2022.
Git Merge conference and Contributor’s Summit
The Git Merge conference happened on September 14th and 15th. On the afternoon of the first day there were optional workshops and the Git Contributor’s Summit, while the main conference took place on the second day.
As usual the topics that were discussed during the Contributor’s Summit were proposed and voted on before the summit started. The discussions started with the topics with the most votes.
Taylor Blau sent an email summarizing what happened and asking for feedback, followed by an own email thread for each topic that had a note-taker, starting with the topic’s broken-out notes.
Gerrit User Summit 2022
Gerrit User Summit is the event that brings together Gerrit and JGit maintainers, contributors and users, as an opportunity to network face-to-face and share news and experiences. It is now back on the 10-11 November 2022 in hybrid mode with a physical venue in London at CodeNode and online.
The summit is going to be recorded and published on the GerritForge YouTube channel, together with roundtables and discussions between the community members.
rev-parse: --
is sometimes a flag and sometimes an arg?
Tim Hockin sent an email to the mailing list containing a series of
git rev-parse
commands with some arguments that he ran on the
command line, along with their results and his comments.
First he ran git rev-parse unknown-tag
, which errored out after
printing unknown-tag
. The error message said that unknown-tag
is
an ambiguous argument and suggested to use --
to separate paths
from revisions.
So he tried git rev-parse unknown-tag --
, which just errored saying
that unknown-tag
was a bad revision, as expected.
Unfortunately when he then tried git rev-parse HEAD --
, there was
no error, as expected, but instead of printing only the SHA1 hash
corresponding to HEAD, the command also printed --
on its own line
after the SHA1 hash.
This made Tim wonder why --
was treated as a regular argument. He
looked at Git source code and said that it seemed intentional to
treat it that way, but he didn’t understand the reason.
Junio Hamano, the Git maintainer, replied that git rev-parse
was
mostly a “plumber” command designed to be used by higher level
“porcelain” commands. By default, it should be able to “parse”
command line arguments, and then dump them all to its output after
translating “revs” into raw object names (SHA1 hashes).
As --
is a valid option for “porcelain” commands or scripts that
would use git rev-parse
to parse their command line arguments, it
makes sense for git rev-parse
to just pass --
along.
Tim then asked if there was “a more friendly way” to do what he wanted to achieve. But Junio replied that it wasn’t clear what Tim actually wanted to do.
Tim replied that his goal was to convert a string that could contain a tag name, a branch name, or a SHA1 hash (abbreviated or not) into a canonical SHA1.
Junio suggested using git rev-parse --verify <string>
, as it would
either convert <string>
into an object ID (a SHA1 hash by
default), or it would error out. Junio also mentioned that
the “EXAMPLES” section
has more elaborate examples.
brian m. carlson chimed in to say that git rev-parse --verify <string>
would print a full object ID whether it exists in the repo or not, if
<string>
already contains one (for example, the all-zeros object ID).
He suggested using git rev-parse --verify <string>^{object}
if Tim
wanted to also verify that the object exists.
Tim thanked brian and Junio saying that their answer helps a lot.
Who are you and what do you do?
My given name is Jeff, but most people call me Peff. Even in real life. I’ve been working on Git since early 2006. For a while it was for fun and to scratch my own itches (and maybe to avoid doing my school work), but I joined GitHub in 2011, where my job was mostly about improving Git. I stopped being a full-time employee earlier this year, but I’m still working a few hours a week on Git.
How has your journey been as a long-time Git contributor? Do you happen to have any memorable experience w.r.t. contributing to the Git project?
One thing I’ve found with contributing to Git is that it sneaks up on you over time.
I still remember one moment in 2008 or 2009. In my mind, Git was
something I did to procrastinate on “real” work. Shawn Pearce was
organizing an in-person meeting of developers, and emailed me
specifically to say that I was one of the core developers and should
consider coming. I was really confused. Wasn’t this just a thing I did
in my spare time? But running git shortlog
showed that I was one of
the top few contributors. That really changed my mindset; I realized I
was part of a larger community, and that it was something I did care
about.
And I have that same sense looking at how far Git has come. Day to day (and especially when you’re fixing a bug in code from 2005) it can seem like nothing changes. But when I look back over the span of 10 or 15 years, I’m amazed at the progress. Not just in terms of features in Git, but at the overall development process. The way we work and communicate has matured so much in that time. Some of that is from technical tools (new Git features, new internal APIs and data structures to avoid whole classes of bugs), but some of it is in what the people do. In my opinion, our standards for testing and commit messages have gone up considerably over the years.
Git Merge got over a few days ago. Any takeaways from the conference that you would like to share?
To me, the most important part of Git Merge is making connections between developers. I’m not convinced that sticking 30 people in a room is the best way to have a technical discussion, and the real work later happens solo, or on the list. But I think seeing people in person, and especially chatting with them over lunch, etc, is so helpful to that later work. We all know intellectually that there’s another person on the end of every email, but I think having met them face to face helps us empathize at a more gut level.
Of course, there were some talks, too. I tend to prefer the more technical ones, but being so involved in Git development, there doesn’t tend to be anything too surprising for me there. I thought the talks from Taylor and Elijah were nice dives into new technical material (though they both also have great blog posts that go even deeper!). Martin’s Jujutsu talk gave a lot of food for thought on different ways for people to interact with Git.
Could you share a few words regarding your experience while you were a member of the Git PLC?
I was the person who led the initial effort in 2010 to join Software Freedom Conservancy. We had gotten some money for the project as part of Google’s Summer of Code program. It was being passed around like a hot potato (between countries, even!) as somebody took responsibility for handling GSoC each year. I don’t even want to think of what we were supposed to do with it, tax-wise, but we knew it would be better with some actual structure. So that led to us joining, which led to the PLC as committee in control of the project as an entity (and the money), and that led to handling more assets (the git-scm.com domain, donated hosting agreements from various places, the trademark).
Since the Conservancy entity isn’t directly related to code development, being on the PLC is long periods of nothing, punctuated by big threads full of boring non-coding stuff. Some of it is fun-ish, like handing out travel funds so people can come to Git Merge. Some of it I found very tedious, like discussing trademark enforcement, or code of conduct issues. I was happy to serve on the PLC for many years, but I’m also happy that other people are doing that work now.
What would you name your most important contribution to Git?
I think my biggest contribution is not any one thing, but rather being there for all of the things. There’s hardly a C file in the repository that I haven’t touched at some point, and when fixing a bug I’d often try to find solutions we could apply to the whole code base (e.g., improving an API to be less error-prone and using it consistently in other callers).
I do sometimes work on bigger features. One of the earliest things I did after starting at GitHub was overhaul our HTTP authentication and introduce the credential-helper protocol. I occasionally see other tools using a similar protocol, proving that it was either a great idea, or a seductively bad one!
What are you doing on the Git project these days, and why?
One of my favorite things in Git is to wake up, read an email on the list that says “why does Git do X when I say Y?”, dig it down to some bug or missing feature, and end up with a nice, tidy patch by lunchtime. Of course it doesn’t always go that way, but I do often enjoy these little fixes. It’s like solving a puzzle.
I also have a backlog of half-finished ideas. Some of them are garbage that I’ll probably throw away, but many of them just need a little polishing. One of them is more tunable knobs for repacking (which has been in use on GitHub’s servers for a few years already!), and another is handling negative commit timestamps (so we can finally import pre-1970 Apollo code).
If you could get a team of expert developers to work full time on something in Git for a full year, what would it be?
Arguably I had that already, so maybe past work speaks for itself. Or maybe I squandered it.
If you could remove something from Git without worrying about backwards compatibility, what would it be?
Trees should be sorted in order strictly by name, rather than directories sorting as if “/” was appended. It’s a little thing, I know, but it’s one of the few things that’s really impossible to fix because it’s baked so deep into Git’s logical model.
What is your favorite Git-related tool/library, outside of Git itself?
Definitely tig
. Its “blame” functionality,
and especially the “re-blame from parent” feature, are so useful. I almost
never run a bare git blame
.
What is your toolbox for interacting with the mailing list and for developing Git itself?
I read the mailing list via mutt. I keep a local archive which I index with notmuch. I used to actually subscribe to the list, but these days I just pull the archive every few minutes from lore.kernel.org’s public-inbox Git repository.
I do all of my development with a fairly vanilla vim setup. I have a few
niceties, like terminal hotkeys to cut and paste object hashes, and a
vim function to inline output from a Git command (like converting hashes
into --format=reference
).
I try to share my scripts when they’re not too gross or specific to my
workflows. An example there is contrib/git-jump
.
I keep some other Git-specific scripts in the meta branch
which I check out as the directory Meta
inside my Git repository (I
stole the name from Junio, who has a similar tree of scripts). I use it
to rebase my topics and make my daily-driver build of Git. There’s
probably not much of use there for most people, but some of it has led
to useful features (e.g., our test suite’s --stress
option started as
a script there, though SZEDER Gábor did all the heavy lifting to
integrate it).
What is your advice for people who want to start Git development? Where and how should they start?
There are a lot of ways to get involved in open source, but I think the best one is scratching your own itch. Pick something you want the tool to do, and work on it. That’s probably harder with Git these days than it was when I started, just because the system is larger and more complex, and so much of the low-hanging fruit has already been picked.
A similar way is just reading the list and looking for bug reports. Once you learn about a problem, then it becomes your itch.
Of course it’s fine to start work on a much larger project if you like. But following my “sneaks up on you” philosophy from above, if you work on enough small things, you will eventually find yourself quite comfortable with the code base, and able to work on larger things.
If there’s one tip you would like to share with other Git developers, what would it be?
Re-read your emails before sending! Obviously it’s nice to catch typos and other simple proofreading errors. But it’s also a final chance to make sure you are saying what you want clearly and concisely, and that you understand what the other person is saying.
I can’t count the number of times that I’ve almost sent out a very confused explanation in a commit message, and upon re-reading realized that not only was there a better way to explain it, but a better way to write the code. It’s also one of the reasons I like writing verbose commit messages. Trying to justify the decisions you’ve made in writing a patch is often the moment you realize that your arguments are weak.
Likewise, there have been many times when I’m about to respond to somebody along the lines of “I think you’re wrong, and here’s why”. And upon re-reading I realize that I did not understand their point in the first place. Of course if everybody remains polite, then hopefully the error works its way to a shared understanding eventually. But besides saving everybody time, catching a misunderstanding before sending means you’re wrong on the Internet one less time!
Of course, ending the interview with this tip gives an almost certain probability that I have a typo somewhere above. So maybe one more tip: be humble. And remember to have fun. Oops, that’s two tips.
Various
Light reading
git subtree
command,
or use the subtree
merge strategy, or the ort
merge strategy with subtree[=<path>]
strategy option.linguist-generated
gitattribute
to exclude generated files from languages statistics.Git tools and sites
b4 send
to send patch series to a mailing list; see
Konstantin Ryabitsev’s announce.ghq
provides a way to organize remote repository clones,
like go get
does; for example when cloning it makes a directory under a specific root directory.
Written in Go.git-of-theseus
is a set of scripts to
analyze how a Git repo grows over time.
This edition of Git Rev News was curated by Christian Couder <christian.couder@gmail.com>, Jakub Narębski <jnareb@gmail.com>, Markus Jansen <mja@jansen-preisler.de> and Kaartic Sivaraam <kaartic.sivaraam@gmail.com> with help from Peff (Jeff King), Bruno Brito and Luca Milanesio.