Git Rev News Edition 91 (September 30th, 2022)

Git Rev News: Edition 91 (September 30th, 2022)

Welcome to the 91st edition of Git Rev News, a digest of all things Git. For our goals, the archives, the way we work, and how to contribute or to subscribe, see the Git Rev News page on git.github.io.

This edition covers what happened during the month of September 2022.

Discussions

General

Git Merge conference and Contributor’s Summit

The Git Merge conference happened on September 14th and 15th. On the afternoon of the first day there were optional workshops and the Git Contributor’s Summit, while the main conference took place on the second day.

As usual the topics that were discussed during the Contributor’s Summit were proposed and voted on before the summit started. The discussions started with the topics with the most votes.

Taylor Blau sent an email summarizing what happened and asking for feedback, followed by an own email thread for each topic that had a note-taker, starting with the topic’s broken-out notes.
Gerrit User Summit 2022

Gerrit User Summit is the event that brings together Gerrit and JGit maintainers, contributors and users, as an opportunity to network face-to-face and share news and experiences. It is now back on the 10-11 November 2022 in hybrid mode with a physical venue in London at CodeNode and online.

The summit is going to be recorded and published on the GerritForge YouTube channel, together with roundtables and discussions between the community members.

Support

rev-parse: -- is sometimes a flag and sometimes an arg?

Tim Hockin sent an email to the mailing list containing a series of git rev-parse commands with some arguments that he ran on the command line, along with their results and his comments.

First he ran git rev-parse unknown-tag, which errored out after printing unknown-tag. The error message said that unknown-tag is an ambiguous argument and suggested to use -- to separate paths from revisions.

So he tried git rev-parse unknown-tag --, which just errored saying that unknown-tag was a bad revision, as expected.

Unfortunately when he then tried git rev-parse HEAD --, there was no error, as expected, but instead of printing only the SHA1 hash corresponding to HEAD, the command also printed -- on its own line after the SHA1 hash.

This made Tim wonder why -- was treated as a regular argument. He looked at Git source code and said that it seemed intentional to treat it that way, but he didn’t understand the reason.

Junio Hamano, the Git maintainer, replied that git rev-parse was mostly a “plumber” command designed to be used by higher level “porcelain” commands. By default, it should be able to “parse” command line arguments, and then dump them all to its output after translating “revs” into raw object names (SHA1 hashes).

As -- is a valid option for “porcelain” commands or scripts that would use git rev-parse to parse their command line arguments, it makes sense for git rev-parse to just pass -- along.

Tim then asked if there was “a more friendly way” to do what he wanted to achieve. But Junio replied that it wasn’t clear what Tim actually wanted to do.

Tim replied that his goal was to convert a string that could contain a tag name, a branch name, or a SHA1 hash (abbreviated or not) into a canonical SHA1.

Junio suggested using git rev-parse --verify <string>, as it would either convert <string> into an object ID (a SHA1 hash by default), or it would error out. Junio also mentioned that the “EXAMPLES” section has more elaborate examples.

brian m. carlson chimed in to say that git rev-parse --verify <string> would print a full object ID whether it exists in the repo or not, if <string> already contains one (for example, the all-zeros object ID). He suggested using git rev-parse --verify <string>^{object} if Tim wanted to also verify that the object exists.

Tim thanked brian and Junio saying that their answer helps a lot.

Developer Spotlight: Jeff King (alias Peff)

Who are you and what do you do?

My given name is Jeff, but most people call me Peff. Even in real life. I’ve been working on Git since early 2006. For a while it was for fun and to scratch my own itches (and maybe to avoid doing my school work), but I joined GitHub in 2011, where my job was mostly about improving Git. I stopped being a full-time employee earlier this year, but I’m still working a few hours a week on Git.
How has your journey been as a long-time Git contributor? Do you happen to have any memorable experience w.r.t. contributing to the Git project?

One thing I’ve found with contributing to Git is that it sneaks up on you over time.

I still remember one moment in 2008 or 2009. In my mind, Git was something I did to procrastinate on “real” work. Shawn Pearce was organizing an in-person meeting of developers, and emailed me specifically to say that I was one of the core developers and should consider coming. I was really confused. Wasn’t this just a thing I did in my spare time? But running git shortlog showed that I was one of the top few contributors. That really changed my mindset; I realized I was part of a larger community, and that it was something I did care about.

And I have that same sense looking at how far Git has come. Day to day (and especially when you’re fixing a bug in code from 2005) it can seem like nothing changes. But when I look back over the span of 10 or 15 years, I’m amazed at the progress. Not just in terms of features in Git, but at the overall development process. The way we work and communicate has matured so much in that time. Some of that is from technical tools (new Git features, new internal APIs and data structures to avoid whole classes of bugs), but some of it is in what the people do. In my opinion, our standards for testing and commit messages have gone up considerably over the years.
Git Merge got over a few days ago. Any takeaways from the conference that you would like to share?

To me, the most important part of Git Merge is making connections between developers. I’m not convinced that sticking 30 people in a room is the best way to have a technical discussion, and the real work later happens solo, or on the list. But I think seeing people in person, and especially chatting with them over lunch, etc, is so helpful to that later work. We all know intellectually that there’s another person on the end of every email, but I think having met them face to face helps us empathize at a more gut level.

Of course, there were some talks, too. I tend to prefer the more technical ones, but being so involved in Git development, there doesn’t tend to be anything too surprising for me there. I thought the talks from Taylor and Elijah were nice dives into new technical material (though they both also have great blog posts that go even deeper!). Martin’s Jujutsu talk gave a lot of food for thought on different ways for people to interact with Git.
Could you share a few words regarding your experience while you were a member of the Git PLC?

I was the person who led the initial effort in 2010 to join Software Freedom Conservancy. We had gotten some money for the project as part of Google’s Summer of Code program. It was being passed around like a hot potato (between countries, even!) as somebody took responsibility for handling GSoC each year. I don’t even want to think of what we were supposed to do with it, tax-wise, but we knew it would be better with some actual structure. So that led to us joining, which led to the PLC as committee in control of the project as an entity (and the money), and that led to handling more assets (the git-scm.com domain, donated hosting agreements from various places, the trademark).

Since the Conservancy entity isn’t directly related to code development, being on the PLC is long periods of nothing, punctuated by big threads full of boring non-coding stuff. Some of it is fun-ish, like handing out travel funds so people can come to Git Merge. Some of it I found very tedious, like discussing trademark enforcement, or code of conduct issues. I was happy to serve on the PLC for many years, but I’m also happy that other people are doing that work now.
What would you name your most important contribution to Git?

I think my biggest contribution is not any one thing, but rather being there for all of the things. There’s hardly a C file in the repository that I haven’t touched at some point, and when fixing a bug I’d often try to find solutions we could apply to the whole code base (e.g., improving an API to be less error-prone and using it consistently in other callers).

I do sometimes work on bigger features. One of the earliest things I did after starting at GitHub was overhaul our HTTP authentication and introduce the credential-helper protocol. I occasionally see other tools using a similar protocol, proving that it was either a great idea, or a seductively bad one!
What are you doing on the Git project these days, and why?

One of my favorite things in Git is to wake up, read an email on the list that says “why does Git do X when I say Y?”, dig it down to some bug or missing feature, and end up with a nice, tidy patch by lunchtime. Of course it doesn’t always go that way, but I do often enjoy these little fixes. It’s like solving a puzzle.

I also have a backlog of half-finished ideas. Some of them are garbage that I’ll probably throw away, but many of them just need a little polishing. One of them is more tunable knobs for repacking (which has been in use on GitHub’s servers for a few years already!), and another is handling negative commit timestamps (so we can finally import pre-1970 Apollo code).
If you could get a team of expert developers to work full time on something in Git for a full year, what would it be?

Arguably I had that already, so maybe past work speaks for itself. Or maybe I squandered it.
If you could remove something from Git without worrying about backwards compatibility, what would it be?

Trees should be sorted in order strictly by name, rather than directories sorting as if “/” was appended. It’s a little thing, I know, but it’s one of the few things that’s really impossible to fix because it’s baked so deep into Git’s logical model.
What is your favorite Git-related tool/library, outside of Git itself?

Definitely tig. Its “blame” functionality, and especially the “re-blame from parent” feature, are so useful. I almost never run a bare git blame.
What is your toolbox for interacting with the mailing list and for developing Git itself?

I read the mailing list via mutt. I keep a local archive which I index with notmuch. I used to actually subscribe to the list, but these days I just pull the archive every few minutes from lore.kernel.org’s public-inbox Git repository.

I do all of my development with a fairly vanilla vim setup. I have a few niceties, like terminal hotkeys to cut and paste object hashes, and a vim function to inline output from a Git command (like converting hashes into --format=reference).

I try to share my scripts when they’re not too gross or specific to my workflows. An example there is contrib/git-jump. I keep some other Git-specific scripts in the meta branch which I check out as the directory Meta inside my Git repository (I stole the name from Junio, who has a similar tree of scripts). I use it to rebase my topics and make my daily-driver build of Git. There’s probably not much of use there for most people, but some of it has led to useful features (e.g., our test suite’s --stress option started as a script there, though SZEDER Gábor did all the heavy lifting to integrate it).
What is your advice for people who want to start Git development? Where and how should they start?

There are a lot of ways to get involved in open source, but I think the best one is scratching your own itch. Pick something you want the tool to do, and work on it. That’s probably harder with Git these days than it was when I started, just because the system is larger and more complex, and so much of the low-hanging fruit has already been picked.

A similar way is just reading the list and looking for bug reports. Once you learn about a problem, then it becomes your itch.

Of course it’s fine to start work on a much larger project if you like. But following my “sneaks up on you” philosophy from above, if you work on enough small things, you will eventually find yourself quite comfortable with the code base, and able to work on larger things.
If there’s one tip you would like to share with other Git developers, what would it be?

Re-read your emails before sending! Obviously it’s nice to catch typos and other simple proofreading errors. But it’s also a final chance to make sure you are saying what you want clearly and concisely, and that you understand what the other person is saying.

I can’t count the number of times that I’ve almost sent out a very confused explanation in a commit message, and upon re-reading realized that not only was there a better way to explain it, but a better way to write the code. It’s also one of the reasons I like writing verbose commit messages. Trying to justify the decisions you’ve made in writing a patch is often the moment you realize that your arguments are weak.

Likewise, there have been many times when I’m about to respond to somebody along the lines of “I think you’re wrong, and here’s why”. And upon re-reading I realize that I did not understand their point in the first place. Of course if everybody remains polite, then hopefully the error works its way to a shared understanding eventually. But besides saving everybody time, catching a misunderstanding before sending means you’re wrong on the Internet one less time!

Of course, ending the interview with this tip gives an almost certain probability that I have a typo somewhere above. So maybe one more tip: be humble. And remember to have fun. Oops, that’s two tips.

Releases

Git 2.38.0-rc2, 2.38.0-rc1, 2.38.0-rc0, 2.37.3
Git for Windows 2.38.0-rc2(1), 2.38.0-rc1(1), 2.38.0-rc0(1), 2.37.3(1)
Bitbucket Server 8.4
GitHub Enterprise 3.6.2, 3.5.6, 3.4.9, 3.3.14, 3.2.19, 3.6.1, 3.5.5, 3.4.8, 3.3.13, 3.2.18
GitLab 15.4 15.3.3, 15.3.2, 15.2.4 and 15.1.6
GitHub Desktop 3.0.8, 3.0.7
Tower for Mac 9.0 (What’s New in Tower 9 video)
Tower for Windows 3.4

Other News

Various

SSH commit verification now supported in GitHub.
1Password now allows you to set up and use SSH keys to sign Git commits.
- See also SSH and Git, meet 1Password, mentioned in Git Rev News Edition #85.
New options for controlling the default commit message when merging a pull request via web interface in GitHub.
Merge commits now created using the merge-ort strategy on GitHub.
35,000 code repos not hacked—but clones flood GitHub to serve malware, and GitHub blighted by “researcher” who created thousands of malicious projects.
Gerrit Code Review v3.7.0 release plan has been published.

Light reading

Scaling Git’s garbage collection by Taylor Blau on GitHub Blog: a tour of recent work to re-engineer Git’s garbage collection process to scale to GitHub’s largest and most active repositories.
A primer on Roaring bitmaps: what they are and how they work by Vikram Oberoi.
- Adding support for Roaring Bitmaps to Git was a part of Reachability bitmap improvements Google Summer of Code 2022 project by Abhradeep Chakraborty, see for example their GSoC Final Report on their Medium blog.
Enable Gitsign Today and Start Signing your Commits with so called keyless signing, that is signing that relies on ephemeral keys. Article by Erika Heidi on DEV.to.
Switching git back to GPG signing from SSH key signing, by Seth Michael Larson on his own blog.
- See also Signing Git Commits with SSH Keys, mentioned in Git Rev News Edition #83.
Git signatures with SSH certificates by Matthew Garrett on his Dreamwidth journal.
Merging two GitHub repositories without losing commit history by Schalk Neethling on Mozilla Hacks blog; a simpler solution might be to use the git subtree command, or use the subtree merge strategy, or the ort merge strategy with subtree[=<path>] strategy option.
.gitignore File – How to Ignore Files and Folders in Git by Dionysia Lemonaki on freeCodeCamp.
SSH Tips and Tricks by Carlos Alexandro Becker, including how to avoid having to touch Yubikey.
Do (not) Self-Host your repos by Kay Rhodes (@masukomi), explaining pros and cons of self hosting, with a simple suggestion on what to do.
The Git Commands I Use Every Day by Wade Zimmerman on DEV.to, originally published at devmap.org, a Medium blog.
Git Best Practices – How to Write Meaningful Commits, Effective Pull Requests, and Code Reviews by Grant Weatherston on freeCodeCamp.
Error: src refspec master does not match any – How to Fix in Git by Dillion Megida on freeCodeCamp.
Rewrite your git history in 4 friendly commands, making the git history of a demo project to start (again) with an “Initial commit”. By Salma Alam-Naylor on whitep4nth3r.com (also on DEV.to).
Fixing Some Bugs in My GitHub Profile Generator by Dave Rolsky on his House Absolute(ly Pointless) blog, among others on how to use linguist-generated gitattribute to exclude generated files from languages statistics.
Things I wish everyone knew about Git (Part I) and (Part II) by Mark Dominus (陶敏修) on The Universe of Discourse blog.
Why We Built an Open Source ML (Machine Learning) Model Registry with git by Dmitry Petrov of Iterative.AI (authors of DVC, CML and MLEM) on The New Stack.
- See Git-backed Machine Learning Model Registry to bring order to chaos, mentioned in Git Rev News Edition #89.
Git – Comparing Visual Studio 2022 with MeGit/EGit and SourceTree by Mark Pelf on CodeProject.
Kaleidoscope + Tower: the perfect Git setup by Florian Albrecht.
How we built Tower 3 for Windows by Kristian Lumme on Tower’s blog.
Mastering Google (for Developers) by Bruno Brito on Tower’s blog.
10 Useful Git Commands You Should Know by Bruno Brito on Tower’s blog.
Git Interview Questions - The essential list, including answers, a part of Git Tower’s Git FAQ (with tips for the Tower Git client).
Semantic Diff for SQL by Iaroslav Zeigerman; implementation discussed in this post is now a part of the SQLGlot library.

Git tools and sites

b4: has gained contributor oriented features, like b4 send to send patch series to a mailing list; see Konstantin Ryabitsev’s announce.
Gitsign: Keyless Git signing with Sigstore, with your own GitHub / OIDC (OpenID Connect) identity. Written in Go.
ghq provides a way to organize remote repository clones, like go get does; for example when cloning it makes a directory under a specific root directory. Written in Go.
Revup provides command-line tools that allow developers to iterate faster on parallel changes and reduce the overhead of creating and maintaining code reviews; it creates multiple independent chains of branches for you in the background and then creates and manages GitHub pull requests for all those branches. Written in Python.
git-of-theseus is a set of scripts to analyze how a Git repo grows over time.
- See The half-life of code & the ship of Theseus by Erik Bernhardsson (2016).
git_dash is a command-line shell script for generating a Git metrics dashboard directly in your terminal.
GitHub does dotfiles: Your unofficial guide to dotfiles on GitHub.

Credits

This edition of Git Rev News was curated by Christian Couder <christian.couder@gmail.com>, Jakub Narębski <jnareb@gmail.com>, Markus Jansen <mja@jansen-preisler.de> and Kaartic Sivaraam <kaartic.sivaraam@gmail.com> with help from Peff (Jeff King), Bruno Brito and Luca Milanesio.