Git Rev News: Edition 75 (May 27th, 2021)

Welcome to the 75th edition of Git Rev News, a digest of all things Git. For our goals, the archives, the way we work, and how to contribute or to subscribe, see the Git Rev News page on git.github.io.

This edition covers what happened during the month of April 2021.

Discussions

General

  • Git participates in GSoC (Google Summer of Code) 2021

    The following two students have been officially accepted to work on Git as part of the GSoC 2021:

    Thanks to the students who applied and worked on micro-projects, but couldn’t be selected! We hope to continue to see you in the community!

  • The top 1% of commit trailers

    Felipe Contreras posted a fun analysis of how often various commit trailers (reviewed-by, tested-by, etc) appear in the git.git project.

    Setting aside signed-off-by (which all contributions must include), the most common trailers are acked-by (1945 occurences) and reviewed-by (1729 occurences), together accounting for almost half of all trailers.

    The next 4 most common trailers give a great insight into just how much collaboration goes on in the git.git project: helped-by (1336), reported-by (960), mentored-by (379), and suggested-by (281).

    Perhaps most interesting is the long list of trailers that have only been seen once, though now that this list is out there we may see more of deemed-obviously-correct-by, worriedly-acked-by, and cheered-on-by in the future.

    One can also note that this distribution roughly follows Zip’s law; the 10th most popular line (“improved-by”) is about 1/10 as popular as the 1st.

    This script can be used to replicate the analysis.

Reviews

  • [PATCH] [GSOC] pretty: provide human date format

    ZheNing Hu sent a patch to the mailing list to add the new %ah and %ch formatting options to the “pretty formats”. The “pretty formats” are the main way for users to customize the output of the git log, git show, git rev-list, and git diff-tree commands.

    These formats are specified by the --pretty[=<format>] or --format=<format> command line flags, where <format>is the actual “pretty format”, and can be either a “built-in format”, like oneline, raw, short, medium, etc, or format:<string>, which is called a “format string”.

    These format strings work in a similar way to printf() formats, as they can contain placeholders starting with a % character, that will be expanded by the command. For example %H will be expanded to print the commit hash, %an the author name, etc.

    A lot of placeholders already exist. For the author date, there are: %ad, %aD, %ar, %at, %ai, %aI and %as. For the committer date, there are the corresponding %cd, %cD, %cr, %ct, %ci, %cI and %cs ones. Each pair of these placeholders uses a different date format. For example, %aI and %cI use the “strict ISO 8601 format”.

    Formats %ad and %cd, though, are special as they use the format specified by the --date=<format> command line flag, so for example with --date=iso-strict, %ad and %cd will behave in the same way as %aI and %cI.

    ZheNing’s patch added the new %ah and %ch placeholders that would behave in the same way as %ad and %cd with --date=human. The rationale for the patch being that there are placeholders corresponding to most of the --date=<format> options except --date=human.

    Taylor Blau was the first to review ZheNing’s patch and found it “pretty good”, as it was similar to a previous patch by René Scharfe that added the %as and %cs placeholders for dates in the “short date format”. ZheNing acknowledged that he indeed learned from René’s patch.

    Philip Oakley, though, commented on the documentation part of the patch suggesting to add an example similar to YYYY-MM-DD for the short format. ZheNing replied that in the “human format” a date could take many forms, so he said he would rather add links to the documentation of the “human format”.

    ZheNing then sent a version 2 of his patch where he had added the links. Philip suggested further small superficial changes to the link and the related text added in this version though.

    Meanwhile Ævar Arnfjörð Bjarmason sent a small patch series that made a “couple of trivial changes” to the tests related to %aI and %cI, and at the same time suggested ZheNing to make similar changes to the tests in his patch.

    ZheNing then sent a version 3 of his patch, taking Philip’s and Ævar’s suggestions into account. This patch contained a typo, though, so ZheNing sent a version 4 of his patch.

    As the version 3 of the patch had already been merged to the “next” branch before ZheNing sent the version 4, the typo got noticed by Martin Ågren who sent a small patch series fixing this typo as well as another unrelated one.

    Eventually both ZheNing’s patch and Martin’s patches were merged into the “master” branch, so that their improvements will appear in the soon upcoming Git version v2.32.0.

Developer Spotlight: Patrick Steinhardt

  • Who are you and what do you do?

    I’m a software developer working at GitLab, more specifically in the team working on Gitaly. Gitaly is our RPC interface to all Git repositories, so it is the backbone to all things Git at GitLab.

    In my own free time, I love to tinker with my Gentoo-based systems and tailor them to my own needs, which results in occasional drive-by patches to all kinds of open source projects to scratch my own itches.

  • What would you name your most important contribution to Git?

    To me, this is the introduction of the reference-transaction hook, which gets executed whenever a reference is about to be updated. This allows tight control over all reference updates happening in a given repository in a command-agnostic way. At GitLab, we use this hook to coordinate reference updates across multiple replicas of the same repository such that we can be sure that all nodes have the same state and move to the same state.

    My most important contributions I’d not locate in the Git project itself though, but instead in libgit2. While I unfortunately haven’t found the time to contribute to it lately, I’ve done a lot more work on libgit2 than I did on Git. And there it’s probably the initial introduction of support for worktrees, maintenance of the CMake build system and work on the gitconfig subsystem.

  • What are you doing on the Git project these days, and why?

    My current work is mostly focussed on tuning performance of some areas we have found to be slow for gitlab.com. This has motivated the recent introduction of a new git-rev-list(1) filter which allows to filter by object type via --filter=object:type=<type>. This makes it easy to find for example all blobs introduced between two revisions.

    And right now I’m trying to devise a new implementation of the object connectivity check performed by git-receive-pack(1) whenever a push gets accepted on the server side. Depending on the repository’s shape, the current implementation can be a major bottleneck and take dozens of seconds to compute even for small pushes. You may have noticed this check when it says “Checking connectivity” on a push.

  • If you could get a team of expert developers to work full time on something in Git for a full year, what would it be?

    I’m obviously biased coming from the libgit2 project, but I’d really love to further push the libification of Git. There has been great progress on this front already to make internal C interfaces look more like the typical interfaces you’d see from a linkable library. But my dream would be to merge the efforts of Git and libgit2 such that Git also provides an official library which can be linked against in your own program.

  • If you could remove something from Git without worrying about backwards compatibility, what would it be?

    Tough question. There’s many user-facing commands which could benefit from a more consistent design, but my take is that these probably could provide an improved user interface while still retaining backwards compatibility.

    But what I’d really love to get rid of is the file-based reference backend. It works reasonably well to represent references as file paths in smallish repositories, but even there it imposes limitations which are only a result of its implementation. It’s also inefficient for bigger repositories and does not really allow for atomic modification of multiple references at once. There luckily is ongoing work on the reftable backend, which fixes many of the shortcomings, but it will likely still take some time to land.

  • What is your favorite Git-related tool/library, outside of Git itself?

    I guess the answer to that question is going to be obvious by now: libgit2.

Releases

Other News

Various

Light reading

Git tools and sites

  • Komit is a Node.js based small command line application providing interactive prompt, designed to be run as a Git hook to help follow the Conventional Commit message standard. This standard was mentioned in Git Rev News Edition #52 and #54; another tool that helps follow this standard is Sailr (also mentioned in edition #52).
  • Flat Data explores how to make it easy to work with data in Git and GitHub. The Flat Data project incorporates three different pieces: the Flat Action (GitHub Action), the Flat Editor VS Code extension, and the Flat Viewer website.
  • git-split-diffs, a Node.js command-line application, provides side-by-side split diffs with syntax highlighting in your terminal, and can be used via core.pager or pager.diff.
  • daff: data diff is a library and a tool for comparing tables, producing a summary of their differences, and using such a summary as a patch file. It is optimized for comparing tables that share a common origin, in other words multiple versions of the “same” table. With daff, you can also make Git diffs and merges table-aware.
  • github-csv-diff and CSVHub are both Chrome extensions to show CSV diffs on GitHub.
  • Semgrep is a fast, Open Source, static analysis tool for finding bugs and enforcing code standards at editor, commit, or CI time; rules look like the code you’re searching.

Credits

This edition of Git Rev News was curated by Christian Couder <christian.couder@gmail.com>, Jakub Narębski <jnareb@gmail.com>, Markus Jansen <mja@jansen-preisler.de> and Kaartic Sivaraam <kaartic.sivaraam@gmail.com> with help from Patrick Steinhardt, Andrew Ardill, Felipe Contreras and Jonas Bernoulli.