Welcome to the 69th edition of Git Rev News, a digest of all things Git. For our goals, the archives, the way we work, and how to contribute or to subscribe, see the About Git Rev News page on git.github.io.
This edition covers what happened during the month of October 2020.
[PATCH] userdiff: support Bash
Victor Engmark sent a patch to add support for Bash and POSIX shell to the userdiff mechanism. This mechanism is used by the diff code to make diffs more informative and better suited to the content, in this case Bash or POSIX shell programs.
As explained in the documentation, diffs contain sections called hunks that look like:
@@ -k,l +n,m @@ TEXT
where k
, l
, n
and m
are numbers indicating the lines that
are concerned, and TEXT
, which is called the “hunk-header”, is a
part of a line of the file to further help identify the related
context of the diff in the file.
The best thing for programs is often to have the name of the enclosing function, or method, in the hunk-header. As detecting functions is programming language specific, it’s the role of the userdiff mechanism to provide a regex (regular expression) that can be used to do that.
Another role of the userdiff mechanism is to provide a regex to customize word diffs.
Victor’s patch then mainly consisted in adding regexes for Bash and
POSIX shells to userdiff.c
, along with some documentation and a
lot of tests.
Junio Hamano, the Git maintainer, replied to Victor, by commenting a bit on the tests and a lot on the complex regex to detect a function. He wondered if it correctly accepts white spaces where they are allowed, and suggested for more clarity to break it down like this:
"^[ \t]*" /* (op) leading indent */
"(" /* here comes the whole thing */
"(function[ \t]+)?" /* (op) noiseword "function" */
"[a-zA-Z_][a-zA-Z0-9_]*" /* identifier - function name */
"[ \t]*(\\([ \t]*\\))?" /* (op) start of func () */
"[ \t]*(\\{|\\(\\(?|\\[\\[)" /* various "opening" of body */
")",
Victor then sent a version 2 of his patch implementing Junio’s suggestions and answering his comments.
Johannes Sixt then suggested a number of improvements especially about the tests, and about the regex to customize word diffs. Victor replied about the latter asking for pointers to look at, as it seemed that there were no tests for that.
Victor nevertheless sent a version 3 implementing Johannes’ suggestions, and Johannes indeed replied that he was happy with the result. Junio was also happy with the result after fixing a typo in the commit message.
The patch was later merged to the next
and then master
branches,
so Git should soon support shell scripts in a better way, while it
itself has been developed for a long time using shell scripts.
Who are you and what do you do?
I’m a scientist at the Canadian Center for Meteorological and Environmental Prediction, in Montréal, Québec, Canada. I’m part of the development team for the numerical model we use to predict the movement, growth and melt of sea-ice, the frozen parts of the Arctic and Antarctic oceans. Like most computer models in the field of weather forecasting, this software is written in good old Fortran, and runs on our supercomputers. We use Git extensively to track changes across all layers of our technological stack, and I quickly developed an interest in Git’s inner workings.
What would you name your most important contribution to Git?
Since I started contributing to the project a little more than one year
ago, I’ve mostly been trying to fix the bugs I encounter in the course
of my daily work. So I’m not sure which one of my topics has had the
biggest impact on other users. However, I can say that the contribution
I’m most proud of is fixing git checkout --recurse-submodules
to correctly switch between branches when one branch has nested
submodules and the other branch has no submodules at all. I learned
enormously during the process of developing this fix, not only about how
Git “unpack trees”
to keep the index, working directory and HEAD consistent, but also how
fork
and exec
calls work and especially how to debug such spawned processes using GDB
and LLDB.
What are you doing on the Git project these days, and why?
Right now I’m working on a fix to prevent git checkout
--recurse-submodules
from losing uncommitted work in submodules.
Although the documentation says this shouldn’t happen, I’ve found a few
cases where it does, and since it’s never a nice experience to lose
work, I’d like to fix that.
If you could get a team of expert developers to work full time on something in Git for a full year, what would it be?
Apart from rewriting the whole thing in C++, you mean? Jokes aside, I
would like for more work to be put towards better submodule support.
There was a colossal effort a couple of years ago to add
--recurse-submodules
flags to several Git commands, so that submodules
worktrees stay up to date when switching between superproject commits.
Unfortunately this effort has died off due to core contributors to the
submodule code changing jobs, and some porcelain commands still lack
that capability.
If you could remove something from Git without worrying about backwards compatibility, what would it be?
I think it’s unfortunate that the dot-dot vs. dot-dot-dot syntax mean
different things in git diff
than in the rest of Git commands. It’s
another one of those tricky things that users have to remember. The fact
that checkout
and reset
have so many different modes of operation
also make them confusing for beginners. The introduction of git switch
and git restore
should help with that, though.
What is your favorite Git-related tool/library, outside of Git itself?
Since I’ve heard of it I’ve been using
diff-so-fancy
as a
way to make Git diffs more easily readable. Apart from that I mostly
stick to the Git command line. Recently I discovered
git-crecord
, an
ncurses interface for, among other features, interactive line-by-line
staging.
Various
Light reading
Git tools and sites
git
subcommand (written in Python), which allows users to interactively select
changes to commit or stage using a ncurses-based text user interface.This edition of Git Rev News was curated by Christian Couder <christian.couder@gmail.com>, Jakub Narębski <jnareb@gmail.com>, Markus Jansen <mja@jansen-preisler.de> and Kaartic Sivaraam <kaartic.sivaraam@gmail.com> with help from Philippe Blain, Semyon Kirnosenko, Tarun Batra, Philip Oakley and Luca Milanesio.