Git Rev News: Edition 33 (November 22nd, 2017)

Welcome to the 33rd edition of Git Rev News, a digest of all things Git. For our goals, the archives, the way we work, and how to contribute or to subscribe, see the Git Rev News page on git.github.io.

This edition covers what happened during the month of October 2017.

Discussions

Reviews

Jacob Keller sent a patch adding a test that fails. He wrote in the commit message that the “git rebase” interactive mode causes exec commands to be run with GIT_DIR set, and that afterwards running a git command in a subdirectory fails because GIT_DIR=".git". He suspected the regression was introduced in some recent rebase--helper changes to speed up the interactive rebase and convert some shell scripts to C code.

Johannes Schindelin, alias Dscho, replied to Jacob and suggested a fix as well as a number of improvements in Jacob’s patch. He also asked if Jacob could take care of creating a proper patch for the fix. Jacob agreed with Dscho’s comments and to create a proper patch.

Phillip Wood then chimed in stating that Dscho’s suggested fix might not be right:

Just clearing GIT_DIR does not match the behavior of the shell version (tested by passing -p to avoid rebase--helper) as that passes GIT_DIR to exec commands if it has been explicitly set. I think that users that set GIT_DIR on the command line would expect it to be propagated to exec commands.

At that point Junio Hamano, the Git maintainer, Jacob and Phillip started discussing the possible impact of the bug and if it was worth delaying the release to get a chance to properly test a fix for some time.

Then Dscho gave an explanation about where the bug could come from:

When you look at git_dir_init in git-sh-setup, you will see that Unix shell scripts explicitly get their GIT_DIR turned into an absolute path.

He then suggested a fix in the rebase--helper code in C that has replaced the shell code in git-sh-setup. The fix is about turning the content of the GIT_DIR environment variable into an absolute path before running the exec command.

Jacob agreed again to create a proper patch from Dscho’s fix and then sent a patch with Dscho’s fix. The patch has subsequently been merged into the master branch.

Support

Lars Schneider realized after migrating a large repository to Git that “all text files in the index of the repo have CRLF line endings”. He then asked:

In general this seems not to be a problem as the project is developed exclusively on Windows.

However, I wonder if there are any “hidden consequences” of this setup?

Jonathan Nieder answered:

There are no hidden consequences that I’m aware of. If you later decide that you want to become a cross-platform project, then you may want to switch to LF endings, in which case I suggest the “single fixup commit” strategy.

He suggested though to declare explicitly all the files as non text files in .gitattributes using the -text flag, so that Git will not be tempted to change line endings.

Torsten Bögershausen agreed with Jonathan saying:

If you don’t specify .gitattributes, then all people who have core.autocrlf=true will suffer from a runtime penalty.

because:

At each checkout Git needs to figure out that the file has CRLF in the repo, so that there is no conversion done.

and also:

Those who have core.autocrlf=false would produce commits with CRLF for new files, and those developers who have core.autocrlf=true would produce files with LF in the index and CRLF in the worktree. This may (most probably will) cause confusion later, when things are pushed and pulled.

Lars thanked Jonathan for the idea of using the -text flag but wondered about its implications saying:

For whatever reason I always thought this is the way to tell Git that a particular file is binary with the implication that Git should not attempt to diff it.

To this Jonathan replied:

No other implications. You’re thinking of -diff. There is also a shortcut “binary” which simply means -text -diff.

Jonathan in his first email also asked his own related question:

I’d be interested to hear what happens when diff-ing across a line ending fixup commit. Is this an area where Git needs some improvement? “git merge” knows an -Xrenormalize option to deal with a related problem — it’s possible that “git diff” needs to learn a similar trick.

To that, Torsten replied:

That is a tricky thing. Sometimes you want to see the CLRF - LF as a diff, (represented as “^M”), and sometimes not.

Junio Hamano then also gave his “knee-jerk reaction” on this, saying that “the end user definitely wants to see preimage and postimage lines are different in such a commit by default, one side has and the other side lacks ^M at the end” and also that when one does not want to see those changes “one of the ‘whitespace ignoring’ options […] may suffice, but if not, it should be easy to invent a new one”.

Junio then posted a sample patch to implement --ignore-cr-at-eol.

Stefan Beller reviewed this patch, which was further improved by Junio and then discussed a few times, so that this new flag is likely to appear is the next Git release.

A sub thread of the discussion started about making big changes to the xdiff code that was originally “borrowed” from a separate open source project. There was no clear result from this discussion though.

Johannes Sixt also replied directly to Lars’ first email:

I’ve been working on a project with CRLF in every source file for a decade now. It’s C++ source, and it isn’t even Windows-only: when checked out on Linux, there are CRs in the files, with no bad consequences so far. GCC is happy with them.

To that Johannes Schindelin, alias Dscho, replied:

I envy you for the blessing of such a clean C++ source that you do not have any, say, Unix shell script in it.

and posted an example showing “Unix shell not handling CR/LF gracefully”.

In a separate reply to Torsten’s first email, Dscho also confirmed that completely switching off line ending conversions can give “around 5-15% speed improvement”.

A discussion then started about the merits of having an entry like “*.sh text eol=lf” in the .gitattributes for shell scripts, compared to having Git change strictly no file. In the end it looks like such an entry could help, though there could be shell scripts that don’t use the .sh extension.

Developer Spotlight: Torsten Bögershausen

Originally a hardware developer, these days are filled with software development for embedded systems.

The precomposeunicode feature for Mac Os was an important thing to go cross-platform, but the Git users may have a different point of view.

The last years it was CRLF handling, also known as EOL or line ending. Mainly because I am using it myself.

The Git code base is in a pretty good shape. Improve the on-disk or even over-the-wire protocol to include information if a file is binary or text with CRLF (2 bits). Please let me know, when you have the team.

git checkout -b” is certainly good for experienced people, hard to understand for beginners. “git add -A” or -all is certainly my favorite thing to be removed… Don’t accept commit messages which are not unicode any more. Remove the core.autocrlf from the code base, demand that people set up a .gitattributes file on Windows.

Probably Gerrit, even if I like the pull-request workflow which allows people to collaborate.

Releases

Other News

Various

Light reading

Git tools and sites

Credits

This edition of Git Rev News was curated by Christian Couder <christian.couder@gmail.com>, Thomas Ferris Nicolaisen <tfnico@gmail.com>, Jakub Narębski <jnareb@gmail.com> and Markus Jansen <mja@jansen-preisler.de> with help from Torsten Bögershausen, Johannes Schindelin, Luca Milanesio and Jérôme Reybert.