Git Rev News: Edition 25 (March 15th, 2017)

Welcome to the 25th edition of Git Rev News, a digest of all things Git. For our goals, the archives, the way we work, and how to contribute or to subscribe, see the Git Rev News page on git.github.io.

This edition covers what happened during the month of February 2017.

Discussions

General

  • SHA1 collisions found

    On February 23rd it was publicly announced that a collision had been found against SHA-1, the cryptographic hash function that Git uses to identify Git objects (blobs, trees, commits, annotated tags).

    Details about the collision, how it was performed, as well as algorithms and code to detect such a collision attack were published simultaneously.

    This caused numerous news articles related to Git and SHA-1 in many places, for example LWN.net:

    as well as many discussions on the mailing list.

    There have also been patch series flowing around. Moreover, plans to move Git away from SHA-1 have been shared and discussed.

    Linus Torvalds for example sent a Typesafer git hash patch as a first step on fixing SHA-1 implicit dependencies. This one big patch approach, though, is not consistent with the way Brian Carlson has been working on the same issue for a long time. Junio Hamano has not commented on this patch yet. Hence, for the time being it is not sure at all that this topic will move much faster.

    Some work on integrating the code to detect a collision attack into a new SHA-1 implementation in Git was started by Jeff King, adding a USE_SHA1DC knob to the Makefile, and then picked up by Linus. The original code was written by Marc Stevens, working for CWI and Dan Shumow, working for Microsoft. Interestingly, both Marc and Dan chimed into the discussion. Dan agreed to work on adaptations and performance improvements for Git, and on upstreaming this work into the original code base. Junio participated in the discussions, too, and it looks as if the resulting patch series could be merged for the next Git release; currently the ‘jk/sha1dc’ is in the ‘pu’ branch.

    One of the plans to move Git away from SHA-1 was contributed by Jonathan Nieder, Stefan Beller, Jonathan Tan and Brandon Williams, who are all working in the same team at Google. The latest version of this plan is available in a Google document where it can be commented on. It has also been discussed in the following threads:

    Another plan was posted by Ian Jackson; it also generated some discussion.

    It’s interesting to note that Git is not the only version control system to be affected by the issue. Here are a few related posts:

Support

  • body-CC-comment regression

    Johan Hovold noticed that git send-email in Git v2.10.2 does not accept anymore patches with a commit message that contain lines like:

    Cc: <stable@vger.kernel.org>	# 4.4
    

    Apparently it parses the above as “stable@vger.kernel.org#4.4” and then aborts.

    Researching the problem, Johan found a mailing list thread which resulted in some “fixes” that seem to be the root cause of the problem.

    He claimed the format of the line that trigger the problem “has been documented at least since 2009” in the Linux kernel and “has been supported by git since 2012”. It is used to tag commits that should be backported into the “stable” Linux kernel versions.

    Johan then asked for a way for Git to revert to the old behavior.

    Junio wondered if installing the Mail::Address Perl module would make git send-email work by avoiding the “non-parsing-but-paste-address-looking-things-together code” that Git uses when Mail::Address is not installed. Johan replied that it doesn’t work.

    Matthieu Moy, who worked on the patch that is responsible for the problem, remarked that “a proper fix is far from obvious”, because we want our own parser to work the same way as Mail::Address does, and we don’t want to regress for people who want to get back two email addresses from lines like:

    Cc: <foo@example.com> # , <boz@example.com>
    

    as this has been working since September 2015.

    Anyway, in another email Matthieu suggested that we should always use our own parser, as “we now have something essentially as good as Mail::Address”, and changing our parser to discard anything after “>” in the email address. Matthieu’s email also contained a patch implementing the latter.

    Johan agreed with Matthieu’s plan, tested the patch and found that it worked. Unfortunately he found another breakage when the --suppress-cc=self option is used if more than one email address in each line is allowed.

    It looked as if the discussion was going to continue for some time, but Linus replied to Matthieu stating that Cc: lines in commit messages are not like Cc: lines in email headers. Consequently, we should not accept more than one email address in them. He concluded as follows:

    So this notion that the bottom of the commit message is some email header crap is WRONG.

    Stop it. It caused bugs. It’s wrong. Don’t do it.

    Finally, after Junio had discussed possible breakages with Matthieu’s patch, Matthieu agreed that it was safer to just revert to not accepting many email addresses in the Cc: lines. Junio then accepted a patch submitted by Johan implementing this proposal. In the meantime, this patch was merged to the “next” branch, so it is very likely to appear in the next Git release. Let’s just hope that no one will complain about it.

Releases

Other News

Various

Light reading

Git tools and sites

Credits

This edition of Git Rev News was curated by Christian Couder <christian.couder@gmail.com>, Thomas Ferris Nicolaisen <tfnico@gmail.com>, Jakub Narębski <jnareb@gmail.com> and Markus Jansen <mja@jansen-preisler.de> with help from Lars Schneider, Luca Milanesio and Junio Hamano.