Git Rev News: Edition 26 (April 19th, 2017)

Welcome to the 26th edition of Git Rev News, a digest of all things Git. For our goals, the archives, the way we work, and how to contribute or to subscribe, see the Git Rev News page on git.github.io.

This edition covers what happened during the month of March 2017.

Discussions

General

Ævar Arnfjörð Bjarmason sent an email saying that OpenSSL is changing its license to the Apache 2 license, which is considered incompatible with the GPL v2 that Git uses for most of its code.

By default Git uses OpenSSL both for its implementation of the SHA-1 algorithm and in git imap-send.

Yves Orton replied by quoting the GPL compatibility page on the Apache web site which is not very clear about the incompatibilities between the Apache 2 and the different GPL versions.

Theodore Ts’o then chimed in to “suggest that we not play amateur lawyer on the mailing list” and leave it to the distributions to decide on their own.

Ævar agreed on that but proposed a new flag to the Makefile to declare “yes I’m OK with combining AL2 + GPLv2”.

Brian Carlson wrote “that most distros don’t link against OpenSSL” already, and suggested using Nettle, an LGPL crypto library that also has SHA-3 which could be used to replace SHA-1 in the long run.

But Ævar replied that we also use OpenSSL in git imap-send for its TLS implementation, so it is not enough to use a different SHA-1 implementation.

About that Jeff King, alias Peff, wrote that, when building with NO_OPENSSL, git imap-send uses the curl IMAP implementation instead of our custom IMAP implementation that can optionally use OpenSSL. Curl itself may be compiled to use either OpenSSL or GnuTLS.

Support

Bernhard Reiter emailed the Git mailing list, with the gnupg-devel mailing list in Cc, to suggest using libgpgme for interfacing to GnuPG, instead of using “an pipe-and-exec approach to running a GnuPG binary”.

He referred to the documentation of the “gpg.program” config option and warned that “the gpg command-line interface is not considered an official API to GnuPG by the GnuPG-devs and thus potentially unstable”, though “Git is only using a small part of the interface, so the risk when keeping the current way is small.”

Bernhard, who is involved in GnuPG development, anyway wrote that “using libgpgme is the way what GnuPG-devs would recommend” as can be seen on the GPGME API web page.

He also reported a problem with gpg2 vs gpg. He has both gpg and gpg2 installed, but his “.gnupg” configuration file is gpg2 specific, and Git by default uses gpg for signing, which fails with his “.gnupg”.

Ævar Arnfjörð Bjarmason asked for more information about the actual bug and the versions of gpg and gpg2 involved, hoping that it could be possible to write patches to fix compatibility problems.

To that Michael J Gruber later replied:

The problem is the “difficult” upgrade path and mixed installations with gpg and gpg2.1+ that some distributions force upon you…

and after describing possible problems:

In short: Users will run into problems anyway; git provides the quick way out (git config gpg.program gpg2), users won’t be as lucky with other things that require gpg.

Michael also wrote that “while - technically speaking - the command line is not a stable API for gpg, it does work across versions of gpg”, though maybe this will change with gpg 2.2.

Linus Torvalds replied to Bernhard with:

Quite frankly, I will NAK this just based on previous bad experiences with using “helpful” libraries.

and then explained what “the problems with libraries in this context tend to be”. For each problem: “hard to personalize”, “existing configuration”, “UI” and “library versioning”, he gave interesting explanations and examples.

He concluded with:

Of course, maybe gpgme is a world first, and actually does read your .gnupg/config file trivially, and has all the gpg agent integration that it picks up automatically, and allows various per-user configurations, and all my worries are bogus.

But that would literally be the first time I’ve ever seen that.

About the “library versioning”, Linus wrote:

I don’t know why, but I’ve never ever met a library developer who realized that libraries were all about stable APIs, and the library users don’t want to fight different versions.

but Ted Ts’o replied to the above:

Actually, you have. (Raises hand :-)

libext2fs has a stable API and ABI. We add new functions instead of changing function parameters (so ext2fs_block_iterate2() is implemented in terms of ext2fs_block_iterate3(), and so on). And structures have magic numbers that have served as versioning signal.

and:

I do have to agree with your general point, that most developers tend to be incredibly sloppy with their interfaces. That being said, not all library developers are as bad as GNOME. :-)

Brian Carlson also replied to Bernhard, asking “whether gpgme supports gpgsm”, and wondering “what happens to the git verify-* –raw output” as this output can be used when scripting signature verification.

Brian further explained his concerns with:

This is not an idle consideration; we have automated systems at work that update software automatically and submit it for human review, including verifying signatures and hashes. This saves hundreds of hours of staff time and results in better security.

The possibilities to write a wrapper script around gpg, to mimic the gpg API, and to filter the gpg output worried him too.

Following all the above discussions Bernhard replied first to AEvar. He gave the requested information about the versions he has been using, and the error messages displayed when gpg accessed his “~/.gnupg/gpg.conf” that contains gpg2-only config options.

Bernhard suggested using a “~/.gnupg/gpg.conf-2” config file, or Git to “try gpg2 first and then fall back to gpg or gpgv in case only these versions are available” as workarounds, but he still thought that it is was worth using libgpgme.

To Linus, Bernhard wrote that “it is too early to say that libgpgme would be right choice for git”, but “it should be seriously considered”, and:

Grateful that you have written down some of your concern, let me try to give you some pointers.

Bernhard then replied to each of the points Linus had raised. About “library versioning” his reply was:

In my experience Werner (the lead GnuPG developers) is quite reasonable about keeping APIs stable (he often goes out of his way to keep even the command line version stable, maybe he shouldn’t do that to the command line options so you are more motivated to go to this official API gpgme. >:) )

and he concluded with:

Your concerns are understandable, I’ve seen similar problems with “library vs program” and the unix tools box approach gives a number of lessons on how to loosely couple components. Thanks again for taking the time and writing them down. I’ve given you some pointers why gpgme indeed could be different and may be an improvement for git (or other applications). I guess one of the next steps would be for someone to look for specific points or try gpgme for git purposes. Me and gnupg-devel@ are happy to take your questions or get feedback.

Bernhard also replied to Brian and Michael, addressing at least some of their concerns.

But Jeff King, alias Peff, in turn replied to Bernhard, stressing that “the existing config option will have to stay” to avoid breaking existing “exotic” setups.

Bernhard, Michael as well as Christian Neukirchen, Peter Lebbing and Werner Koch, the lead GnuPG developer, further discussed some points, but in the end it doesn’t look like things will change much in this area for the foreseeable future.

Developer Spotlight: Ævar Arnfjörð Bjarmason

I’m a programmer currently working at Booking.com in Amsterdam since 2010. I’ve worn a lot of hats there, but most of it’s had to do with core infrastructure, I’m currently a senior dev there in an SRE role.

I’m originally from Iceland, self-taught, 31, got an 18 month old son with my girlfriend, and a daughter on the way. My hobbies aside from programming include power-lifting.

Unlike a lot of self-taught programmers I started relatively late. I had access to computers since I was a kid, but wasn’t using them to program. Mostly to play & tweak games (e.g. StarCraft map “programming”).

I eventually got tired of gaming, got a Gentoo Linux install CD from someone. It took me around a month and a half to install it. I only had access to some really old Mac monitor whose refresh configuration XFree86 didn’t support, so I had to go from knowing nothing at all about Linux to debugging X bugs with intermittent Internet access before I could start X.

I still wasn’t doing any non-trivial programming, but I became a very active editor on Wikipedia, both in Icelandic & English, and very interested in the free software movement. I taught myself a lot of topics by translating articles from English to Icelandic, and eventually started running bots to update Wikipedia itself. That led to needing to shell-script, use Perl, and eventually I started contributing to MediaWiki itself (the software that runs Wikipedia).

Initially I mostly worked on translation-related code, mostly bugs that hurt the Icelandic UX translation, but eventually became more interested in the core software itself. I worked on a lot of things, but the one most people will recognize is that I wrote the software plugin powering the “Notes and references” section at the bottom of pretty much every Wikipedia article.

This was back in 2010. I’d started my first software development job at f-prot.com and become mostly inactive in MediaWiki development, but I was very interested in source control & was involved with that at work. I added the git commit --allow-empty-message feature to use in snerp-vortex, a Subversion to Git migration script I was using to migrate the MediaWiki repository to Git.

Shortly after that I started maintaining a mirror of the main MediaWiki Subversion repositories on Github using git svn. The MediaWiki project eventually did end up migrating to Git about a year later, but someone else drove that effort.

Definitely adding localization support. The git command-line client is translated to over 10 languages now. I don’t use that feature myself, I wrote it because adding translation support was a very effective way to get to know the code-base, but hopefully that’s very useful to many users, particularly complete newbies.

Jiang Xin gets all the credit for actually maintaining and coordinating that effort, I just drive-by added the initial support.

I was quite active in 2010 & 2011 but had a long period of inactivity since then, and have only recently started submitting patches again:

$ git log --author=Ævar --date=format:%Y --pretty=format:%ad @{u}|sort|uniq -c
        121 2010
        152 2011
          8 2012
          1 2013
          7 2016
         29 2017

My main contribution for the last couple of years has been indirect. I’m one of the two main people responsible for the Git infrastructure at Booking.com (along with Dennis Kaarsemaker, interviewed in a previous iteration of Git Rev News).

In late 2015 I made the case internally at Booking.com that we should contribute time directly to Git development, which we’re doing by contracting Git performance to Christian Couder, who’s maintaining this newsletter & doing this interview, and no I didn’t put him up to interviewing me :)

I manage that project at Booking.com, which mainly consists of occasionally touching base with Christian on what performance-related thing he and I think is worthwhile to do next.

Then I send the occasional patches myself, mainly small isolated things to scratch this or that itch I personally run into, or that someone reports as a bothersome misfeature or bug at work.

But to actually answer the question: Right now I have a series in the works to support PCRE v2, some other related fixes to git grep, and a couple of miscellaneous code cleanup submissions I need to pick up again.

The Git data model is so much more flexible in theory than any of the Git clients out there allow for.

Microsoft’s made the first real step to advancing that with GVFS. I’d have my hypothetical team work on something like that, but on steroids.

I.e. a git client/server pair that could trade off as much work between the client & server as you wanted, and which could scale to cloning & working on something like the result of git merge-ing every single repository on GitHub into one ginormous project, with every command that needed to walk/search/checkout a large part of that set able to either download the data it needed locally or ask a server to produce the results it needed.

Right now support for anything except PCRE regular expressions, but that’s just because right now I’m hacking in PCRE v2 support to grep and supporting POSIX regexes going forward for some of the things I’d like to do is a pain.

More generally, more than deprecating any one thing I think Git could really use deprecation cycles as part of its release process. I.e. we should make an exhaustive list of things we have now which suck, then at some point we break all of those, release v3.0, and start the process again by accumulating things we’ll break in v4.0.

The git-rebase.el plugin for Emacs, which is the the only non-core Git interface I use which I’d miss enough to personally write a replacement for myself if it didn’t exist.

It’s sort of like a little git add --interactive, except for git rebase, which of course has an –interactive option already, but which isn’t very interactive at all. It would be neat to add something like git-rebase.el to Git itself, but we’d have to call it git rebase --actually-interactive, which I brings us back to the backwards compatibility discussion.

Releases

Other News

Various

Light reading

Git tools and sites

Credits

This edition of Git Rev News was curated by Christian Couder <christian.couder@gmail.com>, Thomas Ferris Nicolaisen <tfnico@gmail.com>, Jakub Narębski <jnareb@gmail.com> and Markus Jansen <mja@jansen-preisler.de> with help from Ævar Arnfjörð Bjarmason and Michael J Gruber.