Git Rev News: Edition 9 (November 11th, 2015)

Welcome to the 9th edition of Git Rev News, a digest of all things Git. For our goals, the archives, the way we work, and how to contribute or to subscribe, see the Git Rev News page on git.github.io.

This edition covers what happened during the month of October 2015.

Discussions

Reviews

Stefan Beller started posting patch series to “finish the on going efforts of parallelizing submodule network traffic”.

This followed previous work by Stefan to make it possible to launch many git submodule fetches in parallel.

What is interesting is that a few weeks before posting the first version of his patch series, Stefan had been involved in a discussion that was started by Kannan Goundan who asked if it would be possible to “Make ‘git checkout’ automatically update submodules?”.

In this previous discussion Stefan pointed Kannan to a wiki that contains a lot of information about submodule implementation including pointers to some current developments that have not been posted to the mailing list yet. This wiki had indeed been maintained since September 2010 by Jens Lehmann and Heiko Voigt who have been working for a long time on git submodule.

When Stefan posted his patch series, it attracted the attention of many reviewers like Eric Sunshine, Ramsay Jones, Jonathan Nieder and Junio Hamano. As usual the reviewers made sensible suggestions and Stefan soon posted another version of his patch series.

Hopefully the tremendous work by Stefan and the reviewers will soon make it possible to have improved submodule performance.

On Saturday November 7th and Sunday November 8th there was the GerritSummit, which is held annually. In a discussion Martin Fick pointed out, that parallelism may hurt the user. As he was using repo -j 8 to sync many git repos down to disk, the following disk operations took longer than expected as the repos were not written to disk in a linear fashion. This will be an interesting benchmark for the submodules as well.

Support

At the end of last September, Karsten Blees sent an email starting with the following:

I think I found a few nasty problems with racy detection, as well as performance issues when using git implementations with different file time resolutions on the same repository (e.g. git compiled with and without USE_NSEC, libgit2 compiled with and without USE_NSEC, JGit executed in different Java implementations…).

He listed and detailed some interesting “Notable file time facts” about how file time is implemented in Linux, Windows, Java and the different Git implementations (git, libgit2 and JGit).

Karsten noted 4 problems related to the above facts. These are:

Kartsen proceeded to suggest several possible solutions for these, all detailed and well written. A few days later he followed up with an RFC patch called “read-cache: fix file time comparisons with different precisions” to take care of some of the problem he described.

Junio Hamano and Johannes Schindelin both reviewed the suggested solutions as well as the RFC patch, and found it all sensible.

There is still some way to go, as the patch has not been merged yet. Hopefully some progress will be made in this area soon, using Karsten’s detailed emails as a reference for future work.

Developer Spotlight: Matthieu Moy

Q: Who are you, and what do you do?

A: I’m an occasional contributor to Git, and I maintain several Git-related tools like git-multimail, git-latexdiff and to some extent Git-Mediawiki. I also teach Git (to student at Ensimag and lifelong learning). In 2014 and 2015, I mentored GSoC projects for the Git organization, and I’ve been co-administrator for Git in 2015.

Q: How did you start getting involved with Git?

A: My first non-trivial contribution to free software was on version-control, before Git existed. I got involved in GNU Arch, then its fork Bazaar 1.x. And then GNU Arch and Bazaar 1.x died, and I moved on to something else. At the same time, I started teaching version-control, hated centralized systems enough to migrate to Git. Teaching Git in 2009 was a funny experience: the tool was starting to get a decent user-interface, but was lacking a lot of polishing. One of my favorite examples is what happens when you push to a remote that has commits that you don’t have locally. Initially, users were getting a message like:

! [rejected]        master -> master (non-fast-forward)

I wrote a rather straightforward patch to change it to:

! [rejected]        master -> master (non-fast-forward)
To prevent you from losing history, non-fast-forward updates were rejected.
Merge the remote changes before pushing again.
See 'non-fast forward' section of 'git push --help' for details.

The students went from “Huh?” to “Wow, a 3-lines long message, that’s long. What shall I do now (given that reading the actual doc is not an option)?”. Then, I added an explicit mention of “git pull” in the message, and the situation became manageable for most students. Many of my contributions to Git follow this principle: see what users have difficulties with, and try to improve the tool to help them. In many cases, a staightforward patch to improve the error message was sufficient: in case of error, explain what’s going on to the user, and give the way out (“did you mean: …?” or “use … to …”).

Q: What would you name your most important contribution to Git?

A: In general, most of my contributions are to be found in the user-interface and in the documentation. To define which is the most important, we’d have to define “important” first.

In terms of impact on Git’s usability, my biggest contribution is probably my involvement in the change of the default value of push.default from matching to simple (i.e. roughly “push only the current branch by default”). I was not alone in the discussion, and this was really more a teamwork than a personal contribution, but I think I played an important role in the discussion to understand what the default new behavior should be, defining the migration path (this was a backward incompatible change, which Git avoids as much as possible, and we had to find a way to do this without hurting users).

In terms of amount of work, my biggest contribution is certainly to supervise students. Both as a teacher, as I offer my students a “contribute to free software” project every year, and as a GSoC mentor. The most visible change done by my students is probably the advice in the output of git status (like “You are currently bisecting”, …).

Q: What are you doing on the Git project these days, and why?

A: These days, I’m taking a break after having spent a lot of time contributing to Git and git-multimail. I’m continuing my Git activities by following the mailing-list, occasionally helping users and reviewing code when I get time, but I’m limited by this old good “days have only 24h” issue ;-).

I hope to get more time to work on git-multimail. Since I became the maintainer after discussing with Michael Haggerty at Git Merge, I’m happy I managed to merge or close all the pending pull-requests, set up a better test-suite, port to Python 3, … The todo-list is still long, and there are a lot more funny things to write!

Q: If you could get a team of expert developers to work full time on something in Git for a full year, what would it be?

A: For my personal use of Git, the tool is already good enough. Still, I could use the help of a team of expert to help Git. I would probably ask them to work in priority on scalability (yes, we went from “Git is crazy fast” when Linus wrote the initial version 10 years ago to “What makes Git so slow?” given the size of projects people use it on), and on gathering some Git forks and related tools in the same codebase. Currently, git.git and libgit2 are two separate projects, and I think they would benefit from more code sharing. There are forks of Git in several companies, and tools like repo which were designed partly to compensate some weaknesses of Git, but having these features directly in Git would be better both for the community and for users IMHO.

Q: If you could remove something from Git without worrying about backwards compatibility, what would it be?

A: I’m geek enough to like tools that have too many features ;-). But I’d remove any instance of “cache” or “cached” referring to the Git index in the user-interface and documentation. “index” is not such a good term in my opinion, but it’s already much better than “cache” (which suggests that it’s a performance improvement that doesn’t change the functionality, while it’s not).

Q: What is your favourite Git-related tool/library, outside of Git itself?

A: That would be Magit. I’ve stayed away from Emacs Git interface for a while because I wanted to force myself to use the command-line for two reasons: as a Git contributor, to see the drawbacks of the cli and get a chance to improve it, and as a teacher, to put myself in the same position as my students. Still, I like Emacs, and I like using a VCS from within Emacs (once upon a time, I was even the maintainer of an Emacs interface for GNU Arch, memories, sweet memories…). I recently started to use Magit, and I really like it. It doesn’t try to hide Git from me, but gives me a lot of shortcuts and interactive features on top of it.

Releases

Other News

Light reading

Git tools and sites

Credits

This edition of Git Rev News was curated by Christian Couder <christian.couder@gmail.com>, Thomas Ferris Nicolaisen <tfnico@gmail.com> and Nicola Paolucci <npaolucci@atlassian.com>, with help from Matthieu Moy.