Git Rev News: Edition 21 (November 16th, 2016)

Welcome to the 21st edition of Git Rev News, a digest of all things Git. For our goals, the archives, the way we work, and how to contribute or to subscribe, see the Git Rev News page on git.github.io.

This edition covers what happened during the month of October 2016.

Discussions

General

There were a number of Git related discussions at the Google Summer of Code (GSoC) Mentor Summit that took place in Google offices in Sunnyvale California from October 29th to October 30th.

298 mentors from 149 organizations attended the event. There were some talks that had been planned by the GSoC team at Google to talk mostly about the past GSoC and the future one. But most of the event was organized in an “unconference” style by the mentors who were there.

The first “unconference” style Git related talk was led by Brendan Forster and Parker Moore from GitHub. Their goal was to get input from mentors about what they like and don’t like about GitHub.

Most of the GitHub features discussed were related to GitHub pages and to discussion threads in issues and pull requests. Parker, who is also the Jekyll maintainer, often replied to requests with “you’ll be happy… soon”. So it looked like GitHub has been working on a number of new features in those areas and has been planning to release some of them soon, though Parker said they couldn’t promise anything concrete.

Some people specifically asked if it would be possible to have a better integration between email and discussion threads in issues and pull requests. Discussions about this subject often happen on the Git mailing list, for example there was a long one last August under the title “Working with public-inbox.org”. But it looks like it is complex and sensitive subject and it is not likely that good solutions will appear soon.

Ironically the next unconference talk, led by Martin Braun from GNU Radio in the same room, called “The Closed-Source Proliferation”, was about the fact that many open source projects now use and depend on closed source tools like GitHub and Slack.

A number of mentors said that they are using GitHub because of the network effect and also because they don’t want to spend time, and maybe money, managing their own servers and a number of different tools on them.

Some people replied that it should be possible to have projects hosted by university related organizations like OSU Open Source Lab using open source tools. It also appears that Canadian universities are now required to host their software on servers located in Canada, which excludes GitHub, so some universities there have started to setup solutions.

People mentioned that GitLab-CE, the GitLab Community Edition, was a good solution for them, but others were not happy that there is GitLab-EE, the GitLab Enterprise Edition, which is not open source.

The last Git related talk called “Git/Gerrit” had been planned by the Google team and was given by Shawn Pearce. Shawn used to work on Git a lot, and has created JGit, EGit and Gerrit. He is now leading a large team at Google working on version control related things. Four members of his team have been contributing to Git. Stefan Beller and Jonathan Nieder have been contributing for a long time, while Brandon Williams and Jonathan Tan started contributing more recently.

In his talk, Shawn described how Git has been developing a big test suite since it’s beginning in 2005 and that it’s “worth its weight when you have 1438 contributors”. It has helped Junio set a “consistent bar about quality” and has been a “huge success” that has “prevented too many regressions to be counted”.

On the contrary Shawn said that he started Gerrit in 2008, but they didn’t really test its REST API until 2013, and didn’t do any UI tests until 2016, “shame on me”.

For a long time Gerrit tests relied on “monkeys testing everything” and there were a lot of regressions. It was hard to get confidence on releases.

There are 284 contributors, and now 1847 junit tests and 524 polymer tests. The tests give confidence in the quality of the new releases. They are run on every commit which is easy to do with Gerrit as it can be linked with tools like Travis CI, Circle CI or Jenkins and the result of the tests can be displayed in the interface along with the review of each commit.

He said though that automation has its limits as it difficult to test all configurations.

Support

Aaron Pelly asked for a new feature on the mailing list:

I want git to be able to include, in its gitignore files, sub-files of ignores or have it understand a directory of ignore files. Or both.

He wanted to be able to pull from https://github.com/github/gitignore and “include relevant bits project by project and/or system wide”, without having to “update many projects manually if that, or any other, repo changes”.

And after discussing possible implementations, he asked:

I would like to know the desirability/practicality/stupidity of such a feature as I believe it is within my skillset to implement it.

Stefan Beller suggested to (sym)link .git/info/exclude to an up to date version of the gitignore repo as a hack. But Aaron said he would still need to copy stuff from one file to another by hand as there would be sections that are project, language, editor, machine, whatever, specific.

Then Alexei Lozovsky, Jacob Keller, Jeff King, Duy Nguyen and Aaron discussed possible implementations. One possibility was to add “include” directives in .gitignore files, but that was considered complex and dangerous.

The other possibility, favored by the reviewers, was to add either “.gitignore.d” or “.git/info/exclude.d” directories that would contain many files. The content of those files would be concatenated by Git to get the actual information about what should be ignored.

Jacob Keller, alias Jake, said that the reading of files in such a directory should “exclude reading .git or other hidden files in some documented manor so as to avoid problems when linking to a git directory for its contents”.

Jeff King, alias Peff, said that the “.git/info/exclude.d” approach could be implement without any changes by doing:

  path=.git/info/exclude
  cat $path.d/* >$path

and if we actually implement something like that, then

  cd .git/info
  git clone /my/exclude/repo exclude ;# or exclude.d

should work.

Duy replied that there could be complications with negative patterns though.

Junio Hamano wrote that he does “not see the point of making in-tree .gitignore to a forest of .gitignore.d/ at all, compatibility complications is not worth even thinking about”. But it looks like the possibility of having a “.git/info/exclude.d” directory is still open.

Developer Spotlight: Jacob Keller, alias Jake

My name is Jake, I’m an avid contributor to various open source projects. I currently work for Intel doing Linux network programming for their 10GbE/40GbE Ethernet networking driver. I have contributed to Git to help resolve specific issues I’ve had in the past, and continue to contribute as I like to give back to the communities that I depend on as a software developer.

I would say my most important contribution was modifying the refspec globs for fetching to allow globbing past / (slash) boundaries. However, the largest contribution is probably the diff --submodule=inline-diff format for displaying the full diff between a submodule change.

Currently I have mostly spent time trying to find areas where my review could be helpful. There are a few projects I wouldn’t mind working on, but as my employer has not hired me directly to work on Git, that limits the amount of time that I spend during work hours. I’m certainly open to new opportunities to contribute in the future.

Hmm. This is a tough question. I think the biggest things I would work towards is implementing something like git-series as a core part of git, at the very least providing the tools needed to make storing the meta-history easier. For example, using git series today is pretty good, but I often mess up and use regular git commands to look at branches or status, and it can tend to make the experience very brittle. Additionally there is the effort to implement gitrefs for storing reachability information about objects. Having this sort of tool built into git would allow more commands and areas to recognize them easily, making the entire experience much smoother.

Honestly? I would re-write much of the interface to be more consistent, removing aspects which aren’t consistent with some of our more modern design. A good example, was a colleague of mine who is not very fluent in git recently tried to checkout a new branch from our main remote, and ended up doing something like:

$ git branch -a
# copy the "remotes/origin/branch"
$ git checkout <copied text>

which unfortunately didn’t actually end up doing what she had intended. She then accidentally included commits into a release tag without realizing they weren’t on that particular branch. These sort of pain points exist in a lot of places. Most of the time, they exist because the tool provides additional express ability and power at some expense of cost. It may also be simply a documentation issue.

However, there are many warts on the user interface that I would love to be able to deprecate and remove as they cause issues for co-workers who are new to Git. I love using Git, and I think the extra tools that we have created are very beneficial, but I have many coworkers who haven’t gotten over that initial road block.

I think right now it is git-series. I have also extensively used Stacked Git in the past. Both of these tools help to manage a patch series, and this is something that I think core git is currently lacking in.

Definitely interactive addition, and the ability to rebase local history. Ever since I realized that I could re-write history, I have changed my development model to develop and commit fast, then re-write to look good later. Additionally, I aim to make my work easily bisectable, since I have used git-bisect “run” to varying degrees of success. The number of times that I’ve tried to do some history archeology and ended up on an “Import from CVS” commit that meant nothing to my issue, has led me to heavily make use of tools to make my history as presentable as possible.

I think pull-requests, with avid local rebasing and squashing of commits. Followed shortly by the use of a tool like Gerrit or a mailing list for review.

So, at my job, we used to use CVS to store our project history. We have a lot of tooling around this history, including build servers. We eventually migrated to Git, but one of the funny things that we did carried over. When you have to compare to “versions” in CVS, the only reasonable way to do so is to tag, since tags are the only thing combining multiple files together.

So, the people who migrated tools for Git decided that Git builds would create the same style of tags of the form “tag_<YYMMDD>_<HHMMSS>”, which are these ridiculous tags that we need for CVS comparing of different build versions.

I tried to make them change the tool, but they refused. I found out that I could exclude certain refs when fetching, by not fetching tags and instead fetching specific “refs/tags/*”. However, this would still fetch all these tags I didn’t care about. So I modified Git to allow using * with less than a full section, so I could say fetch “refs/tags/driver-*” which would fetch human readable tags that were meaningful without fetching hundreds of tags created by nightly builds.

Releases

Other News

Various

Events

Light reading

Watching

Git tools and sites

Credits

This edition of Git Rev News was curated by Christian Couder <christian.couder@gmail.com> Thomas Ferris Nicolaisen <tfnico@gmail.com> and Jakub Narębski <jnareb@gmail.com> with help from Jacob Keller, Johannes Schindelin, Markus Jansen, Gábor Szeder and Jeff King.