Git Rev News Edition 65 (July 29th, 2020)

Git Rev News: Edition 65 (July 29th, 2020)

Welcome to the 65th edition of Git Rev News, a digest of all things Git. For our goals, the archives, the way we work, and how to contribute or to subscribe, see the Git Rev News page on git.github.io.

This edition covers what happened during the month of June 2020.

Discussions

General

The history of master in Git (written by Andrew Ardill)

Amidst all the discussion around changing the default branch from master to something else, many people have asked why master was chosen in the first place. As master has a few different meanings in English, just which meaning was intended?

Konstantin Ryabitsev was the first to discuss the meaning of master, saying

Git doesn’t use “master-slave” terminology – the “master” comes from the concept of having a “master” from which copies (branches) are made

This post from the GNOME mailing list was then linked by Simon Pieters with the claim that

Git’s master is in fact a reference to master/slave

That post points out that the first use of master was in a CVS helper script, links that to BitKeeper (the version control system used to manage Linux development when Linus Torvalds first wrote Git), and claims BitKeeper used the “master and slave” meaning of master.

Many people considered master to mean a “master copy”, so this connection to slavery was very surprising.

Andrew Ardill investigated the BitKeeper source code and came to the conclusion that “the overwhelming majority of [the usages of master in BitKeeper] are of the “Master Copy” variant”, or as Michal Suchánek said “even in BitKeeper the use of master/slave is the exception rather than the norm.”

Off the list discussions were ongoing, and Petr Baudis wrote on Twitter about naming the master branch in Git

I picked the names “master” (and “origin”) in the early Git tooling back in 2005.

(this probably means you shouldn’t give much weight to my name preferences :) )

I have wished many times I would have named them “main” (and “upstream”) instead.

Glad it’s happening @natfriedman

When asked for which meaning of master was intended, Petr replied

“master” as in e.g. “master recording”. Perhaps you could say the original, but viewed from the production process perspective.

A clueless Central European youngster whose command of English was mostly illusory came up with the term, which is why it isn’t very obvious…

In a follow-up to that original GNOME mailing list post, Bastien Nocera retracted their claims from the original post, saying

I emailed Linus Torvalds recently… and he told me that it was unlikely that the “Git master” branch name was influenced by BitKeeper, and that “master” was “fairly standard naming” for this sort of thing and “more likely to be influenced by the CVS master repository”

Going on, Bastien discusses Petr Baudis’ tweets and then concludes “it doesn’t matter where the name comes from… The fact that it has bad connotations, or inspires dread for individuals and whole communities, is reason enough to change it.”

This is something that Brian M. Carlson had also pointed out on the Git mailing list, saying

“master”, even though from a different origin, brings the idea of human bondage and suffering to mind for a non-trivial number of people, which of course was not the intention and is undesirable. I suspect if we were making the decision today, we’d pick another name, since that’s not what we want people to think of when they use Git.

Brian goes on to lay out changes required in Git to rename master as the default, suggesting that there is a decent amount of work and that due to compatibility concerns “we’d probably want to make it in a [Git version] 3.0”.

Around the web the discussion about renaming master continues. The incorrect claims around the history of master persist, even in our own Git Rev News: Edition 64, but seem to be quickly corrected where possible such as on GitLab’s discussion on the topic.

Reviews

More commit-graph/Bloom filter improvements

Derrick Stolee, who prefers to be called Stolee, sent a patch series to the mailing list, based on a previous experimental patch series sent a few weeks earlier by Gábor Szeder.

When he sent his patch series, Gábor said that his work was a proof of concept started more than a year ago, that he had no time to finish yet. He was motivated to send it as-is with changes to commit messages, when he recently took a look at the current changed-path Bloom filter implementation. This implementation was developed for a long time mainly by Garima Singh and was merged at the beginning of May. He saw that it had some of the same issues that he had stumbled upon, and that it missed some optimization opportunities.

Gábor listed a lot of very interesting benefits from his work, but also a lot of drawbacks that would prevent it from being merged as is. Many of the benefits are linked to a new format used to store the changed-path Bloom filter. This new format was justified by an impressive commit message.

Stolee, Taylor Blau, Johannes Schindelin and Junio Hamano, when reviewing Gábor’s work, were disappointed that Gábor was not trying to contribute to the current implementation. It appeared though that a number of Gábor’s 34 patches and ideas could be applied on top of the current implementation.

That’s what Stolee did by first sending 10 patches from Gábor’s series at the beginning of June. This patches series required a bit of work, but Stolee left out what would have been more difficult to apply to the current code. René Scharfe, Stolee, Gábor and Junio commented a bit on it, but didn’t find anything that would require a new version of this patch series. So it is now “cooking” in the ‘next’ branch.

Stolee’s next patch series called “More commit-graph/Bloom filter improvements” was about adding a few extra improvements, several of which are rooted in Gábor’s original series. Even though Gábor’s patches did not apply or cherry-pick at all, Stolee still credited Gábor as the author of 4 patches out of 8.

Anyway this new series contained 2 changes that improve the false-positive rate which increases performance, and one change that improves usability. René and Taylor suggested improvements and bug fixes. Taylor even sent a patch.

Stolee then sent a version 2 of the series, taking into account the feedback and adding the patch from Taylor to the series. René, Gábor, Junio and Stolee discussed a few more points.

That led to Stolee sending a version 3 in which Gábor reported a bug that Stolee subsequently fixed.

So Stolee sent a version 4 which is now cooking in the ‘next’ branch, along with the first series that has 10 patches from Gábor.

In the meantime though Gábor commented on this first series saying that it has a number of issues. Hopefully these issues will be addressed soon, and these 2 patch series will be merged in the near future.

Developer Spotlight: Jonathan Tan

Who are you and what do you do?

I’m a Software Engineer at Google who works on Git. I also contribute to JGit (a Java implementation of Git) as one of its committers.
What would you name your most important contribution to Git?

I would say “partial clone” - the ability to clone a repository, but not necessarily have all of that repository’s objects (accumulated throughout its history) in your clone. Quite a few articles have been written about it, but in summary, it improves Git performance especially for large repositories.
What are you doing on the Git project these days, and why?

The thing that immediately comes to mind is “partial clone”. The fundamentals are there, but some Git commands still operate under the assumption that objects are only a disk read away (instead of a network fetch - in a partial clone, if an object is needed but missing, it is automatically fetched). I’m improving those commands to be more cognizant of this fact - typically, this means batching the fetch of all the objects it will need once it realizes that it does not have some of them, instead of “I need this object, so go fetch it; OK let me process it; oops I need another one, so go fetch that”.
If you could get a team of expert developers to work full time on something in Git for a full year, what would it be?

Along the lines of “partial clone” and large repositories, I would like them to investigate the feasibility of having Git servers be able to serve results of computations (thus, not just objects). One case is git blame - if a Git client could ask a Git server to send the results of such a command, it could offload most of the computation to the server, only needing to build upon the server’s results with the locally-created objects that the server does not know about. This is especially useful with partial clone, because the client does not even have most of the objects needed and would have to fetch them otherwise.
If you could remove something from Git without worrying about backwards compatibility, what would it be?

One small thing that I can think of: remove the ability of git reset to update the working tree and the objects staged in the index. The git restore command, relatively recently introduced, does this with more beginner-friendly parameter names (--worktree and --staged, respectively, instead of the --hard, --mixed, and --soft of git reset). This change would make it easier, for example, to read scripts written by other people - I would no longer need to think so much about what that reset in the script would do.

Releases

Git 2.28.0, 2.28.0-rc2, 2.28.0-rc1, 2.28.0-rc0
Git for Windows 2.28.0, 2.28.0-rc2, 2.28.0-rc1, 2.28.0-rc0
git-filter-repo 2.28.0
Bitbucket Server 7.4
Gerrit Code Review 3.1.8, 3.0.12, 2.16.22
GitHub Enterprise 2.21.3, 2.20.12, 2.19.18, 2.18.23, 2.21.2, 2.20.11, 2.19.17, 2.18.22
GitLab 13.1.5, 13.2.1, 13.2, 13.0.10, 13.1.4, 13.1.3, 13.0.9 and 12.10.14, 13.1.2, 13.0.8 and 12.10.13
GitKraken 7.1.0
GitHub Desktop 2.5.3

Other News

Events

Carmen Andoh, who works for Google, and Jonathan Nieder’s team at Google have volunteered to organize a Git Inclusion Summit. It would be a virtual contributor summit with the purpose of engaging core Git contributors as active participants in diversity and inclusion initiatives for the Git project. Interested Git contributors can vote on their preferred summit duration and times on a whenisgood poll by Thursday, July 30th.

Various

Junio Hamano, the Git maintainer, has renamed the pu branch of git.git to seen. This has been done to use a more meaningful name and make room for topics from those contributors whose two-letter name abbreviation needs to be ‘pu’. This was announced in “What’s cooking in git.git (Jun 2020, #04; Mon, 22)”
The Git Project Leadership Committee has been briefly interviewed via email by Elizabeth Landau for an article in Wired about current changes to Git’s default name for the initial branch.
Highlights from Git 2.28 by Taylor Blau on GitHub Blog, mentioning among others init.defaultBranch, changed-path Bloom filters, the git bugreport command and git log’s new --show-pulls option.
The Tower Git client for Windows and MacOS now supports CMD+Z for Git (a universal undo).
Exciting new updates to the Git experience in Microsoft Visual Studio 2019.
GitHub Archive Program: the journey of the world’s open source code to the Arctic by Julia Metcalf on GitHub Blog. The GitHub Archive Program along with the GitHub Arctic Code Vault were introduced at GitHub Universe 2019, and mentioned in Git Rev News #57 (November 20th, 2019).
Updating the Git protocol for SHA-256 [LWN.net] by John Coggeshall.

Light reading

Git Rebase - A Complete Guide by Brooke Kuhlmann at Alchemists.
How to safely use GitHub Actions in organizations by Nicholas C. Zakas, mainly about handling credentials and other secrets. Various tools for checking the repository for secrets and/or safely storing secrets were mentioned in Git Rev News Edition #25, #28, #36, #39 and #57
Fedora Classroom: Git 101 with Pagure session was streamed on YouTube on the Fedora Project channel.
How To Create A GitHub Profile README by Monica Powell on Dev.to.
Top 13 GitHub Alternatives in 2020 [Free and Paid] by Momchil Koychev on DevOps Zone.
Git Best Practices – AFTER Technique by Rajeev Bera on DevOps Zone.
6 best practices for teams using Git by Ravi Chandran on OpenSource.com.
Use broot and meld to diff before commit by Denys Séguret (author of broot, which is a tool to navigate file trees).
Basic Git Analogy for Contributing to Open Source Project by Sagar Seghal on Medium.
Can You Restore A Deleted Commit on Git? by Dmytro Khmelenko on Hacker Noon (the answer is yes, with the help of the reflog).
Git Concepts I Wish I Knew Years Ago by Gabriel Abud on Dev.to.

Git tools and sites

Git Lint - A command line interface for analyzing Git commit quality and consistency for yourself and/or team. Can be used as a Git Hook and/or wired into your continuous integration build system.
git-assembler: update git branches using high-level instructions; it can perform automatic merge and rebase operations following a simple declarative script (like “make”, for branches).
git-manpages-l10n is repository for translating Git manpages (the Git documentation).
icdiff is improved colored diff. Instead of trying to be a diff replacement for all circumstances, the goal of icdiff is to be a tool you can reach for to get a better picture of what changed when it’s not immediately obvious from diff. Docs include examples on how to integrate it with Git, Mercurial and Subversion.
Guitar is a multi-platform graphical Git client under development, written in C++ and powered by Qt.
SCM Breeze is a set of shell scripts (for bash and zsh) that make it easier to use Git. It integrates with your shell to give you numbered file shortcuts, a repository index with tab completion, and a community driven collection of useful SCM functions. SCM Breeze lives on GitHub at https://github.com/ndbroadbent/scm_breeze
Gitpod - Prebuilt Dev Environments for GitLab, GitHub and Bitbucket; on the cloud or self-hosted, with a free tier.

Credits

This edition of Git Rev News was curated by Christian Couder <christian.couder@gmail.com>, Jakub Narębski <jnareb@gmail.com>, Markus Jansen <mja@jansen-preisler.de> and Kaartic Sivaraam <kaartic.sivaraam@gmail.com> with help from Andrew Ardill, Jonathan Tan, Brooke Kuhlmann, Eric Sunshine, Carlo Marcelo Arenas Belón and Gábor Szeder.