Welcome to the 68th edition of Git Rev News, a digest of all things Git. For our goals, the archives, the way we work, and how to contribute or to subscribe, see the Git Rev News page on git.github.io.
This edition covers what happened during the month of September 2020.
Apply Git bundle to source tree?
Andreas Grünbacher asked the mailing list if there was “a way to apply a particular head in a bundle to a source tree”. Using the Linux kernel as an example, he said he would like to create a bundle containing Git objects from version 5.8 to version 5.9-rc1:
$ git bundle create v5.9-rc1.bundle v5.8..v5.9-rc1
and then be able to “apply” this bundle either to an existing repository or only a source tree that would already contain code for version 5.8.
Taylor Blau replied to Andreas that there is no such thing as
git bundle apply
, but that git fetch
and git pull
can be used
to update an existing repo from a bundle, for example:
$ git pull /path/to/bundle 'refs/tags/v5.9-rc1'
Andreas replied to Taylor that he was looking for a way to apply a bundle to an actual source tree, not a Git repository, but Taylor didn’t think it was possible.
Then another Andreas, Andreas Schwab, chimed in to discuss with Andreas Grünbacher what basis was needed to “apply” a bundle.
In the meantime Junio Hamano, the Git maintainer, replied to the
original email from Andreas G. saying that a bundle created with
“v5.8..v5.9-rc1” was not like a patch but rather “an equivalent of a
(shallow) repository” that required having v5.8 for the bundle to be
usable. Junio suggested ways to use such a bundle to update and then
work in an existing repo that has v5.8, and asked for ideas to
improve the git bundle
documentation.
Andreas G. replied to Junio that the documentation was fine, and that he then saw that there’s simply not enough information in a bundle for what he wants to achieve on a source tree.
Konstantin Ryabitsev, a maintainer of kernel.org, also replied to Andreas G.’s initial email suggesting the following command:
curl --header 'Accept-Encoding: gzip' -L https://git.kernel.org/torvalds/p/v5.9-rc1/v5.8 |
gunzip - | git apply
Andreas G. liked it, but said that the use case he was thinking about was to replace the patches that are, along with a baseline release, in source packages provided by Linux distributions. As bundles would provide actual Git history, they would be nicer than patches, if they could replace them.
Thomas Guyot-Sionnest replied that bundles could do that and that it was a “neat idea”. He said “bundles could be used for both the base release and patches”. The source packages would be “bigger initially than a single release”, but old bundles could help downloading just the additional bits needed for the next versions. That would help packagers as patches are “best maintained in Git” already.
Thomas also mentioned that he has been using bundles “since the very beginning” for backups as it’s “an efficient way to archive an entire bare repo into a single file and ship it offsite”.
Brian M. Carlson also replied to Andreas G. saying that “Debian considered using Git as part of the 3.0 (git) format”, but that there were issues with upstream having “non-free or undistributable material in their repositories”. “Tarballs can be repacked, but it’s harder to rewrite Git history to exclude objects.”
Who are you and what do you do?
I’m a software developer and MSc student, living in São Paulo, Brazil. I started contributing to Git just over a year ago, through Google Summer of Code (GSoC). During the program, I was fascinated with the internal mechanics of Git and with Git features that I had not known until then. So, at the end of GSoC, I definitely wanted to stick around and learn more. Today, I’m working as a contract developer for Amazon, seeking to parallelize checkout and other Git operations.
What would you name your most important contribution to Git?
The improvements in git grep
’s parallelism, during GSoC. With
these changes, we’ve got up to 3.3x faster
git grep
searches in the object store, using threads.
What are you doing on the Git project these days, and why?
I’m currently working on parallelizing checkout. The parallel version has shown
to be particularly effective for repositories located on SSDs or over network
file systems (here are some benchmark numbers).
The idea is to make it available to all commands that perform checkout: from
the plumbing git read-tree
to git clone
, git sparse-checkout
, and
git checkout
itself.
It’s been an exciting and challenging project! And containing patches and ideas from two previous parallel checkout approaches (by Duy and Jeff Hostetler), I think it really goes to show the power of collaboration and open-source.
If you could get a team of expert developers to work full time on something in Git for a full year, what would it be?
Hmm, it would be nice to try to minimize the global states in the codebase. We have a fair amount of thread-unsafe operations, due to reads and writes to global variables or function-scoped static variables. Such variables do offer quite practical mechanics, but they can also hinder the process of adding multi-thread support. Furthermore, they are sometimes hard to locate, which can potentially lead to race conditions. The alternative approach, using multi-processes, usually requires writing more code for communication and synchronization (besides having an extra cost for subprocess spawning).
If you could remove something from Git without worrying about backwards compatibility, what would it be?
Hmm, nothing comes to mind right now.
What is your favorite Git-related tool/library, outside of Git itself?
I rarely use Git-related tools besides Git itself. But I’ve always wanted to
try and incorporate tig blame
more in my daily life. I use git blame
quite
a lot, to dig through the code’s history and find answers regarding specific
designs or implementations. In that sense, what I find most attractive about
the tig
interface, is that it allows to interactively load blame for parent
commits, which is quite handy. I guess I still tend to go with the plain
git blame
out of habit.
Various
The default branch for newly-created repositories (on GitHub) is now ‘main’.
The history of the term master
in Git was covered in Git Rev News Edition #65,
if one is interested.
Mercurial planning to transition away from SHA-1 [LWN.net]; discovering the problem with SHA-1 was discussed in Git Rev News Edition #25, and the state of SHA-1 transition in Git (in 2018) in Edition #41.
What’s cooking on Sourcehut? September 2020 and October 2020; Sourcehut, or sr.ht, is a software forge which was covered in Git Rev News Edition #46.
Google Summer of Code 2021 has been announced with significant changes compared to previous editions. Notably coding hours and period will be reduced from 350 hours and 12 weeks to 175 hours and 10 weeks; there would be 2 evaluations (instead of 3). Additionally, eligibility requirements will be relaxed, among others allowing people participating in a variety of different licensed academic programs, not just students of accredited university programs.
How Gitlab puts gRPC in the Real World talks about GitLab internals, especially access and management of Git repos through gRPC and a software layer written in Go called Gitaly.
Light reading
Git tools and sites
bit is an experimental modernized Git CLI,
written in Go, built on top of git
, that provides happy defaults and other
niceties, with some commands taken from git-extras.
It takes some inspiration from Gitless, which was
covered in Git Rev News Edition #20.
nb is a command line note-taking, bookmarking, archiving, and knowledge base application with plain-text data storage, encryption, and Git-backed versioning and syncing, written in Bash.
Cambria is a Javascript/Typescript library for converting JSON or YAML data between related schemas, via lenses (bidirectionally specifying a transformation). This helps version-control data exchange formats.
Codeberg is a collaboration platform and git hosting for free and open source software, content, projects. It is hosted in EU.
Git exercises is an open-source platform and service where one can learn and practice Git and discover its features, with the help of 23 exercises.
git-send-email.io by Sourcehut is a step-by-step tutorial on how to contribute to email-driven projects like the Linux kernel, PostgreSQL, or Git.
This edition of Git Rev News was curated by Christian Couder <christian.couder@gmail.com>, Jakub Narębski <jnareb@gmail.com>, Markus Jansen <mja@jansen-preisler.de> and Kaartic Sivaraam <kaartic.sivaraam@gmail.com> with help from Matheus Tavares Bernardino.