Welcome to the 62nd edition of Git Rev News, a digest of all things Git. For our goals, the archives, the way we work, and how to contribute or to subscribe, see the Git Rev News page on git.github.io.
This edition covers what happened during the month of March 2020.
Happy birthday to all of us ;-)
On April 7, Junio Hamano, the Git maintainer, sent a happy birthday message to the mailing list to celebrate that “it was today 15 years ago that Linus announced the availability of the first tarball of Git”.
Junio thanked the contributors and everyone in the ecosystem, including people from the Software Freedom Conservancy and employers of contributors.
His email ended with “Thanks all, and let’s look forward to see the next 15 years be as wonderful years for Git as the past 15 years ;-)”
A number of people replied especially thanking Junio for his work as the maintainer of the project. Edward Thomson, who is a maintainer of libgit2, also thanked everyone, on behalf of libgit2.
Let’s all also thank Junio, Linus and every contributor in the Git ecosystem!
Regression in v2.26.0-rc0 and Magit
Jean-Noël Avila reported to the mailing list that git version 2.26.0-rc0 segfaulted under Magit with auto-revert enabled.
Magit is a popular Emacs interface to Git, and the auto-revert mode lets Emacs revert files that have changed on disk when a Git command has been run outside of Emacs.
Jean-Noël had bisected the issue to a commit that was improving the error message which Git issues when it dies due to processing a path outside the repository. This commit though didn’t consider the case of a bare repo which triggered the segfault.
Jonathan Nieder replied that the bug was fixed by another commit by Emily Shaffer that had not yet made it to the master branch. He asked Junio Hamano, the Git maintainer, if the commit could be fast-tracked, and Emily if she could add a test to her commit.
Junio replied suggesting that a few tests should be added, and that there were a few days left before v2.26.0-rc2 to add them. The next day though Junio replied to himself with a patch adding the tests and asking for comments.
Jonathan Nieder reviewed Junio’s tests adding his “Reviewed-by:”, and said that Emily was out of office so they were well timed.
Junio replied to himself again discussing one test he wrote that
tested that both git log -- ..
and git ls-files -- ..
fail when
the current working directory is the .git
directory.
He wondered why, if “.” instead of “..” is used in the above
commands, Git should behave as if the current working directory was
the top-level of the working tree instead of .git
, and why Magit
is expecting cd .git && git ls-files ..
to show the entire working
tree.
Kyle Meyer replied to Junio that internally Magit’s call which
triggered the bug is running git ls-files
from .git
to ask
whether the file used to edit a commit message is actually tracked,
as it makes no distinction between files in .git and in the working
tree. He said that he would propose a change in Magit to improve
this.
Gábor Szeder also replied to Junio’s patch suggesting a small improvement in the tests which Junio accepted sending an improved patch.
The fix with Emily’s code and Junio’s tests was then merged into v2.26.0-rc2.
Who are you and what do you do?
I’ve been an open-source hacker since long before that term was coined. Back around the turn of the century I wrote “The Cathedral and the Bazaar”, which helped reinvent the movement and gained it mainstream acceptance. I’m also the author of “The Art of Unix Programming”. More recently I’ve headed the GPSD and NTPsec projects. My most Git-relevant work is reposugeon.
You’ve recently gifted us with an article about reposurgeon on Git Rev News edition 60, is there something you would like to add about reposurgeon, its history, or the article?
Probably the most interesting additional thing I can say is that I discovered a fundamental strategy for designing good DSLs (Domain-Specific Languages) while working on reposurgeon.
Here it is: Whatever domain you’re trying to capture, first develop a way to do that capture in a declarative markup. Then write an editor for that markup, and you will implicitly have a DSL that spans the domain. It’s rather like the mathematical concept of a functor.
In reposurgeon’s case, the domain is Git repositories and the declarative markup is fast-import streams. Reposurgeon is all about exploiting that equivalence.
You used reposurgeon to migrate the GCC repository, which has about 280K commits, from SVN to Git. Can you tell us a bit more about the context in which such migrations happen. Like what are the timelines, goals, tools, people, etc involved in such migrations?
I’ve already written about this in some detail here. I’m going to be reworking that over the next couple of weeks based in part on the GCC experience and in part to reflect the unfortunate fact that Mercurial isn’t really a contender any more.
One thing that has been on my mind recently is the importance of having a “Mr. Inside”. The ideal team to do a major conversion pairs a reposurgeon expert (Mr. Outside) with a project member who knows the history of the repository intimately, can make policy decisions, and is willing to learn enough about reposurgeon to edit the recipe.
On GCC we had the ideal situation - the project lead chose to be Mr. Inside. And a good thing, too - this conversion was difficult.
Can you describe how Git is related to reposurgeon, other software you have been working on and your work in general?
Reposurgeon depends on the leverage offered by git fast-import streams. It uses them as an interchange format between different version-control streams. Conversely, Reposurgeon (and a front-end I also maintain, cvs-fast-export) enables higher-quality migrations from older VCSes than anyone has ever been able to do before.
As for how it relates to my other work, there is cvs-fast-export. I didn’t write the crucial analysis parts of cvs-fast-export myself; Keith Packard of X fame did that, it used to be called cvsparse. I did rescue it and give it the ability to emit a fast-import stream, because reposurgeon needed a better CVS importer than I could find elsewhere.
More generally, I love writing DSLs and will take pretty much any opportunity to do that. I have especially enjoyed inventing and working on reposurgeon because - well, I used to be a mathematician. I like working on things where graph theory and abstract algebra are important. Reposurgeon scratches that itch.
What is something about Git or its ecosystem that you admire?
How freaking comprehensive it is. Pretty much anything you can imagine wanting to do with a version-control history there is tool support for somewhere.
If you could get a team of expert developers to work full time on something in Git for a full year, what would it be?
That’s easy. I’d port it to Go and toss out the C code. C still has its uses, but for anything without hard latency requirements C is now obsolete. Not something I would have said even three years ago, but times change.
If you could remove something from Git without worrying about backwards compatibility, what would it be?
There’s a weird feature I’ve forgotten the name of where you can pass around binary pack files rather than MRs or patches. Ah, git bundles, that’s it. Having worked with it once, I hate the lack of transparency, that you can’t easily eyeball a bundle before you apply it. I’d shoot that feature through the head in a heartbeat.
If I pick one thing to fix rather than remove without worrying about backwards compatibility, it would be git-cvsimport. That thing is very badly broken; the engine it uses misplaces branch joins. The Git devs know it’s broken but have stuck to it because of an incremental-conversion feature that I think is effectively useless. They should scrap it and rewrite the CVS import procedure to use cvs-fast-export instead.
What is your favorite Git-related tool/library, outside of Git itself?
You mean other than the ones I’ve written myself? I don’t think I have one, sorry.
Various
Light reading
Git tools and sites
This edition of Git Rev News was curated by Christian Couder <christian.couder@gmail.com>, Jakub Narębski <jnareb@gmail.com>, Markus Jansen <mja@jansen-preisler.de> and Kaartic Sivaraam <kaartic.sivaraam@gmail.com> with help from Eric S. Raymond, Junio Hamano and Philip Oakley.