Welcome to the 53rd edition of Git Rev News, a digest of all things Git. For our goals, the archives, the way we work, and how to contribute or to subscribe, see the Git Rev News page on git.github.io.
This edition covers what happened during the month of June 2019.
Easy bulk commit creation in tests
Derrick Stolee, who prefers to be called Stolee, has been sending “Git Test Coverage Report” emails to the mailing list since September 2018.
These emails contain reports made using contrib/coverage-diff.sh
to combine the gcov
output from make coverage-test ; make coverage-report
with the output from git diff A B
to discover new lines of code that are
not covered by the test suite.
In June 2019 Stolee sent a new report which stated that some new lines of code contributed by Jeff King, alias Peff, were not covered by any test. Peff replied that when running the test that exercises the surrounding code, we hit the early return above the lines that are not tested when there are fewer than 100 commits to index.
One solution to that problem could be to create 100 commits, instead of just 10, in the test, though that would slow down the test. Peff then concluded:
It would be nice if we had a “test_commits_bulk” that used fast-import to create larger numbers of commits.
A few hours later Peff replied to himself by sending a 6 patch long
patch series which implements the test_commit_bulk
function he
suggested.
In this patch series Peff converted a few places in the code to use this function and reported that it saves around 7.5 seconds from his test runs. He commented: “Not ground-breaking, but I think it’s nice to have a solution where we don’t have to be afraid to generate a bunch of commits.”
The comments above the function explains it like this:
# Similar to test_commit, but efficiently create <nr> commits, each with a
# unique number $n (from 1 to <nr> by default) in the commit message.
#
# Usage: test_commit_bulk [options] <nr>
# -C <dir>:
# Run all git commands in directory <dir>
# --ref=<n>:
# ref on which to create commits (default: HEAD)
# --start=<n>:
# number commit messages from <n> (default: 1)
# --message=<msg>:
# use <msg> as the commit message (default: "commit $n")
# --filename=<fn>:
# modify <fn> in each commit (default: "$n.t")
# --contents=<string>:
# place <string> in each file (default: "content $n")
# --id=<string>:
# shorthand to use <string> and $n in message, filename, and contents
#
# The message, filename, and contents strings are evaluated by the shell inside
# double-quotes, with $n set to the current commit number. So you can do:
#
# test_commit_bulk --filename=file --contents='modification $n'
#
# to have every commit touch the same file, but with unique content. Spaces are
# OK, but you must escape any metacharacters (like backslashes or
# double-quotes) you do not want expanded.
Johannes Schindelin replied to Peff that he likes the direction
because “it would make it super easy to go one step further that would
probably make a huge difference on Windows: to move test_commit_bulk
to test-tool commit-bulk
”.
Peff replied that “in the biggest case we dropped 900 processes to
4” so that he thought it would not make a big difference to convert
test_commit_bulk
to C code and integrate it as test-tool commit-bulk
.
But anyway Peff suggested 3 different ways to have only 1 process.
One of the ways Peff suggested was to add a feature to fast-import to
say “build on top of ref X”. Elijah Newren replied to Peff that such
a feature already exists using something like from refs/heads/branch^0
.
Peff thanked Elijah and used the feature in a patch he attached, which further reduces the number of processes used.
Ævar Arnfjörð Bjarmason also replied to Peff wondering if just having a few “template” repositories, that could just be copied in many tests, would be a better approach.
Peff replied to Ævar that just “seeing the end result of running a bunch of commands” is less instructive than following “the steps that the author was thinking about”, and that “it’s more annoying to update”. Peff though would find cool that we could “allow caching of the on-disk state at certain points in a test script”, for example by annotating some test snippets as “SETUP”, like a prerequisite.
Stolee also replied to Peff, congratulating him for the quick
turnaround, providing results from performance tests on Windows
which were similar as those provided by Peff, and suggesting
possible improvements to the test_commit_bulk
function. These
suggestions were then discussed by Junio Hamano, the Git
maintainer, and Peff.
Junio, Ævar, Eric Sunshine and Gábor Szeder also discussed with Peff some aspects of the implementation and documentation of the new feature.
Peff then sent a version 2
of the patch that implements test_commit_bulk
with many of the
improvements that had been discussed. Junio discussed the
implementation a bit further, but seemed happy with the patch series.
Peff recently sent a version 3, which will hopefully be merged soon.
Various
Light reading
My Personal Git Tricks Cheatsheet by Antonin Januska, with some improvements based on feedback in comments
Manage your dev.to blog posts from a Git repo and use continuous deployment to auto publish/update them by Maxime (@maxime1992)
Object Oriented Programming in C: A Case Study - Git and Linux Kernel: slides by Matheus Tavares and Renato Lui Geh (PDF)
C# or Java? TypeScript or JavaScript? Machine learning based classification of programming languages by Kavita Ganesan on GitHub Blog (OctoLingua project)
Git tools and sites
CallPath, by Matheus Tavares, “lets you see all the paths your code take until a specified function”. Matheus is a Google Summer of Code student currently working on making pack access code thread-safe.
diffr, by Nathan Moreau, is a new diff highlighting tool “trying to improve on the diff-highlight script distributed with Git”.
This edition of Git Rev News was curated by Christian Couder <christian.couder@gmail.com>, Jakub Narębski <jnareb@gmail.com>, Markus Jansen <mja@jansen-preisler.de> and Gabriel Alcaras <gabriel.alcaras@telecom-paristech.fr> with help from Jaime Rivas and Carlo Marcelo Arenas Belón.