This is the idea page for Summer of Code 2022 for Git.

Please completely read the general application information page before reading the idea list below.

Summer of code main project ideas

Students: Please consider these ideas as starting points for generating proposals. We are also more than happy to receive proposals for other ideas related to Git. Make sure you have read the “Note about refactoring projects versus projects that implement new features” in the general application information page though.

More Sparse Index Integrations

The spare index feature accelerates Git commands when using sparse-checkout in cone mode. This works by modifying the on-disk index file in a way that includes “sparse directory” entries instead of only file entries. This requires care when enabling the sparse index for different commands, as custom logic might be necessary. At minimum, interaction with the sparse index needs to be carefully tested in the Git test suite when enabling it.

The most-used commands have already been integrated with the sparse index feature. This process usually takes a few steps:

  1. Add tests to t1092-sparse-checkout-compatibility.sh for the builtin, with a focus on what happens for paths outside of the sparse-checkout cone.
  2. Disable the command_requires_full_index setting in the builtin and ensure the tests pass.
  3. If the tests do not pass, then alter the logic to work with the sparse index.
  4. Add tests to check that a sparse index stays sparse.
  5. Add performance tests to demonstrate speedup.

Here is a list of builtins that could be integrated with the sparse index. They are generally organized in order of least-difficult to most-difficult. This allows the student to gain partial success early in the project and the student can complete as many as possible in the timeframe (without expectation that all will be completed during the project).

Expected Project Size: 175 hours or 350 hours

Difficulty: Medium

Languages: C, shell(bash)

Possible mentors:

Unify ref-filter formats with other pretty formats

Git has an old problem of duplicated implementations of some logic. For example, Git had at least 4 different implementations to format command output for different commands.

Our previous GSoC students and Outreachy interns unified some of the formatting logic into ref-filter and got rid of similar logic in some command specific files. Current task is to continue this work and reuse ref-filter formatting logic in pretty.

See:

Expected Project Size: 175 hours or 350 hours

Difficulty: Medium

Languages: C, shell(bash)

Possible mentors:

Reachability bitmap improvements

Reachability bitmaps allow Git to quickly answer queries about which objects are reachable from a given commit. Instead of a commits parents and its root tree recursively, we can use a precomputed set of objects encoded as a bit-string and stored in a .bitmap file to answer the query near instantaneously.

There are a couple of areas where bitmap performance itself could be improved:

This project will give GSoC students a broad overview of reachability bitmaps, with the goal of improving their performance in some way or another. Students can expect hands-on mentorship, but will have the agency to pick one or more of the above sub-projects (or create their own!) that interests them most.

Expected Project Size: 175 hours or 350 hours

Difficulty: Medium

Languages: C, shell

Possible mentors: