Keeping reviewers happy when developing larger changes

Apr 5, 2019 · 7-minute read software-engineering

The challenge

When developing a large change to a project, in a team that uses code review and frequent deployments, there is a conflict that often comes up:

There is uncertainty at the outset about what exactly will need to change and the best approach to take. This tends to demand being highly iterative while working on the code. You’ll find unforeseen complexities part-way through. You’ll figure out a better solution once you’ve got something working. You’ll make a mistake or a typo and need to make “fixup” commits to correct it.
At the end you want to present a result at the end which is easy to review, minimizes deployment risks, and provides a clear record of the rationale for past decisions for your future self and other team members. This demands delivering changes in small, clearly organized increments, as if you knew exactly what you needed to do beforehand and never made mistakes.

A development strategy

In this post I’m going to describe the workflow that I use to try and meet both these needs. The strategy is to optimize for being able to make changes quickly and easily at the start and get feedback that might result in significant changes early, yet makes the review and deployment steps easy later.

During the part of the process that comes before someone starts reviewing code changes in detail, this makes heavy use of git rebase --interactive, so it is a good idea to first get familiar with how to use it to re-organize, combine and split commits.

My workflow

Have a think at the outset about what will likely need to change and talk through the design with others. Time spent here can save a lot of time later.
Create a branch and work through the pieces based on initial expectations. Try to create independent commits for logical functional changes. The main focus though at this stage is on discovering what will need to change and experimenting with different approaches. Inevitably during this process unforeseen issues come up, mistakes get made and so on. Expect a certain messiness to the result.
Once you’ve got everything working and the high-level approach seems satisfactory, this can be a good point to run a draft of the code, or at least the high-level design, by someone else. This helps catch both code design and functional issues early.
Take a break. Don’t skip this step!
Review the code with fresh eyes, and try to identify subsets of the overall change which don’t have dependencies on other parts, as well as fixes to earlier commits which can be folded into the original commit.
Use git rebase --interactive {branch you started from} to reorder, combine and split commits. The goal is to make it so that:
- No commit has a dependency on a later commit in the sequence. In other words, the sequence of changes forms a logical “story” and ideally, the tests pass after each commit
- You maximize the number of independent changes which could be merged on their own without breaking anything. It is often helpful if commits that have no dependencies occur near the start of the sequence.
- There are no fixes to earlier commits in the final sequence. In other words, it looks as if you created the final commits with perfect foresight and without making any mistakes along the way 😉
  
  Remember that you can run a rebase as many times as you like, so don’t feel a need to make all the changes during one run of git rebase --interactive. In fact I tend to do things in a super-iterative style, eg. maybe combining just a couple of commits or all changes to a particular file in one run of git rebase.
Do another pass over the commits using git rebase, this time focusing on editing the commit messages. The goal is to make sure that the purpose of each commit is clear from the commit message, and the rationale for any changes is clear, as well as any subtleties that may not be obvious. Note that the commit message doesn’t need to extensively repeat material covered elsewhere (eg. in issue descriptions), it can just link to that material. See this post for further guidance on writing good commit messages.

At this point, you should have a tidy sequence of changes with many that could, if you wanted to, be submitted as separate pull requests. As well as making life easier for other contributors to the codebase in future with a clean and well documented history, you have also already made reviewing the overall change easier by breaking it into more easily digestible pieces.

At this point, what happens next will depend on the overall size of the change, the complexity of the code being changed, the level of deployment risk involved and the preferences of the people involved in reviewing the code. It may be fine to submit everything as one PR, or it may need to be split and deployed in smaller chunks.

Here are some rough rules of thumb that I use for a change of “average” cognitive complexity and deployment risk:

If the overall change is under 300 lines of code, it is probably fine in one PR.
If the change is 300-600 lines of code, consider breaking it into multiple PRs.
If the change is more than 600 lines of code, definitely consider breaking it into multiple PRs.

If the change is trivial (eg. automated change by a trusted tool) or complex, or the deployment risk is high, adjust these accordingly. Also, if you know who will be reviewing the change, do ask about their preferences.

If it is deemed better to split up the work into smaller PRs, you’ve now made the task much easier. For each subsequence of commits that you want to submit separately, create a new branch and use git rebase to remove all commits except the ones you want to keep. Run the build and tests and review the change again before submitting a new PR. I must emphasize the review step. I sometimes find it even helps just to look at the code in a different viewer with a different color scheme (eg. GitHub’s PR review vs my local Vim setup).

When submitting each PR, make it clear whether that PR can be merged on its own, or whether it has dependencies on other PRs that must be merged first. Typically you want to maximize the number of PRs that can be independently merged and deployed.

If you have discussed the overall design with your peer before submitting the code, this will tend to avoid being asked to make large structural changes during the review process. Keeping PRs small also limits the size of any changes that you’ll end up making. For any changes you are asked to make as a result of review, it is typically more convenient for the reviewer if they are pushed as additional commits on the branch so they can be inspected separately. A final cleanup rebase can then happen before merging. If you have good tools for showing the diff after a force-push, you might be able to cleanup as you go.

Summary

To sum up:

Learn to use git rebase --interactive to re-order, combine and split commits
Focus on being iterative and getting critical feedback early in the process
When the code works and you’re happy with the approach, take stock of what changed and the dependencies between different parts of the overall change
Re-organize and edit the commits with a focus on minimizing inter-dependencies and providing a clean “story” that makes review easy and provides clear context for future decision makers
Break the changes into an appropriate number of chunks for review and deployment. This will depend on factors such as change complexity, deployment risk, and team members’ preferences

The challenge

A development strategy

My workflow

Summary

Further reading