How to Split Commits and Remove Unwanted Files
Did you perhaps commit that log file or binary by accident? No worries! In this post I'll provide you with three ways to remove unwanted files from a series of commits.
As a developer you're occasionally faced with the need to undo (or rather redo) a series of commits – just like the original poster of this Stack Overflow question (with over 10M views). If you've ever accidentally committed unwanted files, such as logs, binaries, or libraries as part of a series of commits, you know what I'm talking about.
No one wants to bloat the repository by adding unintentional commit noise, ultimately causing confusion among your team. Whenever this happens, removing these unwanted files from each commit is necessary – but how can it be done?
In this post, I'll go through three different ways to remove unwanted files from a series of commits using reset
, rebase
, and filter-branch
.
Start case
Before we look at the different options, let's contextualize the problem with a concrete example. Below is a short linear-history consisting of three commits, where the two latter (C1 & C2) contain changes to a log file (tmp.log) that have been accidentally committed.
Notice that none of the two commits (C1 & C2) have been pushed remotely, and only exist in the local master branch, allowing us to easily rewrite history as we please. Now, let's see how the accidentally committed file tmp.log can be removed from the history and C1 & C2 redone.
If the whole concept of rewriting history sounds strange to you, make sure to first revisit this post on Immutable Snapshots - One of Git's Core Concepts to get your bearings.
Redoing a series of commits
There are several ways to "undo" or "redo" a series of commits, depending on the outcome you're after. Considering the start case above, reset
, rebase
and filter-branch
can all be used to rewrite your history.
Alternative 1: reset
With reset
, a branch can be reset to a previous state, and any compounded changes be reverted to the Staging Area, from where any unwanted changes can then be discarded. Below illustration showcases how undoing changes from our initial start case looks like using a "soft reset":
$ git reset --soft t56pi
Following the soft reset, any unwanted changes can be removed using restore
and a new commit can then be created containing only the desired changes.
Note: As the soft reset
clusters all previous changes (C1 + C2) into the Staging Area, individual commit meta-data is lost; e.g. commit messages and unique changes. If this is not OK with you, chances are you're probably better off with rebase
or filter-branch
instead.