How I Stopped Worrying And Learned To Love `git rebase`

N.B. Some of this revolves around puppet, but if you know nothing about puppet, you won’t miss anything.

I’ve been doing a bunch of stuff in puppet lately. Since my actual puppet runs are inside VMs which I nuke fairly frequently, doing the actual bits of development in the VM is impractical (and somewhat dangerous even).

I use git to do my normal version control anyway, so that’s nothing earth shattering, but I also use git to update /etc/puppet in the VM from what’s on my laptop because it works really well.

Since I do that, the number of commits I generate can get large, as some commits are adding a close quote or colon, or what have you. Not only that, but I often have a few commits that I never want to actually get merged at the end of the day because they’re just there to set the node configration, or tweak things that only apply because I’m fiddling in a dev vm, and not a real environment.

How to manage all of this?

I have a script that I copy to my VMs after I bring them up the first time which I call plunk. The script is really just a one-liner of:

git fetch origin && git reset --hard origin/$1 && puppet agent -t -v

Basically, it syncs /etc/puppet to the contents of my laptop’s puppet repo.

So after the VM comes up, I scp that to the VM, log in and do:

sudo bash
cd /etc/puppet
~drew/plunk my_local_git_dev_branch

(Yes, I sudo bash which is a bad practice, but it’s in a disposeable test VM that never lives for more than a day and is not accessible outside my laptop. I never do this on a real box though).

Puppet will do it’s thing, and I’ll find something that’s not right or just need tweaking. So I’ll edit the relevant bits on my laptop, and do what I call a ‘derp commit’ like this:

git commit -am 'derp'

When I’m iterating quickly, the commit messages are useless and of no lasting value, and as you’ll see later, they’re just removed anyway, so while I’m in a tight loop, I’ve got two tabs in a shell window, one logged into the VM and the other at the top of my local puppet git repo.

Then after the commit is done, I run this in the tab that’s logged into the VM:

~drew/plunk my_local_git_dev_branch

Then lather, rinse, repeat for a while. Now I may have to make some changes that I don’t ever actually want merged into master at the end of all of this. It’s things like the “cluster” settings so that it connects to a zookeeper instance running on the VM instead of the normal dev instance, or what have you. When I need those kinds of changes, I just commit them with a message of “DO NOT MERGE” or something similar (I’ll call it DNM for short for the rest of this).

After a while of development, I wind up with a git log of:

commit 27bb61f05519de87a52e3007263481c87353b76e
Author: Drew Csillag <drew@workmarket.com>
Date:   Thu Oct 1 13:36:19 2015 -0400

    derp

commit a917bdc140d0a83af7e0706c4cb7ba9074c5e9af
Author: Drew Csillag <drew@workmarket.com>
Date:   Thu Oct 1 13:34:29 2015 -0400

    derp

commit a9bace0441dafd965762348d345221f3fa66e717
Author: Drew Csillag <drew@workmarket.com>
Date:   Thu Oct 1 13:28:19 2015 -0400

    DO NOT MERGE

commit d8040395945b69dbf56bd8ae3475c23df995b49d
Author: Drew Csillag <drew@workmarket.com>
Date:   Thu Oct 1 13:28:02 2015 -0400

    DO NOT MERGE
commit 301ab2b62cc555a94fb4feff6d78be7161d8208a
Author: Drew Csillag <drew@workmarket.com>
Date:   Thu Oct 1 13:27:28 2015 -0400

    DO NOT MERGE

commit df142f91ef771822c720c7e30afdba89847b4bd9
Author: Drew Csillag <drew@workmarket.com>
Date:   Thu Oct 1 13:23:35 2015 -0400

    DO NOT MERGE

commit 74267d7bd9e9070e072eb598851f28a55ea22e33
Author: Drew Csillag <drew@workmarket.com>
Date:   Thu Oct 1 13:20:54 2015 -0400

    derp

commit 0c5e71c80b845e4f7323bf3a50cb3fa7a72cbc04
Author: Drew Csillag <drew@workmarket.com>
Date:   Thu Oct 1 13:19:25 2015 -0400

    derp
commit c0cafd1bbdee3730f59139ddd5a4871705d63757
Author: Drew Csillag <drew@workmarket.com>
Date:   Thu Oct 1 13:19:01 2015 -0400

    derp

commit ec09446f4eb0536ed1b97906b6f74f1ffad14763
Author: Drew Csillag <drew@workmarket.com>
Date:   Thu Oct 1 13:18:39 2015 -0400

    derp

commit 673a850ec602e7debdaca2e44b2ba2bf316ae36f
Author: Drew Csillag <drew@workmarket.com>
Date:   Thu Oct 1 13:16:39 2015 -0400

    derp

commit b6ced1c44f359047dd9668fa74babf848c3e2acd
Author: Drew Csillag <drew@workmarket.com>
Date:   Thu Oct 1 13:15:54 2015 -0400

    derp

commit ac6b27eadc700195ba6b1cf0546256a0ea0ac94f
Author: Drew Csillag <drew@workmarket.com>
Date:   Thu Oct 1 13:14:45 2015 -0400

    derp

commit 8d626eb246778e4419fca48cf16acc5267b64bcd
Author: Drew Csillag <drew@workmarket.com>
Date:   Thu Oct 1 13:12:24 2015 -0400

    derp

commit e2f48d81f1d21f95ba935df30cd57bc5e8271632
Author: Drew Csillag <drew@workmarket.com>
Date:   Thu Oct 1 13:08:01 2015 -0400

    derpy

commit 433ad67b5dc37d70110a714e02af928e23e1fd6f
Author: Drew Csillag <drew@workmarket.com>
Date:   Fri Sep 25 16:48:33 2015 -0400

    Added smartstack nerve/synapse bits for kafka and graphite

commit 82493a570de100e8eeb989eb6ccc9d7b65d78b4c
...

After some period of time, I’ll decide to clean up the commits, so I’ll start the rebase. In particular, I’ll get the commit id of the commit just past the last one I want to join into. In this case it would be 82493a570de100e8eeb989eb6ccc9d7b65d78b4c.

git rebase -i 82493a

which will bring up an editor with this: editor with commit mess

From there I’ll basically sort them with “derps” first, then the “DO NOT MERGE”, maintaining their original order within those two categories:

sorted commits

Then I’ll change picks on the derp commits to be f to squash them into the “main” commit. Then I’ll change all of pick on the DO NOT MERGEs, except for the first one.

fixup commits

From there, save and exit, and you have the following git log:

commit ad89f94b7c861d3834b653f4e541ff041a69cc8f
Author: Drew Csillag <drew@workmarket.com>
Date:   Thu Oct 1 13:09:57 2015 -0400

    DO NOT MERGE THIS

commit 6d1139e39572f16e1a6ec4e88be4ab5808fca01c
Author: Drew Csillag <drew@workmarket.com>
Date:   Fri Sep 25 16:48:33 2015 -0400

    Added smartstack nerve/synapse bits for kafka and graphite

Nice and clean.

Multiple Logical Commits

Now if there should be multiple logical commits, you can do it one of two ways (or even both) depending on how you work. If I want to track the commits as I go, I’ll commit with a message like ‘derp kafka thing’ or ‘derp graphite’ and then do a similar sorting process like I did above, just with two ‘derp’ categories instead of just the one.

Alternatively, I’ll split them out at the end, like so:

git rebase -i 82493a

Then I’ll change pick to edit for the commit I want to split edit commit

Then I back up the state of the tree to the previous commit. reset head

Now I create three commits, one with the kafka bits, one with graphite, and one with everything else.

git add services/s_kafka/manifests/init.pp services/s_smartsack/manifests/kafka.pp
git commit -m 'kafka bits'
git add services/s_graphite/manifests/init.pp services/s_smartstack/manifests/graphite_pickle.pp services/s_smartstack/manifests/graphite_plaintext.pp
git commit -m 'graphite bits'
git commit -am 'Added smartstack nerve/synapse base bits'

And lastly, I’ll finish up by letting the rebase continue until it finishes.

git rebase --continue

If all went well you should see:

Successfully rebased and update refs/heads/example.

I mess up editing commits all the time. Fortunately, there’s git rebase --abort which will restore the state of the world back to what it was before you entered interactive rebase.

Reordering Commits

But now the commit list is a tad out of order. The base bits should be first and not kafka, right?

out of order commits

So we rebase again

git rebase -i 82493a

And we see this: out of order rebase

Let’s put the base one first like this: in order rebase

And now our commit log looks like it ought to: in order log

Time For The Pull Request

It’s time for code review, and changes will have to be made and I’d like to preserve the DNM commits until it’s all done and the branch merged, but don’t want the DNM merged (duh).

This turns out to not be at all difficult. When you go to push your branch to be reviewed, instead of doing something like:

git push origin example

You do this instead:

git push HEAD^:refs/heads/example

The refs/heads/ part only needs to be done the first time when creating the remote branch, after that you can just use:

git push HEAD^:example

And your reviewers won’t see the DNM commits in your pull request. pull request with no DNM commits

If you have more than one DNM commit you want to maintain, then adjust HEAD^ above as appropriate.

Now you have to be careful that the what’s in the DNM commits doesn’t affect what should be merged, but that’s outside of the scope of this post.

Developing On Top

Sometimes you want to build on top of a commit like this, but without the DNM commit. It’s pretty simple from there. Just do a:

git checkout -b new_branch_name HEAD^

Then continue on with life. But if say one of your reviewers requests some changes that you want to make in the first pull request, a little rebasing is necessary. Assuming you’ve got the original branch checked out, lets say you make a single commit. Then you interactively rebase to reorder the commits like we did earlier. Push again, this time with the -f flag or it won’t work.

But now we have this branch we made before the rebase. What do we do now? With the second branch checked out, we rebase it onto the old one in the proper place. Let’s say that the commit before our first commit on the new branch is a8db21 we then rebase that onto example, skipping the DNM commit like so. If there is more than one DNM commit on example, you may need to use more than one ^.

git rebase --onto example^ a8db21

Summary

Being able to do things like this definitely is really nice. It allows me to commit whenever I want and make the commit log at the end look like I want it to. Factoring in all of that and a doc that GitHub put out called How to undo (almost) anything with Git, I learned to stop worrying and love git rebase.