Advanced Git - Cherry-pick and Rebase

Jeff
Octo cat

Git has become the default version control solution for most software teams these days; there are a few alternatives — Subversion, Helix VCS, etc. — but none of them are as ubiquitous as git. And for good reason. Git is fab. It’s easy to get started with. The basic commands (push, pull, merge) are simple enough to pick up in an afternoon. But I find all too often developers never move past these basics into the more powerful commands git has to offer (or worse, they insist on using a GUI client! GASP!). So today, let’s take a tour of the two of the most useful commands for manipulating your git history, cherry-pick and rebase.

xkcd git

*Photo Credit: The brilliant and hilarious XKCD: Git

Cherry-Pick

First things first, the cherry-pick command.

What is cherry-pick:

Cherry-pick essentially pulls changes from a different branch into the current branch you are on. You can do this for one commit, or a range of commits from a separate branch.

Why should I care?

Say you’re working on a team and you do code review as a part of your process. So at some point in your day, you move from a branch that you’re working on to a teammate’s branch to look at the code they wrote. Everything looks good; you put a comment and a ship emoji on their PR in github and go on your merry way. Type-itty type, you go back to the feature you were working on. You finish it up and put in a commit. What’s that message back from git? Crap, you’re still on your co-workers branch?

Now you reach into your bag of git tricks, and pull out a cherry-pick. This basically lets you copy a specific commit and superglue it on the end of your current branch. In our example, you would checkout to your feature branch.

Example Time:

Whoo, we traveled through spacetime to an imaginary terminal. We’re currently in our lovely co-worker’s branch. Let’s investigate:

git log

And here we have some fancy logs of our git history

commit 0d1dsomehashthingycopiedfromgithistory (origin/heman/master/ofUniverse)
Author: Person Smmith <personS@gmail.com>
Date:   Tue Oct 31 17:35:51 2017 -0400

     Wrote all the codes

Now we’re gonna steal that commit hash 0d1dsomehashthingycopiedfromgithistory and navigate back to our branch where we can super glue that fancy commit onto our branch, like a professional.

git checkout super-awesome-feature-thing
git cherry-pick 0d1dsomehashthingycopiedfromgithistory

That’s it. That commit is pulled into your current branch.

Rebase:

When I started one of my first professional programming jobs, I was met with a surprise when I put in my first PR. Let’s set the scene: there I am, agonizing over my keyboard, putting the final touches on my masterpiece of a first feature — am I being too dramatic?! ;). I hit save, go to my terminal, and type in my commit message. Beaming with pride, I messaged my team lead that I had finished the first task handed to me… And he responded “looks good. Now rebase and squash that thing and we can merge it in.”

Rebase? Squash? WTF IS THAT!! Why can’t we just merge it?

The answer is kind of complicated and very dependent on style and sensibility. Sometimes, it’s best to just merge and let git do its thing. Git is an extremely sophisticated program that does some great parsing and does its best to combine unlike things into what you (usually) expect it would. But it leaves behind a trace: “Branch so and so has been merged into branch Bob Loblaw.” And sometimes you need finer control over what is in your branch, you want to pull another feature into the branch you’re working on, or you’re just really anal about your git history. Without all the merge commits, your git history is much easier to read, and is a more accurate record of project progress.

What is rebase:

Rebase essentially rewrites your git history to put all the commits in their proper place in time. What happens is you rebase “onto” the branch you want to catch up with. You essentially tell git that this branch is the template and you want to fill your commits in around the existing commits in this branch.

If some of your commits were done before the commits in the branch you are rebasing onto, AND some were after them, you’ll have several “rounds” (a term I came up with, not official) of rebasing, during which, if there are merge conflicts, you’ll have the special privilege of multiple rounds of merge conflict resolution! And your git history will be rewritten completely, meaning you cannot get back to the state it was before rebasing, like when merging a branch directly.

When you merge a branch, your git history looks like this —> your commits + merged branch with all the things. This is nice, because if something goes terribly wrong during the merge, you can revert back to the previous state of your branch. With rebase, your commits are sprinkled in to the commit history, like milk in tea, never to be retrieved (sorry, English major moment there).

A side effect of this is changed commit hash names, which means if your branch was pushed to an origin site (like github), you will have to use the –f flag to force push it back up, because it looks like a brand new branch.

Why would I do this again?

True, merge is the safer option— which is why it is used much more often. Rebase is often referred to as the “unsafe” method of merging branches. The main benefit is a clean, readable git history. Without all the merge commits, your git history is much easier to read, and is a more accurate record of project progress.

You can also “squash” your commits during the rebase phase. So say you have a branch that you’ve been working on for a while, and you have a bunch of commits that say “WIP stepping out for lunch, blah blah blah.” You can get rid of those during the rebase process, and make your whole feature one commit, if you want. Or reword the commit messages to remove typos, etc. I, for one, am very skilled at making typos in commit messages.

The main drawback is the potential to lose a ton a work in the process of rebasing.

Rebase Example Time:

Say you have a feature branch that you’ve been working on for a day. In the meantime, your co-worker Sarah has not been compulsively checking twitter and chugging coffee, so she’s got her feature done and merged into master (or staging or whatever your feature is branched from).

Anyways, your branch is at least one commit behind the origin branch, so when you finally get your act together and add your feature, it won’t be able to “fast forward” and just tack your commits on to the master branch. You’ll be creating one of those hideous merge commits. How do we avoid this, and simultaneously make everyone think we’re a time-traveling git wizard? Rebase, my friends.

First, get all the things from remote git branches, with git fetch:

git fetch origin
git log

Then, look longingly at those missing commits in the master log… Time to smuggle those into your branch.

git rebase -i master

The -i stands for interactive, which brings up a list of commits from your branch that will be added to the branch you’re rebasing onto in your terminal’s default text editor. This is also where the mysterious “squash” command come in to play.

rebase shell

This shell shows us a list of commands. These are the ways you can manipulate your commit history during the rebase. Let’s go through them to see what they do.

Pick and Drop:

Pick and drop are the simplest available command in rebase. Pick just tells git to use the commit unchanged. Drop tells git to ignore it completely and you will lose all the changes in that commit.`

Reword and edit:

Reword and edit are very similar; they both let you edit the message of a commit. Reword only lets you edit the commit message. This is the more frequently used command. Edit pauses the rebase process after this commit is applied to your code, to let you also EDIT THE CODE. This is an even more risky move, so use this command with caution.

Squash and Fixup:

Squash and fixup are similar commands, as well. Squash will keep all of the changes made in that particular commit, but move the commit message to the description of the rebased commit. This essentially means the commit has been sucked into the commit after, but the distinct commit no longer exists. Fixup does the same thing, but does not append the commit message to the description of the rebase. These commands are often used when “squashing” many commits down to one “feature” commit, to clean up lots of “work in progress” commits.

Exec

That leaves us with “exec.” This command is the most complicated and deserves a blog post of its own, but we’ll briefly discuss it here. The exec command lets you run a bash script for each commit in the rebase. This is useful when you want to do something like run a test suite for each commit. You could get super creative with this if you wanted and do some powerful things.

Wrap Up

There you have it. Now you can go impress your coworkers and friends with your pristine git history. If you’re interested in reading a bit more about these concepts, here are some additional resources:

  1. Explanation from Git

  2. Merging vs. Rebasing

  3. Rebasing Golden Rule