Tagged "git"

Git Tip of the Day: --color-words

I’ve always felt that Git’s default diff output left something to be desired, especially when it was applied to Markdown text files (like the ones used to generate this blog). Line-by-line diffs aren’t very helpful if every line represents a paragraph of text.

I was thus pleasantly surprised to stumble across this blog post, which—among other things—explains how to generate word-by-word diffs that wrap nicely. The first step is to configure the default Git pager to wrap lines using one of the following commands.

git config --global core.pager less -r
git config --global core.pager "less -+\$LESS -FRX"

The second step is to tell Git to generate a (colored!) word-by-word diff with the --color-words option.

git diff --color-words

And that’s all there is to it! Note that --color-words is technically equivalent to --word-diff=color, so if you’re interested in reading about the other available options to --word-diff you can check out the git-diff man page.

As helpful as this technique is, there is one caveat to keep in mind: as far as I can tell the --color-words option only works with the git-diff and git-log commands. If you happen to use git add -p to stage individual changes—like I do—you are stuck with the default line-by-line diff output of git-add. That’s really only a minor inconvenience, though, because you can still call git diff --color-words just before git add -p to get a nicely-formatted overview of your changes before you stage them.

I hope you found this tip to be as helpful as I did!

Calculating Git SHA-1 hashes in Ruby

Although the process by which Git calculates SHA-1 hashes is well documented in Pro Git, I had a hard time finding it today and decided to write a blog post that will (hopefully) be a bit easier for myself and others to search for later.

First of all, use the hash-object command as follows to print the SHA-1 hash that Git calculates for an object. (You can also pass a filename as an argument to hash-object.)

$ echo 'test content' | git hash-object --stdin
d670460b4b4aece5915caf5c68d12f560a9fe3e4

Note that, by default, echo prints a trailing newline character so this command is actually computing the SHA-1 hash of "test content\n". Interestingly enough, though, if you try to reproduce this behavior in Ruby by computing the SHA-1 hash of the same string, you get a different result.

$ irb
>> require 'digest/sha1'
=> true
>> puts Digest::SHA1.hexdigest "test content\n"
4fe2b8dd12cd9cd6a413ea960cd8c09c25f19527
=> nil

The reason for this, as explained in Pro Git, is that Git actually prepends the following information to a file’s contents before it calculates a hash.

  1. The object’s type—blob for a regular object, tree for a tree object, and commit for a commit object
  2. A space
  3. The (human-readable) number of bytes of data in the object
  4. A null byte (\0)

In other words, you need to run the following command to generate the appropriate hash.

$ irb
>> require 'digest/sha1'
=> true
>> puts Digest::SHA1.hexdigest "blob 13\0test content\n"
d670460b4b4aece5915caf5c68d12f560a9fe3e4
=> nil

Hope this helps!

GitHub Flow

A few months ago I shared a link to a successful Git branching model, also known as git-flow. I’ve always considered it to be a very robust and well-designed process for teams that collaborate via Git, but at the same time I’ve rarely used it for any of my personal projects. Why? I honestly never gave it too much thought, but after reading Scott Chacon’s take on the matter (GitHub Flow) I am inclined to agree with him. The git-flow process is just complex enough to outweigh the benefits for many developers.

One of the bigger issues for me is that it’s more complicated than I think most developers and development teams actually require. It’s complicated enough that a big helper script was developed to help enforce the flow. Though this is cool, the issue is that it cannot be enforced in a Git GUI, only on the command line, so the only people who have to learn the complex workflow really well, because they have to do all the steps manually, are the same people who aren’t comfortable with the system enough to use it from the command line. This can be a huge problem.

So the complexity of git-flow is one issue, and another is the frequency with which GitHub releases code (emphasis mine).

So, why don’t we use git-flow at GitHub? Well, the main issue is that we deploy all the time. The git-flow process is designed largely around the “release”. We don’t really have “releases” because we deploy to production every day – often several times a day. We can do so through our chat room robot, which is the same place our CI results are displayed. We try to make the process of testing and shipping as simple as possible so that every employee feels comfortable doing it.

This makes sense—git-flow does appear to be designed for more traditional release schedules rather than for continuous delivery, as summarized below.

For teams that have to do formal releases on a longer term interval (a few weeks to a few months between releases), and be able to do hot-fixes and maintenance branches and other things that arise from shipping so infrequently, git-flow makes sense and I would highly advocate it’s use.

For teams that have set up a culture of shipping, who push to production every day, who are constantly testing and deploying, I would advocate picking something simpler like GitHub Flow.

I highly recommend that you read the full article. If you’re still interested in learning more about Git, I would also recommend Scott Chacon’s comprehensive book on the subject, Pro Git.

A Successful Git Branching Model

Git branching model

I came across this article on a standardized Git branching model some time ago, but I still find it quite useful to reference from time to time. If you’re using Git, this is definitely a worthwhile read.