Calculating Git SHA-1 hashes in Ruby
Although the process by which Git calculates SHA-1 hashes is well documented in Pro Git, I had a hard time finding it today and decided to write a blog post that will (hopefully) be a bit easier for myself and others to search for later.
First of all, use the hash-object
command as follows to print the SHA-1 hash that Git calculates for an object. (You can also pass a filename as an argument to hash-object
.)
$ echo 'test content' | git hash-object --stdin
d670460b4b4aece5915caf5c68d12f560a9fe3e4
Note that, by default, echo
prints a trailing newline character so this command is actually computing the SHA-1 hash of "test content\n"
. Interestingly enough, though, if you try to reproduce this behavior in Ruby by computing the SHA-1 hash of the same string, you get a different result.
$ irb
>> require 'digest/sha1'
=> true
>> puts Digest::SHA1.hexdigest "test content\n"
4fe2b8dd12cd9cd6a413ea960cd8c09c25f19527
=> nil
The reason for this, as explained in Pro Git, is that Git actually prepends the following information to a file’s contents before it calculates a hash.
- The object’s type—
blob
for a regular object,tree
for a tree object, andcommit
for a commit object - A space
- The (human-readable) number of bytes of data in the object
- A null byte (
\0
)
In other words, you need to run the following command to generate the appropriate hash.
$ irb
>> require 'digest/sha1'
=> true
>> puts Digest::SHA1.hexdigest "blob 13\0test content\n"
d670460b4b4aece5915caf5c68d12f560a9fe3e4
=> nil
Hope this helps!