Data Structures: The Secret Life of Hashes

A drawing of the structure of a hash

SPOILER: A hash is secretly an array. Each position in the array contains another array. The inner array contains the key at position [0] and the value at position [1].

I’ve had a few people say they liked seeing my notes, so I’m going to try and make a few posts with what I think are the more key bits of a few lessons! I hope this will also make me write about more complicated topics instead of basic procedural outlines.

The lesson here was our introduction to understanding the inner workings of data structures. Specifically, arrays and hashes.

The cool thing I learned is that hashes are full of secrets.

Specifically, Hashes are secretly Arrays, but disguised by their curly brackets. The cool way Hashes disguise themselves as paired values in no particular order is through nested Arrays and a .hash method which does some secret math. As you might be able to tell from my drawing (click on it to make it bigger), a Hash is just an Array. But it is a fancy Array! The size of the Array is defined ahead of time (by the Ruby computer-brain), and each spot in the array is set to nil. Then, when we add a key => value pair to the Array, some magic happens.

  1. The Ruby computer-brain runs .hash on the key. This generates a pretty large and (mostly) unique number.
  2. However, the Ruby computer-brain don’t actually want (or need) a gigantic Array with hundreds of thousands of places filled with nil values, so it uses the modulo (or remainder) % to make it smaller. Specifically, it divides the giant hash number by the number of spots in the Array.
  3. The modulo gives the Ruby computer-brain the remainder of that division operation, which by definition, has to fit inside the array.
  4. Since it was generated by a (mostly) unique and very large number, the remainder will also be (mostly) unique, so the Ruby computer-brain uses it as the index position in the array to store our new key and value.
  5. The key and value are set as positions [0] and [1] in an array which is nested inside the Hash-Array at the index position calculated via the .hash and modulo operations.

So this is why Hash lookups are so fast! The Ruby computer-brain ‘knows’ where each key and value are because it can take the requested key, do the math for the .hash method really quick (because computers are really good at doing math quickly), calculate the index position with the modulo operation, and BAM! Find your value!