Data Structures: The Secret Life of Hashes

A drawing of the structure of a hash

SPOILER: A hash is secretly an array. Each position in the array contains another array. The inner array contains the key at position [0] and the value at position [1].

I’ve had a few people say they liked seeing my notes, so I’m going to try and make a few posts with what I think are the more key bits of a few lessons! I hope this will also make me write about more complicated topics instead of basic procedural outlines.

The lesson here was our introduction to understanding the inner workings of data structures. Specifically, arrays and hashes.

The cool thing I learned is that hashes are full of secrets.

Specifically, Hashes are secretly Arrays, but disguised by their curly brackets. The cool way Hashes disguise themselves as paired values in no particular order is through nested Arrays and a .hash method which does some secret math. As you might be able to tell from my drawing (click on it to make it bigger), a Hash is just an Array. But it is a fancy Array! The size of the Array is defined ahead of time (by the Ruby computer-brain), and each spot in the array is set to nil. Then, when we add a key => value pair to the Array, some magic happens.

  1. The Ruby computer-brain runs .hash on the key. This generates a pretty large and (mostly) unique number.
  2. However, the Ruby computer-brain don’t actually want (or need) a gigantic Array with hundreds of thousands of places filled with nil values, so it uses the modulo (or remainder) % to make it smaller. Specifically, it divides the giant hash number by the number of spots in the Array.
  3. The modulo gives the Ruby computer-brain the remainder of that division operation, which by definition, has to fit inside the array.
  4. Since it was generated by a (mostly) unique and very large number, the remainder will also be (mostly) unique, so the Ruby computer-brain uses it as the index position in the array to store our new key and value.
  5. The key and value are set as positions [0] and [1] in an array which is nested inside the Hash-Array at the index position calculated via the .hash and modulo operations.

So this is why Hash lookups are so fast! The Ruby computer-brain ‘knows’ where each key and value are because it can take the requested key, do the math for the .hash method really quick (because computers are really good at doing math quickly), calculate the index position with the modulo operation, and BAM! Find your value!

Arrays vs Hashes

Here is a quick rundown on the difference between arrays and hashes! This is a pretty basic overview.

Arrays

The array is basically a list of values or items separated by commas. (CSV! My favorite data format!) Arrays are pretty straightforward.

Array = [ “a cool string”, 5, “another cool string”, “word”, 19983578923485]

Key facts about arrays:

  • Denoted by square brackets [  ]
  • Values are separated by commas [ value1, value2]
  • Order matters!
  • Access individual items via their index. ex: Array[0] gives the first item in the array.

Hashes

Hashes are cool! (And not just because they wear bowties.) A hash looks a little bit like an array, but with different brackets and each position in a hash is made up of a pair of things instead of a single item. The left side of the pair is called the key and the right side is the value. As in, it is KEY to remember the KEY otherwise you will not be able to access the VALUE. Also, the order of items in the array does not matter, because instead of looking up things by their position or index, we look up values in an array by their key.

Hash = { “pumpkin pie” => “is delicious”, 42 => “life the universe and everything”, :dog => “cute!”, nailpolish: “green” }

Key facts about hashes:

  • Denoted by curly brackets { }
  • Key/values pairs are separated by commas { hash1: key1, hash2: key2 } (new syntax) OR { hash1 => key1, hash2 => key2 } (old syntax)
  • Order doesn’t matter!
  • Access individual items via their key. ex: Hash[“key”]
  • That little equals-greater-than symbol that ties the key and value together? It is called a hash rocket! => => =>
  • Can contain symbols as keys! Those are the guys that are either start (or sometimes end with) a colon.