Data Structures: The Secret Life of Hashes

A drawing of the structure of a hash

SPOILER: A hash is secretly an array. Each position in the array contains another array. The inner array contains the key at position [0] and the value at position [1].

I’ve had a few people say they liked seeing my notes, so I’m going to try and make a few posts with what I think are the more key bits of a few lessons! I hope this will also make me write about more complicated topics instead of basic procedural outlines.

The lesson here was our introduction to understanding the inner workings of data structures. Specifically, arrays and hashes.

The cool thing I learned is that hashes are full of secrets.

Specifically, Hashes are secretly Arrays, but disguised by their curly brackets. The cool way Hashes disguise themselves as paired values in no particular order is through nested Arrays and a .hash method which does some secret math. As you might be able to tell from my drawing (click on it to make it bigger), a Hash is just an Array. But it is a fancy Array! The size of the Array is defined ahead of time (by the Ruby computer-brain), and each spot in the array is set to nil. Then, when we add a key => value pair to the Array, some magic happens.

  1. The Ruby computer-brain runs .hash on the key. This generates a pretty large and (mostly) unique number.
  2. However, the Ruby computer-brain don’t actually want (or need) a gigantic Array with hundreds of thousands of places filled with nil values, so it uses the modulo (or remainder) % to make it smaller. Specifically, it divides the giant hash number by the number of spots in the Array.
  3. The modulo gives the Ruby computer-brain the remainder of that division operation, which by definition, has to fit inside the array.
  4. Since it was generated by a (mostly) unique and very large number, the remainder will also be (mostly) unique, so the Ruby computer-brain uses it as the index position in the array to store our new key and value.
  5. The key and value are set as positions [0] and [1] in an array which is nested inside the Hash-Array at the index position calculated via the .hash and modulo operations.

So this is why Hash lookups are so fast! The Ruby computer-brain ‘knows’ where each key and value are because it can take the requested key, do the math for the .hash method really quick (because computers are really good at doing math quickly), calculate the index position with the modulo operation, and BAM! Find your value!

Git: Working With Remote Repositories

In an earlier post, I covered how to add and remove remote repositories. Now that we’ve told our local git repo where all these remotes are, now what?

Push

We want to share all the changes we’ve made on our project with our collaborators. Or we just want a backup incase our computer catches fire. The next step is to push the changes we’ve made to our remote repo. The push command is pretty safe and it is pretty hard to mess anything up by using it.

Here is the general outline of the command:

git push <alias>

Typically, we will just be pushing to origin, the default remote repository:

git push origin

Sometimes, when we have multiple branches, we might need to specify which branch we want to push to the remote:

git push origin master

And actually, this formula can be generalized as:

git push <alias> <branch name>

With this, we can push to any remote (that we are authorized to access) and specify any branch we like!

Fetch

Our collaborators have been hard at work! We want to update our local repo with the changes they’ve pushed to our shared remote. If we want to download all of the changes, but not merge them into your files just yet, we want to use the ‘fetch’ command. This is the ‘safer’ command, because we can check on the changes before merging them into our own files.

If we only have one remote:

git fetch

If we have more than one remote:

git fetch <alias>

If we want to download from all remotes:

git fetch --all

Now we have all of our changes from the remote on our local machine, but our local files haven’t been updated yet.

I like to take a look at what git thinks is happening.

git diff <branch-name> <remote-alias>/<branch-name>

This should show us the changes that will be made to our files. Assuming everything looks about right, it’s time to merge.

Check we are on the correct branch:

git branch

Then merge:

git merge <remote-alias>/<branch-name>

Pull

Sometimes referred to as a ‘fast forward’, the ‘pull’ command does a ‘fetch’ and a ‘merge’ all in one command. I think of this as THE DANGER ZONE. Sometimes, it is exactly what I want, like when I am the only one making commits to a branch or repo, or I know that I want to merge all the changes in the remote with my own local changes. If this is the case, the pull command makes getting updates from the remote really easy.

git pull <alias> <branch-name>

And we’re done!

 

Other Resources:

  • I found this blog post on git fetch vs git pull to be very helpful. Spoiler: git fetch is the way to go!
  • As always, obsessively reading the documentation on push, fetch, and pull was really helpful.
  • And many thanks go Jeff Felchner for giving a talk at MakerSquare on using Git beyond the basics! Remember kids, NO FORCE PUSH!

Writing a Ruby gem!

For my final project, I’m working with Bonnie Mattson on a Ruby gem, currently called ‘Wikiwhat’, for the Wikipedia API! This is a really fun project, and also really challenging. The last few weeks of class have focused on working with frameworks and engines, and less on writing basic Ruby code, so it has been a nice change of pace.

One of the first challenges for this project has been getting the gem to build and install correctly. We use bundler to install and manage gems for our projects. Awesomely, we can also use it to build our gem too!

Here is a short outline of the process.

  1.  Use bundler to create a scaffold directory for the gem. This is nice because it takes care of a lot of details that are potentially easy to forget, like listing all the files that should be included in the gem.
bundle gem <GEMNAME>
  1. Write your gem! (no, it didn’t go that quickly, but you get the idea)

  2. Build your gem! A hint here is that you need to include the version number in your build command. Also, include the .gemspec file ending. (UPDATE: This is not where you include the version number. That is later!) EX: gemname.gemspec

gem build <GEMNAME>.gemspec

This will build your gem in the directory where you run this command.

  1. In the same directory, run the install command:
gem install <GEMNAME-version>.gem

It did take us quite a while to get to this point. We definitely had some malformed files and/or structure in our gem such that it either would not build or would not install. A few of the problems we had:

  • Bad require statements – It does not like it if things are named badly, or if you are trying to require '' or similar. I can’t remember why we thought we needed it, but you don’t! Don’t do it!
  • Incorrect naming or file structure – The gem builder/installer really really wants things to follow a semi-strict naming convention. Inside lib/, the main .rb file needs to be named the same as your gem name. If you have any additional code files, those need to be stored inside a folder that is named the same as your gem and main file.
    EXAMPLE:

    .
    ├── wikiwhat.gemspec
    └── lib
        ├── wikiwhat
        │   └── api_call.rb
        └── wikiwhat.rb

The final challenge was to be able to require it in IRB and run a command!

  1. Open IRB and require the gem:
require 'GEMNAME'

If it returns true, you’re 99% of the way there!

  1. Run a command! Obviously, this one is going to vary from gem to gem. If it works the way you expect, you are done! In our case, we ran our call method:
Wikiwhat.call("Albert Einstein")

And got back the first paragraph of the wiki article on Albert Einstein, just like we wanted. Yay!

 

Resources:

  1. So far, we have mainly used this great guide from RubyGems.org.

  2. I also really enjoyed this podcast on Ruby Gems from Ruby Rogues.

Listening for Events with Backbone Views

This is something that I’ve learned before, and I think I even have it in my notes from when we first learned this, but on Friday, we used two ways to listen for events in Backbone, and I wanted to really solidify this in my mind so I stop getting confused!

One kind of events we can listen for are browser events like click, hover, or focus. The way to ‘listen’ for these kinds of events in a Backbone View is to use an events property in the Backbone View, like so:

events: { '.delete click': 'removeComment' }

The property of the events object is what event we are looking for. In this example, we are listening for a click action on something with the class ‘delete’. The value of the property is what View property should be called when the event occurs.

The other kind of events we can listen for are Backbone events. These are events like ‘add’, ‘remove’, or ‘change’ in Views, Models, or Collections that can trigger further actions. They are set via the listenTo(); function in the initialize property in the View, like so:

initialize: function (options) { this.listenTo(this.collection, 'add', this.addCommentToWall); }

Here, we are saying that we want this instance of the Backbone View to listen to the instance of a Backbone Collection that has been assigned to it. It is listening for something to be added. When something is added, we want to call addCommentToWall, which is a property of the same instance of the Backbone View.

Recap: I have learned two ways to listen for events using Backbone.js. The events property is for listening for browser events. The listenTo() function in the initialize property is for listening for changes in backbone objects.

Git: Adding Remote Repositories

Git and version control are some of the wierdest and funnest things I’ve been introduced to. I love the idea of taking continuous snapshots of the small steps of my work, and then being able to sift through them and pick the ones I want. I think I’m still at a fairly basic level, but within that context, I’ve learned so much!

Remote Repositories

Remote repositories are basically copies of your git repo that are not stored on your hard drive. This seems kind of obvious, but didn’t really click for me until I learned that you could have as many remote repositories (remotes, for short) as you want!

Step 1: Take a look at the remote repositories you already have.

git remote -v

You will see a list that looks something like this:

origin https://github.com/cglinka/urdb-1.git (fetch)
origin https://github.com/cglinka/urdb-1.git (push)

Or like this, if you have multiple remotes already set:

mks https://github.com/makersquare/urdb.git (fetch)
mks https://github.com/makersquare/urdb.git (push)
origin https://github.com/cglinka/urdb-1.git (fetch)
origin https://github.com/cglinka/urdb-1.git (push)

You can see the remote name (or ‘alias’, if you want to get technical about it) followed by the URL that actually describes the location of each remote repository.

All of my remotes are on GitHub, but you can have remotes on other servers, or so I’m told. I have not tried to set up my own git server yet. :D

NOTE: The repo you clone from and/or the first remote you add will by default be called your origin. So when you are reading tips and tricks about Git, origin is just your ‘default’ remote alias, which may or may not be true for your particular repo. This may or may not cause a lot of confusion and/or frustration, but checking your remote names/aliases will save you a lot of troubleshooting time.

Step 2: Add a new remote.

git remote add <ALIAS> <url>

Fill in ALIAS with the name you want to use to access that remote. You can’t use origin if you already have an origin assigned. I try and give them easy-to-remember names based on the GitHub username. For the last several weeks, we’ve been working on the same repos in class, with new branches every day for that day’s lesson. Instead of making a million folders with a million copies of the same repo, I added a second remote as ‘mks’ that pointed to the correct repo on the makersquare account so I could just update my local copy of the repo every day! I added a third remote that pointed to my partner’s repo of the same assignment.

Step 3: But I want to change the remote that origin points to!

It’s cool, you can do that! First, remove the current origin:

git remote rm origin

Then, check make sure you removed the origin:

git remote -v

Last, add the new origin;

git remote add origin <URL>

And just to be sure, check your remotes again:

git remote -v

Ta-dah! New origin added!

More keyboard shortcuts

No matter how many cheat sheets I read, I seem to only manage to make one or two new keyboard shortcuts stick at a time. I hope you all are enjoying the fact that this is now a keyboard shortcut blog. :D

My new awesome trick is a general trick that appears to apply to all programs on OS X.

alt + ARROW

lets you jump by word instead of having to key past every single character. Hat tip to Greg, my partner this week for the trick!

And it gets even better! If you do

alt + delete

you can delete an entire word at a time!

I feel that it would also be important to mention that

⌘ + delete

works the same way as

⌘ + ARROW

which I mentioned earlier. It will delete an entire line, the way that you can jump to the end or beginning of a line!

Git Tricks: git diff

I think one of the first things you learn in git is the

git diff

command. This one is really important for me, because I forget exactly what I’ve changed since I last committed. But sometimes, I forget to run

git diff

before I add my files to the staging area. Whoops!

 

I feel like I should have figured this out sooner (that’s how learning works, right? obvious once you know it), but there is a very nice command that fixes this problem for me!

git diff --cached

show the git diff for files I’ve already staged! Huzzah! My commit messages will never be a series of “Ummmmm, I don’t remember what I just did.”