This is a direct continuation from the last post. The last post had already gotten way too long, so I’m starting another for organization sake. Wow, that sounded professional for a second.
Spinning It Up
Okay, so I clicked on “Create Cluster” after we finished step 4 of the AWS EMR console.
It looks like the cluster is up. That took about 10 minutes to get to this stage with the two Running statuses showing up. It still says “Waiting” up top… but I’m not really sure what that means. If I click Spark under Connections, sure enough, I see the Spark web interface come up:
What does this page mean? No clue man. That’s for everyone else to know and for me to find out. It looks like it’s all up. Let’s try to SSH into the master node (generally, the master node will have all the applications housed on it as to not interfere with the processing power and RAM of the worker nodes).
The SSH link gives us a pre-constructed SSH command for us. Awesome. The link for my box looks something like:
ssh -i ~/ec2-user.pem email@example.com
I just have to match up my ec2-user.pem on my local machine and I should be in.
~ ,----.. .---. / / \ .---. /. ./| / . : /. ./| .--'. ' ; . / ;. \ .--'. ' ; /__./ \ : | . ; / ` ; /__./ \ : | .--'. ' \' . ; | ; \ ; | .--'. ' \' . /___/ \ | ' ' | : | ; | ' /___/ \ | ' ' ; \ \; : . | ' ' ' : ; \ \; : \ ; ` | ' ; \; / | \ ; ` | . \ .\ ; \ \ ', / . \ .\ ; \ \ ' \ | ; : / \ \ ' \ | : ' |--" \ \ .' : ' |--" \ \ ; `---` \ \ ; '---" '---"
Thank you so much for the ASCII art, Amazon! You’ve certainly sold me as a customer!
Anyways, the first thing that comes to mind is that I tried to open jupyter notebook and, of course, I don’t have it.
sudo pip install jupyter
Here, we see Jupyter starting on port 8889 because 8888 has already been assigned to Zeppelin, I believe. When I try to go to that localhost link that Jupyter spits out, I don’t get anything. I forgot to port forward 8889. Well, forget would be the wrong word… I never even knew it would be on 8889. Let’s re-ssh in with the port forwarding -L flag
ssh -L 8889:localhost:8889 -i ~/ec2-user.pem firstname.lastname@example.org
Perfect! We’re in!
This would be the time I usually start to write some code… Before we do that, though, we should probably explore the other tool that we’re about to dive into: Spark.
Let’s review Spark in the next post.