NYPD Crime #18 – Clustering To Explore Neighbourhoods (Part III – Continued Because Spark Hates Me)

Review To sum up the last post, our driver’s RAM was essentially the bottleneck and what was causing our Spark application and the underlying JVM to crash. Before, we were using 3 AWS m4.large (8GB RAM) boxes for our master + 2 workers. In this notebook, I’ve spawned a new cluster keeping my workers the […]

Read More NYPD Crime #18 – Clustering To Explore Neighbourhoods (Part III – Continued Because Spark Hates Me)