How to install Apache Spark on Ubuntu using Apache Bigtop
Has this ever happened to you? A new version of Spark is coming out and you want to try it out. To do that, you have to remove the previous version, download and extract a new one and hope that everything still works. Or, sometimes, you are just getting started with a Spark project and want the installation process to be seamless and easy. A one-liner command would be nice, wouldn’t it? Making this process more organized and user-friendly is one of the goals of Apache Bigtop. This post is for ML and infrastructure engineers, data scientists, and those who are just willing to try Spark out.
Apache Bigtop is aimed at providing ML engineers, infrastructure engineers, and data scientists with a convenient tool for packaging, deployment, and integration of Hadoop-related projects such as HDFS, MapReduce, Pig, Hive, HBase, ZooKeeper, Spark, and many others.
In this tutorial, I will show how to install Apache Bigtop and how to use it to install Apache Spark. Here, I will focus on Ubuntu. For other distributions, check out this link.
Bigtop installation
This tutorial is for Bigtop version 1.3.0. If you want to isntall other versions, change the version in the commands below accordingly.
-
Make sure that you have the latest JDK installed on your system (so far, JDK 8 works well).
-
Install the Apache Bigtop GPG key.
- Make sure to grab the repo file.
- Update the
apt
cache.
- Browse through the artifacts.
- Install
bigtop-utils
.
Now you can install Spark and other Hadoop-related projects.
Spark installation
- Install Spark.
Take a look at the Wiki of the Bigtop project for more information concerning other Hadoop-related projects.
If you are looking for an easier way to try out Spark, check out another tutorial on how to create Spark Scala project in Intellij IDEA. This way you do not have to install anything except Intellij IDEA.
Let me know what you think of this article on twitter @mizvladimir or leave a comment below!