Pages

Saturday, July 25, 2015

How to set up Apache Spark (Java) - MLlib in Eclipse?

Apache Spark version: 1.3.0

Download Apache Spark required pre-built version from the following link:
http://spark.apache.org/downloads.html

Create Maven project in Eclipse
File > New > Maven Project

Add following dependencies in pom.xml.



  org.apache.spark
  spark-mllib_2.10
  1.3.0
  provided
 

org.apache.spark
spark-core_2.10
1.3.0
provided


We have mentioned scope as “provided” as those dependancies are already available in Spark server.

Create new class and add you Java source code for required MLlib algorithm

Run as > Maven Build… > package

Verify .jar file is created in ‘target' folder of Maven project

Change the location to Spark installation you downloaded and unpacked and try following command:
./bin/spark-submit —class --master local[2]
E.g.,

./bin/spark-submit --class fpgrowth --master local[2] /Users/XXX/target/uber-TestMLlib-0.0.1-SNAPSHOT.jar

3 comments:

  1. This comment has been removed by the author.

    ReplyDelete
  2. superb its helpful for basic setup
    but if any one uses maven don't forget to add these dependencies



    org.apache.spark
    spark-core_2.10
    2.1.0



    org.apache.spark
    spark-mllib_2.10
    1.3.0



    org.apache.spark
    spark-sql_2.10
    2.1.0

    ReplyDelete