Submit a Spark job to YARN from Java Code
来源:互联网 发布:空中杀手 知乎 编辑:程序博客网 时间:2024/06/06 02:13
Submit a Spark job to YARN from Java Code
In this post I will show you how to submit a Spark job from Java code. Typically, we submit Spark jobs to "Spark Cluster" and Hadoop/YARN by using $SPARK_HOME/bin/spark-submit
shell script. Submitting Spark job from a shell script limits programmers when they want to submit Spark jobs from Java code (such as Java servlets or other Java code such as REST servers).
Use YARN's Client Class
Below is a complete Java code, which submits a Spark job to YARN from Java code (no shell scripting is required). The main class used for submitting a Spark job to YARN is the org.apache.spark.deploy.yarn.Client
` class.
Complete Java Client Program
$ cat SubmitSparkJobToYARNFromJavaCode.java// import required classes and interfacesimport org.apache.spark.deploy.yarn.Client;import org.apache.spark.deploy.yarn.ClientArguments;import org.apache.hadoop.conf.Configuration;import org.apache.spark.SparkConf;public class SubmitSparkJobToYARNFromJavaCode { public static void main(String[] arguments) throws Exception { // prepare arguments to be passed to // org.apache.spark.deploy.yarn.Client object String[] args = new String[] { // the name of your application "--name", "myname", // memory for driver (optional) "--driver-memory", "1000M", // path to your application's JAR file // required in yarn-cluster mode "--jar", "/Users/mparsian/zmp/github/data-algorithms-book/dist/data_algorithms_book.jar", // name of your application's main class (required) "--class", "org.dataalgorithms.bonus.friendrecommendation.spark.SparkFriendRecommendation", // comma separated list of local jars that want // SparkContext.addJar to work with "--addJars", "/Users/mparsian/zmp/github/data-algorithms-book/lib/spark-assembly-1.5.2-hadoop2.6.0.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/log4j-1.2.17.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/junit-4.12-beta-2.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/jsch-0.1.42.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/JeraAntTasks.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/jedis-2.5.1.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/jblas-1.2.3.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/hamcrest-all-1.3.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/guava-18.0.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/commons-math3-3.0.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/commons-math-2.2.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/commons-logging-1.1.1.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/commons-lang3-3.4.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/commons-lang-2.6.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/commons-io-2.1.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/commons-httpclient-3.0.1.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/commons-daemon-1.0.5.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/commons-configuration-1.6.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/commons-collections-3.2.1.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/commons-cli-1.2.jar,/Users/mparsian/zmp/github/data-algorithms-book/lib/cloud9-1.3.2.jar", // argument 1 to your Spark program (SparkFriendRecommendation) "--arg", "3", // argument 2 to your Spark program (SparkFriendRecommendation) "--arg", "/friends/input", // argument 3 to your Spark program (SparkFriendRecommendation) "--arg", "/friends/output", // argument 4 to your Spark program (SparkFriendRecommendation) // this is a helper argument to create a proper JavaSparkContext object // make sure that you create the following in SparkFriendRecommendation program // ctx = new JavaSparkContext("yarn-cluster", "SparkFriendRecommendation"); "--arg", "yarn-cluster" }; // create a Hadoop Configuration object Configuration config = new Configuration(); // identify that you will be using Spark as YARN mode System.setProperty("SPARK_YARN_MODE", "true"); // create an instance of SparkConf object SparkConf sparkConf = new SparkConf(); // create ClientArguments, which will be passed to Client ClientArguments cArgs = new ClientArguments(args, sparkConf); // create an instance of yarn Client client Client client = new Client(cArgs, config, sparkConf); // submit Spark job to YARN client.run(); }}
Running Complete Java Client Program
Now, we can submit Spark job from Java code:
$ javac SubmitSparkJobToYARNFromJavaCode.java$ java SubmitSparkJobToYARNFromJavaCode
Note: before running your Java code, make sure that the output directory does not exist:
hadoop fs -rmr /friends/output
- Submit a Spark job to YARN from Java Code
- Submit a Spark job to YARN from code
- Submit JOB from FTP
- [Spark | Yarn | Hadoop] Spark Submit over Yarn
- Spark-submit模式yarn-cluster和yarn-client的区别
- How to call a android/native service from a native/android (java) code
- From Java code to Java heap
- spark yarn /bin/bash: /bin/java: is a directory
- Spark 2.1 backend implementation vary greatly from local mode to yarn mode
- spark-submit到yarn上遇到的各种坑
- How to access HBase from spark-shell using YARN as the master on CDH 5.3 and Spark 1.2
- Some code about copying from a file to another
- 如何使用yarn界面查看spark job运行的情况
- spark on yarn,Initial job has not accepted any resources
- Java to download a file from internet
- How to submit a patch to upsream
- 关于spark-submit 使用yarn-client客户端提交spark任务的问题
- Beginning Java Objects: From Concepts To Code, Second Edition
- 本BLOG暂停更新,新BLOG见:http://wangchongjie.com
- 前端-减少页面加载时间的方法
- ubuntu16.04安装 mongodb-linux-x86_64-amazon-3.2.10.tgz
- oracle--预定义异常
- 树状数组求逆序对数板子
- Submit a Spark job to YARN from Java Code
- Git出现 fatal: Pathspec 'xxx' is in submodule 解决方案
- android开发之导入项目慢和中文路径报错
- mysql 5.7.16 x64 位ZIP的安装及服务没有报告任何错误
- laravel生成控制器
- Android 浏览器对 HTML5 的支持情况
- Linux 基本命令学习笔记《三》
- PendingIntent的用法,转载自网络大腿翻译文章
- 用python + openpyxl处理excel(07+)多文档合并+表格合并 + 一些中文处理的技巧