Launching Spark on YARN
来源:互联网 发布:中信建投软件下载 编辑:程序博客网 时间:2024/05/12 06:36
Spark在Yarn上的wordcount程序
https://www.iteblog.com/archives/1028
Ensure that HADOOP_CONF_DIR or YARN_CONF_DIR points to the directory which contains the (client side) configuration files for the Hadoop cluster. These configs are used to write to HDFS and connect to the YARN ResourceManager. The configuration contained in this directory will be distributed to the YARN cluster so that all containers used by the application use the same configuration. If the configuration references Java system properties or environment variables not managed by YARN, they should also be set in the Spark application’s configuration (driver, executors, and the AM when running in client mode).
There are two deploy modes that can be used to launch Spark applications on YARN. In cluster mode, the Spark driver runs inside an application master process which is managed by YARN on the cluster, and the client can go away after initiating the application. In client mode, the driver runs in the client process, and the application master is only used for requesting resources from YARN.
Unlike Spark standalone and Mesos modes, in which the master’s address is specified in the –master parameter, in YARN mode the ResourceManager’s address is picked up from the Hadoop configuration. Thus, the –master parameter is yarn.
To launch a Spark application in cluster mode:
$ ./bin/spark-submit --class path.to.your.Class --master yarn --deploy-mode cluster [options] <app jar> [app options]
For example:
$ ./bin/spark-submit --class org.apache.spark.examples.SparkPi \ --master yarn \ --deploy-mode cluster \ --driver-memory 4g \ --executor-memory 2g \ --executor-cores 1 \ --queue thequeue \ lib/spark-examples*.jar \ 10
The above starts a YARN client program which starts the default Application Master. Then SparkPi will be run as a child thread of Application Master. The client will periodically poll the Application Master for status updates and display them in the console. The client will exit once your application has finished running. Refer to the “Debugging your Application” section below for how to see driver and executor logs.
To launch a Spark application in client mode, do the same, but replace cluster with client. The following shows how you can run spark-shell in client mode:
$ ./bin/spark-shell --master yarn --deploy-mode client
Adding Other JARs
In cluster mode, the driver runs on a different machine than the client, so SparkContext.addJar won’t work out of the box with files that are local to the client. To make files on the client available to SparkContext.addJar, include them with the –jars option in the launch command.
$ ./bin/spark-submit --class my.main.Class \ --master yarn \ --deploy-mode cluster \ --jars my-other-jar.jar,my-other-other-jar.jar \ my-main-jar.jar \ app_arg1 app_arg2
- Launching Spark on YARN
- Launching Spark on YARN
- Spark on Yarn部署
- Spark on Yarn
- spark on yarn
- spark on yarn
- Spark on Yarn简介
- spark on yarn
- Spark on YARN 部署
- spark on yarn 配置
- spark on yarn
- spark on yarn 安装
- Spark on Yarn 图
- Spark on yarn
- 源码-Spark on Yarn
- Spark On Yarn 知识点
- Spark 2.0 On Yarn
- spark on yarn
- redis集群体验-安装与测试
- 05.git撤销修改
- linux 防火墙
- java并发控制机制
- Android中CardView的使用
- Launching Spark on YARN
- 06.git文件删除
- GMT时间和普通时间互转
- yum 命令死锁
- 07.git基本操作小结
- 栈 堆 全局区(静态区) 文字区
- jquery移动端实现阶段日历(非整年整月,只一个周期)
- unity-AssetBundle
- pixhawk原生固件:linux环境