spark 2.1 on yarn -- container shell analysis
来源:互联网 发布:手机报价软件 编辑:程序博客网 时间:2024/05/16 16:04
I set the following content in spark-defaults.conf
spark.serializer org.apache.spark.serializer.KryoSerializer spark.master yarnspark.executor.instances 2spark.executor.cores 1spark.executor.memory 512m
When execute spark-shell, it will create two executors.
jps32412 CoarseGrainedExecutorBackend32444 CoarseGrainedExecutorBackend
look at the command of one executor.
]$ ps aux | grep 32412houzhiz+ 374 0.0 0.0 112668 976 pts/1 R+ 14:08 0:00 grep --color=auto 32412houzhiz+ 32412 15.1 4.3 2371448 342156 ? Sl 14:03 0:46 /usr/local/java/bin/java -server -Xmx512m -Djava.io.tmpdir=/data/hadoop/data11/tmp/nm-local-dir/usercache/houzhizhen/appcache/application_1495532285542_0005/container_1495532285542_0005_01_000002/tmp -Dspark.driver.port=35736 -Dspark.yarn.app.container.log.dir=/home/houzhizhen/usr/local/hadoop/hadoop-2.7.2/logs/userlogs/application_1495532285542_0005/container_1495532285542_0005_01_000002 -XX:OnOutOfMemoryError=kill %p org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url spark://CoarseGrainedScheduler@192.168.122.1:35736 --executor-id 1 --hostname localhost --cores 1 --app-id application_1495532285542_0005 --user-class-path file:/data/hadoop/data11/tmp/nm-local-dir/usercache/houzhizhen/appcache/application_1495532285542_0005/container_1495532285542_0005_01_000002/__app__.jar
Look the container directory.
$ cd /data/hadoop/data11/tmp/nm-local-dir/usercache/houzhizhen/appcache/application_1495532285542_0005/container_1495532285542_0005_01_000002[houzhizhen@localhost container_1495532285542_0005_01_000002]$ ll总用量 20-rw-rw-r--. 1 houzhizhen houzhizhen 86 5月 24 14:03 container_tokens-rwx------. 1 houzhizhen houzhizhen 703 5月 24 14:03 default_container_executor_session.sh-rwx------. 1 houzhizhen houzhizhen 757 5月 24 14:03 default_container_executor.sh-rwx------. 1 houzhizhen houzhizhen 3590 5月 24 14:03 launch_container.shlrwxrwxrwx. 1 houzhizhen houzhizhen 89 5月 24 14:03 __spark_conf__ -> /data/hadoop/data11/tmp/nm-local-dir/usercache/houzhizhen/filecache/17/__spark_conf__.ziplrwxrwxrwx. 1 houzhizhen houzhizhen 108 5月 24 14:03 __spark_libs__ -> /data/hadoop/data11/tmp/nm-local-dir/usercache/houzhizhen/filecache/16/__spark_libs__7172508084572895679.zipdrwx--x---. 2 houzhizhen houzhizhen 6 5月 24 14:03 tmp[houzhizhen@localhost container_1495532285542_0005_01_000002]$
Open the spark configuration, you can see spark.executor.id=driver, and from __spark_conf__ -> /data/hadoop/data11/tmp/nm-local-dir/usercache/houzhizhen/filecache/17/__spark_conf__.zip
, so it can safely conclude that the configure file is shared across executors of the same spark application.
cat __spark_conf__/__spark_conf__.properties #Spark configuration.#Wed May 24 14:03:27 CST 2017spark.yarn.cache.visibilities=PRIVATEspark.yarn.cache.timestamps=1495605805866spark.executor.memory=512mspark.executor.id=driverspark.driver.host=192.168.122.1spark.yarn.cache.confArchive=hdfs\://localhost\:8020/user/houzhizhen/.sparkStaging/application_1495532285542_0005/__spark_conf__.zipspark.files.ignoreCorruptFiles=truespark.yarn.cache.sizes=200756074spark.jars=spark.sql.catalogImplementation=hivespark.home=/usr/local/sparkspark.submit.deployMode=clientspark.executor.heartbeatInterval=2spark.master=yarnspark.yarn.cache.filenames=hdfs\://localhost\:8020/user/houzhizhen/.sparkStaging/application_1495532285542_0005/__spark_libs__7172508084572895679.zip\#__spark_libs__spark.executor.cores=1spark.yarn.cache.types=ARCHIVEspark.driver.appUIAddress=http\://192.168.122.1\:4040spark.serializer=org.apache.spark.serializer.KryoSerializerspark.repl.class.outputDir=/tmp/spark-caaf86f0-267d-4b39-9bfe-833d97db838e/repl-e03f92dd-176d-42b5-9ebd-a1e3d66c7e1cspark.executor.instances=2spark.app.name=Spark shellspark.repl.class.uri=spark\://192.168.122.1\:35736/classesspark.driver.port=35736
Open launch_container.sh, you can see $PWD/__spark_conf__:$PWD/__spark_libs__/*
is included in the CLASSPATH. From the last command, it can see the executor-id is override with --executor-id 1
launch_container.sh
cat launch_container.sh #!/bin/bashexport SPARK_YARN_STAGING_DIR="hdfs://localhost:8020/user/houzhizhen/.sparkStaging/application_1495532285542_0005"export HADOOP_CONF_DIR="/usr/local/hadoop/etc/hadoop"export JAVA_HOME="/usr/local/java"export SPARK_LOG_URL_STDOUT="http://localhost:8042/node/containerlogs/container_1495532285542_0005_01_000002/houzhizhen/stdout?start=-4096"export NM_HOST="localhost"export SPARK_HOME="/usr/local/spark"export HADOOP_HDFS_HOME="/home/houzhizhen/usr/local/hadoop/hadoop-2.7.2"export LOGNAME="houzhizhen"export JVM_PID="$$"export PWD="/data/hadoop/data11/tmp/nm-local-dir/usercache/houzhizhen/appcache/application_1495532285542_0005/container_1495532285542_0005_01_000002"export HADOOP_COMMON_HOME="/home/houzhizhen/usr/local/hadoop/hadoop-2.7.2"export LOCAL_DIRS="/data/hadoop/data11/tmp/nm-local-dir/usercache/houzhizhen/appcache/application_1495532285542_0005"export NM_HTTP_PORT="8042"export LOG_DIRS="/home/houzhizhen/usr/local/hadoop/hadoop-2.7.2/logs/userlogs/application_1495532285542_0005/container_1495532285542_0005_01_000002"export NM_AUX_SERVICE_mapreduce_shuffle="AAA0+gAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA="export NM_PORT="33996"export USER="houzhizhen"export HADOOP_YARN_HOME="/home/houzhizhen/usr/local/hadoop/hadoop-2.7.2"export CLASSPATH="$PWD:$PWD/__spark_conf__:$PWD/__spark_libs__/*:$HADOOP_CONF_DIR:$HADOOP_COMMON_HOME/share/hadoop/common/*:$HADOOP_COMMON_HOME/share/hadoop/common/lib/*:$HADOOP_HDFS_HOME/share/hadoop/hdfs/*:$HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*:$HADOOP_YARN_HOME/share/hadoop/yarn/*:$HADOOP_YARN_HOME/share/hadoop/yarn/lib/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*"export SPARK_YARN_MODE="true"export HADOOP_TOKEN_FILE_LOCATION="/data/hadoop/data11/tmp/nm-local-dir/usercache/houzhizhen/appcache/application_1495532285542_0005/container_1495532285542_0005_01_000002/container_tokens"export SPARK_USER="houzhizhen"export SPARK_LOG_URL_STDERR="http://localhost:8042/node/containerlogs/container_1495532285542_0005_01_000002/houzhizhen/stderr?start=-4096"export HOME="/home/"export CONTAINER_ID="container_1495532285542_0005_01_000002"export MALLOC_ARENA_MAX="4"ln -sf "/data/hadoop/data11/tmp/nm-local-dir/usercache/houzhizhen/filecache/17/__spark_conf__.zip" "__spark_conf__"hadoop_shell_errorcode=$?if [ $hadoop_shell_errorcode -ne 0 ]then exit $hadoop_shell_errorcodefiln -sf "/data/hadoop/data11/tmp/nm-local-dir/usercache/houzhizhen/filecache/16/__spark_libs__7172508084572895679.zip" "__spark_libs__"hadoop_shell_errorcode=$?if [ $hadoop_shell_errorcode -ne 0 ]then exit $hadoop_shell_errorcodefiexec /bin/bash -c "$JAVA_HOME/bin/java -server -Xmx512m -Djava.io.tmpdir=$PWD/tmp '-Dspark.driver.port=35736' -Dspark.yarn.app.container.log.dir=/home/houzhizhen/usr/local/hadoop/hadoop-2.7.2/logs/userlogs/application_1495532285542_0005/container_1495532285542_0005_01_000002 -XX:OnOutOfMemoryError='kill %p' org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url spark://CoarseGrainedScheduler@192.168.122.1:35736 --executor-id 1 --hostname localhost --cores 1 --app-id application_1495532285542_0005 --user-class-path file:$PWD/__app__.jar 1>/home/houzhizhen/usr/local/hadoop/hadoop-2.7.2/logs/userlogs/application_1495532285542_0005/container_1495532285542_0005_01_000002/stdout 2>/home/houzhizhen/usr/local/hadoop/hadoop-2.7.2/logs/userlogs/application_1495532285542_0005/container_1495532285542_0005_01_000002/stderr"hadoop_shell_errorcode=$?if [ $hadoop_shell_errorcode -ne 0 ]then exit $hadoop_shell_errorcodefi
阅读全文
0 0
- spark 2.1 on yarn -- container shell analysis
- Spark on yarn Container beyond virtual memory limits
- spark on yarn:Container is running beyond physical memory limits
- spark 2.1 on yarn deployment
- spark 2.1 task allocation on yarn cluster
- Spark on Yarn部署
- Spark on Yarn
- spark on yarn
- spark on yarn
- Spark on Yarn简介
- spark on yarn
- Spark on YARN 部署
- spark on yarn 配置
- spark on yarn
- spark on yarn 安装
- Spark on Yarn 图
- Spark on yarn
- 源码-Spark on Yarn
- eric6安装插件cx-freeze的问题
- ubuntu Nginx 编译 带ssl
- 三种获取网页源码的方法(使用MFC/Socket实现)
- python 数据挖掘 版本 环境变量
- 正则匹配省市区
- spark 2.1 on yarn -- container shell analysis
- Spring基础学习教程(applicationContext.xml 配置文件 详解)-07
- PHP如何在页面中原样输出HTML代码
- spring boot 最佳实践(二)--使用Bean Validation
- Android中解决破解签名验证之后导致的登录授权失效问题
- H5 Web Notification桌面推送消息
- node.js 安装问题
- Tiles
- Python-selenium(1)环境部署