how-to-configure-and-use-spark-history-server
来源:互联网 发布:淘宝网舞蹈鞋 编辑:程序博客网 时间:2024/05/18 01:19
how-to-configure-and-use-spark-history-server
参考
spark configuration
spark monitoring Viewing After the Fact
基础知识
how to configure spark ?
locations to configure sparkspark properties using val sparkConf=new SparkConf().set… in spark application code
- Dynamically Loading Spark Properties using spark-submit options
./bin/spark-submit --name "My app" --master local[4] --conf spark.shuffle.spill=false --conf "spark.executor.extraJavaOptions=-XX:+PrintGCDetails -XX:+PrintGCTimeStamps" myApp.jar
- Viewing Spark Properties using spark app WebUI
http://<driver>:4040
in the “Environment” tab
- environment variable using conf/spark-env.sh
- loging using conf/log4j.properites
Available Properties about history server
* Environment Variables
略
enable spark eventlog
method 1:
bin/spark-submit --conf spark.eventLog.enabled=true --conf spark.eventLog.dir=file:/data01/data_tmp/spark-events ...
method 2:
vi conf/spark-env.shSPARK_DAEMON_JAVA_OPTS=SPARK_MASTER_OPTS=SPARK_WORKER_OPTS=SPARK_JAVA_OPTS="-Dspark.eventLog.enabled=true -Dspark.eventLog.dir=file:/data01/data_tmp/spark-events"
NOTE:
a spark.eventLog.dir=file:/data01/data_tmp/spark-events
- eventlog 是由 driver 写日志,对于 local/standalone/yarn-client 是没有大问题,但对于 spark-cluster 需要特别注意,最好设置为 hdfs 路径
- spark.eventLog.dir 指定目录不会自动生成,需要手工创建,有相应权限
b these Environment variable will be used in bin/spark-class)
test spark app after enable eventlog
test case:
object WordCount extends App { val sparkConf = new SparkConf().setAppName("WordCount") val sc = new SparkContext(sparkConf) val lines = sc.textFile("file:/data01/data/datadir_github/spark/README.md") val words = lines.flatMap(_.split("\\s+")) val wordsCount = words.map(word=>(word, 1)).reduceByKey(_ + _) wordsCount.foreach(println) sc.stop()}
测试问题1:IDEA中 running spark in local 模式没有生成 eventlog ,继续测试自带 的 examples.SparkPi 一样
处理方法1: 在 CLI 测试
bin/run-example SparkPi
结果报错:
Exception in thread “main” java.lang.IllegalArgumentException: Log directory file:/data01/data_tmp/spark-events does not exist.
处理方法2:
mkdir -p /data01/data_tmp/spark-eventsbin/run-example SparkPi
结果:测试确认 /data01/data_tmp/spark-events 下生成了eventlog
继续在 IDEA中测试 running spark in local 发现依然没有生成 eventlog
原因分析:在 IDEA 中测试,虽然在依赖添加了 SPARK_CONF_DIR 路径,但 IDEA中执行并不像 在 CLI 使用bin/spark-submit
提交app 读取解析conf/spark-env.sh
中的配置文件处理方法3:
在 IDEA 的 run configuration 设置 vm options-Dspark.master="local[2]" -Dspark.eventLog.enabled=true -Dspark.eventLog.dir=file:/data01/data_tmp/spark-events
结果: local 模式正常
问题2:IDEA 测试 spark-on-yarn报错(暂没有解决)
在 IDEA 的 run configuration 设置 vm options-Dspark.master="yarn-client" -Dspark.eventLog.enabled=true -Dspark.eventLog.dir=file:/data01/data_tmp/spark-events
报错
configure, start and use spark history server
configure
vi conf/spark-env.shSPARK_HISTORY_OPTS="-Dspark.history.fs.logDirectory=file:/data01/data_tmp/spark-events" #set when you use spark-history-server
NOTE
- spark.history.fs.logDirectory , spark.eventLog.dir 可以不同,意味着能够移动 eventlog 文件,便于协助诊断
- spark.history.ui.port
start
bin/start-history-server.sh
access spark history WebUI at http://<server-url>:18080
spark history server WebUI applications
spark history server WebUI specific app
- how-to-configure-and-use-spark-history-server
- How to configure and use CAN bus
- How to use sendmail to configure SMTP server
- how to configure and use activemq in camel
- How to Configure, Install and Use libnefilter_queue on Linux
- How to Configure, Install and Use libnefilter_queue on Linux
- how-to-use-grahite-and-grafana-to-monitor-spark
- How to configure environment to use JavaMail?
- How to configure and use Git with visual studio 2012 and TFS
- How to configure and use minicom in Ubuntu 12.04 User Manual
- How-to: use spark to suport query across mysql tables and hbase tables
- how to configure libvirt to use virtio with KVM guests
- How to use ASMCA in silent mode to configure ASM for a stand-alone server [ID 1068788.1]
- How to install and configure bugzilla
- How To: Install and Configure GitWeb
- How To: Configure Secure FTP Server (VSFTPD)
- How to configure a Dns4 server
- How to install and configure Jabber Server (Ejabberd) on Debian Lenny GNU / Linux
- Fragment中有几个Activity中没有的新方法,这里需要重点介绍一下
- [Leetcode]Word Search
- 自定义控件之刮刮卡
- 常量指针和指针常量
- [Accessibility] Missing contentDescription attribute on image
- how-to-configure-and-use-spark-history-server
- [LeetCode] Longest Common Prefix
- Android官方开发指南-Camera(相机)
- 怎样才能学好C语言
- 【Android开发经验】Android开发相关的Blog推荐——跟随大神的脚步才能成长为大神
- VirtualBox安装ubuntu之后,无法挂载共享文件夹
- 两个APP之间相互传值
- iBATIS, Hibernate, and JPA : Which is right for you ?
- http://blog.csdn.net/ns_code/article/details/40408397