Spark学习-SparkSQL--02-Spark history Server

来源：互联网发布：java创建对象实例编辑：程序博客网时间：2024/06/05 07:12

Spark History Server配置使用
1。Spark history Server产生背景

以standalone运行模式为例，在运行Spark Application的时候，Spark会提供一个WEBUI列出应用程序的运行时信息；但该WEBUI随着Application的完成(成功/失败)而关闭，也就是说，Spark Application运行完(成功/失败)后，将无法查看Application的历史记录；

Spark history Server就是为了应对这种情况而产生的，通过配置可以在Application执行的过程中记录下了日志事件信息，那么在Application执行结束后，WEBUI就能重新渲染生成UI界面展现出该Application在执行过程中的运行时信息；

Spark运行在yarn或者mesos之上，通过spark的history server仍然可以重构出一个已经完成的Application的运行时参数信息（假如Application运行的事件日志信息已经记录下来）；

配置&使用Spark History Server
以默认配置的方式启动spark history server：

cd $SPARK_HOME/sbin
start-history-server.sh

报错

starting org.apache.spark.deploy.history.HistoryServer, logging to /home/spark/software/source/compile/deploy_spark/sbin/../logs/spark-spark-org.apache.spark.deploy.history.HistoryServer-1-hadoop000.outfailed to launch org.apache.spark.deploy.history.HistoryServer:        at org.apache.spark.deploy.history.FsHistoryProvider.<init>(FsHistoryProvider.scala:44)        ... 6 more[root@biluos logs]# /opt/moudles/spark-2.2.0-bin-hadoop2.7/sbin/start-history-server.sh hdfs://mycluster:8020/spark_job_historystarting org.apache.spark.deploy.history.HistoryServer, logging to /opt/moudles/spark-2.2.0-bin-hadoop2.7/logs/spark-root-org.apache.spark.deploy.history.HistoryServer-1-biluos.com.out[root@biluos logs]# cat spark-root-org.apache.spark.deploy.history.HistoryServer-1-biluos.com.out Spark Command: /opt/moudles/jdk1.8.0_121/bin/java -cp /opt/moudles/spark-2.2.0-bin-hadoop2.7/conf/:/opt/moudles/spark-2.2.0-bin-hadoop2.7/jars/*:/opt/moudles/hadoop-2.7.3/etc/hadoop/ -Xmx1g org.apache.spark.deploy.history.HistoryServer hdfs://mycluster:8020/spark_job_history========================================17/08/03 03:22:18 INFO HistoryServer: Started daemon with process name: 2666@biluos.com17/08/03 03:22:18 INFO SignalUtils: Registered signal handler for TERM17/08/03 03:22:18 INFO SignalUtils: Registered signal handler for HUP17/08/03 03:22:18 INFO SignalUtils: Registered signal handler for INT17/08/03 03:22:18 WARN HistoryServerArguments: Setting log directory through the command line is deprecated as of Spark 1.1.0. Please set this through spark.history.fs.logDirectory instead.17/08/03 03:22:19 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable17/08/03 03:22:19 INFO SecurityManager: Changing view acls to: root17/08/03 03:22:19 INFO SecurityManager: Changing modify acls to: root17/08/03 03:22:19 INFO SecurityManager: Changing view acls groups to: 17/08/03 03:22:19 INFO SecurityManager: Changing modify acls groups to: 17/08/03 03:22:19 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(root); groups with view permissions: Set(); users  with modify permissions: Set(root); groups with modify permissions: Set()17/08/03 03:22:19 INFO FsHistoryProvider: History server ui acls disabled; users with admin permissions: ; groups with admin permissionsException in thread "main" java.lang.reflect.InvocationTargetException        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)        at org.apache.spark.deploy.history.HistoryServer$.main(HistoryServer.scala:278)        at org.apache.spark.deploy.history.HistoryServer.main(HistoryServer.scala)Caused by: java.io.FileNotFoundException: Log directory specified does not exist: hdfs://mycluster:8020/spark_job_history        at org.apache.spark.deploy.history.FsHistoryProvider.org$apache$spark$deploy$history$FsHistoryProvider$$startPolling(FsHistoryProvider.scala:214)        at org.apache.spark.deploy.history.FsHistoryProvider.initialize(FsHistoryProvider.scala:160)        at org.apache.spark.deploy.history.FsHistoryProvider.<init>(FsHistoryProvider.scala:156)        at org.apache.spark.deploy.history.FsHistoryProvider.<init>(FsHistoryProvider.scala:78)        ... 6 moreCaused by: java.io.FileNotFoundException: File does not exist: hdfs://mycluster:8020/spark_job_history        at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1309)        at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301)        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)        at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1317)        at org.apache.spark.deploy.history.FsHistoryProvider.org$apache$spark$deploy$history$FsHistoryProvider$$startPolling(FsHistoryProvider.scala:204)        ... 9 more解决方法[root@biluos logs]# hdfs dfs -mkdir /spark_job_history重新启动不报错了

界面如图
这里写图片描述

阅读全文

0 0