Spark standalone集群安装
来源:互联网 发布:西门子手编程器 编辑:程序博客网 时间:2024/04/30 02:08
本文不会搞什么Yarn混搭Spark,只想建立一个纯粹的Spark环境,太多层东西搅和在一起,不靠谱。
创建spark服务运行帐号
# useradd smile
smile帐号就是spark服务的运行帐号。
下载安装包并测试
在root帐号下,下载最新安装包,注意不是source,而是bin安装包,支持hadoop2.6以后的
wget http://mirrors.cnnic.cn/apache/spark/spark-1.5.1/spark-1.5.1-bin-hadoop2.6.tgz
解压到下面的目录,并将owner和group设置成smile帐号,再建立链接。
wget http://mirrors.cnnic.cn/apache/spark/spark-1.5.1/spark-1.5.1-bin-hadoop2.6.tgztar zxvf spark-1.5.1-bin-hadoop2.6.tgzchown -R smile:smile spark-1.5.1-bin-hadoop2.6ln -s spark-1.5.1-bin-hadoop2.6 sparkchown -R smile:smile spark
进入目录
cd /data/slot0/spark/./sbin/start-master.shstarting org.apache.spark.deploy.master.Master, logging to /data/slot0/spark-1.5.1-bin-hadoop2.6/sbin/../logs/spark-smile-org.apache.spark.deploy.master.Master-1-10-149-11-157.out
启动成功。查看web界面 http://your-host:8080
测试成功。关闭命令也很简单
$ sbin/stop-master.shstopping org.apache.spark.deploy.master.Master
基于zookeeper建立高可用集群
将三个节点作为master
现在打算用3台服务器建立master集群,使用zookeeper进行选举,确保总有一个master leader,其他两个总是master slave
在第一台服务器上,进入spark/conf目录,复制spark-env.sh.template为spark-env.sh文件
然后添加如下设置
SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=10.149.11.146:2181,10.149.11.147:2181,10.149.11.148:2181 -Dspark.deploy.zookeeper.dir=/vehicle_spark"export SPARK_DAEMON_JAVA_OPTS
启动服务为master
./sbin/start-master.sh
依次在后面两个节点上启动start-master.sh, 此时3个节点都可以通过http://ip:8080打开master状态站点
将后续节点作为slave启动
在另外几台spark 服务器上启动slave
./sbin/start-slave.sh spark://host1:7077,host2:7077,host3:7077
注意:
1. host1, host2, host3必须来自于几个master的8080站点,如果用IP代替连接会被拒绝
2. slave启动成功,可以在8081端口打开worker的UI站点,里面会显示当前的master leader
现在3台master的8080端口都显示了worker的状态。
用shell测试连接master
$ ./bin/spark-shell --master spark://10-149-11-*:7077,10-149-11-*:7077,10-149-11-*:7077log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).log4j:WARN Please initialize the log4j system properly.log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.Using Spark's repl log4j profile: org/apache/spark/log4j-defaults-repl.propertiesTo adjust logging level use sc.setLogLevel("INFO")Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 1.5.2 /_/Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_45)Type in expressions to have them evaluated.Type :help for more information.15/11/16 13:22:37 WARN MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set.Spark context available as sc.15/11/16 13:22:39 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)15/11/16 13:22:39 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)15/11/16 13:23:15 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.015/11/16 13:23:15 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException15/11/16 13:23:21 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable15/11/16 13:23:21 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)15/11/16 13:23:22 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)SQL context available as sqlContext.scala>
观察web ui和zookeeper,一切正常。
使用环境变量设置master url
由于spark-shell可以通过读取环境变量获得spark master信息,为了方便,不用每次都输入很长的参数
在~/.bashrc中添加
export MASTER=spark://10-149-*-*:7077,10-149-*-*:7077,10-149-*-*:7077
0 0
- Spark standalone集群安装
- Spark Standalone集群安装介绍
- spark部署standalone集群
- spark集群搭建,standalone
- spark standalone 集群配置
- spark的standalone集群搭建
- Spark -5:Standalone 集群模式
- spark standalone 集群环境搭建
- 配置Spark standalone集群启动
- Java连接Spark Standalone集群
- spark standalone集群模式搭建
- 安装spark standalone mode
- Spark standalone模式安装
- Spark Standalone Mode 安装
- 安装spark - standalone模式
- Spark 集群搭建从零开始之3 Spark Standalone集群安装、配置与测试
- spark安装,单节点spark,spark standalone
- spark standalone模式 zeppelin安装
- ZOJ 3652Maze
- VS插件-GetSet生成器
- Linux下进行SVN迁移
- iOS开发:设计模式那点事
- Intellij IDEA & Android Studio IDE自己风格的配置
- Spark standalone集群安装
- Android.view.inflateException :binary xml file line #2:Error inflating class
- 第一个Blog
- poj 3666 Making the Grade dp 离散化
- soj 3336 Diary(trie树)
- 【实例】Qt创建程序启动画面
- DOM样式偏移量
- 支持向量机 SVM 算法推导优缺点 代码实现 in Python
- Beaglebone Black(5)C++编程控制GPIO基础进阶