spark的集群安装

来源:互联网 发布:淘宝客服售后聊天技巧 编辑:程序博客网 时间:2024/06/05 18:47
环境
192.168.17.151
192.168.17.152
192.168.17.153
192.168.17.154
192.168.17.155




1.首先我们要搭建scala


我们可以从 Scala 官网地址 http://www.scala-lang.org/downloads 下载 Scala 二进制包,我将下载 2.11.7版本,如下图所示:
tar zxvf scala-2.11.7.tgz -C /data/
cd /data/
 ll
mv scala-2.11.7 scala
vi /etc/profile  添加下面一行
export PATH=$PATH:/data/scala/bin
source /etc/profile
[root@cheng5 bin]# pwd
/data/scala/bin
[root@cheng5 bin]# scala
:wq!保存退出,重启终端,执行 scala 命令,输出以下信息,表示安装成功:
$ scala
Welcome to Scala version 2.11.7 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_31).
Type in expressions to have them evaluated.
Type :help for more information.
注意:在编译的时候,如果有中文会出现乱码现象,解决方法查看:Scala 中文乱码解决


5台上都装上scala
2.然后我们安装spark
首先在master上装
http://spark.apache.org/downloads.html


tar zxvf spark-2.1.0-bin-hadoop2.7.tgz -C /data/hadoop-2.8.0/
[root@cheng1 data]# cd hadoop-2.8.0/spark-2.1.0-bin-hadoop2.7/
[root@cheng1 spark-2.1.0-bin-hadoop2.7]# cd ..
[root@cheng1 hadoop-2.8.0]# mv spark-2.1.0-bin-hadoop2.7 spark
[root@cheng1 conf]# pwd
/data/hadoop-2.8.0/spark/conf




vi /etc/profile
export SPARK_HOME=/data/hadoop-2.8.0/spark
export PATH=$SPARK_HOME/bin:$PATH


修改配置文件 spark-env.sh


     在主节点上进入spark安装目录 conf目录执行如下命令:


  cp spark-env.sh.template  spark-env.sh
  添加 hadoop、scala 、java环境变量
  vi  spark-env.sh 




export HADOOP_HOME=/data/hadoop-2.8.0    #hadoop的安装目录
export SCALA_HOME=/data/scala                    #scala安装目录
export SPARK_MASTER_IP=192.168.1.151            #spark主master
export SPARK_WORKER_MEMORY=2g              #  最大使用多少内存
#export SPARK_EXECUTOR_MEMORY=4g
#export SPARK_DRIVER_MEMORY=4G
#export SPARK_WORKER_CORES=8
export HADOOP_CONF_DIR=/data/hadoop-2.8.0/etc/hadoop
export SPARK_HOME=/data/hadoop-2.8.0/spark








3.编辑slaves文件,你可以执行如下命令:
[root@cheng1 conf]# pwd
/data/hadoop-2.8.0/spark/conf


cp slaves.template  slaves
 vi  slaves


192.168.17.151
192.168.17.152
192.168.17.153
192.168.17.154
192.168.17.155
添加集群里面所有主机名(注,需在hosts文件中添加各主机的hostname和ip的对应关系)
   至此,master节点上的Spark已配置完毕。


 把master上Spark相关配置复制到集群从节点中,注意,三台机器spark所在目录必须一致,因为master会登陆到worker上执行命令,master认为worker的spark路径与自己一样。
[root@cheng1 sbin]# pwd
/data/hadoop-2.8.0/spark/sbin
[root@cheng1 sbin]# ./start-all.sh 
这块一定要用./start-all.sh 来表名当前目录下的start-all.sh 因为我们在配置hadoop的bin目录中野有一个start-all.sh文件
如果不用./start-all.sh会执行hadoop的start-all.sh 
此时使用jps我们发现我们在主节点出现master和worker2个新进程


 测试
通过浏览器输入 主节点集群ip地址加上默认8080端口号访问,出现如下图所示界面,表示安装成功。


http://192.168.17.151:8080/


 2.1.0 Spark Master at spark://cheng1:7077 URL: spark://cheng1:7077 
REST URL: spark://cheng1:6066 (cluster mode) 
Alive Workers: 5 
Cores in use: 5 Total, 0 Used 
Memory in use: 10.0 GB Total, 0.0 B Used 
Applications: 0 Running, 0 Completed 
Drivers: 0 Running, 0 Completed 
Status: ALIVE 
Workers 
Worker Id Address State Cores Memory 
worker-20170629162411-192.168.17.152-34976  192.168.17.152:34976 ALIVE 1 (0 Used) 2.0 GB (0.0 B Used)  
worker-20170629162411-192.168.17.153-44525  192.168.17.153:44525 ALIVE 1 (0 Used) 2.0 GB (0.0 B Used)  
worker-20170629162414-192.168.17.151-55731  192.168.17.151:55731 ALIVE 1 (0 Used) 2.0 GB (0.0 B Used)  
worker-20170629162414-192.168.17.155-35865  192.168.17.155:35865 ALIVE 1 (0 Used) 2.0 GB (0.0 B Used)  
worker-20170629162418-192.168.17.154-51786  192.168.17.154:51786 ALIVE 1 (0 Used) 2.0 GB (0.0 B Used)  
Running Applications 
Application ID Name Cores Memory per Node Submitted Time User State Duration 


Completed Applications 
Application ID 


从上面的我们看到5个worker节点及这5个节点信息
此时我们进入spark的bin目录,使用spark-shell控制台


[root@cheng1 bin]# spark-shell
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
17/06/29 16:30:18 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/06/29 16:30:33 WARN metastore.ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
17/06/29 16:30:33 WARN metastore.ObjectStore: Failed to get database default, returning NoSuchObjectException
17/06/29 16:30:36 WARN metastore.ObjectStore: Failed to get database global_temp, returning NoSuchObjectException
Spark context Web UI available at http://192.168.17.151:4040
Spark context available as 'sc' (master = local[*], app id = local-1498725020847).
Spark session available as 'spark'.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.1.0
      /_/
         
Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_131)
Type in expressions to have them evaluated.
Type :help for more information.


scala>