spark-1.5.2安装配置实录整理

来源:互联网 发布:淘宝搜索词指数查询 编辑:程序博客网 时间:2024/05/16 12:36
本文是http://blog.csdn.net/nisjlvhudy/article/details/49338899的补充。
1、安装scala-2.11.6环境
具体参照[http://blog.csdn.net/nisjlvhudy/article/details/49338899]文中所述。

2、下载spark并安装
wget http://d3kbcqa49mib13.cloudfront.net/spark-1.5.2-bin-hadoop2.6.tgz
tar -zxvf spark-1.5.2-bin-hadoop2.6.tgz
mv spark-1.5.2-bin-hadoop2.6 ~/opt/spark-1.5.2

3、spark配置
3.1、配置spark_home等环境变量

PATH=$PATH:$HOME/.local/bin:$HOME/bin

export PATH

export HIVE_HOME=/home/hs/opt/hive-1.2.1
export PATH=$HIVE_HOME/bin:$PATH

export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.95-2.6.4.0.el7_2.x86_64
export JRE_HOME=$JAVA_HOME/jre
export CLASSPATH=.:$JAVA_HOME/jre/lib:$JAVA_HOME/lib:$JAVA_HOME/lib/tools.jar
export SCALA_HOME=/home/hs/opt/scala-2.11.6
export HADOOP_HOME=/home/hs/opt/hadoop-2.7.2
export SPARK_HOME=/home/hs/opt/spark-1.5.2
PATH=$PATH:$HOME/bin:$JAVA_HOME/bin:${SCALA_HOME}/bin
export PATH=$PATH:$HADOOP_HOME/bin:$SPARK_HOME/bin

3.2、/home/hs/opt/spark-1.5.2/conf中配置slaves 
[hs@master conf]$ cat slaves
# A Spark Worker will be started on each of the machines listed below.
slave1
slave2
slave3

3.3、/home/hs/opt/spark-1.5.2/conf中配置spark-env.sh
export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.95-2.6.4.0.el7_2.x86_64
export SCALA_HOME=/home/hs/opt/scala-2.11.6
export SPARK_MASTER_IP=10.91.99.101
export SPARK_WORKER_MEMORY=6g
export HADOOP_CONF_DIR=/home/hs/opt/hadoop-2.7.2/etc/hadoop

HADOOP_CONF_DIR是Hadoop配置文件目录,SPARK_MASTER_IP主机IP地址,SPARK_WORKER_MEMORY是worker使用的最大内存。

4、将spark目录copy slave机器
scp -r /home/hs/opt/spark-1.5.2  hs@slave1:~/opt/
scp -r /home/hs/opt/spark-1.5.2  hs@slave2:~/opt/
scp -r /home/hs/opt/spark-1.5.2  hs@slave3:~/opt/

5、启动spark分布式集群并查看信息
cd $SPARK_HOME/sbin
sh start-all.sh

sh spark-shell
jps
查看页面集群:http://master:8080/

6、SparkSQL配置
6.1
、将$HIVE_HOME/conf/hive-site.xml配置文件拷贝到$SPARK_HOME/conf目录下。
6.2、将$HADOOP_HOME/etc/hadoop/hdfs-site.xml配置文件拷贝到$SPARK_HOME/conf目录下。
6.3、运行spark-sql连接hive进行查看及测试。
0 0