spark+hadoop Standalone模式 搭建

来源:互联网 发布:网络诈骗案案件徐 编辑:程序博客网 时间:2024/04/28 07:59

spark安装与使用(Standalone模式)

 

安装环境: Ubuntu sever版 ,java ,scala,

一:在linux下安装java环境(自行安装jdk)

二:安装Scala2.9.3

$ tar -zxf scala-2.9.3.tgz$ sudo mv scala-2.9.3 /usr/lib$ sudo vim /etc/profile# add the following lines at the endexport SCALA_HOME=/usr/lib/scala-2.9.3export PATH=$PATH:$SCALA_HOME/bin# save and exit vim#make the bash profile take effect immediatelysource /etc/profile# test$ scala -version

三:安装spark

从官网下载最新版本的spark,截止目前最新版的是1.5.1.下载地址:http://spark.apache.org/downloads.html

记住选择预编译好的文件下载,选择Pre-build for Hadoop 2.6 and later,下载的文件为spark-1.5.1-bin-hadoop2.6.tgz

解压

$ tar -zxf spark-1.5.1-bin-hadoop2.6.tgz

设置SPARK_EXAMPLES_JAR 环境变量

$ vim ~/.bashrc# add the following lines at the endexport SPARK_EXAMPLES_JAR=$HOME/spark-0.7.2/examples/target/scala-2.9.3/spark-examples_2.9.3-0.7.2.jar# save and exit vim#make the bash profile take effect immediately$ source /etc/profile

这一步其实最关键,很不幸的是,官方文档和网上的博客,都没有提及这一点。我是偶然看到了这两篇帖子,Running SparkPi, Null pointer exception when running ./run spark.examples.SparkPi local,才补上了这一步,之前死活都无法运行SparkPi

(可选)设置 SPARK_HOME环境变量,并将SPARK_HOME/bin加入PATH

$ vim ~/.bashrc# add the following lines at the endexport SPARK_HOME=$HOME/spark-0.7.2export PATH=$PATH:$SPARK_HOME/bin# save and exit vim#make the bash profile take effect immediately$ source /etc/profile
后来安装以上两步感觉没用,但还是照做了。spark和hadoop是一样的,解压即可使用。


单机运行spark

四:Spark配置

配置Spark环境变量

cd $SPARK_HOME/conf 
cp spark-env.sh.template spark-env.sh

vi spark-env.sh 添加以下内容:

<code class="prettyprint" style="box-sizing: border-box; font-family: Consolas, Menlo, Monaco, 'Lucida Console', 'Liberation Mono', 'DejaVu Sans Mono', 'Bitstream Vera Sans Mono', 'Courier New', monospace, serif;font-size:undefined; padding: 0px; color: inherit; border-top-left-radius: 0px; border-top-right-radius: 0px; border-bottom-right-radius: 0px; border-bottom-left-radius: 0px; margin: 0px; border: 0px; outline: 0px; vertical-align: baseline; position: relative; overflow: auto; word-break: break-word; max-height: 600px; display: block; background-color: rgb(247, 247, 247);"><span class="kwd" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: rgb(0, 0, 139); background-color: transparent;">export</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;"> JAVA_HOME</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">=</span><span class="str" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: rgb(128, 0, 0); background-color: transparent;">/usr/</span><span class="kwd" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: rgb(0, 0, 139); background-color: transparent;">local</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">/</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">java</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">-</span><span class="lit" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: rgb(128, 0, 0); background-color: transparent;">1.7</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">.</span><span class="lit" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: rgb(128, 0, 0); background-color: transparent;">0</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;"></span><span class="kwd" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: rgb(0, 0, 139); background-color: transparent;">export</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;"> HADOOP_HOME</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">=</span><span class="str" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: rgb(128, 0, 0); background-color: transparent;">/opt/</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">hadoop</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">-</span><span class="lit" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: rgb(128, 0, 0); background-color: transparent;">2.3</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">.</span><span class="lit" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: rgb(128, 0, 0); background-color: transparent;">0</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">-</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">cdh5</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">.</span><span class="lit" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: rgb(128, 0, 0); background-color: transparent;">0.0</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;"></span><span class="kwd" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: rgb(0, 0, 139); background-color: transparent;">export</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;"> HADOOP_CONF_DIR</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">=</span><span class="str" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: rgb(128, 0, 0); background-color: transparent;">/etc/</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">hadoop</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">/</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">conf</span><span class="kwd" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: rgb(0, 0, 139); background-color: transparent;">export</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;"> SCALA_HOME</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">=</span><span class="str" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: rgb(128, 0, 0); background-color: transparent;">/usr/</span><span class="kwd" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: rgb(0, 0, 139); background-color: transparent;">local</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">/</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">scala</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">-</span><span class="lit" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: rgb(128, 0, 0); background-color: transparent;">2.11</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">.</span><span class="lit" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: rgb(128, 0, 0); background-color: transparent;">4</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;"></span><span class="kwd" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: rgb(0, 0, 139); background-color: transparent;">export</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;"> SPARK_HOME</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">=</span><span class="str" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: rgb(128, 0, 0); background-color: transparent;">/home/</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">lxw1234</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">/</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">spark</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">-</span><span class="lit" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: rgb(128, 0, 0); background-color: transparent;">1.3</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">.</span><span class="lit" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: rgb(128, 0, 0); background-color: transparent;">1</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">-</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">bin</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">-</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">hadoop2</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">.</span><span class="lit" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: rgb(128, 0, 0); background-color: transparent;">3</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;"></span><span class="kwd" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: rgb(0, 0, 139); background-color: transparent;">export</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;"> SPARK_MASTER_IP</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">=</span><span class="lit" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: rgb(128, 0, 0); background-color: transparent;">127.0</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">.</span><span class="lit" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: rgb(128, 0, 0); background-color: transparent;">0.1</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;"></span><span class="kwd" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: rgb(0, 0, 139); background-color: transparent;">export</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;"> SPARK_MASTER_PORT</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">=</span><span class="lit" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: rgb(128, 0, 0); background-color: transparent;">7077</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;"></span><span class="kwd" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: rgb(0, 0, 139); background-color: transparent;">export</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;"> SPARK_MASTER_WEBUI_PORT</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">=</span><span class="lit" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: rgb(128, 0, 0); background-color: transparent;">8099</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;"> </span><span class="kwd" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: rgb(0, 0, 139); background-color: transparent;">export</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;"> SPARK_WORKER_CORES</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">=</span><span class="lit" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: rgb(128, 0, 0); background-color: transparent;">3</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;"> </span><span class="com" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: gray; background-color: transparent;">//每个Worker使用的CPU核数</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;"></span><span class="kwd" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: rgb(0, 0, 139); background-color: transparent;">export</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;"> SPARK_WORKER_INSTANCES</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">=</span><span class="lit" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: rgb(128, 0, 0); background-color: transparent;">1</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;"> </span><span class="com" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: gray; background-color: transparent;">//每个Slave中启动几个Worker实例</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;"></span><span class="kwd" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: rgb(0, 0, 139); background-color: transparent;">export</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;"> SPARK_WORKER_MEMORY</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">=</span><span class="lit" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: rgb(128, 0, 0); background-color: transparent;">10G</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;"> </span><span class="com" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: gray; background-color: transparent;">//每个Worker使用多大的内存</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;"></span><span class="kwd" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: rgb(0, 0, 139); background-color: transparent;">export</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;"> SPARK_WORKER_WEBUI_PORT</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">=</span><span class="lit" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: rgb(128, 0, 0); background-color: transparent;">8081</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;"> </span><span class="com" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: gray; background-color: transparent;">//Worker的WebUI端口号</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;"></span><span class="kwd" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: rgb(0, 0, 139); background-color: transparent;">export</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;"> SPARK_EXECUTOR_CORES</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">=</span><span class="lit" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: rgb(128, 0, 0); background-color: transparent;">1</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;"> </span><span class="com" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: gray; background-color: transparent;">//每个Executor使用使用的核数</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;"></span><span class="kwd" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: rgb(0, 0, 139); background-color: transparent;">export</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;"> SPARK_EXECUTOR_MEMORY</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">=</span><span class="lit" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: rgb(128, 0, 0); background-color: transparent;">1G</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;"> </span><span class="com" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: gray; background-color: transparent;">//每个Executor使用的内存</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;"> </span><span class="kwd" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: rgb(0, 0, 139); background-color: transparent;">export</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;"> SPARK_CLASSPATH</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">=</span><span class="str" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: rgb(128, 0, 0); background-color: transparent;">/opt/</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">hadoop</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">-</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">lzo</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">/</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">current</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">/</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">hadoop</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">-</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">lzo</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">.</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">jar </span><span class="com" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: gray; background-color: transparent;">//由于要用到lzo,因此需要配置</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;"></span><span class="kwd" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: rgb(0, 0, 139); background-color: transparent;">export</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;"> SPARK_CLASSPATH</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">=</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">$SPARK_CLASSPATH</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">:</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">$CLASSPATH</span><span class="kwd" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: rgb(0, 0, 139); background-color: transparent;">export</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;"> LD_LIBRARY_PATH</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">=</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">$</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">{</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">LD_LIBRARY_PATH</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">}:</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">$HADOOP_HOME</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">/</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">lib</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">/</span><span class="kwd" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: rgb(0, 0, 139); background-color: transparent;">native</span></code>
  • 配置Slave

cp slaves.template slaves 
vi slaves 添加以下内容: 
localhost

五、配置免密码ssh登陆

因为Master和Slave处于一台机器,因此配置本机到本机的免密码ssh登陆,如有其他Slave,都需要配置Master到Slave的无密码ssh登陆。

<code class="prettyprint" style="box-sizing: border-box; font-family: Consolas, Menlo, Monaco, 'Lucida Console', 'Liberation Mono', 'DejaVu Sans Mono', 'Bitstream Vera Sans Mono', 'Courier New', monospace, serif;font-size:undefined; padding: 0px; color: inherit; border-top-left-radius: 0px; border-top-right-radius: 0px; border-bottom-right-radius: 0px; border-bottom-left-radius: 0px; margin: 0px; border: 0px; outline: 0px; vertical-align: baseline; position: relative; overflow: auto; word-break: break-word; max-height: 600px; display: block; background-color: rgb(247, 247, 247);"><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">cd </span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">~/</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">ssh</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">-</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">keygen </span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">(一路回车)</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">cd </span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">.</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">ssh</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">/</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">cat id_rsa</span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">.</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">pub </span><span class="pun" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;">>></span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;"> authorized_keyschmod </span><span class="lit" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; color: rgb(128, 0, 0); background-color: transparent;">600</span><span class="pln" style="box-sizing: border-box; margin: 0px; padding: 0px; font-family: 'Microsoft YaHei'; border: 0px; outline: 0px; font-size: 12px; vertical-align: baseline; background-color: transparent;"> authorized_keys</span></code>

六、启动Spark Master

cd $SPARK_HOME/sbin/ 
./start-master.sh

启动日志位于 $SPARK_HOME/logs/目录下,正常启动的日志如下:

15/06/05 14:54:16 INFO server.AbstractConnector: Started SelectChannelConnector@localhost:6066 
15/06/05 14:54:16 INFO util.Utils: Successfully started service on port 6066. 
15/06/05 14:54:16 INFO rest.StandaloneRestServer: Started REST server for submitting applications on port 6066 
15/06/05 14:54:16 INFO master.Master: Starting Spark master at spark://127.0.0.1:7077 
15/06/05 14:54:16 INFO master.Master: Running Spark version 1.3.1 
15/06/05 14:54:16 INFO server.Server: jetty-8.y.z-SNAPSHOT 
15/06/05 14:54:16 INFO server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:8099 
15/06/05 14:54:16 INFO util.Utils: Successfully started service ‘MasterUI’ on port 8099. 
15/06/05 14:54:16 INFO ui.MasterWebUI: Started MasterWebUI at http://127.1.1.1:8099 
15/06/05 14:54:16 INFO master.Master: I have been elected leader! New state: ALIVE

七、启动Spark Slave


cd $SPARK_HOME/sbin/ 
./start-slaves.sh 


会根据$SPARK_HOME/conf/slaves文件中配置的主机,逐个ssh过去,启动Spark Worker

成功启动后,在WebUI界面上可以看到,已经有Worker注册上来了,如图:


在浏览器输入:http://192.168.1.84:8080/   (前面为master的ip地址)



八、简单小实例(统计文件中出现最多的50个单词)

在bin目录下直接运行./spark-shell

[plain] view plain copy
  1. hadoop@Master:/usr/local/spark-1.5.1-bin-hadoop2.6/bin$ ./spark-shell  
  2. log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).  
  3. log4j:WARN Please initialize the log4j system properly.  
  4. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.  
  5. Using Spark's repl log4j profile: org/apache/spark/log4j-defaults-repl.properties  
  6. To adjust logging level use sc.setLogLevel("INFO")  
  7. Welcome to  
  8.       ____              __  
  9.      / __/__  ___ _____/ /__  
  10.     _\ \/ _ \/ _ `/ __/  '_/  
  11.    /___/ .__/\_,_/_/ /_/\_\   version 1.5.1  
  12.       /_/  
  13.   
  14. Using Scala version 2.10.4 (OpenJDK 64-Bit Server VM, Java 1.7.0_79)  
  15. Type in expressions to have them evaluated.  
  16. Type :help for more information.  
  17. 15/10/13 19:12:16 WARN MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set.  
  18. Spark context available as sc.  
  19. 15/10/13 19:12:18 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)  
  20. 15/10/13 19:12:19 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)  
  21. 15/10/13 19:12:35 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0  
  22. 15/10/13 19:12:35 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException  
  23. 15/10/13 19:12:39 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable  
  24. 15/10/13 19:12:39 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)  
  25. 15/10/13 19:12:39 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)  
  26. SQL context available as sqlContext.  


没注意这么多warn是怎么回事,接着进入spark-shell,依次输入:

var srcFile = sc.textFile("/usr/local/kern.log")

var a = srcFile.flatMap(line=>line.split(" ")).map(word=>(word,1)).reduceByKey(_+_)

a.map(word=>(word._2,word._1)).sortByKey(false).map(word=>(word._2,word._1)).take(50).foreach(println)

结果打印在终端:



在4040端口可查看job的情况 http://192.168.1.84:4040/jobs/



八、Spark Java programming (Spark and Spark Streaming)

1:spark批处理:统计一个文件中出现a和出现b的单词数:SimpleApp.java
[java] view plain copy
  1. package org.apache.eagle.spark_streaming_kafka;  
  2.   
  3. import org.apache.spark.SparkConf;  
  4. import org.apache.spark.api.java.JavaRDD;  
  5. import org.apache.spark.api.java.JavaSparkContext;  
  6. import org.apache.spark.api.java.function.Function;  
  7.   
  8. public class SimpleApp {  
  9.   
  10.     public static void main(String[] args) {  
  11.         String logFile = "/var/log/boot.log"// Should be some file on your system    
  12.         SparkConf conf = new SparkConf().setAppName("Simple Application");    
  13.         JavaSparkContext sc = new JavaSparkContext(conf);    
  14.         JavaRDD<String> logData = sc.textFile(logFile).cache();    
  15.         
  16.         long numAs = logData.filter(new Function<String, Boolean>() {    
  17.           /** 
  18.              *  
  19.              */  
  20.             private static final long serialVersionUID = 1L;  
  21.   
  22.         public Boolean call(String s) { return s.contains("a"); }    
  23.         }).count();    
  24.         
  25.         long numBs = logData.filter(new Function<String, Boolean>() {    
  26.               
  27.         
  28.         public Boolean call(String s) { return s.contains("b"); }    
  29.         }).count();    
  30.         
  31.         System.out.println("Lines with a: " + numAs + ", lines with b: " + numBs);    
  32.   
  33.     }  
  34.   
  35. }  

2:Spark Streaming, 读取kafka数据做单词统计。
[java] view plain copy
  1. package org.apache.eagle.spark_streaming_kafka;  
  2.   
  3. import java.util.HashMap;  
  4. import java.util.Map;  
  5. import java.util.regex.Pattern;  
  6.   
  7. import org.apache.spark.SparkConf;  
  8. import org.apache.spark.api.java.function.FlatMapFunction;  
  9. import org.apache.spark.api.java.function.Function;  
  10. import org.apache.spark.api.java.function.Function2;  
  11. import org.apache.spark.api.java.function.PairFunction;  
  12. import org.apache.spark.streaming.Duration;  
  13. import org.apache.spark.streaming.api.java.JavaDStream;  
  14. import org.apache.spark.streaming.api.java.JavaPairDStream;  
  15. import org.apache.spark.streaming.api.java.JavaPairReceiverInputDStream;  
  16. import org.apache.spark.streaming.api.java.JavaStreamingContext;  
  17. import org.apache.spark.streaming.kafka.KafkaUtils;  
  18.   
  19. import com.google.common.collect.Lists;  
  20.   
  21. import scala.Tuple2;  
  22.   
  23.   
  24. /** 
  25.  * spark-streaming-kafka 
  26.  * 
  27.  */  
  28. public class JavaKafkaWordCount   
  29. {  
  30.     private static final Pattern SPACE = Pattern.compile(" ");  
  31.   
  32.       private JavaKafkaWordCount() {  
  33.       }  
  34.         
  35.     public static void main( String[] args )  
  36.     {  
  37.           
  38.         String zkQuorum = "10.64.255.161";    
  39.         String group = "test-consumer-group";    
  40.         SparkConf sparkConf = new SparkConf().setAppName("JavaKafkaWordCount");  
  41.         // Create the context with 2 seconds batch size  
  42.         JavaStreamingContext jssc = new JavaStreamingContext(sparkConf, new Duration(2000));  
  43.         Map<String, Integer> topicMap = new HashMap<String, Integer>();  
  44.         topicMap.put("noise",1);  
  45.         JavaPairReceiverInputDStream<String, String> messages =  
  46.                 KafkaUtils.createStream(jssc, zkQuorum, group, topicMap);;  
  47.           
  48.         JavaDStream<String> lines = messages.map(new Function<Tuple2<String, String>, String>() {  
  49.             public String call(Tuple2<String, String> tuple2) {  
  50.               return tuple2._2();  
  51.             }  
  52.           });  
  53.         JavaDStream<String> words = lines.flatMap(new FlatMapFunction<String, String>() {  
  54.             public Iterable<String> call(String x) {  
  55.               return Lists.newArrayList(SPACE.split(x));  
  56.             }  
  57.           });  
  58.   
  59.           JavaPairDStream<String, Integer> wordCounts = words.mapToPair(  
  60.             new PairFunction<String, String, Integer>() {  
  61.               public Tuple2<String, Integer> call(String s) {  
  62.                 return new Tuple2<String, Integer>(s, 1);  
  63.               }  
  64.             }).reduceByKey(new Function2<Integer, Integer, Integer>() {  
  65.               public Integer call(Integer i1, Integer i2) {  
  66.                 return i1 + i2;  
  67.               }  
  68.             });  
  69.   
  70.           wordCounts.print();  
  71.           jssc.start();  
  72.           jssc.awaitTermination();  
  73.     }  
  74. }  

注意几点:
    1:环境:要确保spark在本机中正确安装,安装步骤如上所述。zookeeper集群和kafka集群要安装好,kafka的topic要新建好。
    2:之前运行遇到找不到jar的情况(kafkaUtil),原因没有把所有依赖的jar包都打包到最终的jar包里去。应在pom.xml中添加一下:
[html] view plain copy
  1. <build>  
  2.     <sourceDirectory>src/main/java</sourceDirectory>  
  3.     <testSourceDirectory>src/test/java</testSourceDirectory>  
  4.     <plugins>  
  5.       <!--  
  6.                    Bind the maven-assembly-plugin to the package phase  
  7.         this will create a jar file without the storm dependencies  
  8.         suitable for deployment to a cluster.  
  9.        -->  
  10.       <plugin>  
  11.         <artifactId>maven-assembly-plugin</artifactId>  
  12.         <configuration>  
  13.           <descriptorRefs>  
  14.             <descriptorRef>jar-with-dependencies</descriptorRef>  
  15.           </descriptorRefs>  
  16.           <archive>  
  17.             <manifest>  
  18.               <mainClass>org.apache.eagle.spark_streaming_kafka.JavaKafkaWordCount</mainClass>  
  19.             </manifest>  
  20.           </archive>  
  21.         </configuration>  
  22.         <executions>  
  23.           <execution>  
  24.             <id>make-assembly</id>  
  25.             <phase>package</phase>  
  26.             <goals>  
  27.               <goal>single</goal>  
  28.             </goals>  
  29.           </execution>  
  30.         </executions>  
  31.       </plugin>  
  32.     </plugins>  
  33. </build>   
     将所需的jar包一同打包,所以生成的文件会很大。
     3:如何提交任务?spark和spark streaming提交的方式都一样,用$SPARK_HOME/bin/soark-submit脚本提交,进入bin目录下,
           以下是spark streaming任务提交,具体如下:
[html] view plain copy
  1. ./spark-submit  --master local[8] /home/zqin/workspace/spark-streaming-kafka/target/spark-streaming-kafka-0.0.1-SNAPSHOT-jar-with-dependencies.jar  

由于在pom.xml中指明了入口类,因此不用加--class,如果没有指明,在命令中要用--class 指明入口。

          以下是spark任务提交:
[html] view plain copy
  1. ./spark-submit  --class org.apache.eagle.spark_streaming_kafka.SimpleApp --master local[8] /home/zqin/workspace/spark-streaming-kafka/target/spark-streaming-kafka-0.0.1-SNAPSHOT-jar-with-dependencies.jar  
          需要指明程序main入口。
       4:在运行spark streaming时,控制台满屏日志,不好查看结果,在Spark的conf目录下,把log4j.properties.template修改为log4j.properties,把log4j.rootCategory=INFO, console改为log4j.rootCategory=WARN, console即可抑制Spark把INFO级别的日志打到控制台上。如果要显示全面的信息,则把INFO改为DEBUG



九、关闭spark

在spark目录下输入:sbin/stop-all.sh
0 0
原创粉丝点击