Running Spark on Mesos[ 安装及运行 ]
来源:互联网 发布:nginx main函数 编辑:程序博客网 时间:2024/05/06 02:53
安装scala
解压文档:tar -zxvf scala-2.9.2.tgz
将下面语句加入到~/.bashrc 或 .profile
export SCALA_HOME="/opt/scala"
export PATH="${SCALA_HOME}/bin:${JAVA_HOME}/bin:${PATH}"
然后 $ source ~/.bashrc
测试scala安装是否成功
$ scala
安装Spark-0.6.1
Spark requires Scala 2.9.2. You will need to have Scala’s bin
directory in yourPATH
,or you will need to set theSCALA_HOME
environment variable to pointto where you’ve installed Scala. Scala must also be accessible through oneof these methods on slave nodes on your cluster.
Spark uses Simple Build Tool, which is bundled with it. To compile the code, go into the top-level Spark directory and run
sbt/sbt package
Testing the Build
Spark comes with a number of sample programs in theexamples
directory.To run one of the samples, use./run <class> <params>
in the top-level Spark directory(therun
script sets up the appropriate paths and launches that program).For example,./run spark.examples.SparkPi
will run a sample program that estimates Pi. Each of theexamples prints usage help if no params are given.
Note that all of the sample programs take a <master>
parameter specifying the cluster URLto connect to. This can be a URL for a distributed cluster,or local
to run locally with one thread, orlocal[N]
to run locally with N threads. You should start by usinglocal
for testing.
Finally, Spark can be used interactively from a modified version of the Scala interpreter that you can start through./spark-shell
. This is a great way to learn Spark.
Running Spark on Mesos
Spark can run on private clusters managed by the Apache Mesos resource manager. Follow the steps below to install Mesos and Spark:
- Download and build Spark using the instructions here.
- Download Mesos 0.9.0-incubating from a mirror.
- Configure Mesos using the
configure
script, passing the location of yourJAVA_HOME
using--with-java-home
. Mesos comes with “template” configure scripts for different platforms, such asconfigure.macosx
, that you can run. See the README file in Mesos for other options.Note: If you want to run Mesos without installing it into the default paths on your system (e.g. if you don’t have administrative privileges to install it), you should also pass the--prefix
option toconfigure
to tell it where to install. For example, pass--prefix=/home/user/mesos
. By default the prefix is/usr/local
. - Build Mesos using
make
, and then install it usingmake install
. - Create a file called
spark-env.sh
in Spark’sconf
directory, by copyingconf/spark-env.sh.template
, and add the following lines in it:export MESOS_NATIVE_LIBRARY=<path to libmesos.so>
. This path is usually<prefix>/lib/libmesos.so
(where the prefix is/usr/local
by default). Also, on Mac OS X, the library is calledlibmesos.dylib
instead of.so
.export SCALA_HOME=<path to Scala directory>
.
- Copy Spark and Mesos to the same paths on all the nodes in the cluster (or, for Mesos,
make install
on every node). - Configure Mesos for deployment:
- On your master node, edit
<prefix>/var/mesos/deploy/masters
to list your master and<prefix>/var/mesos/deploy/slaves
to list the slaves, where<prefix>
is the prefix where you installed Mesos (/usr/local
by default). - On all nodes, edit
<prefix>/var/mesos/conf/mesos.conf
and add the linemaster=HOST:5050
, where HOST is your master node. - Run
<prefix>/sbin/mesos-start-cluster.sh
on your master to start Mesos. If all goes well, you should see Mesos’s web UI on port 8080 of the master machine. - See Mesos’s README file for more information on deploying it.
- On your master node, edit
- To run a Spark job against the cluster, when you create your
SparkContext
, pass the stringmesos://HOST:5050
as the first parameter, whereHOST
is the machine running your Mesos master. In addition, pass the location of Spark on your nodes as the third parameter, and a list of JAR files containing your JAR’s code as the fourth (these will automatically get copied to the workers). For example:
new SparkContext("mesos://HOST:5050", "My Job Name", "/home/user/spark", List("my-job.jar"))
运行SparkKMeans算法在Mesos
启动各个节点的mesos服务,检查WebUI各个slaves有没有挂载上,启动hadoop alongside, 上传keansdata.txt到hdfs上,在master进入spark目录,运行kmeans算法。
./run spark.examples.SparkKMeans 192.168.1.130:5050 hdfs://master:9000/user/liu/testdata/kmeansdata.txt 8 2.0
注意添加环境变量
export JAVA_HOME=$HOME/jdk1.7.0_05export HADOOP_VERSION=1.0.4export HADOOP_HOME=$HOME/hadoop-$HADOOP_VERSIONexport SCALA_HOME=$HOME/scala-2.9.2export MESOS_HOME=$HOME/mesos-0.9.0export MESOS_NATIVE_LIBRARY=$MESOS_HOME/src/.libs/libmesos.soexport SPARK_HOME=$HOME/spark-0.6.1export LD_LIBRARY_PATH=$MESOS_HOME/src/.libsexport CLASSPATH=/home/hadoop/spark-0.6.1/core/target/spark-core-assembly-0.6.1.jar:.:$JAVA_HOME/lib:$JAVA_HOME/jre/libexport PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$SCALA_HOME/bin
Note:
第八步说明的不是很具体,这里以Spark自带的SparkKMeans.scala为例,如何编译与运行程序。以下步骤都只在master节点上操作即可。可参考spark programming guide
首先生成 Spark和依赖的jar包(core/target/spark-core-assembly-0.6.0.jar)
sbt/sbt assembly
将此jar包加入到CLASSPATH中
export CLASSPATH=/home/hadoop/spark-0.6.1/core/target/spark-core-assembly-0.6.1.jar:.:$JAVA_HOME/lib:$JAVA_HOME/jre/lib
将下面语句加入到scala程序文件中:
import spark.SparkContextimport SparkContext._
编译scala程序
scalac SparkKMeans.scala运行编译好的 SparkKMeans 程序
scala spark.examples.SparkKMeans mesos://192.168.1.130:5050 hdfs://192.168.1.130:9000/dataset/Square-10m.txt 8 2.0
如何写Spark程序
The first thing a Spark program must do is to create aSparkContext
object, which tells Spark how to access a cluster.This is done through the following constructor:
new SparkContext(master, jobName, [sparkHome], [jars])
The master
parameter is a string specifying aMesos cluster to connect to, or a special “local” string to run in local mode, as described below.jobName
is a name for your job, which will be shown in the Mesos web UI when running on a cluster. Finally, the last two parameters are needed to deploy your code to a cluster if running in distributed mode, as described later.
In the Spark shell, a special interpreter-aware SparkContext is already created for you, in the variable calledsc
. Making your own SparkContext will not work. You can set which master the context connects to using theMASTER
environment variable. For example, to run on four cores, use
$ MASTER=local[4] ./spark-shell
- Running Spark on Mesos[ 安装及运行 ]
- spark+mesos安装运行
- spark on mesos 两种运行模式
- Spark on Mesos部署
- spark-on-mesos
- Spark on Mesos cluster mode
- Spark on Mesos集群部署
- 在Mesos上运行Spark
- Running Spark on YARN
- Running Spark on YARN
- Mesos 安装与运行
- spark官方文档之——Running Spark on YARN YARN上运行SPARK
- Spark on Mesos: 搭建Mesos的一些问题
- Spark on Mesos: 搭建Mesos的一些问题
- Run Spark on Mesos with Zookeeper
- Spark on mesos的坑以及解决办法
- Run Spark on Mesos with Zookeeper
- Spark on Yarn解密及运行流程
- 不加班做不出好游戏
- hibernate多对多单向关联_XML
- 使用JDBC进行批处理
- 字符串搜索最长单词
- 开博文记录ios学习路程
- Running Spark on Mesos[ 安装及运行 ]
- mysql子查询的弱点
- 设计模式之工厂三姐妹
- JVM垃圾回收总结
- Hibernate的查询 标准(Criteria--Projections)查询----- 聚合和分组
- mount.nfs:access denied by server while mounting问题解决
- NSURLRequest详解
- linux 下生成核心文件
- UIAlertView.h