安装Hadoop及Spark for Ubuntu 16.04
来源:互联网 发布:js 质数 编辑:程序博客网 时间:2024/06/06 20:29
安装JDK
下载jdk(以jdk-8u91-Linux-x64.tar.gz为例)
新建文件夹
sudo mkdir /usr/lib/jvm
解压下载的jdk文件并移动到新建的文件夹下
sudo tar -xzvf jdk-8u91-linux-x64.tar.gz -C /usr/lib/jvm
进入jvm文件夹并重命名解压出来的文件夹
<code class="hljs dos"><span class="hljs-built_in">cd</span> /usr/lib/jvmsudo mv jdk1.<span class="hljs-number">8</span>.<span class="hljs-number">0</span>_91 jdk</code>
添加环境变量
<code class="hljs bash">sudo vim /etc/profile<span class="hljs-comment"># 添加如下配置</span><span class="hljs-built_in">export</span> JAVA_HOME=/usr/lib/jvm/jdk<span class="hljs-built_in">export</span> CLASSPATH=.:<span class="hljs-variable">$JAVA_HOME</span>/lib:<span class="hljs-variable">$JAVA_HOME</span>/jre/lib:<span class="hljs-variable">$CLASSPATH</span><span class="hljs-built_in">export</span> PATH=<span class="hljs-variable">$JAVA_HOME</span>/bin:<span class="hljs-variable">$JAVA_HOME</span>/jre/bin:<span class="hljs-variable">$PATH</span></code>
使配置生效
source /etc/profile
测试
Java -version
安装Scala
类似于jdk的安装
下载Scala(以scala-2.11.8.tgz为例)
解压下载的scala文件
sudo tar -xzvf scala-2.11.8.tgz -C /usr/local
重命名
<code class="hljs bash"><span class="hljs-built_in">cd</span> /usr/<span class="hljs-built_in">local</span>sudo mv scala-2.11.8 scala</code>
添加环境变量
<code class="hljs bash">sudo vim /etc/profile<span class="hljs-comment"># 在最后添加下面内容</span><span class="hljs-built_in">export</span> SCALA_HOME=/usr/<span class="hljs-built_in">local</span>/scala<span class="hljs-built_in">export</span> PATH=<span class="hljs-variable">$SCALA_HOME</span>/bin:<span class="hljs-variable">$PATH</span></code>
使配置生效
source /etc/profile
测试
scala -version
安装Hadoop
Spark默认使用HDFS充当持久化层,所以需要安装Hadoop,当然也可以不安装
参考
- 安装单机/伪分布式
- 安装集群
安装
安装ssh
sudo apt install openssh-server
配置ssh无密登陆
<code class="hljs ruby">ssh-keygen -t rsa <span class="hljs-comment"># 一直回车</span>cat ~<span class="hljs-regexp">/.ssh/id</span>_rsa.pub <span class="hljs-meta">>> </span>~<span class="hljs-regexp">/.ssh/authorized</span>_keys</code>
测试ssh无密登陆
ssh localhost # 如果不提示输入密码则配置成功
下载Hadoop(以hadoop-2.7.2.tar.gz为例)
解压
sudo tar -xzvf hadoop-2.7.2.tar.gz -C /usr/local
重命名
<code class="hljs bash"><span class="hljs-built_in">cd</span> /usr/<span class="hljs-built_in">local</span>sudo mv hadoop-2.7.2 hadoop</code>
修改权限
<code class="hljs perl">cd /usr/<span class="hljs-keyword">local</span>sudo <span class="hljs-keyword">chown</span> -R yourusername:yourusername hadoop</code>
配置环境变量
<code class="hljs bash">sudo vim /etc/profile<span class="hljs-comment"># 在最后添加下面代码</span><span class="hljs-built_in">export</span> HADOOP_HOME=/usr/<span class="hljs-built_in">local</span>/hadoop<span class="hljs-built_in">export</span> PATH=<span class="hljs-variable">$HADOOP_HOME</span>/bin:<span class="hljs-variable">$HADOOP_HOME</span>/sbin:<span class="hljs-variable">$PATH</span></code>
测试
hadoop version
Hadoop伪分布式配置
修改配置文件
core-site.xml
<code class="hljs dts">cd <span class="hljs-meta-keyword">/usr/</span>local/hadoopvim .<span class="hljs-meta-keyword">/etc/</span>hadoop/core-site.xml<span class="hljs-meta"># 修改为如下</span><span class="hljs-params"><configuration></span> <span class="hljs-params"><property></span> <span class="hljs-params"><name></span>hadoop.tmp.dir<span class="hljs-params"></name></span> <span class="hljs-params"><value></span>file:<span class="hljs-meta-keyword">/usr/</span>local<span class="hljs-meta-keyword">/hadoop/</span>tmp<span class="hljs-params"></value></span> <span class="hljs-params"><description></span>Abase for other temporary directories.<span class="hljs-params"></description></span> <span class="hljs-params"></property></span> <span class="hljs-params"><property></span> <span class="hljs-params"><name></span>fs.defaultFS<span class="hljs-params"></name></span> <span class="hljs-params"><value></span>hdfs:<span class="hljs-comment">//localhost:9000</value></span> <span class="hljs-params"></property></span><span class="hljs-params"></configuration></span></code>
修改配置文件
hdfs-site.xml
<code class="hljs dts">cd <span class="hljs-meta-keyword">/usr/</span>local/hadoopvim .<span class="hljs-meta-keyword">/etc/</span>hadoop<span class="hljs-meta-keyword">/hdfs-site/</span>xml<span class="hljs-meta"># 修改为如下</span><span class="hljs-params"><configuration></span> <span class="hljs-params"><property></span> <span class="hljs-params"><name></span>dfs.replication<span class="hljs-params"></name></span> <span class="hljs-params"><value></span><span class="hljs-number">1</span><span class="hljs-params"></value></span> <span class="hljs-params"></property></span> <span class="hljs-params"><property></span> <span class="hljs-params"><name></span>dfs.namenode.name.dir<span class="hljs-params"></name></span> <span class="hljs-params"><value></span>file:<span class="hljs-meta-keyword">/usr/</span>local<span class="hljs-meta-keyword">/hadoop/</span>tmp<span class="hljs-meta-keyword">/dfs/</span>name<span class="hljs-params"></value></span> <span class="hljs-params"></property></span> <span class="hljs-params"><property></span> <span class="hljs-params"><name></span>dfs.datanode.data.dir<span class="hljs-params"></name></span> <span class="hljs-params"><value></span>file:<span class="hljs-meta-keyword">/usr/</span>local<span class="hljs-meta-keyword">/hadoop/</span>tmp<span class="hljs-meta-keyword">/dfs/</span>data<span class="hljs-params"></value></span> <span class="hljs-params"></property></span><span class="hljs-params"></configuration></span></code>
修改配置文件
hadoop-env.sh
<code class="hljs dts">cd <span class="hljs-meta-keyword">/usr/</span>local/hadoopvim .<span class="hljs-meta-keyword">/etc/</span>hadoop/hadoop-env.sh<span class="hljs-meta"># 将 export JAVA_HOME=${JAVA_HOME} 更改为:</span>export JAVA_HOME=<span class="hljs-meta-keyword">/usr/</span>lib<span class="hljs-meta-keyword">/jvm/</span>jdk</code>
执行NameNode格式化
hdfs namenode -format
运行
start-dfs.sh
测试
jps
有如下几个进程
<code class="hljs basic"><span class="hljs-symbol">5939 </span>Jps<span class="hljs-symbol">5636 </span>DataNode<span class="hljs-symbol">5493 </span>NameNode<span class="hljs-symbol">5814 </span>SecondaryNameNode</code>
通过浏览器查看
在浏览器中输入一下地址:
localhost:50070
配置YARN
修改配置文件
mapred-site.xml
<code class="hljs dts">cd <span class="hljs-meta-keyword">/usr/</span>local/hadoopcp .<span class="hljs-meta-keyword">/etc/</span>hadoop/mapred-site.xml.template .<span class="hljs-meta-keyword">/etc/</span>hadoop/mapred-site.xmlvim .<span class="hljs-meta-keyword">/etc/</span>hadoop/mapred-site.xml<span class="hljs-meta"># 修改为如下配置</span><span class="hljs-params"><configuration></span> <span class="hljs-params"><property></span> <span class="hljs-params"><name></span>mapreduce.framework.name<span class="hljs-params"></name></span> <span class="hljs-params"><value></span>yarn<span class="hljs-params"></value></span> <span class="hljs-params"></property></span><span class="hljs-params"></configuration></span></code>
修改配置文件
yarn-site.xml
<code class="hljs dts">cd <span class="hljs-meta-keyword">/usr/</span>local/hadoopvim .<span class="hljs-meta-keyword">/etc/</span>hadoop/yarn-site.xml<span class="hljs-meta"># 修改为以下配置</span><span class="hljs-params"><configuration></span> <span class="hljs-params"><property></span> <span class="hljs-params"><name></span>yarn.nodemanager.aux-services<span class="hljs-params"></name></span> <span class="hljs-params"><value></span>mapreduce_shuffle<span class="hljs-params"></value></span> <span class="hljs-params"></property></span><span class="hljs-params"></configuration></span></code>
编写启动脚本
<code class="hljs bash"><span class="hljs-meta">#!/bin/bash</span><span class="hljs-comment"># 启动hadoop</span>start-dfs.sh<span class="hljs-comment"># 启动yarn</span>start-yarn.sh<span class="hljs-comment"># 启动历史服务器,以便在Web中查看任务运行情况</span>mr-jobhistory-daemon.sh start historyserver</code>
编写停止脚本
<code class="hljs bash"><span class="hljs-meta">#!/bin/bash</span><span class="hljs-comment"># 停止历史服务器</span>mr-jobhistory-daemon.sh stop historyserver<span class="hljs-comment"># 停止yarn</span>stop-yarn.sh<span class="hljs-comment"># 停止hadoop</span>stop-dfs.sh</code>
通过 Web 界面查看任务的运行情况
浏览器中输入地址:
localhost:8088
安装Spark
下载spark(以spark-2.0.0-bin-hadoop2.7.tgz为例)
解压下载的spark文件
sudo tar -zxf spark-2.0.0-bin-hadoop2.7.tgz -C /usr/local
重命名
<code class="hljs bash"><span class="hljs-built_in">cd</span> /usr/<span class="hljs-built_in">local</span>sudo mv spark-2.0.0-bin-hadoop2.7 spark</code>
添加环境变量
<code class="hljs bash">sudo vim /etc/profile<span class="hljs-comment"># 在最后添加下面内容</span><span class="hljs-built_in">export</span> SPARK_HOME=/usr/<span class="hljs-built_in">local</span>/spark<span class="hljs-built_in">export</span> PATH=<span class="hljs-variable">$SPARK_HOME</span>/bin:<span class="hljs-variable">$SPARK_HOME</span>/sbin:<span class="hljs-variable">$PATH</span></code>
修改一下权限
<code class="hljs perl">cd /usr/<span class="hljs-keyword">local</span>sudo <span class="hljs-keyword">chown</span> -R yourusername:yourusername ./spark</code>
拷贝配置文件
<code class="hljs dts">cd <span class="hljs-meta-keyword">/usr/</span>local/sparkcp .<span class="hljs-meta-keyword">/conf/</span>spark-env.sh.template .<span class="hljs-meta-keyword">/conf/</span>spark-env.sh</code>
修改配置文件
<code class="hljs dts">cd <span class="hljs-meta-keyword">/usr/</span>loca/sparkvim .<span class="hljs-meta-keyword">/conf/</span>spark-env.sh<span class="hljs-meta"># 添加下面一行</span>export SPARK_DIST_CLASSPATH=$(<span class="hljs-meta-keyword">/usr/</span>local<span class="hljs-meta-keyword">/hadoop/</span>bin/hadoop classpath)export JAVA_HOME=<span class="hljs-meta-keyword">/usr/</span>lib<span class="hljs-meta-keyword">/jvm/</span>jdk</code>
运行简单示例
/usr/local/spark/bin/run-example SparkPi 2>&1 | grep "Pi is roughly"
启动Spark
/usr/local/spark/sbin/start-all.sh
编写脚本
启动Hadoop以及Spark
<code class="hljs bash"><span class="hljs-meta">#!/bin/bash</span><span class="hljs-comment"># 启动Hadoop以及yarn</span>start-dfs.shstart-yarn.sh<span class="hljs-comment"># 启动历史服务器</span>mr-jobhistory-daemon.sh start historyserver<span class="hljs-comment"># 启动Spark</span>/usr/<span class="hljs-built_in">local</span>/spark/sbin/start-all.sh</code>
停止Hadoop以及Spark
<code class="hljs bash"><span class="hljs-meta">#!/bin/bash</span><span class="hljs-comment"># 停止Spark</span>stop-dfs.shstop-yarn.sh<span class="hljs-comment"># 停止历史服务器</span>mr-jobhistory-daemon.sh stop historyserver<span class="hljs-comment"># 停止Hadoop以及yarn</span>/usr/<span class="hljs-built_in">local</span>/hadoop/sbin/stop-all.sh</code>
通过WEB页面查看
浏览器中输入地址:
localhost:8080
- 安装Hadoop及Spark for Ubuntu 16.04
- 安装Hadoop及Spark for Ubuntu 16.04
- 安装Hadoop及Spark for Ubuntu 16.04
- Ubuntu 16.04安装Hadoop及Spark
- 安装Hadoop及Spark(Ubuntu 16.04)
- 安装Hadoop及Spark(Ubuntu 16.04)
- 安装Hadoop及Spark(Ubuntu 16.04)
- Ubuntu安装Spark和Hadoop集群
- Ubuntu 16.04 Spark安装
- ubuntu16.04安装Hadoop及Spark
- Ubuntu Hadoop Spark集群
- 从零开始安装配置Hadoop 2.7.2+Spark 2.0.0到Ubuntu 16.04
- Ubuntu下安装及配置单点hadoop
- 【Hadoop/Spark】Hadoop、Spark安装详解
- Hadoop+spark安装
- HADOOP+SPARK安装
- Spark+Hadoop安装注意事项
- Hadoop+Spark+Scala+R+PostgreSQL+Zeppelin安装过程-Hadoop安装及测试
- 欢迎使用CSDN-markdown编辑器
- java获取随机数
- Spring的BeanFactoryPostProcessor和BeanPostProcessor接口的区别
- android图片和分辨率对照图
- 为RecyclerView的item创建点击事件
- 安装Hadoop及Spark for Ubuntu 16.04
- JavaSE-3min回顾并优化Object的equals方法
- 41.我的收藏
- 牛客网刷题之构建乘积数组
- 移植SUI Mobile省市区选择器至PC端,使用select完成地区联动
- kero2
- CentOS学习17_CentOS升级openssl为最新版
- 通过Spring MVC 的自定义拦截器实现灵活的登录拦截
- Mac 上面使用cocoapods的一些问题