hadoop2.6 安装
来源:互联网 发布:js商品计算公式 编辑:程序博客网 时间:2024/05/14 18:05
Hadoop2.6.0集群搭建
===============
在多台服务器上搭建Hadoop2.6.0平台,操作系统为CentOS/RHEL 64位。使用Hadoop为2.6.0的源文件进行64位版本编译。
===============
注意事项:
1.环境:笔记本电脑,装了2台centos6.6虚拟机,每台虚拟机分配:1G内存,20G硬盘。其中xml做了较多修改,参考:安装配置手册:http://blog.csdn.net/licongcong_0224/article/details/12972889
2.命令基本在hadoop用户下进行,不要贪图简便在root用户下运行,会出错
3.区分root和hadoop用户(# 和 $)
4./etc/profile是在根目录,由root修改就行;~/.bash_profile是在每个用户的根目录下(例如:/home/hadoop;/home/hongyuqin)
5.配置错误,重新配置时,建议删除~/hdfs,重新mkdir…,删除~/hadoop-2.6.0/logs,重新mkdir…;再hadoop namenode -format
6.添加了 wordcount 测试实例
===============
第一步:创建用户
CentOS增加hadoop用户,设定密码,加入/etc/sudoers配置(需要chmod u+w)
# useradd hadoop# passwd hadoop
如果需要配置成sudo用户的话:
# chmod u+w /etc/sudoers# vi /etc/sudoers 添加 xxx ALL=(ALL) ALL# chmod u-w /etc/sudoers
补充:更改主机名
#vim /etc/hosts(先把备份后内容清空) 192.168.136.131 master 192.168.136.130 slave#vim /etc/selinux/config#改成disabledsetenforce 0#vim /etc/sysconfig/networkHOSTNAME=slave
第二步:安装Java
原先的linux里已经安装了Java,在安装完新版后选择新版本。
下载jdk-7u71-linux-x64.rpm
$ sudo rpm -ivh jdk-7u71-linux-x64.rpm
编辑sudo vim /etc/profile
export JAVA_HOME=/usr/java/jdk1.7.0_71export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jarexport PATH=$PATH:$JAVA_HOME/bin
把新的jdk加入到使用的备选项
$ sudo update-alternatives --install /usr/bin/java java /usr/java/jdk1.7.0_71/bin/java 300$ sudo update-alternatives --install /usr/bin/javac javac /usr/java/jdk1.7.0_71/bin/javac 300
更改默认版本
$ sudo update-alternatives --config java
会出现一些备选项,选择刚刚加入的版本即可
查验:
#java -version # source /etc/profile # echo $PATH
第三步:配置Key
$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys$ chmod 600 ~/.ssh/authorized_keys$ chmod 700 ~/.ssh/
设置并重启sshd服务
#vi /etc/ssh/sshd_configRSAAuthentication yesPubkeyAuthentication yesAuthorizedKeysFile .ssh/authorized_keys#service sshd restart
测试一下本机ok否:
$ ssh localhost
对slave机器分别执行:
#ssh-copy-id hadoop@slave
测试登录
$ ssh slave
第四步:编译Hadoop
在64位机器上想安装64位的Hadoop只能下载源码自己编译,因为Hadoop只提供了32位的bin。编译工具集需要,如Maven、Findbugs、protobuf、CMake、zlib等等。
# yum -y update# yum -y install svn ncurses-devel gcc*# yum -y install lzo-devel zlib-devel autoconf automake libtool cmake openssl-devel# yum -y install gcc make cmake zlib zlib-devel openssl glibc-headers gcc-c++
然后检查下基本的有没有齐全
#rpm -ql zlib-devel/usr/include/zconf.h/usr/include/zlib.h/usr/lib64/libz.so/usr/lib64/pkgconfig/zlib.pc/usr/share/doc/zlib-devel-1.2.3/usr/share/doc/zlib-devel-1.2.3/README/usr/share/doc/zlib-devel-1.2.3/algorithm.txt/usr/share/doc/zlib-devel-1.2.3/example.c/usr/share/doc/zlib-devel-1.2.3/minigzip.c/usr/share/man/man3/zlib.3.gz
#rpm -ql zlib /lib64/libz.so.1/lib64/libz.so.1.2.3/usr/share/doc/zlib-1.2.3/usr/share/doc/zlib-1.2.3/ChangeLog/usr/share/doc/zlib-1.2.3/FAQ/usr/share/doc/zlib-1.2.3/README
# vi + /etc/profile(添加下述)export LD_LIBRARY_PATH=/lib64:/usr/lib64export ZLIB_INCLUDE_DIR=/lib64:/usr/lib64
1. 安装maven
wget http://mirror.bit.edu.cn/apache/maven/maven-3/3.2.5/binaries/apache-maven-3.2.5-bin.tar.gz
解压到目录/usr/local/apache-maven
# vi + /etc/profile添加
export MAVEN_HOME=/usr/local/apache-maven-3.2.5export PATH=$PATH:$MAVEN_HOME/bin
执行source /etc/profile使之生效,测试一下
$ echo $MAVEN_HOME$ mvn –v
2. 安装ant
百度云盘下载 :http://pan.baidu.com/s/1c0vjhBy$tar xvzf apache-ant-1.9.4-bin.tar.gz$sudo mv apache-ant-1.9.4 /usr/local
然后在/etc/profile中添加环境变量
export ANT_HOME=/usr/local/apache-ant-1.9.4export PATH=$PATH:$ANT_HOME/binexport LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$ANT_HOME/lib
3. 安装protocol buffers
百度云盘 :http://pan.baidu.com/s/1pJlZubT$cd protobuf-2.5.0$./configure$sudo make $sudo make check$sudo make install
%默认安装在/usr/local/bin/protoc和/usr/local/lib/*.so,最后检查一下
protoc --version
显示如下
libprotoc 2.5.0
4. 安装findbugs
$ wget ...$ tar xvzf findbugs-3.0.0.tar.gz$ mv findbugs-3.0.0 findbugs$ sudo mv findbugs /usr/local/$ vim /etc/profile
export FINDBUGS_HOME=/usr/local/findbugsexport PATH=$PATH:$FINDBUGS_HOME/bin
#PATH中加入$FINDBUGS_HOME/bin
5. 编译Hadoop
# cd ~官网下载:http://www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-2.6.0/hadoop-2.6.0-src.tar.gz# tar zxvf hadoop-2.6.0-src.tar.gz -C /home/hadoop/# chown –R hadoop /home/hadoop/# source /etc/profile
进入Hadoop源码目录根目录
# cd /home/hadoop-2.6.0-src# mvn clean package -Pdist,native -DskipTests -Dtar或者# mvn package -Pdist,native,doc -DskipTest -Dtar -Dmaven.test.skip=true 跳过测试
编译成功后 source目录下的 /hadoop-dist/target/hadoop-2.6.0.tar.gz就是需要的文件了
编译好的文件传到各个slave机器上:
# scp hadoop-2.6.0.tar.gz hadoop@slave:~
在master,slave机器上:
# tar -xzvf hadoop-2.6.0.tar.gz -C /home/hadoop# chown -R hadoop:hadoop /home/hadoop
第五步:配置Hadoop
先为当前hadoop用户配置环境变量
配置用户名为Hadoop的默认环境变量(如果没有在/etc/profile里配置JAVA_HOME也可以配置在bash_profile)
$cd ~$vi + .bash_profile#Hadoop variablesexport HADOOP_HOME=/home/hadoop/hadoop-2.6.0export HADOOP_INSTALL=/home/hadoop/hadoop-2.6.0export PATH=$PATH:$HADOOP_INSTALL/binexport PATH=$PATH:$HADOOP_INSTALL/sbinexport HADOOP_MAPRED_HOME=$HADOOP_INSTALLexport HADOOP_COMMON_HOME=$HADOOP_INSTALLexport HADOOP_HDFS_HOME=$HADOOP_INSTALLexport YARN_HOME=$HADOOP_INSTALLexport HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/nativeexport HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"alias h='cd $HADOOP_HOME'alias etc='cd $HADOOP_HOME/etc/hadoop'$source .bash_profile
在hadoop-env.sh和yarn-env.sh里加入下面的内容:
$ vim hadoop-env.sh#modify JAVA_HOMEexport JAVA_HOME=/usr/java/jdk1.7.0_71
.bash_profile要在hadoop用户下编辑和传给slave,.bash_profile是在/home/hadoop/目录下:
$scp .bash_profile slave:~
还有十台机器
目录规划
HDFSNameNode元数据文件 dfs.namenode.name.dir /home/hadoop/hdfs/name
HDFS数据文件1 dfs.datanode.data.dir /home/hadoop/hdfs/data
HDFSNameNode备份文件目录 fs.checkpoint.dir /home/hadoop/hdfs/checkpoint
临时文件 hadoop.tmp.dir /home/hadoop/hdfs/tmp
创建目录:
$ mkdir -p /home/hadoop/hdfs/name /home/hadoop/hdfs/data /home/hadoop/hdfs/checkpoint /home/hadoop/hdfs/tmp /home/hadoop/hdfs/tmp/nodemanager/local /home/hadoop/hdfs/tmp/nodemanager/remote /home/hadoop/hdfs/tmp/nodemanager/logs
检查一下所属是不是hadoop
$ ll /home/hadoop/hdfs drwxrwxr-x 2 hadoop hadoop 4096 Aug 12 09:25 checkpoint drwxrwxr-x 2 hadoop hadoop 4096 Aug 12 09:25 data drwxrwxr-x 3 hadoop hadoop 4096 Aug 13 00:23 name drwxrwxr-x 4 hadoop hadoop 4096 Aug 12 23:13 tmp
在各台slave机器上执行上述命令。
master进入目录cd /home/hadoop/hadoop-2.6.0/etc/hadoop
需要配置yarn-env.sh,hadoop-env.sh(这2个.sh本文前部分添加了JAVA_HOME),slaves,masters, core-site.xml, hdfs-site.xml, mapred-site.xml, yarn-site.xml这8个文件.
1. 配置etc/master和etc/slaves
etc目录下
echo "master" > mastersecho -ne "slave\n" > slaves
2. 配置core-site.xml
需要先mkdir temp
<configuration> <property> <name>fs.defaultFS</name> <value>hdfs://master:9000</value> </property> <property> <name>hadoop.logfile.size</name> <value>104857600</value> </property> <property> <name>hadoop.tmp.dir</name> <value>file:/home/hadoop/hdfs/tmp</value> </property></configuration>
3. 配置hdfs-site.xml
configuration> <property> <name>dfs.name.dir</name> <value>file:/home/hadoop/hdfs/name</value> <description>Path on the local filesystem where the NameNode stores the namespace and transactions logs persistently.</description></property><property> <name>dfs.data.dir</name> <value>file:/home/hadoop/hdfs/data</value> <description>Comma separated list of paths on the local filesystem of a DataNode where it should store its blocks.</description></property><property> <name>dfs.replication</name> <value>2</value></property></configuration>
4. 配置mapred-site.xml
首先copy mapred-site.xml.template到mapred-site.xml:
$ cp mapred-site.xml.template mapred-site.xml
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> <description>Execution framework set to Hadoop YARN.</description> </property> <property> <name>mapreduce.jobhistory.address</name> <value>master:10020</value> <description>MapReduce JobHistory Server host:port, default port is 10020.</description> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>master:19888</value> <description>MapReduce JobHistory Server Web UI host:port, default port is 19888.</description> </property></configuration>
5. 配置yarn-site.xml
<configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> <description>shuffle service that needs to be set for Map Reduce to run </description> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.resourcemanager.address</name> <value>master:8032</value> <description>the host is the hostname of the ResourceManager and the port is the port on which the clients can talk to the Resource Manager. </description> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>master:8030</value> <description>host is the hostname of the resourcemanager and port isthe port on which the Applications in the cluster talk to the Resource Manager. </description> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>master:8031</value> <description>host is the hostname of the resource manager and port is the port on which the NodeManagers contact the Resource Manager. </description> </property></configuration>
$ scp mapred-site.xml core-site.xml hdfs-site.xml yarn-site.xml masters slaves hadoop-env.sh yarn-env.sh slave:~/hadoop-2.6.0/etc/hadoop/
第六步:开启Hadoop服务
进入/home/hadoop/hadoop-2.6.0/bin/下执行命令格式化namenode
$ hadoop namenode -format或者执行$ hdfs namenode -format
只要初始化一次就行,因为dfs里面有数据的话会弄丢
开启hadoop服务其实只用在master上运行
start-all.sh
可以在master:50070查看dataNode信息;
也可以命令行输入:
$ hdfs dfsadmin -report
查看,
Configured Capacity: 18645180416 (17.36 GB)Present Capacity: 12578476032 (11.71 GB)DFS Remaining: 12578250752 (11.71 GB)DFS Used: 225280 (220 KB)DFS Used%: 0.00%Under replicated blocks: 6Blocks with corrupt replicas: 0Missing blocks: 0-------------------------------------------------Live datanodes (1):Name: 192.168.136.130:50010 (slave)Hostname: slaveDecommission Status : NormalConfigured Capacity: 18645180416 (17.36 GB)DFS Used: 225280 (220 KB)Non DFS Used: 6066704384 (5.65 GB)DFS Remaining: 12578250752 (11.71 GB)DFS Used%: 0.00%DFS Remaining%: 67.46%Configured Cache Capacity: 0 (0 B)Cache Used: 0 (0 B)Cache Remaining: 0 (0 B)Cache Used%: 100.00%Cache Remaining%: 0.00%Xceivers: 1Last contact: Thu Aug 13 16:39:54 CST 2015
第七步:测试并访问Hadoop服务
在master机器上
配置好core-site.xml, hdfs-site.xml, yarn-site.xml mapred-site.xml是关键
在master运行 start-all.sh
$start-all.sh创建目录:$mkdir /home/hadoop/input$cd /home/hadoop/input创建文件:$touch wordcount1.txt$touch wordcount2.txt二、添加内容$echo "Hello World" > wordcount1$echo "Hello Hadoop" > wordcount2三、在hdfs上创建input目录$hadoop fs -mkdir /input四、拷贝文件到/input目录$hadoop fs -put /home/hadoop/input/* /input五、执行程序$hadoop jar /home/hadoop/hadoop-2.6.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar wordcount /input /output六、完成后查看输出目录$hadoop fs -ls /output七、查看输出结果,正确如下$hadoop fs -cat /output/part-r-00000 Hadoop 1 Hello 2 World 1
备注:
/etc/profile的内容
export JAVA_HOME=/usr/java/jdk1.7.0_71export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jarexport PATH=$PATH:$JAVA_HOME/binexport LD_LIBRARY_PATH=/lib64:/usr/lib64export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/libexport ZLIB_INCLUDE_DIR=/lib64:/usr/lib64export MAVEN_HOME=/usr/local/apache-maven-3.2.5export PATH=$PATH:$MAVEN_HOME/binexport ANT_HOME=/usr/local/apache-ant-1.9.4export PATH=$PATH:$ANT_HOME/binexport LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$ANT_HOME/libexport FINDBUGS_HOME=/usr/local/findbugsexport PATH=$PATH:$FINDBUGS_HOME/bin
~/.bash_profile内容
#Hadoop variablesexport HADOOP_HOME=/home/hadoop/hadoop-2.6.0export HADOOP_INSTALL=/home/hadoop/hadoop-2.6.0export PATH=$PATH:$HADOOP_INSTALL/binexport PATH=$PATH:$HADOOP_INSTALL/sbinexport HADOOP_MAPRED_HOME=$HADOOP_INSTALLexport HADOOP_COMMON_HOME=$HADOOP_INSTALLexport HADOOP_HDFS_HOME=$HADOOP_INSTALLexport YARN_HOME=$HADOOP_INSTALLexport HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/nativeexport HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"alias h='cd $HADOOP_HOME'alias etc='cd $HADOOP_HOME/etc/hadoop'
目录/home/hadoop/hadoop-2.6.0/etc/hadoop下
slaves的内容
slave
masters的内容
master
参考资料:
安装配置手册:http://blog.csdn.net/licongcong_0224/article/details/12972889
Hadoop2.3搭建完全实践:http://www.debugo.com/hadoop2-3-install/
- hadoop2.6安装
- Hadoop2.6 Ha 安装
- hadoop2.6安装配置
- hadoop2.6安装
- hadoop2.6 安装
- Hadoop2.6伪分布式安装
- centos单机安装Hadoop2.6
- centos单机安装Hadoop2.6
- centos单机安装Hadoop2.6
- centos单机安装Hadoop2.6
- centos单机安装Hadoop2.6
- centos hadoop2.6 安装部署
- hadoop2.6下pig安装
- hadoop2.6下安装hive
- Hadoop2.6下安装Hive
- W2.3 Hadoop2.6安装
- hadoop2.6完全分布式安装
- hadoop2安装
- memcached 源码阅读疑问
- Win7下Eclipse中文字体太小
- 从2月14号开始,上传AppStore会碰到:Failed to locate or generate matching signing assets
- Linux学习笔记(八)--shell编程(上)
- C++不可以返回局部变量的引用或指针
- hadoop2.6 安装
- ZOJ 3635 Cinema in Akiba(二分+树状数组)
- question_020-JAVA之Map之HashMap嵌套HashMap
- java提高篇-----异常(一)
- 在页面的title添加图标
- Leetcode 024
- <meta-data>的使用
- 眼界很重要
- (5)Jfreechart环形图