Hadoop2.7.0学习——伪分布式搭建

来源:互联网 发布:韩国女演员排行知乎 编辑:程序博客网 时间:2024/06/07 00:37

Hadoop2.7.0学习——伪分布式搭建

根据极客学院视频学习
http://www.jikexueyuan.com/course/2089_3.html?ss=1
需要的材料
1. [红帽企业Linux.6.4.服务器版].rhel-server-6.4-x86_64-dvd[ED2000.COM]|下载
2. hadoop-2.7.0
1. 分卷1
2. 分卷2
3. 分卷3

下载好后,右键分卷1,解压即可

3. hbase-0.98.13-hadoop2-bin|下载
4. jdk-7u80-linux-x64|下载

关闭防火墙

打开终端
输入
service iptables stop 临时关闭,重启后失效
chkconfig iptables off 永久关闭

关闭SELinux

输入
vim /etc/sysconfig/selinux
按i进入编辑模式
设置:SELINUX=disabled
按Esc进入退出编辑,输入:wq!回车,即为保存并退出,或者shift+z+z

配置主机ip




设置完成后重启网络
service network restart
设置虚拟机网络为桥接模式

配置主机名

配置主机名
vi /etc/sysconfig/network
修改主机名为hbase02.pzr.com

配置ip的映射关系
vi /etc/hosts
添加 192.168.20.140 hbase02.pzr.com hbase02

配置SSh免密码登录

生成秘钥,注意ssh-keygen之间无空格
ssh-keygen -t rsa
执行后连续点击回车,下图为成功图片

秘钥拷贝到本机
ssh-copy-id 192.168.20.140

测试是否成功
ssh 192.168.20.140
看到下图说明成功

重启
reboot

安装JDK

下载并安装
执行命令
rpm -ivh jdk-7u80-linux-x64.rpm
java安装在usr下java中
配置/etc/profile

export JAVA_HOME=/usr/java/jdk1.7.0_80export PATH=$JAVA_HOME/bin:$PATHexport CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar

搭建Hadoop环境

  1. 上传文件
  2. 配置HDFS,YARN
  3. 格式化
  4. 启动并测试

上传文件

上传编译好的文件
解压文件
tar -zxf hadoop-2.7.0.tar.gz -C ../soft/

配置Hadoop

地址:http://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SingleCluster.html


往下滚动即可看到对应的配置文件的配置方法
配置Java路径
文件hadoop-2.7.0/etc/hadoop/hadoop-env.sh
找到JAVA_HOME,修改为刚刚配置的JAVA_HOME地址

export JAVA_HOME=/usr/java/jdk1.7.0_80

修改文件:hadoop-2.7.0/etc/hadoop/core-site.xml
hadoop.tmp.dir:缓存目录
地址用固定ip,否则在跑job的时候会有问题

<?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="configuration.xsl"?><configuration>    <property>        <name>fs.defaultFS</name>        <value>hdfs://192.168.20.140:8032</value>    </property>    <property>        <name>hadoop.tmp.dir</name>        <value>/usr/local/bigdata/soft/hadoop-2.7.0/data/tmp</value>    </property></configuration>

修改备份配置文件:hadoop-2.7.0/etc/hadoop/hdfs-site.xml

<?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="configuration.xsl"?><!--  Licensed under the Apache License, Version 2.0 (the "License");  you may not use this file except in compliance with the License.  You may obtain a copy of the License at    http://www.apache.org/licenses/LICENSE-2.0  Unless required by applicable law or agreed to in writing, software  distributed under the License is distributed on an "AS IS" BASIS,  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.  See the License for the specific language governing permissions and  limitations under the License. See accompanying LICENSE file.--><!-- Put site-specific property overrides in this file. --><configuration>    <property>        <name>dfs.replication</name>        <value>1</value>    </property></configuration>

修改配置文件:hadoop-2.7.0/etc/hadoop/mapred-site.xml.template
重命名为mapred-site.xml

<?xml version="1.0"?><?xml-stylesheet type="text/xsl" href="configuration.xsl"?><!--  Licensed under the Apache License, Version 2.0 (the "License");  you may not use this file except in compliance with the License.  You may obtain a copy of the License at    http://www.apache.org/licenses/LICENSE-2.0  Unless required by applicable law or agreed to in writing, software  distributed under the License is distributed on an "AS IS" BASIS,  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.  See the License for the specific language governing permissions and  limitations under the License. See accompanying LICENSE file.--><!-- Put site-specific property overrides in this file. --><configuration>    <property>        <name>mapreduce.framework.name</name>        <value>yarn</value>    </property></configuration>

修改配置文件:hadoop-2.7.0/etc/hadoop/yarn-site.xml
比视频中多了一些配置内容

<?xml version="1.0"?><!--  Licensed under the Apache License, Version 2.0 (the "License");  you may not use this file except in compliance with the License.  You may obtain a copy of the License at    http://www.apache.org/licenses/LICENSE-2.0  Unless required by applicable law or agreed to in writing, software  distributed under the License is distributed on an "AS IS" BASIS,  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.  See the License for the specific language governing permissions and  limitations under the License. See accompanying LICENSE file.--><configuration><!-- Site specific YARN configuration properties -->    <property>        <name>yarn.nodemanager.aux-services</name>        <value>mapreduce_shuffle</value>    </property>    <property>        <name>yarn.resourcemanager.address</name>        <value>127.0.0.1:8032</value>    </property>    <property>        <name>yarn.resourcemanager.scheduler.address</name>        <value>127.0.0.1:8030</value>    </property>    <property>        <name>yarn.resourcemanager.resource-tracker.address</name>        <value>127.0.0.1:8031</value>    </property>    <property>    <name>yarn.nodemanager.resource.memory-mb</name>        <value>3072</value>    </property>    <property>    <name>yarn.nodemanager.resource.cpu-vcores</name>        <value>2</value>    </property>    <property>    <name>yarn.scheduler.minimum-allocation-mb</name>        <value>256</value>    </property></configuration>

启动

第一次启动hadoop需要格式化
查看命令
bin/hdfs

格式化命令:bin/hdfs namenode -format

看到这个则说明格式化成功

这是sbin目录下的命令,如果配置在PATH下就可以直接使用命令,如果没有,则到对应目录下执行该命令
start-dfs.sh 启动Hadoop HDFS守护进程NameNode、SecondaryNameNode和DataNode
start-yarn.sh 启动Hadoop YARN守护进程ResourceManager,NodeManager

测试是否启动成功

登录Hadoop 的管理页面:http://部署服务器ip:50070\
看到以下页面,说明部署成功

创建目录:-p说明创建多级目录
bin/hadoop fs -mkdir -p /user/root/my/in
上传文件
bin/hadoop fs -put /etc/profile /user/root/my/in
在管理页面可以看到创建的目录以及上传的文件


跑job,将上传的文件上传到指定目录
bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.0.jar wordcount /user/root/my/in/profile /user/root/my/out

管理页面可以看到

搭建Hbase环境

下载解压

http://archive.apache.org/dist/hbase/0.98.13/hbase-0.98.13-hadoop1-bin.tar.gz

解压文件

tar -zxf hbase-0.98.13-hadoop2-bin.tar.gz -C ../soft/

配置

修改配置文件:hbase-0.98.13-hadoop2/conf/hbase-env.sh
找到JAVA_HOME,解除注释,配置服务器JAVA_HOME的地址

export JAVA_HOME=/usr/java/jdk1.7.0_80

修改配置文件:hbase-0.98.13-hadoop2/conf/hbase-site.xml
hbase.zookeeper.property.dataDir的值是自己创建的目录地址

<?xml version="1.0"?><?xml-stylesheet type="text/xsl" href="configuration.xsl"?><!--/** * * Licensed to the Apache Software Foundation (ASF) under one * or more contributor license agreements.  See the NOTICE file * distributed with this work for additional information * regarding copyright ownership.  The ASF licenses this file * to you under the Apache License, Version 2.0 (the * "License"); you may not use this file except in compliance * with the License.  You may obtain a copy of the License at * *     http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. */--><configuration>    <property>      <name>hbase.cluster.distributed</name>      <value>true</value>    </property>    <property>        <name>hbase.rootdir</name>        <value>hdfs://192.168.20.39:8032/hbase</value>    </property>    <property>        <name>hbase.zookeeper.property.dataDir</name>        <value>/usr/local/bigdata/soft/hbase-0.98.13-hadoop2/data/zkData</value>    </property></configuration>

修改配置文件:hbase-0.98.13-hadoop2/conf/regionservers
localhost修改为固定ip

192.168.20.140

启动

顺序必须正确
hbase-daemon.sh start zookeeper
hbase-daemon.sh start master
hbase-daemon.sh start regionserver

测试是否启动成功

输入jps,查看进程是否正确
输入hbase,可以看到相关命令

关闭

顺序和启动相反
hbase-daemon.sh stop regionserver
hbase-daemon.sh stop master
hbase-daemon.sh stop zookeeper

常用命令

查看其它命令,直接使用hbase

进入shell

hbase shell

查看表

list

新建表

create ‘表名’,’列族’
create ‘mytest’,’info’

删除表

先禁用,在删除
disable ‘mytest’ 禁用
drop ‘mytest’ 删除

插入数据

put ‘表名’,’rowkey’,’列族:列名’,’值’
put ‘mytest’,’rk0001’,’info:name’,’myname’

更新数据

向相同的rowkey中插入相同数据即可

删除数据

deleteall ‘mytest’,’rok0001’ 删除整行
deleteall ‘mytest’,’rk0001’,’info:age’ 删除指定列
还有其他的命令,输入del按table键查看

查询数据

scan ‘mytest’ 扫描整张表
scan ‘mytest’,{LIMIT=>10} 扫描前10条数据
p/hbase_shell.html)
get ‘mytest’,’rk0001’ 通过rowkey获取
get ‘mytest’,’rk0001’,’info:name’ 通过rowkey和指定列名获取
更多命令用法:[http://www.cnblogs.com/nexiyi/p/hbase_shell.html](http://www.cnblogs.com/nexiyi/

查看管理页面地址

http://192.168.20.140:60010/master-status

0 0