rpm傻瓜安装hadoop-1.2.1总结,单机+hdfs+mapreduce
来源:互联网 发布:js判断json对象的属性 编辑:程序博客网 时间:2024/05/16 08:54
rpm安装hadoop-1.2.1总结,单机+hdfs+mapreduce
系统:rhel 6.4 64bit, in vmware player
主机名:rhel64,替换成你自己的
-------------------------------------
下载:
wget http://mirror.esocc.com/apache/hadoop/common/stable1/hadoop-1.2.1-1.x86_64.rpmwget http://download.oracle.com/otn-pub/java/jdk/7u45-b18/jdk-7u45-linux-x64.rpm
先rpm安装:
rpm -ivh jdk-7u45-linux-x64.rpmrpm -ivh hadoop-1.2.1-1.x86_64.rpm
会自动创建两个系统用户hdfs, mapred
在hadoop里面有3个用户
root, mapred是普通用户
hdfs是特权用户可以管理文件系统
mapred用来执行任务
环境变量(非必需)
vi /etc/profile添加:
export JAVA_HOME=/usr/java/jdk1.7.0_45/export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jarPATH=$PATH:$JAVA_HOME/bin
执行:
source /etc/profile
-------------------------------------
最终的配置文件(hdfs+mapreduce, 最简,共3个文件,6个属性要配):
hadoop-env.sh中的JAVA_HOME环境变量rpm安装的时候已经自动配好了
vi /etc/hadoop/core-site.xml内容:
<configuration> <property> <name>fs.default.name</name> <value>hdfs://rhel64:9000</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/var/tmp/hadoop</value> </property></configuration>
说明:hadoop.tmp.dir是相对于fs.default.name的,也就是说在hdfs下面的
如果不配的话,它会在/tmp目录么?不好说,也许还是在dfs.data.dir下面
只要dfs.data.dir不在/tmp下即可
不配的话,它是${fs.default.name}/tmp/hadoop-mapred
vi /etc/hadoop/hdfs-site.xml
内容:
<configuration> <property> <name>dfs.name.dir</name> <value>/home/hdfs/name</value> </property> <property> <name>dfs.data.dir</name> <value>/home/hdfs/data</value> </property> <property> <name>dfs.replication</name> <value>1</value> </property></configuration>
vi /etc/hadoop/mapred-site.xml内容:
<configuration> <property> <name>mapred.job.tracker</name> <value>rhel64:9001</value> </property></configuration>
-------------------------------------
要创建的目录:
mkdir -p /home/hdfs/datachown hdfs:hadoop -R /home/hdfs
要先启动hdfs才能在上面创建目录
/etc/init.d/hadoop-namenode start/etc/init.d/hadoop-datanode start
mkdir -p /var/tmp/hadoop/dfs/namesecondarychown hdfs:hadoop -R /var/tmp/hadoop/dfs
sudo -u hdfs hadoop dfs -mkdir /var/tmp/hadoopsudo -u hdfs hadoop dfs -chown mapred /var/tmp/hadoopsudo -u hdfs hadoop dfs -mkdir /user/mapredsudo -u hdfs hadoop dfs -chown mapred /user/mapred
注:chown必须用hdfs来执行,root都没有权限
-------------------------------------
执行的命令
首次执行:
hadoop namenode -format
若出错,可直接把相应name目录删掉再来一次
启动:
/etc/init.d/hadoop-namenode start/etc/init.d/hadoop-secondarynamenode start/etc/init.d/hadoop-jobtracker start/etc/init.d/hadoop-datanode start/etc/init.d/hadoop-tasktracker start
停止:
/etc/init.d/hadoop-namenode stop/etc/init.d/hadoop-secondarynamenode stop/etc/init.d/hadoop-jobtracker stop/etc/init.d/hadoop-datanode stop/etc/init.d/hadoop-tasktracker stop
不需要执行的:
/etc/init.d/hadoop-historyserver start
执行会出错,因为它随着jobtracker启动
-------------------------------------
确认命令
jps
(需要配JAVA_HOME并加入到PATH)
web访问端口(不需要配置,自动绑定)
需要安装lynx
hdfs:
lynx rhel64:50070
jobtracker:
lynx rhel64:50030
secondarynamenode:
lynx rhel64:50090
-------------------------------------
查看日志:
vi /var/log/hadoop/hdfs/hadoop-hdfs-*node-rhel64.logvi /var/log/hadoop/mapred/hadoop-mapred-*-rhel64.log
-------------------------------------
hdfs常用操作命令:
hadoop fs -lssudo -u hdfs hadoop fs -mkdir my1sudo -u hdfs hadoop fs -chown mapred /user/mapredhadoop fs -cat /user/mapred/random-data/part-00001
-------------------------------------
执行测试任务
sudo -u mapred hadoop jar /usr/share/hadoop/hadoop-examples-1.2.1.jar randomwriter /user/mapred/random-data
写成脚本
vi hadoop-first-run.sh
内容:
hadoop-first-run.sh rm /home/hdfs/* -rf rm /var/log/hadoop/hdfs/* -f rm /var/log/hadoop/mapred/* -fhadoop namenode -formatmkdir /home/hdfs/datachown hdfs:hadoop -R /home/hdfsmkdir -p /var/tmp/hadoop/dfs/namesecondarychown hdfs:hadoop -R /var/tmp/hadoop/dfsecho 'starting namenode...'/etc/init.d/hadoop-namenode startecho 'starting datanode...'/etc/init.d/hadoop-datanode starthadoop dfs -mkdir /varhadoop dfs -mkdir /var/tmphadoop dfs -mkdir /var/tmp/hadoopsudo -u hdfs hadoop dfs -chown mapred /var/tmp/hadoopecho 'starting secondary namenode...'/etc/init.d/hadoop-secondarynamenode startecho 'starting job tracker...'/etc/init.d/hadoop-jobtracker startecho 'starting task tracker...'/etc/init.d/hadoop-tasktracker start
注:删除home目录和log是为了确保没有残留文件影响
可以不删
使用:
先rpm安装jdk和hadoop,再配好3个配置文件,然后执行
bash hadoop-first-run.sh
不出意外的话jps就可看到相关的5个进程,用lynx或浏览器可以打开3个管理页面,可以运行测试任务
如果出意外,查看日志找原因
- rpm傻瓜安装hadoop-1.2.1总结,单机+hdfs+mapreduce
- hadoop-2.6.0伪分布式单机安装傻瓜教程
- Hadoop-1.2.1安装HBase(单机模式)
- 安装单机Hadoop时格式化HDFS出现问题
- Hadoop之MapReduce & HDFS
- hadoop hdfs , mapreduce 第一步
- Hadoop-2.6.0分布式单机环境搭建HDFS讲解Mapreduce示例
- HDFS单机版安装
- HDFS单机版安装
- hadoop,hdfs,mapreduce,hive,derby配置总结(待续)
- hadoop傻瓜式安装
- hadoop单机部署伪分布式系列1:HDFS
- hadoop学习笔记 MapReduce + HDFS
- Hadoop,MapReduce,HDFS面试题
- hadoop思维:HDFS和MapReduce
- Hadoop,MapReduce,HDFS面试题
- Hadoop学习:HDFS和MapReduce
- hadoop搭建之HDFS,MapReduce
- 数组冒泡
- Myeclipse的快捷键
- Oracle 添加10g RAC数据库集群节点
- 浅析arm汇编中指令使用学习
- MVC Html.ActionLink的用法
- rpm傻瓜安装hadoop-1.2.1总结,单机+hdfs+mapreduce
- C++深入体验之旅十:C++对象
- 实际项目中的常见算法
- Linux Device和Driver注册过程,以及Probe的时机
- Oracle10g RAC删除节点
- 在Eclipse中使用JUnit4进行单元测试(高级篇)
- ORACLE 1Og RAC 升级实录
- Ubuntu 12.04 安装clang 3.3
- ECMall报Strict Standards: Non-static method Conf::get() should not be called statically解决