hadoop伪分布式部署及测试
来源:互联网 发布:js 大屏幕 倒计时特效 编辑:程序博客网 时间:2024/05/22 07:50
以ubuntu10.04安装运行伪分布式Hadoop(以0.20.2版本为例)
下载Hadoop:地址:http://www.apache.org/dyn/closer.cgi/hadoop/common/ 选择一个镜像地址。选择版本。
操作都在hadoop的home目录下。
准备工作
1. 安装jdk
2. 解压所下载的Hadoop发行版。编辑 conf/hadoop-env.sh文件,至少需要将JAVA_HOME设置为Java安装根路径。
3. 安装ssh,打开终端,输入如下命令
sudo apt-get install openssh-server
以下操作为生成ssh公钥和私钥
cd ~
ssh-keygen –t rsa(需输入密码)
cd .ssh
cat id_rsa.pub > authorized_keys(该步为了免密登录)
若在配置途中出现关于ssh的以下错误
Agent admitted failure to sign using the key
执行以下命令
ssh-add ~/.ssh/id_rsa
便可实现无密登录了
注意:
若执行ssh-add是出现这个错误:Could not open a connection to yourauthentication agent,则先执行如下命令即可:
ssh-agent bash
伪分布式模式的操作方法
Hadoop可以在单节点上以所谓的伪分布式模式运行,此时每一个Hadoop守护进程都作为一个独立的Java进程运行。
配置使用如下的
conf/core-site.xml:
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://192.168.0.101:9000</value>
</property>
</configuration>
conf/hdfs-site.xml:
<configuration>
<property>
<name>fs.replication</name>
<value>1</value>
</property>
</configuration>
conf/mapred-site.xml:
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>192.168.0.101:9001</value>
</property>
</configuration>
首先,请求 namenode 对 DFS 文件系统进行格式化。在安装过程中完成了这个步骤,但是了解是否需要生成干净的文件系统是有用的。
bin/hadoop namenode -format
输出:
11/11/3009:53:56 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host= ubuntu1/192.168.0.101
STARTUP_MSG: args= [-format]
STARTUP_MSG: version = 0.20.2
STARTUP_MSG: build =https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707;compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
************************************************************/
11/11/3009:53:56 INFO namenode.FSNamesystem: fsOwner=root,root
11/11/3009:53:56 INFO namenode.FSNamesystem: supergroup=supergroup
11/11/3009:53:56 INFO namenode.FSNamesystem: isPermissionEnabled=true
11/11/3009:53:56 INFO common.Storage:Image file of size94 savedin0 seconds.
11/11/3009:53:57 INFO common.Storage:Storage directory/tmp/hadoop-root/dfs/name has beensuccessfully formatted.
11/11/3009:53:57 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode atubuntu1/192.168.0.101
************************************************************/
执行:bin/start-all.sh
输出:
starting namenode, logging to /usr/hadoop-0.20.2/bin/../logs/hadoop-root-namenode-ubuntu1.out
localhost: starting datanode, logging to/usr/hadoop-0.20.2/bin/../logs/hadoop-root-datanode-ubuntu1.out
localhost: startingsecondarynamenode, logging to/usr/hadoop-0.20.2/bin/../logs/hadoop-root-secondarynamenode-ubuntu1.out
starting jobtracker, logging to/usr/hadoop-0.20.2/bin/../logs/hadoop-root-jobtracker-ubuntu1.out
localhost: startingtasktracker, logging to/usr/hadoop-0.20.2/bin/../logs/hadoop-root-tasktracker-ubuntu1.out
检查hdfs:
bin/hadoop fs -ls /
输出目录文件则正常。
hadoop文件系统操作:
bin/hadoop fs -mkdir test
bin/hadoop fs -ls test
bin/hadoop fs -rmr test
测试hadoop:
bin/hadoop fs -mkdir input
自己建立两个文本文件:file1和file2放在/opt/hadoop/sourcedata下
执行:bin/hadoop fs -put /opt/hadoop/sourcedata/file* input
执行:bin/hadoop jar hadoop-0.20.2-examples.jar wordcount input output
输出:
11/11/3010:15:38 INFO input.FileInputFormat:Total input paths toprocess:2
11/11/3010:15:52 INFO mapred.JobClient:Running job: job_201111301005_0001
11/11/3010:15:53 INFO mapred.JobClient: map0% reduce0%
11/11/3010:19:07 INFO mapred.JobClient: map50% reduce0%
11/11/3010:19:14 INFO mapred.JobClient: map100% reduce0%
11/11/3010:19:46 INFO mapred.JobClient: map100% reduce100%
11/11/3010:19:54 INFO mapred.JobClient:Job complete: job_201111301005_0001
11/11/3010:19:59 INFO mapred.JobClient:Counters:17
11/11/3010:19:59 INFO mapred.JobClient: JobCounters
11/11/3010:19:59 INFO mapred.JobClient: Launched reduce tasks=1
11/11/3010:19:59 INFO mapred.JobClient: Launched map tasks=2
11/11/3010:19:59 INFO mapred.JobClient: Data-local map tasks=2
11/11/3010:19:59 INFO mapred.JobClient: FileSystemCounters
11/11/3010:19:59 INFO mapred.JobClient: FILE_BYTES_READ=146
11/11/3010:19:59 INFO mapred.JobClient: HDFS_BYTES_READ=64
11/11/3010:19:59 INFO mapred.JobClient: FILE_BYTES_WRITTEN=362
11/11/3010:19:59 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=60
11/11/3010:19:59 INFO mapred.JobClient: Map-ReduceFramework
11/11/3010:19:59 INFO mapred.JobClient: Reduce input groups=9
11/11/3010:19:59 INFO mapred.JobClient: Combine output records=13
11/11/3010:19:59 INFO mapred.JobClient: Map input records=2
11/11/3010:19:59 INFO mapred.JobClient: Reduce shuffle bytes=102
11/11/3010:19:59 INFO mapred.JobClient: Reduce output records=9
11/11/3010:19:59 INFO mapred.JobClient: SpilledRecords=26
11/11/3010:19:59 INFO mapred.JobClient: Map output bytes=120
11/11/3010:19:59 INFO mapred.JobClient: Combine input records=14
11/11/3010:19:59 INFO mapred.JobClient: Map output records=14
11/11/3010:19:59 INFO mapred.JobClient: Reduce input records=13
执行成功!
其他查看结果命令:
bin/hadoop fs -ls /user/root/output
bin/hadoop fs -cat output/part-r-00000
bin/hadoop fs-cat output/part-r-00000| head -13
bin/hadoop fs-get output/part-r-00000 output.txt
cat output.txt | head-5
bin/hadoop fs-rmr output
也可以使用浏览器查看,地址:
http://192.168.0.101:50030 (mapreduce的web页面)
http://192.168.0.101:50070 (hdfs的web页面)
下面执行grep的mapreduce任务:
执行:bin/hadoop fs -rmr output
执行:bin/hadoop jar hadoop-0.20.2-examples.jar grep input output 'hadoop'
输出:
11/11/3010:28:37 INFO mapred.FileInputFormat:Total input paths toprocess:2
11/11/3010:28:40 INFO mapred.JobClient:Running job: job_201111301005_0002
11/11/3010:28:41 INFO mapred.JobClient: map0% reduce0%
11/11/3010:34:16 INFO mapred.JobClient: map66% reduce0%
11/11/3010:37:40 INFO mapred.JobClient: map100% reduce11%
11/11/3010:37:50 INFO mapred.JobClient: map100% reduce22%
11/11/3010:37:54 INFO mapred.JobClient: map100% reduce66%
11/11/3010:38:15 INFO mapred.JobClient: map100% reduce100%
11/11/3010:38:30 INFO mapred.JobClient:Job complete: job_201111301005_0002
11/11/3010:38:32 INFO mapred.JobClient:Counters:18
11/11/3010:38:32 INFO mapred.JobClient: JobCounters
11/11/3010:38:32 INFO mapred.JobClient: Launched reduce tasks=1
11/11/3010:38:32 INFO mapred.JobClient: Launched map tasks=3
11/11/3010:38:32 INFO mapred.JobClient: Data-local map tasks=3
11/11/3010:38:32 INFO mapred.JobClient: FileSystemCounters
11/11/3010:38:32 INFO mapred.JobClient: FILE_BYTES_READ=40
11/11/3010:38:32 INFO mapred.JobClient: HDFS_BYTES_READ=77
11/11/3010:38:32 INFO mapred.JobClient: FILE_BYTES_WRITTEN=188
11/11/3010:38:32 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=109
11/11/3010:38:32 INFO mapred.JobClient: Map-ReduceFramework
11/11/3010:38:32 INFO mapred.JobClient: Reduce input groups=1
11/11/3010:38:32 INFO mapred.JobClient: Combine output records=2
11/11/3010:38:32 INFO mapred.JobClient: Map input records=2
11/11/3010:38:32 INFO mapred.JobClient: Reduce shuffle bytes=46
11/11/3010:38:32 INFO mapred.JobClient: Reduce output records=1
11/11/3010:38:32 INFO mapred.JobClient: SpilledRecords=4
11/11/3010:38:32 INFO mapred.JobClient: Map output bytes=30
11/11/3010:38:32 INFO mapred.JobClient: Map input bytes=64
11/11/3010:38:32 INFO mapred.JobClient: Combine input records=2
11/11/3010:38:32 INFO mapred.JobClient: Map output records=2
11/11/3010:38:32 INFO mapred.JobClient: Reduce input records=2
11/11/3010:38:36 WARN mapred.JobClient:UseGenericOptionsParserfor parsing the arguments.Applications should implementToolfor the same.
执行:bin/hadoop fs -catoutput/part-00000
输出:2hadoop
转载:http://m.blog.csdn.net/blog/rjhym/8269977
并加以修正补充
- hadoop伪分布式部署及测试
- hadoop学习入门之伪分布式部署及测试
- hadoop伪分布式部署
- hadoop伪分布式部署
- hadoop伪分布式部署
- Hadoop 1.x 伪分布式部署与测试
- hadoop 伪分布式部署过程
- Hadoop伪分布式模式部署
- Hadoop单机伪分布式部署
- Hadoop伪分布式集群部署
- 05hadoop的安装部署及伪分布式搭建
- hadoop伪分布式安装及测试(2.7.1)
- Hadoop伪分布式模式测试
- Easy hadoop 向导伪分布式部署注意事项
- hadoop伪分布式集群部署注意事项
- Hadoop 1.x 伪分布式安装部署
- Hadoop伪分布式部署和集群搭建
- 虚拟机中伪分布式Hadoop的部署
- jstl的fmt标签
- hdu 2845 Beans(最长不连续子序列和)
- HDU3853--LOOPS--概率DP
- 九度1482解题报告
- ZOJ 3764 - ZOJ Monthly, March 2014 最大流最小割
- hadoop伪分布式部署及测试
- TreeView读取数据库
- SVN提交小结
- 【算法导论】归并排序实现
- 九度OJ 题目1002:Grading
- 组合问题(从M个不同字符中任取N个字符的所有组合)
- ACM-简单题之进制转换——hdu2031
- UVA Power of Cryptography
- 题目1188:约瑟夫环