hadoop 第二节 单节点集群配置 Setting up a Single Node Cluster
来源:互联网 发布:淘宝店铺如何开 编辑:程序博客网 时间:2024/06/05 17:15
一. 单机模式:Local (Standalone) Mode
直接使用hadoop命令执行jar包,以下是官方例子:
root@ubuntu:~/hadoop/output# cd $HADOOP_HOMEroot@ubuntu:~/hadoop# mkdir inputroot@ubuntu:~/hadoop# cp etc/hadoop/*.xml inputroot@ubuntu:~/hadoop# hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar grep input output 'dsf[a-z.]+'root@ubuntu:~/hadoop# cat output/*
二.伪分布式模式:Pseudo-Distributed Mode
1. ssh设置免密登录
root@ubuntu:~# ssh localhost
2. 如果提示输入密码,需要进行导入公钥
root@ubuntu:~# ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsaGenerating public/private dsa key pair.Your identification has been saved in /root/.ssh/id_dsa.Your public key has been saved in /root/.ssh/id_dsa.pub.The key fingerprint is:SHA256:9xUa1hH5XJJQYr3G5AU5rapYcYXNICK8hIgshfI/SWQ root@ubuntuThe key's randomart image is:+---[DSA 1024]----+|.+.. o. . . =O=B ||=.. E o. . o.o%.+||o. o . . o=oO.|| . . . ...o*.o|| o . S .o.o. || + ..... || . o .. || . . || |+----[SHA256]-----+root@ubuntu:~# cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized.keysroot@ubuntu:~# chmod 0600 ~/.ssh/authorized.keys root@ubuntu:~# ssh localhost root@localhost's password:
PS:按照官方文档,并不能无密登录,换成RSA就能成功了(表示疑惑,两者感觉并无不同啊??)
root@ubuntu:~# ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsaGenerating public/private rsa key pair.Your identification has been saved in /root/.ssh/id_rsa.Your public key has been saved in /root/.ssh/id_rsa.pub.The key fingerprint is:SHA256:bc4qQR2NMHsGVL6NzNSJ5ycuUBiqXtS9jB1BQJGXxsM root@ubuntuThe key's randomart image is:+---[RSA 2048]----+| oXX++ || o.OE+.. || o ++Xo+ || o .@.X || . ..o S * . || . . .. = o || . .. + || . o || .. |+----[SHA256]-----+root@ubuntu:~# cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keysroot@ubuntu:~# chmod 0600 ~/.ssh/authorized_keys root@ubuntu:~# ssh localhost Welcome to Ubuntu 16.04 LTS (GNU/Linux 4.4.0-21-generic x86_64) * Documentation: https://help.ubuntu.com/Last login: Mon Jun 27 16:57:53 2016 from 192.168.80.1
3. 格式化文件系统
root@ubuntu:~# hdfs namenode -format
PS:http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/core-default.xml根据文档配置说明,hadoop.tmp.dir 的默认值是/tmp/hadoop-${user.name},如果重启系统后,文件系统会被删除,所以避免这种情况可以更改此配置文件etc/hadoop/core-sites.xml,加入以下配置:
<configuration> <property> <name>hadoop.tmp.dir</name> <value>/data/hadoop-${user.name}</value> </property></configuration>
4. 启动NameNode和DataNode守护进程
root@ubuntu:~# start-dfs.shIncorrect configuration: namenode address dfs.namenode.servicerpc-address or dfs.namenode.rpc-address is not configured.Starting namenodes on []localhost: Error: JAVA_HOME is not set and could not be found.localhost: Error: JAVA_HOME is not set and could not be found.Starting secondary namenodes [0.0.0.0]0.0.0.0: Error: JAVA_HOME is not set and could not be found.
ERROR1:提示JAVA_HOME找不到,但是前文已经配置了。
解决问题步骤如下:
1) 先搜索错误提示经查证,发现问题出在hadoop-config.sh脚本。
root@ubuntu:~/hadoop# grep -R "JAVA_HOME is not set and could not be found" ../libexec/hadoop-config.sh: echo "Error: JAVA_HOME is not set and could not be found." 1>&2
2)hadoop-config.sh中的$JAVA_HOME变量又是执行etc/hadoop/hadoop-env.sh脚本export的,将${JAVA_HOME}改成绝对路径。
# The java implementation to use.#export JAVA_HOME=${JAVA_HOME}export JAVA_HOME=/root/jdk
root@ubuntu:~/hadoop# start-dfs.shIncorrect configuration: namenode address dfs.namenode.servicerpc-address or dfs.namenode.rpc-address is not configured.Starting namenodes on []localhost: starting namenode, logging to /root/hadoop/logs/hadoop-root-namenode-ubuntu.outlocalhost: starting datanode, logging to /root/hadoop/logs/hadoop-root-datanode-ubuntu.outStarting secondary namenodes [0.0.0.0]0.0.0.0: starting secondarynamenode, logging to /root/hadoop/logs/hadoop-root-secondarynamenode-ubuntu.out0.0.0.0: Exception in thread "main" java.lang.IllegalArgumentException: Invalid URI for NameNode address (check fs.defaultFS): file:/// has no authority.0.0.0.0: at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:471)0.0.0.0: at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:461)0.0.0.0: at org.apache.hadoop.hdfs.server.namenode.NameNode.getServiceAddress(NameNode.java:454)0.0.0.0: at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.initialize(SecondaryNameNode.java:229)0.0.0.0: at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.<init>(SecondaryNameNode.java:192)0.0.0.0: at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.main(SecondaryNameNode.java:671)
ERROR2:再次启动,发生文件系统URI无效
解决问题步骤如下:
root@ubuntu:~/hadoop# vi etc/hadoop/core-site.xml <configuration> <property> <name>hadoop.tmp.dir</name> <value>/data/hadoop-${user.name}</value> </property> <!-- 加入以下配置 --> <property> <name>fs.defaultFS</name> <value>hdfs://localhost:9000</value> </property></configuration>
5. 启动成功
root@ubuntu:~/hadoop# start-dfs.shStarting namenodes on [localhost]localhost: starting namenode, logging to /root/hadoop/logs/hadoop-root-namenode-ubuntu.outlocalhost: starting datanode, logging to /root/hadoop/logs/hadoop-root-datanode-ubuntu.outStarting secondary namenodes [0.0.0.0]0.0.0.0: starting secondarynamenode, logging to /root/hadoop/logs/hadoop-root-secondarynamenode-ubuntu.out
6. 通过web接口查看NameNode:http://localhost:50070/
7.创建HDFS目录
root@ubuntu:~/hadoop# hdfs dfs -mkdir /userroot@ubuntu:~/hadoop# hdfs dfs -mkdir /user/root
8.把文件put到dfs中
root@ubuntu:~/hadoop# hdfs dfs -put etc/hadoop inputroot@ubuntu:~/hadoop# hdfs dfs -lsFound 1 itemsdrwxr-xr-x - root supergroup 0 2016-06-28 09:44 input
9.用hadoop提供的examples运行mapreduce,做一个检查的搜索字符串
root@ubuntu:~/hadoop# hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar grep input output 'dfs[a-z.]+'
10.查看运行结果
1)可以直接在dfs中查看
root@ubuntu:~/hadoop# hdfs dfs -cat output/*6 dfs.audit.logger4 dfs.class3 dfs.server.namenode.2 dfs.period2 dfs.audit.log.maxfilesize2 dfs.audit.log.maxbackupindex1 dfsmetrics.log1 dfsadmin1 dfs.servers1 dfs.file
2)也可以先从dfs中拿到本地目录,再查看
root@ubuntu:~/hadoop# hdfs dfs -get output outputroot@ubuntu:~/hadoop# cat output/*6 dfs.audit.logger4 dfs.class3 dfs.server.namenode.2 dfs.period2 dfs.audit.log.maxfilesize2 dfs.audit.log.maxbackupindex1 dfsmetrics.log1 dfsadmin1 dfs.servers1 dfs.file
总结
严格按照文档操作,仍会出现一些意想不到的事情。类似上文中的第4点,启动NameNode和DataNode守护进程失败,遇到这类情况,如果有报错信息,可以按图索骥,一步一步检查修正。如果不能自己解决,网上也有许多资料可供参考和答疑的。
TOO ME,KEEP GOING,JUST DO IT!!!
0 0
- hadoop 第二节 单节点集群配置 Setting up a Single Node Cluster
- Hadoop: Setting up a Single Node Cluster
- Setting up a Single Node Cluster on hadoop-0.23.9
- hadoop探索-Setting up a Single Node Cluster
- Setting up SSH for a Hadoop cluster
- 《Hadoop The Definitive Guide》ch09 Setting Up a Hadoop Cluster
- hadoop 单节点安装 Single Node Setup
- Ubuntu上“单节点”方式运行Hadoop (Running Hadoop 1.5.3 on Ubuntu in Single-node cluster)
- Setting up a Storm Cluster
- Hadoop 单节点集群配置
- Hadoop 单节点集群配置
- Hadoop 1.2.1 单节点安装(Single Node Setup)步骤
- Hadoop单机单节点集群安装配置
- Hadoop MapReduce2 -单节点集群配置
- 【配置】Hadoop单节点集群搭建
- Setting Up Redis Cluster
- Tomcat集群---Cluster节点配置
- Tomcat集群---Cluster节点配置
- 华为架构师8年经验谈:从单体架构到微服务的服务化演进之路
- 4412出错内容
- eclipse debug F5 F6 快捷键失效
- 专题四 · 1002
- When running gradle with java 5, 6 or 7, you must set the path to jdk8, either with property retrola
- hadoop 第二节 单节点集群配置 Setting up a Single Node Cluster
- 解决使用findnext相关函数在64位编译环境下遍历文件夹下的文件
- (1)IP地址 : 网络号 + 主机号
- nginx配置
- Linux信号(signal) 机制分析
- csrf攻击
- android studio 的相关问题
- 消息提示类控件使用之Toast(吐司)的简单使用
- SurfaceView的应用