Installation of Hadoop-1.2.1 Pseudo-distributed mode on Centos 7
来源:互联网 发布:玉兔公子淘宝 编辑:程序博客网 时间:2024/05/21 05:22
1 Hadoop Versions
On the official website of Apache, there are variable hadoop releases from 0.10.1 to 2.7.2(recent). In compared with the early releases, the 2.x hadoop introduced Global Resource Manager and Application Master, which are the core components of the so-called YARN framework. Beside of MapReduce, a large number of other parallel computing models such as Memory-Computing, Streaming-Computing, Iterative-Computing, Graph-Computing can be compatible in the new hadoop system. Meantime, according to distinct platforms, Apache offers corresponding packages(.rpm/.deb) or compression files(-bin.tar.gz/tar.gz). So how to choose a suitable package according to your OS is really a trick.
At first, I choose a .rpm hadoop for my centos7. Question happened when I try to use the rpm package management tool to set up the hadoop:
rpm -ivh ./hadoop***.rpm
The error message shows that the default set-up directory in hadoop/bin conflicts with the system root directory /bin. So you have to use other arguments(relocate or prefix) to denote the appointed installation directory:
rpm -ivh --relocate /=/opt/temp xxx.rpm; or rpm -ivh --prefix= /opt/temp xxx.rpm
On the contrary, it is more convenient to handle with the .tar.gz version of hadoop. Just use the tar tool and then copy it to arbitrary rational position.
2 Prerequisites for Installation
It is suggested that you create a new linux user for hadoop, and assign the new user a higher permission by modifying the sudoers file in /etc directory. Remember to recover the file's read-only attribute after your finished:
1)Create a new user for hadoop
groupadd hadoop-user ----- create a user group
useradd -g hadoop-user hadoop ----- add up the new user hadoop to the group
passwd hadoop ----- set up a password for your new user.
2)Modify permission for the new user
Switch to the root mode, then add the writing permission of the /etc/sudoers to the logged-in new user.
#chmod u+w /etc/sudoers
emend the sudoers file, add up a new line:
user ALL(ALL) NOPASSWD: ALL or user ALL(ALL) ALL
At last, recover the sudoers file to the read-only mode.
#chmod u-w /etc/sudoers
Since hadoop already runs in java, you need to have a java version 1.6 or higher on your machine. Fortunately, centos contains openjdk 1.8 for the recent upgrade. You can also choose another official version of java from Oracle. If you choose a .rpm package for java, it needs not to set up the environment variable. If not, add the JAVA_HOME to your profile configuration, the process is omitted...
The communication between nodes in the cluster happens via ssh. In a multi-node cluster setup of communication between individual nodes, while in single-node cluster, localhost acts as server. The concrete configuration as below:
$ssh-keygen -t rsa ------ generation of keys pair
$cp id_rsa.pub authorized_keys ------- copy the public key to the authorized user
$ssh localhost ------ test the password-less connection
If the connection should fail, these general tips might help:
Enable debugging with ssh -vvv localhost and investigate the error in detail.
Check the SSH server configuration in /etc/ssh/sshd_config, in particular the options PubkeyAuthentication and AllowUsers. If you made any changes to the SSH server configuration file, you can force a configuration reload with sudo /etc/init.d/ssh reload.
3 Formal deployment
It is not recommend to add the HADOOP_HOME to the environment variable for the reason that it is deprecated. you need several steps to finish your work:
Step1: Configuring the Hadoop environment
Just append the correlative contents to the respective four files in hadoop-**/conf:
1)hadoop-env.sh
$export JAVA_HOME="Where your JAVA HOME"
2)core-site.xml
3)hdfs-site.xml
4)mapred-site.xml
Step2: Running Hadoop
1)Formatting the NameNode
$bin/hadoop namenode -format
2)Starting Hadoop
You can do a two-stage start up to more easily verify the cluster configuration or just start-all.
$bin/start-dfs.sh
$bin/start-mapred.sh
or:
$bin/start-all.sh
3)Checking the started hadoop process
Normally, if you operated right, by using the 'jps' command, you would find totally five hadoop processes except the jps process itself.
There are several tips if you have some of the processes failed to start:
a. Checking out the four configuration file in the hadoop installation directory hadoop-***/conf, make the directory you denoted for 'tmp' or 'namenode', 'datanode' existing.
b. Censoring if you have the permission to manage the denoted directories above.
c. Through the hadoop web UI to find relative error log.
Step3: Testing instance
Here, we use the hadoop owned example, which aims to compute PI, the first parameter rules the running times of map, the second one denotes the number of samples each map task needs to fetch. The command and map-reduce running procedure are listed below:
$bin/hadoop jar $HADOOP_HOME/hadoop-examples-1.2.1.jar \
>pi 2 5
- Installation of Hadoop-1.2.1 Pseudo-distributed mode on Centos 7
- Hadoop 伪分布式搭建 Pseudo-Distributed Mode
- [hadoop]How To Install Apache Hadoop Pseudo Distributed Mode on a Single Node
- Ubuntu下Hadoop伪分布式配置(Pseudo-Distributed Mode)
- hadoop探索-Pseudo-Distributed Operation
- Debugging Nutch With Hbase on Hadoop Fully Distributed Mode
- Hadoop Fully distributed mode
- Hadoop Installation - Pseudodistributed Mode
- 成功在pseudo-distributed mode下运行wordcount 程序
- Hadoop installation Local (Standalone) Mode
- Hadoop Installation on Linux
- Hadoop installation on windows
- VNC-Server installation on CentOS 7
- Wkhtmltopdf Installation on Centos 7 Server
- Freeradius Installation Guide on CentOS 7
- ncview installation on centos
- Hadoop 完全分布式 Fully-Distributed Mode
- Application called By IE on Pseudo B/S Mode
- 银行账户管理系统详细设计说明书
- viewpager不返回最大值实现轮播
- java中queue的使用
- HDU 3746 Cyclic Nacklace
- 关于计算机中的数据存储与显示
- Installation of Hadoop-1.2.1 Pseudo-distributed mode on Centos 7
- Java 虚拟机总结 - JVM 内存区域
- 制作包含依赖库的AAR包
- Bootstrap 栅格系统
- [mysql]ERROR 1364 (HY000): Field 'ssl_cipher' doesn't have a default value
- 山东理工大学第八届ACM校赛——活动选择
- 今天大概了解了一下百度的点击率
- ListView加头部Banner基本思路
- 【iOS开发】NSString与int和float的相互转换