hadoop2.5.0-cdh5.3.3 伪分布式安装(mysql、hive、sqoop)

来源:互联网 发布:graphics在java中使用 编辑:程序博客网 时间:2024/05/06 00:15

一、准备一台机器(我的是centos 6.7 64位)

-----------------------------------------------------------------------------------------------------------

主机名和ip地址映射:

[hadoop@hadoop ~]$ cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.38.100 hadoop

-----------------------------------------------------------------------------------------------------------

配置静态的ip:

[hadoop@hadoop ~]$ cat /etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE=eth0
HWADDR=00:0c:29:c3:70:cf
TYPE=Ethernet
UUID=707e0c2e-6550-4c50-a650-4b352f72c4b1
ONBOOT=yes
NM_CONTROLLED=yes
BOOTPROTO=none
IPADDR=192.168.38.100
NETMASK=255.255.255.0
IPV6INIT=no
USERCTL=no

---------------------------------------------------------------------------------------------------------------

修改主机名:

[hadoop@hadoop ~]$ cat /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=hadoop

---------------------------------------------------------------------------------------------------------------

二、安装jdk1.7

--------------------------------------------------------------------------------------------------------------

上传tar包,创建目录:mkdir -p /usr/local/jdk7,解压到此目录,分组授权:

eg:

[hadoop@hadoop ~]$ ls -ltr /usr/local/

drwxrwxrwx. 3 hadoop hadoop 4096 Jan 19 11:13 hadoop
drwxrwxrwx. 3 hadoop hadoop 4096 Jan 19 11:14 jdk7
drwxrwxrwx. 3 hadoop hadoop 4096 Jan 19 11:15 hive
drwxrwxrwx. 3 hadoop hadoop 4096 Jan 19 11:16 sqoop

注释:

我在此处将以上四个包全部安装,并分用户组授权

--------------------------------------------------------------------------------------------------------------

配置jdk环境变量:

[hadoop@hadoop ~]$ cat /etc/profile

JAVA_HOME=/usr/local/jdk7/jdk1.7.0_71
HADOOP_HOME=/usr/local/hadoop/hadoop-2.5.0-cdh5.3.3
PATH=$JAVA_HOME/bin:$HADOOP_HOME/bin:$PATH
CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export JAVA_HOME
export HADOOP_HOME
export PATH
export CLASSPATH

执行:source/etc/profile  使环境变量生效

--------------------------------------------------------------------------------------------------------------

三、安装hadoop

上面已经安装过hadoop,和jdk类似,就是一个解压授权的过程,这里不再写详细的操作步骤

1、配置hadoop的配置文件:

[hadoop@hadoop ~]$ cd /usr/local/hadoop/hadoop-2.5.0-cdh5.3.3/etc/hadoop

[hadoop@hadoop hadoop]$ vi hadoop-env.sh

添加一行:

export JAVA_HOME=/usr/local/jdk7/jdk1.7.0_71

2、[hadoop@hadoop hadoop]$ vi slaves

hadoop注:以下四个XML配置文件,需在标签<configuration>和</configuration>之间增加配置项。

3、配置文件:mapred-site.xml

[hadoop@hadoop hadoop]$ vi mapred-site.xml

<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>

4、core-site.xml(其中“hadoop”是在/etc/hosts中设置的host,如果未设置,则换为localhost)

[hadoop@hadoop hadoop]$ vi core-site.xml

<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://hadoop:8020</value>
</property>
</configuration>

5、配置文件:yarn-site.xml

[hadoop@hadoop hadoop]$ vi yarn-site.xml

<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>

6、配置文件:hdfs-site.xml

[hadoop@hadoop hadoop]$ vi hdfs-site.xml

<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>

7、启动服务:
格式化HDFS:
bin/hadoop namenode -format
启动HDFS:
sbin/ start-dfs.sh
启动YARN:
sbin/start-yarn.sh

验证是否成功启动服务:

[hadoop@hadoop sbin]$ jps
61770 DataNode
62189 NodeManager
62505 Jps
62095 ResourceManager
61950 SecondaryNameNode
61678 NameNode

有以上五个进程,说明成功

问题1:采用虚拟机搭建Hadoop环境,每次虚拟机重启后,Hadoop无法启动成功。
解决方案:
在core-site.xml中增加以下两个配置:
<property>
<name>dfs.namenode.name.dir</name>
<value>/home/hadoop/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/hadoop/dfs/data</value>
</property>
其中,各个目录一定是非/tmp下的目录

--------------------------------------------------------------------------------------------------------------

四、安装hive元数据库mysql

启动系统自带的mysql服务:

[root@hadoop ~]# service mysqld start

修改密码:

[root@hadoop ~]# mysql
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 4
Server version: 5.1.73 Source distribution

Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| test               |
+--------------------+
3 rows in set (0.00 sec)

mysql> use mysql

mysql> update user set password=passworD("hadoop") where user='root'
    -> ;

mysql> flush privileges;

mysql> select Host,User,Password from user;
+-----------+------+-------------------------------------------+
| Host      | User | Password                                  |
+-----------+------+-------------------------------------------+
| localhost | root | *B34D36DA2C3ADBCCB80926618B9507F5689964B6 |
| hadoop    | root | *B34D36DA2C3ADBCCB80926618B9507F5689964B6 |
| 127.0.0.1 | root | *B34D36DA2C3ADBCCB80926618B9507F5689964B6 |
| localhost |      |                                           |
| hadoop    |      |                                           |
+-----------+------+-------------------------------------------+
5 rows in set (0.00 sec)

为Hive建立相应的MySQL帐号,并赋予足够的权限
①进入root:mysql -uroot -p;
②创建hive数据库:create database hive;

mysql> create databases hive;

③创建用户hive,它只能从localhost连接到数据库并可以连接到wordpress数据库:

mysql> grant all on hive.* to hive@localhost identified by 'hive';


--------------------------------------------------------------------------------------------------------------

五、安装hive

将tar包解压后,配置环境变量:

在/etc/profile中添加:(root用户)

export JAVA_HOME=/usr/local/jdk7/jdk1.7.0_71
export HADOOP_HOME=/usr/local/hadoop/hadoop-2.5.0-cdh5.3.3
export HIVE_HOME=/usr/local/hive/hive-0.13.1-cdh5.3.3
export PATH=$JAVA_HOME/bin:$HADOOP_HOME/bin:$HIVE_HOME/bin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar

hadoop用户:

cd /usr/local/hive/hive-0.13.1-cdh5.3.3/conf

[hadoop@hadoop conf]$ cp hive-env.sh.template hive-env.sh

export HADOOP_HOME=/usr/local/hadoop/hadoop-2.5.0-cdh5.3.3

[hadoop@hadoop conf]$ cp hive-log4j.properties.template hive-log4j.properties

[hadoop@hadoop conf]$ cp hive-default.xml.template hive-site.xml

[hadoop@hadoop conf]$ vi hive-site.xml

添加:

<property>
  <name>javax.jdo.option.ConnectionURL</name>
  <value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true</value>
  <description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
  <name>javax.jdo.option.ConnectionDriverName</name>
  <value>com.mysql.jdbc.Driver</value>
  <description>Driver class name for a JDBC metastore</description>
</property>
<property>
  <name>javax.jdo.option.ConnectionUserName</name>
  <value>hive</value>
  <description>username to use against metastore database</description>
</property>
<property>
  <name>javax.jdo.option.ConnectionPassword</name>
  <value>hive</value>
  <description>password to use against metastore database</description>
</property>

[hadoop@hadoop conf]$ hadoop fs -mkdir /tmp

[hadoop@hadoop conf]$ hadoop fs -mkdir -p /user/hive/warehouse

[hadoop@hadoop conf]$ hadoop fs -chmod g+w /tmp

[hadoop@hadoop conf]$ hadoop fs -chmod g+w /user/hive/warehouse

注释:这是默认的hive存储hdfs文件的目录

--------------------------------------------------------------------------------------------------------------

六、安装sqoop

1、配置环境变量

[root@hadoop ~]# vi /etc/profile

export JAVA_HOME=/usr/local/jdk7/jdk1.7.0_71
export HADOOP_HOME=/usr/local/hadoop/hadoop-2.5.0-cdh5.3.3
export HIVE_HOME=/usr/local/hive/hive-0.13.1-cdh5.3.3
export SQOOP_HOME=/usr/local/sqoop/sqoop-1.4.5-cdh5.3.3
export PATH=$JAVA_HOME/bin:$HADOOP_HOME/bin:$HIVE_HOME/bin:$SQOOP_HOME/bin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib:$SQOOP_HOME/lib:$CLASSPATH

2、配置文件

[hadoop@hadoop conf]$ cp sqoop-env-template.sh sqoop-env.sh
[hadoop@hadoop conf]$ vi sqoop-env.sh

export HADOOP_COMMON_HOME=/usr/local/hadoop/hadoop-2.5.0-cdh5.3.3

export HIVE_HOME=/usr/local/hive/hive-0.13.1-cdh5.3.3

该配置文件中只有HADOOP_COMMON_HOME的配置是必须的 另外关于hbase和hive的配置 如果用到需要配置 不用的话就不用配置

3、添加需要的jar包到sqoop的lib下,这里的jar包指的是连接关系型数据库的jar 比如mysql oracle  这些jar包是需要自己添加到lib目录下面去的

[hadoop@hadoop lib]$ cp mysql-connector-java-5.1.7-bin.jar /usr/local/sqoop/sqoop-1.4.5-cdh5.3.3/lib/

4、测试连接的数据库:

[hadoop@hadoop ~]$ sqoop list-databases --connect jdbc:mysql://127.0.0.1:3306/ --username root -P
Warning: /usr/local/sqoop/sqoop-1.4.5-cdh5.3.3/../hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
Warning: /usr/local/sqoop/sqoop-1.4.5-cdh5.3.3/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /usr/local/sqoop/sqoop-1.4.5-cdh5.3.3/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /usr/local/sqoop/sqoop-1.4.5-cdh5.3.3/../zookeeper does not exist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
16/01/19 18:35:30 INFO sqoop.Sqoop: Running Sqoop version: 1.4.5-cdh5.3.3
Enter password: 
16/01/19 18:35:36 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
information_schema
hive
mysql
test








0 0
原创粉丝点击