CentOS7下Hive-2.1.1安装

来源:互联网 发布:网络校时软件 编辑:程序博客网 时间:2024/05/22 19:04

一、安装说明:

hive在安装前需要确保已安装并运行hadoop集群(hadoop2.x版本),java环境强烈推荐使用java1.8版本(详细参考官网:https://cwiki.apache.org/confluence/display/Hive/GettingStarted)。hive默认是将元数据信息存放在内嵌的derby数据库中,我们通常使用mysql(这里我使用的是mysql5.7版本)将derby替换。hivehadoop的主节点集群里安装即可。

二、软件下载及配制:

(1) hive下载:

[root@Clouder3 conf]# wget https://mirrors.tuna.tsinghua.edu.cn/apache/hive/hive-2.1.1/apache-hive-2.1.1-bin.tar.gz

解压:

[root@Clouder3 conf]# tar zxvf apache-hive-2.1.1-bin.tar.gz

修改名称:

[root@Clouder3 conf]# mv apache-hive-2.1.1-bin.tar.gz hive-2.1.1


(2) 配制系统环境变量:

[root@Clouder3 ~]# vim /etc/profile

export HADOOP_HOME=/usr/local/hadoop-2.7.4
export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_HOME}/lib/native
export HADOOP_OPTS="-Djava.library.path=${HADOOP_HOME}/lib"
export HIVE_HOME=/usr/local/hive-2.1.1
export HIVE_CONF_DIR=${HIVE_HOME}/conf
export JAVA_HOME=/usr/java/jdk1.8.0_144
export JRE_HOME=/usr/java/jdk1.8.0_144/jre
export CLASS_PATH=.:${JAVA_HOME}/lib:${HIVE_HOME}/lib:$CLASS_PATH
export PATH=.:$JAVA_HOME/bin:$JRE_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HIVE_HOME/bin:$PATH

保存后,使配制生效:source /etc/profile


(3) 修改hive-env.sh文件,在文件底部增加以下环境变量

[root@Clouder3 conf]# cp hive-env.sh.template hive-env.sh

[root@Clouder3 conf]# vim hive-env.sh

export JAVA_HOME=/usr/java/jdk1.8.0_144
export HIVE_HOME=/usr/local/hive-2.1.1
export HIVE_CONF_DIR=/usr/local/hive-2.1.1/conf
export HIVE_AUX_JARS_PATH=/usr/local/hive-2.1.1/lib
export HADOOP_HOME=/usr/local/hadoop-2.7.4


(4) 修改hive-site.xml文件,并将所有的${system:java.io.tmpdir}替换为绝对路径

[root@Clouder3 conf]# cp hive-default.xml.template hive-site.xml

[root@Clouder3 conf]# vim hive-site.xml 

在该配置文件中有如下两个配置项注明了hive在HDFS中数据存储的目录,因此我们需要在HDFS上手动创建并赋权限:

  <property>    <name>hive.exec.scratchdir</name>    <value>/tmp/hive</value>    <description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/<username&gt; is created, with ${hive.scratch.dir.permission}.</description>  </property>

  <property>    <name>hive.metastore.warehouse.dir</name>    <value>/user/hive/warehouse</value>    <description>location of default database for the warehouse</description>  </property>
在HDFS中创建对应的目录并赋予权限:

hadoop fs -mkdir -p  /user/hive/warehouse #创建目录(或使用命令  hdfs dfs -mkdir -p /user/hive/warehouse)hadoop fs -chmod -R 777 /user/hive/warehouse #递归赋予读写权限hadoop fs -mkdir -p /tmp/hive/ #创建/tmp/hive/目录hadoop fs -chmod -R 777 /tmp/hive #目录赋予读写权限验证:hadoop fs -ls /user/hivehadoop fs -ls /tmp/hive

修改本地临时文件存储路径,将所有${system:java.io.tmpdir}替换为本地绝对路径:

  <property>    <name>hive.exec.local.scratchdir</name>    <value>${system:java.io.tmpdir}/${system:user.name}</value>    <description>Local scratch space for Hive jobs</description>  </property>  <property>    <name>hive.downloaded.resources.dir</name>    <value>${system:java.io.tmpdir}/${hive.session.id}_resources</value>    <description>Temporary local directory for added resources in the remote file system.</description>  </property>    <property>    <name>hive.querylog.location</name>    <value>${system:java.io.tmpdir}/${system:user.name}</value>    <description>Location of Hive run time structured log file</description>  </property>     <property>    <name>hive.server2.logging.operation.log.location</name>    <value>${system:java.io.tmpdir}/${system:user.name}/operation_logs</value>    <description>Top level directory where operation logs are stored if logging functionality is enabled</description>  </property>
替换后:

  <property>    <name>hive.exec.local.scratchdir</name>    <value>/usr/local/hive-2.1.1/tmp/</value>    <description>Local scratch space for Hive jobs</description>  </property>  <property>    <name>hive.downloaded.resources.dir</name>    <value>/usr/local/hive-2.1.1/tmp/${hive.session.id}_resources</value>    <description>Temporary local directory for added resources in the remote file system.</description>  </property>  <property>    <name>hive.querylog.location</name>    <value>/usr/local/hive-2.1.1/tmp/</value>    <description>Location of Hive run time structured log file</description>  </property>  <property>    <name>hive.server2.logging.operation.log.location</name>    <value>/usr/local/hive-2.1.1/tmp/root/operation_logs</value>    <description>Top level directory where operation logs are stored if logging functionality is enabled</description>  </property>


hive-site.xml中相关元数据信息配制:
javax.jdo.option.ConnectionDriverName,将对应的value修改为MySQL驱动类路径;
javax.jdo.option.ConnectionURL,将对应的value修改为MySQL的地址;
javax.jdo.option.ConnectionUserName,将对应的value修改为MySQL数据库登录名;javax.jdo.option.ConnectionPassword,将对应的value修改为MySQL数据库的登录密码:

修改后:

  <property>    <name>javax.jdo.option.ConnectionDriverName</name>    <value>com.mysql.jdbc.Driver</value>    <description>Driver class name for a JDBC metastore</description>  </property>        <name>javax.jdo.option.ConnectionURL</name>      <value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true</value> <property>    <name>javax.jdo.option.ConnectionUserName</name>    <value>root</value>    <description>Username to use against metastore database</description>  </property>   <property>    <name>javax.jdo.option.ConnectionPassword</name>    <value>root</value>    <description>password to use against metastore database</description>  </property>
将MySQL驱动包添加到Hive的lib目录下:mysql-connector-java-5.1.38-bin.jar

(5) Mysql安装及配制:

[root@Clouder3 local]# yum -y install mysql

[root@Clouder3 local]# yum install mysql-devel

[root@Clouder3 local]# wget http://dev.mysql.com/get/mysql-community-release-el7-5.noarch.rpm

[root@Clouder3 local]# rpm -ivh mysql-community-release-el7-5.noarch.rpm

[root@Clouder3 local]# yum install mysql-community-server 

安装成功后重启mysql服务
[root@Clouder3 local]# service mysqld restart
初次安装mysql,root账户没有密码,登陆:mysql -uroot

设置密码:

set password for 'root'@'localhost' =password('root');

远程连接设置:
mysql> grant all privileges on *.* to root@'%'identified by 'root';
如果是新用户而不是root,则要先新建用户
mysql>create user 'username'@'%' identified by 'password'; 

mysql>flush privileges;

配置mysql编码:[root@Clouder3 local]# vim /etc/my.cnf
在[mysqld]最后加上编码配置
default-character-set =utf8 #这里的字符编码必须和/usr/share/mysql/charsets/Index.xml中一致。

三、hive的启动及测试:

(1) 对hive元数据初始化(mysql中hive元信息初始化、建表等):

[root@Clouder3 hive-2.1.1]# bin/schematool -initSchema -dbType mysql

SLF4J: Class path contains multiple SLF4J bindings.SLF4J: Found binding in [jar:file:/usr/local/hive-2.1.1/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: Found binding in [jar:file:/usr/local/hbase-1.3.1/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: Found binding in [jar:file:/usr/local/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]Metastore connection URL:        jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=trueMetastore Connection Driver :    com.mysql.jdbc.DriverMetastore connection User:       rootStarting metastore schema initialization to 2.1.0Initialization script hive-schema-2.1.0.mysql.sqlInitialization script completedschemaTool completed
初始化完毕,输入hive进入hive 交互模式:

[root@Clouder3 hive-2.1.1]# hiveSLF4J: Class path contains multiple SLF4J bindings.SLF4J: Found binding in [jar:file:/usr/local/hive-2.1.1/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: Found binding in [jar:file:/usr/local/hbase-1.3.1/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: Found binding in [jar:file:/usr/local/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]Logging initialized using configuration in jar:file:/usr/local/hive-2.1.1/lib/hive-common-2.1.1.jar!/hive-log4j2.properties Async: trueHive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.hive> show databases;OKdefaultTime taken: 1.798 seconds, Fetched: 1 row(s)