linux下hive三种方式的安装

来源:互联网 发布:淘宝充值qb不到账 编辑:程序博客网 时间:2024/05/22 06:12

本次以apache-hive-1.2.1-bin.tar.gz为例

服务器node5192.168.13.135

服务器node6192.168.13.136

服务器node7192.168.13.137

服务器node8192.168.13.138


一、配置本地内置derby模式
1.上传hive至linux上(/opt/sxt/soft)
2.解压tar -zxvf apache-hive-1.2.1-bin.tar.gz
3.配置文件(/opt/sxt/soft/apache-hive-1.2.1-bin/conf)

1)cp hive-default.xml.template hive-site.xml
2)vi hive-site.xml

<property>    <name>javax.jdo.option.ConnectionURL</name>    <value>jdbc:derby:;databaseName=metastore_db;create=true</value>  </property>  <property>  <name>javax.jdo.option.ConnectionDriverName</name>    <value>org.apache.derby.jdbc.EmbeddedDriver</value>  </property>   <property>    <name>hive.metastore.local</name>    <value>true</value>  </property>    <property>    <name>hive.metastore.warehouse.dir</name>    <value>/user/hive/warehouse</value>  </property>

4.配置环境变量(并重新加载)
vi ~/.bash_profile
export HIVE_HOME=/opt/sxt/soft/apache-hive-1.2.1-bin
export PATH=$PATH:$HIVE_HOME/bin
source ~/.bash_profile
5.启动hive
1)在启动前需要成功启动hdfs和yarn(start-all.sh start)
        通过浏览器检查node5/8:50070,node5/8:8088
2)修改HADOOP_HOME\lib目录下的jline-*.jar 变成HIVE_HOME\lib下的jline-2.12.jar
   确保hadoop和hive下的jline版本一致
   hadoop下 jar包 路径是:share/hadoop/yarn/lib
3)启动:hive
注:使用derby存储方式时,运行hive会在当前目录生成一个derby文件和一个metastore_db目录。这种存储方式的弊端是在同一个目录下同时只能有一个hive客户端能使用数据库。

二、配置本地mysql模式(推荐使用,本地指的是能连接上的ip地址服务器)
1.修改配置文件
vi /opt/sxt/soft/apache-hive-1.2.1-bin/conf/hive-site.xml
删除配置derby时的内容,改为

<property>    <name>hive.metastore.warehouse.dir</name>    <value>/user/hive_remote/warehouse</value>  </property>   <property>    <name>hive.metastore.local</name>    <value>true</value>  </property>   <property>    <name>javax.jdo.option.ConnectionURL</name>    <value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true</value>  </property>  <property>    <name>javax.jdo.option.ConnectionDriverName</name>    <value>com.mysql.jdbc.Driver</value>  </property>  <property>    <name>javax.jdo.option.ConnectionUserName</name>    <value>root</value>  </property>  <property>    <name>javax.jdo.option.ConnectionPassword</name>    <value>1234</value>  </property>  
2.拷贝mysql驱动jar包
需要将mysql(本次mysql-connector-java-5.1.32-bin.jar)的jar包拷贝到$HIVE_HOME/lib目录下
3.启动mysql
service mysqld start
4.启动hive(确保数据库已开启)
hive


三、配置远程mysql

1)remote一体(未亲测)

这种存储方式需要在远端服务器运行一个mysql服务器,并且需要在Hive服务器启动meta服务。这里用mysql的测试服务器,

ip位192.168.13.138,新建hive_remote数据库,字符集位latine1

<?xml version="1.0"?>  <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>  <configuration>  <property>    <name>hive.metastore.warehouse.dir</name>    <value>/user/hive/warehouse</value>  </property>  <property>    <name>javax.jdo.option.ConnectionURL</name>    <value>jdbc:mysql://192.168.13.135:3306/hive?createDatabaseIfNotExist=true</value>  </property>  <property>    <name>javax.jdo.option.ConnectionDriverName</name>    <value>com.mysql.jdbc.Driver</value>  </property>  <property>    <name>javax.jdo.option.ConnectionUserName</name>    <value>root</value>  </property>  <property>    <name>javax.jdo.option.ConnectionPassword</name>    <value>1234</value>  </property>  <property>    <name>hive.metastore.local</name>    <value>false</value>  </property>  <property>    <name>hive.metastore.uris</name>    <value>thrift://192.168.13.138:9083</value>  </property>  </configuration>  
注:这里把hive的服务端和客户端都放在同一台服务器上了。服务端和客户端可以拆开
2)remote分开
1.如上,在node8(任意另一节点)上成功安装hive
2.修改node8中的hive-site.xml(服务端配置文件)
vi /opt/sxt/soft/apache-hive-1.2.1-bin/conf/hive-site.xml,内容为

<?xml version="1.0"?>  <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>  <configuration>  <property>    <name>hive.metastore.warehouse.dir</name>    <value>/user/hive/warehouse</value>  </property>  <property>    <name>javax.jdo.option.ConnectionURL</name>    <value>jdbc:mysql://192.168.13.135:3306/hive?createDatabaseIfNotExist=true</value>      //192.168.13.135:3306为安装mysql的服务器,不一定是135</property>   <property>    <name>javax.jdo.option.ConnectionDriverName</name>    <value>com.mysql.jdbc.Driver</value>  </property>   <property>    <name>javax.jdo.option.ConnectionUserName</name>    <value>root</value>  </property>  <property>    <name>javax.jdo.option.ConnectionPassword</name>    <value>1234</value>  </property>  </configuration>
3.启动hive服务端程序(node8中)
  hive --service metastore   
4.修改node5中的hive-site.xml(客户端配置文件)  
vi /opt/sxt/soft/apache-hive-1.2.1-bin/conf/hive-site.xml,内容为
<?xml version="1.0"?>  <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>  <configuration>  <property>    <name>hive.metastore.warehouse.dir</name>    <value>/user/hive/warehouse</value>  </property>  <property>    <name>hive.metastore.local</name>    <value>false</value>  </property>  <property>    <name>hive.metastore.uris</name>    <value>thrift://192.168.13.138:9083</value>  </property>  </configuration> 
5.客户端直接使用hive命令即可(node5中)
hive
hive> show tables;
OK
Time taken: 0.707 seconds
hive>
6.启动hiveserver库
$HIVE_HOME/bin/hiveserver2或者$HIVE_HOME/bin/hive --service hiveserver2
7.连接hiveserver库
1)beeline方式
    beeline
    beeline>!connect jdbc:hive2://localhost:10000 root org.apache.hive.jdbc.HiveDriver或
    beeline>!connect jdbc:hive2://localhost:10000/default
2)客户端连接(JDBC)
public class TestHive2 {public static void main(String[] args) {try {Class.forName("org.apache.hive.jdbc.HiveDriver");Connection conn = DriverManager.getConnection("jdbc:hive2://192.168.13.135:10000/default","root","");Statement stmt = conn.createStatement();ResultSet resultSet = stmt.executeQuery("select count(*) from people");if(resultSet.next()){System.out.println(resultSet.getInt(1));}} catch (Exception e) {e.printStackTrace();}}}

补充:
1)load本地(local)数据到数据库
   load data local inpath ‘/opt/sxt/temp/test.txt’ into table people PARTITION (dt=’2016-1-1’);(直接复制会出错)
2)举例建几个表(DDL)

1.CREATE TABLE page_view(
    page_url STRING,
    ip STRING
  )
  ROW FORMAT DELIMITED
  FIELDS TERMINATED BY '\t'
  STORED AS TEXTFILE;
  数据如下:(-->表示制表符,即’\t’)
  Node1-->192.168.13.1
  Node2-->192.168.13.2


2.CREATE TABLE people(
id STRING,
name STRING,
likes Array<String>,
addr Map<String,String>
  )
  PARTITIONED BY(dt STRING)
  ROW FORMAT DELIMITED
  FIELDS TERMINATED BY '\t'
  COLLECTION ITEMS TERMINATED BY ‘,’
  MAP KEYS TERMINATED BY ‘:’
  STORED AS TEXTFILE;
  数据如下:
  1-->zs-->game,girl,money-->stuAddr:nantong,workAddr:tongzhou-->2017-1-1
  2-->ls-->game,girl,money-->stuAddr:nantong,workAddr:tongzhou-->2017-1-1

  select addr[‘stuAddr’] from people where name=’zs’;

3.从一个表中查询数据放到另一表中
  内表与外表的区别:内表drop后数据销毁,外表drop后数据还在hdfs上。
  内表数据由hive管理,外表数据存放在别处。
  CREATE [EXTERNAL] TABLE people_test(
id STRING,
    name STRING,
likes Array<String>
  )
  ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
   COLLECTION ITEMS TERMINATED BY ‘,’
  STORED AS TEXTFILE;

INSERT OVERWRITE TABLE people_test select id,name,likes FROM people where name=’zs’;


//把表清空
INSERT OVERWRITE TABLE people_test select id,name,likes FROM people where 1=2;

UPDATE people_test SET name = ‘ls’ where name=’zs’;

//删除分区
ALTER TABLE people DROP IF EXISTS PARTITION (dt=’2016-1-1’);