hive安装和使用相关问题解决方式

来源:互联网 发布:怎么看服务器端口 编辑:程序博客网 时间:2024/04/30 00:21
ubuntu下面:
(1)安装好java
设置 JAVA_HOME
在文件/etc/profile中设置
JAVA_HOME=/usr/
export JAVA_HOME

tianzhao@tianzhao-VirtualBox:~/apache/trunk$ echo $JAVA_HOME
/usr/

(2)ssh
tianzhao@tianzhao-VirtualBox:~$ ssh-keygen -t rsa -P ""
tianzhao@tianzhao-VirtualBox:~$ cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
ssh localhost
如果没有ssh就需要安装
tianzhao@tianzhao-VirtualBox:~$ sudo apt-get install openssh-server
tianzhao@tianzhao-VirtualBox:~$ ssh localhost


(3) hadoop的配置,  hadoop-env.sh  mapred-site.xml hdfs-site.xml core-site.xml
官方的版本里面要在 hadoop-env.sh 加上 :
export JAVA_HOME=/usr/

tianzhao@tianzhao-VirtualBox:~/hadoop-0.20.2-cdh3u0/conf$ cat core-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>

<property>
  <name>fs.default.name</name>
  <value>hdfs://localhost:54310</value>
  <description>The name of the default file system. A URI whose
  scheme and authority determine the FileSystem implementation. The
  uri's scheme determines the config property (fs.SCHEME.impl) naming
  the FileSystem implementation class. The uri's authority is used to
  determine the host, port, etc. for a filesystem.</description>
</property>

<property>
  <name>hadoop.tmp.dir</name>
  <value>/home/tianzhao/hadoop-0.20.2-cdh3u0/tmp</value>
</property>

</configuration>



tianzhao@tianzhao-VirtualBox:~/hadoop-0.20.2-cdh3u0/conf$ cat hdfs-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>

<property>
<name>dfs.name.dir</name>
<value>/home/tianzhao/hadoop-0.20.2-cdh3u0/hdfs/name</value>
</property>

<property>
<name>dfs.data.dir</name>
<value>/home/tianzhao/hadoop-0.20.2-cdh3u0/hdfs/data</value>
</property>

</configuration>





tianzhao@tianzhao-VirtualBox:~/hadoop-0.20.2-cdh3u0/conf$ cat mapred-site.xml <?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>

<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
<description>The host and port that the MapReduce job tracker runs
at. If "local", then jobs are run in-process as a single map
and reduce task.
</description>
</property>

</configuration>

bin/hadoop namenode -format

bin/start-all.sh
bin/stop-all.sh


(3.1)hbase   hbase-site.xml
<configuration>

  <property>
    <name>hbase.rootdir</name>
    <value>hdfs://localhost:54310/hbase</value>
  </property>

<property> 
  <name>hbase.cluster.distributed</name> 
  <value>true</value> 
</property> 

  <property>
    <name>dfs.replication</name>
    <value>1</value>
  </property>

</configuration>

// master
http://localhost:60010
// region server
http://localhost:60030

mysql 4不支持,mysql 5以上才支持。 外键问题。
(4)mysql安装
tianzhao@tianzhao-VirtualBox:~/hive-0.7.0-cdh3u0$ sudo apt-get install mysql-server
root用户密码:tianzhao1234

tianzhao@tianzhao-VirtualBox:~/hive-0.7.0-cdh3u0$ mysql -uroot -ptianzhao1234
mysql> create database hive;
Query OK, 1 row affected (0.00 sec)
mysql> create user 'hive'@'localhost' identified by '123456';
Query OK, 0 rows affected (0.00 sec)
mysql> grant all privileges on hive.* to 'hive'@'localhost' with grant option;
Query OK, 0 rows affected (0.00 sec)

flush privileges;

sudo /etc/init.d/mysql restart

http://www.pantz.org/software/mysql/mysqlcommands.html




用mysqldump命令行
命令格式
mysqldump -u用户名 -p 数据库名 > 数据库名.sql
范例:
mysqldump -uroot -p abc > abc.sql
(导出数据库abc到abc.sql文件)
提示输入密码时,输入该数据库用户名的密码。

用mysql命令行
命令格式
mysql -u用户名 -p 数据库名 < 数据库名.sql
范例:
mysql -uabc_f -p abc < abc.sql
(导入数据库abc从abc.sql文件)
提示输入密码时,输入该数据库用户名的密码。

drop database hive;
show databases;


(5)hive  hive-config.sh   hive-site.xml
1.修改bin/hive-config.sh
export HADOOP_HOME=/home/tianzhao/hadoop-0.20.2-cdh3u0/

conf/hive-site.xml

tianzhao@tianzhao-VirtualBox:~/hive-0.7.0-cdh3u0/conf$ cat hive-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>

<property>
  <name>hive.metastore.local</name>
  <value>true</value>
</property>

<property>
        <name>javax.jdo.option.ConnectionURL</name>
        <value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true</value>
        <description>JDBC connect string FOR a JDBC metastore</description>
</property>

<property>
        <name>javax.jdo.option.ConnectionDriverName</name>
        <value>com.mysql.jdbc.Driver</value>
        <description>Driver class name FOR a JDBC metastore</description>
</property>

<property>
        <name>javax.jdo.option.ConnectionUserName</name>
        <value>hive</value>
        <description>username TOUSE against metastore database</description>
</property>
<property>
        <name>javax.jdo.option.ConnectionPassword</name>
        <value>123456</value>
        <description>password TOUSE against metastore database</description>
</property>

</configuration>

待修改


export HADOOP_HOME=/home/tianzhao/hadoop-0.19.2
export HADOOP_CONF_DIR=$HOME/config



echo $HOME 
/home/tianzhao
export HIVE_HOME=/home/tianzhao/hive
export HIVE_CONF_DIR=/home/tianzhao/hive-config
export HIVE_LIB=${HIVE_HOME}/lib
export HIVE_AUX_JARS_PATH=/home/tianzhao/jar




在 hive  cli中执行show tables,alter table等hivesql变得很慢,往往是因为mysql的原因,因为hive操作分几部分:表信息操作是通过连接的数据库,目录操作的是hdfs上面的目录,job提交给jobtracker。
show tables主要是从mysql中读取表的信息,然后显示。如果变得很慢就是hive从Mysql获取table信息很慢,可以重启mysql或者把mysql迁移到其他机器上。

http://forge.mysql.com/wiki/MySQL_Internals_Optimizer#The_Optimizer


hive> show tables;
FAILED: Error in metadata: javax.jdo.JDODataStoreException: Exception thrown obtaining schema column information from datastore
NestedThrowables:
java.sql.SQLException: Got error 28 from storage engine
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask

java.sql.SQLException: Got error 28 from storage engine  这个是因为磁盘空间满了的原因


http://bugs.mysql.com/bug.php?id=3611
This issue still exist in MySQL server version 5.1.41
I did a sub-query in FROM clause and used WHERE clause to filter results.
The following query works fine without WHERE clause. If you add WHERE clause you will
receive an error code: "1267 Illegal mix of collations (latin1_swedish_ci, IMPLICIT) and
(utf8_general_ci, COERCIBLE) for operation '='"

This example is taken from the book I was reading:

SELECT PLAYERNO, NAME, PLAYERS.TOWN, NUMBER * 1000
FROM PLAYERS,
(SELECT 'Stratford' AS TOWN, 4 AS NUMBER
UNION
SELECT 'Plymouth', 6
UNION
SELECT 'Inglewood', 1
UNION
SELECT 'Douglas', 2) AS TOWNS
WHERE PLAYERS.TOWN = TOWNS.TOWN
ORDER BY PLAYERNO


脏数据产生了分区字段,然后出错。
Failed with exception javax.jdo.JDODataStoreException: Error executing JDOQL query "SELECT 'org.apache.hadoop.hive.metastore.model.MPartition' AS NUCLEUS_TYPE,`THIS`.`CREATE_TIME`,`THIS`.`LAST_ACCESS_TIME`,`THIS`.`PART_NAME`,`THIS`.`PART_ID` FROM `PARTITIONS` `THIS` LEFT OUTER JOIN `TBLS` `THIS_TABLE_TABLE_NAME` ON `THIS`.`TBL_ID` = `THIS_TABLE_TABLE_NAME`.`TBL_ID` LEFT OUTER JOIN `TBLS` `THIS_TABLE_DATABASE` ON `THIS`.`TBL_ID` = `THIS_TABLE_DATABASE`.`TBL_ID` LEFT OUTER JOIN `DBS` `THIS_TABLE_DATABASE_DATABASE_NAME` ON `THIS_TABLE_DATABASE`.`DB_ID` = `THIS_TABLE_DATABASE_DATABASE_NAME`.`DB_ID` WHERE `THIS_TABLE_TABLE_NAME`.`TBL_NAME` = ? AND `THIS_TABLE_DATABASE_DATABASE_NAME`.`NAME` = ? AND `THIS`.`PART_NAME` = ?" : Illegal mix of collations (latin1_bin,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '='.
NestedThrowables:
java.sql.SQLException: Illegal mix of collations (latin1_bin,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '='
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask


Failed with exception javax.jdo.JDODataStoreException: Add request failed : INSERT INTO `COLUMNS` (`SD_ID`,`COMMENT`,`COLUMN_NAME`,`TYPE_NAME`,`INTEGER_IDX`) VALUES (?,?,?,?,?)
NestedThrowables:
java.sql.BatchUpdateException: The table 'columns' is full
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask

错误信息:
Caused by: java.sql.SQLException: The table 'xxx' is full
        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1073)
        at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3597)
        at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3529)
解决办法:
请参阅如下地址
1. http://forums.dcm4che.org/jiveforums/message.jspa?messageID=2751
2. http://jeremy.zawodny.com/blog/archives/000796.html
3. http://dev.mysql.com/doc/refman/5.0/en/full-table.html
原创粉丝点击