Hive 基本操作

来源:互联网 发布:泉州电视台直播软件 编辑:程序博客网 时间:2024/06/16 23:26
目录
1.显示地展示当前使用的数据库
2.建表
3.导入本地文件
4.数据库多用户共享表
5.创建数据库
5.1仓库数据库判断数据库是否存在
5.2.显示现在有的数据库
5.3条件查询数据库
5.4创建指定存放文件位置 数据库
5.5创建数据库时 添加注释信息
5.6创建数据库 添加扩展信息
5.7使用数据库 
5.8删除数据库 
5.9当数据库存在表时,先要删除表 再能删除数据库  
5.10修改数据库
6.1创建表
6.2创建表-复制表结构
6.3显示某个数据库中的表
6.4显示指定筛选条件 表名
6.5显示表扩展信息
6.6指定显示某个字段的信息
7.1External Tables -- 外部表
7.2复制表结构仓库外部表
7.3Partitioned, Managed Tables --分区表
7.3.1建议安全措施
7.3.2查看现有分区
7.3.3查看分区详细 分区键
7.3.3通过 DESC 显示分区键
7.3.4从文件读入 分区表
7.4.1External Partitioned Tables 外部分区表
7.4.1.1.先创建外部表结构
7.4.2.为外部表增加指定分区
7.4.3.把数据表复制外部表目录结构中
7.4.3显示 表分区信息





1.显示地展示当前使用的数据库 :hive> set hive.cli.print.current.db=true; 
2.建表:create table t_order(id int,age int) row format delimited fields terminated by ',';
3.导入本地文件:load data local inpath '/home/hadoop/hivedata/data' into table t_order;
4.数据库多用户共享表(解决不同用户不能共享数据库):mysqld_safe --skip-grant-tables
注意:Hive 不支持行级插入,更新,删除。也不支持事务
5.创建数据库 hive> CREATE DATABASE financials;  
5.1仓库数据库判断数据库是否存在  hive> CREATE DATABASE IF NOT EXISTS financials;  
5.2.显示现在有的数据库 hive> SHOW DATABASES;  
                                        hive> CREATE DATABASE human_resources;  
                                        hive> SHOW DATABASES;  
5.3条件查询数据库  hive> SHOW DATABASES LIKE 'h.*';  
5.4创建指定存放文件位置 数据库
  hive> CREATE DATABASE financials  
    > LOCATION '/my/preferred/directory';  
5.5创建数据库时 添加注释信息
hive> CREATE DATABASE financials  
    > COMMENT 'Holds all financial tables';  
hive> DESCRIBE DATABASE financials;  
financials   Holds all financial tables  
  hdfs://master-server/user/hive/warehouse/financials.db  
5.6创建数据库 添加扩展信息
hive> CREATE DATABASE financials  
    > WITH DBPROPERTIES ('creator' = 'Mark Moneybags', 'date' = '2012-01-02');  
hive> DESCRIBE DATABASE financials;  
financials   hdfs://master-server/user/hive/warehouse/financials.db  
hive> DESCRIBE DATABASE EXTENDED financials;  
financials   hdfs://master-server/user/hive/warehouse/financials.db  
 {date=2012-01-02, creator=Mark Moneybags);  
5.7使用数据库  hive> USE financials;  
5.8删除数据库 hive> DROP DATABASE IF EXISTS financials;  
5.9当数据库存在表时,先要删除表 再能删除数据库  hive> DROP DATABASE IF EXISTS financials CASCADE;  
5.10修改数据库 hive> ALTER DATABASE financials SET DBPROPERTIES ('edited-by' = 'Joe Dba');  
        There is no way to delete or “unset” a DBPROPERTY  没有方法删除或重置 DBPROPERTY 
6.1创建表
CREATE TABLE IF NOT EXISTS mydb.employees (  
  name         STRING COMMENT 'Employee name',  
  salary       FLOAT  COMMENT 'Employee salary',  
  subordinates ARRAY<STRING> COMMENT 'Names of subordinates',  
  deductions   MAP<STRING, FLOAT>  
               COMMENT 'Keys are deductions names, values are percentages  
  address      STRUCT<street:STRING, city:STRING, state:STRING, zip:INT>  
               COMMENT 'Home address')  
COMMENT 'Description of the table'  
TBLPROPERTIES ('creator'='me', 'created_at'='2012-01-02 10:00:00', ...)  
LOCATION '/user/hive/warehouse/mydb.db/employees';  

6.2创建表-复制表结构
CREATE TABLE IF NOT EXISTS mydb.employees2  
LIKE mydb.employees;  

6.3显示某个数据库中的表
hive> USE mydb;  
hive> SHOW TABLES;    

6.4显示指定筛选条件 表名
hive> USE mydb;  
hive> SHOW TABLES 'empl.*';  
employees  

6.5显示表扩展信息
hive> DESCRIBE EXTENDED mydb.employees;  
name    string  Employee name  
salary  float   Employee salary  
subordinates    array<string>   Names of subordinates  
deductions      map<string,float> Keys are deductions names, values are percentages  
address struct<street:string,city:string,state:string,zip:int>  Home address  
Detailed Table Information      Table(tableName:employees, dbName:mydb, owner:me,  
...  
location:hdfs://master-server/user/hive/warehouse/mydb.db/employees,  
parameters:{creator=me, created_at='2012-01-02 10:00:00',  
            last_modified_user=me, last_modified_time=1337544510,  
            comment:Description of the table, ...}, ...)  


6.6指定显示某个字段的信息
hive> DESCRIBE mydb.employees.salary;  
salary  float   Employee salary  

7.1External Tables -- 外部表
外部表,删除表不删除数据
CREATE EXTERNAL TABLE IF NOT EXISTS stocks (  
  exchange        STRING,  
  symbol          STRING,  
  ymd             STRING,  
  price_open      FLOAT,  
  price_high      FLOAT,  
  price_low       FLOAT,  
  price_close     FLOAT,  
  volume          INT,  
  price_adj_close FLOAT)  
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','  
LOCATION '/data/stocks';  

7.2复制表结构仓库外部表
CREATE EXTERNAL TABLE IF NOT EXISTS mydb.employees3  
LIKE mydb.employees  
LOCATION '/path/to/data';  

7.3Partitioned, Managed Tables --分区表
CREATE TABLE employees (  
  name         STRING,  
  salary       FLOAT,  
  subordinates ARRAY<STRING>,  
  deductions   MAP<STRING, FLOAT>,  
  address      STRUCT<street:STRING, city:STRING, state:STRING, zip:INT>  
)  
PARTITIONED BY (country STRING, state STRING);  

However, Hive will now create subdirectories reflecting the partitioning structure. For
example:

...  
.../employees/country=CA/state=AB  
.../employees/country=CA/state=BC  
...  
.../employees/country=US/state=AL  
.../employees/country=US/state=AK  
...  

7.3.1建议安全措施
把HIVE 设置成“严格”模式,禁止分区表的查询没有
一个WHERE子句

hive> set hive.mapred.mode=strict;  
hive> SELECT e.name, e.salary FROM employees e LIMIT 100;  
FAILED: Error in semantic analysis: No partition predicate found for  
 Alias "e" Table "employees"  
hive> set hive.mapred.mode=nonstrict;  
hive> SELECT e.name, e.salary FROM employees e LIMIT 100;  

7.3.2查看现有分区

hive> SHOW PARTITIONS employees;  
...  
Country=CA/state=AB  
country=CA/state=BC  
...  
country=US/state=AL  
country=US/state=AK  

7.3.3查看分区详细 分区键

hive> SHOW PARTITIONS employees PARTITION(country='US');  
country=US/state=AL  
country=US/state=AK  
...  
hive> SHOW PARTITIONS employees PARTITION(country='US', state='AK');  
country=US/state=AK  

7.3.3通过 DESC 显示分区键

hive> DESCRIBE EXTENDED employees;  
name         string,  
salary       float,  
...  
address      struct<...>,  
country      string,  
state        string  
Detailed Table Information...  
partitionKeys:[FieldSchema(name:country, type:string, comment:null),  
FieldSchema(name:state, type:string, comment:null)],  
...  

7.3.4从文件读入 分区表
LOAD DATA LOCAL INPATH '${env:HOME}/california-employees'  
INTO TABLE employees  
PARTITION (country = 'US', state = 'CA');  


7.4.1External Partitioned Tables 外部分区表

1.先创建外部表结构
CREATE EXTERNAL TABLE IF NOT EXISTS log_messages (  
  hms             INT,  
  severity        STRING,  
  server          STRING,  
  process_id      INT,  
  message         STRING)  
PARTITIONED BY (year INT, month INT, day INT)  
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';  

7.4.2.为外部表增加指定分区
ALTER TABLE log_messages ADD PARTITION(year = 2012, month = 1, day = 2)  
LOCATION 'hdfs://master_server/data/log_messages/2012/01/02';  

7.4.3.把数据表复制外部表目录结构中
Copy the data for the partition being moved to S3. For example, you can use the
Hadoop distcp command:

hadoop distcp /data/log_messages/2011/12/02 s3n://ourbucket/logs/2011/12/02  

•Alter the table to point the partition to the S3 location:

ALTER TABLE log_messages PARTITION(year = 2011, month = 12, day = 2)  

SET LOCATION 's3n://ourbucket/logs/2011/01/02';
•Remove the HDFS copy of the partition using the hadoop fs -rmr command:

hadoop fs -rmr /data/log_messages/2011/01/02  

7.4.3显示 表分区信息
hive> SHOW PARTITIONS log_messages;  
...  
year=2011/month=12/day=31  
year=2012/month=1/day=1  
year=2012/month=1/day=2  

hive> DESCRIBE EXTENDED log_messages;  
...  
message         string,  
year            int,  
month           int,  
day             int  
Detailed Table Information...  
partitionKeys:[FieldSchema(name:year, type:int, comment:null),  
FieldSchema(name:month, type:int, comment:null),  
FieldSchema(name:day, type:int, comment:null)],  
...  

hive> DESCRIBE EXTENDED log_messages PARTITION (year=2012, month=1, day=2);  
...  
location:s3n://ourbucket/logs/2011/01/02,  
...  


8.1Customizing Table Storage Formats -- 表存储格式(表操作)

CREATE TABLE employees (  
  name         STRING,  
  salary       FLOAT,  
  subordinates ARRAY<STRING>,  
  deductions   MAP<STRING, FLOAT>,  
  address      STRUCT<street:STRING, city:STRING, state:STRING, zip:INT>  
)  
ROW FORMAT DELIMITED  
FIELDS TERMINATED BY '\001'  
COLLECTION ITEMS TERMINATED BY '\002'  
MAP KEYS TERMINATED BY '\003'  
LINES TERMINATED BY '\n'  
STORED AS TEXTFILE;  

Dropping Tables -- 删除表


DROP TABLE IF EXISTS employees;  

For external tables, the metadata is deleted but the data is not.

Alter Table --修改表结构

ALTER TABLE modifies  table metadata  only. The  data  for  the  table  is
untouched. It’s up to you to ensure that any modifications are consistent
with the actual data.

Renaming a Table -- 修改表名

ALTER TABLE log_messages RENAME TO logmsgs;  

Adding, Modifying, and Dropping a Table Partition -- 增加,修改,删除 表分区


ALTER TABLE log_messages ADD IF NOT EXISTS  
PARTITION (year = 2011, month = 1, day = 1) LOCATION '/logs/2011/01/01'  
PARTITION (year = 2011, month = 1, day = 2) LOCATION '/logs/2011/01/02'  
PARTITION (year = 2011, month = 1, day = 3) LOCATION '/logs/2011/01/03'  


ALTER TABLE log_messages PARTITION(year = 2011, month = 12, day = 2)  
SET LOCATION 's3n://ourbucket/logs/2011/01/02';  


ALTER TABLE log_messages DROP IF EXISTS PARTITION(year = 2011, month = 12, day = 2);  


Changing Columns --修改列


ALTER TABLE log_messages  
CHANGE COLUMN hms hours_minutes_seconds INT  
COMMENT 'The hours, minutes, and seconds part of the timestamp'  
AFTER severity;  

Adding Columns --增加列

ALTER TABLE log_messages ADD COLUMNS (  
 app_name   STRING COMMENT 'Application name',  
 session_id LONG   COMMENT 'The current session id');  

Deleting or Replacing Columns --删除 替换列


ALTER TABLE log_messages REPLACE COLUMNS (  
 hours_mins_secs INT    COMMENT 'hour, minute, seconds from timestamp',  
 severity        STRING COMMENT 'The message severity'  
 message         STRING COMMENT 'The rest of the message');  

This statement effectively renames the original hms column and removes the server and
process_id columns from the original schema definition. As for all ALTER statements,
only the table metadata is changed.

Alter Table Properties --修改表属性

ALTER TABLE log_messages SET TBLPROPERTIES (  
 'notes' = 'The process id is no longer captured; this column is always NULL');  

Alter Storage Properties --修改存储属性

ALTER TABLE log_messages  
PARTITION(year = 2012, month = 1, day = 1)  
SET FILEFORMAT SEQUENCEFILE;  

You can specify a new SerDe along with SerDe properties or change the properties for
the existing SerDe. The following example specifies that a table will use a Java class
named com.example.JSONSerDe to process a file of JSON-encoded records

ALTER TABLE table_using_JSON_storage  
SET SERDE 'com.example.JSONSerDe'  
WITH SERDEPROPERTIES (  
 'prop1' = 'value1',  
 'prop2' = 'value2');  
来源: http://blog.csdn.net/iquicksandi/article/details/8522691
原创粉丝点击