hive 实践练习1 建表查询

来源：互联网发布：网络打印机通讯协议编辑：程序博客网时间：2024/06/06 21:08

head -n 5 visits_data.txt

cat visits.hive

[root@master exercise]#head -n 5 visits_data.txt

BUCKLEY SUMMER 10/12/2010 14:48 10/12/2010 14:45 WH

CLOONEY GEORGE 10/12/2010 14:47 10/12/2010 14:45 WH

PRENDERGAST JOHN 10/12/2010 14:48 10/12/2010 14:45 WH

LANIER JAZMIN 10/13/2010 13:00 WH BILL SIGNING/

MAYNARD ELIZABETH 10/13/2010 12:34 10/13/2010 13:00 WH BILL SIGNING/

[root@master exercise]# cat visits.hive

--cat visits.hive

create table people_visits (

last_name string,

first_name string,

arrival_time string,

scheduled_time string,

meeting_location string,

info_comment string)

ROW FORMAT DELIMITED

FIELDS TERMINATED BY '\t' ;

建表语句不知道什么意思

[root@master ~]# ./hive -f /opt/visits.hive

bash: ./hive: No such file or directory

hive> ./hive -f /opt/exercise/visits.hive

> ;

NoViableAltException(17@[])

hive> -f /opt/exercise/visits.hive

> ;

NoViableAltException(299@[])

at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1074)

at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:202)

at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166)

at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:397)

at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:309)

at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1145)

at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1193)

at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1082)

at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1072)

at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)

at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)

at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)

at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:736)

at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)

at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:498)

at org.apache.hadoop.util.RunJar.run(RunJar.java:221)

at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

FAILED: ParseException line 1:1 cannot recognize input near '-' 'f' '/'

[root@master exercise]# hive -f /opt/exercise/visits.hive

Logging initialized using configuration in jar:file:/opt/apache-hive-1.2.2-bin/lib/hive-common-1.2.2.jar!/hive-log4j.properties

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:For direct MetaStore DB connections, we don't support retries at the client level.)

[root@master ~]# hive

Logging initialized using configuration in jar:file:/opt/apache-hive-1.2.2-bin/lib/hive-common-1.2.2.jar!/hive-log4j.properties

hive (default)> show tables;

OK

tab_name

people_visits

Time taken: 1.281 seconds, Fetched: 1 row(s)

hive (default)> describe people_visits;

OK

col_name data_type comment

last_name string

first_name string

arrival_time string

scheduled_time string

meeting_location string

info_comment string

Time taken: 0.544 seconds, Fetched: 6 row(s)

[root@master exercise]# hadoop fs -put visits_data.txt /data/hive/warehouse/people_visits

put: `/data/hive/warehouse/people_visits/': No such file or directory

(You are getting the error, because there is no such directory specified in the path. Please take a look at my answer to a similar question which explains how hadoop interprets relative path's.

Make sure you create the directory first using:

bin/hadoop fs -mkdir input

and then try to re-execute the command -put.)

[root@master data]# hadoop fs -mkdir /hive

--OK 

[root@master data]# hadoop fs -mkdir /data/hive/warehouse/people_visits

mkdir: `/data/hive/warehouse/people_visits': No such file or directory

查看hadoop 目录结构 ？ 不知道如何建立hadoop 子目录结构

hadoop fs -put visits_data.txt /hive

17/08/23 10:56:03 WARN hdfs.DFSClient: DataStreamer Exception

org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /hive/visits_data.txt._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.

put: File /hive/visits_data.txt._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.

首先确认已经关闭防火墙，然后发现，我用master，slave两个机器，配置

dfs.replication 属性为2 ，修改成1 搞定。

<property>

<name>dfs.replication</name>

<value>1</value>

</property>

[root@master exercise]# hadoop fs -ls /hive

Found 1 items

-rw-r--r-- 2 root supergroup 989239 2017-08-23 11:04 /hive/visits_data.txt

hive (default)> select * from people_visits limit 5;

OK

people_visits.last_name people_visits.first_name people_visits.arrival_time people_visits.scheduled_time people_visits.meeting_location people_visits.info_comment

Time taken: 2.319 seconds

放入的数据找不到！！！ 是不是配置问题？

需要配置

<property>  

  <name>hive.metastore.warehouse.dir</name>  

  <value>/user/hive/warehouse</value>  

</property>  

然后执行

[root@master exercise]# hadoop fs -put visits_data.txt /user/hive/warehouse/employees

You have new mail in /var/spool/mail/root

[root@master exercise]#hadoop fs -ls /user/hive/warehouse/people_visits

Found 1 items

-rw-r--r-- 2 root supergroup 989239 2017-08-24 15:08 /user/hive/warehouse/people_visits/visits_data.txt

hive (default)> select * from people_visits limit 5;

OK

people_visits.last_name people_visits.first_name people_visits.arrival_time people_visits.scheduled_time people_visits.meeting_location people_visits.info_comment

BUCKLEY SUMMER 10/12/2010 14:48 10/12/2010 14:45 WH

CLOONEY GEORGE 10/12/2010 14:47 10/12/2010 14:45 WH

PRENDERGAST JOHN 10/12/2010 14:48 10/12/2010 14:45 WH

LANIER JAZMIN 10/13/2010 13:00 WH BILL SIGNING/

MAYNARD ELIZABETH 10/13/2010 12:34 10/13/2010 13:00 WH BILL SIGNING/

Time taken: 0.631 seconds, Fetched: 5 row(s

hive (default)> select count(*) from people_visits;

Query ID = root_20170824153813_390ff406-b9bb-4b83-99bf-a2f5bf9092dc

Total jobs = 1

Launching Job 1 out of 1

Number of reduce tasks determined at compile time: 1

In order to change the average load for a reducer (in bytes):

set hive.exec.reducers.bytes.per.reducer=<number>

In order to limit the maximum number of reducers:

set hive.exec.reducers.max=<number>

In order to set a constant number of reducers:

set mapreduce.job.reduces=<number>

Starting Job = job_1503560054683_0001, Tracking URL = http://master:8088/proxy/application_1503560054683_0001/

Kill Command = /opt/hadoop-2.6.5/bin/hadoop job -kill job_1503560054683_0001

Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1

2017-08-24 15:38:39,084 Stage-1 map = 0%, reduce = 0%

2017-08-24 15:38:47,164 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.88 sec

2017-08-24 15:38:53,717 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 5.32 sec

MapReduce Total cumulative CPU time: 5 seconds 320 msec

Ended Job = job_1503560054683_0001

MapReduce Jobs Launched:

Stage-Stage-1: Map: 1 Reduce: 1 Cumulative CPU: 5.32 sec HDFS Read: 996386 HDFS Write: 6 SUCCESS

Total MapReduce CPU Time Spent: 5 seconds 320 msec

OK

_c0

17977

Time taken: 43.436 seconds, Fetched: 1 row(s)

阅读全文

0 0

hive 实践练习1 建表 查询

hive 实践练习1 建表查询