hive 结构

来源：互联网发布：网络与手机失泄密编辑：程序博客网时间：2024/04/26 23:55

HIVE

MetaStrore element:table...

Driver

compiler parsing:get table.... from metastore -> logical plan

parseDriver abstract tree

semanticAnalyzer query block

logical plan generator logical plan

query plan generator:logical plan->pyhsical plan physical plan

optimizer optimze logical plan using 列修剪/谓词下压

executer use DAG to generate jobs chain->顺序执行job: each job is a mapreduce task(mapreduce script)

如存在依赖关系，先执行完父job再是子job

interface

CLI bin/hive --service cli

HWI bin/hive --service hwi port:9999

ThriftServer bin/hive --service hiverserver port:10000

DBS

DataBase(dir in hive) hive.metastore.warehouse.dir hive-site.xml

table(dir in hive) internal table external table

partition(dir in hive)

bucket(1 file in hive)

table:

internal table

表元数据存放在metastore

external table

存放在外部介质中

Datatype

Numeric

Decimal

Float

double

Int(BIGINT,SMALLINT,TINYINT,INT)

Date/Time

TIMESTAMP

DATE

String

Char

varchar

Advanced

STRUCT struct('a','b')

MAP map('1','a','2','b')

ARRAY array('a','b')

[graph]

Hadoop

Job Tracker get job and meta data for the job

Task Tracker Mapreduce execution and finally return to executer and the executer return to client

tips:

1 存储metastore的库采用高可用方式，即存在堕胎数据库防止单点

0 0