druid.io sql支持

来源:互联网 发布:windows界面开发主流 编辑:程序博客网 时间:2024/06/18 01:38
参考地址:http://druidwithsql.tumblr.com/post/98578718282/a-first-look-at-druid-with-sql
 
 
download:  git clone git@git.corp.yahoo.com:srikalyan/Sql4D.git
 
 
make install:
mvn clean install -DskipTests=true
 
start:
java -jar Sql4DClient/target/Sql4DClient-4.1.0.jar -bh 10.13.4.45 -bp 8092 ch 10.13.4.45 -cp 8091 -oh 10.13.4.45 -op 8061 -mh 10.210.136.64 -mp 3306 -mid druid -mpw diurd -
mdb druid  -i 50
 
-bh: broke node host
-bp: broke node port
-ch: coordinator node host
-cp: coordinator node port
-oh: overlord node host
-op: overlord node port
-mh: mysql host
-mp: mysql port
-mid: mysql username
-mpw: mysql password
-mdb: mysql db
 
 
help:
1. select/crud statements   (GroupBy, TimeSeries, TopN, Select, Search, Insert). See wiki for examples: https://github.com/srikalyc/Sql4D/wiki/Sql4DCompiler
 2. generatebean=BeanName (This command must be preceding a SQL, it generates a java source file BeanName.java which extends DruidBaseBean.
 3. trace=[true|false]    (When enabled prints out compiled JSON query)
 4. querymode=[sql|json]  (Default is sql, when mode is json it is fired directly)
 5. show tables           (Displays all the datasources)
 6. describe TableName    (Displays the given datasource's schema)
 7. quit                  (Exits client)
 
query语法:
query支持sql及json两种方式,默认为sql
 
sql:
支持基本的show tables,
desc table—> describe TableName
 
注:druid  table列的类型一共三种
1: Implicit_Dimension (一般为timestamp列)
2: Dimension (查询条件,只能通过groupby来查询)
3: Metric (指标项,一般为数值,可直接查询)
 
 
select Metric
SELECT LONG_SUM(count) as num FROM weibovolence where interval between '2015-09-17T14:01:00.000Z' AND '2015-09-17T14:15:05.832Z' LIMIT 100;
 
select groupBy and  order by
SELECT uid, LONG_SUM(count) AS count FROM weibovolence WHERE interval BETWEEN '2015-09-17T14:01:00.000Z' AND '2015-09-17T14:15:05.832Z' BREAK BY 'all' GROUP BY uid order by count desc limit 10;
 
 
 
BREAK BY 表示聚合粒度,一般有以下几种值(day\hour\all\none等)group by order by 都能正常支持。
HINT(‘')为查询类型,可为GroupBy, TimeSeries, TopN等
 
select Timeseries
SELECT  LONG_SUM(count) AS count FROM weibovolence WHERE interval BETWEEN '2015-09-17T14:01:00.000Z' AND '2015-09-17T14:15:05.832Z' BREAK BY 'all' HINT('timeseries');
 
select Timeseries BREAK BY ‘minute’  and limit
SELECT  LONG_SUM(count) AS count FROM weibovolence WHERE interval BETWEEN '2015-09-17T14:01:00.000Z' AND '2015-09-17T14:15:05.832Z' BREAK BY 'minute' HINT('timeseries') limit10;
 
 
 
注意:druid 查询的核心是聚合,基本上所有的查询都需要通过LONG_SUM、DOUNLE_SUM函数以及group by来聚合
 
总结:druid sql 与比较类似,但与列的类型区分不一样。在druid中,大体划分为三种类型,Implicit_Dimension、Dimension、Metric
之所以Dimension类型不能直接查询,是跟druid底层存储有关,Implicit_Dimension\Metric一般是采用lz4压缩算法直接压缩,而Dimension是采用位图的方式存储,因此Dimension中的列能高效的支持and和or操作。 
0 0
原创粉丝点击