hbase shell操作之scan+filter
来源:互联网 发布:8寸windows平板电脑 编辑:程序博客网 时间:2024/05/16 07:11
创建表
create 'test1', 'lf', 'sf'lf: column family of LONG values (binary value)
-- sf: column family of STRING values
导入数据
put 'test1', 'user1|ts1', 'sf:c1', 'sku1'put 'test1', 'user1|ts2', 'sf:c1', 'sku188'put 'test1', 'user1|ts3', 'sf:s1', 'sku123'put 'test1', 'user2|ts4', 'sf:c1', 'sku2'put 'test1', 'user2|ts5', 'sf:c2', 'sku288'put 'test1', 'user2|ts6', 'sf:s1', 'sku222'
一个用户(userX),在什么时间(tsX),作为rowkey
对什么产品(value:skuXXX),做了什么操作作为列名,比如,c1: click from homepage; c2: click from ad; s1: search from homepage; b1: buy
查询案例
1、谁的值=sku188
scan 'test1', FILTER=>"ValueFilter(=,'binary:sku188')"ROW COLUMN+CELL user1|ts2 column=sf:c1, timestamp=1409122354918, value=sku1882、谁的值包含88
scan 'test1', FILTER=>"ValueFilter(=,'substring:88')"ROW COLUMN+CELL user1|ts2 column=sf:c1, timestamp=1409122354918, value=sku188 user2|ts5 column=sf:c2, timestamp=1409122355030, value=sku288
3、通过广告点击进来的(column为c2)值包含88的用户
scan 'test1', FILTER=>"ColumnPrefixFilter('c2') AND ValueFilter(=,'substring:88')"ROW COLUMN+CELL user2|ts5 column=sf:c2, timestamp=1409122355030, value=sku288
4、通过搜索进来的(column为s)值包含123或者222的用户
scan 'test1', FILTER=>"ColumnPrefixFilter('s') AND ( ValueFilter(=,'substring:123') OR ValueFilter(=,'substring:222') )"ROW COLUMN+CELL user1|ts3 column=sf:s1, timestamp=1409122354954, value=sku123 user2|ts6 column=sf:s1, timestamp=1409122355970, value=sku2225、rowkey为user1开头的
scan 'test1', FILTER => "PrefixFilter ('user1')"ROW COLUMN+CELL user1|ts1 column=sf:c1, timestamp=1409122354868, value=sku1 user1|ts2 column=sf:c1, timestamp=1409122354918, value=sku188 user1|ts3 column=sf:s1, timestamp=1409122354954, value=sku123
6、FirstKeyOnlyFilter: 一个rowkey可以有多个version,同一个rowkey的同一个column也会有多个的值, 只拿出key中的第一个column的第一个version
KeyOnlyFilter: 只要key,不要value
scan 'test1', FILTER=>"FirstKeyOnlyFilter() AND ValueFilter(=,'binary:sku188') AND KeyOnlyFilter()"ROW COLUMN+CELL user1|ts2 column=sf:c1, timestamp=1409122354918, value=
7、从user1|ts2开始,找到所有的rowkey以user1开头的
scan 'test1', {STARTROW=>'user1|ts2', FILTER => "PrefixFilter ('user1')"}ROW COLUMN+CELL user1|ts2 column=sf:c1, timestamp=1409122354918, value=sku188 user1|ts3 column=sf:s1, timestamp=1409122354954, value=sku123
8、从user1|ts2开始,找到所有的到rowkey以user2开头
scan 'test1', {STARTROW=>'user1|ts2', STOPROW=>'user2'}ROW COLUMN+CELL user1|ts2 column=sf:c1, timestamp=1409122354918, value=sku188 user1|ts3 column=sf:s1, timestamp=1409122354954, value=sku123
9、查询rowkey里面包含ts3的
import org.apache.hadoop.hbase.filter.CompareFilterimport org.apache.hadoop.hbase.filter.SubstringComparatorimport org.apache.hadoop.hbase.filter.RowFilterscan 'test1', {FILTER => RowFilter.new(CompareFilter::CompareOp.valueOf('EQUAL'), SubstringComparator.new('ts3'))}ROW COLUMN+CELL user1|ts3 column=sf:s1, timestamp=1409122354954, value=sku12310、查询rowkey里面包含ts的
import org.apache.hadoop.hbase.filter.CompareFilterimport org.apache.hadoop.hbase.filter.SubstringComparatorimport org.apache.hadoop.hbase.filter.RowFilterscan 'test1', {FILTER => RowFilter.new(CompareFilter::CompareOp.valueOf('EQUAL'), SubstringComparator.new('ts'))}ROW COLUMN+CELL user1|ts1 column=sf:c1, timestamp=1409122354868, value=sku1 user1|ts2 column=sf:c1, timestamp=1409122354918, value=sku188 user1|ts3 column=sf:s1, timestamp=1409122354954, value=sku123 user2|ts4 column=sf:c1, timestamp=1409122354998, value=sku2 user2|ts5 column=sf:c2, timestamp=1409122355030, value=sku288 user2|ts6 column=sf:s1, timestamp=1409122355970, value=sku222
加入一条测试数据
put 'test1', 'user2|err', 'sf:s1', 'sku999'
11、查询rowkey里面以user开头的,新加入的测试数据并不符合正则表达式的规则,故查询不出来
import org.apache.hadoop.hbase.filter.RegexStringComparatorimport org.apache.hadoop.hbase.filter.CompareFilterimport org.apache.hadoop.hbase.filter.SubstringComparatorimport org.apache.hadoop.hbase.filter.RowFilterscan 'test1', {FILTER => RowFilter.new(CompareFilter::CompareOp.valueOf('EQUAL'),RegexStringComparator.new('^user\d+\|ts\d+$'))}ROW COLUMN+CELL user1|ts1 column=sf:c1, timestamp=1409122354868, value=sku1 user1|ts2 column=sf:c1, timestamp=1409122354918, value=sku188 user1|ts3 column=sf:s1, timestamp=1409122354954, value=sku123 user2|ts4 column=sf:c1, timestamp=1409122354998, value=sku2 user2|ts5 column=sf:c2, timestamp=1409122355030, value=sku288 user2|ts6 column=sf:s1, timestamp=1409122355970, value=sku222
加入测试数据
put 'test1', 'user1|ts9', 'sf:b1', 'sku1'
12、b1开头的列中并且值为sku1的:
scan 'test1', FILTER=>"ColumnPrefixFilter('b1') AND ValueFilter(=,'binary:sku1')"ROW COLUMN+CELL user1|ts9 column=sf:b1, timestamp=1409124908668, value=sku1
13、SingleColumnValueFilter的使用,b1开头的列中并且值为sku1的
import org.apache.hadoop.hbase.filter.CompareFilterimport org.apache.hadoop.hbase.filter.SingleColumnValueFilterimport org.apache.hadoop.hbase.filter.SubstringComparatorscan 'test1', {COLUMNS => 'sf:b1', FILTER => SingleColumnValueFilter.new(Bytes.toBytes('sf'), Bytes.toBytes('b1'), CompareFilter::CompareOp.valueOf('EQUAL'), Bytes.toBytes('sku1'))}ROW COLUMN+CELL user1|ts9 column=sf:b1, timestamp=1409124908668, value=sku1
hbase zkcli 的使用
hbase zkclils /[hbase, zookeeper][zk: hadoop000:2181(CONNECTED) 1] ls /hbase[meta-region-server, backup-masters, table, draining, region-in-transition, running, table-lock, master, namespace, hbaseid, online-snapshot, replication, splitWAL, recovering-regions, rs][zk: hadoop000:2181(CONNECTED) 2] ls /hbase/table[member, test1, hbase:meta, hbase:namespace][zk: hadoop000:2181(CONNECTED) 3] ls /hbase/table/test1[][zk: hadoop000:2181(CONNECTED) 4] get /hbase/table/test1?master:60000}l$??lPBUFcZxid = 0x107ctime = Wed Aug 27 14:52:21 HKT 2014mZxid = 0x10bmtime = Wed Aug 27 14:52:22 HKT 2014pZxid = 0x107cversion = 0dataVersion = 2aclVersion = 0ephemeralOwner = 0x0dataLength = 31numChildren = 0
阅读全文
0 0
- hbase shell操作之scan+filter
- HBase scan shell操作详解
- hbase shell 中,使用filter进行scan
- hbase shell - 使用filter进行scan
- hbase filter shell 操作
- HBase shell scan命令中filter的使用
- Hbase shell scan
- HBase之shell操作
- HBase中的Scan操作
- HBase扫描操作Scan
- HBase shell scan 模糊查询
- HBase shell scan 模糊查询
- HBase shell scan 模糊查询
- HBase shell scan 模糊查询
- HBase shell scan 模糊查询
- Hbase 之 scan
- hbase-1.2.1之scan、batch操作的源码学习
- hbase scan和bloom filter的讨论
- 06_ARM汇编自学笔记指令系统之分类与格式
- Buildroot构建指南--Overview
- Android TextInputLayout 使用
- 在pom.xml中设置maven的镜像为国内镜像
- linux恢复 rm -rf 删除的文件:extundelete
- hbase shell操作之scan+filter
- SharePoint 2016与外部系统人员信息同步(一,Excel数据导入到AD)
- 自建hashMap缓存
- RecyclerView简单的使用
- 关于python的编码问题
- javascript面向对象
- 四大组件之Activity
- Ubuntu下使用Requests 和 lxml抓取个人主页文章
- leetcode 434- Number of Segments in a String