HBase 表的创建 属性 避免热点问题 region split

来源:互联网 发布:淘宝店没有发布宝贝 编辑:程序博客网 时间:2024/06/06 13:59

1.查看建表帮助

help 'create'Creates a table. Pass a table name, and a set of column familyspecifications (at least one), and, optionally, table configuration.Column specification can be a simple string (name), or a dictionary(dictionaries are described below in main help output), necessarily including NAME attribute. Examples:Create a table with namespace=ns1 and table qualifier=t1  hbase> create 'ns1:t1', {NAME => 'f1', VERSIONS => 5}Create a table with namespace=default and table qualifier=t1  hbase> create 't1', {NAME => 'f1'}, {NAME => 'f2'}, {NAME => 'f3'}  hbase> # The above in shorthand would be the following:  hbase> create 't1', 'f1', 'f2', 'f3'  hbase> create 't1', {NAME => 'f1', VERSIONS => 1, TTL => 2592000, BLOCKCACHE => true}  hbase> create 't1', {NAME => 'f1', CONFIGURATION => {'hbase.hstore.blockingStoreFiles' => '10'}}Table configuration options can be put at the end.Examples:  hbase> create 'ns1:t1', 'f1', SPLITS => ['10', '20', '30', '40']  hbase> create 't1', 'f1', SPLITS => ['10', '20', '30', '40']  hbase> create 't1', 'f1', SPLITS_FILE => 'splits.txt', OWNER => 'johndoe'  hbase> create 't1', {NAME => 'f1', VERSIONS => 5}, METADATA => { 'mykey' => 'myvalue' }  hbase> # Optionally pre-split the table into NUMREGIONS, using  hbase> # SPLITALGO ("HexStringSplit", "UniformSplit" or classname)  hbase> create 't1', 'f1', {NUMREGIONS => 15, SPLITALGO => 'HexStringSplit'} SPLITALGO => 'HexStringSplit', REGION_REPLICATION => 2,   CONFIGURATION => {'hbase.hregion.scan.loadColumnFamiliesOnDemand' => 'true'}}You can also keep around a reference to the created table:  hbase> t1 = create 't1', 'f1'Which gives you a reference to the table named 't1', on which you can thencall methods.

2.避免热点问题的方法:手动分区

creata 't1','f1',{NUUMREGIONS => 15,SPLITLGO => 'HexStringSplit'}

3.查看表的描述:

describe 'terminal_data_file_jn''terminal_data_file_jn', {NAME => 'cf', BLOOMFILTER true  => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', K EEP_DELETED_CELLS => 'false', DATA_BLOCK_ENCODING = > 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZ E => '65536', REPLICATION_SCOPE => '0'}
  • IN_MEMORY可以提高读的性能
  • TTL:Time To Live , 即cell存活的时间,forever 就是永远不会自动删除,也可以设定比如三个月删除等等
  • COMPRESSION: 是否压缩数据,默认不压缩,也可以应用不同的压缩方法,如’gz’, ‘lzo’ , ‘snappy’, or ‘none’.
  • MIN_VERSIONS : 数据保存的最小版本数,配合TTL使用
  • BLOCKCACHE : 涉及HBase的缓存策略
  • BLOCKSIZE : block的大小
  • REPLICATION_SCOPE : 复制

参考:
Hbase split的过程以及解发条件
实时系统HBase读写优化–大量写入无障碍
Hbase split的三种方式和split的过程

HBase的Block Cache实现机制分析

0 0
原创粉丝点击