HBase 压缩算法设置及修改
来源:互联网 发布:淘宝科尔沃面具 编辑:程序博客网 时间:2024/06/04 19:10
Compression就是在用CPU换IO吞吐量/磁盘空间,如果没有什么特殊原因推荐针对Column Family设置compression,下面主要有三种算法: GZIP, LZO, Snappy,作者推荐使用Snappy,因为它有较好的Encoding/Decoding速度和可以接受的压缩率。
HBase comes with support for a number of compression algorithims that can be enabled at the column family level. Enabling compression is recommended unless you have a reason not to do so, for example, when using already compressed content, such as JPEG images. For every other use-case compression usually will yield an overall better performance, because the overhead of the CPU performing the compression and decompression is less than what is required to read more data from disk.
Available Codecs
You can choose from a fixed list of supported compression algorithms. They have different qualities when it comes to compression ratio, as well as CPU and installation requirements.
Table 11.1. Comparison between compression algorithms
Note that some of the algorithms have a better compression ration while others are faster for the encoding, and a lot faster during decoding. Depending on your use-case you can choose one that suits you best.
Enabling Compression
Enabling compression requires the installation of the JNI and native compression libraries (unless you only want to use the Java code based GZIP compression), as described above, and specifying the chosen algorithm in the column family schema.
One way to accomplish this is during the creation of the table. The possible values are listed in the section called “Column Families”:
- hbase(main):001:0> create 'testtable', { NAME => 'colfam1', COMPRESSION => 'GZ' }
- 0 row(s) in 1.1920 seconds
- hbase(main):012:0> describe 'testtable'
- DESCRIPTION ENABLED
- {NAME => 'testtable', FAMILIES => [{NAME => 'colfam1', true
- BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', VERSIONS
- => '3', COMPRESSION => 'GZ', TTL => '2147483647', BLOCKSIZE
- => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}
- 1 row(s) in 0.0400 seconds
The describe shell command is used to read back the schema of the newly created table. You can see the compression is set to GZIP (using the shorter "GZ"
value as required). Another option to enable - or change, or disable - the compression algorithm is using the alter command for existing tables:
- hbase(main):013:0> create 'testtable2', 'colfam1'
- 0 row(s) in 1.1920 seconds
- hbase(main):014:0> disable 'testtable2'
- 0 row(s) in 2.0650 seconds
- hbase(main):016:0> alter 'testtable2', { NAME => 'colfam1', COMPRESSION => 'GZ' }
- 0 row(s) in 0.2190 seconds
- hbase(main):017:0> enable 'testtable2'
- 0 row(s) in 2.0410 seconds
Note how the table was first disabled. This is necessary to perform the alteration of the column family definition. The final enable command brings the table back online.
- HBase 压缩算法设置及修改
- HBase 压缩算法设置及修改
- HBase修改压缩格式及Snappy压缩实测分享
- HBase修改压缩格式及Snappy压缩实测分享
- HBase修改压缩格式及Snappy压缩实测分享
- HBase修改压缩格式及Snappy压缩实测分享
- HBase修改压缩格式及Snappy压缩实测分享
- Hbase设置Snappy压缩测试
- cloudera中hbase使用Snappy算法安装及设置
- Hbase 常见问题及设置
- hbase压缩算法-Snappy算法安装
- hbase压缩算法-Snappy算法安装
- hbase预建分区表,修改压缩方式
- HBase修改Table压缩格式步骤
- 【Hbase】修改Hbase压缩方式,重启一个regionserver
- 为hadoop和hbase配置压缩算法
- HBase压缩
- Hbase 错误记录及修改方法
- Oracle的redo和undo的区别
- 敏捷开发、重构与设计模式
- 用Apache+mod_wsgi部署python程序
- Apache使用mod_wsgi安装Trac
- VC调试方法大全!
- HBase 压缩算法设置及修改
- ubuntu11.04下安装gtk+
- ubuntu apache apxs 安装问题
- 用纯粹的C++编写COM组件
- 关于--在 System.Threading.ThreadAbortException 中第一次偶然出现的“mscorlib.dll”类型的异常
- Html 进行DOM 操作(放缩,颜色...),HTML抓图(放缩,滚动,拼接)
- cloudera中hbase使用Snappy算法安装及设置
- gcc编译连接库文件 转载http://www.iteye.com/topic/261176
- ExtJS 4 官方指南翻译:Grid组件(上)