phoenix Storage Formats
来源:互联网 发布:私募股权基金待遇 知乎 编辑:程序博客网 时间:2024/05/17 23:52
Storage Formats
As part of Phoenix 4.10, we have reduced the on disk storage size to improve overall performance by implementing the following enhancements:
- Introduce a layer of indirection between Phoenix column names and the corresponding HBase column qualifiers.
- Support a new encoding scheme for immutable tables that packs all values into a single cell per column family
How to use column mapping?
One can set the column mapping property only at the time of creating the table.Before deciding on using column mapping, you need to think about how many columns you expect in a table and its view hierarchy to have in the lifecycle. For various mapping schemes, below are the limits on number of columns:
Config/Property Value
Max # of columns
1
255
2
65535
3
16777215
4
2147483647
NONE
no limit(theoretically)
For mutable tables, this limit applies to columns in all column families. For immutable tables, the limit applies toper column family.By default, any new phoenix tables will be using the column mapping feature.
These defaults could be overridden by setting below config to the desired value in hbase-site.xml
Table type
Default Column mapping
Config
Mutable/Immutable
2 byte qualifiers
phoenix.default.column.encoded.bytes.attrib
Keep in mind that this config controls global level defaults that would apply to all tables. If you would like to use a different mapping scheme than this global default, then you can use the COLUMN_ENCODED_BYTES table property.
CREATE TABLE T
(
a_string varchar not null,
col1 integer
CONSTRAINT pk PRIMARY KEY (a_string)
)
COLUMN_ENCODED_BYTES = 1;
How to use immutable data encoding?
Below are some scenarios in when it would be better to use ONE_CELL_PER_COLUMN encoding instead.
- Data is sparse i.e. less than 50% of the columns have values
- Size of data within a column family gets too big. Our general guidance here is that with default HBase block size of 64K, if data within a column family grows beyond 50K then we shouldn’t be using SINGLE_CELL_ARRAY_WITH_OFFSETS.
- For immutable tables that are going to have views on them
SINGLE_CELL_ARRAY_WITH_OFFSETS generally provides really good performance improvement and space savings
By default, immutable non-multitenant tables are created using the two byte column mapping and the SINGLE_CELL_ARRAY_WITH_OFFSETS data encoding.
Immutable Table type
Immutable storage scheme
Config
Multi-tenant
ONE_CELL_PER_COLUMN
phoenix.default.multitenant.immutable.storage.scheme
Non multi-tenant
SINGLE_CELL_ARRAY_WITH_OFFSETS
phoenix.default.immutable.storage.scheme
- phoenix Storage Formats
- 14.9 InnoDB Row Storage and Row Formats
- 稀疏矩阵的存储格式(Sparse Matrix Storage Formats)
- 稀疏矩阵的存储格式(Sparse Matrix Storage Formats)
- 稀疏矩阵的存储格式(Sparse Matrix Storage Formats)
- 稀疏矩阵的存储格式(Sparse Matrix Storage Formats)
- 稀疏矩阵的存储格式(Sparse Matrix Storage Formats)
- 稀疏矩阵的存储格式(Sparse Matrix Storage Formats)
- formats
- phoenix
- Phoenix
- phoenix
- 14.9 InnoDB Row Storage and Row Formats InnoDB 行存储和行格式:
- NUMBER FORMATS
- file formats
- ffmpeg -formats
- YUV Formats
- YUV Formats
- 将非常规Json字符串转换为常用的json对象
- mgr分析
- 普洱茶是黑茶还是绿茶??经常饮用有哪些好处?
- c语言中int到float的缺失问题解决
- hdu 1686 (使用G++过题)
- phoenix Storage Formats
- Eclipse如何更改js文件匹配括号的颜色
- 二叉搜索树的后序遍历序列
- js实现点击div外部隐藏弹出框
- js运算符的一些特殊应用
- 二进制数用十进制表示----parseInt()方法分析&java7新特性二进制文本
- swift基础 变量,常量,类型
- 正则表达式 判断 是不是 包不包含 ulr
- spring_microservice_in_action-学习笔记