Setup phoenix index
来源:互联网 发布:ubuntu改中文 编辑:程序博客网 时间:2024/06/18 11:08
phoenix index Setup
Non transactional, mutable indexing requires special configuration options on the region server and master to run
You will need to add the following parameters to hbase-site.xml on each region server:
<property>
<name>hbase.regionserver.wal.codec</name>
<value>org.apache.hadoop.hbase.regionserver.wal.IndexedWALEditCodec</value>
</property>
The above property enables custom WAL edits to be written, ensuring proper writing/replay of the index updates. This codec supports the usual host of WALEdit options, most notably WALEdit compression.
<property>
<name>hbase.region.server.rpc.scheduler.factory.class</name>
<value>org.apache.hadoop.hbase.ipc.PhoenixRpcSchedulerFactory</value>
<description>Factory to create the Phoenix RPC Scheduler that uses separate queues for index and metadata updates</description>
</property>
<property>
<name>hbase.rpc.controllerfactory.class</name>
<value>org.apache.hadoop.hbase.ipc.controller.ServerRpcControllerFactory</value>
<description>Factory to create the Phoenix RPC Scheduler that uses separate queues for index and metadata updates</description>
</property>
The above properties prevent deadlocks from occurring during index maintenance for global indexes (HBase 0.98.4+ and Phoenix 4.3.1+ only) by ensuring index updates are processed with a higher priority than data updates. It also prevents deadlocks by ensuring metadata rpc calls are processed with a higher priority than data rpc calls.
From Phoenix 4.8.0 onward, no configuration changes are required to use local indexing. In Phoenix 4.7 and below, the following configuration changes are required to the server-side hbase-site.xml on the master and regions server nodes:
<property>
<name>hbase.master.loadbalancer.class</name>
<value>org.apache.phoenix.hbase.index.balancer.IndexLoadBalancer</value>
</property>
<property>
<name>hbase.coprocessor.master.classes</name>
<value>org.apache.phoenix.hbase.index.master.IndexMasterObserver</value>
</property>
<property>
<name>hbase.coprocessor.regionserver.classes</name>
<value>org.apache.hadoop.hbase.regionserver.LocalIndexMerger</value>
</property>
注:4.8.0之后不需要配置,4.7.0及之前需要配置
Upgrading Local Indexes created before 4.8.0
While upgrading the Phoenix to 4.8.0+ version at server remove above three local indexing related configurations fromhbase-site.xml if present. From client we are supporting both online(while initializing the connection from phoenix client of 4.8.0+ versions) and offline(using psql tool) upgrade of local indexes created before 4.8.0. As part of upgrade we recreate the local indexes in ASYNC mode. After upgrade user need to build the indexes usingIndexTool
Following client side configuration used in the upgrade.
- phoenix.client.localIndexUpgrade
- The value of it is true means online upgrade and false means offline upgrade.
- Default: true
Command to run offline upgrade using psql tool $ psql [zookeeper] -l
Tuning
Out the box, indexing is pretty fast. However, to optimize for your particular environment and workload, there are several properties you can tune
All the following parameters must be set in hbase-site.xml - they are true for the entire cluster and all index tables, as well as across all regions on the same server (so, for instance, a single server would not write to too many different index tables at once).
- index.builder.threads.max
- Number of threads to used to build the index update from the primary table update
- Increasing this value overcomes the bottleneck of reading the current row state from the underlying HRegion. Tuning this value too high will just bottleneck at the HRegion as it will not be able to handle too many concurrent scan requests as well as general thread-swapping concerns.
- Default: 10
- index.builder.threads.keepalivetime
- Amount of time in seconds after we expire threads in the builder thread pool.
- Unused threads are immediately released after this amount of time and not core threads are retained (though this last is a small concern as tables are expected to sustain a fairly constant write load), but simultaneously allows us to drop threads if we are not seeing the expected load.
- Default: 60
- index.writer.threads.max
- Number of threads to use when writing to the target index tables.
- The first level of parallelization, on a per-table basis - it should roughly correspond to the number of index tables
- Default: 10
- index.writer.threads.keepalivetime
- Amount of time in seconds after we expire threads in the writer thread pool.
- Unused threads are immediately released after this amount of time and not core threads are retained (though this last is a small concern as tables are expected to sustain a fairly constant write load), but simultaneously allows us to drop threads if we are not seeing the expected load.
- Default: 60
- hbase.htable.threads.max
- Number of threads each index HTable can use for writes.
- Increasing this allows more concurrent index updates (for instance across batches), leading to high overall throughput.
- Default: 2,147,483,647
- hbase.htable.threads.keepalivetime
- Amount of time in seconds after we expire threads in the HTable’s thread pool.
- Using the “direct handoff” approach, new threads will only be created if it is necessary and will grow unbounded. This could be bad but HTables only create as many Runnables as there are region servers; therefore, it also scales when new region servers are added.
- Default: 60
- index.tablefactory.cache.size
- Number of index HTables we should keep in cache.
- Increasing this number ensures that we do not need to recreate an HTable for each attempt to write to an index table. Conversely, you could see memory pressure if this value is set too high.
- Default: 10
- org.apache.phoenix.regionserver.index.priority.min
- Value to specify to bottom (inclusive) of the range in which index priority may lie.
- Default: 1000
- org.apache.phoenix.regionserver.index.priority.max
- Value to specify to top (exclusive) of the range in which index priority may lie.
- Higher priorites within the index min/max range do not means updates are processed sooner.
- Default: 1050
- org.apache.phoenix.regionserver.index.handler.count
- Number of threads to use when serving index write requests for global index maintenance.
- Though the actual number of threads is dictated by the Max(number of call queues, handler count), where the number of call queues is determined by standard HBase configuration. To further tune the queues, you can adjust the standard rpc queue length parameters (currently, there are no special knobs for the index queues), specificallyipc.server.max.callqueue.length and ipc.server.callqueue.handler.factor. See theHBase Reference Guide for more details.
- Default: 30
Index Scrutiny Tool
Limitations
- If rows are actively being updated or deleted while the scrutiny is running, the tool may give you false positives for inconsistencies (PHOENIX-4277).
- Snapshot reads are not supported by the scrutiny tool (PHOENIX-4270).
- Setup phoenix index
- Phoenix Secondary Index
- phoenix local index 本地索引分裂源码分析
- phoenix local index的使用和join table的原理
- phoenix
- Phoenix
- phoenix
- setup
- setup
- Setup
- setup
- phoenix Singleton
- Phoenix / Tools
- Phoenix 分析
- Phoenix使用指南
- phoenix入门
- Apache phoenix
- solr-phoenix
- JAVA | 3
- 记多次断更
- 提供资产证券化投行业务的“点石金融”能在中国点石成金吗?
- 设计模式Note
- Hibernate入门实例
- Setup phoenix index
- 蓝桥杯 算法训练 Torry的困惑(基本型)
- c++的cout输出控制符
- 可变参数列表
- idea的toString()之json-Templates
- vi /vim自动缩进或者显示行号
- 大连理工大学软件学院数据库复习——SQL语句(《数据库系统概念第6版》)
- Apriori算法进行关联分析(1)
- spring cloud 前人踩过的坑