Phoenix UDFs

来源:互联网 发布:织梦cms重新安装 编辑:程序博客网 时间:2024/05/17 22:47

Phoenix User-defined functions(UDFs)

As of Phoenix 4.4.0 we have added the ability to allow users to create and deploy their own custom or domain-specific UDFs to the cluster.


Overview

User can create temporary/permanent user-defined or domain-specific scalar functions.he UDFs can be used same as built-in functions in the queries like select, upsert, delete, create functional indexes.

Temporary functions are specific to a session/connection and cannot be accessible in other sessions/connections

Permanent functions meta information will be stored in system table called SYSTEM.FUNCTION

We are supporting tenant specific functions

Functions created in a tenant specific connection are not visible to other tenant specific connections. Only global tenant(no tenant) specific functions are visible to all the connections


We are leveraging HBase dynamic class loader to dynamically load the udf jars from HDFS at phoenix client and region server without restarting the services


Configuration

You will need to add the following parameters to hbase-site.xml at phoenix client.

<property>

  <name>phoenix.functions.allowUserDefinedFunctions</name>

  <value>true</value>

</property>

<property>

  <name>fs.hdfs.impl</name>

  <value>org.apache.hadoop.hdfs.DistributedFileSystem</value>

</property>

<property>

  <name>hbase.rootdir</name>

  <value>${hbase.tmp.dir}/hbase</value>

  <description>The directory shared by region servers and into

    which HBase persists.  The URL should be 'fully-qualified'

    to include the filesystem scheme.  For example, to specify the

    HDFS directory '/hbase' where the HDFS instance's namenode is

    running at namenode.example.org on port 9000, set this value to:

    hdfs://namenode.example.org:9000/hbase.  By default, we write

    to whatever ${hbase.tmp.dir} is set too -- usually /tmp --

    so change this configuration or else all data will be lost on

    machine restart.</description>

</property>

<property>

  <name>hbase.dynamic.jars.dir</name>

  <value>${hbase.rootdir}/lib</value>

  <description>

    The directory from which the custom udf jars can be loaded

    dynamically by the phoenix client/region server without the need to restart. However,

    an already loaded udf class would not be un-loaded. See

    HBASE-1936 for more details.

  </description>

</property>

The last two configuration values should match with hbase server side configurations.



As with other configuration properties, The property phoenix.functions.allowUserDefinedFunctions may be specified at JDBC connection time as a connection property.

Example:

Properties props = new Properties();

props.setProperty("phoenix.functions.allowUserDefinedFunctions", "true");

Connection conn = DriverManager.getConnection("jdbc:phoenix:localhost", props);


Following is optional parameter which will be used by dynamic class loader to copy the jars from hdfs into local file system.

<property>

  <name>hbase.local.dir</name>

  <value>${hbase.tmp.dir}/local/</value>

  <description>Directory on the local filesystem to be used

    as a local storage.</description>

</property>



Creating Custom UDFs

To implement custom UDF you can followthe steps

You can follow these simple steps to write your UDF (for more detail, seethis blog post):

After compiling your code to a jar, you need to deploy the jar into the HDFS. It would be better to add the jar to HDFS folder configured for hbase.dynamic.jars.dir

The final step is to run CREATE FUNCTION query


Dropping the UDFs

You can drop functions using the DROP FUNCTION query. Drop function delete meta data of the function from phoenix.


原创粉丝点击