Titan - Using HBase
来源:互联网 发布:云计算服务分类 编辑:程序博客网 时间:2024/05/21 23:39
HBase is the Hadoop database. Think of it as a distributed, scalable, big data store. Use HBase when you need random, realtime read/write access to your Big Data. This project’s goal is the hosting of very large tables — billions of rows X millions of columns — atop clusters of commodity hardware. HBase is an open-source, distributed, versioned, column-oriented store modeled after Google’s Bigtable. Just as Bigtable leverages the distributed data storage provided by the Google File System, HBase provides Bigtable-like capabilities on top of Hadoop and HDFS. — Apache HBase Homepage
HBase Setup
The following sections outline the various ways in which Titan can be used in concert with HBase.
Local Server Mode
HBase can be run as a standalone database on the same local host as Titan and the end-user application. In this model, Titan and HBase communicate with one another via a localhost
socket. Running Titan over HBase requires the following setup steps:
- Download and extract a stable HBase fromhttp://www.apache.org/dyn/closer.cgi/hbase/stable/.
- Start HBase by invoking the
start-hbase.sh
script in the bin directory inside the extracted HBase directory. To stop HBase, usestop-hbase.sh
.
$ ./bin/start-hbase.sh starting master, logging to ../logs/hbase-master-machine-name.local.out
Now, you can create an HBase TitanGraph using the following Gremlin snippet:
// Gremlinconf = new BaseConfiguration();conf.setProperty("storage.backend","hbase");g = TitanFactory.open(conf);
or the following Java snippet:
// JavaConfiguration conf = new BaseConfiguration();conf.setProperty("storage.backend","hbase");TitanGraph g = TitanFactory.open(conf);
Note, that you do not need to specify a hostname since a localhost connection is attempted by default.
Remote Server Mode
When the graph needs to scale beyond the confines of a single machine, then HBase and Titan are logically separated into different machines. In this model, the HBase cluster maintains the graph representation and any number of Titan instances maintain socket-based read/write access to the HBase cluster. The end-user application can directly interact with Titan within the same JVM as Titan.
For example, suppose we have a running HBase cluster with two machines at IP address 77.77.77.77 and 77.77.77.78, then connecting Titan with the cluster is accomplished as follows:
Configuration conf = new BaseConfiguration();conf.setProperty("storage.backend","hbase");conf.setProperty("storage.hostname","77.77.77.77,77.77.77.78");TitanGraph g = TitanFactory.open(conf);
storage.hostname accepts a comma separated list of IP addresses and hostname for any subset of machines in the HBase cluster Titan should connect to. Also, in the Gremlin shell, you can not define the type of the variables conf
and g
. Therefore, simply leave off the type declaration.
Remote Server Mode with Rexster
Finally, Rexster can be wrapped around each Titan instance defined in the previous subsection. In this way, the end-user application need not be a Java-based application as it can communicate with Rexster over REST. This type of deployment is great for polyglot architectures where various components written in different languages need to reference and compute on the graph.
http://rexster.titan.machine1/mygraph/vertices/1http://rexster.titan.machine2/mygraph/tp/gremlin?script=g.v(1).out('follows').out('created')
In this case, each Rexster server would be configured to connect to the HBase cluster. The following shows the graph specific fragment of the Rexster configuration. Refer to the Rexster configuration page for a complete example.
<graph> <graph-name>mygraph</graph-name> <graph-type>com.thinkaurelius.titan.tinkerpop.rexster.TitanGraphConfiguration</graph-type> <graph-location></graph-location> <graph-read-only>false</graph-read-only> <properties> <storage.backend>hbase</storage.backend> <storage.hostname>77.77.77.77,77.77.77.78</storage.hostname> </properties> <extensions> <allows> <allow>tp:gremlin</allow> </allows> </extensions> </graph>
HBase Specific Configuration
In addition to the general Titan Graph Configuration, there are the following HBase specific Titan configuration options:
Please refer to the HBase configuration documentation for more HBase configuration options and their description. By prefixing the respective HBase configuration option with storage.hbase-config in the Titan configuration it will be passed on to HBase at initialization time. This allows arbitrary HBase configuration options to be configured through Titan.
Global Graph Operations
Titan over HBase supports global vertex and edge iteration. However, note that all these vertices and/or edges will be loaded into memory which can causeOutOfMemoryException
. Use Faunus to iterate over all vertices or edges in large graphs.
Build Graph using rexster-console.sh
graphNames = rexster.graphNames.toArray()g = rexster.getGraph("titan")v1 = g.addVertex()v1.setProperty("name","Suspect A")v2 = g.addVertex()v2.setProperty("name","Victim 1")v3 = g.addVertex()v3.setProperty("name","Suspect B")v4 = g.addVertex()v4.setProperty("name","Victim 2")v5 = g.addVertex()v5.setProperty("name","Gang Leader")e1 = g.addEdge(v1,v2, 'Robs')e2 = g.addEdge(v3,v4, 'Robs')e3 = g.addEdge(v5,v1, 'Controls')e4 = g.addEdge(v5,v3, 'Controls')
Ref: https://github.com/thinkaurelius/titan/wiki/Using-HBase
https://github.com/thinkaurelius/titan/wiki/Rexster-Graph-Server
- Titan - Using HBase
- Titan连接Hbase后端
- Titan‘s storage backends Hbase open execption
- titan-1.0.0-hadoop1 + Hbase 问题
- TITAN
- Performance testing HBase using YCSB
- Performance testing HBase using YCSB
- Titan代碼行數
- 初始Titan
- 初始Titan
- Guide to Using Apache HBase Ports
- Titan openstack design
- Titan漫谈(一)
- HelloWorld in Eclipse TITAN
- titan配置相关
- Titan学习笔记-初识
- Titan数据库简介
- Titan 体系结构概述
- 内存检查强力组合: valgrind和gdb
- Android Fragment详解(三): 实现Fragment的界面
- 动态规划之状态压缩
- 随机梯度下降(Stochastic gradient descent)和 批量梯度下降(Batch gradient descent )的公式对比、实现对比
- Android Fragment详解(四):管理Fragment
- Titan - Using HBase
- 正态分布具有很多好的性质,很多模型假设数据服从正态分布。但是如果数据不服从正态分布怎么办?
- Android Fragment详解(五):Fragment与Activity通讯
- BSP编程模型(以NMF为例,试验基于消息传递的模型BSP过程)
- 在后台根据单据标识构建单据的DynamicObject,然后调用BOS的保存服务保存单据。
- Android Fragment详解(六):Fragement示例
- NYOJ 35 表达式求值(非递归+栈)
- MPI学习笔记(1)
- LeetCode之旅(38)