Trafodion: 针对HBase的SQL事务支持

来源:互联网 发布:交换机sfp端口 编辑:程序博客网 时间:2024/06/11 03:49
Introduction
Trafodion is an open source initiative from HP, incubated at HP Labs 
and HP-IT, to develop an enterprise-class SQL-on-HBase solution 
targeted for big data transactional or operational workloads. HP has 
developed transactional SQL technologies with more than 20 years of 
investment into database technology and solutions. Trafodion brings 
this core technology to the Hadoop ecosystem. The name 'Trafodion' 
(the Welsh word for transactions, pronounced ‘Tra-vod-eee-on':) was 
chosen specifically to emphasize the differentiation that Trafodion 
provides in closing a critical gap in the Hadoop ecosystem. To find out 
more about the origin and the name of the project, please visit 
www.hp.com/go/trafodion.
Target workloads
Hadoop workloads span from long-running batch mode to low-latency 
operational workloads as shown in the figure below. The three 
categories on the right side are analytic workloads and are regarded as 
well-suited for Hadoop and therefore have garnered the most attention. 
In contrast, the leftmost workload defined as “Operational” is a new class 
of workloads that encompasses OLTP workloads as well as transactions 
that include social and mobile data interactions and observations using 

a mixture of structured and semi-structured data. 


Traditionally, these workloads have been handled by relational 
databases. But, relational databases have scalability issues and do not 
provide schema flexibility required in certain cases. Hadoop addresses 
these limitations. Combined with Hadoop’s perceived benefits of 
significantly reduced costs, there is growing interest and pressure to 
embrace these workloads in the Hadoop ecosystem. 
As operational workloads represent business needs, they typically consist 
of a constant flow of transactions requiring low- latency response times 
for read/write access. Additionally, these workloads are characterized by: 
• Data integrity with ACID-compliant protection 
• High availability, concurrency and scalability 
• Multi-structured data 

• Rapidly evolving data requirements

Features
Currently, there is no existing open source SQL-on-HBase solution 
that adequately meets these requirements. Trafodion provides the 
following functionality to support transactional workloads in 
Hadoop: 
• ACID-compliant distributed transaction protection 
over multiple SQL statements, tables and rows 
• Rich, full-functioned ANSI SQL language support using 
ODBC/JDBC connectivity interfaces 
• Performance improvements for transactional 
workloads by leveraging compile-time and run-time 
optimizations 
• Support for large data sets using parallel-aware 
query optimizer 
Trafodion intends to leverage the full capabilities of Hadoop 
ecosystem: 
• Schema flexibility provided by HBase column family 
structures 
• Snapshot capability with versioning support in Hadoop 
• High Availability and Disaster Recovery support with 
replication and snapshotting capabilities 
Benefits
Trafodion delivers a full-featured and optimized 
transactional SQL-on-HBase DBMS solution with full 
transactional data protection. These capabilities help 
overcome Hadoop’s weaknesses in terms of supporting 
transactional workloads. 


With Trafodion, customers gain the following benefits: 
• Ability to leverage SQL expertise versus complex MapReduce 
programming 
• Seamless support for existing transactional applications 
• Ability to develop next generation highly scalable, real-time 
transaction processing applications 
• Reduction in data latency for down-steam analytic workloads 
And they also gain the following benefits inherent in Hadoop 
ecosystem: 
• Reduced infrastructure costs
• Massive scalability and granular elasticity
• Improved data availability and disaster recovery protection 

Trafodion: Transactional SQL on HBase
Architecture
The Trafodion software architecture consists of three distinct layers: 
the client layer, the SQL database services layer, and the storage 
engine layer as shown in the figure below. 


The first layer is the Client Services layer where the application resides 
and accesses the Trafodion database via standard ODBC/JDBC 
interface using a Trafodion-supplied Windows or Linux client driver. 
The second layer is the SQL layer where Trafodion provides a relational 
schema abstraction on top of HBase, encapsulating all of the services 
required for managing Trafodion database objects. These services 
include connection management, transaction management, optimized 
plan generation, and execution against Trafodion database objects. 
Trafodion features a mature query optimizer that can generate parallel 
query plans, eliminating the need for complex MapReduce programming 
development. 
The third layer is the Storage Engine layer which consists of standard 
Hadoop services including HBase, HDFS, and Zookeeper. Trafodion 
database objects are stored in native Hadoop (HBase/HDFS) database 
structures. Trafodion handles the mapping of SQL requests into native 
HBase calls transparently on behalf of the application. 
Key innovations
Trafodion’s Distributed Transaction Management (DTM) component 
provides protection to transactions spanning multiple SQL statements, 
multiple tables, or multiple rows of a single table. Additionally, 
Trafodion DTM provides protection in a distributed cluster configuration 
across multiple HBase regions using an inherent two-phase commit 
protocol. DTM provides support for implicit (auto-commit) and explicit 
(BEGIN, COMMIT, ROLLBACK WORK) transaction control. 
Trafodion provides many compile-time and run-time optimizations for 
varying transactional workloads ranging from singleton row accesses 
for OLTP-like transactions to highly complex SQL statements used for 
operational reporting purposes. 


0 0
原创粉丝点击