Better, Faster, Cheaper: Summer ’09 Data Warehousing Roundup

来源:互联网 发布:swoole windows 编辑:程序博客网 时间:2024/05/28 15:37

Better, Faster, Cheaper: Summer ’09 Data Warehousing Roundup

 

Netezza and Teradata news follows Sybaseand IBM announcements. Vendors promise higher performance, lower costand greater deployment flexibility.


By

Doug Henschen
August 3, 2009

 

The better-faster-cheaper trend in data warehousing continues. Fourleading vendors are introducing new software, new hardware andoptimized integrations of the two realms that promise betterperformance, higher scalability and lower cost. The latestannouncements are being made at this week's TDWI World Conference inSan Diego. Netezza and IBM are arguably making the biggest news, whileTeradata and Sybase focused on software upgrades supporting in-databaseprocessing.

The news from Netezza, which broke last weekin the blogosphere ahead of tomorrow's formal announcement, is that itis moving off proprietary hardware onto industry-standard blade-serverhardware. The move takes advantage of the low cost and steadyperformance improvements available on commodity hardware, yet thearchitecture continues to use the field programmable gate arrays (FPGA)that have been Netezza's performance differentiator.

"Rather than having a proprietary intelligent storage node,where the disk, the FPGA and the CPU are all on the same card, we'lluse standard storage arrays connected to commodity blades amended withour FPGA accelerator cards," says Phil Francisco, Netezza's vicepresident of product management and product marketing.

Netezza says current customers will be able to use existingcode and queries on the new platform. Getting off proprietary hardwareis an important strategic step that is likely to free up research anddevelopment funds while also improving margins. "A proprietary,monolithic architecture will not survive in the long run in datawarehousing because it's inherently more expensive," says ForresterAnalyst Jim Kobielus. "If you can use cheap, off-the-shelf componentswherever possible, you can build a cheaper appliance, and cheap iseverything in a commodity market."

The Netezza TwinFin appliance, the first product to be releasedon this new blade-server-based platform, has been beta tested byseveral customers and is available immediately. The TwinFin scales fromseveral hundred gigabytes to more than a petabyte, and it is said todeliver three- to five-times faster performance than Netezza's currentplatform. The vendor also emphasized that the new platform dropsNetezza's prices under $20,000 per terabyte. Whether that "resets thebar on price-performance in the industry," as the vendor claims, is amatter of interpretation. Vendors including Greenplum and Kickfire arealready below that threshold.

"Netezza has a lot of street credibility for deliveringscalability and performance, but they've been at the middle of themarket in terms of pricing," Kobielus says. "Now they can positionthemselves as a performance leader and as a price-performance leader.If what they are saying holds true, then they'll have a strong claim onboth fronts."

Netezza also announced today that entry-level, high-capacityand memory-intensive appliances will be released on the newcommodity-based platform. The entry-level product will targetdeployments ranging from tens to hundreds of gigabytes, but it willultimately scale to more than a terabyte, Francisco says. Thehigh-capacity model is aimed at archiving and record-retentionapplications where scalability trumps query speed -- a niche currentlywell served by both Greenplum and Teradata's Extreme Data Appliance.The memory-intensive model is designed for environments with complexqueries, thousands to tens of thousands of users or both -- thehigh-end enterprise data warehouse (EDW) market commanded by Teradataand coveted by HP with its Neoview offering.

Netezza's exclusive storage and blade server hardware partneris IBM. The FPGA accelerator cards are attached on standard "side cars"available on IBM blades, but Francisco says the new architecture couldbe built on just about any industry-standard Intel-powered bladeserver. Netezza will continue to sell its current platform "for sometime," Francisco says. But the long-term plan is to move entirely tothe commodity-hardware-based platform.

Teradata Revs Its Database

Teradata's news this week is the general release of Teradata Database13 and Teradata Tools and Utilities 13. The upgrades pack 93 newfeatures and capabilities said to improve system performance by up to30 percent. Highlights includea faster-running and more efficient query optimizer and a doubling ofextract, load and transform (ELT) speeds while supporting simultaneousdata analysis. But the key theme is in-database processing, withgeospatial data and certain online analytical processing (OLAP)functions now supported while high-performance in-database data miningwith partner SAS has been enhanced.

"The whole move to in-database processing is certainly a keytheme that we've had for a long time," says Teradata development chiefScott Gnau. "The geospatial piece will let customers analyze not onlywhat happened and who it happened to but also where it happened, and wecan also do in-database predictive modeling of 'where will it happen,to whom and why?'"

Many databases support geospatial analysis, but Gnau saysTeradata has integrated it into the parallel execution engine of thedatabase. "Rather than just being a bolt-on solution -- which is how weimplemented geospatial analysis previously and how it's generallyimplemented by competitors -- we've enabled those geospatial operatorsto execute alongside other operators in our massively parallelenvironment. That means performance, performance, performance."

Teradata has continued work with SAS to extend the depth andbreadth of data mining functions that can be automatically handledwithin the database. Teradata has also increased the amount of memoryavailable for such analyses for improved scalability, Gnau says.

Asked whether Netezza's less-than-$20,000-per-terabyte claimwould draw attention from Teradata customers, Gnau countered, "ourpricing and our product family are extremely competitive, and weprovide a lot of choice... Customers don't have to be locked into aspecific appliance and do a forklift [hardware] upgrade when they wantto move to a different class of machine."

Gnau specifically touts Teradata's support for "coexistence,"which is the capability to combine old and new hardware in a singlesystem rather than replacing systems outright to scale up. Thatcapability is unique to Teradata, says Gartner Analyst Don Feinberg,but it's not without compromise. "If I have a two-node Teradata system,I can add nodes on new hardware. The catch is that they will run asslow as the slowest node," he explains. "All nodes act as a singleTeradata system, and you can't have parts of a query operating atdifferent speeds."

Joining the In-Database Parade

Sybase, too, is championing in-database processing with the Sybase IQ 15.1 release, announced in mid July (see "Sybase IQ Upgrade Adds Support for In-Database Analytics").The release is the first column-oriented database to supportin-database processing for data mining, predictive analysis, OLAP andother compute-intensive analytic functions. The release extended theproduct's built-in library of statistical and data mining analyticfunctions while also adding standard-SQL OLAP extensions for analysisof large data sets.

Sybase has also launched a partner program to enablethird-party providers to plug their analytics into the Sybase IQdatabase. Only one vendor, Fuzzy Logix, has joined to date. Teradatahas worked exclusively with analytics leader SAS, and the companies'two-year-old partnership yielded production-ready in-database offeringsthis spring. Netezza has a broader community of more than 100in-database partners. Many are little-known developers of analyticapplications and services, but SAS, SPSS, Catalina Marketing and otherinfluential firms are also on the list.

Completing the Package

In contrast to Netezza's hardware-focused announcement and the Teradata and Sybase software announcements, IBM last week trumpetedthe integration of hardware and software with optional businessintelligence and analytic application modules. The company bills theIBM Smart Analytic System, to be released in September, as asingle-vendor, single-order offering with all components pre-integratedand optimized to deliver maximum performance.

Some critics derided IBM's sketchy announcement as"markitecture" and a broader bundling of the IBM Balanced Warehouse,but Forrester's Kobielus says he's impressed: "IBM is alone in themarket in taking things to this extent... It offers many more options[than competitors] for the packaging of its appliances for various sizeclasses and functional profiles... And now they are adding verticalizedand horizontalized solutions that are aligned with the delivery ofconsulting and professional services from IBM Global BusinessServices."

The short take on recent announcements is that IBM is stressingits breadth, Teradata and Sybase are accentuating their depth, andNetezza is lowering its cost while looking for broader deployments.

"What's really happening is that vendors are enhancing theirsoftware, and if they have hardware they are starting to take advantageof the faster chips, bigger disk drives and better interconnections,"observes Gartner's Donald Feinberg. "If they're not doing so already,vendors are also moving to handle mixed workloads and generic datawarehousing needs because they don't want their appliances to bepigeonholed as being just for data marts."

原创粉丝点击