Wednesday, September 15, 2010

Aster Data's newest offering provides row and column functionality for big data MPP analytics

Aster Data has taken big data management and analytics to the next level with the announcement today of its Aster Data nCluster 4.6, which includes a column data store and provides a universal SQL-MapReduce analytic framework on a hybrid row and column massively parallel processing (MPP) database management system (DBMS).

The San Carlos, Calif. company's new offering will allow users to choose the data format best suited to their needs and benefit from the power of Aster Data’s SQL-MapReduce analytic capabilities, as well as Aster Data’s suite of 1000+ MapReduce-ready analytic functions. [Disclosure: Aster Data is a sponsor of BriefingsDirect podcasts.]

Row stores traditionally have been optimized for look-up style queries, while column stores are traditionally optimized for scan-style queries. Providing both a row store and a column store within nCluster and delivering a unified SQL-MapReduce framework across both stores enables both query types.

Universal query framework

For example, a retailer using historical customer purchases to derive customer behavior indicators may store each customer purchase in a row store to ease retrieval of any individual customer order. This is a look-up style query. This same retailer can see a 5-15x performance improvement by using a column store to provide access to the data for a scan-style query, such as the number of purchases completed per brand or category of product. The Aster Data platform now supports both query types with natively optimized stores and a universal query framework.

Other features include:
  • Choice of storage, implemented per-table partition, which provides customers flexible performance optimization based on analytical workloads.

  • Such services as dynamic workload management, fault tolerance, Online Precision Scaling on commodity hardware, compression, indexing, automatic partitioning, SQL-MapReduce, SQL constructs, and cross-storage queries, among others.

  • New statistical functions popular in decision analysis, operations research, and quality management including decision trees and histograms.
You may also be interested in: