Wednesday, August 27, 2008

Databases leverage MapReduce technology to radically juice data scale, performance, analytics

In what could best be termed a photo finish, Greenplum and Aster Data Systems have both announced that they have integrated MapReduce into their massively parallel processing (MPP) database engines.

MapReduce, pioneered by Google for analyzing the Web, now becomes available to enterprises and service providers, giving them more access and visibility into more data from more origins. Originally created to analyze massive amounts of unstructured data, the approach has been updated to analyze structured data as well.

Greenplum, San Mateo, Calif., says that MapReduce will be part of its Greenplum Database beginning in September. Aster Data, Redwood Shores, Calif., says that MapReduce will be included in its Aster nCluster. [Disclosure: Greenplum is a sponsor of BriefingsDirect podcasts.]

Curt Monash, president of Monash Research, editor of DBMS2, and a leading authority on MapReduce, sees this as a major leap forward. He reports that both companies had completed adding MapReduce to their existing products and had been racing to the finish line to get their news out first. As it turned out, both made their announcements within hours of each other.

Curt lists some points on his blog about what this new technology marriage means.
  • Google’s internal use of MapReduce is impressive. So is Hadoop’s success. Now commercial implementations of MapReduce are getting their shots too.

  • The hardest part of data analysis is often the recognition of entities or semantic equivalences. The rest is arithmetic, Boolean logic, sorting, and so forth. MapReduce is already proven in use cases encompassing all of those areas.

  • MapReduce isn’t needed for tabular data management. That’s been efficiently parallelized in other ways. But, if you want to build non-tabular structures such as text indexes or graphs, MapReduce turns out to be a big help.

  • In principle, any alphanumeric data at all can be stuffed into tables. But in high-dimensional scenarios, those tables are super-sparse. That’s when MapReduce can offer big advantages by bypassing relational databases. Examples of such scenarios are found in CRM and relationship analytics.
Greenplum customers have been involved in an early-access program using Greenplum MapReduce for advanced analytics. For example, LinkedIn is using Greenplum Database for new, innovative social networking features such as “People You May Know” and sees it as a way to develop compelling analytics products faster. A primary benefit of the new capability is that customers can combine SQL queries and MapReduce programs into unified tasks that are executed in parallel across hundreds or thousands of cores.

Part of the appeal of business intelligence and its huge ramp-up over the past five years is that IT assets play an ever larger role in providing unprecedented strategic guidance and insights to leaders of enterprises, governments, telecos and cloud providers. IT has gone from an automating business functions role to an essential crystal ball service of the highest order. By consequently gaining access to larger data sets that -- more than ever before can be mined and analyzed for higher levels of process and business refinements -- IT has become a member of the board.

With better data reach and inclusion, come better results. So BI allows leaders can establish the trends early that will determine their future success or failures. In a fast-paced, global, hyper competitive business landscape these insights are the currency of success for the future. The better you do BI, the better you do business ... current, near-term and long-term. There's no better way to know your customers, competitors, employees and the variables that buffet and stir markets than effective BI.

Now, by exanding the role and reach of MapReduce technologies and methods, a powerful new tool is added to the BI arsenal. More data, more data types, more data sources -- all rolled into an analytical framework that can be directly targeted by developers, scripters, business analysts, exectutives, and investors.

These new MapReduce use announcements mark a significant advancement that helps makes IT another notch higher in its utility and indespensible nature to business. And it comes at a time when more data, meta data, complex events, transactions and Internet-scale inferences demand tools that can do for enterprise BI what Google has done for Web search and indexing.

Being comprehensive and deep with massive data sets analytics offers a new mantra: The database is dead, long live the data. Structured data and the containers that contain it are simply not enough to organize an access the intelligence lurking on modern networks, at Internet scale and Internet time.

Tuesday, August 26, 2008

Citrix makes virtualization splash with new version of XenApp to speed desktop applications delivery

Citrix Systems has overhauled its flagship presentation server product, promising IT operators higher performance and lower costs, while improving the end-user experience. The company this week announced Citrix XenApp 5, the next generation of its application virtualization solution.

The new version of XenApp, formerly the Citrix Presentation Server, combines with Citrix XenServer to create an "end-to-end" solution that spans servers, applications, and desktops. Companies using the new combined product can centralize applications in their datacenter and deliver them as on-demand services to both physical and virtual desktops.

Virtualization, while not a new technology, has currently been gaining a huge head of steam, as companies realize the deployment, maintenance, and security benefits of central control across nearly all applications, while also providing businesses with agile and flexible solutions.

In my thinking, virtualization is allowing the best of the old (central command and control) with the new (user flexibility and ease of innovation). Virtualizing broadly places more emphasis on the datacenter and less on the client, without the end user even knowing it.

What's more, from a productivity standpoint, the end users gain by having app and OS updates and fixes done easier and faster (fewer help desk calls and waits), while operators can excercise the security constraints they need (data stays on the server), and developers need only target the server deployments (local processing is over-rated).

And, of course, virtualization far better aligns IT resources supply with demand, removing wasted utilization capacity while allowing for more flexibility in raming up or down on specific applications or data demands. Works for me.

Currently, most IT operations are faced with managing myriad Windows-based applications, and are hampered by the demands of installing, patching, updating, and removing those applications. Many users have simplified the task and lowered cost by using server-based deployment. We'll see a lot more of this, and that includes more uptake in the use of desktop virtualization, but that's another topic for another day.

According to Fort Lauderdale, Fla.-based Citrix, version 5 of XenApp, which includes more than 50 major enhancements, can improve application start-up time by a factor of 10 and reduces applications preparation and maintenance by 25 percent.

Of the major new features, I like the support for more Windows apps and compatibility with Microsoft AppV (formerly Softgrid), the HTTP streaming support, the IPV6 support, as well as the improved performance monitoring and load balancing. Also very nice is the "inter-isolation communication," which allows each app to be isolated and also aggregrated as if installed locally. Add to that the ability of the apps to communicate locally, such as cut and paste. Think of it as OLE for the virtualized app set (finally).

I've been watching Citrix since it took the bold step of acquiring XenSource just a little over a year ago. At that time, I saw the potential for its move to gobble a piece of the virtualization pie:
The acquisition also sets the stage for Citrix to move boldly into the desktop as a service business, from the applications serving side of things. We’ve already seen the provider space for desktops as a service heat up with the recent arrival of venture-backed Desktone. One has to wonder whether Citrix will protect Windows by virtualizing the desktop competition, or threaten Windows by the reverse.
The new XenApp 5 release is being featured on Sept. 9 as part of a global, online launch event called, Citrix Delivery Center Live! This virtual event is the first in a series that will take place in the second half of 2008 highlighting the entire Citrix Delivery Center product family. This debut event features presentations, chat sessions and online demos from Citrix, as well as participation from key partners such as Microsoft and Intel. I'm also looking forward to attending Citrix's annual analyst conference in Phoenix on Sept. 9.

XenApp 5, which runs on the Microsoft Windows Server platform, leverages all the enhancements in Windows Server 2008 and fully supports Windows Server 2003. This enables existing Windows Server 2003 customers to immediately deploy Windows Server 2008 into their existing XenApp environments in any mix.

XenApp 5 will be available Sept. 10. For North America, suggested retail pricing is per concurrent user (CCU) and includes one year of Subscription Advantage, the Citrix program that provides updates during the term of the contract:
  • Advanced Edition – $350

  • Enterprise Edition – $450

  • Platinum Edition – $600
Standalone pricing for client-side application streaming and virtualization begins as low as $60 per CCU. TCO for virtualized apps will over time continue to fall, a nice effect for all concerned.