Wednesday, January 7, 2009

Webinar: IT analysts delve into desktop as service/VDI cloud opportunities for enterprises and telcos

Read a full transcript of the webinar discussion. Listen and watch.

Many of us expect that delivery of the full PC desktop experience and applications as a service will grow in use and value. For many users inside of enterprises, at call centers and for point of sale uses, their requirements can be me with a low-cost thin client and desktop as a service (DaaS) approach. The technology is largely here today.

But stark economics may end up driving the adoption. The value and cost savings from virtualization techniques escalate as they extend from servers to applications to the PC experience itself.

We're also seeing a lot of churn in the concept of the client device itself. The notion of a full-fledged tower PC for every desktop use scenario -- and the associated costs of maintenance, support, security, and upgrades -- is giving way to the right device for the use case. Why not the right software service mix for the right use case too?

For hosting organizations, telcos, cloud services providers and software as a service providers, the allure of providing a complete PC desktop and the required applications as a subscription is enticing. But this is a big subject, and many people are still just wrapping their heads around the implications.

Consequently, I recently participated in a webinar with virtual desktop infrastructure vendor Desktone that I think really gets to some core issues and insights on VDI and the models that support it. Joining me in the discussion are Jeff Fisher, senior director of strategic development at Desktone; Rachel Chalmers, research director of infrastructure management, at The 451 Group, and Robin Bloor, analyst at Hurwitz & Associates.

Here are some excerpts:
Fisher: Clearly, everyone is talking about cloud computing. You can’t look anywhere within IT and not hear about it. It’s amazing to see it surpassing even the frenzy around virtualization. In fact, most of the conversations people are having today are around virtualization and how it can take place in the cloud. Everyone wants to focus on all the benefits, including anytime/anywhere access and subscription economics.

However, like any other major trend that unfolds in IT, there are a number of challenges with the cloud. When people talk about cloud computing with respect to the enterprise, in most cases they’re talking about virtualizing server workloads and moving those workloads into a service provider cloud.

Clearly, that shift introduces a number of challenges. Most notable is the challenge of data security. Because server workloads are very tightly-coupled with their data tier, when you move the server or the server instance, you have to move the data. Most IT folks are not really comfortable with having their data reside in a service provider or external data center.

For that reason Desktone believes that it’s actually going to be virtual desktops, not servers, that are the better place to start and are going to be what jump starts this whole enterprise adoption of cloud computing.

The reason is pretty simple. Most fixed corporate desktop environments -- those are desktops that have a permanent home within your enterprise – already probably have their application and user data abstracted away from the actual desktop. The data is not stored locally. It’s stored somewhere on the network, whether it’s security credentials within the Active Directory (AD), whether it’s home drives that store user data, or it’s the back end of client-server applications. All the back end systems run within your data center.

The other interesting thing is this notion of the service-provider cloud, which is that it actually can traverse both the enterprise and the service provider data centers.

So, depending on the use case, service providers can either keep the virtual infrastructure and the racks powering that virtual infrastructure in their data center or they can, in certain cases, put the physical infrastructure within the enterprise data center, what we call the customer premise equipment model. The most important thing is that it doesn’t break the model.

There is flexibility in the location of the actual hosting infrastructure. Yet, no matter where it resides, whether it’s in a service provider data center or an enterprise data center, the service provider still owns and operates it and the enterprise still pays for it as a subscription.

Chalmers: There are three sensible places to run a desktop virtual machine. One is on the physical client, which gives you a whole bunch of benefits around the ability to encrypt and lock down a laptop and manage it remotely. One is to run it on the server, which is the very similarly tried-and-tested VM with VDI or Citrix XenDesktop method. That’s appropriate for a lot of these cases, but when you run out of server capacity or storage in the server-hosted desktop virtualization model a lot of companies would like elastic access to off-site resources.

This is particularly appropriate, for example, for retailers who see a big balloon in staffing – short-term and temporary staffing around the holiday seasons, although possibly not this year -- or for companies that are doing things off-shore and want to provide developer desktops in a very flexible way, or in education where companies get big summer classes, for example, and want to fire up a whole bunch of desktops for their students.

This kind of elastic provisioning is exactly what we see on the server virtualization side around cloud bursting. On the desktop side, you might want to do cloud bursting. You might even want to permanently host those desktops up in the cloud with a hosting provider and you want exactly the same things that you want from a server cloud deployment. You want a very, very clean interface between the cloud resources and the enterprise resources and you want a very, very granular charge back in billing.

And so, we see cloud-hosted desktop virtualization as a special case of server-hosted desktop virtualization.

Gardner: I think we’re entering a new era in how people conceive of compute resources. ... What’s happening now is that organizations are starting to re-evaluate the notion that a one-size-fits-all PC paradigm makes sense.

We have lots of different slices of different types of productivity workers. As Rachel mentioned, some come and go on a seasonal basis, some come and go on a project basis. We’re really looking at a slice-and-dice productivity in a new way, and that forces the organization to really re-evaluate the whole notion of application delivery.

When you start moving toward virtualization and you start re-thinking about infrastructure, you start re-thinking the relationship between hardware and software. You start re-thinking the relationship between tools and the deployment platform, as you elevate the virtualization and isolate applications away from the platform, and you start re-thinking about delivery.

If you take the step toward terminal services and delivering some applications across the wire from a server-based host, that continues to tip this a little bit toward, “Okay, if I could do it with a couple of apps, why not look at more? If I could do it with apps, why not with desktop? If I can do it with one desktop, why not with a mobile tier?”

If we look at the cost pressures that organizations are under, recognizing that it’s maintenance and support, and risk management and patch management that end up being the lion’s share of the cost of these systems, we’re really at a compelling point where the cost and the availability of different alternatives has really sparked sort of a re-thinking.

So we’re really going through this period of transformation, and I think that virtualization has been a catalyst to VDI and that VDI is therefore a catalyst into cloud. If you can do it through your servers, somebody else can do it through theirs.

When we start really seeing total costs tip as a result, the delta between doing it yourself and then doing it through some of these newer approaches is just super-compelling. Now that we’re entering into an economic period, where we’re challenged with top-line and bottom-line growth, people are not going to take baby steps. They’re going to be looking for transformative, real game-changing types of steps.

This is particularly relevant if they’re commodity level types of applications and services. It could be communications and messaging, it could be certain accounting or back office functions. It just makes a lot of sense to start re-evaluating. What we haven’t seen, unfortunately, is some clear methodologies about how to make these decisions and boundaries inside of organizations with any sort of common framework or approach.

It’s still a one-off company by company approach -- which workers should we keep on a full-fledged PC? Who should we put on a mobile Internet device, for example? Who could go into a cloud-based applications hosting type of scenario that you’ve been describing?

It’s still up in the air and I’m hoping that professional services and systems integrators over the next months and years will actually come up with some standard methodologies for going in and examining the cost-benefit analysis, what types of users and what types of functions and what types of applications it makes sense to put into these different ... environments.

Bloor: One of the things that is really important about what’s happening here with the virtualization of the desktop is just the very simple fact that desktop costs have never been well under control.

The interesting thing is that with end users that we’ve been talking to earlier this year, when they look at their user populations, they normally come to the conclusion that something like 70 or 80 percent of PC users are actually using the PC in a really simple way. The virtualization of those particular units is an awful lot easier to contemplate than the sophisticated population of heavy workstation use and so on.

With the trend that’s actually in operation here, and especially with the cloud option where you no longer need to be concerned about whether your data center actually has the capacity to do that kind of thing, there’s an opportunity with a simple investment of time to make a real big difference in the way the desktop is managed.

... From the corporate point of view, if you’re somebody that’s running a thousand desktops or more it’s a problem. It is a problem in terms of an awful lot of things but mostly it’s a support issue and it’s a management issue. When you get an implementation that involves changing the desktop from a PC to a thin client and you don’t put anything into the data center, it improves.

You’ve now got a situation where you don’t need cages in the data center running PC blades or running virtualized blades to actually provide the service. You don’t need to implement the networking stuff, the brokering capability, boost the networking in case it’s clashing with anything else, or re-engineer networks.

All you do is you go straight into the cloud and you have control of the cloud from the cloud. It’s not going to be completely pain free obviously, but it’s a fairly pain-free implementation.

It absolutely stunned me that a cloud offering became available earlier this year because that meant that somebody would actually have had to have been thinking about this two years ago in order to put together the technology that would enable such an offering like that.

Gardner: This is a much more lucid and rational architecture. We’ve found ourselves, over the past 15 or 20 years, sort of the victim of a disjointed market roll out. We really didn’t anticipate the role of the Internet, when client-server came about. Client-server came about quickly just after local area networks (LANs) were established.

We really hadn’t even rationalized how a LAN should work properly, before we were off and running into bringing browsers in TCP/IP stacks. So, in a sense, we've been tripping over and bouncing around from one very rapid shift in technology to another. I think we’re finally starting to think back and say, “Okay, what’s the real rational, proper architectural approach to this?”

We recognize that it’s not just going to be a PC on every desktop. It’s going to be a broadband Internet connection in every coat pocket, regardless of where you are. That fundamentally changes things. We’re still catching up to that shift.

Bloor: From an architect’s point of view, if nobody had influenced you in any way and you were just asked to draw out a sense of a virtualization of services to end users, you would probably head in this direction. I have no doubt about it. I’ve been an architect in my time, and it’s just very appealing. It looks like what Desktone DaaS has here is resources under control, and we’ve never had that with a PC.
Read a full transcript of the webinar discussion. Listen and watch.

Monday, January 5, 2009

A technical look at how parallel processing brings vast new capabilities to large-scale BI and data analysis

Listen to the podcast. Download the podcast. Find it on iTunes/iPod. Learn more. Sponsor: Greenplum.

Read a full transcript of the discussion.

Internet-scale data collecting, swarms of sensors outputs, and content clouds from the mobile device fabric -- as well as enterprises piling up ever more kinds of analytics metadata to analyze -- have stretched traditional data-management models to the breaking point.

Yet advances in parallel processing using multi-core chipsets have prompted new software approaches such as MapReduce that can handle these data chores at surprisingly low total cost. The technical response to oceans of data is something that has been building for some time. But the time now seems ripe to bring the technical solutions of lower-cost parallel computing advances into play with the economic imperatives of huge data crunching requirements.

And so just what are the technical underpinnings that support the new demands being placed on, and by, extreme data sets? What economies of scale can we anticipate? How will these advances spur the movement of data to Internet cloud models?

BriefingsDirect's Dana Gardner put these and other questions to a panel of new data architecture experts, to plumb into how parallelism, modern data infrastructure, and MapReduce technologies come together. He spoke with Joe Hellerstein, professor of computer science at UC Berkeley; Robin Bloor, analyst at Hurwitz & Associates, and Luke Lonergan, CTO and co-founder at Greenplum.

Here are some excerpts:
Data growth has been following and exceeding Moore's Law over time. What we've been seeing is that the data sets that people are gathering and storing over time have been doubling at a rate of even faster than every 18 months. ... We're going to see all kinds of large organizations gathering data from all sorts of automated sources.

... What's changed in the last few years is that clock speeds on processors have stopped doubling every 18 months. ... Instead, what they are doing is putting more processing cores on every chip. You can expect the number of processors on your chip to double every 18 months, but they're not going to get any faster.

So data is growing faster, and we have chips basically standing still, but you're getting more of them. If you want to take advantage of that data, you're going to have to program in parallel to make use of all those processors on the chips. That's the confluence that's happening.

There are very many people storing and analyzing more data. We're very encouraged that most of our customers are finding new uses for data that are earning them more money. Consequently, the driver to analyze more and more data continues to grow. As our customers get more successful, this use of data is becoming really important.

It's easy to parallelize the data. You break it up into little chunks and you throw it out to different machines. What can we do cleverly in computing with that kind of a framework? There are a lot of ideas for how to move forward ... where you are taking this massively parallel data-flow approach.

One thing that's kind of invisible is that there is a lot of data out there that's not being analyzed fast enough to be analyzed effectively. That's something that I think parallelism is going to address. ... The only reason not to gather that data is when you run out of affordable processing and storage. Anybody with the budget will have as much data as they can budget for and will try to monetize that. It's going to be pervasive.

The core problem we've solved is the ability for our engine to redistribute the data and the computation on the fly, as these queries and analysis are being performed. ... The combination of the software-switch interconnect, which Greenplum built into the Greenplum product, and the underlying use of commodity parallel computers, is brought together in this database system that makes it possible to do SQL query and languages like MapReduce with automatic parallelism.

Businesses have invested a tremendous amount of their time over the last 15 to 25 years in SQL, and some of the more traditional kinds of business analysis that pay off very well are ensconced in that programming model. So, packaging a system that can do transactional, mixed workloads with large amounts of concurrency, with applications that use the SQL paradigm, is very important.

Packaging this together as software plus hardware, making that available as a reference architecture for customers, has been very important and has been very successful in our accounts at New York Stock Exchange, Fox, MySpace, and many others.

The combination of SQL and MapReduce in a unified way in programming environments ... is a very pragmatic [step] that can help with people's ability to get their hands on data in an organization. ... You want to have the same access to all your data via either an SQL interface or a MapReduce programming interface. ... You ought to be able to access those with whatever language suits you, mix and match.

Some things are easier to do in MapReduce, and some things are easier to do in SQL, even when you know both. Good programmers have a lot of tools in their tool belt. They like to be able to use whatever tool is appropriate for the task. Having both of these things interleaved is really quite helpful.

[The solution] is about users being able to gain access to all that power. What really turned the corner for general data analysis using SQL is the ability for a user to not to have to worry about what kind of table structure they have. They can have lots of small tables joining to lots of big tables, and big tables joining to each other.

What the developer needs is an engine that doesn't care how the data is distributed, per se, just being able to use all of that parallelism on the problems of interest. ... The physical model of how the database is distributed in a shared nothing architecture in a Greenplum system is not visible to the developer.

There are a couple of questions about how an individual organization's data will end up in the cloud. Inevitably it will, but in the short-term, people like to keep their data close, particularly database data that's traditionally been in the warehouses, very carefully managed. ... It's going to be some time until we really see everybody's data warehouses up in the cloud. ... How long will it be until you really get big volumes of data in the cloud[?] The answer is that certainly new applications will be up there. We may start to see old data getting uploaded in the cloud as well.

We'll start to see big data sets up there that don't necessarily belong to anyone, and they are going to be big. In that environment, you can imagine big data analytics will have to run in the cloud, because that's where the data will be. One of the fun things about the cloud that's really exciting is the elasticity of the resources. You don't buy yourself a data center full of machines, but you rent as many machines as you need for a task.

If you have a task that's going to look at a lot of data, you would rent a lot of machines for a few hours, and then you would shrink your pool. What this is going to allow people to do is that even small organizations may, for a short period of time, look at an enormous amount of data, which perhaps doesn't originate in their own data production environment, but is something that they want to utilize for their purposes.

Disk densities show no signs of slowing down. So, data is going to be essentially no cost. The data-gathering infrastructure is also going to be mechanized. We're going through what I call the industrial revolution of data production. We're just going to build machines to generate data, because we think we can get value out of that data, and we can store it essentially for free.

The compute cost of multi-core with parallelism is going to continue Moore's Law. It's just going to continue it in a parallel programming environment. If we can get all those cores looking at all that data, it won't cost much to do that, and the cost of that will continue to shrink by half.

The only real barrier to the process is to make those systems easy to program and manageable. Cloud helps somewhat with manageability, and programming environments like SQL and MapReduce are well-suited to parallelism. We're going to just see an enormous use of data analysis over time. It's just going to grow, because it gets cheaper and cheaper and bigger and bigger.
Read a full transcript of the discussion.

Listen to the podcast. Download the podcast. Find it on iTunes/iPod. Learn more. Sponsor: Greenplum.