Showing posts with label openstack. Show all posts
Showing posts with label openstack. Show all posts

Tuesday, August 30, 2011

I come to use clouds, not to build them...

[Update: Thanks for all the comments and Ryan Lawler's GigaOM summary - also I would like to credit James Urquhart's posting on Competing With Amazon Part 1. for giving me the impetus to write this.]

My question is what are the alternatives to AWS from a developer perspective, and when might they be useful? However I will digress into a little history to frame the discussion.

There are really two separate threads of development in cloud architectures, the one I care about is how to build applications on top of public cloud infrastructure, the other is about how to build cloud infrastructure itself.

In 1984 I didn't care about how the Zilog Z80 or the Motorola 6809 microprocessors were made, but I built my own home-brew 6809 machine and wrote a code generator for a C compiler because I thought it was the best architecture, and I needed something to distract me from a particularly mind-numbing project at work.

In 1988 I joined Sun Microsystems and was one of the people who could argue in detail how SPARC was better than MIPS or whatever as an instruction set, or how Solaris and OpenLook window system were better. However I never designed a RISC microprocessor, wrote kernel code or developed window systems libraries. I helped customers use them to solve their own problems.

In 1993 I moved to the USA and worked to help customers scale their applications on the new big multiprocessor systems that Sun had just launched. I didn't re-write the operating system myself, but I figured out how to measure it and explain how to get good performance in some books I wrote at the time.

In 1995 when Java was released and the WWW was taking off, I didn't work on the Java implementation or IETF standards body, I helped people to figure out how to use Java and to get the first web servers to scale, so they could build new kinds of applications.

In 2004 I made a career change to move from the Enterprise Computing market place with Sun to the Consumer Web Services space with eBay. At the time eBay was among the first sites to have a public web API. It seemed to me that the interesting innovation was now taking place in the creation and mash-up of web services and APIs, no-one cared about what operating system they ran, what hardware that ran on, or who sold those computers.

Over time, the interesting innovation that matters has moved up the food chain to higher and higher levels of abstraction, leveraging and taking for granted the layers underneath. A few years ago I had to explain to friends who still worked at Sun, how I was completely uninterested in whether my servers ran Linux or Solaris, but I did care what version of Java we were using.

Now I'm working on a cloud architecture for Netflix, we don't really care which Content Delivery Network is used to stream the TV shows over the Internet to the customers, we interchangeably use three vendors at present. I also don't care how the cloud works underneath, I hear that AWS uses Xen, but it's invisible to me. What I do care about is how the cloud behaves, i.e. does it scale and does it have the feature set that I need.

That brings me back to my original question, what are the alternatives to AWS and when might they be useful.

Last week I attended an OpenStack meetup, thinking that I might learn about its feature set, scale and roadmap as an alternative to AWS. However the main objective of the presenters seemed to be to recruit the equivalent of chip designers and kernel developers to help them build out the project itself, and to appeal to IT operations people who want to build and run their own cloud. There was no explanation or outreach to developers who might want to build applications that run on OpenStack.

I managed to get the panel to spend a little while explaining what OpenStack consists of, and figured out a few things. The initial release is only really usable via the AWS clone APIs and doesn't have an integrated authentication system across the features. The "Diablo" release this fall should be better integrated and will have native APIs, it is probably suitable for proof of concept implementations by people building private clouds. The "Essex" version targeted at March next year is supposed to be the first production oriented release.

There are several topics that I would like to have seen discussed, perhaps people could discuss them in the comments to this blog post? One is a feature set comparison with AWS, and a discussion of whether OpenStack plans to continue to support the AWS clone APIs for equivalent features as it adds them. So far I think OpenStack has a basic EC2 clone and S3 clone, plus some networking and identity management that doesn't map to equivalent AWS APIs.

The point of my history lesson in the introduction is that a few very specialized engineers are needed to build microprocessors, operating systems, servers, datacenters, CDNs and clouds. It's difficult and interesting work, but in the end if its done right it's a commodity that is invisible to developers and their customers. One of the slides proudly showed how many developers OpenStack had, a few hundred, mostly building it from scratch. There wasn't room on the slide to show how many developers AWS has on the same scale. Someone said recently that the far bigger AWS team has more open headcount than the total number of people working on OpenStack. When you consider the developer ecosystem around AWS, there must be hundreds of thousands of developers familiar with AWS concepts and APIs.

Some of the proponents of OpenStack argue that because it's an open source community project it will win in the end. I disagree, the most successful open source projects I can think of have a strong individual leader who spends a lot of time saying no to keep the project on track. Some of the least successful are large multi-vendor industry consortiums.

The analogy that seems to fit is Apple's iOS vs. Google's Android in the smartphone market. The parts of the analogy that resonate with me are that Apple came out first and dominates the market, taking most of the profit and forcing it's competitors to try and band together to compete, changing the rules of the game and creating new products like the iPad that leave their competition floundering. By adding together all the incompatible fragmented Android market together it's possible to claim that Android is selling in a similar volume to iPhone. However it's far harder for developers to build Android apps that work on all devices, and then they usually make much less money from them. Apple and it's ecosystem is dominant, growing fast, and extremely profitable.

In the cloud space, OpenStack appears to be the consortium of people who can't figure out how to compete with AWS on their own. AWS is dominant, growing its feature set and installed capacity very fast. Every month that passes, AWS is refining and extending it's products to meet real customer needs. Measured by the reserved IP address ranges used by its instances AWS has more than doubled in the last year and now has over 800,000 IP addresses assigned to its regions worldwide.

The problem with a consortium is that it is hard to get it to agree on anything, and Brooks law applies (The Mythical Man-Month - adding resources to a late software project makes it later). While it seems obvious that adding more members to OpenStack is a good thing, in practice, it will slow the project down. I was once told that the way to kill a standards body or consortium is to keep inviting new people to join and adding to its scope. With the huge diversity of datacenter hardware and many vendors with a lot to lose if they get sidelined I expect OpenStack to fracture into multiple vendor specific "stacks" with narrow test matrixes and extended features that lock customers in and don't interoperate well.

I come to use clouds, because I work for a developer oriented company that has decided that building and running infrastructure on a global scale is undifferentiated heavy lifting, and we can leverage outside investment from AWS and others to do a better job than we could do ourselves, while we focus on the real job of developing global streaming to TVs.

Operations oriented companies tend to focus on costs and ways to control their developers. They want to build clouds, and may use OpenStack, but their developers aren't going to wait, they may be allowed to use AWS "just for development and testing" but when the time comes to deploy on OpenStack, it's lack of features is going to add a significant burden of complexity to the development team. OpenStack's lack of scale and immaturity compared to AWS is also going to make it harder to deploy products. I predict that the best developers will get frustrated and leave to work at places like Netflix (hint, we're hiring).

I haven't yet seen a viable alternative to AWS, but that doesn't mean I don't want to see one. My guess is that in about two to three years from now there may be a credible alternative. Netflix has already spent a lot of time helping AWS scale as we figured out our architecture, we don't want to do that again, so I'm also waiting for someone else (another large end-user) to kick the tires and prove that an alternative works.

Here's my recipe for a credible alternative that we could use:

AWS has too many features to list, we use almost all of them, because they were all built to solve real customer problems and make life easier for developers. The last slide of my recent cloud presentations at http://slideshare.net/adrianco contains a partial list as a terminology glossary. AWS is adding entirely new capabilities and additional detailed features every month, so this is a moving target that is accelerating fast away from the competition...

From a scale point of view Netflix has several thousand instances organized into hundreds of different instance types (services), and routinely allocates and deallocates over a thousand new instances each day as we autoscale to the traffic load and push new code. Often a few hundred instances are created in a few minutes. Some other cloud vendors we have talked to consider a hundred instances a large customer, and their biggest instances are too small for us to use. We mostly use m2.4xl and we need the 68GB RAM for memcached, Cassandra or our very large Java applications, so a 15GB max doesn't work.

In summary, although the CDN space is already a commodity with multiple interchangeable vendors, we are several years from having multiple cloud vendors that have enough features and scale to be interchangeable. The developer ecosystem around AWS concepts and APIs is dominant, so I don't see any value in alternative concepts and APIs, please try to build AWS clones that scale. Good luck :-)

Tuesday, July 20, 2010

Can OpenStack catch up with AWS? Looks unlikely to me.

There has been a lot of chatter about the new OpenStack standard in the last few days, and how it could become a third option in the cloud infrastructure market alongside Amazon AWS and VMWare vCloud. For example Randy Bias/Cloudscaling has a good overview of the players.

The real question is whether OpenStack can catch up and become viable. Amazon is always described as being years ahead of the competition and having the lions share of the market. Estimates of the size of their lead depends on who you talk to, but I don't see anyone disputing their lead. On top of that, Amazon is investing heavily, has big customers like Zynga and Netflix stretching and hardening their systems (along with a huge number of small customers), and has already added many features beyond the basic compute (EC2) and storage (S3) that have been copied by others including OpenStack.

So for OpenStack to catch up, means that they have to move faster than Amazon, and leverage the slipstream effect where Amazon has had to educate the market and figure out what works, so competitors can copy their successes. However when I look at the details of the OpenStack specification they appear to be copying some of Amazon's problems as well. In particular the account and authentication model, which does not scale for enterprise use.

At the most basic level, it is infeasible to change the security model of a platform architecture, it's one of the most fundamental starting points that conditions the layers above. Changes to account and authentication management break all the layers of applications and tools that are built on the platform. One of the first problems that Netflix had with Amazon, was the lack of sub-accounts and role based access control (RBAC), and while beating up Amazon on this point for the last two years, we have built our own platform layers and tools to compensate. As a result, we find it impossible to use any of the web consoles or tools produced by Amazon or the many cloud vendors, which assume that there is a single account owner who can do anything. We hope that Amazon will eventually support sub-accounts, and when they do, I expect it will break everyone's tools.

At this point OpenStack is just the base level of the platform, it doesn't really have layers of tools on top yet, but its account and authentication model appears to be exactly the same as Amazon, so they will end up with layers of tooling that don't meet the needs of enterprise customers.

What's the difference between a startup and an enterprise? In a startup, everyone in IT knows the root password to every machine in their infrastructure. In an enterprise root passwords are carefully controlled, they change when someone in-the-know leaves, and specialist groups manage different parts of the system (Network ops can only mess with the switches, DBA's can only mess with the database etc.). The problem with a single account for the cloud is that everyone who needs to do anything to that account can do everything to it, and the common tools are oriented to a single user, managing several accounts, rather than a hierarchy of users managing parts of one account. All the systems in the account need an authentication key to access cloud services, and changing the password and key means you have to re-key every system. Get this wrong and your cloud will evaporate in an instant.

So in my opinion, a necessary but not sufficient condition for OpenStack to eventually catch up with AWS is that they need to build sub-accounts and RBAC into their spec from the start. However it seems much more likely that Amazon will just disappear into the distance from what I've seen so far.