ACM SIGMOD Athens, Greece, 2011
sigmod pods logo

Keynote 1: James Hamilton, Amazon Web Services

Internet Scale Storage

Abstract

The pace of innovation in data center design has been rapidly accelerating over the last five years, driven by the mega-service operators. I believe we have seen more infrastructure innovation in the last five years than we did in the previous fifteen. Most very large service operators have teams of experts focused on server design, data center power distribution and redundancy, mechanical designs, real estate acquisition, and network hardware and protocols. At low scale, with only a data or center or two, it would be crazy to have all these full time engineers and specialist focused on infrastructural improvements and expansion. But, at high scale with tens of data centers, it would be crazy not to invest deeply in advancing the state of the art.

Looking specifically at cloud services, the cost of the infrastructure is the difference between an unsuccessful cloud service and a profitable, self-sustaining business. With continued innovation driving down infrastructure costs, investment capital is available, services can be added and improved, and value can be passed on to customers through price reductions. Amazon Web Services, for example, has had eleven price reductions in four years. I don't recall that happening in my first twenty years working on enterprise software. It really is an exciting time in our industry.

I started working on database systems twenty years ago during a period of incredibly rapid change. We improved DB2 performance measured using TPC-A by a factor of ten in a single release. The next release, we made a further four-fold improvement. It's rare to be able to improve a product by forty fold in three years but, admittedly, one of the secrets is to begin from a position where work is truly needed. Back then, the database industry was in its infancy. Customers loved the products and were using them heavily, but we were not anywhere close to delivering on the full promise of the technology.

That's exactly where cloud computing is today - just where the database world was twenty years ago. Customers are getting great value from cloud computing but, at the same time, we have much more to do and many of the most interesting problems are yet to be solved. I could easily imagine tenfold improvement across several dimensions in over the next five years. What ties these two problems from different decades together is that some of the biggest problems in cloud computing are problems in persistent state management. What's different is that we now have to tackle these problems in a multi-tenant, high-scale, multi-datacenter environment. It's a new vista for database and storage problems.

In this talk, we'll analyze an internet-scale data center looking at the cost of power distribution, servers, storage, networking, and cooling on the belief that understanding what drives cost helps us focus on the most valuable research directions. We'll look at some of the fundamental technology limits approached in cloud database and storage solutions on the belief that, at scale, these limits will constrain practical solutions. And we'll consider existing cloud services since they form the foundation on which future solutions might be built.

Bio

James is VP and Distinguished Engineer at Amazon Web Services where he focuses on infrastructure efficiency, reliability, and scaling. Prior to AWS, James held leadership roles on several high-scale products and services including Windows Live, Exchange Hosted Services, Microsoft SQL Server, and IBM DB2. He loves all things server related and is interested in optimizing all components from data center power and cooling infrastructure, through server design, and the distributed software systems they host. James maintains high-scale services blog at http://perspectives.mvdirona.com.

Credits