Saturday, April 21, 2012

Impressions from Amazon's AWS Summit in NYC

Yesterday (4/19) I attended the AWS Summit in NYC (

I'm a big fan and also a heavy user of AWS especially S3, EC2, and naturally, RDS. In every point in time I have several dozens of AWS machines running for me out there in the East region, and in some cases when we do some special benchmarks and tests, number of EC2 and RDS machines can easily reach 3-digit. As I said, I'm a fan...

A few quotes I was able to catch and document on my laptop, on my laps...:
"When you develop an app for facebook, you must be prepared (and be afraid) that to your party, not noone will show up, but everybody will show up!"
So true! Simple and true. We all want to succeed, to have success with our app. We have to think about scaling from day 1.
"Database was bottleneck for building of sophisticated apps. This is no longer the case when building DynamoDB".
The quote above was about DynamoDB which is an excellent new NoSQL service by AWS. But we all can think about YesSQL databases and hope and wish and make it the same. Databases, good old RDBMSs, are great for applications, they offload a lot of complexity, SQL is a rich language and API to access data, it leverages existing skills and it allows ACID. RDBMSs also should not be a  bottleneck for building of sophisticated apps! They should be able to scale.
"How people really want to interact w the database? not by 'how many servers' but with 'give me a DB to handle 1000 reads, 10000 writes'. That's all. Users want a situation when you cannot run out of space, you cannot run out of capacity."
Inspiring. I couldn't agree more. A service is a good service when it hides away all complexities, gives me a URL and, boom, everything works. AWS are getting there no doubt, and whoever provides a product or a service (including myself...), should work according to this quote!
"RDS has 2 push button scaling: Scale-Up or Scale-Out, read replica, or, sharding... 'have the applicaiton go to the right shard'"
This is a quote said in the excellent Solutions track seminar "Building scalable database application with Amazon RDS". As I said, RDS is an excellent service. It's capacity to be a "service" and being automatically tuned, backed-up, upgraded, etc. - is impressive. The ability to ensure transparent high availability across Availability Zones (Multi-AZ) and have read-replica(s) set up with a click is no less than phenomenal. However in the scale-out department I think the solution is good, but not excellent. The support for read replica is great but it covers only the transportation of the data between the databases. it leaves the application with 2 or more IP addresses to deal with, route reads and writes, handle replication lag consistency and so on. In the sharding department, it's even less complete as, while I can spawn RDS servers as much as I like, the application need to do all the command routing to the right shard and also handle the transportation of the data. It's quite far from the vision I see in the 3rd quote above, quoted and inspired from Dr. Werner Vogels. I think this good service by AWS can can be completed to become and excellent service with a 3rd party products, such as ScaleBase.

In addition to the above quotes, I enjoyed hearing a good scale-out case study from Pinterest (, who invested in sharding themselves over almost 70 RDS databases. See here a good article about Pinterest's case:

I just love those case studies. Every one of those, especially by my prospects, customers, partners, makes me much smarter and my products much much better. If you have a scale-out story - don't be shy to share!!

A quick update: Look at this article, Search for the "Transformation three: We're moving from scaling by architecture to scaling by command". Good statement about database scale-out.

1 comment:

  1. Hi.
    I´m new in RDS AWS and i got tons of questions. Maybe you can help me:
    can i change the physical conf, of datafiles? can i create more than one that brings as default?, the databases that you set up and used in RDS has archive log on?, how can i change the location where the database write them? If i want to split data and index in separate datafiles, could i do it?

    Many thanks for your answers, and sorry for my poor poor english.

    Thanks in advance.