Wednesday, May 30, 2012

Scale-out your DB on ARM-based servers

Today, I think we witnessed a small sign for a big revolution...
"Dell announced a prototype low-power server with ARM processors, following a growing demand by Web companies for custom-built servers that can scale performance while reducing financial overhead on data centers"
In short, ARM (see Wikipedia definition here) is an architecture standard for processors. ARM processors are slower compared to good old x86 processors from Intel and AMD, but have power-efficiency, density and price attributes that intrigue customers, especially in our days of green data centers where carbon emissions is carefully measured, and of course, cost-saving economics.

Take iPhones and iPads for example, those amazing machines do fast real-time calculations with their relatively powerful ARM processors (Apple A4, A5, A5x), yet are extremely efficient with regards to power and stay relatively cold. See picture (credits to Wikipedia) of the newest Apple A5x chip, used in New iPad:

Today when true big web and cloud players build their data centers the question is not "how big are your servers?" but rather "how many servers do you carry?". Ask Facebook, Google, Netflix, and more to come... For those guys, no single server can be big enough anyway, so they're built from the ground up for scaling-out to numerous servers and performing small tasks, concurrently. Familiar with Google's Map-Reduce and Hadoop? So - why not parallelize on ARM based servers? Can you imagine 20 iPads, occupying the same space as a 1U rack-mounted "pizza" server, but with 5x parallel computing power, 20x electricity power efficiency, and 100x cooling costs efficiency?

So what all this has to do with database scalability you ask?

With this quote from the article, I don't agree: "But ARM still cannot match up chips from Intel and AMD for resource-heavy tasks such as databases.".

Oh I remember those days when, in every data center in every organization I gave consulting to, I saw the same picture... All machines were nice, neatly organized, tagged, blades, standard racks, mostly virtualized... But the DB? No... Those servers were non-standard, the biggest, ugliest, most-expensive, capex and opex. Why? It's the DB! It's special! It has needs!! Specialized HW, specialized storage, $$$. Those days are over, as organizations' need to save is arriving to the shores of the sacred database. Today, harder questions are being asked, more is being done in commoditization, more databases are virtualized, and the cloud...

Big Data is everywhere, in the web, in the cloud, in the enterprise, databases must scale, scale-out or else explode, it's a hard fact. Databases can be scaled with smart distribution and parallelism, and then can use commodity hardware, can be easily virtualized, and cloud-ified. If distribution is done correctly - the sum is greater than its parts, and the parts in this case can be low end... The lowest of the low... database machines can definitely be ARM based servers, each holds portion of the data, attracts a portion of the concurrent sessions, and contributes to the overall processing.

A database on an iPad? Naa, I prefer breaking another record in Fruit Ninja.
A database on 20 ARM-based servers? If it's 5x faster and costs 60x less in electricity and cooling - then yes, definitely.

Monday, May 21, 2012

Scaling OLTP is nothing like scaling Analytics

We're in the big data business. OLTP applications and Analytics.

Scaling OLTP applications is nothing like scaling Analytics, like I posted here: OLTP is a mixture of read and writes, heavy session concurrency and also growing amounts of data.

In my previous post,, I mentioned that Analytics can be scaled using: columnar storage, RAM and query parallelism.

Columnar storage cannot be used for OLTP, as while it makes read scans better, it hurts writes, especially INSERTs. Same goes for RAM, the approach of “let’s put everything in memory” is also problematic for writes that should be Durable (the D in ACID). There are databases that reach Durability with writing to memory of at least 2 machines, I'll get to that in a later post, but in the simpler view, RAM is great for reads (Analytics), very limited for writes (OLTP).

Query parallelism that worked for Analytics, is limited for OLTP. Mostly because of high concurrency and writes, OLTP is a mixture of read and writes, ratios today reach 50%-50% and more. Every write operation is eventually at least 5 operations for the database, including table, index(s), rollback segment, transaction log, row-level locking and more. Now multiply with 1000 concurrent transactions, and 1TB of data. The database engine itself becomes the bottleneck! It puts so many resources into buffer management, locking, thread locks/semaphores, and recovery tasks, no resources are left available for handling query data!

3 bullets why na├»ve parallelism is not a magic bullet for OLTP:
  1. Parallel query within the same database server will just turn the hard-to-manage 1000 concurrent transactions into impossible 1000000 concurrent sub-transactions… Good luck with that… 
  2. Parallelizing query on several database servers is a step in a good direction. However it can’t scale: if I have 10 servers and each one my 1000 concurrent transactions needs to gather data from all servers in parallel, how many concurrent transactions I’ll have on each server? That’s right, 1000. What did I solve? Can I scale to 2000 concurrent transactions? All my servers will die together. In that case what if I scale to 20 servers instead of 10? Then I’ll have 20 servers with 2000 concurrent transactions… that will all die together. 
  3. OLTP operations are not good candidates for parallelism:
    1. Scans, Full table/index scans and range scans, are parallelized all the time in Analytics, are seldom in OLTP. In OLTP most accesses are short, pinpointed, index-based small range and unique scans. Oracle’s optimizer mode FIRST_ROWS (OLTP) will almost always prefer index access and ALL_ROWS (Analytics) will have hard time give up its favorable full table scan.  So what exactly do we parallelize in OLTP? An index rebuild once a day (scan...)?
    2. 1000 concurrent 1-row INSERT commands a second - is a valid OLTP scenario. What exactly do I parallelize?
Parallelism cannot be the one and only complete solution. It serves a minor role in the solution, whose key factor is: distribution.

OLTP databases can scale only by a smart distribution of data but also the concurrent sessions among numerous database servers. It’s all in the distribution of the data, if data is distributed in a smart way, concurrent sessions will be also distribute across servers.

Go from 1 big fat database server dealing with 1TB of data and 1000 concurrent transactions, to 10 databases, each deal with easy 100GB and 100 concurrent transactions. I'll hit the jackpot if I'll manage to keep databases isolated, shared nothing, processing-wise, not only cables-wise. Best are transactions that start and finish on a single database.

And if I’m lucky and my business is booming, I can scale:
  1. Data grew from 1TB to 1.5TB? Add more databases servers.
  2. Concurrent sessions grew from 1000 to 1500? Add more databases servers.
  3. Parallel query/update? Sure! If a session does need to scan data from all servers, or need to perform an index rebuild, it can run in parallel on all servers, and will take a fraction of the time.
Ask Facebook (FB, as of today... ). Each of their 10,000s databases is handling a fraction of the data, in a way that only a fraction of all sessions are accessing it in any point in time.

Each of the databases is still doing hard work on every update/insert/delete and on buffer management, locking, thread locks/semaphores, and recovery tasks. What can we do? It's OLTP, it's read/write, it's ACID... It's heavy! I trust every one of my DBs to do what it does best, I just give it the optimal data size and session concurrency to do that.

Let's summarize here:

In my next post I'll dive more into implementations caveats (shared disk, shared memory, sharding) and pitfalls, do's and don't's... 

Stay tuned, join those who subscribed and get automatic updates, get involved!

Tuesday, May 15, 2012

Scale differences between OLTP and Analytics

In my previous post,, I reviewed the differences between OLTP and Analytics databases.

Scale challenges are different between those 2 worlds of databases.

Scale challenges in the Analytics world are with the growing amounts of data. Most solutions have been leveraging those 3 main aspects: Columnar storage, RAM and parallelism.
Columnar storage makes scans and data filtering more precise and focused. After that – it all goes down to the I/O - the faster the I/O is, the faster the query will finish and bring results. Faster disks and also SSD can play good role, but above all: RAM! Specialized Analytics databases (such as Oracle Exadata and Netezza) have TBs of RAM. Then, in order to bring results for queries, data needs to be scanned and filtered, a great fit for parallelism. A big data range is divided into many smaller ranges given to parallel worker threads that each performs his task in parallel, the entire scan will finish in a fraction of the time.

In the OLTP, scale challenges are in the growing transaction concurrency throughput and… growing amounts of data. Again? Didn't we just say growing data is the problem of Analytics? Well, today’s OLTP apps are required to hold more data to provide a larger span online functionality. In the last couple of years OLTP data archiving was changed dramatically. OLTP data now covers years and not just days or weeks. Facebook recently launched its “time line” feature (, can you imagine your timeline ends after 1 week? Facebook’s probably world’s largest OLTP database holds data of a billion users for years back. Today all data is required anywhere anytime, right here, right now, online. Many of today’s OLTP databases go well beyond the 1TB line. And what about transaction concurrency throughput? Applications today are bombarded by millions of users shooting transactions from browsers, smartphones, tablets… I personally checked my bank account 3 times today. Why? Because I can…

What can be done to solve OLTP scale challenges?

In my next post let's start answering this question with understanding why solutions proposed for the Analytics are limited in the OLTP, and start reviewing relevant approaches.

Stay tuned, subscribe, get involved!

Monday, May 14, 2012

OLTP vs. Analytics

This blog is all into database scaling, but before we dive into scaling challenges, here's a quick alignment around those important terms, if you want, an introduction to OLTP and Analytics...

Traditionally the RDBMS world was divided into 2 main categories:
OLTP, Online Transaction Processing (, data is stored and processed by operational applications throughout the organization. In an example of a supermarket, the cashiers application is clearly an operational application doing OLTP. Traditionally OLTP databases were characterized by:

  1. Large number of concurrent users
  2. High throughput of concurrent transactional activity of reads and writes mixture
  3. Smaller amounts of data

On the other side we could find the Data Warehouse ( or DSS (Decision Support Systems,, Oracle likes DSS very much...) or with their newer name: Analytics (
In our supermarket, the Data Warehouse was loaded from data from all operational system, the cahiers system and also from the logistics and suppliers system , ERP, CRM – and Business Intelligence tools are used to allow convenient visual ways to query and analyze the enormous amounts of data making aggregations, summaries, comparisons and so on. Traditionally Analytics databases were characterized by:

  1. Small number of concurrent users
  2. Non transactional, read-only activity
  3. Huge amounts of data

While RDBMS can be used for both OLTP and Analytics, each has different design methodologies, tuning parameters, hardware requirements. Every DBA will tell you that Oracle’s BITMAP INDEX speeds up Analytics since the schema is in a star-schema design, however it can be hell for OLTP when schema design is totally different, and concurrent write transactions will suffer from the BITMAP’s segment-level-locking. Columnar database
On the other hand, in OLTP the DBA will set a goal and eventually reach a cache hit-ratio of 95%, and “sql-area” hit-ratio of 99.99%. In Analytics the above numbers are impossible to reach just because of the different activity imposed on the database, resulting in different behaviors, and it’s OK. There is a difference.

Scale challenges and practices are also different. In future posts I'll address specific database scale issues relevant to OLTP from one hand and Analytics on the other.

Stay tuned.