The Database Scalability Blog - Doron Levari

I've lived around databases all my life, 21st century is challenging for them: big data, throughput, complexity, virtualization, global distribution - it's all scalability. I'm the founder and CTO of ScaleBase, solving this problem is a workoholic's heaven, so I'm having great time!
My agenda is to stay technical, no marketing and sales BS, give my summarized set of views and opinions to urgent topics, events and latest news in database scalability.

Showing posts with label Database Grid. Show all posts

Friday, January 4, 2013

Partial partitioning and sharding

I came across this: http://stackoverflow.com/questions/14136633/difference-between-partial-replication-and-sharding

I was wondering if sharding is an alternate name for partial replication or not. What I have figured out that --

Partial Repl. – each data item has only copies at some but not all of the nodes (‘Sharding’?)

Pure Partial Repl. – has only copies of a subset of the data item but no node contains a full copy of the database

Hybrid Partial Repl. – a set of nodes are full replicas and another set of nodes are partial replicas

I thought it was a good topic, I write a really nice answer, but there was a problem when I pressed "post this answer", probably some error or mistake on their side. Anyway - this is what I have to say about partial partitioning and sharding:

Partial replication is an interesting way, in which you distribute the data with replication from a master to slaves, each contains a portion of the data. Eventually you get an array of smaller DBs, read only, each contains a portion of the data. Reads can very well be distributed and parallelized.

But what about the writes?

Those are still clogged, in 1 big fat lazy master database, tasks as buffer management, locking, thread locks/semaphores, and recovery tasks - are the real bottleneck of the OLTP, they make writes impossible to scale... As I wrote in many previous posts, for example here: http://database-scalability.blogspot.com/2012/08/scale-up-partitioning-scale-out.html.

Sharding is where every piece of data lives only in one place, within an array of DBs. Each database is the complete owner of the data: data is read from there, data is written to there. This way, reads and writes are distributed and parallelized, real scale-out can be achieved.

The idea behind sharding is great, the ultimate scaling solution (ask Facebook, Google, Twitter and all the other big guys) but it's a mess to handle, to maintain. It's hard as hell if done by yourself, ScaleBase enables an automatic transparent scale-out machine - that does all that, so you won't have too...

Wednesday, October 3, 2012

"(Cloud) is complete gibberish. It's insane. When is this idiocy going to stop?"

This is Larry Ellison keynotes in Oracle OpenWorld on September 2008. Only 4 years ago.

"The computer industry is the only industry that is more fashion-driven than women's fashion. Maybe I'm an idiot, but I have no idea what anyone is talking about. What is it? It's complete gibberish. It's insane. When is this idiocy going to stop?"

"We'll make cloud computing announcements. I'm not going to fight this thing. But I don't understand what we would do differently in the light of cloud."

The above along with additional marbles are all here: http://news.cnet.com/8301-13953_3-10052188-80.html

Yesterday Larry stood on that same stage and announced Oracle 12c. The c stands for... Cloud!

And what makes Oracle 12c cloud ready?
12c is a "container database." It's function is to hold lots of other databases, keeping their data separate, but allowing them to share underlying hardware resources like memory or file storage. So this way 12c can be used for software-as-a-service tech companies that need a way to let multiple customers access a single database. It's also geared toward large enterprises who may have hundreds of Oracle databases. It would let them consolidate their databases onto less hardware, saving them money on that and making all of those databases easier to manage.

So in short 2 words: multi-tenancy.

Oracle is still Shared-everything, big boxes, and allow virtualization using internal division and allocation of those shared resources to multiple smaller "virtual databases". Indeed, it's great for consolidation and also multi-tenancy.

Cloud? IMHO, to me, cloud is multi-tenancy, but also scale (out) and elasticity. Amazon calls their cloud services EC2, the E stands for Elasticity. The new "DB made for the cloud" has no news about scale-out or about elasticity. Maybe we'll need to wait for Oracle 13e... :)

Also there's a new Exadata database machine, called x3... Yet another bigger box to do all the above. They say it's Oracle competition to SAP HANA.

And finally, we have a new player in the cloud services... Oracle! They'll have a public cloud offering (like Amazon, Rackspace, HP) and also a private cloud, which is a replica of Oracle's public cloud that is put in the customer's own data center. Oracle would still own the hardware and be responsible for running it, securing it and updating it. "as a Service" in your premise. It's interesting and even rhymes...

So it seems someone regained his composure...

Ref: http://www.businessinsider.com/larry-ellison-just-took-on-amazon-with-a-new-cloud-service-2012-9

Friday, September 28, 2012

Being successful like Pinterest without its DB adventures...

I just came across this: "Scaling Pinterest and adventures in database sharding" (http://gigaom.com/data/scaling-pinterest-and-adventures-in-database-sharding/)

"Pinterest has learned about scaling the way most popular sites do — the architecture works until one day it doesn’t"

Pinterest found out that "the architecture" is not scalable and they turned to development of a Scale Out mechanism also called Sharding.

I find it amazing that sharding, or in other words, the idea of "scale out by splitting and parallelizing data across shared-nothing commodity-hardware" is not supplied "out of the box" by "the architecture" (such as database, load-balancer, any other IT stuff). I'm wondering who was the one that decided that an IT issue like scale-out should be outsourced from the database to the application developers?...

Amazing!!

When was the last time you heard about a PHP or Ruby developer wrote code to enable Scale Out. NEVER! Scale Out in the application layer is enabled easily by a magical box called a load balancer, and you can get one from F5 or wherever for a low 4 digit USD. Commodity!

But to scale the database? To enjoy the obvious advantages of "scale out by splitting and parallelizing data across shared-nothing commodity-hardware"? - for this the world still thinks developers need to stop investing effort in innovation, better product, competitive business. Instead they need harness their how-databases-really-work skills to write band-aid code to scale the DB.

Amazing... As you know I took it personally, and have been solving this paradox every day now, by bringing a complete, automatic, out-of-the-box "scale-out machine", that we like to call ScaleBase. I think Pinterest story is great, with a great outcome, but it's not always the case with this complex matter, and a generic, repeatable, IT-level solution for Scale Out can make it much easier for all other "Pinterests" out there to be as successful and make the right choice and enjoy the great benefits - without the tremendous efforts and labor in home-grown sharding.

Friday, August 31, 2012

Facebook makes big data look... big!

Oh I love these things: http://techcrunch.com/2012/08/22/how-big-is-facebooks-data-2-5-billion-pieces-of-content-and-500-terabytes-ingested-every-day/

Every day there are 2.5B content items shares, and 2.7B "Like"s. I care less about GiGo content itself, but metadata, connections, relations are kept transactionally in a relational database. The above 2 use-cases generate 5.2B transactions on the database, and since there are only 86400 seconds a day, we get over 60000 write transactions per second on the database, from these 2 use-cases alone, not to mention all other use-cases, such as new profiles, emails, queries...

And what's the size of new data, on top of all the existing data, that cannot be deleted so easily, (remember why? Get a hint here: http://database-scalability.blogspot.com/2012/08/twitter-and-new-big-data-lifecycle.html). A total 500+TB is added every day, I would exaggerated and assume 98% is pictures and other GiGo content only to leaves us with fuzzy new daily 10TB. There were times Oracle called VLDB to a DB of over 1TB, and here we have 10TB, every day.

So how do FB handle all this? They have a scaled-out grid of several 10000s of MySQL servers.

The size alone is not the entire problem. Enough juice, memory, MPP, columnar - will do the trick.
The but if we put this throughput of 100Ks transactions per second, it'll rip the guts out of any database engine. Remember that it translates every write operation into at least 4 internal operations (table, index(s), undo, log) and also needs to do "buffer management, locking, thread locks/semaphores, and recovery tasks". Can't happen.

The only way to handle such data size and such load is scale-out, divide the big problem to 20000 smaller problems, and this is what FB is doing with their cluster of 10000s of MySQLs.

You're probably thinking "naaa it's not my problem", "hey how many Facebooks are out there?". Take a look and try to put yourself, your organization, on the chart below:

Where are you today? Where will you be in 1 year? In 5 years? Things today go wild and and they go wild faster than ever. Big data is everywhere. New social apps aren't afraid of "what if no one will show up to my party?", rather they're afraid "what if EVERYBODY show up?"

You don't need to be Facebook to need a good solution to scale out your database

Tuesday, August 28, 2012

Scale Up, Partitioning, Scale Out

On the 8/16 I conducted a webinar titled: "Scale Up vs. Scale Out" (http://www.slideshare.net/ScaleBase/scalebase-webinar-816-scaleup-vs-scaleout):

ScaleBase Webinar 8.16: ScaleUp vs. ScaleOut from ScaleBase

The webinar was successful, we had many attendees and great participation in questions and answers throughout the session and in the end. Only after the webinar it only occurred to me that one specific graphic was missing from the webinar deck. It was occurred to me after answering several audience questions about "the difference between partitioning and sharding" or "why partitioning doesn't qualify as scale-out".

Having the webinar today, I would definitely include the following picture, describing the core difference between Scale Up, Partitioning, and Scale Out:

In the above (poor) graphics, I used the black server box as the database server machine, the good old cylinder as the disk or storage device, and the colorful square thingy stands for the database engine. Believe it or not, this is a real complete architecture chart of Oracle 10gR2 SGA, miniatured to a small scale. Yes, all databases including Oracle and also MySQL, are complex beasts, a lot of stuff is going on inside the database engine for every command.

If my DB is like in the "starting point" then I'm either really small, or I'm in a really bad shape by now.

Partitioning makes wonders as data grows towards being "big data". It optimizes the data placement on separate files or disks, it makes every partition optimized and "thin" and less fragmented as you would expect from a gigantic busy monolithic table. Still, although splitting the data across files, we're still "stuck" with busy monolithic database engine that relies on a single box "compute" or "computing power".

While we distributed the data, we didn't distribute the "compute".

When there is a heavy join operation, there is one busy monolithic database engine to collect data from all partitions and process this join.

When there are 10000 concurrent transactions to handle right here and now, there is one busy monolithic database engine to do all database-engine activities such as buffer management, locking, thread locks/semaphores, and recovery tasks. Buffers, locking queues, transaction queues... are still the same for all partitions.

This is where Scale-out is different than partitions. It enables distribution and parallelism of the data as well as the so important compute, brings the compute closer to the data, enables several database engine process different sets of data, handling different sets of the overall session concurrency.

You can think of it as one step forward from partitioning, and it comes with great great results. It's not a simple step though, an abstraction layer is required to represent the databases grid as one database to the application, same as what it's used to use.

In further posts I'll go into more on this "Scale Out Abstraction Layer", and also about ScaleBase which is a provider of such layer

Monday, August 6, 2012

Twitter and the new big data lifecycle

Recently I came across this fine article in The New York Times: "Twitter Is Working on a Way to Retrieve Your Old Tweets". Dick Costolo, Twitter’s chief executive, said:

"It’s a different way of architecting search, going through all tweets of all time. You can’t just put three engineers on it."

Mr. Costolo is right, and pointed the spotlight to a very important change we're experiencing today, in these such interesting times. The word is expectations and those are changing fast!

Not so long ago, Big Data was a synonym to Analytics, Data Warehouse, Business Intelligence. Traditionally operational (OLTP) apps held limited amounts of data, only the "current" data, relevant for the ongoing operations. A cashier in a supermarket would hold only recent transactions, to enable lookup of a charge that was done 10 minutes ago, if I need to return and item or dispute the charge while at the cashier. When I come back to the store the day after, I won't go to the cashier, I should go to "customer service" that with a different application, a different database - I will get the service for my returning items or disputes. A dispute after several months will not be handled by the customer service in the store, but by "the chain's dispute department", using a different, 3rd app with a 3rd cumulative aggregative DB. And on and on it goes.

In this simplified example, the organization invested many resources in 3 different DBs and apps aggregating different levels of data, enabling similar and marginal additional functionality. Why? Data volume and concurrency.

At the cashiers, the only place where new data is really generated, there also the highest concurrency. In a global look many thousands of items are "beeped" and sold through the cashiers every minute - data is kept small - generated and extracted out shortly after that. The customer service reps handle tens of customers a minute over larger data, and "the chain's dispute department" overlooks the biggest data, but handles 1 or 2 cases an hour, and might also execute more "analytic-style" queries to determine the nature of a dispute...

This was, in a nutshell, the "lifecycle of the data" in the old world. But today, everything changes - it's all online, right here, right now!

Enormous amount of (big) data is generated and also searched and analyzed at the same time. Everything is online, here and now. Every tweet (millions a day) is reported instantly to hundreds of followers, participates in saved searches, analyzed by numerous robots and engines throughout the web, and also by Twitter itself. Same goes for every search or e-mail I send in Google and for every status or "like" in Facebook that is is reported to my hundreds of friends and also analyzed at the same time, here and now. Hey its their way to make money, to push the right ads at the right time.

And now - we learn the users expect to see online data that "old" in the terminology of the old days. I want to see statuses, likes and tweets from 2 and 4 months ago, in the same interface and the same experience I'm used to, don't send me to the "customer service department"!

On the bottom line - it requires scale. Scale you online database to handle online data volumes and throughput, as well as older data, on the same grid, without interference, with the same applications. This is what scale out is all about. Think outside the (one database server) box. If you have 10 databases for the current data, you can have 10 more with older data, and 100 more with even-older data and so on. Giving a transparent unified view to (or virtualizing) this database grid - is the solution occupies most of my time, and it's the missing link to making a database scale-out a commodity.