Ruminations of a Programmer: scalability

Showing posts with label scalability. Show all posts

Thursday, January 21, 2010

A new way to think of Data Storage for your Enterprise Application

A couple of posts earlier I had blogged about a real life case study of one of our projects where we are using a SQL store (Oracle) and a NoSQL store (MongoDB) in combination over a message based backbone. MongoDB was used to cater to a very specific subset of the application functionality, where we felt it made a better fit than a traditional RDBMS. This hybrid architecture of data organization is turning out to be an increasingly attractive option today with more and more specialized persistent storage structures being developed.

In many applications we need to process graph data structures. Neo4J can be a viable option for this. You can have your mainstream data storage still in an RDBMS and use Neo4J only for the subset of functionalities for which you need to use graph data structures. If you need to sync back to your main storage, use messaging as the transport to talk back to your relational database.

Multiple data storage use along with asynchronous messaging is one of the options that will looks very potent today. Drizzle has its entire replication based on a RabbitMQ based transport. And using AMQP messaging, Drizzle replicates data to a host of key/value stores like Voldemort, memcachedDB and Cassandra.

If we agree that messaging is going to be one of the most dominant paradigms in shaping application architectures, why not try to go one level up and look at some higher level abstractions for message based programming? Erlang programmers have been using the actor model for many years now and have demonstrated all the good qualities that the model imbibes. Inspired by Erlang, Scala also offers a similar model on the JVM. In an earlier post I had discussed how we can use the actor model in Scala to scale out messaging applications using a RabbitMQ storage.

Now with the developing ecosystem of polyglot storage, we can use the same model of actor based communication as the backbone for integrating multiple data storage options that you may plug in to your application. Have specific clients front end the storage that they need to work with and use messaging to sync that up with the main data storage backend to have a consistent system of record. Have a look at the following diagram that may not look that unreal today. You have a host of options that bring your data closer to the way you process them in your domain model, be it document oriented, graph based, key/value based or simple POJO based across a data grid like Terracotta.

When we have a bunch of architectural components loosely connected through messaging infrastructure, you can have a world of options managing interactions between them. In fact your options open up more when you get to interact with data shaped the way you would like to be. You now can think in terms of having a data model aligned with the model of your domain. You know once your rule base gets updated in Neo4J, it will somehow be synced up with the backend storage through some other service that will make it eventually consistent.

In a future post I will explore some of the options that a higher order middleware service like Akka can add to your stack. With Akka providing abstractions like transactors, pluggable persistence and out of the box integration modules for AMQP, there's a number of ways you can think of modularizing your application's domain model and storage. You can use peer to peer distributed actor based communication model that sets up synchronization options with your databases or you can use AMQP based transport to do the same much like what Drizzle does for replication. But that's some food for thought for yet another future post.

Friday, March 20, 2009

Now serving - Message Queuing in Web Applications

Looks like asynchronous messaging is becoming quite a common paradigm of distribution in Web applications and large Web sites. The general theory is to do the absolutely essential user requirement stuff as part of the synchronous HTTP request/response cycle. And delegate the rest offline for the asynchronous queuing service to process. This leads to faster page reloads, optimal resource utilization and much better user experience. Gojko Azdic has a great post clarifying the misconceptions around messaging systems, as being only useful for big investment banks.

And here are some real life use cases ..

Flickr uses offline queuing systems to process notifications to contacts and third party partners when a user uploads a photo. Informing the user is of prime importance which Flickr does as part of the synchronous request/response processing, while the other notifications are processed by job queues running in parallel.

Digg uses Gearman to farm out similar processing of jobs in parallel while serving the absolute essential stuff synchronously.

And as recently presented in QCon by Evan Weaver, Twitter is becoming more and more centered around messaging middleware. Every incoming tweet triggers the messaging system for processing notifications to all followers, and this is done asynchronously through Kestrel, an implementation of a message queue written in Scala.

All these use cases lead to the conclusion that message queuing is becoming an increasingly potent medium of distribution even in Web based applications. The moot point is eventual consistency, that leads to a better scalability model. And, by the way, it looks to be about time that open source messaging platforms start getting more steam, so that every house does not have to reinvent the wheel and develop its own message queuing infrastructure.

Monday, March 02, 2009

Your data model can speak different languages too

Andrej Koelewijn writes ..

REST is about resources. Resource is just another word for object, or record.

REST is also about URLs. URLs that identify resources. Just like ids can identify objects or records.

And, often overlooked, REST is also about links. Resources use URLs to link to other resources. Just like foreign keys can link records to other records.

and finally concludes ..

REST is a distributed data model

There is no denying the fact that REST is based on the central concept of Resources, which are addressable entities that can be manipulated by all components of the communicating network using a standardized and uniform interface. Resources abstract the underlying data model for the user, based on which the application developer can design nice REST based APIs to the external world. REST abstracts the data model, it is NOT the data model - in fact underlying data representations can change without any impact on the API.

In another related post, talking about the relational data model, he also mentions ..

This is why RDBMSes are so great: it doesn’t bind your data to a single application.

RDBMS does bind your data to the specific application model. What it abstracts from the application, is the underlying physical representation of the data. In fact it is this specific binding and the constraints that it imposes on the data model that makes it so difficult to work with semi-structured data. You design the schema upfront, define relationships and constraints along with the schema, and organize your data based on them. This is one of the reasons why we need to have separate data models for writes (normalized), queries and reports (denormalized).

In an earlier post, I had drawn upon some of my thoughts on Data 2.0, as I see it in an application today. The fact is that, data is no longer viewed as something to be abstracted in a uniform storage and drenched out using a single query language. Gone are the days when we used to think of the ORM as the grand unifier for all database platforms. Both the O and the R are fast losing ubiquity in today's application development context, at least they are not as universal as they used to be, half a decade back.

Things are surely moving towards polyglotism in the data modeling world as well. Not all of your data need to be on the relational database - distribute data to where it belongs. Sometime back Alex Miller had a post that suggested distribution and modeling of data based on lifetimes. Gone are the days when you need to persist your conversational data in the database for scaling out your application. Modern day grid platforms like Terracotta and Gigaspaces offer network attached memory that will store this data for you in the form that is much closer to the application model along with transparent clustering of your application.

Not all data need a relational model. In a typical application, there are data that are document oriented, does not need to have a fixed schema attached to it. They can be stored as key-value pairs and their primary reason for existence is to support fast, real fast inserts, updates and key based lookups. The semantics of the data inside the value is fairly opaque and does not need to have any constraints of relation with the rest of the data model. Why coerce such a simple model into one that forces you to pay the upfront tax of normalization, constraint enforcement, index rebuilding and joins. Think simple, think key value pairs. And we have lots of them being used in the application space today. Rip such data out of your relational model into lightweight transactional stores that scale easily and dynamically. Long lived persistent data can however, happily choose to stay around within the confines of your normalized relational model.

One of the significant advantages that you get out of storing data on the key/value stores is that your persistent data is now more closely mapped to the objects and classes of your application. This leads to less of a cognitive dissonance that the relational data model enforces upon the application.

But what about queries that can fetch relevant records from the key/value stores based on user defined criteria ?

Obviously primary key based fetch is not always that useful in practical applications. All distributed key/value stores provide the capability to index based on custom defined filters and in conjunction with full text search engines like Lucene, return collections of selected entries from the data store. Have a look at this article demonstrating how Sphinx, a full text search engine, can be integrated with MemcacheDB, a distributed key-value store which conforms to the memcached protocol and uses Berkeley DB as its storage back-end.

CouchDB provides an interesting view model that offers the capability to aggregate and query documents stored in the database through the map-reduce paradigm. Users can define the computation to model the query using map functions and subsequent aggregates using the reduce function that make relevant data available to the user.

So, now that your data model is polyglotic, there can be situations where you may need to synchronize data across multiple storage engines. Here is an example ..

In a real life trading application, huge volume of trade messages need to be processed by the back office within a very short time interval. This needs scaling out to throttle at a rate that can be achieved more easily using the light payloads that schemaless, amorphous key value stores offer than traditional relational databases. You just need to validate the message and store it against the trade reference number. Also, these peer based distributed key/value databases offer easy bi-directional replication and updates to shared data in disconnected mode, which can make offline message processing, followed by synchronization later on, quite affordable. Hence it makes every sense to offload this message processing from the database and deploy clusters of key/value stores.

However, there is a second level processing that needs to be done, which updates all books and accounts and customer balances based on each individual trade. And this information needs to be stored as a system of record for various queries, reports, audit trails and other subsequent downstream processing. One way this can be achieved is by pushing the second level of processing to scheduled queue jobs that asynchronously operate on the key value store, do all relevent heavy lifting with the processing of data and finally pushing the balances to the back end relational database. Currently we are doing everything as one synchronous ACID transaction against an RDBMS. Distribute the model based on data lifetime, rely on asynchronous processing, and what you get is eventual consistency with the goodness of better scalability.

A few days back I was listening to this Scaling DIGG episode with Joe Stump. He mentioned repeatedly that it's not the language, the bottleneck is the IO and the latency resulting from the IO. And the database is the single most bottleneck in your architecture. More specifically, it is NOT the database per se, but the synchronous communication between the application tier and the persistent data layer. This is the reason why DIGG is architected around Gearman, MemcacheDB and MogileFS. However, not all sites need the scalability of DIGG. But even with applications that need a fraction of scalability compared to DIGG, architecting it away from strictly synchronous ACID transaction oriented data sources is the way to go.

Monday, October 27, 2008

Data 2.0 - Is your Application ready ?

Brian Aker talking about assumptions on Drizzle and future of database technologies ..

"Map/Reduce will kill every traditional data warehousing vendor in the market. Those who adapt to it as a design/deployment pattern will survive, the rest won't. Database systems that have no concept of being multiple node are pretty much dead. If there is no scale out story, then there is not future going forward."

Later on in the same post he goes on to mention the forces that he hopes would drive the performance of tomorrow's database technologies - map/reduce and asynchronous queues. Drizzle is a fork out of MySql, it is revolutionary in many respects compared to all other forks of the same product or similar products of the same genre. Drizzle does away with the erstwhile dodos of database technology viz. stored procedures, triggers etc. and is being planned exclusively for scaling out in the cloud. The development model is also ultra open source, aka organic open source and is being driven by the community as a whole.

Drizzle is clearly a step towards N > 1 and quite a forceful step too. Erlang was the silent roadmaker towards true N > 1 with the entire ecosystem of OTP, mnesia and the supervisor hierarchies .. Erlang offers the platform and the correct set of primitives for delivering fault tolerance in applications along with replicated mnesia storage. Erlang was not built for N = 1 .. it had N > 1 right into it and right from day 0 ..

Couchdb is another instance that may hit the sweet spot of getting the combination right - asynchronous map/reduce along with replicated, distributed, loosely coupled document storage. Implements REST, BASE, MVCC .. what else .. that leads to eventual consistency of the system.

All the buzz about future database technologies have been somehow related to the cloud, or at least being planned keeping the scale out factor in mind. Erlang started it all, and is now being actively driven in multiple dimensions by all similar complementary/competing technologies and platforms. No one talks about normalization or ACID or multi-phase commit these days. Call it Web 2.0 or Enterprise 2.0 or social networking, everything boils down to how enterprise IT can make data access easier, better, and more resilient to datacenter or network failures without compromising on the quality of service.

Middleware matters, data storage in the cloud matters, data processing in the cloud matters, and we are looking at lots of vendors fighting for a space there in. Applications of Enterprise 2.0 need to be smart and malleable enough to participate in this ecosystem. Make sure they are implemented as loosely coupled components that talk to middleware services asynchronously, can consume the atom feeds that other enterprise applications generate and do not depend on ACID properties or rely on synchronous 2-phase commit protocols from the data layer. Are we there yet ?

Friday, August 22, 2008

Facebook Scaling Out across Data Centers

Jason Sobel has an interesting post on scaling out of Facebook on to a new data center in the East Coast at Virginia. A really interesting insight into some of the design decisions that have given us one of the most trafficked sites on the face of the planet today.

Here are two points that struck me on reading the post ..

Changing the sql grammar in the replication stream to incorporate eviction of expired items from memcached looks like a hack. A more traditional implementation could have been using triggers or MySQL UDFs to atomicize the entire transaction. But generic solutions always come ironclad with some performance overhead. It's no wonder that Facebook needs to do all specializations, even if that amounts to no ceremony and all hack.

Just wondering that Facebook still writes in one data center. With all the CAP theorem and eventual consistency stuff being solved by Amazon, why does Facebook still have this limitation ?