Showing posts with label Design patterns. Show all posts
Showing posts with label Design patterns. Show all posts

Thursday, November 19, 2009

Cloud Computing Patterns

I have attended a presentation by Simon Guest from Microsoft on their cloud computing architecture. Although there was no new concept or idea introduced, Simon has provided an excellent summary on the major patterns of doing cloud computing.

I have to admit that I am not familiar with Azure and this is my first time hearing a Microsoft cloud computing presentation. I felt Microsoft has explained their Azure platform in a very comprehensible way. I am quite impressed.

Simon talked about 5 patterns of Cloud computing. Let me summarize it (and mix-in a lot of my own thoughts) ...

1. Use Cloud for Scaling
The key idea is to spin up and down machine resources according to workload so the user only pay for the actual usage. There is two types of access patterns: passive listener model and active worker model.

Passive listener model uses a synchronous communication pattern where the client pushes request to the server and synchronously wait for the processing result.
In the passive listener model, machine instances are typically sit behind a load balancer. To scale the resource according to the work load, we can use a monitor service that send NULL client request and use the measured response time to spin up and down the size of the machine resources.

On the other hand, Active worker model uses an asynchronous communication patterns where the client put the request to a queue, which will be periodically polled by the server. After queuing the request, the client will do some other work and come back later to pickup the result. The client can also provide a callback address where the server can push the result into after the processing is done.
In the active worker model, the monitor can measure the number of requests sitting in the queue and use that to determine whether machine instances (at the consuming end) need to be spin up or down.


2. Use Cloud for Multi-tenancy
Multi-tenancy is more a SaaS provider (rather than an enterprise) usage scenario. The key idea is to use the same set of code / software to host the application for different customers (tenants) who may have slightly different requirement in
  • UI branding
  • Business rules for decision criteria
  • Data schema
The approach is to provide sufficient "customization" capability for their customer. The most challenging part is to determine which aspects should be opened for customization and which shouldn't. After identifying these configurable parameters, it is straightforward to define configuration metadata to capture that.

3. Use Cloud for Batch processingThis is about running things like statistics computation, report generation, machine learning, analytics ... etc. These task is done in batch mode and so it is more economical to use the "pay as you go" model. On the other hand, batch processing has very high tolerance in latency and so is a perfect candidate of running in the cloud.
Here is an example of how to run Map/Reduce framework in the cloud. Microsoft hasn't provided a Map/Reduce solution at this moment but Simon mentioned that Dryad in Microsoft research may be a future Microsoft solution. Interestingly, Simon also recommended Hadoop.

Of course, one challenge is how to move the data from the cloud in the first place. In my earlier blog, I have describe some best practices on this.

4. Use Cloud for Storage
The idea of storing data into the cloud and no need to worry about DBA tasks. Most cloud vendor provide large scale key/value store as well as RDBMS services. Their data storage services will also take care of data partitioning, replication ... etc. Building cloud storage is a big topic involving many distributed computing concepts and techniques, I have covered it in a separate blog.

5. Use Cloud for Communication
A queue (or mailbox) service provide a mechanism for different machines to communicate in an asynchronous manner via message passing.

Azure also provide a relay service in the cloud which is quite useful for machines behind different firewall to communicate. In a typical firewall setup, incoming connection is not allowed so these machine cannot directly establish a socket to each other. In order for them to communicate, each need to open an on-going socket connection to the cloud relay, which will route traffic between these connections.

I have used the same technique in a previous P2P project where user's PC behind their firewall need to communicate, and I know this relay approach works very well.

Friday, April 11, 2008

REST design pattern

Based on the same architectural pattern of the web, "REST" has a growing dominance of the SOA (Service Oriented Architecture) implementation these days. In this article, we will discuss some basic design principles of REST.

SOAP : The Remote Procedure Call Model

Before the REST become a dominance, most of SOA architecture are built around WS* stack, which is fundamentally a RPC (Remote Procedure Call) model. Under this model, "Service" is structured as some "Procedure" exposed by the system.

For example, WSDL is used to define the procedure call syntax (such as the procedure name, the parameter and their structure). SOAP is used to define how to encode the procedure call into an XML string. And there are other WS* standards define higher level protocols such as how to pass security credentials around, how to do transactional procedure call, how to discover the service location ... etc.

Unfortunately, the WS* stack are getting so complicated that it takes a steep learning curve before it can be used. On the other hand, it is not achieving its original goal of inter-operability (probably deal to different interpretation of what the spec says).

In the last 2 years, WS* technology development has been slowed down and the momentum has been shifted to another model; REST.

REST: The Resource Oriented Model

REST (REpresentation State Transfer) is introduced by Roy Fielding when he captured the basic architectural pattern that make the web so successful. Observing how the web pages are organized and how they are linked to each other, REST is modeled around a large number of "Resources" which "link" among each other. As a significant difference with WS*, REST raises the importance of "Resources" as well as its "Linkage", on the other hand, it push down the importance of "Procedures".

Unlike the WS* model, "Service" in the REST is organized as large number of "Resources". Each resource will have a URI that make it globally identifiable. A resource is represented by some format of "Representation" which is typically extracted by an idempotent HTTP GET. The representation may embed other URI which refers to other resources. This emulates an HTML link between web pages and provide a powerful way for the client to discover other services by traversing its links. It also make building SOA search engine possible.

On the other hand, REST down play the "Procedure" aspect and define a small number of "action" based on existing HTTP Methods. As we discussed above, HTTP GET is used to get a representation of the resource. To modify a resource, REST use HTTP PUT with the new representation embedded inside the HTTP Body. To delete a resource, REST use HTTP DELETE. To get metadata of a resource, REST use HTTP HEAD. Notice that in all these cases, the HTTP Body doesn't carry any information about the "Procedure". This is quite different from WS* SOAP where the request is always made using HTTP POST.

At the first glance, it seems REST is quite limiting in terms of the number of procedures that it can supported. It turns out this is not the case, REST allows any "Procedure" (which has a side effect) to use HTTP POST. Effectively, REST categorize the operations by its nature and associate well-defined semantics with these categories (ie: GET for read-only, PUT for update, DELETE for remove, all above are idempotent) while provide an extension mechanism for application-specific operations (ie: POST for application procedures which may be non-idempotent).


URI Naming Convention

Since resource is usually mapped to some state in the system, analyzing its lifecycle is an important step when designing how a resource is created and how an URI should be structured.

Typically there are some eternal, singleton "Factory Resource" which create other resources. Factory resource typically represents the "type" of resources. Factory resource usually have a static, well-known URI, which is suffixed by a plural form of the resource type. Some examples are ...
http://xyz.com/books
http://xyz.com/users
http://xyz.com/orders

"Resource Instance", which are created by the "Factory Resource" usually represents an instance of that resource type. "Resource instances" typically have a limited life span. Their URI typically contains some unique identifier so that the corresponding instance of the resource can be located. Some examples are ...
http://xyz.com/books/4545
http://xyz.com/users/123
http://xyz.com/orders/2008/04/10/1001

If this object is a singleton object of that type, the id is not needed.
http://www.xyz.com/library

"Dependent Resource" are typically created and owned by an existing resource during part of its life cycle. Therefore "dependent resource" has an implicit life-cycle dependency on its owning parent. When a parent resource is deleted, all the dependent resource it owns will be deleted automatically. Dependent resource use an URI which has prefix of its parent resource URI. Some examples are ...
http://xyz.com/books/4545/tableofcontent
http://xyz.com/users/123/shopping_cart

Creating Resource

HTTP PUT is also used to create the object if the caller has complete control of assigning the object id, the request body contains the representation of the Object after successful creation.
PUT /library/books/668102 HTTP/1.1
Host: www.xyz.com
Content-Type: application/xml
Content-Length: nnn

<book>
<title>Restful design</title>
<author>Ricky</author>
</book>
HTTP/1.1 201 Created

If the caller has no control in the object id, HTTP POST is made to the object's parent container with the request body contains the representation of the Object. The response body should contain a reference to the URL of the created object.
POST /library/books HTTP/1.1
Host: www.xyz.com
Content-Type: application/xml
Content-Length: nnn

<book>
<title>Restful design</title>
<author>Ricky</author>
</book>
HTTP/1.1 301 Moved PermanentlyLocation: /library/books/668102

To create a resource instance of a particular resource type, make an HTTP POST to the Factory Resource URI. If the creation is successful, the response will contain a URI of the resource that has been created.

To create a book ...
POST /books HTTP/1.1
Host: xyz.com
Content-Type: application/xml; charset=utf-8
Content-Length: nnn

<book>
<title>...</title>
<author>Ricky Ho</author>
</book>
HTTP/1.1 201 Created
Content-Type: application/xml; charset=utf-8
Location: /books/4545

<ref>http://xyz.com/books/4545</ref>

To create a dependent resource, make an HTTP POST (or PUT) to its owning resource's URI

To upload the content of a book (using HTTP POST) ...
POST  /books/4545  HTTP/1.1
Host: example.org
Content-Type: application/pdf
Content-Length: nnnn

{pdf data}
HTTP/1.1 201 Created
Content-Type: application/pdf
Location: /books/4545/content

<ref>http://xyz.com/books/4545/tableofcontent</ref>

HTTP POST is typically used to create a resource when its URI is unknown to the client before its creation. However, if the URI is known to the client, then an idempotent HTTP PUT should be used with the URI of the resource to be created. For example, the

To upload the content of a book (using HTTP PUT) ...
PUT  /books/4545/tableofcontent  HTTP/1.1
Host: example.org
Content-Type: application/pdf
Content-Length: nnnn

{pdf data}
HTTP/1.1 200 OK

Finding Resources

Make an HTTP GET to the factory resource URI, criteria pass in as parameters.
(Note that it is up to the factory resource to interpret the query parameter).

To search for books with a certain author ...
GET /books?author=Ricky HTTP/1.1
Host: xyz.com
Content-Type: application/xml; charset=utf-8
HTTP/1.1 200 OK
Content-Type: application/xml; charset=utf-8
Content-Length: nnn

<books>
<book>
<ref>http://xyz.com/books/4545</ref>
<title>...</title>
<author>Ricky</author>
</book>
<book>
<ref>http://xyz.com/books/4546</ref>
<title>...</title>
<author>Ricky</author>
</book>
</books>

Another school of thoughts is to embed the criteria in the URI path, such as ...
http://xyz.com/books/author/Ricky

I personally prefers the query parameters mechanism because it doesn't imply any order of search criteria.


Lookup a particular resource

Make an HTTP GET to the resource object URI

Lookup a particular book...
GET /books/4545 HTTP/1.1
Host: xyz.com
Content-Type: application/xml; charset=utf-8
HTTP/1.1 200 OK
Content-Type: application/xml; charset=utf-8
Content-Length: nnn

<book>
<title>...</title>
<author>Ricky Ho</author>
</book>

In case the resource have multiple representation format. The client should specify within the HTTP header "Accept" of its request what format she is expecting.


Lookup a dependent resource

Make an HTTP GET to the dependent resource object URI

Download the table of content of a particular book...
GET /books/4545/tableofcontent HTTP/1.1
Host: xyz.com
Content-Type: application/pdf
HTTP/1.1 200 OK
Content-Type: application/pdf
Content-Length: nnn

{pdf data}

Modify a resource

Make an HTTP PUT to the resource object URI, pass in the new object representation in the HTTP body

Change the book title ...
PUT /books/4545 HTTP/1.1
Host: xyz.com
Content-Type: application/xml; charset=utf-8
Content-Length: nnn

<book>
<title>Changed title</title>
<author>Ricky Ho</author>
</book>
HTTP/1.1 200 OK

Delete a resource

Make an HTTP DELETE to the resource object URI

Delete a book ...
DELETE /books/4545 HTTP/1.1
Host: xyz.com
HTTP/1.1 200 OK

Resource Reference

In some cases, we do not want to create a new resource, but we want to add a "reference" to an existing resource. e.g. consider a book is added into a shopping cart, which is another resource.

Add a book into the shopping cart ...
POST  /users/123/shopping_cart  HTTP/1.1
Host: xyz.com
Content-Type: application/xml; charset=utf-8
Content-Length: nnn

<?xml version="1.0" ?>
<add>
<ref>http://xyz.com/books/4545</ref>
</add>
HTTP/1.1 200 OK

Show all items of the shopping cart ...
GET  /users/123/shopping_cart  HTTP/1.1
Host: xyz.com
Content-Type: application/xml; charset=utf-8
HTTP/1.1 200 OK
Content-Type: application/xml; charset=utf-8
Content-Length: nnn

<?xml version="1.0" ?>
<shopping_cart>
<ref>http://xyz.com/books/4545</ref>
...
<shopping_cart>
Note that the shopping cart resource contains "resource reference" which acts as links to other resources (which is the books). Such linkages create a resource web so that client can discovery and navigate across different resources.


Remove a book from the shopping cart ...
POST  /users/123/shopping_cart  HTTP/1.1
Host: xyz.com
Content-Type: application/xml; charset=utf-8
Content-Length: nnn

<?xml version="1.0" ?>
<remove>
<ref>http://xyz.com/books/4545</ref>
</remove>
HTTP/1.1 200 OK
Note that we are using HTTP POST rather than HTTP DELETE to remove a resource reference. This is because we are remove a link but not the actual resource itself. In this case, the book still exist after it is taken out from the shopping cart.

Note that what the book is deleted, that all the shopping cart that refers to that book need to be fixed in an application specific way. One way is to do lazy checking. In other words, wait until the shopping cart checking out to check the book existence and fix it at that point.

Checkout the shopping cart ...
POST  /orders  HTTP/1.1
Host: xyz.com
Content-Type: application/xml; charset=utf-8
Content-Length: nnn

<?xml version="1.0" ?>
<ref>http://xyz.com/users/123/shopping_cart</ref>
HTTP/1.1 201 Created
Content-Type: application/xml; charset=utf-8
Location: /orders/2008/04/10/1001

<?xml version="1.0" ?>
<ref>http://xyz.com/orders/2008/04/10/1001</ref>
Note that here the checkout is implemented by creating another resource "Order" which is used to keep track of the fulfillment of the purchase.

Asynchronous Request

In case when the operation takes a long time to complete, an asynchronous mode should be used. In a polling approach, a transient transaction resource is return immediately to the caller. The caller can then use GET request to poll for the result of the operation

We can also use a notification approach. In this case, the caller pass along a callback URI when making the request. The server will invoke the callback URI to POST the result when it is done.

The basic idea is to immediately create a "Transaction Resource" to return back to the client. While the actual processing happens asynchronously in the background, the client at any time, can poll the "Transaction Resource" for the latest processing status.

Lets look at an example to request for printing a book, which may take a long time to complete

Print a book

POST  /books/123  HTTP/1.1
Host: xyz.com
Content-Type: application/xml; charset=utf-8
Content-Length: nnn

?xml version="1.0" ?>
<print>http://xyz.com/printers/abc</print>
HTTP/1.1 200 OK
Content-Type: application/xml; charset=utf-8
Location: /transactions/1234

<?xml version="1.0" ?>
<ref>http://xyz.com/transactions/1234</ref>
Note that a response is created immediately which contains the URI of a transaction resource, even before the print job is started. Client can poll the transaction resource to obtain the latest status of the print job.

Check the status of the print Job ...
GET /transactions/1234 HTTP/1.1
Host: xyz.com
Content-Type: application/xml; charset=utf-8
HTTP/1.1 200 OK
Content-Type: application/xml; charset=utf-8
Content-Length: nnn

<transaction>
<type>PrintJob</type>
<status>In Progress</status>
</transaction>
It is also possible to cancel the transaction if it is not already completed.

Cancel the print job

POST  /transactions/1234  HTTP/1.1
Host: xyz.com
Content-Type: application/xml; charset=utf-8
Content-Length: nnn

?xml version="1.0" ?>
<cancel/>
HTTP/1.1 200 OK


Conclusion
The Resource Oriented Model that REST advocates provides a more natural fit for our service web. Therefore, I suggest that SOA implementation should take the REST model as a default approach.

Thursday, March 6, 2008

Web Site Scalability

A classical large scale web site typically have multiple data centers in geographically distributed locations. Each data center will typically have the following tiers in its architecture
  • Web tier : Serving static contents (static pages, photos, videos)
  • App tier : Serving dynamic contents and execute the application logic (dynamic pages, order processing, transaction processing)
  • Data tier: Storing persistent states (Databases, Filesystems)

















Content Delivery

Dynamic Content
  • Most of the content display is dynamic content. Some application logic will be executed at the web server which generate an HTML for the client browser. The efficiency of application logic will have a huge impact on the overall site's scalability. This is our main topic here.
  • Sometimes it is possible to pre-generate dynamic content and store it as static content. When the real request comes in, instead of re-running the application logic to generate the page, we just need to lookup the pre-generated page, which can be much faster
Static Content
  • Static content are typically the images, videos embedded inside the dynamic pages.
  • A typical HTML pages typically contains many static contents where the browser will make additional HTTP network round trips to fetch. So fetching static content efficiency also has a big impact to the overall response of dynamic page
  • Content Delivery Network is an effective solution for delivering static contents. CDN provider will cache the static content in their network and will return the cached copy for subsequent HTTP fetch request. This reduce the overall hits to your web site as well as improving the user's response time (because their cache is in closer proximity to the user)
Request dispatching and Load balancing

There are 2 layers of dispatching for a Client who is making an HTTP request to reach the application server

DNS Resolution based on user proximity
  • Depends on the location of the client (derived from the IP address), the DNS server can return an ordered list of sites according to the proximity measurement. Therefore client request will be routed to the data center closest to him/her
  • After that, the client browser will cache the server IP
Load balancer
  • Load balancer (hardware-based or software-based) will be sitting in front of a pool of homogeneous servers which provide same application services. The load balancer's job is to decide which member of the pool should handle the request
  • The decision can be based on various strategy, simple one include round robin or random, more sophisticated one involves tracking the workload of each member (e.g. by measuring their response time) and dispatch request to the least busy one
  • Members of the pool can also monitor its own workload and mark itself down (by not responding to the ping request of the load balancer)

Client communication

This is concerned about designing an effective mechanism to communicate with the client, which is typically the browser making some HTTP call (maybe AJAX as well)

Designing the granularity of service call
  • Reduce the number of round trips by using a coarse grain API model so your client is making one call rather than many small calls
  • Don't send back more data than your client need
  • Consider using an incremental processing model. Just send back sufficient result for the first page. Use a cursor model to compute more result for subsequent pages in case the client needs it. But it is good to calculate an estimation of the total matched result to return to the client.
Designing message format
  • If you have control on the client side (e.g. I provide the JavaScript library which is making the request), then you can choose a more compact encoding scheme and not worry about compatibility.
  • If not, you have to use a standard encoding mechanism such as XML. You also need to publish the XML schema of the message (the contract is the message format)
Consider data compression
  • If the message size is big, then we can apply compression technique (e.g. gzip) to the message before sending it.
  • You are trading off CPU for bandwidth savings, better to measure whether this is a gain first
Asynchronous communication
  • AJAX fits very well here. User can proceed to do other things while the server is working on the request
  • Consider not sending the result at all. Rather than sending the final order status to the client who is sending an order placement request, consider sending an email acknowledgment.
Session state handling
Typical web transaction involves multiple steps. Session state need to be maintained across multiple interactions

Memory-based session state with Load balancer affinity
  • One way is to store the state in the App Server's local memory. But we need to make sure subsequent request land on the same App Server instance otherwise it cannot access the previous stored session state
  • Load balancer affinity need to be turned on. Typically request with the same cookie will be routed to the same app server
Memory replication session state across App servers
  • Another way to have the App server sharing a global session state by replicating its changes to each other
  • Double check the latency of replication so we can make sure there is enough time for the replication to complete before subsequent request is made
Persist session state to a DB
  • Store the session state into a DB which can be accessed by any App Server inside the pool
On-demand session state migration
  • Under this model, the cookie will be used to store the IP address of the last app server who process the client request
  • When the next request comes in, the dispatcher is free to forward to any members of the pool. The app server which receive this request will examine the IP address of the last server and pull over the session state from there.
Embed session state inside cookies
  • If the session state is small, you don't need to store at the server side at all. You can just embed all information inside a cookie and send back to the client.
  • You need to digitally sign the cookie so that modification cannot happen

Caching
Remember the previous result can reuse them for future request can drastically reduce the workload of the system. But don't cache request which modifies the backend state

Sunday, March 2, 2008

Database Scalability

Database is typically the last piece of the puzzle of the scalability problem. There are some common techniques to scale the DB tire

Indexing

Make sure appropriate indexes is built for fast access. Analyze the frequently-used queries and examine the query plan when it is executed (e.g. use "explain" for MySQL). Check whether appropriate index exist and being used.

Data De-normalization

Table join is an expensive operation and should be reduced as much as possible. One technique is to de-normalize the data such that certain information is repeated in different tables.

DB Replication


For typical web application where the read/write ratio is high, it will be useful to maintain multiple read-only replicas so that read access workload can be spread across. For example, in a 1 master/N slaves case, all update goes to master DB which send a change log to the replicas. However, there will be a time lag for replication.


Table Partitioning

You can partition vertically or horizontally.

Vertical partitioning is about putting different DB tables into different machines or moving some columns (rarely access attributes) to a different table. Of course, for query performance reason, tables that are joined together inside a query need to reside in the same DB.

Horizontally partitioning is about moving different rows within a table into a separated DB. For example, we can partition the rows according to user id. Locality of reference is very important, we should put the rows (from different tables) of the same user together in the same machine if these information will be access together.



Transaction Processing

Avoid mixing OLAP (query intensive) and OLTP (update intensive) operations within the same DB. In the OLTP system, avoid using long running database transaction and choose the isolation level appropriately. A typical technique is to use optimistic business transaction. Under this scheme, a long running business transaction is executed outside a database transaction. Data containing a version stamp is read outside the database trsnaction. When the user commits the business transaction, a database transaction is started at that time, the lastest version stamp of the corresponding records is re-read from the DB to make sure it is the same as the previous read (which means the data is not modified since the last read). Is so, the changes is pushed to the DB and transaction is commited (with the version stamp advanced). In case the version stamp is mismatched, the DB transaction as well as the business transaction is aborted.

Object / Relational Mapping
Although O/R mapping layer is useful to simplify persistent logic, it is usually not friendly to scalability. Consider the performance overhead carefully when deciding to use O/R mapping.

There are many tuning parameters in O/R mapping. Consider these ...
  • When an object is dereferenced, how deep the object will be retrieved
  • If a collection is dereferenced, does the O/R mapper retrieve all the object contained in the collection ?
  • When an object is expanded, choose carefully between multiple "single-join" queries and single "multiple join" query