Showing posts with label couchdb. Show all posts
Showing posts with label couchdb. Show all posts

Sunday, February 14, 2010

Why I don't like ActiveRecord for Domain Model Persistence

When it comes to a rich domain modeling, I am not a big fan of the ActiveRecord model. The biggest problem that it entails is invasiveness - the persistence model invades into my domain model. And this was also my first reaction to the Lift-CouchDB integration module which was released recently.

The moment I say ..

class Person extends CouchRecord[Person] {
  //..
}


my domain model becomes tied to the persistence concerns.

When I started scouchdb, my very first thought was to make it non-invasive. The Scala objects must be pure and must remain pure and completely oblivious of the underlying persistence model. In the age of polyglot persistence there is every possibility that you may need to persist a domain model across multiple storage engines. I may be using a JPA backed relational store as the enterprise online database, which gets synchronized with some offline processing that comes from a CouchDB backend. I need the domain model persistence in both my Oracle and my CouchDB engines. The ActiveRecord pattern is difficult to scale to such requirements.

Here's what I implemented in scouchdb ..

// Scala abstraction : pure
case class ItemPrice(store: String, item: String, price: Number)

// specification of the db server running
val couch = Couch("127.0.0.1")
val item_db = Db("item_db")

// create the database
couch(item_db create)

// create the Scala object : a pure domain object
val s = ItemPrice("Best Buy", "mac book pro", 3000)

// create a document for the database with an id
val doc = Doc(item_db, "best_buy")

// add
couch(doc add s)

// query by id to get the id and revision of the document
val id_rev = couch(item_db by_id "best_buy")

// query by id to get back the object
// returns a tuple3 of (id, rev, object)
val sh = couch(item_db by_id("best_buy", classOf[ItemPrice]))

// got back the original object
sh._3.item should equal(s.item)
sh._3.price should equal(s.price)


It's a full cycle session of interaction with the CouchDB persistence engine without any intrusion into the domain abstraction. I am free to use ItemPrice domain abstraction for a relational storage as well.

CouchDB offers a model of persistence where the objects that we store should be close to the granularity of domain abstractions. I should be able to store the entire Aggregate Root of my model components directly as JSON. ActiveRecord model offers a lower level of abstraction and makes you think more in terms of persistence of the individual entities. The thought process is so relational that you ultimately end up with a relational model both in terms of persistence and domain. With CouchDB you need to think in terms of documents and views and NOT in terms of relations and tables. I blogged on this same subject some time back.

The philosophy that I adopted in scouchdb was to decouple the domain entities from the persistence layer. You hand over a pure Scala object to the driver, it will extract a JSON model from it and write it to CouchDB. I use sjson for this serialization. sjson works totally based on reflection and can transparently serialize and deserialize Scala objects that you hand over to it. From this point of view, the three aspects of managing domain abstractions, JSON serialization and persistence into CouchDB are totally orthogonal. I think this is difficult to get with an ActiveRecord based model.

Sunday, June 21, 2009

scouchdb Views now interoperable with Scala Objects

In one of the mail exchanges that I had with Dick Wall before the scouchdb demonstration at JavaOne ScriptBowl, Dick asked me the following ..

"Can I return an actual car object instead of a string description? It would be killer if I can actually show some real car sale item objects coming back from the database instead of the string description."

Yes, Dick, you can, now. scouchdb now offers APIs for returning Scala objects directly from couchdb views. Here's an example with Dick's CarSaleItem object model ..

// CarSaleItem class
@BeanInfo
case class CarSaleItem(make : String, model : String, 
  price : BigDecimal, condition : String, color : String) {

  def this(make : String, model : String, 
    price : Int, condition : String, color : String) =
    this(make, model, BigDecimal.int2bigDecimal(price), condition, color)

  private [db] def this() = this(null, null, 0, null, null)

  override def toString = "A " + condition + " " + color + " " + 
    make + " " + model + " for $" + price
}


The following map function returns the car make as the key and the car price as the value ..

// map function
val redCarsPrice =
  """(doc: dispatch.json.JsValue) => {
        val (id, rev, car) = couch.json.JsBean.toBean(doc, 
          classOf[couch.db.CarSaleItem]);
        if (car.color.contains("Red")) List(List(car.make, car.price)) else Nil
  }"""


This is exciting. The following map function returns the car make as the key and the car object as the value ..

// map function
val redCars =
  """(doc: dispatch.json.JsValue) => {
        val (id, rev, car) = couch.json.JsBean.toBean(doc, 
          classOf[couch.db.CarSaleItem]);
        if (car.color.contains("Red")) List(List(car.make, car)) else Nil
  }"""


And now some regular view setup code that registers the views in the CouchDB design document.

// view definitions
val redCarsView = new View(redCars, null)
val redCarsPriceView = new View(redCarsPrice, null)

// handling design document stuff
val cv = DesignDocument("car_views", null, Map[String, View]())
cv.language = "scala"

val rcv = 
  DesignDocument(cv._id, null, 
    Map("red_cars" -> redCarsView, "red_cars_price" -> redCarsPriceView))
rcv.language = "scala"
couch(Doc(carDb, rcv._id) add rcv)


The following query returns JSON corresponding to the car objects being returned from the view ..

val ls1 = couch(carDb view(
  Views builder("car_views/red_cars") build))


On the client side, we can do a simple map over the collection that converts the returned collection into a collection of the specific class objects .. Here we have a collection of CarSaleItem objects ..

import dispatch.json.Js._;
val objs =
  ls1.map { car =>
    val x = Symbol("value") ? obj
    val x(x_) = car
    JsBean.toBean(x_, classOf[CarSaleItem])._3
  }
objs.size should equal(3)
objs.map(_.make).sort((e1, e2) => (e1 compareTo e2) < 0) 
  should equal(List("BMW", "Geo", "Honda"))


But it gets better than this .. we can now have direct Scala objects being fetched from the view query directly through scouchdb API ..

// ls1 is now a list of CarSaleItem objects
val ls1 = couch(carDb view(
  Views builder("car_views/red_cars") build, classOf[CarSaleItem]))
ls1.map(_.make).sort((e1, e2) => (e1 compareTo e2) < 0) 
  should equal(List("BMW", "Geo", "Honda"))


Note the class being passed as an additional parameter in the view API. Similar stuff is also being supported for views having reduce functions. This makes scouchdb more seamless for interoperability between JSON storage layer and object based application layer.

Have a look at the project home page and the associated test case for details ..

Sunday, June 07, 2009

scouchdb Scala View Server gets "reduce"

scouchdb View Server gets reduce. After a fairly long hiatus, I finally got some time to do some hacking on scouchdb over the weekend. And this is what came out of a brief stint on Saturday evening ..

map was already supported in version 0.3. You could define map functions in Scala as ..

val mapfn = """(doc: dispatch.json.JsValue) => {
  val it = couch.json.JsBean.toBean(doc, classOf[couch.json.TestBeans.Item_1])._3;
  for (st <- it.prices)
    yield(List(it.item, st._2))
}"""


Now you can do reduce too ..

val redfn = """(key: List[(String, String)], values: List[dispatch.json.JsNumber], rereduce: Boolean) => {
  values.foldLeft(BigDecimal(0.00))
    ((s, f) => s + (match { case dispatch.json.JsNumber(n) => n }))
}"""


attach the map and reduce functions to a view ..

val view = new View(mapfn, redfn)


and finally fetch using the view query ..

val ls1 =
  couch(test view(
    Views.builder("big/big_lunch")
         .build))
ls1.size should equal(1)


reduce, by default returns only one row through a computation on the result set returned by map. The above query does not use grouping and returns 1 row as the result. You can also use view results grouping and return rows grouped by keys ..

val ls1 =
  couch(test view(
    Views.builder("big/big_lunch")
         .options(optionBuilder group(true) build) // with grouping
         .build))
ls1.size should equal(3)


For a more detailed discussion and examples have a look at the project home page documentation or browse through the test script ScalaViewServerSpec.

The current trunk is 0.3.1. The previous version has been tagged as 0.3 and available in tags folder.

Next up ..

  • JPA like collections of objects directly from scouchdb views

  • more capable reduce options (rereduce, collations etc.)

  • replication

  • advanced exception management with new dbDispatch


.. and lots of other features ..

Stay tuned!

Wednesday, June 03, 2009

scouchdb @ JavaOne

JavaOne script bowl was organized as a panel session to show off different scripting languages on the JVM. Tha languages considered were Jython, Groovy, Scala, JRuby and Clojure. As part of the Scala show, Dick Wall demonstrated scouchdb, the Scala driver for CouchDB. Cool .. and thanks Dick for choosing scouchdb ..

Alex Miller has more details here ..

Sunday, May 17, 2009

scouchdb gets View Server in Scala

CouchDB views are the real wings of the datastore that goes into every document and pulls out data exactly what you have asked for through your queries. The queries are different from the ones you do in an RDBMS using SQL - here you have all the state-of-the-art map/reduce being exercised through each of the cores that your server may have. One very good part of views in CouchDB is that the view server is a separate abstraction from the data store. Computation of views is delegated to an external server process that communicates with the main process over standard input/output using a simple line-based protocol. You can find more details about this protocol in the couchdb wiki.

The default implementation of the query server in CouchDB uses Javascript running via Mozilla SpiderMonkey. However, language aficionados always find a way to push their own favorite into any accessible option. People have developed query servers for Ruby, Php, Python and Common Lisp.

scouchdb gives one for Scala. You can now write map and reduce scripts for CouchDB views in Scala .. the reduce part is not yet ready. But the map functions actually do work in the repository. Here is a usual session using ScalaTest ..


// create some records in the store
couch(test doc Js("""{"item":"banana","prices":{"Fresh Mart":1.99,"Price Max":0.79,"Banana Montana":4.22}}"""))
couch(test doc Js("""{"item":"apple","prices":{"Fresh Mart":1.59,"Price Max":5.99,"Apples Express":0.79}}"""))
couch(test doc Js("""{"item":"orange","prices":{"Fresh Mart":1.99,"Price Max":3.19,"Citrus Circus":1.09}}"""))

// create a design document
val d = DesignDocument("power", null, Map[String, View]())
d.language = "scala"

// a sample map function in Scala
val mapfn1 = 
  """(doc: dispatch.json.JsValue) => {
    val it = couch.json.JsBean.toBean(doc, classOf[couch.json.TestBeans.Item_1])._3; 
    for (st <- it.prices)
      yield(List(it.item, st._2))
  }"""
    
// another map function
val mapfn2 = """(doc: dispatch.json.JsValue) => {
    import dispatch.json.Js._; 
    val x = Symbol("item") ? dispatch.json.Js.str;
    val x(x_) = doc; 
    val i = Symbol("_id") ? dispatch.json.Js.str;
    val i(i_) = doc;
    List(List(i_, x_)) ;
  }"""




Now the way the protocol works is that when the view functions are stored in the view server, CouchDB starts sending the documents one by one and every function gets invoked on every document. So once we create a design document and attach the view with the above map functions, the view server starts processing the documents based on the line based protocol with the main server. And if we invoke the views using scouchdb API as ..

couch(test view(
  Views builder("power/power_lunch") build))


and

couch(test view(
  Views builder("power/mega_lunch") build))


we get back the results based on the queries defined in the map functions. Have a look at the project home page for a complete description of the sample session that works with Scala view functions.

Setting up the View Server

The view server is an external program which will communicate with the CouchDB server. In order to set our scouchdb query server, here are the steps :

The common place to do custom settings for couchdb is local.ini. This can usually be found under /usr/local/etc/couchdb folder. There has been some changes in the configuration files since CouchDB 0.9 - check out the wiki for them. In my system, I set the view server path as follows in local.ini ..

[query_servers]
scala=$SCALA_HOME/bin/scala -classpath couch.db.VS "/tmp/vs.txt"

  • scala is the language of query server that needs to be registered with CouchDB. Once you start futon after registering scala as the language, you should be able to see "scala" registered as a view query language for writing map functions.

  • The classpath points to the jar where you deploy scouchdb.

  • couch.db.VS is the main program that interacts with the CouchDB server. Currently it takes as argument one file name where it sends all statements that it exchanges with the CouchDB server. If it is not supplied, all interactions are routed to the stderr.

  • another change that I needed to make was setting of the os_process_timeout value. The default is set to 5000 (5 seconds). I made the following changes in local.ini ..


[couchdb]
os_process_timeout=20000

Another thing that needs to be setup is an environment variable named CDB_VIEW_CLASSPATH. This should point to the classpath which needs to be passed to the Scala interpreter for executing the map/reduce functions.

You've been warned!

All the above stuff is very much development in progress and has been tested only to the limits of some unit test suites also recorded in the codebase. Use at your own risk, and please, please send feedbacks, patches, bug reports etc. in the project tracker.

Happy hacking!

P.S. Over the weekend I got a patch from Martin Kleppmann that adds the ability to store the type name of an object in the JSON blob when it is serialized (either as fully-qualified class name or as base name without the package component), and to automatically create a bean of the right type when that JSON blob is loaded from the database (without advance knowledge of what that type is going to be). Thanks Martin - I will have a look and integrate it in the trunk.

I have undertaken this as a side project and only get to work on it over the weekends. It is great to have contributory patches from the community that only goes on to enrich the framework. I need to work on the reduce part of the query server and then will launch into a major refactoring to incorporate 0.3 release of Nathan's dbDispatch. Nathan has made some fruitful changes on exceptions and response-code handling. I am itching to incorporate the goodness in scouchdb.

Monday, May 11, 2009

CouchDB and Scala - Updates on scouchdb

A couple of posts back, I introduced scouchdb, the Scala driver for CouchDB persistence. The primary goal of the framework is to offer non-intrusiveness in persistence, in the sense that the Scala objects can be absolutely oblivious to the underlying CouchDB existence. The last post discussed how Scala objects can be added, updated or deleted from CouchDB with the underlying JSON representation carefully veneered away from client APIs. Here is an example of the fetch API in scouchdb ..

val sh = couch(test by_id(s_id, classOf[Shop]))

The document is fetched as an instance of the Scala class Shop, which can then be manipulated using usual Scala machinery. The return type is a Tuple3, where the first two components are the id and revision that may be useful for doing future updates of the document, while sh._3 is the object retrieved from the data store. Returning tuples from a method is a typical Scala idiom that can give rise to some nice pattern matching code capsules ..

couch(test by_id(s_id, classOf[Shop])) match {
  case (id, rev, obj) =>
    //..
  //..
}


The last post also discussed the View APIs and the little builder syntax for View queries.

Over the weekend, scouchdb got some more features, hence a brief post introducing the new additions ..

Temporary Views

No frills, just shares the similar builder interface as ordinary views, with the addition of specifying the map and reduce functions. Here is the necessary spec for querying temporary views ..


describe("fetch from temporary views") {
  it("should fetch 3 rows with group option and 1 row without group option") {
    val mf = 
      """function(doc) {
           var store, price;
           if (doc.item && doc.prices) {
             for (store in doc.prices) {
               price = doc.prices[store];
               emit(doc.item, price);
             }
           }
         }"""
      
    val rf = 
      """function(key, values, rereduce) {
           return(sum(values))
         }"""
      
    // with grouping
    val aq = 
      Views.adhocBuilder(View(mf, rf))
           .options(optionBuilder group(true) build)
           .build
    val s = couch(
      test adhocView(aq))
    s.size should equal(3)
      
    // without grouping
    val aq_1 = 
      Views.adhocBuilder(View(mf, rf))
           .build
    val s_1 = couch(
      test adhocView(aq_1))
    s_1.size should equal(1)
  }
}




Attachment Handling

With each document, CouchDB allows attachments, much like emails. Along with creating a document, I can have a separate attachment associated with the document. However, when the document is retrieved, the attachment, by default is not fetched. It has to be fetched using a special URI. All these are now encapsulated in Scala APIs in scouchdb. Have a look at the following spec ..


describe("create a document and make an attachment") {
  val att = "The quick brown fox jumps over the lazy dog."
    
  val s = Shop("Sears", "refrigerator", 12500)
  val d = Doc(test, "sears")
  var ir:(String, String) = null
  var ii:(String, String) = null
    
  it("document creation should be successful") {
    couch(d add s)
    ir = couch(>%(Id._id, Id._rev))
    ir._1 should equal("sears")
  }
  it("query by id should fetch a row") {
    ii = couch(test by_id ir._1)
    ii._1 should equal("sears")
  }
  it("sticking an attachment should be successful") {
    couch(d attach("foo", "text/plain", att.getBytes, Some(ii._2)))
  }
  it("retrieving the attachment should equal to att") {
    val air = couch(>%(Id._id, Id._rev))
    air._1 should equal("sears")
    couch(d.getAttachment("foo") as_str) should equal(att)
  }
}




CouchDB also allows adding attachments to yet non-existing documents. Adding the attachment will create the document as well. scouchdb supports that as well. Have a look at the bdd specs in the test folder for details of the usage.

Bulk Documents

CouchDB has separate REST interfaces for handling editing of multiple documents at the same time. I can have multiple documents, some of which need to be added as new, some to be updated with specific revision information and some to be deleted from the existing database. And all these can be done using a single POST. scouchdb uses a small DSL for handling such requests. Here is how ..


describe("bulk updates of documents") {
  it("should create 3 documents with 1 post") {
    val cnt = couch(test all_docs).filter(_.startsWith("_design") == false).size 
      
    val s1 = Shop("cc", "refrigerator", 12500)
    val s2 = Shop("best buy", "macpro", 1500)
    val a1 = Address("Survey Park", "Kolkata", "700075")
    val a2 = Address("Salt Lake", "Kolkata", "700091")
      
    couch(test docs(List(s1, s2, a1, a2), false)).size should equal(4)
    couch(test all_docs).filter(_.startsWith("_design") == false).size should equal(cnt + 4)
  }
  it("should insert 2 new documents, update 1 existing document and delete 1 - all in 1 post") {
    val sz = couch(test all_docs).filter(_.startsWith("_design") == false).size
    val s = Shop("Shoppers Stop", "refrigerator", 12500)
    val d = Doc(test, "ss")
      
    val t = Address("Monroe Street", "Denver, CO", "987651")
    val ad = Doc(test, "add1")
      
    var ir:(String, String) = null
    var ir1:(String, String) = null
    
    couch(d add s)
    ir = couch(>%(Id._id, Id._rev))
    ir._1 should equal("ss")
      
    couch(ad add t)
    ir1 = couch(ad >%(Id._id, Id._rev))
    ir1._1 should equal("add1")
      
    val s1 = Shop("cc", "refrigerator", 12500)
    val s2 = Shop("best buy", "macpro", 1500)
    val a1 = Address("Survey Park", "Kolkata", "700075")
      
    val d1 = bulkBuilder(Some(s1)).id("a").build 
    val d2 = bulkBuilder(Some(s2)).id("b").build
    val d3 = bulkBuilder(Some(s)).id("ss").rev(ir._2).build
    val d4 = bulkBuilder(None).id("add1").rev(ir1._2).deleted(true).build

    couch(test bulkDocs(List(d1, d2, d3, d4), false)).size should equal(4)
    couch(test all_docs).filter(_.startsWith("_design") == false).size should equal(sz + 3)
  }
}




As can be found from the above, there are 2 levels of APIs for bulk updates. scouchdb already has an api for creating a document from a Scala object with auto id generation :

def doc[<: AnyRef](obj: T) = { //..

As an extension, I introduce the following which lets users add multiple new documents through a single API. Note here all of the documents will be added new ..

def docs(objs: List[<: AnyRef], allOrNothing: Boolean) = { //..

and the objects can be of any type, not necessarily the same. This is illustrated in the first of the 2 specs above.

But in case you need to use the full feature of bulk uploads and editing of multiple documents, I offer a builder based interface, which is illustrated in the second spec above. Here 2 new documents are added, 1 being updated and 1 deleted, all through one single API.

In case you are doing CouchDB and Scala stuff, give scouchdb a spin and post comments on your feedback. I am yet to write a meaningful application using scouchdb - any feedback will be immensely helpful.