Showing posts with label map-reduce. Show all posts
Showing posts with label map-reduce. Show all posts

Sunday, June 07, 2009

scouchdb Scala View Server gets "reduce"

scouchdb View Server gets reduce. After a fairly long hiatus, I finally got some time to do some hacking on scouchdb over the weekend. And this is what came out of a brief stint on Saturday evening ..

map was already supported in version 0.3. You could define map functions in Scala as ..

val mapfn = """(doc: dispatch.json.JsValue) => {
  val it = couch.json.JsBean.toBean(doc, classOf[couch.json.TestBeans.Item_1])._3;
  for (st <- it.prices)
    yield(List(it.item, st._2))
}"""


Now you can do reduce too ..

val redfn = """(key: List[(String, String)], values: List[dispatch.json.JsNumber], rereduce: Boolean) => {
  values.foldLeft(BigDecimal(0.00))
    ((s, f) => s + (match { case dispatch.json.JsNumber(n) => n }))
}"""


attach the map and reduce functions to a view ..

val view = new View(mapfn, redfn)


and finally fetch using the view query ..

val ls1 =
  couch(test view(
    Views.builder("big/big_lunch")
         .build))
ls1.size should equal(1)


reduce, by default returns only one row through a computation on the result set returned by map. The above query does not use grouping and returns 1 row as the result. You can also use view results grouping and return rows grouped by keys ..

val ls1 =
  couch(test view(
    Views.builder("big/big_lunch")
         .options(optionBuilder group(true) build) // with grouping
         .build))
ls1.size should equal(3)


For a more detailed discussion and examples have a look at the project home page documentation or browse through the test script ScalaViewServerSpec.

The current trunk is 0.3.1. The previous version has been tagged as 0.3 and available in tags folder.

Next up ..

  • JPA like collections of objects directly from scouchdb views

  • more capable reduce options (rereduce, collations etc.)

  • replication

  • advanced exception management with new dbDispatch


.. and lots of other features ..

Stay tuned!

Sunday, September 07, 2008

More Erlang with Disco

Over the weekend I was having a look at Disco, an open source map-reduce framework, built atop Erlang/OTP that allows users to write mapper/reducer jobs in Python. Disco does not mandate any Erlang knowledge on part of the user, who can use all the expressiveness of Python to implement map/reduce jobs.

One more addition to the stack of using Erlang as middleware.

As a programmer, you can concentrate on composing map/reduce jobs using all the expressiveness of Python. Disco master receives jobs from the clients, adds them to the job queue, and makes them run on Erlang powered clusters. Each node of the cluster runs the usual supervisor-worker hierarchies of Erlang processes that fires up the concurrent processing of all client jobs. Server crash does not affect job execution, new servers can be added on the fly and high availability can be ensured through a multiple Disco master configuration.

Disco has quite a bit of overlap of functionalities with CouchDb, one of the earliest adopters of Erlang-at-the-backend with Javascript and optionally a host of your favorite languages for view processing and REST APIs.

As I had mentioned before, Erlang as middleware is catching up ..