Showing posts with label scalaz. Show all posts
Showing posts with label scalaz. Show all posts

Monday, June 03, 2013

Endo is the new fluent API

I tweeted this over the weekend .. My last two blog posts have been about endomorphisms and how it combines with the other functional structures to help you write expressive and composable code. In A DSL with an Endo - monoids for free, endos play with Writer monad and implement a DSL for a sequence of activities through monoidal composition. And in An exercise in Refactoring - Playing around with Monoids and Endomorphisms, I discuss a refactoring exercise that exploits the monoid of an endo to make composition easier. Endomorphisms help you lift your computation into a data type that gives you an instance of a monoid. And the mappend operation of the monoid is the function composition. Hence once you have the Endo for your type defined, you get a nice declarative syntax for the operations that you want to compose, resulting in a fluent API. Just a quick recap .. endomorphisms are functions that map a type on to itself and offer composition over monoids. Given an endomorphism we can define an implicit monoid instance ..
implicit def endoInstance[A]: Monoid[Endo[A]] = new Monoid[Endo[A]] {
  def append(f1: Endo[A], f2: => Endo[A]) = f1 compose f2
  def zero = Endo.idEndo
}
I am not going into the details of this, which I discussed at length in my earlier posts. In this article I will sum up with yet another use case for making fluent APIs using the monoid instance of an Endo. Consider an example from the domain of securities trading, where a security trade goes through a sequence of transformations in its lifecycle through the trading process .. Here's a typical Trade model (very very trivialified for demonstration) ..
sealed trait Instrument
case class Security(isin: String, name: String) extends Instrument

case class Trade(refNo: String, tradeDate: Date, valueDate: Option[Date] = None, 
  ins: Instrument, principal: BigDecimal, net: Option[BigDecimal] = None, 
  status: TradeStatus = CREATED)
Modeling a typical lifecycle of a trade is complex. But for illustration, let's consider these simple ones which need to executed on a trade in sequence ..
  1. Validate the trade
  2. Assign value date to the trade, which will ideally be the settlement date
  3. Enrich the trade with tax/fees and net trade value
  4. Journalize the trade in books
Each of the functions take a Trade and return a copy of the Trade with some attributes modified. A naive way of doing that will be as follows ..
def validate(t: Trade): Trade = //..

def addValueDate(t: Trade): Trade = //..

def enrich(t: Trade): Trade = //..

def journalize(t: Trade): Trade = //..
and invoke these methods in sequence while modeling the lifecycle. Instead we try to make it more composable and lift the function Trade => Trade within the Endo ..
type TradeLifecycle = Endo[Trade]
and here's the implementation ..
// validate the trade: business logic elided
def validate: TradeLifecycle = 
  ((t: Trade) => t.copy(status = VALIDATED)).endo

// add value date to the trade (for settlement)
def addValueDate: TradeLifecycle = 
  ((t: Trade) => t.copy(valueDate = Some(t.tradeDate), status = VALUE_DATE_ADDED)).endo

// enrich the trade: add taxes and compute net value: business logic elided
def enrich: TradeLifecycle = 
  ((t: Trade) => t.copy(net = Some(t.principal + 100), status = ENRICHED)).endo

// journalize the trade into book: business logic elided
def journalize: TradeLifecycle = 
  ((t: Trade) => t.copy(status = FINALIZED)).endo
Now endo has an instance of Monoid defined by scalaz and the mappend of Endo is function composition .. Hence here's our lifecycle model using the holy monoid of endo ..
def doTrade(t: Trade) =
  (journalize |+| enrich |+| addValueDate |+| validate).apply(t)
It's almost the specification that we listed above in numbered bullets. Note the inside out sequence that's required for the composition to take place in proper order.

Why not plain old composition ?


A valid question. The reason - abstraction. Abstracting the composition within types helps you compose the result with other types, as we saw in my earlier blog posts. In one of them we built larger abstractions using the Writer monad with Endo and in the other we used the mzero of the monoid as a fallback during composition thereby avoiding any special case branch statements.

One size doesn't fit all ..


The endo and its monoid compose beautifully and gives us a domain friendly syntax that expresses the business functionality ina nice succinct way. But it's not a pattern which you can apply everywhere where you need to compose a bunch of domain behaviors. Like every idiom, it has its shortcomings and you need different sets of solutions in your repertoire. For example the above solution doesn't handle any of the domain exceptions - what if the validation fails ? With the above strategy the only way you can handle this situation is to throw exceptions from validate function. But exceptions are side-effects and in functional programming there are more cleaner ways to tame the evil. And for that you need different patterns in practice. More on that in subsequent posts ..

Monday, July 25, 2011

Monad Transformers in Scala

Monads don't compose .. and hence Monad Transformers. A monad transformer maps monads to monads. It lets you transform a monad with additional computational effects. Stated simply, if you have a monadic computation in place you can enrich it incrementally with additional effects like states and errors without disturbing the whole structure of your program.

A monad transformer is represented by the kind T :: (* -> *) -> * -> *. The general contract that a monad transformer offers is ..

class MonadTrans t where
 lift :: (Monad m) => m a -> t m a

Here we lift a computation m a into the context of another effect t. We call t the monad transformer, which is itself a monad.

Well in this post, I will discuss monad transformers in scala using scalaz 7. And here's what you get as the base abstraction corresponding to the Haskell typeclass shown above ..

trait MonadTrans[F[_[_], _]] {
  def lift[G[_] : Monad, A](a: G[A]): F[G, A]
}

It takes a G and lifts it into another computation F thereby combining the effects into the composed monad. Let's look at a couple of examples in scalaz that use lift to compose a new effect into an existing one ..

// lift an Option into a List
// uses transLift available from the main pimp of scalaz

scala> List(10, 12).transLift[OptionT].runT
res12: List[Option[Int]] = List(Some(10), Some(12))

// uses explicit constructor methods

scala> optionT(List(some(12), some(50))).runT
res13: List[Option[Int]] = List(Some(12), Some(50))

If you are like me, you must have already started wondering about the practical usage of this monad transformer thingy. Really we want to see them in action in some meaningful code which we write in our daily lives.

In the paper titled Monad Transformers Step by Step, Martin Grabmuller does this thing in Haskell evolving a complete interpreter of a subset of language using these abstractions and highlighting how they contribute towards an effective functional model of your code. In this post I render some of the scala manifestations of those examples using scalaz 7 as the library of implementation. The important point that Martin also mentions in his paper is that you need to think functionally and organize your code upfront using monadic structures in order to take full advantage of incremental enrichment through monad transformers.

We will be writing an interpreter for a very small language. I will first define the base abstractions and start with a functional code as the implementation base. It does not contain many of the useful stuff like state management, error handling etc., which I will add incrementally using monad transformers. We will see how the core model remains the same, transformers get added in layers and the static type of the interpreter function states explicitly what effects have been added to it.

The Language

Here's the language for which we will be writing the interpreter. Pretty basic stuff with literal integers, variables, addition, λ expressions (abstraction) and function application. By abstraction and application I mean lambda terms .. so a quick definition for the uninitiated ..


- a lambda term may be a variable, x
- if t is a lambda term, and x is a variable, then λx.t is a lambda term (called a lambda abstraction)
- if t and s are lambda terms, then ts is a lambda term (called an application)


and the Scala definitions for the language elements ..

// variable names
type Name = String
  
// Expression types
trait Exp
case class Lit(i: Int) extends Exp
case class Var(n: Name) extends Exp
case class Plus(e1: Exp, e2: Exp) extends Exp
case class Abs(n: Name, e: Exp) extends Exp
case class App(e1: Exp, e2: Exp) extends Exp
  
// Value types
trait Value
case class IntVal(i: Int) extends Value
case class FunVal(e: Env, n: Name, exp: Exp) extends Value

// Environment in which the λ-abstraction will be evaluated
type Env = collection.immutable.Map[Name, Value]

// defining additional data constructors because in Scala
// typeclass resolution and variance often give some surprises with type inferencing

object Values {
  def intval(i: Int): Value = IntVal(i)
  def funval(e: Env, n: Name, exp: Exp): Value = FunVal(e, n, exp)
}

The Reference Implementation

I start with the base implementation, which is a functional model of the interpreter. It contains only the basic stuff for evaluation and has no monadic structure. Incrementally we will start having fun with this ..

def eval0: Env => Exp => Value = { env => exp =>
  exp match {
    case Lit(i) => IntVal(i)
    case Var(n) => (env get n).get
    case Plus(e1, e2) => {
      val IntVal(i1) = eval0(env)(e1)
      val IntVal(i2) = eval0(env)(e2)
      IntVal(i1 + i2)
    }
    case Abs(n, e) => FunVal(env, n, e)
    case App(e1, e2) => {
      val val1 = eval0(env)(e1)
      val val2 = eval0(env)(e2)
      val1 match {
        case FunVal(e, n, exp) => eval0((e + ((n, val2))))(exp)
      }
    }
  }
}

Note we assume that we have the proper matches everywhere - the Map lookup in processing variables (Var) doesn't fail and we have the proper function value when we go for the function application. So things look happy for the correct paths of expression evaluation ..

// Evaluate: 12 + ((λx -> x)(4 + 2))

scala> val e1 = Plus(Lit(12), App(Abs("x", Var("x")), Plus(Lit(4), Lit(2))))
e1: Plus = Plus(Lit(12),App(Abs(x,Var(x)),Plus(Lit(4),Lit(2))))

scala> eval0(collection.immutable.Map.empty[Name, Value])(e1)
res4: Value = IntVal(18)

Go Monadic

Monad transformers give you layers of control over the various aspects of your computation. But for that to happen you need to organize your code in a monadic way. Think of it like this - if your code models the computations of your domain (aka the domain logic) as per the contracts of an abstraction you can very well compose more of similar abstractions in layers without directly poking into the underlying implementation.

Let's do one thing - let's transform the above function into a monadic one that doesn't add any effect. It only sets up the base case for other monad transformers to prepare their playing fields. It's the Identity monad, which simply applies the bound function to its input without any additional computational effect. In scalaz 7 Identity simply wraps a value and provides a map and flatMap for bind.

Here's our next iteration of eval, this time with the Identity monad baked in .. eval0 was returning Value, eval1 returns Identity[Value] - the return type makes this fact explicit that we are now in the land of monads and have wrapped ourselves into a computational structure which can only be manipulated through the bounds of the contract that the monad allows.

type Eval1[A] = Identity[A]

def eval1: Env => Exp => Eval1[Value] = {env => exp =>
  exp match {
    case Lit(i) => intval(i).point[Eval1]
    case Var(n) => (env get n).get.point[Eval1]
    case Plus(e1, e2) => for {
      i <- eval1(env)(e1)
      j <- eval1(env)(e2)
    } yield {
      val IntVal(i1) = i
      val IntVal(i2) = j
      IntVal(i1 + i2)
    }
    case Abs(n, e) => funval(env, n, e).point[Eval1]
    case App(e1, e2) => for {
      val1 <- eval1(env)(e1)
      val2 <- eval1(env)(e2)
    } yield {
      val1 match {
        case FunVal(e, n, exp) => eval1((e + ((n, val2))))(exp)
      }
    }
  }
}

All returns are now monadic, though the basic computation remains the same. The Lit, Abs and the Var cases use the point function (pure in scalaz 6) equivalent to a Haskell return. Plus and App use the for comprehension to evaluate the monadic action. Here's the result on the REPL ..

scala> eval1(collection.immutable.Map.empty[Name, Value])(e1)
res7: Eval1[Value] = scalaz.Identity$$anon$2@18f67fc

scala> res7.value
res8: Value = IntVal(18)

So the Identity monad has successfully installed itself making our computational model like an onion peel on which we can now stack up additional effects.

Handling Errors

In eval1 we have a monadic functional model of our computation. But we have not yet handled any errors that may arise from the computation. And I promised that we will add such effects incrementally without changing the guts of your model.

As a very first step, let's use a monad transformer that helps us handle errors, not by throwing exceptions (exceptions are bad .. right?) but by wrapping the error conditions in yet another abstraction. Needless to say this also has to be monadic because we would like it to compose with our already implemented Identity monad and the others that we will work out later on.

scalaz 7 offers EitherT which we can use as the Error monad transformer. It is defined as ..

sealed trait EitherT[A, F[_], B] {
  val runT: F[Either[A, B]]
  //..
}

It adds the EitherT computation on top of F so that the composed monad will have both the effects. And as with Either we use the Left A for the error condition and the Right B for returning the result. The plus point of using the monad transformer is that this plumbing of the 2 monads is taken care of by the implementation of EitherT, so that we can simply define the following ..

type Eval2[A] = EitherT[String, Identity, A]

def eval2a: Env => Exp => Eval2[Value] = {env => exp =>
  //..
}

The error will be reported as String and the Value will be returned in the Right constructor of Either. Our return type is also explicit in what the function does. You can simply change the return type to Eval2 and keep the rest of the function same as eval1. It works perfectly like the earlier one. Since we have not yet coded explicitly for the error conditions, appropriate error messages will not appear, but the happy paths execute as earlier even with the changed return type. This is because Identity was a monad and so is the newly composed one consisting of Identity and EitherT.

We can run eval2a and the only difference in output will be that the result will be wrapped in a Right constructor ..

scala> val e1 = Plus(Lit(12), App(Abs("x", Var("x")), Plus(Lit(4), Lit(2))))
e1: Plus = Plus(Lit(12),App(Abs(x,Var(x)),Plus(Lit(4),Lit(2))))

scala> eval2a(collection.immutable.Map.empty[Name, Value])(e1)
res31: Eval2[Value] = scalaz.EitherTs$$anon$2@ad2f60

scala> res31.runT.value
res33: Either[String,Value] = Right(IntVal(18))

We can do a couple of more iterations improving upon how we can handle errors using EitherT and issue appropriate error messages to the user. Here's the final version that has all error handling implemented. Note however that the core model remains the same - we have only added the Left handling for error conditions ..

def eval2: Env => Exp => Eval2[Value] = {env => exp =>
  exp match {
    case Lit(i) => intval(i).point[Eval2]

    case Var(n) => (env get n).map(v => rightT[String, Identity, Value](v))
                              .getOrElse(leftT[String, Identity, Value]("Unbound variable " + n))
    case Plus(e1, e2) => 
      val r = 
        for {
          i <- eval2(env)(e1)
          j <- eval2(env)(e2)
        } yield((i, j))

      r.runT.value match {
        case Right((IntVal(i_), IntVal(j_))) => rightT(IntVal(i_ + j_))
        case Left(s) => leftT("type error in Plus" + "/" + s)
        case _ => leftT("type error in Plus")
      }

    case Abs(n, e) => funval(env, n, e).point[Eval2]

    case App(e1, e2) => 
      val r =
        for {
          val1 <- eval2(env)(e1)
          val2 <- eval2(env)(e2)
        } yield((val1, val2))

      r.runT.value match {
        case Right((FunVal(e, n, exp), v)) => eval2(e + ((n, v)))(exp)
        case _ => leftT("type error in App")
      }
  }
}

How about some State ?

Let's add some mutable state in the function using the State monad. So now we need to stack up our pile of transformers with yet another effect. We would like to add some profiling capabilities that track invocation of every pattern in the evaluator. For simplicity we just count the number of invocations as an integer and report it along with the final output. We define the new monad by wrapping a StateT constructor around the innermost monad, Identity. So now our return type becomes ..

type Eval3[A] = EitherT[String, StateTIntIdentity, A]

We layer the StateT between EitherT and Identity - hence we need to form a composition between StateT and Identity that goes as the constructor to EitherT. This is defined as StateTIntIdentity, we make the state an Int. And we define this as a type lambda as follows ..

type StateTIntIdentity[α] = ({type λ[α] = StateT[Int, Identity, α]})#λ[α]

Intuitively our returned value in case of a successful evaluation will be a tuple2 (Either[String, Value], Int), as we will see shortly.

We write a couple of helper functions that manages the state by incrementing a counter and lifting the result into a StateT monad and finally lifting everything into the EitherT.

def stfn(e: Either[String, Value]) = (s: Int) => id[(Either[String, Value], Int)](e, s+1)

def eitherNStateT(e: Either[String, Value]) =
  eitherT[String, StateTIntIdentity, Value](stateT[Int, Identity, Either[String, Value]](stfn(e)))

And here's the eval3 function that does the evaluation along with profiling and error handling ..

def eval3: Env => Exp => Eval3[Value] = {env => exp => 
  exp match {
    case Lit(i) => eitherNStateT(Right(IntVal(i)))

    case Plus(e1, e2) =>
      def appplus(v1: Value, v2: Value) = (v1, v2) match {
        case ((IntVal(i1), IntVal(i2))) => eitherNStateT(Right(IntVal(i1 + i2))) 
        case _ => eitherNStateT(Left("type error in Plus"))
      }
      for {
        i <- eval3(env)(e1)
        j <- eval3(env)(e2)
        v <- appplus(i, j)
      } yield v

    case Var(n) => 
      val v = (env get n).map(Right(_))
                         .getOrElse(Left("Unbound variable " + n))
      eitherNStateT(v)

    case Abs(n, e) => eitherNStateT(Right(FunVal(env, n, e)))

    case App(e1, e2) => 
      def appfun(v1: Value, v2: Value) = v1 match {
        case FunVal(e, n, body) => eval3(e + ((n, v2)))(body)
        case _ => eitherNStateT(Left("type error in App"))
      }

      val s =
        for {
          val1 <- eval3(env)(e1)
          val2 <- eval3(env)(e2)
          v    <- appfun(val1, val2)
        } yield v

      val ust = s.runT.value.usingT((x: Int) => x + 1)
      eitherT[String, StateTIntIdentity, Value](ust)
  }
}

We run the above function through another helper runEval3 that also takes the seed value of the state ..

def runEval3: Env => Exp => Int => (Either[String, Value], Int) = { env => exp => seed => 
  eval3(env)(exp).runT.value.run(seed)
}

Here's the REPL session with runEval3 ..

scala> val e1 = Plus(Lit(12), App(Abs("x", Var("x")), Plus(Lit(4), Lit(2))))
e1: Plus = Plus(Lit(12),App(Abs(x,Var(x)),Plus(Lit(4),Lit(2))))
scala> runEval3(env)(e1)(0)
res25: (Either[String,Value], Int) = (Right(IntVal(18)),8)

// -- failure case --
scala> val e2 = Plus(Lit(12), App(Abs("x", Var("y")), Plus(Lit(4), Lit(2))))
e2: Plus = Plus(Lit(12),App(Abs(x,Var(y)),Plus(Lit(4),Lit(2))))

scala> runEval3(env)(e2)(0)
res27: (Either[String,Value], Int) = (Left(Unbound variable y),7)

In case you are interested the whole code base is there in my github repository. Feel free to check out. I will be adding a couple of more transformers for hiding the environment (ReaderT) and logging (WriterT) and also IO.

Monday, March 28, 2011

Killing a var and Threading a State

In my earlier post on CQRS using functional domain models and Akka actors, I had implemented a data store that accumulates all events associated with individual trades. We call this EventSourcing that allows us to rollback our system in time and replay all events over the earlier snapshot to bring it up to date. This has many uses to detect any problems that might have occured in the past or to profile your system on a retroactive basis. This post is not about event sourcing or the virtues of CQRS.

In this post I start taking cue from the EventSourcing example and discuss some strategies of improving some aspects of the domain model. This is mostly related to raising the level of abstraction at which to program the solution domain. Consider this post to be a random rant to record some of my iterations in the evolution of the domain model.

The code snippets that I present below may sometimes look out of context since they're part of a bigger model. The entire prototype is there in my github repository ..

Story #1 : How to kill a var

Have a look at the event store that I implemented earlier ..

class EventStore extends Actor with Listeners {
  private var events = Map.empty[Trade, List[TradeEvent]]

  def receive = //..
  //..
}


With an actor based implementation the mutable var events is ok since the state is confined within the actor itself. In another implementation where I was using a different synchronous event store, I had to get around this mutable shared state. It was around this time Tony Morris published his Writer Monad in Scala. This looked like a perfect fit .. Here's the implementation of the abstraction that logs all events, shamelessly adopted from Tony's Logging Without Side-effects example ..

import TradeModel._
object EventLog {
  type LOG = List[(Trade, TradeEvent)]
}

import EventLog._

case class EventLogger[A](log: LOG, a: A) {
  def map[B](f: A => B): EventLogger[B] =
    EventLogger(log, f(a))

  def flatMap[B](f: A => EventLogger[B]): EventLogger[B] = {
    val EventLogger(log2, b) = f(a)
    EventLogger(log ::: log2 /* accumulate */, b)
  }
}

object EventLogger {
  implicit def LogUtilities[A](a: A) = new {
    def nolog =
      EventLogger(Nil /* empty */, a)

    def withlog(log: (Trade, TradeEvent)) =
      EventLogger(List(log), a)

    def withvaluelog(log: A => (Trade, TradeEvent)) =
      withlog(log(a))
  }
}


and here's a snippet that exercises the logging process ..

import EventLogger._

val trd = makeTrade("a-123", "google", "r-123", HongKong, 12.25, 200).toOption.get

val r = for {
  t1 <- enrichTrade(trd) withlog (trd, enrichTrade)
  t2 <- addValueDate(t1) withlog (trd, addValueDate)
} yield t2


Now I can check what events have been logged and do some processing on the event store ..

// get the log from the EventLogger grouped by trade
val m = r.log.groupBy(_._1)

// play the event on the trades to get the current snapshot
val x =
  m.keys.map {=>
    m(t).map(_._2).foldLeft(t)((a,e) => e(a))
  }

// check the results
x.size should equal(1)
x.head.taxFees.get.size should equal(2) 
x.head.netAmount.get should equal(3307.5000)


Whenever you're appending data to an abstraction consider using the Writer monad. With this I killed the var events from modeling my event store.

Story #2: Stateful a la carte

This is a story that gives a tip for handling changing state of a domain model in a functional way. When the state does not change and you're using an abstraction repeatedly for reading, you have the Reader monad.

// enrichment of trade
// Reader monad
val enrich = for {
  taxFeeIds      <- forTrade // get the tax/fee ids for a trade
  taxFeeValues   <- taxFeeCalculate // calculate tax fee values
  netAmount      <- enrichTradeWith // enrich trade with net amount
}
yield((taxFeeIds map taxFeeValues) map netAmount)

val trd = makeTrade("a-123", "google", "r-123", HongKong, 12.25, 200)
(trd map enrich) should equal(Success(Some(3307.5000)))


Note how we derive the enrichment information for the trade keeping the original abstraction immutable - Reader monad FTW.

But what happens when you need to handle state that changes in the lifecycle of the abstraction ? You can use the State monad itself. It allows you to thread a changing state across a sequence transparently at the monad definition level. Here's how I would enrich a trade using the State monad as implemented in scalaz ..

val trd = makeTrade("a-123", "google", "r-123", HongKong, 12.25, 200).toOption.get

val x =
  for {
    _ <- init[Trade]
    _ <- modify((t: Trade) => refNoLens.set(t, "XXX-123"))
    u <- modify((t: Trade) => taxFeeLens.set(t, some(List((TradeTax, 102.25), (Commission, 25.65)))))
  } yield(u)

~> trd == trd.copy(refNo = "XXX-123")


In case you're jsut wondering what's going around above, here's a bit of an explanation. init initializes the state for Trade. init is defined as ..

def init[S]: State[S, S] = state[S, S](=> (s, s))


modify does the state change with the function passed to it as the argument ..

def modify[S](f: S => S) = init[S] flatMap (=> state(=> (f(s), ())))


We apply the series of modify to enrich the trade. The actual threading takes place transparently through the magic of comprehensions.

There is another way of securely encapsulating stateful computations that allow in-place updates - all in the context of functional programming. This is the ST monad which has very recently been inducted into scalaz. But that is the subject of another post .. sometime later ..

Monday, February 14, 2011

Applicatives for composable JSON serialization in Scala

It has been quite some time I have decided to play around with sjson once again. For the convenience of those who are not familiar with sjson, it's a tiny JSON serialization library that can serialize and de-serialize Scala objects. sjson offers two ways in which you can serialize your Scala objects :-
  1. typeclass based serialization, where you define your own protocol (typeclass instances) for your own objects. The standard ones, of course come out of the box.
  2. reflection based serialization, where you provide a bunch of annotations and sjson looks up reflectively and tries to get your objects serialized and de-serialized.
One of the things which bothered me in both the implementations is the way errors are handled. Currently I use exceptions to report errors in serializing / de-serializing. Exceptions, as you know, are side-effects and don't compose. Hence even though your input JSON value has many keys that don't match with the names in your Scala class, errors are reported one by one.

scalaz is a Haskell like library for Scala that offers myriads of options towards pure functional programming. I have been playing around with Scalaz recently, particularly the typeclasses for Applicatives. I have also blogged on some of the compositional features that scalaz offers that help make your code much more declarative, concise and composable.

The meat of scalaz is based on the two most potent forces that Scala offers towards data type generic programming :-
  1. typeclass encoding using implicits and
  2. ability to abstract over higher kinded types (type constructor polymorphism)
Using these features scalaz has made lots of operations available to a large family of data structures, which were otherwise available only for a smaller subset in the Scala standard library. Another contribution of scalaz has been to make many of the useful abstractions first class in Scala e.g. Applicative, Monad, Traversable etc. All of these are available in Haskell as typeclass hierarchies - so now you can use the goodness of these abstractions in Scala as well.

One of the areas which I focused on in sjson using scalaz is to make error reporting composable. Have a look at the following snippet ..

// an immutable value object in Scala
case class Address(no: Int, street: String, city: String, zip: String)
 
// typeclass instance for sjson serialization protocol for Address
object AddressProtocol extends DefaultProtocol {
 
 implicit object AddressFormat extends Format[Address] {
   def reads(json: JsValue): ValidationNEL[String, Address] = json match {
     case m@JsObject(_) => 
       (field[Int]("no", m)        |@| 
        field[String]("street", m) |@| 
        field[String]("city", m)   |@| 
        field[String]("zip", m)) { Address }
 
     case _ => "JsObject expected".fail.liftFailNel
   }
 //..
}


In the current version of sjson, reads returns an Address. Now it returns an applicative, ValidationNEL[String, Address], which is a synonym for Validation[NonEmptyList[String], Address]. Validation is isomorphic to scala.Either in the sense that it has two separate types for error and success. But it has a much cleaner API and does not leave the choice to convention. In our case since we will be accumulating errors, we choose to use a List type for the error part. As a general implementation strategy, when Validation is used as an Applicative, the error type is modeled as a SemiGroup that offers an append operation. Have a look at scalaz for details of how you can use Validation as an applicative for cumulative error reporting.

Let's see what happens in the above snippet ..

1. field extracts the value the relevant field (passed as the first argument) from the JsObject. Incidentally JsObject is from Nathan Hamblen's dispatch-json, which sjson uses under the covers. More on dispatch-json's awesomeness later :). Here's how I define field .. Note if the name is not available, it gives us a Failure type on the Validation.

def field[T](name: String, js: JsValue)(implicit fjs: Reads[T]): ValidationNEL[String, T] = {
 val JsObject(m) = js
 m.get(JsString(name))
  .map(fromjson[T](_)(fjs))
  .getOrElse(("field " + name + " not found").fail.liftFailNel)
}


2. field invocations are composed using |@| combinator of scalaz, which gives us an ApplicativeBuilder that allows me to play around with the elements that it composes. In the above snippet we simply pass these components to build up an instance of the Address class.

Since Validation is an Applicative, all errors that come up during composition of field invocations get accumulated in the final list that occurs as the error type of it.

Let's first look at the normal usecase where things are happy and we get an instance of Address constructed from the parsed json. No surprises here ..

// test case
it ("should serialize an Address") {
 import Protocols._
 import AddressProtocol.// typeclass instances
 val a = Address(12, "Tamarac Square", "Denver", "80231")
 fromjson[Address](tojson(a)) should equal(a.success)
}


But what happens if there are some errors in the typeclass instance that you created ? Things start to get interesting from here ..

implicit object AddressFormat extends Format[Address] {
 def reads(json: JsValue): ValidationNEL[String, Address] = json match {
   case m@JsObject(_) => 
     (field[Int]("number", m) |@| 
      field[String]("stret", m) |@| 
      field[String]("City", m) |@| 
      field[String]("zip", m)) { Address }
 
   case _ => "JsObject expected".fail.liftFailNel
 }
 //..
}


Note that the keys in json as passed to field API do not match the field names in the Address class. Deserialization fails and we get a nice list of all errors reported as part of the Failure type ..

it ("address serialization should fail") {
  import Protocols._
  import IncorrectPersonProtocol._
  val a = Address(12, "Tamarac Square", "Denver", "80231")
  (fromjson[Person](tojson(p))).fail.toOption.get.list 
    should equal (List("field number not found", "field stret not found", "field City not found"))
}


Composability .. Again!

A layer of monads on top of your API makes your API composable with any other monad in the world. With sjson de-serialization returning a Validation, we can get better composability when writing complex serialization code like the following. Consider this JSON string from where we need to pick up fields selectively and make a Scala object ..

val jsonString = 
  """{
       "lastName" : "ghosh", 
       "firstName" : "debasish", 
       "age" : 40, 
       "address" : { "no" : 12, "street" : "Tamarac Square", "city" : "Denver", "zip" : "80231" }, 
       "phone" : { "no" : "3032144567", "ext" : 212 },
       "office" :
        {
          "name" : "anshinsoft",
          "address" : { "no" : 23, "street" : "Hampden Avenue", "city" : "Denver", "zip" : "80245" } 
        }
     }"""


We would like to cherry pick a few of the fields from here and create an instance of Contact class ..

case class Contact(lastName: String, firstName: String, 
  address: Address, officeCity: String, officeAddress: Address)


Try this with the usual approach as shown above and you will find some of the boilerplate repetitions within your implementation ..

import dispatch.json._
import Js._

val js = Js(jsonString) // js is a JsValue

(field[String]("lastName", js)    |@| 
 field[String]("firstName", js)   |@| 
 field[Address]("address", js)    |@| 
 field[String]("city", (('office ! obj) andThen ('address ? obj))(js)) |@|
 field[Address]((('office ! obj) andThen ('address ! obj)), js)) { Contact } should equal(c.success)


Have a look at this how we need to repeatedly pass around js, though we never modify it any time. Since our field API is monadic, we can compose all invocations of field together with a Reader monad. This is a very useful technique of API composition which I discussed in an earlier blog post. (Here is a trivia : How can we compose similar stuff when there's modification involved in the passed around state ? Hint: The answer is within the question itself :D)

But for that we need to make a small change in our field API. We need to make it curried .. Here are 2 variants of the curried field API ..

// curried version: for lookup of a String name
def field_c[T](name: String)(implicit fjs: Reads[T]) = { js: JsValue =>
  val JsObject(m) = js
  m.get(JsString(name)).map(fromjson[T](_)(fjs)).getOrElse(("field " + name + " not found").fail.liftFailNel)
}

// curried version: we need to get a complete JSON object out
def field_c[T](f: (JsValue => JsValue))(implicit fjs: Reads[T]) = { js: JsValue =>
  try {
    fromjson[T](f(js))(fjs)
  } catch {
    case e: Exception => e.getMessage.fail.liftFailNel
  }
}


Note how in the second variant of field_c, we use the extractors of dispatch-json to take out nested objects from a JsValue structure. We use it below to get the office address from within the parsed JSON.

And here's how we compose all lookups monadically and finally come up with the Contact instance ..

// reader monad
val contact =
  for {
    last    <- field_c[String]("lastName")
    first   <- field_c[String]("firstName")
    address <- field_c[Address]("address")
    office  <- field_c[Address]((('office ! obj) andThen ('address ! obj)))
  }
  yield(last |@| first |@| address |@| office)

// city needs to be parsed separately since we are working on part of js
val city = field_c[String]("city")

// compose everything and build a Contact
(contact(js) |@| city((('office ! obj) andThen ('address ? obj))(js))) { 
  (last, first, address, office, city) => 
    Contact(last, first, address, city, office) } should equal(c.success)


I am still toying around with some of the monadic implementations of sjson APIs. It's offered as a separate package and will make a nice addition to the API families that sjson offers. You can have a look at my github repo for more details. I plan to finalize soon before I get to 1.0.