Showing posts with label monad. Show all posts
Showing posts with label monad. Show all posts

Monday, January 14, 2013

A language and its interpretation - Learning free monads

I have been playing around with free monads of late and finding them more and more useful in implementing separation of concerns between pure data and its interpretation. Monads generally don't compose. But if you restrict monads to a particular form, then you can define a sum type that composes. In the paper Data Types a la carte, Wouter Swierstra describes this form as
data Term f a =
    Pure a
    | Impure (f (Term f a)) 
These monads consist of either pure values or an impure effect, constructed using f. When f is a functor, Term f is a monad. And in this case, Term f is the free monad being the left adjoint to the forgetful functor f.

I am not going into the details of what makes a monad free. I once asked this question on google+ and Edward Kmett came up with a beautiful explanation. So instead of trying to come up with a half assed version of the same, have a look at Ed's response here .

In short, we can say that a free monad is the freeest object possible that's still a monad.

Composition

Free monads compose and help you build larger abstractions which are pure data and yet manage to retain all properties of a monad. Hmm .. this sounds interesting because now we can not only build abstractions but make them extensible through composition by clients using the fact that it's still a monad.

A free monad is pure data, not yet interpreted, as we will see shortly. You can pass it to a separate interpreter (possibly multiple interpreters) which can do whatever you feel like with the structure. Your free monad remains pure while all impurities can be put inside your interpreters.

And so we interpret ..

In this post I will describe how I implemented an interpreter for a Joy like concatenative language that uses the stack for its computation. Of course it's just a prototype and addresses an extremely simplified subset, but you get the idea. The basic purpose is to explore the power of free monads and come up with something that can potentially be extended to a full blown implementation of the language.

When designing an interpreter there's always the risk of conflating the language along with the concerns of interpreting it. Here we will have the 2 completely decoupled by designing the core language constructs as free monads. Gabriel Gonzalez has written a series of blog posts [1,2,3] on the use of free monads which contain details of their virtues and usage patterns. My post is just an account of my learning experience. In fact after I wrote the post, I discovered a similar exercise done for embedding Forth in Haskell - so I guess I'm not the first one to learn free monads using a language interpreter.

Let's start with some code, which is basically a snippet of Joy like code that will run within our Haskell interpreter ..
p :: Joy ()
p = do push 5
       push 6
       add
       incr
       add2
       square
       cube
       end

This is our wish list and at the end of the post we will see if we can interpret this correctly and reason about some aspects of this code. Let's not bother much about the details of the above snippet. The important point is that if we fire it up in ghci, we will see that it's pure data!
*Joy> p
Free (Push 5 (Free (Push 6 (Free (Add (Free (Push 1 (Free (Add (Free (Push 1 (Free (Add (Free (Push 1 (Free (Add (Free (Dup (Free (Mult (Free (Dup (Free (Dup (Free (Mult (Free (Mult (Free End))))))))))))))))))))))))))))))
*Joy> 

We haven't yet executed anything of the above code. It's completely free to be interpreted and possibly in multiple ways. So, we have achieved this isolation - that the data is freed from the interpreter. You want to develop a pretty printer for this data - go for it. You want to apply semantics and give an execution model based on Joy - do it.

Building the pure data (aka Builder)

Let's first define the core operators of the language ..
data JoyOperator cont = Push Int cont
                      | Add      cont 
                      | Mult     cont 
                      | Dup      cont 
                      | End          
                      deriving (Show, Functor)

The interesting piece is the derivation of the Functor, which is required for implementing the forgetful functor adjoint of the free monad. Keeping the technical mumbo jumbo aside, free monads are just a general way of turning functors into monads. So if we have a core operator f as a functor, we can get a free monad Free f out of it. And knowing something is a free monad helps you transform transform an operation over the monad (the monad homomorphism) into an operation over the functor (functor homomorphism). We will see how this helps later ..

The other point to note is that all operators take a continuation argument that points to the next operation in the chain. End is the terminal symbol and we plan to ignore anything that the user enters after an End.

Push takes an Int and pushes into the stack, Add pops the top 2 elements of the stack and pushes the sum, Mult does the same for multiplication and Dup duplicates the top element of the stack. End signifies the end of program.

Next we define the free monad over JoyOperator by using the Free data constructor, defined as part of Control.Monad.Free ..

data Free f a = Pure a | Free (f (Free f a))

-- | The free monad over JoyOperator
type Joy = Free JoyOperator

And then follow it up with some of the definitions of Joy operators as operations over free monads. Note that liftF lifts an operator (which is a Functor) into the context of a free monad. liftF has the following type ..

liftF :: Functor f => f a -> Free f a

As a property, a free moand has a forgetful functor as its left adjoint. The unlifting from the monad to the functor is given by the retract function ..

retract :: Monad f => Free f a -> f a

and needless to say

retract . liftF = id

-- | Push an integer to the stack
push :: Int -> Joy ()
push n = liftF $ Push n ()

-- | Add the top two numbers of the stack and push the sum
add :: Joy ()
add = liftF $ Add ()

.. and this can be done for all operators that we wish to support.

Not only this, we can also combine the above operators and build newer ones. Remember we are working with monads and hence the *do* notation based sequencing comes for free ..

-- | This combinator adds 1 to a number. 
incr :: Joy ()
incr = do {1; add}

-- | This combinator increments twice
add2 :: Joy ()
add2 = do {incr; incr}

-- | This combinator squares a number
square :: Joy ()
square = do {dup; mult}

Now we can have a composite program which sequences through the core operators as well as the ones we derive from them. And that's what we posted as our first example snippet of a target program.

An Interpreter (aka Visitor)

Once we have the pure data part done, let's try and build an interpreter that does the actual execution based on the semantics that we defined on the operators.

-- | Run a joy program. Result is either an Int or an error
runProgram :: Joy n -> Either JoyError Int
runProgram program = joy [] program
  where joy stack (Free (Push v cont))         = joy (v : stack) cont
        joy (a : b : stack) (Free (Add cont))  = joy (a + b : stack) cont
        joy (a : b : stack) (Free (Mult cont)) = joy (a * b : stack) cont
        joy (a : stack) (Free (Dup cont))      = joy (a : a : stack) cont
        joy _ (Free Add {})                    = Left NotEnoughParamsOnStack
        joy _ (Free Dup {})                    = Left NotEnoughParamsOnStack
        joy [] (Free End)                      = Left NotEnoughParamsOnStack
        joy [result] (Free End)                = Right result
        joy _ (Free End)                       = Left NotEmptyOnEnd
        joy _ Pure {}                          = Left NoEnd

runProgram is the interpreter that takes a free monad as its input. Its implementation is quite trivial - it just matches on the recursive structure of the data and pushes the appropriate results on the stack. Now if we run our program p using the above interpreter we get the correct result ..

*Joy> runProgram p
Right 7529536
*Joy> 

Equational Reasoning

Being Haskell and being pure, we obviously can prove some bits and pieces of our program as mathematical equations. At the beginning I said that End is the end of the program and anything after End needs to be ignored. What happens if we do the following ..

runProgram $ do {push 5; incr; incr; end; incr; incr}

If you have guessed correctly we get Right 7, which means that all operations after end have been ignored. But can we prove that our program indeed does this ? The following is a proof that end indeed ends our program. Consider some operation m follows end ..

end >> m

-- definition of end
= liftF $ End >> m

-- m >> m' = m >>= \_ -> m'
= liftF $ End >>= \_ -> m

-- definition of liftF
= Free End >>= \_ -> m

-- Free m >>= f = Free (fmap (>>= f) m)
= Free (fmap (>>= \_ -> m) End)

-- fmap f End = End
= Free End

-- liftF f = Free f (f is a functor)
= liftF End

-- definition of End
= end

So this shows that any operation we do after end is never executed.

Improving the bind

Free monads offer improved modularity by allowing us to separate the building of pure data from the (possibly) impure concerns of interpreting it with the external world. But a naive implementation leads to a penalty in runtime efficiency as Janis Voigtlander discusses in his paper Asymptotic Improvement of Computations over Free Monads. And the Haskell's implementation of free monads had this inefficiency where the asymptotic complexity of substitution was quadratic because of left associative bind. Edward Kmett implemented the Janis trick and engineered a solution that gets over this inefficiency. I will discuss this in a future post. And if you would like to play around with the interpreter, the source is there in my github.

Monday, July 25, 2011

Monad Transformers in Scala

Monads don't compose .. and hence Monad Transformers. A monad transformer maps monads to monads. It lets you transform a monad with additional computational effects. Stated simply, if you have a monadic computation in place you can enrich it incrementally with additional effects like states and errors without disturbing the whole structure of your program.

A monad transformer is represented by the kind T :: (* -> *) -> * -> *. The general contract that a monad transformer offers is ..

class MonadTrans t where
 lift :: (Monad m) => m a -> t m a

Here we lift a computation m a into the context of another effect t. We call t the monad transformer, which is itself a monad.

Well in this post, I will discuss monad transformers in scala using scalaz 7. And here's what you get as the base abstraction corresponding to the Haskell typeclass shown above ..

trait MonadTrans[F[_[_], _]] {
  def lift[G[_] : Monad, A](a: G[A]): F[G, A]
}

It takes a G and lifts it into another computation F thereby combining the effects into the composed monad. Let's look at a couple of examples in scalaz that use lift to compose a new effect into an existing one ..

// lift an Option into a List
// uses transLift available from the main pimp of scalaz

scala> List(10, 12).transLift[OptionT].runT
res12: List[Option[Int]] = List(Some(10), Some(12))

// uses explicit constructor methods

scala> optionT(List(some(12), some(50))).runT
res13: List[Option[Int]] = List(Some(12), Some(50))

If you are like me, you must have already started wondering about the practical usage of this monad transformer thingy. Really we want to see them in action in some meaningful code which we write in our daily lives.

In the paper titled Monad Transformers Step by Step, Martin Grabmuller does this thing in Haskell evolving a complete interpreter of a subset of language using these abstractions and highlighting how they contribute towards an effective functional model of your code. In this post I render some of the scala manifestations of those examples using scalaz 7 as the library of implementation. The important point that Martin also mentions in his paper is that you need to think functionally and organize your code upfront using monadic structures in order to take full advantage of incremental enrichment through monad transformers.

We will be writing an interpreter for a very small language. I will first define the base abstractions and start with a functional code as the implementation base. It does not contain many of the useful stuff like state management, error handling etc., which I will add incrementally using monad transformers. We will see how the core model remains the same, transformers get added in layers and the static type of the interpreter function states explicitly what effects have been added to it.

The Language

Here's the language for which we will be writing the interpreter. Pretty basic stuff with literal integers, variables, addition, λ expressions (abstraction) and function application. By abstraction and application I mean lambda terms .. so a quick definition for the uninitiated ..


- a lambda term may be a variable, x
- if t is a lambda term, and x is a variable, then λx.t is a lambda term (called a lambda abstraction)
- if t and s are lambda terms, then ts is a lambda term (called an application)


and the Scala definitions for the language elements ..

// variable names
type Name = String
  
// Expression types
trait Exp
case class Lit(i: Int) extends Exp
case class Var(n: Name) extends Exp
case class Plus(e1: Exp, e2: Exp) extends Exp
case class Abs(n: Name, e: Exp) extends Exp
case class App(e1: Exp, e2: Exp) extends Exp
  
// Value types
trait Value
case class IntVal(i: Int) extends Value
case class FunVal(e: Env, n: Name, exp: Exp) extends Value

// Environment in which the λ-abstraction will be evaluated
type Env = collection.immutable.Map[Name, Value]

// defining additional data constructors because in Scala
// typeclass resolution and variance often give some surprises with type inferencing

object Values {
  def intval(i: Int): Value = IntVal(i)
  def funval(e: Env, n: Name, exp: Exp): Value = FunVal(e, n, exp)
}

The Reference Implementation

I start with the base implementation, which is a functional model of the interpreter. It contains only the basic stuff for evaluation and has no monadic structure. Incrementally we will start having fun with this ..

def eval0: Env => Exp => Value = { env => exp =>
  exp match {
    case Lit(i) => IntVal(i)
    case Var(n) => (env get n).get
    case Plus(e1, e2) => {
      val IntVal(i1) = eval0(env)(e1)
      val IntVal(i2) = eval0(env)(e2)
      IntVal(i1 + i2)
    }
    case Abs(n, e) => FunVal(env, n, e)
    case App(e1, e2) => {
      val val1 = eval0(env)(e1)
      val val2 = eval0(env)(e2)
      val1 match {
        case FunVal(e, n, exp) => eval0((e + ((n, val2))))(exp)
      }
    }
  }
}

Note we assume that we have the proper matches everywhere - the Map lookup in processing variables (Var) doesn't fail and we have the proper function value when we go for the function application. So things look happy for the correct paths of expression evaluation ..

// Evaluate: 12 + ((λx -> x)(4 + 2))

scala> val e1 = Plus(Lit(12), App(Abs("x", Var("x")), Plus(Lit(4), Lit(2))))
e1: Plus = Plus(Lit(12),App(Abs(x,Var(x)),Plus(Lit(4),Lit(2))))

scala> eval0(collection.immutable.Map.empty[Name, Value])(e1)
res4: Value = IntVal(18)

Go Monadic

Monad transformers give you layers of control over the various aspects of your computation. But for that to happen you need to organize your code in a monadic way. Think of it like this - if your code models the computations of your domain (aka the domain logic) as per the contracts of an abstraction you can very well compose more of similar abstractions in layers without directly poking into the underlying implementation.

Let's do one thing - let's transform the above function into a monadic one that doesn't add any effect. It only sets up the base case for other monad transformers to prepare their playing fields. It's the Identity monad, which simply applies the bound function to its input without any additional computational effect. In scalaz 7 Identity simply wraps a value and provides a map and flatMap for bind.

Here's our next iteration of eval, this time with the Identity monad baked in .. eval0 was returning Value, eval1 returns Identity[Value] - the return type makes this fact explicit that we are now in the land of monads and have wrapped ourselves into a computational structure which can only be manipulated through the bounds of the contract that the monad allows.

type Eval1[A] = Identity[A]

def eval1: Env => Exp => Eval1[Value] = {env => exp =>
  exp match {
    case Lit(i) => intval(i).point[Eval1]
    case Var(n) => (env get n).get.point[Eval1]
    case Plus(e1, e2) => for {
      i <- eval1(env)(e1)
      j <- eval1(env)(e2)
    } yield {
      val IntVal(i1) = i
      val IntVal(i2) = j
      IntVal(i1 + i2)
    }
    case Abs(n, e) => funval(env, n, e).point[Eval1]
    case App(e1, e2) => for {
      val1 <- eval1(env)(e1)
      val2 <- eval1(env)(e2)
    } yield {
      val1 match {
        case FunVal(e, n, exp) => eval1((e + ((n, val2))))(exp)
      }
    }
  }
}

All returns are now monadic, though the basic computation remains the same. The Lit, Abs and the Var cases use the point function (pure in scalaz 6) equivalent to a Haskell return. Plus and App use the for comprehension to evaluate the monadic action. Here's the result on the REPL ..

scala> eval1(collection.immutable.Map.empty[Name, Value])(e1)
res7: Eval1[Value] = scalaz.Identity$$anon$2@18f67fc

scala> res7.value
res8: Value = IntVal(18)

So the Identity monad has successfully installed itself making our computational model like an onion peel on which we can now stack up additional effects.

Handling Errors

In eval1 we have a monadic functional model of our computation. But we have not yet handled any errors that may arise from the computation. And I promised that we will add such effects incrementally without changing the guts of your model.

As a very first step, let's use a monad transformer that helps us handle errors, not by throwing exceptions (exceptions are bad .. right?) but by wrapping the error conditions in yet another abstraction. Needless to say this also has to be monadic because we would like it to compose with our already implemented Identity monad and the others that we will work out later on.

scalaz 7 offers EitherT which we can use as the Error monad transformer. It is defined as ..

sealed trait EitherT[A, F[_], B] {
  val runT: F[Either[A, B]]
  //..
}

It adds the EitherT computation on top of F so that the composed monad will have both the effects. And as with Either we use the Left A for the error condition and the Right B for returning the result. The plus point of using the monad transformer is that this plumbing of the 2 monads is taken care of by the implementation of EitherT, so that we can simply define the following ..

type Eval2[A] = EitherT[String, Identity, A]

def eval2a: Env => Exp => Eval2[Value] = {env => exp =>
  //..
}

The error will be reported as String and the Value will be returned in the Right constructor of Either. Our return type is also explicit in what the function does. You can simply change the return type to Eval2 and keep the rest of the function same as eval1. It works perfectly like the earlier one. Since we have not yet coded explicitly for the error conditions, appropriate error messages will not appear, but the happy paths execute as earlier even with the changed return type. This is because Identity was a monad and so is the newly composed one consisting of Identity and EitherT.

We can run eval2a and the only difference in output will be that the result will be wrapped in a Right constructor ..

scala> val e1 = Plus(Lit(12), App(Abs("x", Var("x")), Plus(Lit(4), Lit(2))))
e1: Plus = Plus(Lit(12),App(Abs(x,Var(x)),Plus(Lit(4),Lit(2))))

scala> eval2a(collection.immutable.Map.empty[Name, Value])(e1)
res31: Eval2[Value] = scalaz.EitherTs$$anon$2@ad2f60

scala> res31.runT.value
res33: Either[String,Value] = Right(IntVal(18))

We can do a couple of more iterations improving upon how we can handle errors using EitherT and issue appropriate error messages to the user. Here's the final version that has all error handling implemented. Note however that the core model remains the same - we have only added the Left handling for error conditions ..

def eval2: Env => Exp => Eval2[Value] = {env => exp =>
  exp match {
    case Lit(i) => intval(i).point[Eval2]

    case Var(n) => (env get n).map(v => rightT[String, Identity, Value](v))
                              .getOrElse(leftT[String, Identity, Value]("Unbound variable " + n))
    case Plus(e1, e2) => 
      val r = 
        for {
          i <- eval2(env)(e1)
          j <- eval2(env)(e2)
        } yield((i, j))

      r.runT.value match {
        case Right((IntVal(i_), IntVal(j_))) => rightT(IntVal(i_ + j_))
        case Left(s) => leftT("type error in Plus" + "/" + s)
        case _ => leftT("type error in Plus")
      }

    case Abs(n, e) => funval(env, n, e).point[Eval2]

    case App(e1, e2) => 
      val r =
        for {
          val1 <- eval2(env)(e1)
          val2 <- eval2(env)(e2)
        } yield((val1, val2))

      r.runT.value match {
        case Right((FunVal(e, n, exp), v)) => eval2(e + ((n, v)))(exp)
        case _ => leftT("type error in App")
      }
  }
}

How about some State ?

Let's add some mutable state in the function using the State monad. So now we need to stack up our pile of transformers with yet another effect. We would like to add some profiling capabilities that track invocation of every pattern in the evaluator. For simplicity we just count the number of invocations as an integer and report it along with the final output. We define the new monad by wrapping a StateT constructor around the innermost monad, Identity. So now our return type becomes ..

type Eval3[A] = EitherT[String, StateTIntIdentity, A]

We layer the StateT between EitherT and Identity - hence we need to form a composition between StateT and Identity that goes as the constructor to EitherT. This is defined as StateTIntIdentity, we make the state an Int. And we define this as a type lambda as follows ..

type StateTIntIdentity[α] = ({type λ[α] = StateT[Int, Identity, α]})#λ[α]

Intuitively our returned value in case of a successful evaluation will be a tuple2 (Either[String, Value], Int), as we will see shortly.

We write a couple of helper functions that manages the state by incrementing a counter and lifting the result into a StateT monad and finally lifting everything into the EitherT.

def stfn(e: Either[String, Value]) = (s: Int) => id[(Either[String, Value], Int)](e, s+1)

def eitherNStateT(e: Either[String, Value]) =
  eitherT[String, StateTIntIdentity, Value](stateT[Int, Identity, Either[String, Value]](stfn(e)))

And here's the eval3 function that does the evaluation along with profiling and error handling ..

def eval3: Env => Exp => Eval3[Value] = {env => exp => 
  exp match {
    case Lit(i) => eitherNStateT(Right(IntVal(i)))

    case Plus(e1, e2) =>
      def appplus(v1: Value, v2: Value) = (v1, v2) match {
        case ((IntVal(i1), IntVal(i2))) => eitherNStateT(Right(IntVal(i1 + i2))) 
        case _ => eitherNStateT(Left("type error in Plus"))
      }
      for {
        i <- eval3(env)(e1)
        j <- eval3(env)(e2)
        v <- appplus(i, j)
      } yield v

    case Var(n) => 
      val v = (env get n).map(Right(_))
                         .getOrElse(Left("Unbound variable " + n))
      eitherNStateT(v)

    case Abs(n, e) => eitherNStateT(Right(FunVal(env, n, e)))

    case App(e1, e2) => 
      def appfun(v1: Value, v2: Value) = v1 match {
        case FunVal(e, n, body) => eval3(e + ((n, v2)))(body)
        case _ => eitherNStateT(Left("type error in App"))
      }

      val s =
        for {
          val1 <- eval3(env)(e1)
          val2 <- eval3(env)(e2)
          v    <- appfun(val1, val2)
        } yield v

      val ust = s.runT.value.usingT((x: Int) => x + 1)
      eitherT[String, StateTIntIdentity, Value](ust)
  }
}

We run the above function through another helper runEval3 that also takes the seed value of the state ..

def runEval3: Env => Exp => Int => (Either[String, Value], Int) = { env => exp => seed => 
  eval3(env)(exp).runT.value.run(seed)
}

Here's the REPL session with runEval3 ..

scala> val e1 = Plus(Lit(12), App(Abs("x", Var("x")), Plus(Lit(4), Lit(2))))
e1: Plus = Plus(Lit(12),App(Abs(x,Var(x)),Plus(Lit(4),Lit(2))))
scala> runEval3(env)(e1)(0)
res25: (Either[String,Value], Int) = (Right(IntVal(18)),8)

// -- failure case --
scala> val e2 = Plus(Lit(12), App(Abs("x", Var("y")), Plus(Lit(4), Lit(2))))
e2: Plus = Plus(Lit(12),App(Abs(x,Var(y)),Plus(Lit(4),Lit(2))))

scala> runEval3(env)(e2)(0)
res27: (Either[String,Value], Int) = (Left(Unbound variable y),7)

In case you are interested the whole code base is there in my github repository. Feel free to check out. I will be adding a couple of more transformers for hiding the environment (ReaderT) and logging (WriterT) and also IO.

Monday, March 28, 2011

Killing a var and Threading a State

In my earlier post on CQRS using functional domain models and Akka actors, I had implemented a data store that accumulates all events associated with individual trades. We call this EventSourcing that allows us to rollback our system in time and replay all events over the earlier snapshot to bring it up to date. This has many uses to detect any problems that might have occured in the past or to profile your system on a retroactive basis. This post is not about event sourcing or the virtues of CQRS.

In this post I start taking cue from the EventSourcing example and discuss some strategies of improving some aspects of the domain model. This is mostly related to raising the level of abstraction at which to program the solution domain. Consider this post to be a random rant to record some of my iterations in the evolution of the domain model.

The code snippets that I present below may sometimes look out of context since they're part of a bigger model. The entire prototype is there in my github repository ..

Story #1 : How to kill a var

Have a look at the event store that I implemented earlier ..

class EventStore extends Actor with Listeners {
  private var events = Map.empty[Trade, List[TradeEvent]]

  def receive = //..
  //..
}


With an actor based implementation the mutable var events is ok since the state is confined within the actor itself. In another implementation where I was using a different synchronous event store, I had to get around this mutable shared state. It was around this time Tony Morris published his Writer Monad in Scala. This looked like a perfect fit .. Here's the implementation of the abstraction that logs all events, shamelessly adopted from Tony's Logging Without Side-effects example ..

import TradeModel._
object EventLog {
  type LOG = List[(Trade, TradeEvent)]
}

import EventLog._

case class EventLogger[A](log: LOG, a: A) {
  def map[B](f: A => B): EventLogger[B] =
    EventLogger(log, f(a))

  def flatMap[B](f: A => EventLogger[B]): EventLogger[B] = {
    val EventLogger(log2, b) = f(a)
    EventLogger(log ::: log2 /* accumulate */, b)
  }
}

object EventLogger {
  implicit def LogUtilities[A](a: A) = new {
    def nolog =
      EventLogger(Nil /* empty */, a)

    def withlog(log: (Trade, TradeEvent)) =
      EventLogger(List(log), a)

    def withvaluelog(log: A => (Trade, TradeEvent)) =
      withlog(log(a))
  }
}


and here's a snippet that exercises the logging process ..

import EventLogger._

val trd = makeTrade("a-123", "google", "r-123", HongKong, 12.25, 200).toOption.get

val r = for {
  t1 <- enrichTrade(trd) withlog (trd, enrichTrade)
  t2 <- addValueDate(t1) withlog (trd, addValueDate)
} yield t2


Now I can check what events have been logged and do some processing on the event store ..

// get the log from the EventLogger grouped by trade
val m = r.log.groupBy(_._1)

// play the event on the trades to get the current snapshot
val x =
  m.keys.map {=>
    m(t).map(_._2).foldLeft(t)((a,e) => e(a))
  }

// check the results
x.size should equal(1)
x.head.taxFees.get.size should equal(2) 
x.head.netAmount.get should equal(3307.5000)


Whenever you're appending data to an abstraction consider using the Writer monad. With this I killed the var events from modeling my event store.

Story #2: Stateful a la carte

This is a story that gives a tip for handling changing state of a domain model in a functional way. When the state does not change and you're using an abstraction repeatedly for reading, you have the Reader monad.

// enrichment of trade
// Reader monad
val enrich = for {
  taxFeeIds      <- forTrade // get the tax/fee ids for a trade
  taxFeeValues   <- taxFeeCalculate // calculate tax fee values
  netAmount      <- enrichTradeWith // enrich trade with net amount
}
yield((taxFeeIds map taxFeeValues) map netAmount)

val trd = makeTrade("a-123", "google", "r-123", HongKong, 12.25, 200)
(trd map enrich) should equal(Success(Some(3307.5000)))


Note how we derive the enrichment information for the trade keeping the original abstraction immutable - Reader monad FTW.

But what happens when you need to handle state that changes in the lifecycle of the abstraction ? You can use the State monad itself. It allows you to thread a changing state across a sequence transparently at the monad definition level. Here's how I would enrich a trade using the State monad as implemented in scalaz ..

val trd = makeTrade("a-123", "google", "r-123", HongKong, 12.25, 200).toOption.get

val x =
  for {
    _ <- init[Trade]
    _ <- modify((t: Trade) => refNoLens.set(t, "XXX-123"))
    u <- modify((t: Trade) => taxFeeLens.set(t, some(List((TradeTax, 102.25), (Commission, 25.65)))))
  } yield(u)

~> trd == trd.copy(refNo = "XXX-123")


In case you're jsut wondering what's going around above, here's a bit of an explanation. init initializes the state for Trade. init is defined as ..

def init[S]: State[S, S] = state[S, S](=> (s, s))


modify does the state change with the function passed to it as the argument ..

def modify[S](f: S => S) = init[S] flatMap (=> state(=> (f(s), ())))


We apply the series of modify to enrich the trade. The actual threading takes place transparently through the magic of comprehensions.

There is another way of securely encapsulating stateful computations that allow in-place updates - all in the context of functional programming. This is the ST monad which has very recently been inducted into scalaz. But that is the subject of another post .. sometime later ..

Monday, February 14, 2011

Applicatives for composable JSON serialization in Scala

It has been quite some time I have decided to play around with sjson once again. For the convenience of those who are not familiar with sjson, it's a tiny JSON serialization library that can serialize and de-serialize Scala objects. sjson offers two ways in which you can serialize your Scala objects :-
  1. typeclass based serialization, where you define your own protocol (typeclass instances) for your own objects. The standard ones, of course come out of the box.
  2. reflection based serialization, where you provide a bunch of annotations and sjson looks up reflectively and tries to get your objects serialized and de-serialized.
One of the things which bothered me in both the implementations is the way errors are handled. Currently I use exceptions to report errors in serializing / de-serializing. Exceptions, as you know, are side-effects and don't compose. Hence even though your input JSON value has many keys that don't match with the names in your Scala class, errors are reported one by one.

scalaz is a Haskell like library for Scala that offers myriads of options towards pure functional programming. I have been playing around with Scalaz recently, particularly the typeclasses for Applicatives. I have also blogged on some of the compositional features that scalaz offers that help make your code much more declarative, concise and composable.

The meat of scalaz is based on the two most potent forces that Scala offers towards data type generic programming :-
  1. typeclass encoding using implicits and
  2. ability to abstract over higher kinded types (type constructor polymorphism)
Using these features scalaz has made lots of operations available to a large family of data structures, which were otherwise available only for a smaller subset in the Scala standard library. Another contribution of scalaz has been to make many of the useful abstractions first class in Scala e.g. Applicative, Monad, Traversable etc. All of these are available in Haskell as typeclass hierarchies - so now you can use the goodness of these abstractions in Scala as well.

One of the areas which I focused on in sjson using scalaz is to make error reporting composable. Have a look at the following snippet ..

// an immutable value object in Scala
case class Address(no: Int, street: String, city: String, zip: String)
 
// typeclass instance for sjson serialization protocol for Address
object AddressProtocol extends DefaultProtocol {
 
 implicit object AddressFormat extends Format[Address] {
   def reads(json: JsValue): ValidationNEL[String, Address] = json match {
     case m@JsObject(_) => 
       (field[Int]("no", m)        |@| 
        field[String]("street", m) |@| 
        field[String]("city", m)   |@| 
        field[String]("zip", m)) { Address }
 
     case _ => "JsObject expected".fail.liftFailNel
   }
 //..
}


In the current version of sjson, reads returns an Address. Now it returns an applicative, ValidationNEL[String, Address], which is a synonym for Validation[NonEmptyList[String], Address]. Validation is isomorphic to scala.Either in the sense that it has two separate types for error and success. But it has a much cleaner API and does not leave the choice to convention. In our case since we will be accumulating errors, we choose to use a List type for the error part. As a general implementation strategy, when Validation is used as an Applicative, the error type is modeled as a SemiGroup that offers an append operation. Have a look at scalaz for details of how you can use Validation as an applicative for cumulative error reporting.

Let's see what happens in the above snippet ..

1. field extracts the value the relevant field (passed as the first argument) from the JsObject. Incidentally JsObject is from Nathan Hamblen's dispatch-json, which sjson uses under the covers. More on dispatch-json's awesomeness later :). Here's how I define field .. Note if the name is not available, it gives us a Failure type on the Validation.

def field[T](name: String, js: JsValue)(implicit fjs: Reads[T]): ValidationNEL[String, T] = {
 val JsObject(m) = js
 m.get(JsString(name))
  .map(fromjson[T](_)(fjs))
  .getOrElse(("field " + name + " not found").fail.liftFailNel)
}


2. field invocations are composed using |@| combinator of scalaz, which gives us an ApplicativeBuilder that allows me to play around with the elements that it composes. In the above snippet we simply pass these components to build up an instance of the Address class.

Since Validation is an Applicative, all errors that come up during composition of field invocations get accumulated in the final list that occurs as the error type of it.

Let's first look at the normal usecase where things are happy and we get an instance of Address constructed from the parsed json. No surprises here ..

// test case
it ("should serialize an Address") {
 import Protocols._
 import AddressProtocol.// typeclass instances
 val a = Address(12, "Tamarac Square", "Denver", "80231")
 fromjson[Address](tojson(a)) should equal(a.success)
}


But what happens if there are some errors in the typeclass instance that you created ? Things start to get interesting from here ..

implicit object AddressFormat extends Format[Address] {
 def reads(json: JsValue): ValidationNEL[String, Address] = json match {
   case m@JsObject(_) => 
     (field[Int]("number", m) |@| 
      field[String]("stret", m) |@| 
      field[String]("City", m) |@| 
      field[String]("zip", m)) { Address }
 
   case _ => "JsObject expected".fail.liftFailNel
 }
 //..
}


Note that the keys in json as passed to field API do not match the field names in the Address class. Deserialization fails and we get a nice list of all errors reported as part of the Failure type ..

it ("address serialization should fail") {
  import Protocols._
  import IncorrectPersonProtocol._
  val a = Address(12, "Tamarac Square", "Denver", "80231")
  (fromjson[Person](tojson(p))).fail.toOption.get.list 
    should equal (List("field number not found", "field stret not found", "field City not found"))
}


Composability .. Again!

A layer of monads on top of your API makes your API composable with any other monad in the world. With sjson de-serialization returning a Validation, we can get better composability when writing complex serialization code like the following. Consider this JSON string from where we need to pick up fields selectively and make a Scala object ..

val jsonString = 
  """{
       "lastName" : "ghosh", 
       "firstName" : "debasish", 
       "age" : 40, 
       "address" : { "no" : 12, "street" : "Tamarac Square", "city" : "Denver", "zip" : "80231" }, 
       "phone" : { "no" : "3032144567", "ext" : 212 },
       "office" :
        {
          "name" : "anshinsoft",
          "address" : { "no" : 23, "street" : "Hampden Avenue", "city" : "Denver", "zip" : "80245" } 
        }
     }"""


We would like to cherry pick a few of the fields from here and create an instance of Contact class ..

case class Contact(lastName: String, firstName: String, 
  address: Address, officeCity: String, officeAddress: Address)


Try this with the usual approach as shown above and you will find some of the boilerplate repetitions within your implementation ..

import dispatch.json._
import Js._

val js = Js(jsonString) // js is a JsValue

(field[String]("lastName", js)    |@| 
 field[String]("firstName", js)   |@| 
 field[Address]("address", js)    |@| 
 field[String]("city", (('office ! obj) andThen ('address ? obj))(js)) |@|
 field[Address]((('office ! obj) andThen ('address ! obj)), js)) { Contact } should equal(c.success)


Have a look at this how we need to repeatedly pass around js, though we never modify it any time. Since our field API is monadic, we can compose all invocations of field together with a Reader monad. This is a very useful technique of API composition which I discussed in an earlier blog post. (Here is a trivia : How can we compose similar stuff when there's modification involved in the passed around state ? Hint: The answer is within the question itself :D)

But for that we need to make a small change in our field API. We need to make it curried .. Here are 2 variants of the curried field API ..

// curried version: for lookup of a String name
def field_c[T](name: String)(implicit fjs: Reads[T]) = { js: JsValue =>
  val JsObject(m) = js
  m.get(JsString(name)).map(fromjson[T](_)(fjs)).getOrElse(("field " + name + " not found").fail.liftFailNel)
}

// curried version: we need to get a complete JSON object out
def field_c[T](f: (JsValue => JsValue))(implicit fjs: Reads[T]) = { js: JsValue =>
  try {
    fromjson[T](f(js))(fjs)
  } catch {
    case e: Exception => e.getMessage.fail.liftFailNel
  }
}


Note how in the second variant of field_c, we use the extractors of dispatch-json to take out nested objects from a JsValue structure. We use it below to get the office address from within the parsed JSON.

And here's how we compose all lookups monadically and finally come up with the Contact instance ..

// reader monad
val contact =
  for {
    last    <- field_c[String]("lastName")
    first   <- field_c[String]("firstName")
    address <- field_c[Address]("address")
    office  <- field_c[Address]((('office ! obj) andThen ('address ! obj)))
  }
  yield(last |@| first |@| address |@| office)

// city needs to be parsed separately since we are working on part of js
val city = field_c[String]("city")

// compose everything and build a Contact
(contact(js) |@| city((('office ! obj) andThen ('address ? obj))(js))) { 
  (last, first, address, office, city) => 
    Contact(last, first, address, city, office) } should equal(c.success)


I am still toying around with some of the monadic implementations of sjson APIs. It's offered as a separate package and will make a nice addition to the API families that sjson offers. You can have a look at my github repo for more details. I plan to finalize soon before I get to 1.0.