Ruminations of a Programmer: cqrs

Showing posts with label cqrs. Show all posts

Sunday, January 15, 2012

Event Sourcing, Akka FSMs and functional domain models

I blogged on Event Sourcing and functional domain models earlier. In this post I would like to share more of my thoughts on the same subject and how with a higher level of abstraction you can make your domain aggregate boundary more resilient and decoupled from external references.

When we talk about a domain model, the Aggregate takes the centerstage. An aggregate is a core abstraction that represents the time invariant part of the domain. It's an embodiment of all states that the aggregate can be in throughout its lifecycle in the system. So, it's extremely important that we take every pain to distil the domain model and protect the aggregate from all unwanted external references. Maybe an example will make it clearer.

Keeping the Aggregate pure

Consider a Trade model as the aggregate. By Trade, I mean a security trade that takes place in the stock exchange where counterparties exchange securities and currencies for settlement. If you're a regular reader of my blog, you must be aware of this, since this is almost exclusively the domain that I talk of in my blog posts.

A trade can be in various states like newly entered, value date added, enriched with tax and fee information, net trade value computed etc. In a trading application, as a trade passes through the processing pipeline, it moves from one state to another. The final state represents the complete Trade object which is ready to be settled between the counterparties.

In the traditional model of processing we have the final snapshot of the aggregate - what we don't have is the audit log of the actual state transitions that happened in response to the events. With event sourcing we record the state transitions as a pipeline of events which can be replayed any time to rollback or roll-forward to any state of our choice. Event sourcing is coming up as one of the potent ways to model a system and there are lots of blog posts being written to discuss about the various architectural strategies to implement an event sourced application.

That's ok. But whose responsibility is it to manage these state transitions and record the timeline of changes ? It's definitely not the responsibility of the aggregate. The aggregate is supposed to be a pure abstraction. We must design it as an immutable object that can respond to events and transform itself into the new state. In fact the aggregate implementation should not be aware of whether it's serving an event sourced architecture or not.

There are various ways you can model the states of an aggregate. One option that's frequently used involves algebraic data types. Model the various states as a sum type of products. In Scala we do this as case classes ..

sealed abstract class Trade {
  def account: Account
  def instrument: Instrument
  //..
}

case class NewTrade(..) extends Trade {
  //..
}

case class EnrichedTrade(..) extends Trade {
  //..
}

Another option may be to have one data type to model the Trade and model states as immutable enumerations with changes being effected on the aggregate as functional updates. No in place mutation, but use functional data structures like zippers or type lenses to create the transformed object in the new state. Here's an example where we create an enriched trade out of a newly created one ..

// closure that enriches a trade
val enrichTrade: Trade => Trade = {trade =>
  val taxes = for {
    taxFeeIds      <- forTrade // get the tax/fee ids for a trade
    taxFeeValues   <- taxFeeCalculate // calculate tax fee values
  }
  yield(taxFeeIds ° taxFeeValues)
  val t = taxFeeLens.set(trade, taxes(trade))
  netAmountLens.set(t, t.taxFees.map(_.foldl(principal(t))((a, b) => a + b._2)))
}

But then we come back to the same question - if the aggregate is distilled to model the core domain, who handles the events ? Someone needs to model the event changes, effect the state transitions and take the aggregate from one state to the next.

Enter Finite State Machines

In one of my projects I used the domain service layer to do this. The domain logic for effecting the changes lies with the aggregate, but they are invoked from the domain service in response to events when the aggregate reaches specific states. In other words I model the domain service as a finite state machine that manages the lifecycle of the aggregate.

In our example a Trading Service can be modeled as an FSM that controls the lifecycle of a Trade. As the following ..

import TradeModel._

class TradeLifecycle(trade: Trade, timeout: Duration, log: Option[EventLog]) 
  extends Actor with FSM[TradeState, Trade] {
  import FSM._

  startWith(Created, trade)

  when(Created) {
    case Event(e@AddValueDate, data) =>
      log.map(_.appendAsync(data.refNo, Created, Some(data), e))
      val trd = addValueDate(data)
      notifyListeners(trd) 
      goto(ValueDateAdded) using trd forMax(timeout)
  }

  when(ValueDateAdded) {
    case Event(StateTimeout, _) =>
      stay

    case Event(e@EnrichTrade, data) =>
      log.map(_.appendAsync(data.refNo, ValueDateAdded, None,  e))
      val trd = enrichTrade(data)
      notifyListeners(trd)
      goto(Enriched) using trd forMax(timeout)
  }

  when(Enriched) {
    case Event(StateTimeout, _) =>
      stay

    case Event(e@SendOutContractNote, data) =>
      log.map(_.appendAsync(data.refNo, Enriched, None,  e))
      sender ! data
      stop
  }

  initialize
}

The snippet above contains a lot of other details which I did not have time to prune. It's actually part of the implementation of an event sourced trading application that uses asynchronous messaging (actors) as the backbone for event logging and reaching out to multiple consumers based on the CQRS paradigm.

Note that the FSM model above makes it very explicit about the states that the Trade model can reach and the events that it handles while in each of these states. Also we can use this FSM technique to log events (for event sourcing), notify listeners about the events (CQRS) in a very much declarative manner as implemented above.

Let me know in the comments what are your views on this FSM approach towards handling state transitions in domain models. I think it helps keep aggregates pure and helps design domain services that focus on serving specific aggregate roots.

I will be talking about similar stuff, Akka actor based event sourcing implementations and functional domain models in PhillyETE 2012. Please drop by if this interests you.

Monday, January 31, 2011

CQRS with Akka actors and functional domain models

Fighting with impedance mismatch has been quite a losing battle so far in the development of software systems. We fight mismatch to handle stateful interactions with a stateless protocol. We fight mismatch of paradigms between the user interface layers, domain layers and data layers. Nothing concrete has emerged till date, though there has been quite a number of efforts to keep the organization of persistent data as close as possible to the way the domain layer uses them.

Command Query separation is nothing new. It's yet another attempt to manage impedance mismatch between how your application uses data and how the underlying store manages data so that transactional updates can be served with equal agility as read-only queries. Bertrand Meyer made this distinction long back when he mentioned that ..

The features that characterize a class are divided into commands and queries. A command serves to modify objects, a query to return information about objects.

Processing a command involves manipulation of state - hence the underlying data model needs to be organized in a way that makes updation easier. A query needs to return data in the format the user wants to view them. Hence it makes sense to organize your storage likewise so that we don't need to process expensive joins in order to process queries. This leads to a dichotomy in the way the application, as a whole, requires processing of data. Command Query Separation (CQRS) endorses this separation. Commands update state - hence produce side-effects. Queries are like pure functions and should be designed using applicative, completely side-effect free approaches. So, the CQRS principle, as Bertrand Meyer said is ..

Functions should not produce abstract side-effects.

Greg Young has delivered some great sessions on DDD and CQRS. In 2008 he said "A single model cannot be appropriate for reporting, searching and transactional behavior". We have at least two models - one that processes commands and feeds changes to another model which serves user queries and reports. The transactional behavior of the application gets executed through the rich domain model of aggregates and repositories, while the queries are directly served from a de-normalized data model.

CQRS and Event Sourcing

One other concept that goes alongside CQRS is that of Event Sourcing. I blogged about some of the benefits that it has quite some time back and implemented event sourcing using Scala actors. The point where event sourcing meets CQRS is how we model the transactions of the domain (resulting from commands) as a sequence of events in the underlying persistent store. Modeling persistence of transactions as an event stream helps record updates as append only event snapshots that can be replayed as and when required. All updates in the domain model are now being translated into inserts in the persistence model. And this gives us an explicit view of all state changes in the domain model.

Over the last few days I have been playing around implementing CQRS and Event Sourcing within a domain model using the principles of functional programming and actor based asynchronous messaging. One of the big challenges is to model updates in a functional way and store them as sequences of event streams. In this post, I will share some of the experiences and implementation snippets that I came up with over the last few days. The complete implementation, so far, can be found in my github repository. It's very much a work in progress, which I hope to enrich more and more as I get some time.

A simple domain model

First the domain model and the aggregate root that will be used to publish events .. It's a ridiculously simple model for a security trade, with lots and lots of stuff elided for simplicity ..

// the main domain class

case class Trade(account: Account, instrument: Instrument, refNo: String, 

  market: Market, unitPrice: BigDecimal, quantity: BigDecimal, 

  tradeDate: Date = Calendar.getInstance.getTime, valueDate: Option[Date] = None, 

  taxFees: Option[List[(TaxFeeId, BigDecimal)]] = None, 

  netAmount: Option[BigDecimal] = None) {



  override def equals(that: Any) = refNo == that.asInstanceOf[Trade].refNo

  override def hashCode = refNo.hashCode

}

For simplicity, we make reference number a unique identifier for a trade. So all comparisons and equalities will be based on reference numbers only.

In a typical application, the entry point for users is the service layer that exposes facade methods that render use cases for the business. In a trading application, two of the most common services that need to be done on a security trade are it's value date computation and it's enrichment. So when the trade passes through its processing pipeline it gets its value date updated and then gets enriched with the applicable taxes and fees and finally its worth net cash value.

If you are a client using these services (again, overly elided for simplicity) you may have the following service methods ..

class TradingClient {

  // create a trade : wraps the model method

  def newTrade(account: Account, instrument: Instrument, refNo: String, market: Market,

    unitPrice: BigDecimal, quantity: BigDecimal, tradeDate: Date = Calendar.getInstance.getTime) =

      //..



  // enrich trade

  def doEnrichTrade(trade: Trade) = //..



  // add value date

  def doAddValueDate(trade: Trade) = //..



  // a sample query

  def getAllTrades = //..

}

In a typical implementation these methods will invoke the domain artifacts of repositories that will either query aggregate roots or do updates on them before being persisted in the underlying store. In a CQRS implementation, the domain model will be updated but the persistent store will record these updates as event streams.

So now we have the first problem - how do we represent updates in the functional world so that we can compose them later when we need to snapshot the persistent aggregate root?

Lenses FTW

I used type lenses for representing updates functionally. Lenses solve the problem of representing updates so that they can be composed. A lens between a set of source structures S and a set of target structures T is a pair of functions:
- get from S to T
- putback from T x S to S

For more on lenses, have a look at this presentation by Benjamin Pierce. scalaz contains lenses as part of its distribution and models a lens as a case class containing a pair of get and set functions ..

case class Lens[A,B](get: A => B, set: (A,B) => A) extends Immutable { //..

Here are some examples from my domain model for updating a trade with its value date or enriching it with tax/fee values and net cash value ..

// add tax/fees

val taxFeeLens: Lens[Trade, Option[List[(TaxFeeId, BigDecimal)]]] = 

  Lens((t: Trade) => t.taxFees, 

       (t: Trade, tfs: Option[List[(TaxFeeId, BigDecimal)]]) => t.copy(taxFees = tfs))



// add net amount

val netAmountLens: Lens[Trade, Option[BigDecimal]] = 

  Lens((t: Trade) => t.netAmount, 

       (t: Trade, n: Option[BigDecimal]) => t.copy(netAmount = n))



// add value date

val valueDateLens: Lens[Trade, Option[Date]] = 

  Lens((t: Trade) => t.valueDate, 

       (t: Trade, d: Option[Date]) => t.copy(valueDate = d))

We will use the above lenses for updation of our aggregate root and also wrap them into closures for subsequent feed into the event stream for persistent storage. In this example I have implemented in-memory persistence for both the command and the query store. Persistence into an on disk database will be available very soon at a github repository near you :)

Combinators that abstract state processing

Let's now define a couple of combinators that encapsulate our transactional service method implementations within the domain model. Note how the lenses have also been abstracted away from the client API as implementation artifacts. For details of these implementations please visit the github repo that contains a working model along with test cases.

// closure that enriches a trade with tax/fee information and net cash value

val enrichTrade: Trade => Trade = {trade =>

  val taxes = for {

    taxFeeIds      <- forTrade // get the tax/fee ids for a trade

    taxFeeValues   <- taxFeeCalculate // calculate tax fee values

  }

  yield(taxFeeIds map taxFeeValues)

  val t = taxFeeLens.set(trade, taxes(trade))

  netAmountLens.set(t, t.taxFees.map(_.foldl(principal(t))((a, b) => a + b._2)))

}



// closure for adding a value date

val addValueDate: Trade => Trade = {trade =>

  val c = Calendar.getInstance

  c.setTime(trade.tradeDate)

  c.add(Calendar.DAY_OF_MONTH, 3)

  valueDateLens.set(trade, Some(c.getTime))

}

We will now use these combinators to implement our transactional services which the TradingClient will invoke. Each of these service methods will do 2 things :-

1. effect the closure on the domain model and
2. as a side-effect stream the event into the command store

Sounds like a kestrel .. doesn't it ? Well here's a kestrel combinator and the above service methods realized in my CQRS implementation ..

// refer To Mock a Mockingbird

private[service] def kestrel[T](trade: T, proc: T => T)(effect: => Unit) = {

  val t = proc(trade)

  effect

  t

}



// enrich trade

def doEnrichTrade(trade: Trade) = 

  kestrel(trade, enrichTrade) { 

    ts ! TradeEnriched(trade, enrichTrade)

  }



// add value date

def doAddValueDate(trade: Trade) = 

  kestrel(trade, addValueDate) { 

    ts ! ValueDateAdded(trade, addValueDate)

Back to Akka!

It was only expected that I will be using Akka for transporting the event down to the command store. And this transport is implemented as a asynchronous side-effect of the service methods - just what the doctor ordered for an actor use case :)

With Event Sourcing and CQRS, one of the things that you would require is the ability to snapshot your persistent versions of the aggregate root. The current implementation is simple and does a zero based snapshotting i.e every time you ask for a snapshot, it replays the whole stream for that trade and gives you the current state. In typical real world systems, you do interval snapshotting and start replaying from the latest available snapshot in case you want to get the current state.

Here's our command store modeled as an Akka actor that processes the various events that it receives from an upstream server ..

// CommandStore modeled as an actor

class CommandStore(qryStore: ActorRef) extends Actor {

  private var events = Map.empty[Trade, List[TradeEvent]]



  def receive = {

    case m@TradeEnriched(trade, closure) => 

      events += ((trade, events.getOrElse(trade, List.empty[TradeEvent]) :+ closure))

      qryStore forward m

    case m@ValueDateAdded(trade, closure) => 

      events += ((trade, events.getOrElse(trade, List.empty[TradeEvent]) :+ closure))

      qryStore forward m

    case Snapshot => 

      self.reply(events.keys.map {trade =>

        events(trade).foldLeft(trade)((t, e) => e(t))

      })

  }

}

Note how the Snapshot message is processed as a fold over all the accumulated closures starting with the base trade. Also the command store adds the event to its repository (which is currently an in memory collection) and forwards the event to the query store. There we can have the trade modeled as per the requirements of the query / reporting client. For simplicity the current example assumes the model as the same as our domain model presented above.

Here's the query store, also an actor, that persists the trades on receiving relevant events from the command store. In effect the command store responds to messages that it receives from an upstream TradingServer and asynchronously updates the query store with the latest state of the trade.

// QueryStore modeled as an actor

class QueryStore extends Actor {

  private var trades = new collection.immutable.TreeSet[Trade]()(Ordering.by(_.refNo))



  def receive = {

    case TradeEnriched(trade, closure) => 

      trades += trades.find(_ == trade).map(closure(_)).getOrElse(closure(trade))

    case ValueDateAdded(trade, closure) => 

      trades += trades.find(_ == trade).map(closure(_)).getOrElse(closure(trade))

    case QueryAllTrades =>

      self.reply(trades.toList)

  }

}

And here's a sample sequence diagram that illustrates the interactions that take place for a sample service call by the client in the CQRS implementation ..

The full implementation also contains the complete wiring of the above abstractions along with Akka's fault tolerant supervision capabilities. A complete test case is also included along with the distribution.

Have fun!