Ruminations of a Programmer: smullyan

Sunday, September 27, 2009

The Thrush combinator in Scala

In his book To Mock a Mockingbird, Raymond Smullyan teaches combinatory logic using songbirds in a forest. He derives important results combining various combinators, all using the birds of the enchanted forest. Combinators are an effective tool in designing abstractions with functional programming principles. They are reusable units, make you code very concise, without losing on the expressivity. More than the implementation, the names can go a long way in establishing a common vocabulary of programming idioms and techniques. In an earlier post, I talked about Kestrel, and its implementation in Scala for handling side-effects in an abstraction.

In this post, I look at Thrush, a permuting combinator. A Thrush is defined by the following condition: Txy = yx. Thrush reverses the order of evaluation. Raganwald talks about Thrush and its implementations in Ruby in an excellent post quite some time back.

Why would you want to reverse the order of evaluation in a computation ? Well, if you value readability of your code, and have been cringing at how a mix of expression oriented pipelines and nested function calls can make your code less readable, then a Thrush may be for you.

Consider the Ruby example that Raganwald discusses in his first example of Thrush implementation.

lambda { |x| x * x }.call((1..100).select(&:odd?).inject(&:+))

The argument to the proc is a pipeline expression that reads nicely from left to right : get the list of numbers from 1 to 100, select the odd ones and add them up. What the proc does is it finds the square of its input number. But the proc invocation is a function call, which, though is the last in sequence to be executed, has to be the first one that you read. You can find the Ruby implementation in Raganwald's blog that transforms the above code to a left-to-right pipeline expression using Thrush.

Let's try to see how we can do the same in Scala ..

In Scala, I can write the above as ..

((x: Int) => (x * x))((1 to 100).filter(_ % 2 != 0).foldLeft(0)(_+_))

Almost the same as the Ruby code above, and has the similar drawback in readability.

Let's define the Scala Thrush combinator ..

case class Thrush[A](x: A) {
  def into[B](g: A => B): B = {
    g(x)
  }
}

Immediately we can write the above invocation as ..

Thrush((1 to 100)
  .filter(_ % 2 != 0)
  .foldLeft(0)(_ + _))
  .into((x: Int) => x * x)

A very simple combinator, a permuting one that pushes the function to where it belongs in the line of readability. If you want to be more succinct, commit the sin of defining an implicit for your use case ..

implicit def int2Thrush(x: Int) = Thrush(x)

and immediately the above transforms to ..

(1 to 100)
  .filter(_ % 2 != 0)
  .foldLeft(0)(_ + _)
  .into((x: Int) => x * x)

Does it read better ?

In fact with this implicit definition, you can go chaining into all the way ..

(1 to 100)
  .filter(_ % 2 != 0)
  .foldLeft(0)(_ + _)
  .into((x: Int) => x * x)
  .into(_ * 2)

While designing domain APIs that need to be expressive, this technique can often come very handy. Here's an example that uses the Thrush combinator to ensure a clean pipeline expression flowing into a piece of DSL.

//..
accounts.filter(_ belongsTo "John S.") 
        .map(_.calculateInterest)
        .filter(_ > threshold)
        .foldLeft(0)(_ + _)
        .into {x: Int =>
          updateBooks journalize(Ledger.INTEREST, x)
        }
//..

Sunday, September 06, 2009

Side-effects with Kestrel in Scala

Consider the following piece of logic that we frequently come across in codebases ..

val x = get an instance, either create it or find it

manipulate x with post-creation activities (side-effects)

use x

Step 2 is only for some side-effecting operations, maybe on the instance itself or for some other purposes like logging, registering, writing to database etc. While working on the serialization framework sjson, I have been writing pieces of code that follows exactly the above pattern to create Scala objects out of JSON structures. Now if you notice the above 3 steps, step 2 looks like being a part of the namespace which calls the function that sets up x. But logically step 2 needs to be completed before we can use x. Which means that step 2 is also a necessary piece of logic that needs to be completed before we hand over the constructed instance x to the calling context.

One option of course is to make Step 2 a part of the function in Step 1. But this is not always feasible, since Step 2 needs access to the context of the caller namespace.

Let's look at an idiom in Scala that expresses the above behavior more succinctly and leads us into implementing one of the most popularly used objects in combinatory logic.

Consider the following method in Scala that creates a new instance of a class with the arguments passed to it ..

def newInstance[T](args: Array[AnyRef])(implicit m: Manifest[T]): T = {
  val constructor = 
    m.erasure.getDeclaredConstructors.first
  constructor.newInstance(args: _*).asInstanceOf[T]
}

I can use it like ..

newInstance[Person](Array("ghosh", "debasish"))

for a class Person defined as ..

case class Person(lastName: String, firstName: String)

It's often the case that I would like to have some operations on the new instance after its creation which will be pure side-effects. It may or may not mutate the new instance, but will be done in the context of the new instance. I can very well do that like ..

val persons = new ListBuffer[Person]()
val p = newInstance[Person](Array("ghosh", "debasish"))
persons += p  // register to the list
persons.foreach(mail("New member has joined: " + p)) // new member mail to all
//.. other stuff

It works perfectly .. but we can make the code more expressive if we can somehow be explicit about the context of the block of code that needs to go with every new instance of Person being created. Maybe something like ..

newInstance[Person](Array("ghosh", "debasish")) { p =>
  persons += p
  persons.foreach(mail("New member has joined: " + p))
  //.. other stuff
}

This clearly indicates that the side-effecting steps of adding to the global list of persons or sending out a mail to every member is also part of the creation process of the new person. The effect is the same as the earlier example, only that it delineates the context more clearly. Though at the end of it all, it returns only the instance that it creates.

Consider another example of a good old Java bean ..

class Address {
  private String street;
  private String houseNumber;
  private String city;
  private String zip;

  //..
  //.. getters and setters
}

Working with a reflection based library it's not uncommon to see code that instantiates the bean using the default constructor and then allow clients to set the instance up with custom values .. something like ..

var p = Person(..)
val pAddress =
  newInstance[Address](null) {a =>
    a.setStreet("Market Street")
    a.setHouseNumber("B/102")
    a.setCity("San Francisco")
    a.setZip("98032")
    p.address = a
    p.mail("Your address has been changed to: " + a)
  }

Once again the block is only for side-effects, which can contain lots of other custom codes that depends on the caller's context. Make it more concise, DSLish using the object import syntax of Scala ..

var p = Person(..)
val pAddress =
  newInstance[Address](null) {a =>
    import a._
    setStreet("Market Street")
    setHouseNumber("B/102")
    setCity("San Francisco")
    setZip("98032")
    p.address = a
    p.mail("Your address has been changed to: " + a)
  }

Looks like a piece of idiom that can be effective as part of your programming repertoire. Here is the version of newInstance that allows you to make the above happen ..

def newInstance[T](args: Array[AnyRef])(op: T => Unit)(implicit m: Manifest[T]): T = {
  val constructor = 
    m.erasure.getDeclaredConstructors.first
  val v = constructor.newInstance(args: _*).asInstanceOf[T]
  op(v)
  v
}

Looking carefully at newInstance I realized that it is actually the Kestrel combinator that Raymond Smullyan explains so eloquently in his amazing book To Mock a Mockingbird. A bird K is called a Kestrel if for any birds x and y, (Kx)y = x. And that's exactly what's happening with newInstance. It does everything you pass onto the block, but ultimately returns the new instance that it creates. A nice way to plug in some side-effects. Reg has blogged about Kestrels in Ruby - the tap method in Ruby 1.9 and returning in Rails. These small code pieces may not be as ceremonious as the popularly used design patterns, but just as effective and provide deep insights into how our code needs to be structured for expressivity. Next time you discover any such snippet that you find useful for sharing, feel free to write a few lines of blog about it .. the community will love it ..