Ruminations of a Programmer: programming

Showing posts with label programming. Show all posts

Monday, February 20, 2012

It's the familiarity model!

James Iry recently blogged on code density and tries to find out the meaning of "dense code". As an example he took upon the topic of regular expressions, which despite being dense are not frowned upon in coding and hardly get replaced for making the code fragment more readable.

So the question is what makes code dense so that it's not acceptable to programmers and they complain about it's incomprehensibility ?

Later in the blog post James himself identifies unfamiliarity as one of the culprits. People don't complain about regexes since they are familiar with them, but will surely complain of something else which they are not familiar with.

Almost during the same time Ola Bini blogged about expressiveness in programming language syntax. He mentions that a well designed syntax should help programmers *read* the code easily. But he also questions about the target programmers ..

who is this person reading ? It makes a huge difference if we’re trying to design something that should be easy to read for a novice or we’re trying to design a syntax that makes it easier for an expert to understand what’s going on.

Once again we get into this territory of familiarity and mental model. An expressive piece of code becomes readable only to a person who is familiar with the underlying model. In my programming career I have come across this dichotomy a number of times where programmers complain of something being too dense the moment it crosses the threshold of his familiarity level. I have seen developers taking every pain to understand the nuances of a Spring XML configuration. Or who have spent zillions of hours mastering the whole bunch of performance tuning Hibernate with stuffs like query cache configuration. Believe me it's not simple with tonnes of corner cases to take care of and even today I am not sure if it can be achieved in a deterministic way for all kinds of data models. But these same developers complain when they are faced with maintaining code that needs a basic understanding of functional programming, set theory or algebraic data types. I think it's purely because these form outside the limits of their familiarity model.

For a programmer who is not familiar with higher order functions, combinators like map, fold or filter will look too dense. So when you say map (+1) [1..5], the code fragment looks much less comprehensible to him than his familar variant of using an imperative mutated-indexed for-loop. To the unfamiliar the functional variant appears dense, to the expert it becomes succinct.

One of the challenges that I face today is to make programmers believe that learning new stuff will only help them think better. It's not mandatory that they will need all of these tools as part of their day job. But broadening your mental model can only help your thought process to leverage a wider playground. Maybe in our part of the world big companies give no incentive to transform yourself from a billable offshore resource to a thinking programmer. But you really need to transcend the limits of your familiarity model in order to appreciate code which experts certify as succinct.

Sunday, January 10, 2010

A Case for Orthogonality in Design

In Learning, using and designing command paradigms, John M Carroll introduces the notion of lexical congruence. When you design a language, one of the things that you do is lexicalization of the domain. If we extrapolate the concept to software designs in general, we go through the same process with our domain model. We identify artifacts or lexemes and choose to name them appropriately so that the names are congruent with the semantics of the domain. This notion of lexical congruence is the essence of having a good mnemonics for your domain language. I found the reference to Carroll's work in Testing the principle of orthogonality in language design, which discusses the same issue of organizing your language around an optimal set of orthogonal semantic concepts. This blog post tries to relate the same concepts of orthogonality in designing domain models using the power that the newer languages of today offers.

The complexity of a modeling language depends on the number of lexemes, their congruence with the domain concepts being modeled and the number of ways you can combine them to form higher order lexemes. The more decoupled each of these lower level lexemes are, the easier they are to compose. When you have overlapping concepts being modeled as part of lexemes, the mixing is not easy. You need to squeeze in some special boundary conditions as part of composition logic. Making your concepts independent yet composable makes your design orthogonal.

The first time I came across the concept of orthogonality in design and consciously appreciated the power of unification that it brings on to your model, is through Andrei Alexandrescu's idea of policy based design that he evangelized in his book Modern C++ Design and in the making of the Loki library. You have orthogonal policies that are themselves reusable as independent abstractions. And at the same time you can use the language infrastructure to combine them when composing your higher order model. Consider this C++ example from Andrei's book ..

template
<
  class T,
  template <class> class CheckingPolicy,
  template <class> class ThreadingModel
>
class SmartPtr;

CheckingPolicy enforces constraints that need to be satisfied by the pointee object. The ThreadingModel abstraction defines the concurrency semantics. These two concerns are not related in any way between themselves. But you can use the power of C++ templates to plug in appropriate behaviors of these concerns when composing your own custom type of SmartPtr ..

template SmartPtr<Widget, NoChecking, SingleThreaded>
  WidgetPtr;

This is orthogonal design where you have a minimal set of lexemes to model otherwise unrelated concerns. And use the power of C++ templates to evolve a larger abstraction by composing them together. The policies themselves are independent and can be applied to construct arbitrary families of abstraction.

The crux of the idea is that you have m concepts that you can use with n types. There's no static relationship between the concepts and the types - that's what makes orthogonality an extensible concept. Consider Haskell typeclasses ..

class Eq a where 
  (==) :: a -> a -> Bool

The above typeclass defines the concept of equality. It's parameterized on the type and defines the constraint that the type as to define an equality operator in order to qualify itself as an instance of the Eq typeclass. The actual type is left open which gives the typeclass an unbounded extensibility.

For integers, we can do

instance Eq Integer where 
  x == y =  x `integerEq` y

For floats we can have ..

instance Eq Float where 
  x == y =  x `floatEq` y

We can define Eq even for any custom data type, even recursive types like Tree ..

instance (Eq a) => Eq (Tree a) where 
  Leaf a         == Leaf b          =  a == b
  (Branch l1 r1) == (Branch l2 r2)  =  (l1==l2) && (r1==r2)
  _              == _               =  False

Haskell typeclasses, like C++ templates help implement orthogonality in abstractions through a form of parametric polymorphism. Programming languages offer facilities to promote orthogonal modeling of abstractions. Of course the power varies depending on the power of abstraction that the language itself offers.

Let's consider a real world scenario. We have an abstraction named Address and modeled as a case class in Scala ..

case class Address(no: Int, street: String, 
                   city: String, state: String, zip: String)

There can be many contexts in which you would like to use the Address abstraction. Consider printing of labels for shipping that needs your address to be printed in some specific label format, as per the following trait ..

trait LabelMaker {
  def toLabel: String
}

Note that printing addresses in the form of labels is not one of the primary concerns of your Address abstraction. Hence it makes no sense to model it as one of the methods of the class. It's only required that in some situations we may need to use the Address to print itself in the form of a label as per the specification mandated by LabelMaker.

One other concern is sorting. You may need to have your addresses sorted based on zip code before submitting them to your Printer module for shpping. Sorting may be required in combination with label printing or as well as on its own - these two are orthogonal concerns that should never have any dependence amongst themselves within your abstraction.

Depending on your use case, you can decide to compose your Address abstraction as

case class Address(houseNo: Int, street: String, 
  city: String, state: String, zip: String)
  extends Ordered[Address] with LabelMaker {
  //..
}

which makes your Address abstraction statically coupled with the other two.

Or you may like to make the composition based on individual objects which would keep the base abstraction independent of any static coupling.

val a = new Address(..) with LabelMaker {
  override def toLabel = {
    //..
  }
}

As an alternative you can also choose to implement implicit conversions from Address using Scala views ..

object Address {
  implicit def AddressToLabelMaker(addr: Address) = new LabelMaker {
    def toLabel =
      "%d-%s, %s, %s-%s".format(
        addr.houseNo, addr.street, addr.city, addr.state, addr.zip)
  }
}

Whatever be the implementation, take note of the fact that we are not polluting the basic Address abstraction with the concerns that are orthogonal to it. Our model, which is the design language treats orthogonal concerns as separate lexemes and encourages ways to compose them non invasively by the user.

Sunday, January 03, 2010

Pragmatics of Impurity

James Hague, a long time Erlanger, drives home a point or two regarding purity of paradigms in a couple of his latest blog posts. Here's his take on being effective with pure functional languages ..

"My real position is this: 100% pure functional programing doesn't work. Even 98% pure functional programming doesn't work. But if the slider between functional purity and 1980s BASIC-style imperative messiness is kicked down a few notches--say to 85%--then it really does work. You get all the advantages of functional programming, but without the extreme mental effort and unmaintainability that increases as you get closer and closer to perfectly pure."

Purity is not necessarily pragmatic. In my last blog post I also tangentially touched upon the notion of purity while discussing how a *hybrid* model of SQL-NoSQL database stack can be effective for large application deployments. Be it with programming languages or with databases or any other paradigms of computation, we need to have the right balance of purity and pragmatism.

Clojure introduced transients. Rich Hickey says in the rationale .. "If a pure function mutates some local data in order to produce an immutable return value, is that ok?". Transients in Clojure allow localized mutation in initializing or transforming a large persistent data structure. This mutation will only be seen by the code that does the transformation - the client gets back a version for immutable use that can be shared. In no way does this invalidate the benefits that immutability brings in reasoning of Clojure programs. It's good to see Rich Hickey being flexible and pragmatic at the expense of injecting that little impurity into his creation.

Just like the little compromise (and big pragmatism) with the purity of persistent data structures, Clojure also made a similar compromise with laziness by introducing chunked sequences that optimize the overhead associated with lazy sequences. These are design decisions that have been taken consciously by the creator of the language that values pragmatism over purity.

Enough has already been said about the virtues of purity in functional languages. Believe me, 99% of the programming world does not even care for purity. They do what works best for them and hybrid languages are mostly the ones that find the sweetest spots. Clojure is as impure as Scala is, considering the fact that both allow side-effecting with mutable references and uncontrolled IO. Even Erlang has uncontrolled IO and a mutable process dictionary, though its use is often frowned upon within the community. The important point is that all of them have proved to be useful to programmers at large.

Why do creators infuse impurity into their languages ? Why aren't every language created as pure as Haskell is ? Well, it's mostly related to a larger thought that the language often targets to. Lisp started as an incarnation of the lambda calculus under the tutelage of John McCarthy and became the first significant language promoting the purely applicative model of programming without side-effects. Later on it added the impurities of mutation constructs based on the von Neumann architecture of the machines where Lisp was implemented. The obvious reason was to get an improved performance over purely functional constructs. Scala and Clojure both decided to go for the JVM as the primary runtime platform - hence both languages are susceptible to the pitfalls of impurity that JVM offers. Both of them decided to inherit all the impurities that Java has.

Consider the module system of Scala. You can compose modules using traits with deferred concrete definitions of types and objects. You can even compose mutually recursive modules using lazy vals, somewhat similar to what Newspeak and some dialects of ML offer. But because you have decided to bite the Java pill, you can also wreak havoc through shared mutable state at the top level object that you compose. In his post titled A Ban on Imports Gilad Bracha discusses all evil effects that an accessible global namespace can bring to the modularity aspects of your code. Newspeak is being designed as pure in this respect, with all dependencies being abstract and need to be plugged together explicitly as part of configuring the module. Scala is impure in this respect, allows imports to bring in the world on to your module definitions, but at the same time opens up all possibilities of sharing the huge ecosystem that the Java community has built over the years. You can rightfully choose to be pure in Scala, but that's not enforced by the language.

When we talk about impurity in languages, it's mostly related to how it handles side-effects and mutable state. And Haskell has a completely different take on this aspect than what we discussed with Lisp, Scala or Clojure. You have to use monads in Haskell towards any side-effecting operation. And people with a taste for finer things in life are absolutely fine with that. You cannot just stick in a printf to your program for debugging. You need to return the whole stuff within an IO monad and then do a print. The Haskell philosophy looks at a program as a model of mathematical functions where side-effects are also implemented in a functional way. This makes reasoning and optimization by the compiler much easier - you can make your pure Haskell code run as fast as C code. But you need to think differently. Pragmatic ? What do you think ?

Gilad Bracha is planning to implement pure subsets of Newspeak. It will be really exciting to get to see languages which are pure, functional (note: not purely functional) and object-oriented at the same time. He observes in his post that (t)he world is slowly digesting the idea that object-oriented and functional programming are not contradictory concepts. They are orthogonal, and can be arranged to be rather complementary. This is an interesting trend where we can see families of languages built around the same philosophy but differing in aspects of purity. You need to be pragmatic to choose and even mix them depending on your requirements.

Sunday, September 13, 2009

Misconstrued Language Similarity Considered Harmful

Very frequently I come across posts of the form Language X for Language Y programmers. It's not that there is anything wrong with them, but, more often than not, the underlying tone of such posts is to highlight some apparent (and often misconstrued) similarities between the two languages.

Objects in Java and processes in Erlang have some similarity in the sense that both of them abstract some state of your application. But that's where the similarity ends - the rest is all gaping differences. Objects in a class oriented OO language like Java and processes in a concurrency oriented language like Erlang are miles apart in philosophy, usage, implementation and granularity. Read Joe Armstrong's Why OO sucks for details. But for the purpose of this post, the following statement from Joe suffices to nip in the bud any logic that claims objects in Java are similar to processes in Erlang ..

"As Erlang became popular we were often asked "Is Erlang OO" - well, of course the true answer was "No of course not" - but we didn't to say this out loud - so we invented a serious of ingenious ways of answering the question that were designed to give the impression that Erlang was (sort of) OO (If you waved your hands a lot) but not really (If you listened to what we actually said, and read the small print carefully)."

Similarity breeds Contentment

It's always comforting to have a base of similarity. It's human nature to force find a similar base and transform one's thought process with respect to it. With programming languages, it works only if the two languages share the same philosophy and implement similar ideas. But you can never learn a new language by pitching it against your own favorite language and identifying apparent similarities. It is these apparent similarities that tend to influence how you think of the idioms of the new language and you will be misled into believing and practising something that will lead you to the path of antipatterns. Consider static typing and dynamic typing - a debate that has possibly been beaten to death. But when you are making the change, learn to think in the new paradigm. It's foolish to think in terms of concrete types in a dynamically typed setting. Think in terms of the contracts that the abstraction implemnents and organize your tests around them. Ola Bini wrote a fantastic post on this in response to a twitter discussion that originated from me.

The starting point should always be the philosophy and the philosophical differences that the two languages imbibe. Haskell typeclasses may seem similar in many respects to polymorphism in object-oriented languages. In fact it is more similar to parametric polymorphism, while the most dominant form of polymorphism in OO is subtype polymorphism. And Haskell being a functional language does not support subtyping.

This post relates pattern matching in Erlang and conditionals in Java in the same vein. It's true both of them offer some form of dispatch in program control. But the more significant and idiomatic difference that matters in this context is between the conditional statements in Java and expressions in the functional setting of Erlang. It's this expression based programming that influences the way you structure your code in Erlang. Instead of highlighting upfront the similarity of both constructs as means of flow control, emphasize on the difference in thought process that gives your program a different geometry. And finally, the most important use of pattern matching is in programming with algebraic data types, a whole new idiom that gets unveiled.

Ok, if you want to write a post on language X for prospective programmers of language Y, go ahead and highlight the differences in philosophy and idioms. And then try to implement your favorite feature from language Y in X. Misconstrued similarities often bias programmers new to language X the wrong way.

Sunday, August 16, 2009

5 Reasons why you should learn a new language NOW!

There have been quite a few murmers in the web sphere today regarding the ways Java programming paradigms have changed since its inception in the late 90s. A clear mandate and recommendation towards immutable abstractions, DSL like interfaces, actor based concurrency models indicate a positive movement towards a trend that nicely aligns with all the language research that has been going on in the community since quite some time. Language platforms are also improving by the day, efforts have been on for making the platforms a better host for multi-paradigm languages. Time is now for you to learn a new language - here are some of my thoughts of why you should invest in learning a new language of your choice .. NOW!

#1

Language barriers are going down - polyglot programming is on the way up. Two of the big enablers towards this movement are:

Middleware inter-operability using document formats like JSON. You can implement persistent actors in Scala or Java that use MongoDB or CouchDB as the storage of JSON documents, which interoperate nicely with your payment gateway system hosted on MochiWeb, developed on an Erlang stack.

Easier language inter-operability using DSLs. While you are on a specific platform like the Java Virtual Machine you can design better APIs in an alternative language that interoperates with the core language of your application. Here's how I got hooked on to Scala in an attempt to make my Java objects smarter and publish better APIs to my clients. Even Google, known for their selective set of languages to use in production applications, have been using s-expressions as an intermediate language expressed as a set of Scheme macros for their Android platform.

#2

Learning a different language helps you look at a problem in a different way. Maybe, the new way models your domain more expressively and succinctly. And you will need to write and maintain lesser amount of code in the new language. Once you're familiar with the paradigms of the new language, idiomatic code will look more expressive to you, and you will never complain about the snippet in defence of the average programmer. What you flaunt today as design patterns will come as natural idiomatic expressions in your new language - you will be programming at a higher level of abstraction.

#3

Playing on the strengths that the new language offers. Long back I blogged on Erlang becoming mainstream as a middleware language. You do not have to use Erlang for the chores of application development that you do in your day job. Nor you will have to be an Erlang expert to use Erlang based solutions like RabbitMQ or CouchDB. But look at the spurt of development that have been going on using the strengths of Erlang's concurrency, distribution and fault tolerance capabilities. As of today, Erlang is unmatched in this regard. And Erlang has the momentum both as a language and as the platform that delivers robust middlware. Learning Erlang will give you more insights into the platform's capabilities and will give you the edge to make a rational decision when your client asks you to select Webmachine as the REST based platform for your next Web application talking to the Riak datastore.

#4

The Java Virtual Machine is now the cynosure of performance optimization and language research. Initially being touted as the platform for hosting statically typed languages, the JVM is now adding capabilities to make itself a better host for dynamically typed languages as well. Anything that runs on the JVM is now a candidate for being integrated into your enterprise application architecture tomorrow. Learning a new JVM language will give you a head start. And it will safeguard your so long acquired Java expertise too. JRuby is a classic example. From a really humble beginning, JRuby today offers you the best of dynamic language capabilities by virtue of being a 100% compatible Ruby interpreter and a solid player in the JVM. JRuby looks to be the future of Ruby in the enterprise application space. Groovy has acquired the mindshare of lots of Java professionals by virtue of its solid integration with the Java platform. Clojure is bringing in the revival of Lisp on the JVM. And the list continues .. Amongst the statically typed ones, Scala is fast emerging as the next mainstream language for the JVM (after Java) and can match the performance of Java as of today. And the best part is that your erstwhile investment on Java will only continue to grow - you will be able to freely interoperate any of these languages with your Java application.

#5

This is my favorite. Learn a language for the fun of it. Learn something which is radically different from what you do in your day job. Maybe Factor, maybe some other concatenative language like Forth or Joy. Or Lua, that's coming up fast as a scripting language to extend your database or application. A couple of days ago I discovered JKat, a dynamically typed, stack-based (concatenative) language similar to Forth but implemented as an interpreter on top of the JVM. You can write neat DSLs and embed the JKat interpreter very much like Lua with your application. Indulge to the sinful feeling that programming in such languages offer - you will never regret it.

Sunday, June 14, 2009

Code Reading for fun and profit

I still remember those days when APIs were not so well documented, and we didn't have the goodness that Javadocs bring us today. I was struggling to understand the APIs of the C++ Standard Library by going through the source code. Before that my only exposure to code reading was a big struggle to pile through reams of Cobol code that we were trying to migrate to the RDBMS based platforms. Code reading was not so enjoyable (at least to me) those days. Still I found it a more worthwhile exercise than trying to navigate through inconsistent pieces of crappy paperwork and half-assed diagrams that project managers passed on in the name of design documentation.

Exploratory Code Reading ..

C++ Standard library and Boost changed it all. C++ was considered to be macho enough those days, particularly if you can boast of your understandability of the template meta-programming that Andrei Alexandrescu first brought to the mainstream through his columns in C++ Report and his seemingly innocuously titled Modern C++ Design. Code reading became a pleasure to me, code understanding was more satisfying, particularly if you could reuse some of those code snippets in your own creations. It was the first taste of how dense C++ code could be, it was as if every sentence had some hidden idioms that you're trying to unravel. That was exploratory code reading - as if I was trying to explore the horizons of the language and its idioms as the experts documented with great care. I subscribed to the view that Code is the Design.

Collaborating with xUnit ..

Then came unit testing and the emergence of xUnit frameworks that proved to be the most complete determinants of the virtues of code reading. Code reading changed from being a passive learning vehicle to an active reification of thoughts. Just fire up your editor, load the unit testing framework and validate your understanding through testXXX() methods. It was then that I realized the wonders of code reading through collaboration with unit testing frameworks. It was as if you are doing pair programming with xUnit - together you and your xUnit framework are trying to understand the library that you're exploring. TDD was destined to be the next step, the only change being that instead of code understanding you're now into real world code writing.

Code Reading on the GO ..

Sometimes I enjoy reading code when I'm traveling or in a long commute. It's not painstaking, you do not have any specific agenda or you're not working against a strict timeline for the project. I found this habit very productive and in fact learnt quite a few tricks of the trade in some of these sessions. I still remember how I discovered the first instance of how to implement the Strategy pattern through Java enums browsing through Guice code in one of the flights to Portland.

Code Reading towards Polyglotism ..

When you're learning a new language, it helps a lot looking at existing programs in languages that you've been programming for long. And think how you could model it in the new language that you're learning. It's not a transliteration, often it results in knowing new idioms and lots of aha! moments as you explore through your learning process. This is one of the most invaluable side-effects of code reading - reading programs in language X makes you a better programmer in language Y. Stuart Halloway in his book on Clojure programming gives a couple of excellent examples of how thinking functionally while reading Java code makes you learn lots of idioms of the new paradigm.

Reading bad code ..

This is important too, since it makes you aware you of the anti-patterns of a language. It's a common misconception that using recursion in functional programs makes them more idiomatic. Recursion has its own problems, and explicit recursions are best hidden within the language offered combinators and libraries. Whenever you see explicit recursion in non trivial code snippets that can potentially get a large data set, think twice. You may be better off refactoring it some other way, particular when you have an underlying runtime that does not support tail call optimization. Code that do not read well, are not communicative to users. Code reading makes you aware of the importance of expressiveness, you realize that you'd not write code that you cannot read well.

Well, that was a drunken rant .. that I wrote as a side-effect in the midst of reading the Scala source for 2.8 Collections ..

Monday, March 30, 2009

Commodity Programming

Some musings from the International Lisp Conference 09 ..

From Jao Ortega, noted Lisper and Haskeller, in his Sussmaniana report from ILC 09 ..

"Next day we were in a kind of tongue-in-check debate provocatively entitled Are macros a menace?. Richard Gabriel was on the wrong side, and arguing along the lines that macros were akin to language design and that he’d rather not suffer the consequences of letting your average software engineer undertake such a complex task. Gerry’s intervention at this point made me again nod like as i was mad: if we cannot trust our software enginneers to proficiently use the really powerful tools of our trade, there must be something wrong in the way we educate them; only those able to judiciously use them should get a diploma, to begin with."

From the blogs of Andy Wingo, reporting on ILC 09 .. on the same discussion "Macros: Are they a menace?" ..

"More seriously, the arguments against macros centered on control: control by bosses on workers, with the idea that macros make custom languages, thus making individual programmers less replaceable."

and by this time all of us know why MIT had switched away from Scheme to Python for their introductory programming course, 6.001. This discussion was also kicked off during the interlude to the macro debate when Pascal Costanza asked the reason to Gerry Sussman. Read Andy's post for more details ..

Sigh! Dumbing down of powerful language features in favor of the strawman argument being discussed in Lisp conferences. Fangs of Enterprise software development ?

Sunday, April 27, 2008

Greedy Coin Changer in Scala

OnLamp has published this Python implementation of the Greedy Coin Changer. Here is my version of a functional implementation of the same in Scala.

object GreedyCoinChanger {
  def divMod(dividend: Int, divisor: Int) = {
      (dividend / divisor, dividend % divisor)
  }

  // the list has to be sorted in descending order of denomination
  def change(coins: List[Int], amount: Int) = {
    def changeOne(pair: (List[Int], Int), div: Int) = {
      val dm = divMod(pair._2, div)
      ((dm._1 :: pair._1), dm._2)
    }
    (((List[Int](), amount) /: coins)(changeOne(_,_)))._1.reverse
  }

  def main(args: Array[String]) = {
    println(change(List(25, 10, 5, 1), 71))
  }
}

Running this will print :

>> List(2, 2, 0, 1)

indicating 2 quarters, 2 dimes, 0 nickels and 1 penny. To make things a little more explanatory and verbose, here is a slightly more decorated version :

object GreedyCoinChanger {
  def divMod(dividend: Int, divisor: Int) = {
    (dividend / divisor, dividend % divisor)
  }

  def pluralize(no: Int, phrase: String) = phrase match {
    case "penny" if no > 1 =>
        "pennies"
    case something if no > 1 =>
        something + "s"
    case other => other
  }

  // the list has to be sorted in descending order of denomination
  def change(coins: List[(Int,String)], amount: Int) = {
    def changeOne(pair: (List[String], Int), denomination: (Int,String)) = {
      val (div, mod) = divMod(pair._2, denomination._1)
      div match {
        case 0 => (pair._1, mod)
        case _ => ((div + " " + pluralize(div, denomination._2) :: pair._1), mod)
      }
    }
    (((List[String](), amount) /: coins)(changeOne(_,_)))._1.reverse.mkString("(", ", ", ")")
  }

  def main(args: Array[String]) = {
    println(change(List((25,"quarter"), (10,"dime"), (5,"nickel"), (1,"penny")), 71))
  }
}

Running this will print :

>> (2 quarters, 2 dimes, 1 penny)

I am not an expert in Scala or functional programming. Any suggestion to make it more idiomatic is most welcome.

Tuesday, April 22, 2008

Syntactic Sugars - What makes them sweet ?

Stephen Colebourne talks about implementing for-each in Java maps. He has proposed changes to be made to javac and queued up his request for approval by appropriate authorities. It is good to see Java community leads taking some serious steps towards syntactic sugars in the language. I am always for intention revealing syntactic sugars - after all .. Programs must be written for people to read, and only incidentally for machines to execute ?

Syntactic sugars, when properly designed, reduce the semantic distance between the problem domain and the solution domain. Syntactic sugars do not add new features or capabilities to an existing language. Still we value them mainly for social reasons - they can make your abstractions much more explicit, thereby making your intentions much more direct. And syntactic sugars often lead to concise and succinct code much pleasing to your eyes.

Java is not a language that boasts of concise syntax. Yet the smart for loop introduced in Java 5 reduces a lot of accidental complexity and makes the programmer intention much more explicit ..

for(String name : names) {
  // process name
}

is much more succinct than

for(Iterator<String> it = names.iterator(); it.hasNext(); ) {
  String name = it.next();
  // process name
}

judging from the fact that the latter snippet has its intentions buried down into verbosity of structures not directly related to the intention of the programmer.

names foreach println

is better, though not Java.

Many of the languages being used today offer lots of syntactic sugars abstracting rich capabilities of the underlying language. Take, for example, the Groovy Builder syntax, which exploits the mechanics of meta-programming, closures and named arguments to implement elegant, concise, intuitive APIs. Java developers use binding frameworks to manipulate XML and bind them to the model or the relational database schema. Not that there is anything wrong with it. But the developer has to go through all the hoops of mapping the object structure to the XML schema and use an external framework like JAXB to come up with a much longer version of the same solution than using Groovy MarkupBuilders.

Syntactic sugars are nothing new in the landscape of programming languages. It all started (possibly) with Lisp, offering macros as the means to design syntactic abstractions. To get a little sugar to the language offered syntax, you need not have to wait till the next official release. In Lisp, the syntax of the program is a direct representation of the AST, and with macros you can manipulate the parse tree directly. Languages like Lisp are known to offer syntax extensibility and allows developers to implement his own syntactic sugar.

Ruby offers runtime meta-programming, another technique to add your own syntactic sugars. Ruby does not have a macro system where you can play around with the abstract syntax tree, though we have had a ruby parser released by Ryan Davis that has been written entirely in Ruby. The standard meta object protocol offered by Ruby allows developer control over the language semantics (not the syntax) and has the capability to generate classes and methods dynamically at runtime. Meta-programming, method_missing, open classes, optional parentheses are some of the features that make Ruby a great language to build syntax abstractions for runtime processing.

A language built on the philosophy of bottom up programming offers extensible syntax (be it through the syntactic abstractions of Lisp or the semantic customizations of Ruby), on which syntactic sugars can be constructed by developers. Java believes in democratization of all syntax offered by the language, and it may take quite a few years to officialize the little sugar that you have been yarning for. Remember the explosion in the number of blog posts in celebration of the for-each loop when it came out with Java 5. In other languages, people build new syntax by the day and evolve new vocabulary within the same language that maps into the domain that they model. If you miss those features which you enjoyed in your earlier language, just build it over the new language. And it does not necessarily have to be the process of hooking onto the compilation cycle or plugging in customized modules into your language parsers.

Many of today's languages offer strong enough capabilities to build structures that look like syntax extensions. Scala is an example that makes the cut in this category. The advanced type system of Scala enables developers write control structures within the syntax of the language that looks like syntactic abstractions. Max likes deterministic finalization in C# and its idiomatic usage with "using" keyword. He has implemented the same syntax in Scala using closures, view bounds and implicit conversions. Besides eliminating lots of boilerplates, his extension looks charmingly useful for the domain he is using it.

Syntax extensibility is a necessity if you want your language to support evolution of DSLs. Extensible syntax scales much better than the framework based approach so popularized by Java. When you add a new syntactic structure in your language, it meshes so nicely with the rest of the language constructs that you never feel that it has been externally bolted on to you. Although in reality it is nothing more than syntactic sugars being presented in a form that makes more sense when coding for the particular problem in hand. When we talk about a language, we think in terms of parsers - this is no different when we think about DSLs. Implementing an external DSL is hard, considering the enormous complexity that parser generators make you go through. Scala offers monadic parser combinators where you can directly map your EBNF syntactic structures into implementations. And all this is done through syntactic sugars on top of closures and higher order functions that the language offers.

Higher Order Functions - The Secret Sauce ?

There has been lots of debates on whether object-oriented interfaces scale better than syntax extension capabilities in a language design. While OO certainly has its place in modularizing components and abstracting away relationships between them, there are situations when objects force us fit the round peg in a square hole. How many times have you cursed Java for forcing you define an unnecessary interface just to apply a function over a set of abstractions defining a specific set of contracts ? You can do the same in Scala using structural typing (aka anonymous types) and higher order functions. Higher order functions seem to be the secret sauce for offering syntax extensibility in programming languages.

Wednesday, February 27, 2008

Open Classes in Ruby - Too easy to be misused ?

I read this ..

Open classes makes invention possible in real time. No committee. No waiting for the next language revision. If you think of it, you can try it. Does Symbol#to_proc make it worthwhile? How about andand? Why don’t you tell me? I was serious when I asked you what you think of it.

and soon after .. this ..

I am all with Raganwald that "dangerous features enable bottom-up language evolution". My main concern, however, is that, I have seen too much of Ruby code where people tend to open up classes without even considering other design choices. On many of these occasions did I feel that mixins could have been a more pragmatic alternative. Maybe, one reason could be that, open classes in Ruby are too easy to implement. And developers give in to the temptation of using this "dangerous feature", without giving the adequate amount of thought that the design decision deserves. Scala's lexically scoped alternative is less sexy, but looks more pragmatic.

Wednesday, January 30, 2008

Hitting the Sweet Spot

Do you have to be a better X (for all X mainstream) to be a successful mainstream programming language ? Smalltalk lost out to C++ back in the 80s even though Smalltalk had a purer object model (Objects all the way down) with lots of powerful abstractions, espoused the virtues of garbage collection, byte codes and JIT (only later to be hijacked by Java) and provided a solid refactoring browser based IDE. On the other hand, C++ was positioned to be a better C and had played upon the familiarity cards of having the curly brace syntax, syntactic compatibility with C but with better type-safety. Even today, we find enough impact that Smalltalk, both as a language and as a platform, has in the market. Ruby is strongly influenced by Smalltalk - even many dynamic language gurus feel that Ruby should be made to run on the highly optimized Strongtalk VM rather than labour through the process of carving out its own or try to make it run on the JVM through JRuby. Gemstone's object server runs Smalltalk and provides a state-of-the-art platform for developing, deploying, and managing scalable, high-performance, multi-tier applications based on business objects. Recently announced new Web programming environment from Sun Labs, Lively Kernel was inspired in part by the success of the Squeak Smalltalk programming environment.

Why did Smalltalk lose out to C++ ?

Eventually Java hit the sweetest of spots as a better and easy-to-use C++. Java adapted the Smalltalk VM and roped in the same features that people rejected with Smalltalk in the 80s. The only difference that Java created was that the community focused on building a strong ecosystem to support the average programmer better than any of its predecessors. This included richer libraries and frameworks, great tooling, a uniform runtime environment, JITs that generated efficient code and of course a very warm and supportive community participation. Lawrence Kesteloot makes a strong point when he emphasizes that helping the average programmer creates the necessary strength and durability of the ecosystem for the language to thrive.

Enterprise projects thrive on the ecosystem.

No matter how elegant your language is, unless you have a strong ecosystem that lives up to the demand / supply economics of developing enterprise software, it will not be able to move into the ranks of BigCo projects. Even the most beautiful piece of code that you may write has an IDE life directly proportional to the skillset of the programmer who will maintain it. Of course there are plenty of good programmers working with the BigCo enterprise projects, but it is the machinery assuring a copius supply of average programmers that keeps the economics ticking.

And only one language has so far been able to create this ecosystem !

Monday, January 07, 2008

Language Explorations on the JVM - An Application Developer's perspective

Sometime ago I had reported on our first experience of using Rhino scripting in a Java EE application for a large client. It was exactly what Ola Bini suggests in his post on language explorations. Some of the modules of the application needed the dynamism, were required to be hot swappable and customizable by the domain users. And the compilation cycle was getting in the way in trying to meet up these requirements through the standard server side language. We went for Rhino scripting for all such controllers using the new scripting engine available for executing Javascript within the JVM.

Since that application has been successfully deployed, we have been fiddling around with some more options towards polyglotism. This post is a brief summary of some of the languages / language bridges we explored in the process. All of what we did so far has been on the JVM as the underlying Polyglot platform - we have not yet explored anything on the .NET world.

Web controllers are areas which may need lots of dynamic nature, since they deal with user interactions, page flows, stateful storage across requests and many other control flow structures for realizing one complex use case. Spring Web Flow provides one viable option for modeling this. Another option from the scripting world is Rhino in Spring, which integrates Mozilla Rhino JavaScript interpreter with Spring Framework. The value add is to offer to the user the flexibility of a dynamic language to model the dynamic parts of the application on the Java platform, while integrating with the dependency injection principles of the Spring framework. Spring also offers nice support of plugging in managed script based controllers in multiple languages - this will surely provide more options towards evolution of polyglot programming in today's applications.

Another area where we explored the possible usage of an expressive language is the configuration of an application. Applications today mostly use XML based configurations, which feels too noisy for human consumption. SISC offers a lightweight Scheme scripting engine atop the JVM and comes bundled with a small footprint of around 230 KB. I had blogged before on using Scheme as an executable XML :

In SISC bridging is accomplished by a Java API for executing Scheme code and evaluating Scheme expressions, and a module that provides Scheme-level access to Java objects and implementation of Java interfaces in Scheme.

Talking about what Ola Bini calls the "stable layer", I fully agree that static type safety helps here, since the entire application infrastructure will be built upon this layer. Till today Java is my #1 choice as the language and Spring is my only choice as the framework for this layer. I have talked on this a number of times before, but I guess it is worth repeating that I love the non-intrusiveness of Spring as far as declarative programming on the JVM is concerned. As it stands now, I will not forego Spring if I am developing on the JVM platform.

It will be really interesting to see how Scala shapes up its future as a potential candidate for this layer. Scala is a feature rich language with an advanced type system, nice syntax, less verbosity and more elegance than Java. Where Scala lacks are tooling, documentation and industry patronage, all of which can improve with more and more users joining the community.

In the domain layer, most applications rely on pure Java to model business rules. As Ola has mentioned, this layer is a strong candidate for DSL based implementation. Irrespective of what language(s) you use to implement your DSL, the business application rules should always be based on the DSL only. My feeling is that in today's scenario, Java is not really an ideal language to design a DSL. Hence we tend to find almost all applications implementing the domain layer at lower levels of abstraction. This makes the domain layer of today more verbose and less maintainable.

Powerful and expressive languages with conciseness of syntax are better fit for designing DSLs. While JRuby and Scala make suitable candidates for designing DSLs for the domain layer, I think the static typing of Scala makes it a better fit here. I may be biased, but when I am thinking of reusable API design to be used by big teams, somehow static typing (possibly done better than Java) makes me more comfortable. However, considering the state of enterprise software development today, there is a significant entry barrier for average programmers to both Scala and JRuby. Idiomatic Scala or Ruby is primarily based on functional paradigms, something which is still not intuitive to a Java programmer today. With most of today's new generation languages embracing FP, this may be the single most deciding factor that will determine the amount of acceptability that polyglot programming will find within the behemoth called enterprise software development. But there is no doubt that a well designed DSL using languages like Scala or JRuby will find tomorrow's domain model at a much higher level of abstraction than what it is today.

Monday, November 26, 2007

Productivity, Team Size and the Blub Paradox

Reginald Braithwaite is one of my favorite bloggers. Each of his postings make me think, some of them leave a lasting impression. One of them is this one that talks about small teams, productivity and the power of abstraction in programming languages.

Do we necessarily have to build large teams to solve complex programming problems ? Here is what Reginald has to say ..

All of our experience in the last sixty years has suggested that productivity drops off a cliff as team size increases. So, if you want more code from a larger team, you have to invest heavily in ways of extracting value out of unproductive people in an unproductive environment.

We cannot build our project execution infrastructure for the unproductive mass of average developers. (via Neal Ford's JRuby podcast) Glen Vanderburg once noted that bad developers will move heaven and earth to do the wrong thing. And by having a restricted environment all you are doing is constraining the power of good developers, not making bad developers any better.

And this is where the expressive power of the programming language comes in. Paul Graham has talked a lot about succinctness, blub programmers and metrics for the continuum of abstractness for programming languages. Good developers love to program in languages higher up the power continuum, while a blub programmer looks out for the averagest feature-set that aligns well within his comfort zone. Hence, as Reginald says ..

If we know that bug per line of code remains amazingly constant, why do we try to scale code out in verbosity rather than up in abstraction?

Scaling out in verbosity is encouraging those language features that add to the lines of code at the cost of the levels of abstraction. Which in turn is encouraging the blub paradox.

Java is still the most dominant language of the enterprise and JVM is undoubtedly the ubiquitous platform. It is difficult for an enterprise to move away from a platform overnight. But today we have a number of alternatives in programming languages that run on the Java Virtual Machine. Many of them are much higher than Java in the power continuum of abstractness. Isn't it time that we start embracing some of them, at least incrementally ? Neal Ford makes a great point in his JRuby podcast while talking about polyglot programming. Tests and builds are two areas that do not ship to the customer. These make great candidates for bootstrapping other JVM friendly languages into the enterprise. Next time try using JRuby for writing your tests, Raven for writing the build scripts. Soon you may start feeling that you could have collapsed your multi-level strategy hierarchy in Java using higher order functions.

Thursday, October 11, 2007

Defensive Programming - What is that ?

Another rant on how to think in the programming language that you are using. In most languages we use, we strive to handle exceptions as close to the site of occurrence. Be it runtime, be it checked, we take great care to ensure that the system does not crash. This is a form of defensive programming. Erlang does not espouse to this idea - the philosophy of Erlang is "to let it crash". Of course it is not as trivial as that. You have a recovery plan, but the recovery semantics is totally decoupled (yes, I mean it .. physically) from the crash itself.

I found this posting from the Erlang mailing list, where Joe Armstrong, the inventor of Erlang, explains the philosophy. Some of the highlights of his premises ..

In C etc. you have to write *something* if you detect an error - in Erlang it's easy - don't even bother to write code that checks for errors - "just let it crash".

Of course he explains how to handle crash in Erlang. It is very much related to the basic idiom of concurrency and fault tolerance that forms the backbone of Erlang's process structure. In Erlang, you can link processes, so that the linked process can keep an eye on the health of the other process. Once linked, the processes will implicitly monitor each other and if one of them crashes, the other process will be signalled. To handle the crash, Erlang suggests to use linked processes to correct the error. These linked processes need not run on the same processor as the original process, and this is where the Erlang philosophy of make-everything-distributable comes in. Joe mentions in this thread ..

Why was error handling designed like this?

Easy - to make fault-tolerant systems you need TWO processors. You can never ever make a fault tolerant system using just one processor - because if that processor crashes you are scomblonked.

One physical processor does the job - another separated physical processor watches the first processor fixes errors if the first processor crashes - this is the simplest possible was of making a fault-tolerant system.

This is also an example of separation of concerns where the handling of a crash is separately handled through a distribution mechanism. You do not code for checking of errors - let it crash and then you have a built-in mechanism of recovery within the process structure. To a layman, it feels a bit unsettling to deploy your production system in a language that follows the "let-it-crash" philosophy, but with the amazing track record of Erlang in designing fault tolerant distributed systems, it speaks volumes of the robustness and reliability of the underlying engine.

Tuesday, October 02, 2007

Refactoring - Only for Boilerplates ?

There are some bloggers who make you think, even if you do subscribe to an orthogonal view of the world. Needless to say, Steve Yegge is one of them. Quite some time back he introduced us to Fowler's Refactoring bible through this extremely thought provoking essay. Even if you are a diehard Java fan, his post forces you to think about the Caterpillar Butterfly conundrum. Seven months later, another great blogger, Raganwald, builds upon Yegge's post and discusses the various agonies that mutable local variables bring into your way of Extract Method refactoring. Raganwald's post led me thinking for the second time when I first read it. Last weekend, I came back to both of the links accidentally, through my favorite search engine (as if there are others also!) and re-read both of them. This post is an involuntary rant of that weekend reading.

Yegge says ..

Automated code-refactoring tools work on caterpillar-like code. You have some big set of entities — objects, methods, names, anything patterned. All nearly identical. You have to change them all in a coordinated way, like a caterpillar’s crawl, moving all the legs or lines this way or that.

Raganwald ends his post with the scriptum ..

This is exactly why languages with more powerful abstractions are more important than adding push-button variable renaming to less powerful languages. Where's the button that refactors a method with lots of mutable variables into one with no mutable variables? No button? Okay, until you invent one, don't you want a language that makes it easy to write methods and functions without mutable local variables?

Jokes apart, sentiments on the wayside, both of them target Java as the language of the Blub programmers, a language with less powerful abstractions, a language that generates caterpillar like code for push-button refactoring. They make you think deeply the possible reasons why you don't come across the term refactoring as often in other *powerful* languages like Lisp and Ruby. While you as a Java programmer consider yourself agile, when the menu item Refactor drops down into your favorite IDE and renames a public method correctly and with complete elan.

Is Refactoring only for Java ?

Martin Fowler defines refactoring as the process of changing a software system in such a way that it does not alter the external behavior of the code yet improves its internal structure. We all want to do so on the software that we write - correct ? Sure, then why the heck, the Java guys boast on something that should, in principle, be the normal way of life irrespective of the language that you use ? Can it be the case that with other languages, programmers write the best optimized code the very first time and need not improve upon the design and code organization in future ? Too optimistic a claim even for Paul Graham.

OO Organization and Modularization

Java is an OO language and offers a wealth of various ways to organize your modules and subsystems. Being a class based language, Java offers various relationships to be established between your abstractions and organize them through polymorphic hierarchies. You can have inheritance of interface as well as implementation, containment and delegation and a slew of frameworks to work on various instantiation models and factories. You can have flexible packaging at the development level as well as deployment level. Java compiles into bytecode and runs in the most powerful VM on the planet - apart from pure Java you can drop into your codebase snippets of other scripting languages that can reside and run within the JVM. The moot point is that all these flexibilities provide you, the Java programmer, with a slew of options at every stage of design and development. As client's requirements change, as your codebase evolves, it is only natural that you optimize your code organization using the best possible modularization strategy. Does this establish refactoring as a necessity for well-designed software and not a mere tool ? More so, when you are using a programming language that offers a rich repertoire of code and module organization. Sure, you need to promote some state into an inner class increasing the level of abstraction, or refactor a slice of code snippet from an existing method into another reusable function.

And Automated Refactoring Tools ?

So, we now agree that refactoring is necessary and it leads us to the holy grail of well-designed software. Do we need automated refactoring tools ? May not be, if we are working on a small set of codebase which you can cache into your memory all at a time. The moment you start having page faults, you feel the necessity of automation. And typical Java enterprise systems are bound to cross the limits of you memory segment and lead to continuous thrashing. Obviously not something that you would want.

But what do you need to have automated refactoring capabilities into your IDE ? Type information, which unfortunately is missing from code written in most dynamic languages. Without type information, it is impossible to have automated full-proof refactoring. Cedric has a nice post which drums this topic to the fullest.

In short, with Java's rich platter of code organization policies, you have the flexibility of merciless refactoring and with Java's static type system you have the option of plugging in automated refactoring tools with your IDE - so nice.

What about Mutable Local Variables ?

It is a known fact that mutable variables, at any level, are an antipattern for functional programming. And a functional program makes it a mathematician's dream for all sorts of analyses. However, in Java, we are programming real world problems, which has enough of a state to model. And Java is a language which offers the power of assignment and mutability. Assignments are not always bad, and is a natural idiom for programming imperative languages. Try modeling every real world problem with a stack containing objects with nested life times and with a constant value during their entire life time. Dijkstra has the following observation while talking about the virtues of assignments and goto statements ..

.. the only way to store a newly formed result is by putting it on top of the stack; we have no way of expressing that an earlier value becomes now obsolete and the latter's life time will be prolonged, although void of interest. Summing up: it is elegant but inadequate. A second objection --which is probably a direct consequence of the first one-- is that such programs become after a certain, quickly attained degree of nesting, terribly hard to read.

It is all but natural that mutable local variables make automated refactoring of methods difficult. But, to me, the more important point is the locality of reference for those local variables. It all depends on the way the mutation is used and the closure of code that is affected by the mutation. No process can handle ill-designed code - disciplined usage of mutability on local variables can be handled quite well by the refactoring process. It all boils down to the problem of how well your code is organized and the criteria used in decomposing systems into modules (reference Parnas). After all, we are modeling state changes using mutable local variables - Raganwald suggests a coarse level state management using objects. This is what the State Design Pattern gives us. He mentions in one of his comments that Object oriented programming is, at its heart, all about state. if you write objects without state, you are basically using objects as namespaces. So true. But at the same time when you need to handle state changes through mutability, always apply them at the lowest level of abstraction, so that even if you need to synchronize for concurrent access, you do so by locking at the minimum level of granularity. So, why not mutable local variables, as long as the closure is well understood and justified by the domain for which it is being applied.

Mutable local variables, when inappropriately weaved into the sphagetti of a method, makes refactoring very difficult. At the same time when the programmer discovers this difficulty, she can try to redesign her method taking advantage of the numerous levels of abstraction that Java offers. From this point of view, Refactoring is also a great teacher towards the ultimate objective of well-designed software.