Ruminations of a Programmer: closure

Showing posts with label closure. Show all posts

Friday, December 14, 2007

Closures in Java - the debate continues ..

Neal Gafter mentions in his blog while referring to Josh's presentation at Javapolis ..

The point he (Josh) makes is that function types enable (he says "encourage") an "exotic" style of programming - functional programming - which should be discouraged, otherwise the entire platform will become infected with unreadable code.

Huh! What exactly is the implication here ?

It could be either ..

Functional programming leads to unreadable code.

The mass of today's Java programmers soaked in the trenches of imperative programming are likely to generate unreadable code with powerful tools of functional programming.

I sincerely hope it is not the first one that Josh meant.

What exactly is unreadable code ? To someone not familiar with the idioms of a particular language, perfectly idiomatic code may seem unreadable. To someone not familiar with the functional paradigms, higher order functions, list comprehensions and closures will look illegible.

The question is, do we constrain down a language to the lowest common denominator only to cater to the masses or try to uplift the developers by encouraging them to program to a higher level of abstraction using more powerful idioms of functional programming. Closures in Java definitely introduce more powerful abstractions, and coupled with tail call optimization, will help us write more concise and succinct programs. To someone familiar with functional programming, the resultant code will definitely be more readable and enjoyable.

I am all for having true closures in Java 7 !

Monday, December 03, 2007

Will Closures in Java 7 help in making Functional Programming mainstream ?

Of late, there has been a lot of discussions on the usage of advanced programming idioms in the developers' community. Are we getting too much confined within the limits of strongly typed object-oriented paradigms and ignoring the power that dynamic non object oriented languages (read functional languages) have to offer ? Some of the blogs put forward the strawman's argument dismissing the usage of dynamic languages and more powerful abstractions as somewhat elitist and not suitable for the mass programmers. C# has added lots of functional paradigms to the language, Microsoft has positioned F# as a mainstream functional language for the .NET platform. Ruby has lots of functional features, Erlang has started being in the limelight and Haskell still reigns supreme amongst all favorites in reddit postings.

Still the adoption rate is not heartening enough and there are enough indications of apprehensions from the corners who make the masses. Is it the fear of dynamic typing or the lack of tools for refactoring support that's holding us back ? Or will the functional programming support from today's most widely used enterprise language can act as the catalyst towards easier adoption of folds and unfolds ?

I am not much of a believer in the strawman argument. But I think the push has to come from the medium which controls the balance and there is no denying the fact that today's enterprise programmers do Java for a living.

How close are we to getting closures in Java 7 ?

Do I get ..

public static <T,U> U fold(List<T> us,{T,U=>U} fun, U init) {
    U acc = init;
    for(T elem : us) {
        acc = fun.invoke(elem, acc);
    }
    return acc;
}

even though I would like to have ..

public static <T,U> U fold(List<T> us,{T,U=>U} fun, U init) {
    if (us.size() == 0) {
        return init;
    }
    return fold(us.subList(1, us.size()),
                fun, fun.invoke(us.get(0),
                init));
}

Either way I should be able to write ..

List<Integer> ints = new ArrayList<Integer>();
//.. populate ints
fold(ints,
      {Integer elem,Integer acc=>elem + acc},
    0);

With Neal Gafter's prototype, I can already write this, though we need TCO in order to play around with meaningful functional control structures.

Friday, November 09, 2007

Infinite Streams using Java Closures

Neal Gafter's prototype of the closures implementation in Java has given us enough playground to fool around with. Of late, I have been dabbling around with a couple of idioms in functional programming trying to implement it in Java. Many of them have already been tried using functors, anonymous inner classes and the likes. Many of them work too, but at the cost of high accidental complexity. The following is an attempt to get a clear implementation of infinite streams in Java.

Infinite streams give you the illusion that it can contain infinite number of objects. The real kludge behind infinite streams is lazy evaluation. SICP introduces the term delayed evaluation, which enables us to represent very large (even infinite) sequences as streams. Functional languages like Haskell and Miranda employs laziness as the default paradigm of evaluation, while languages like Scheme implement the same concepts as library functions (delay and force).

Dominik Gruntz implements infinite streams in Java using the functor paradigm and inner classes. The obvious problem is verbosity resulting from the accidental complexity that they lend to the implementation. In this post, I attempt the same using Neal's closures prototype. So, without further ado ..

The Stream Interface

Here's the contract for lazy evaluation ..

class StreamTest {
  interface Stream<E> {
    E car();
    Stream<E> cdr();
    E get(int index);
    <R> Stream<R> map(Unary<? super E, R> f);
    Stream<E> filter(Unary<? super E, Boolean> f);
  }
  //..
}

and a lazy implementation using Java closures ..

class StreamTest {
  interface Stream<E> {
    //.. as above
  }

  static class LazyStream<E> implements Stream<E> {
    private E car;
    private {=>Stream<E>} cdr;

    // constructor
    public LazyStream(E car, {=>Stream<E>} cdr) {
      this.car = car;
      this.cdr = cdr;
    }

    // accessors
    public E car() { return car; }
    public Stream<E> cdr() { return cdr.invoke(); }

    // access at position
    public E get(int index) {
      Stream<E> stream = this;
      while (index-- > 0) {
      stream = stream.cdr();
    }
      return stream.car();
    }

    // map over the stream
    public <R> Stream<R> map(Unary<? super E, R> f) {
      return cons(f.invoke(car), {=>cdr().map(f)});
    }

    // filter the stream
    public Stream<E> filter(Unary<? super E, Boolean> f) {
      if (car() != null) {
        if (f.invoke(car()) == true) {
          return cons(car(), {=>cdr().filter(f)});
        } else {
          return cdr().filter(f);
        }
      }
      return null;
    }

    // factory method cons
    public static <E> LazyStream<E> cons(E val, {=>Stream<E>} c) {
      return new LazyStream<E>(val, c);
    }
  }
}

A couple of points bugging ..

I had to make up the Unary class since the closure did not allow me to specify ? super E in the left hand side. Ricky has clarified with Neal that this is due to the fact that things on the left hand side of a closure automatically have ? super in their types. Hence a little noise ..

static class Unary<T,R> {
  private {T=>R} u;

  public Unary({T=>R} u) {
    this.u = u;
  }

  public R invoke(T arg) {
    return u.invoke(arg);
  }
}

and now some tests ..

class StreamTest {
  //.. all above stuff
  //.. and the tests

  // helper function generating sequence of natural numbers
  static LazyStream<Integer> integersFrom(final int start) {
    return LazyStream.cons(start, {=>integersFrom(start+1)});
  }

  // helper function for generating fibonacci sequence
  static LazyStream<Integer> fibonacci(final int a, final int b) {
    return LazyStream.cons(a, {=>fibonacci(b, a+b)});
  }

  public static void main(String[] args) {

    // natural numbers
    Stream<Integer> integers = integersFrom(0);
    Stream<Integer> s = integers;
    for(int i=0; i<20; i++) {
      System.out.print(s.car() + " ");
      s = s.cdr();
    }
    System.out.println("...");

    // a map example over the stream
    Stream<Integer> t = integers;
    Stream<Integer> u = t.map(new Unary<Integer, Integer>({Integer i=>i*i}));
    for(int i=0; i<20; i++) {
      System.out.print(u.car() + " ");
      u = u.cdr();
    }
    System.out.println("...");

    // a filter over stream
    Stream<Integer> x = integers;
    Stream<Integer> y = x.filter(new Unary<Integer, Boolean>({Integer i=>i%2==0}));
    for(int i=0; i<20; i++) {
      System.out.print(y.car() + " ");
      y = y.cdr();
    }
    System.out.println("...");
  }
}

Closures in Java will surely bring in a new paradigm of programming within the developers. The amount of excitement that the prototype has already generated is phenomenal. It'll be too bad if they do not appear in Java 7.

Update: Ricky Clarkson points out in the Comments that {? super E=>? extends R} is the same as {E=>R}. The covariance / contravariance stuff just blew off my mind when I compiled the post. Hence the Unary class is not required. Just remove the class and substitute the closure in map() and filter(). e.g.

public <R> Stream<R> map({E=>R} f) {
  return cons(f.invoke(car), {=>cdr().map(f)});
}

Monday, November 05, 2007

Towards more Powerful Abstractions in the Javaland

Way back in 2000, Scott Myers suggested improving encapsulation through increased usage of non-member, non-friend functions in C++. He had this punchline as the starting point of his column :

If you're writing a function that can be implemented as either a member or as a non-friend non-member, you should prefer to implement it as a non-member function. That decision increases class encapsulation. When you think encapsulation, you should think non-member functions.

While discussing the degrees of encapsulation, he explains that the amount of encapsulation in a class is inversely proportional to the number of functions that may break if the implementation of the class has to change. And that being the case, it becomes clear that a class with n member functions is more encapsulated than a class with n+1 member functions.

I was then an avid C++ programmer and this article hit me like a bolt. This was possibly the first instance of an exposition (that I found) on OO that makes you think away from the kingdom of nouns. Keep the number of verbs hanging off the nouns to a minimum, use the C++ namespace as the module of encapsulation, so as to keep the class interfaces complete and minimal.

Shortly afterwards, Herb Sutter, in one his Guru-of-the-Week columns, Monoliths "Unstrung", dissected std::basic_string to identify the set of functions which should have been implemented as member functions instead of the current count of 103.

Very recently, Reg Braithwaite and Buko Obele discussed the inappropriateness of applying the noun/verb metaphor to object oriented programming. It is not correct to model all verbs hanging off the nouns - Obele goes on to say ..

The question of who verbs 'belong to' is always the wrong question; instead it is always a matter of how a new concept can be meaningfully introduced into our existing system.

In fact, long back, the design of the C++ Standard library was based on the concept of making the algorithms (the verbs) the first class citizens implemented generically on the containers that they operate on.

It's not the objects carrying the functions that make the OO way of modeling things - the language has to support powerful abstractions that help programmers write extensible code.

The Expression Problem

This has been one of the classical examples to demonstrate how mainstream OO languages lack abstractions for modeling an extensible solution. The problem has been described simplistically in this paper by Zenger and Odersky :

Suppose we have a datatype which is defined by a set of cases and we have processors which operate on this datatype. There are primarily two directions along which we can extend such a system:
• The extension of the datatype with new data variants,
• The addition of new processors.

The classical OO approach to solve this problem involves designing a polymorphic hierarchy of datatypes, with each concrete subclass modeling a data variant and implementing the set of processors. With this approach, extending datatypes is easy, just add a variant as another subclass. But extending processors is invasive, violates the Open/Closed principle and forces you to dig into every existing data variant. A typical fallout of embracing the noun/verb paradigm of modeling.

The alternate not-a-strictly-OO approach adopted by programmers to solve these problems is by implementing some sort of double dispatch using the Visitor design pattern. This is an example where the solution gets overtly complex and forces you to write code that simulates the run time type dispatch mechanism of the underlying programming language. The Visitor design pattern is yet another example of a workaround for the noun/verb metaphor in mainstream OO languages. The Visitor abstractions allow a physical separation of the processors from the datatype, but the individual processors very much match one-to-one with each individual datatype. In this paper (Matching Objects with Patterns) discussing the Scala language capabilities, Burak Emir et. al. states this problem with Visitors more succinctly :

The visitor design pattern causes a relatively high notational overhead for framework construction, because a visitor class has to be defined and matchWith methods have to be provided in all data variants. The pattern matching itself is disciplined but very verbose, especially for deep patterns. Visitors in their standard setting do not maintain representation independence, because case methods correspond one-to-one to data alternatives.

How can we get around this accidental complexity and make our OO code more succinct ? The answer is simple - more powerful abstractions.

Scala Pattern Matching

Scala offers pattern matching as a solution to the expression problem. Pattern matching has been one of the key features in many functional languages like Erlang - Scala has brought this feature to the OO paradigm. Martin Odersky has been quite a strong advocate of pattern matching, a feature which many purists consider orthogonal to encapsulation of abstractions. But definitely pattern matching in Scala provides a solution away from the typical noun/verb syndrome in today's OO modeling paradigms. Have a look here for a complete example of pattern matching in Scala to solve a problem which would have been much more verbose and complex using visitors.

And Java ?

Java has so long been the kingpin of the noun kingdom and there are tons of Java code that model verbs hanging off the nouns. However, with more and more functional paradigms crying out on the sidelines for their entry into the noun kingdom, things look to be changing. I have been dabbling a bit with the prototype of Closures released by Neal Gafter, and things have already started looking interesting. Closures in Java is definitely a very very welcome feature, and the community has already started battling for their introduction in Java 7. Ricky Clarkson has implemented pattern matching in Java using Neal's prototype. Though at a very early stage, it is indeed very heartening to see more powerful forms of abstractions making their way into the Java programming language. Java, as a platform, has already been enriched with languages like Scala and JRuby contributing many powerful abstractions of functional programming.

Reg has demonstrated the Equivalence relationship using the visitor pattern in a manner conforming to the tradition of verbosity in Java. With pattern matching and closures, things will improve towards more succinctness. Here is a version based on the pattern matching engine of Ricky :

class Main
{
  public static void main(String[] args) {
    Collection coll1 = new ArrayList<Integer>() {{ add(12); add(15); add(10); }};
    Collection coll2 = new TreeSet<Integer>() {{ add(12); add(15); add(10); }};

    // equivalence of List-List and List-Set
    System.out.println(equivalent(coll1, coll2)
        .add(List.class, List.class, {List l1, List l2 => CollectionUtils.isEqualCollection(l1, l2)})
        .add(List.class, Set.class, {List l1, Set s1 => CollectionUtils.isEqualCollection(l1, s1)})
        .done());

    Map<Integer, Integer> coll3 = new HashMap<Integer, Integer>() {{
      put(1, 12);
      put(2, 15);
      put(3, 10);
    }}

    // equivalence of List-List, List-Set, List-Map
    System.out.println(equivalent(coll2, coll3)
        .add(List.class, List.class, {List l1, List l2 => CollectionUtils.isEqualCollection(l1, l2)})
        .add(List.class, Set.class, {List l1, Set s1 => CollectionUtils.isEqualCollection(l1, s1)})
        .add(List.class, Map.class, {List l1, Map m1 => CollectionUtils.isEqualCollection(l1, m1.values())})
        .done());
  }

  public static <T,R> Matcher<T,R> equivalent(T t1, T t2) {
    return new Matcher<T,R>(t1, t2);
  }

  public static <T,R> Matcher<T,R> match(T t1, T t2) {
    return new Matcher<T,R>(t1, t2);
  }

  public static class Matcher<T,R> {
    public final T t1;
    public final T t2;
    public R r;

    public Matcher(T t1, T t2) {
      this.t1 = t1;
      this.t2 = t2;
    }

    public <U1 extends T, U2 extends T> Matcher<T,R> add(Class<U1> aCase, Class<U2> bCase, {U1,U2=>R} f) {
      if (aCase.isInstance(t1) && bCase.isInstance(t2))
        r=f.invoke(aCase.cast(t1), bCase.cast(t2));

      return this;
    }

    public R done() {
      return r;
    }
  }
}

Definitely a more succinct way to model equivalence than what we are used to in the Javaland. And this is only a start. There are quite a few rough edges to smoothen out, particularly with respect to handling generics and type erasure issues. But, I am sure we will see more and more power of abstractions coming into Java with people like Neal Gafter and Doug Lea pitching in. A tonne of thanks to these thoughtleaders for still keeping the Java language an interesting platform to play around with.

Wednesday, October 10, 2007

I didn't know Javascript can do this !

The best part of learning a new programming language is the bag of surprises that await you at every line of your program. And it's almost always a mixed bag - things that you were confident of, often do not work the way all the blogs have claimed to be. Your favorite idiom that you have so long used in every design starts falling apart. And you start thinking in a new way, in terms of the language which you have started learning. Sometimes you feel bad when your favorite design pattern finds no place in the laws of the new land, but soon you realize that there is a more idiomatic way of doing the same thing.

I have been a passionate Java programmer for the last eight years and started the journey with the same sinking feeling coming from the land of C and C++. I missed the sleak idioms of memory management using smart pointers, the black magic of template metaprogramming and a host of other techniques and patterns as evangelized by the language called C++. Now when I learn new idioms from languages like Javascript and Ruby, I always try to think how it can be mapped to an equivalent construct in Java. Not that there is always a mapping between the dynamically typed world of Ruby or Javascript and the land of static typing that Java implements. Still it's always an enlightening exercise trying to emulate the best practices of one language in the other.

This post is about one such recent experience with programming in Javascript. I was trying to get some stuff from the front end - I had to write an abstraction which will do some computation based on those values. Some parts of the computation had a fixed algorithm, while some parts varied depending upon the environment and context.

Strategy ! I thought .. this was my first reaction to the solution. Since I am not using a statically typed language as Java, I need not implement the delegation-backed-by-the-polymorphic-hierarchy-of-strategy-classes idiom. And I knew that Javascript supports closures as a higher level of abstraction, which I can use to model the pluggability that the variable part of the abstraction demands. Depending upon the context, a different closure will be passed, and I could make the variable component a closure and pass it along into the main computation routine. But the problem comes up when you have multiple such variabilities to model in the same abstraction. For every such variable point, you need to pass a closure as an argument.

// closures as arguments
function WageCalculation(taxCalculator, allowanceCalculator, ...) {
    //..
}

Works, but may not be the best way to model in Javascript. After some more digging, I came across the details of the Execution Context in Javascript. All Javascript code gets executed in an execution context and all identifiers are resolved based on the scope chain and an activation object. And Javascript offers a keyword this, which gets assigned as part of the process for setting up of execution context. The value assigned to this refers to an object, whose properties become visible to accessors prefixed with the this keyword in the main function. Hence all variabilities of the abstraction can be made part of this context and made available to the main computation routine through the this keyword. And this is exactly what Function.call() and Function.apply() do for us!

Simplifying my example to some extent for brevity, let us say that we are trying to abstract a routine for computing the wages of employees. The wage calculation contains a few components, the basic wage, the tax and allowance. Of these, the percentage of tax varies depending upon the category of the employee, while the other components follow a fixed algorithm. For simplicity, I assume hard coded values for each of them, but applying the variation where necessary.

The basic abstraction looks like the following :

WageCalc = function() {

    // privileged method
    this.taxCalc = function() {
        // 20% of basic wage
        return 0.2;
    };

    // private static
    allowanceCalc = function() {
        // 50% of basic wage
        return 0.5;
    };

    this.wage = function(basic) {
        // and the calculation
        return basic - this.taxCalc() * basic + allowanceCalc() * basic;
    }
};

and I invoke the calculation method as :

var w = new WageCalc();
print(w.wage(1000));

for a basic wage of $1000. The tax calculation component (taxCalc()) is a variable part and has been tagged with the this keyword. This is the candidate that I need to implement as a strategy and plug in different implementations depending on the context. If I want to have another implementation where the tax percentage is 30, instead of 20, I just do the following ..

MyWageCalc = function() {

    this.taxCalc = function() {
        return 0.3;
    };
};

and invoke the wage calculation function, passing in the new function object as the context. This is what the apply method of Javascript Function does :

var w = new WageCalc();
print(w.wage.apply(new MyWageCalc(), [5000]));

Here new MyWageCalc() sets up the new context on which the this.taxCalc() of wage() method operates. Hence the new tax calculation functions with 30% of the basic and not the 20% as defined in WageCalc() function.

I have only started to learn Javascript, hence I am not sure if this is the most idiomatic way of solving the problem at hand. But to me it appears to be exciting enough to share in an entire blog post.