Friday, January 11, 2008

Is the Java Language Dying?

We tend to think of programming languages in two categories: "living languages" in which we should seriously consider developing new code, and "legacy languages" that we mainly use, if at all, because we have to maintain an existing code base. The act of classifying a language into one or the other category helps us decide what, if anything, we might consider doing to change the language. If a language is primarily a legacy language, changes should be aimed at making it easier to maintain and modify existing bodies of code. A living language, on the other hand, also benefits from changes that make it easier to design, develop, and maintain new code. Living languages evolve to reduce accidental complexity.

"What does a high-level language accomplish? It frees a program from much of its accidental complexity. An abstract program consists of conceptual constructs: operations, datatypes, sequences, and communication. The concrete machine program is concerned with bits, registers, conditions, branches, channels, disks, and such. To the extent that the high-level language embodies the constructs wanted in the abstract program and avoids all lower ones, it eliminates a whole level of complexity that was never inherent in the program at all." -No Silver Bullet - Fred Brooks

Programs written in legacy languages tend to exhibit a high degree of accidental complexity [Code's Worst Enemy, Steve Yegge] [Mr. Yegge meets Mr. Brooks, Brian C Cunningham]. Early in the life of a language, the complexity of programs written in that language may appear to be essential, but as we learn more about software engineering and programming languages, we find patterns of complexity appearing in the code that can be eliminated by improved languages.

A good example of this is garbage collection. In C and C++, memory management is a pervasive concern. Smart pointers and destructors help, but they do not significantly reduce the complexity of memory management. In languages with garbage collection, most of the complexity of memory management is assumed by the implementation of the language. Most languages that have been introduced in the past ten years support garbage collection.

Another example is concurrency. The threads-and-locks-and-semaphores primitives of Java enable parallel programming, but require that programmers express concurrency at a fairly low level. This has been "good enough" for some time, as most programs are not deployed on highly concurrent hardware. But that is changing [The Free Lunch Is Over, Herb Sutter]. Libraries such as java.util.concurrent and Doug Lea's fork-join framework help somewhat, but in many cases they introduce complexities of their own. Other languages that support closures, such as Scala make fork-join-like libraries much easier to use. Scala supports control abstraction [Crowl and LeBlanc], which allows the libraries to manage much of the complexity associated with concurrency [Debasish Ghosh]. Support for the Actors model [Haller and Odersky], for example, can be expressed cleanly as a library in Scala

Besides raising the level of abstraction of concurrent code, control abstraction also raises the level of abstraction for sequential code by eliminating whole categories of boilerplate, which can instead be moved into common library code. This kind of boilerplate cannot be significantly reduced by adding one or two custom statements to the language, because such built-in forms necessarily make assumptions about the use cases that narrow their applicability. For example, ARM blocks don't document how they handle exceptions arising from the close() method. One example in the proposal suggests they are silently swallowed at runtime, which may work for many cases involving I/O streams, but another example given is a transactional API, in which ignoring such exceptions is precisely wrong. Without a specification for the syntax and semantics, the reader is welcome to imagine the most favorable treatment of each use case. But an attempt to reconcile these and other conflicting requirements may show the approach cannot be salvaged. Perhaps that is why no progress has been made since mid 2006.

What about Java? Is it a living language, or a legacy language like Cobol? This question underlies much of the debate about how to move the Java programming language forward, if at all. Carl Quinn asked at the December 14, 2007, JavaPolis Future of Computing Panel (to be published on http://www.parleys.com): "How can we address the issue of evolving the [Java] platform, language, and libraries without breaking things?"

Neal Gafter: "If you don't want to change the meaning of anything ever, you have no choice but to not do anything. The trick is to minimize the effect of the changes while enabling as much as possible. I think there's still a lot of room for adding functionality without breaking existing stuff..."
Josh Bloch: "My view of what really happens is a little bit morbid. I think that languages and platforms age by getting larger and clunkier until they fall over of their own weight and die very very slowly, like over ... well, they're all still alive (though not many are programming Cobol anymore). I think it's a great thing, I really love it. I think it's marvelous. It's the cycle of birth, and growth, and death. I remember James saying to me [...] eight years ago 'It's really great when you get to hit the reset button every once and a while.'"

Josh may well be right. If so, we should place Java on life support and move our development to new languages such as Scala. The fork-join framework itself is an example of higher-order functional programming, which Josh argues is a style that we should neither encourage nor support in Java. Is it really time to move on?

Personally, I believe rumors of Java's demise are greatly exaggerated. We should think of Java as a living language, and strive to eliminate much of the accidental complexity of Java programs. I believe it is worth adding support for closures and control abstraction, to reduce such complexity of both the sequential and concurrent aspects of our programs. At the same time, for completely new code bases, we should also consider (and continue to develop) newer languages such as Scala, which benefit from the lessons of Java.

Thursday, December 13, 2007

What flavor of closures?

I just attended Josh Bloch's presentation at JavaPolis, where he asks the community whether they want Java to support function types, or if they'd prefer that people write these things the way they do today. His examples are carefully selected from the most twisted of the test suite. Compiler test suites are a good place to find the most twisted but unrealistic uses of any given language feature. I thought it would be interesting to look at the question in the context of a real API. You probably know my opinion, but just to be clear, here is an excerpt from Doug Lea's fork-join framework

/**
 * An object with a function accepting pairs of objects, one of
 * type T and one of type U, returning those of type V
 */
interface Combiner<T,U,V> {
  V combine(T t, U u);
}
class ParallelArray<T> {
  /**
   * Returns a ParallelArray containing results of applying
   * combine(thisElement, otherElement) for each element.
   */
  <U,V> ParallelArray<V> combine(
    ParallelArray<U> other,
    Combiner<? super T, ? super U, ? extends V> combiner) { ... }
}

And the equivalent code ported to use the features of the closures spec:

class ParallelArray<T> {
  /**
   * Returns a ParallelArray containing results of applying
   * combine(thisElement, otherElement) for each element.
   */
  <U,V> ParallelArray<V> combine(
    ParallelArray<U> other,
    { T, U => V } combiner) { ... }
}

The question Josh asks is this: which version of this API would you prefer see?

The point he makes is that function types enable (he says "encourage") an "exotic" style of programming - functional programming - which should be discouraged, otherwise the entire platform will become infected with unreadable code. Although functional programming is just as possible with or without function types - they are just shorthand for interface types, after all - Josh prefers the language provide syntactic vinegar for these techniques.

Part of his talk was about the problems of being able to use nonlocal return by default in a closure. See my previous blog post for a description of how this theoretical problem won't exist in the next version of the spec, and doesn't exist in the prototype today.

Finally, Josh showed that if you want to use something like eachEntry to loop over a map, and you want to be able to use primitive types for the loop variables, autoboxing doesn't work and you'd have to define 81 different versions of the eachEntry method (one for each possible primitive type in each position). That's true, just as it's true that you'd have to define 81 different versions of the Map API if you want to be able to handle primitives in them. If it turns out to be a good idea to make autoboxing work for the incoming arguments to a closure, that is a small tweak to the closure conversion. These kinds of issues can be addressed in a JSR.

Josh's vision for an alternative is Concise Instance Creation Expressions along with adding a moderate number of new statement forms.