Ruminations of a Programmer: java

Showing posts with label java. Show all posts

Monday, April 26, 2010

DSL Interoperability and Language Cacophony

Many times I hear people say that DSL based development often leads to a situation where you have to manage a mix of code written in various languages, which are barely interoperable and often need to be integrated using glues like ScriptEngine that makes your life more difficult than easier. Well, if you are in this situation, you are in the world of language cacophony. You have somehow managed to pitchfork yourself into this situation through immature decisions of either selecting an improper language of implementation for your DSL or the means of integrating them with your core application.

Internal DSLs are nothing but a disciplined way of designing APis which speak the Ubiquitous Language of the domain you are modeling. Hence when you design an internal DSL, stick to the idioms and best practices that your host language offers. For domain rules that you think could be better modeled using a more expressive language, select one that has *good* interoperability with your host language. If you cannot find one, work within the limitations of your host language It's better to be slightly limiting in syntax and flexibility with your DSL than trying to force integration with a language that does not interoperate seamlessly with your host language. Otherwise you will end up having problems not only developing the interfaces between the DSL and the host language, you will have other endless problems of interoperability in the face of exceptions as well.

Don't let the language cacophony invade the horizons of your DSL implementation!

Consider the following example where we have a DSL in Groovy that models an Order that a client places to a securities broker firm for trading of stocks or bonds. The DSL is used to populate orders into the database when a client calls up the broker and asks for buy/sell transactions.

I am not going into the details of implementation of this DSL. But consider that the above script on execution returns an instance of Order, which is an abstraction that you developed in Groovy. Now you have your main application written in Java where you have the complete domain model of the order processing component. Your core application Order abstraction may be different from what you have in the DSL. Your DSL only constructs the abstraction that you need to populate the order details that the user receives from their clients.

It's extremely important that the Order abstraction that you pass from executing the Groovy script be available within your Java code. You can use the following mechanism to execute your Groovy code ..

This runs the script but does not have any integration with your Java application. Another option may be using the Java 6 ScriptEngine for talking back to your Java application. Something like the following snippet which you can have wiothin your main application to execute the Groovy script using ScriptEngine ..

Here you have some form of integration, but it's not ideal. You get back some objects into your Java code, can iterate over the collection .. still the objects that you get back from Groovy are opaque at best. Also since the script executes in the sandbox of the ScriptEngine, in case of any exception, the line numbers mentioned in the stack trace will not match the line number of the source file. This can lead to difficulty in debugging exceptions thrown from the DSL script.

Groovy has excellent interoperability with Java even at the scripting level. When integrating DSLs with your main application always prefer the language specific integration features over any generic script engine based support. Have a look at the following that does the same job of executing the Groovy script from within a Java application. But this time we use GroovyClassLoader, a ClassLoader that extends URLClassLoader and can load Groovy classes from within your Java application ..

You will have to make your DSL script return a Closure that then gets called within the Java application. Note that within Java we can now get complete handle over the Order classes which we defined in Groovy.

This is an example to demonstrate the usage of proper integration techniques while designing your DSL. In my upcoming book DSLs In Action I discuss many other techniques of integration for DSLs developed on the JVM. The above example is also an adaptation from what I discuss in the book in the context of integrating Groovy DSLs with Java application.

Thanks to Guillame Laforge and John Wilson for many suggestions on improving the Groovy DSL and it's interoperability with Java.

Sunday, February 01, 2009

Asynchronous Write Behinds and the Repository Pattern

The following is a typical implementation of service methods of the domain model of an application. The Repository is injected and is used to persist the domain model or lookup objects from the underlying store. The entire storage and the mechanics of the underlying retrieval is abstracted within the DAO / Repository layer.

public class RestaurantServiceImpl implements RestaurantService {

  @Autowired
  public RestaurantServiceImpl(..) {
    //..
  }

  // injected
  private final RestaurantRepository restaurantRepo;

  public void storeRestaurants(List<Restaurant> restaurants) {
    restaurantRepo.store(restaurants);
  }
}

In a typical layered architecture, the database often proves to be the hardest layer to scale. And in the above implementation, restaurantRepo.store() is a synchronous method that keeps you in abeyance till the data gets persisted across all the layers of your architecture down to the bits and pieces of the underlying relational store. Of course it can be any other store as well - after all, the repository is an abstraction, so it doesn't matter to the application whether you use a relational database, a native file system or a document database underneath. But you get the idea, synchronous communication with the database / hard disk often turns out to be the bottleneck here.

Terracotta provides a nice option of virtualizing your interaction with the database. Async tim (Terracotta Integration Module) provides asynchronous write behind to the database, while the application works on in-memory data structures. Terracotta offers network attached memory with transparent JVM clustering that allows data structures to be *declaratively* clustered. The value proposition here is that, the user can work on the object model, using POJOs, delegating the concerns of persistence to an asynchronous Terracotta process.

Here is an example of the above service extended to handle asynchronous write behinds ..

public class AsyncRestaurantServiceImpl extends RestaurantServiceImpl {

  // need to be clustered
  @Root
  private final AsyncCoordinator<Restaurant> asyncCommitter =
    new AsyncCoordinator<Restaurant>(new RestaurantAsyncConfig(), new NeverStealPolicy<ExamResult>());

  // dependency injected
  private final RestaurantCommitHandler handler;

  @Autowired
  public AsyncRestaurantServiceImpl(..) {
    super();
    asyncCommitter.start(handler, ..);
  }

  @Override
  public void storeRestaurants(List<Restaurant> restaurants) {
    asyncCommitter.add(restaurants);
  }

  //.. other methods
}

The AsyncCoordinator<> is the agent that handles the persistence asynchronously in the background. The class RestaurantCommitHandler contains the actual code that writes the collection of Restaurants to the database. RestaurantCommitHandler implements ItemProcessor<> - instances of ItemProcessor gets bucketed and throttled asynchronously for database commits, while the application continues by adding the objects to be persisted to a POJO.

@Service
public class RestaurantCommitHandler implements ItemProcessor<Restaurant> {
  //..
}

Now, we can take this one step further. The Repository is supposed to abstract the handling of the storage and retrieval - why not abstract the asynchronous persistence within the repository itself and keep the service implementation clean. Then it becomes simply injecting the proper repository to enable asynchrony at the service layer ..

interface RestaurantRepository {
  void store(List<Restaurant> restaurants);
}

class RestaurantRepositoryImpl implements RestaurantRepository {
  public void store(..) {
    //.. standard DAO based implementation
  }
}

class AsyncRestaurantRepositoryImpl implements RestaurantRepository {
  @Root
  private final AsyncCoordinator<ExamResult> asyncCommitter =
    new AsyncCoordinator<Restaurant>(new RestaurantAsyncConfig(), new NeverStealPolicy<Restaurant>());

  // dependency injected
  private final RestaurantCommitHandler handler;

  public AsyncRestaurantRepositoryImpl() {
    super();
    asyncCommitter.start(handler, ..);
  }

  public void store(..) {
    asyncCommitter.add(restaurants);
  }

  //.. other methods
}

I have not yet used the above in any production application. But the idea of decoupling the main processing from the underlying database decreases the write latency of domain objects. And couple this idea with Terracotta's original value proposition of cluster-wide in-process distributed coherent caching, I think it can prove to be a really wicked cool platform for scaling out your application. The system of record (SOR) is now closer to the application, and the database can act as a snapshot for audit trails and reporting purposes. Of course this asynchronous write behind is not suitable for a plug-in into an existing architectural framework where you have lots of loosely coupled systems interconnected through databases. But I guess there can be many use cases for which this can be a viable solution.

However, looking at the current state of Terracotta async write behind framework, one area that concerns me is the lack of an out-of-the-box support for cases when the database may be down for an extended period. The framework leaves it to the client to implement any such failover support. The ItemProcessor is a non-clustered local instance - hence the user can very well catch the ProcessingException and act upon it according to business needs. Still it will be nice to have some support from the framework, where by the application can continue to run in-memory and later can sync up when the database comes up.

Would love to hear some real life stories from anyone with experience to share on usage of Terracotta Async module ..

Sunday, January 18, 2009

Generic Repository and DDD - Revisited

Greg Young talks about the generic repository pattern and how to reduce the architectural seam of the contract between the domain layer and the persistence layer. The Repository is the contract of the domain layer with the persistence layer - hence it makes sense to have the contract of the repository as close to the domain as possible. Instead of a contract as opaque as Repository.FindAllMatching(QueryObject o), it is always recommended that the domain layer looks at something self revealing as CustomerRepository.getCustomerByName(String name) that explicitly states out the participating entities of the domain. +1 on all his suggestions.

However, he suggests using composition, instead of inheritance to encourage reuse along with encapsulation of the implementation details within the repository itself .. something like the following (Java ized)

public class CustomerRepository implements ICustomerRepository {
  private Repository<Customer> internalGenericRepository;

  public IEnumerable<Customer> getCustomersWithFirstNameOf(string _Name) {
    internalGenericRepository.fetchByQueryObject(
      new CustomerFirstNameOfQuery(_Name)); //could be hql or whatever
  }
}

Quite some time ago, I had a series of blogs on DDD, JPA and how to use generic repositories as an implementation artifact. I had suggested the use of the Bridge pattern to allow independent evolution of the interface and the implementation hierarchies. The interface side of the bridge will model the domain aspect of the repository and will ultimately terminate at the contracts that the domain layer will use. The implementation side of the bridge will allow for multiple implementations of the generic repository, e.g. JPA, native Hibernate or even, with some tweaking, some other storage technologies like CouchDB or the file system. After all, the premise of the Repository is to offer a transparent storage and retrieval engine, so that the domain layer always has the feel that it is operating on an in-memory collection.

// root of the repository interface
public interface IRepository<T> {
  List<T> read(String query, Object[] params);
}

public class Repository<T> implements IRepository<T> {

  private RepositoryImpl repositoryImpl;

  public List<T> read(String query, Object[] params) {
    return repositoryImpl.read(query, params);
  }

  //..
}

Base class of the implementation side of the Bridge ..

public abstract class RepositoryImpl {
  public abstract <T> List<T> read(String query, Object[] params);
}

One concrete implementation using JPA ..

public class JpaRepository extends RepositoryImpl {

  // to be injected through DI in Spring
  private EntityManagerFactory factory;

  @Override
  public <T> List<T> read(String query, Object[] params) {
    
  //..
}

Another implementation using Hibernate. We can have similar implementations for a file system based repository as well ..

public class HibernateRepository extends RepositoryImpl {
  @Override
  public <T> List<T> read(String query, Object[] params) {
    // .. hibernate based implementation
  }
}

Domain contract for the repository of the entity Restaurant. It is not opaque or narrow, uses the Ubiquitous language and is self-revealing to the domain user ..

public interface IRestaurantRepository {
  List<Restaurant> restaurantsByName(final String name);
  //..
}

A concrete implementation of the above interface. Implemented in terms of the implementation artifacts of the Bridge pattern. At the same time the implementation is not hardwired with any specific concrete repository engine (e.g. JPA or filesystem). This wiring will be done during runtime using dependency injection.

public class RestaurantRepository extends Repository<Restaurant>
  implements IRestaurantRepository {

  public List<Restaurant> restaurantsByEntreeName(String entreeName) {
    Object[] params = new Object[1];
    params[0] = entreeName;
    return read(
      "select r from Restaurant r where r.entrees.name like ?1",
      params);
  }
  // .. other methods implemented
}

One argument could be that the query string passed to the read() method is dependent on the specific engine used. But it can very easily be abstracted using a factory that returns the appropriate metadata required for the query (e.g. named queries for JPA).

How does this compare with Greg Young's solution ?

Some of the niceties of the above Bridge based solution are ..

The architecture seam exposed to the domain layer is NOT opaque or narrow. The domain layer works with IRestaurantRepository, which is intention revealing enough. The actual implementation is injected using Dependency Injection.

The specific implementation engine is abstracted away and once agian injected using DI. So, in the event of using alternative repository engines, the domain layer is NOT impacted.

Greg Young suggests using composition instead of inheritance. The above design also uses composition to encapsulate the implementation within the abstract base class Repository.

However in case you do not want to have the complexity or flexibility of allowing switching of implementations, one leg of the Bridge can be removed and the design simplified.

Monday, June 02, 2008

Java to Scala - Smaller Inheritance hierarchies with Structural Typing

I was going through a not-so-recent Java code base that contained the following structure for modeling the employee hierarchy of an organization. This looks quite representative of idiomatic Java being used to model a polymorphic hierarchy for designing a payroll generation application.

public interface Salaried {
  int salary();
}

public class Employee implements Salaried {
  //..
  //.. other methods

  @Override
  public int salary() {
    // implementation
  }
}

public class WageWorker implements Salaried {
  //..
  //.. other methods

  @Override
  public int salary() {
    // implementation
  }
}

public class Contractor implements Salaried {
  //..
  //.. other methods

  @Override
  public int salary() {
    // implementation
  }
}

And the payroll generation class (simplified for brevity ..) that actually needs the subtype polymorphism between the various concrete implementations of the Salaried interface.

public class Payroll {
  public int makeSalarySheet(List<Salaried> emps) {
    int total = 0;
    for(Salaried s : emps) {
      total += s.salary();
    }
    return total;
  }
}

While implementing in Java, have you ever wondered whether using public inheritance is the best approach to model such a scenario ? After all, in the above class hierarchy, the classes Employee, WageWorker and Contractor does not have *anything* in common except the fact that all of them are salaried persons and that subtype polymorphism has to be modeled *only* for the purpose of generating paysheets for all of them through a single API. In other words, we are coupling the entire class hierarchy through a compile time static relationship only for the purpose of unifying a single commonality in behavior.

Public inheritance has frequently been under fire, mainly because of the coupling that it induces between the base and the derived classes. Experts say Inheritance breaks Encapsulation and also regards it as the second strongest relationship between classes (only next to friend classes of C++). Interface driven programming has its advantages in promoting loose coupling between the contracts that it exposes and their concrete implementations. But interfaces in Java also pose problems when it comes to evolution of an API - once an interface is published, it is not possible to make any changes without breaking client code. No wonder we find design patterns like The Extension Object or strict guidelines for evolution of abstractions being enforced in big projects like Eclipse.

Finer Grained Polymorphism

Structural typing offers the ability to reduce the scope of polymorphism only over the subset of behaviors that need to be common between the classes. Just as in duck typing, commonality in abstractions does not mean that they belong to one common type; but only the fact that they respond to a common set of messages. Scala offers the benefit of both the worlds through its implementation of structural typing - a compile time checked duck typing. Hence we have a nice solution to unify certain behaviors of otherwise unrelated classes. The entire class hierarchy need not be related through static compile time subtyping relationship in order to be processed polymorphically over a certain set of behaviors. As an example, I tried modeling the above application using Scala's structural typing ..

case class Employee(id: Int) { def salary: Int = //.. }
case class DailyWorker(id: Int) { def salary: Int = //.. }
case class Contractor(id: Int) { def salary: Int = //.. }

class Payroll {
  def makeSalarySheet(emps: List[{ def salary: Int }]) = {
    (0 /: emps)(_ + _.salary)
  }
}

val l = List[{ def salary: Int }](DailyWorker(1), Employee(2), Employee(1), Contractor(9))
val p = new Payroll
println(p.makeSalarySheet(l))

The commonality in behavior between the above classes is through the method salary and is only used in the method makeSalarySheet for generating the payroll. We can generalize this commonality into an anonymous type that implements a method having the same signature. All classes that implement a method salary returning an Int are said to be structurally conformant to this anonymous type { def salary: Int }. And of course we can use this anonymous type as a generic parameter to a Scala List. In the above snippet we define makeSalarySheet accept such a List as parameter, which will include all types of workers defined above.

The Smart Adapter

Actually it gets better than this with Scala. Suppose in the above model, the name salary is not meaningful for DailyWorkers and the standard business terminology for their earnings is called wage. Hence let us assume that for the DailyWorker, the class is defined as ..

case class DailyWorker(id: Int) { def wage: Int = //.. }

Obviously the above scheme will not work now, and the unfortunate DailyWorker falls out of the closure of all types that qualify for payroll generation.

In Scala we can use implicit conversion - I call it the Smart Adapter Pattern .. we define a conversion function that automatically converts wage into salary and instructs the compiler to adapt the wage method to the salary method ..

case class Salaried(salary: Int)
implicit def wageToSalary(in: {def wage: Int}) = Salaried(in.wage)

makeSalarySheet api now changes accordingly to process a List of objects that either implement an Int returning salary method or can be implicitly converted to one with the same contract. This is indicated by <% and is known as a view bound in Scala. Here is the implementation of the class Payroll that incorporates this modification ..

class Payroll {
  def makeSalarySheet[T <% { def salary: Int }](emps: List[T]) = {
    (0 /: emps)(_ + _.salary)
  }
}

Of course the rest of the program remains the same since all conversions and implicit magic takes place with the compiler .. and we can still process all objects polymorphically even with a different method name for DailyWorker. Here is the complete source ..

case class Employee(id: Int) { def salary: Int = //.. }
case class DailyWorker(id: Int) { def salary: Int = //.. }
case class Contractor(id: Int) { def wage: Int = //.. }

case class Salaried(salary: Int)
implicit def wageToSalary(in: {def wage: Int}) = Salaried(in.wage)

class Payroll {
  def makeSalarySheet[T <% { def salary: Int }](emps: List[T]) = {
    (0 /: emps)(_ + _.salary)
  }
}

val l = List[{ def salary: Int }](DailyWorker(1), Employee(2), Employee(1), Contractor(9))
val p = new Payroll
println(p.makeSalarySheet(l))

With structural typing, we can afford to be more conservative with public inheritance. Inheritance should be used *only* to model true subtype relationship between classes (aka LSP). Inheritance definitely has lots of uses, we only need to use our judgement not to misuse it. It is a strong relationship and, as the experts say, always try to implement the least strong relationship that correctly models your problem domain.

Monday, May 12, 2008

Thinking in JVM languages

When I find a language expressive enough to implement the programming idioms succinctly, I like to use the language in my projects. But I constantly have to thunder on myself the fact that a project development is a team game and the language has to be usable by all members of the development team. Another fact that stares at me is the deadline for delivery that has long been committed to the client by a different team, completely oblivious of all constraints of software development lifecycle that appear to rock the project right from inception. Hence choosing the programming language is also a function of the number of available frameworks, libraries, refactoring-friendly IDEs and community support that can perceivably add to the velocity of program development. Considering all such forces, it is a no brainer that both us and the client happily choose the Java programming language for most of our development projects.

Is there any value in learning new languages according to the epigrams of Alan Perlis ? Mychael Nygard recently wrote a very thoughtful essay on this, keeping in mind the recent trends of development planned for Java, the language and Java, the platform. Here are some of our thoughts from the trenches of the development team ..

Of late, a couple of new developments on the JVM platform have added a new dimension to our project infrastructure. One of them is the scripting support that comes bundled with Java 6 and the other is the emergence of languages like Scala, Groovy and JRuby that can coexist happily in a single project under the caring patronage of the Java virtual machine. As a result, things become a little more interesting nowadays, and we can enjoy some amount of polyglotism by sneaking in a few of these spices into the mass of Java objects. I had earlier blogged on some such (not all successful) attempts -

trying to use Scheme as an executable XML through SISC, an attempt that got shot down the moment I uttered the word Lisp

using Javascript through Rhino engine to externalize some of the customizable business rules in a big Java project

making Java objects smarter with nicely scaffolded APIs using Scala's lexically scoped open classes

Is there any value to such attempts in projects where the bulk of the code is churned out by new inexperienced developers and managed by delivery managers encouraging maintainable programming paradigms ?

Isn't Java enough ?

Java has been the most dominant programming language for the enterprise. Given the proper set of tools, I love programming in Java, however unhip may it sound today. Then, why do I try to sneak in those weird features of Groovy, Scala or Rhino, when the ecosystem of the project thrives mainly on a Java diet ? Because I believe syntax matters, succinctness makes abstractions more resilient and powerful idioms always reduce the solution space noise, making the code base a more true representation of the business problem that I am trying to solve. Design patterns encourage best practices in application design, but their implementations in Java often tend to generate lots of boilerplate codes, which, in other higher level languages can be addressed through richer levels of abstractions.

I understand that implementing a whole new enterprise project in an evolving language like Scala or Groovy is not a feasible option today (yet). But I can certainly use some of their goodness to make my APIs more like the domain that I am modeling. And the best thing is that I can use these power on top of my existing Java objects. The key part is to identify the use cases that makes this exercise non-invasive, risk free and maintainable by the profile of programmers on the job.

In a typical multi-tiered application stack, there are layers which do not need the safety net of static typing e.g. the modules which interact with the external data sources and receive amorphous data streams that my application has to consume and forward to the domain layer underneath. Groovy is a dynamically typed language on the JVM and provides strong support for XML consumption and generation without the verbosity of Java. Hamlet D'Arcy demonstrates how effectively we can use Groovy in the service layer acting as the perfect glue to my Java based domain layer. This makes the code base smaller through programming at a higher level of abstraction, and at the same time keeps dynamic modules decoupled from the core static domain layer. External interfaces are usually required to be kept malleable enough, so that changes to them do not impact your core logic - and Groovy or JRuby provides ample scope for such decoupling.

In one of our Java EE projects, I had used Rhino scripting to keep business rules externalized from the core model. The requirement was to have the rules configurable by the users without a full build of the application code and hot deployment of those rules within the running application containers. Scripting engines, bundled with Java 6 is a great option where I can provide dynamic loading capabilities for all such scripting tasks. With OSGi becoming mainstream, I am sure, we will have better options for application packaging, versioning and deployment very soon.

And for the static typing afficionados .. (believe me, I am also one !)

Not only with dynamically typed languages, you can get the benefits of static typing, along with nice concise syntax on the JVM today. Scala is a multi-paradigm language for the JVM offering all the goodness of statically checked duck typing, type inferencing, rich functional features and some great library support for threadless concurrency and parser combinators. Scala supports XML literals as part of the language, which can very well be used to implement elegant XML crunching modules, much concise compared to the DOM APIs or JAXB frameworks that Java offers.

Recently in one of my programming assignments, I had to design a few adapters to existing Java classes, not related througn common parentage. The requirement was to define a set of uniform operations over a collection of the adapted classes based on some common structural classifiers. Initially I came up with a Java solution. It was standard idiomatic Java which would pass any careful review if it were a couple of years ago. I tried the same problem in Scala and could come up with a far more elegant and succinct solution. The three features of Scala that made the solution more precise are the supports for structural typing, implicit adapters and of course, functional programming. And since the entire development was additive and belonged to the service layer of the application, the core domain model was not impacted. The client was never bothered as long as his investments and commitments on the JVM were protected. As David Pollak has recently stated in one of his posts, it is only an additional jar. So true.

Is the infrastructure ready ?

All the JVM languages are evolving - even Java is slated to undergo lots of evolutions in the coming days (closures, JSR-308, modularity, ..). The most important thing, as I mentioned above is to follow the evolutionary path and carefully choose the use cases to plugin the goodness of these languages. To me, lots of risks are mitigated once you start using them as additive glue, rather than invasive procedures. These languages are becoming performant by the day, and innovations on hosting languages on a common runtime are now a reality. Groovy 1.6 has seen significant performance improvements in method dispatch by shortening the call path between the caller and the receiver through using method handles and call site cache, a technique previously applied in optimizing JRuby performance, and documented very well by Charles Nutter in one of his recent posts. This is one big JVM community force in action towards improving the daily life of all languages hosted there.

The best part of "polyglotism under a common runtime" is that I can very well use a uniform toolset for all the languages that I use. Unit testing frameworks like JUnit, TestNG are available to all developers working on multiple languages like Java, Groovy, Scala etc. Maven and Ant with all their eyesore XMLs are still available for any of them. And of course I can use my favorite IDE polymorphically over all languages, albeit with varying degrees of refactoring abilities. And if I am adventurous enough, I can also use additional power of JTestR, ScalaCheck and specs for doing all sorts of BDD and PDD stuff. Real fun huh!

Are you planning to use any of the non-Java, JVM friendly languages in your Java project ? What are the typical use cases that you think fits the bill for another JVM language ?

Tuesday, April 22, 2008

Syntactic Sugars - What makes them sweet ?

Stephen Colebourne talks about implementing for-each in Java maps. He has proposed changes to be made to javac and queued up his request for approval by appropriate authorities. It is good to see Java community leads taking some serious steps towards syntactic sugars in the language. I am always for intention revealing syntactic sugars - after all .. Programs must be written for people to read, and only incidentally for machines to execute ?

Syntactic sugars, when properly designed, reduce the semantic distance between the problem domain and the solution domain. Syntactic sugars do not add new features or capabilities to an existing language. Still we value them mainly for social reasons - they can make your abstractions much more explicit, thereby making your intentions much more direct. And syntactic sugars often lead to concise and succinct code much pleasing to your eyes.

Java is not a language that boasts of concise syntax. Yet the smart for loop introduced in Java 5 reduces a lot of accidental complexity and makes the programmer intention much more explicit ..

for(String name : names) {
  // process name
}

is much more succinct than

for(Iterator<String> it = names.iterator(); it.hasNext(); ) {
  String name = it.next();
  // process name
}

judging from the fact that the latter snippet has its intentions buried down into verbosity of structures not directly related to the intention of the programmer.

names foreach println

is better, though not Java.

Many of the languages being used today offer lots of syntactic sugars abstracting rich capabilities of the underlying language. Take, for example, the Groovy Builder syntax, which exploits the mechanics of meta-programming, closures and named arguments to implement elegant, concise, intuitive APIs. Java developers use binding frameworks to manipulate XML and bind them to the model or the relational database schema. Not that there is anything wrong with it. But the developer has to go through all the hoops of mapping the object structure to the XML schema and use an external framework like JAXB to come up with a much longer version of the same solution than using Groovy MarkupBuilders.

Syntactic sugars are nothing new in the landscape of programming languages. It all started (possibly) with Lisp, offering macros as the means to design syntactic abstractions. To get a little sugar to the language offered syntax, you need not have to wait till the next official release. In Lisp, the syntax of the program is a direct representation of the AST, and with macros you can manipulate the parse tree directly. Languages like Lisp are known to offer syntax extensibility and allows developers to implement his own syntactic sugar.

Ruby offers runtime meta-programming, another technique to add your own syntactic sugars. Ruby does not have a macro system where you can play around with the abstract syntax tree, though we have had a ruby parser released by Ryan Davis that has been written entirely in Ruby. The standard meta object protocol offered by Ruby allows developer control over the language semantics (not the syntax) and has the capability to generate classes and methods dynamically at runtime. Meta-programming, method_missing, open classes, optional parentheses are some of the features that make Ruby a great language to build syntax abstractions for runtime processing.

A language built on the philosophy of bottom up programming offers extensible syntax (be it through the syntactic abstractions of Lisp or the semantic customizations of Ruby), on which syntactic sugars can be constructed by developers. Java believes in democratization of all syntax offered by the language, and it may take quite a few years to officialize the little sugar that you have been yarning for. Remember the explosion in the number of blog posts in celebration of the for-each loop when it came out with Java 5. In other languages, people build new syntax by the day and evolve new vocabulary within the same language that maps into the domain that they model. If you miss those features which you enjoyed in your earlier language, just build it over the new language. And it does not necessarily have to be the process of hooking onto the compilation cycle or plugging in customized modules into your language parsers.

Many of today's languages offer strong enough capabilities to build structures that look like syntax extensions. Scala is an example that makes the cut in this category. The advanced type system of Scala enables developers write control structures within the syntax of the language that looks like syntactic abstractions. Max likes deterministic finalization in C# and its idiomatic usage with "using" keyword. He has implemented the same syntax in Scala using closures, view bounds and implicit conversions. Besides eliminating lots of boilerplates, his extension looks charmingly useful for the domain he is using it.

Syntax extensibility is a necessity if you want your language to support evolution of DSLs. Extensible syntax scales much better than the framework based approach so popularized by Java. When you add a new syntactic structure in your language, it meshes so nicely with the rest of the language constructs that you never feel that it has been externally bolted on to you. Although in reality it is nothing more than syntactic sugars being presented in a form that makes more sense when coding for the particular problem in hand. When we talk about a language, we think in terms of parsers - this is no different when we think about DSLs. Implementing an external DSL is hard, considering the enormous complexity that parser generators make you go through. Scala offers monadic parser combinators where you can directly map your EBNF syntactic structures into implementations. And all this is done through syntactic sugars on top of closures and higher order functions that the language offers.

Higher Order Functions - The Secret Sauce ?

There has been lots of debates on whether object-oriented interfaces scale better than syntax extension capabilities in a language design. While OO certainly has its place in modularizing components and abstracting away relationships between them, there are situations when objects force us fit the round peg in a square hole. How many times have you cursed Java for forcing you define an unnecessary interface just to apply a function over a set of abstractions defining a specific set of contracts ? You can do the same in Scala using structural typing (aka anonymous types) and higher order functions. Higher order functions seem to be the secret sauce for offering syntax extensibility in programming languages.

Monday, March 17, 2008

Are you fully using your Static Typing ?

Back in 2005 in an LtU discussion on Dynamic vs. Static Typing, Anton van Straaten had this post ..

Here's a nice bit of Java code I came across (here):

if ((value != null) && !returnedClass().isAssignableFrom(value.getClass())) { throw new IllegalArgumentException("Received value is not a [" + returnedClass().getName() + "] but [" + value.getClass() + "]"); }

This is from a piece of code that's trying very, very hard to avoid the need for the definition of boilerplate classes when persisting classes representing enumeration types to a SQL database.

This code is actually doing a kind of dynamic typechecking, illustrating the following generalization of Greenspun's 10th Law: "any sufficiently complicated program in a statically-typechecked language contains an ad-hoc, informally-specified bug-ridden slow implementation of a dynamically-checked language." ;)

Today's good Java frameworks use reflection quite sparingly and responsibly. Using Java generics, these frameworks allow compile time type checking for cases which would earlier have to be implemented using a slow and bug ridden simulation of runtime type checking. Guice and EasyMock stand out as two frameworks I have been using that have used the power of generics to implement extraordinary typesafety.

I really like the way small interface-heavy APIs of Guice enforce compile time type-safety.

Have a look at this piece of code, which binds an implementation SpecialServiceImpl to the interface Service using Guice Binder.

public class MyModule implements Module {
    public void configure(Binder binder) {
        binder.bind(Service.class)
              .to(SpecialServiceImpl.class)
              .in(Scopes.SINGLETON);
    }
}

Given the fact that DI frameworks are in the business of injecting implementations to objects dynamically, it may seem that the "implements" relationship between Service and SpecialServiceImpl is done during runtime. Thanks to Java generics, every bit of type checking is done during compile time.

A peek at the source code of Guice reveals that BinderImpl.bind() returns BindingBuilderImpl<T> ..

public <T> BindingBuilderImpl<T> bind(Class<T> clazz) {
    return bind(Key.get(clazz));
}

and BindingBuilderImpl<T>.to() takes as input Class<? extends T> - the bound on the wild card enforces the above "implements" relationship as part of compile time type checking of the arguments ..

public ScopedBindingBuilder to(Class<? extends T> implementation) {
    return to(TypeLiteral.get(implementation));
}

In comparison to Guice, Spring makes much heavier use of reflection, which, I think, is, kind of expected, from a pre-generics framework. Spring's implementation has lots of code similar to

// Check if required type matches the type of the actual bean instance.
if (requiredType != null && bean != null &&
      !requiredType.isAssignableFrom(bean.getClass())) {
    throw new BeanNotOfRequiredTypeException(name, requiredType, bean.getClass());
}

that does quite a bit of juggling with dynamic type checking at runtime.

Coming back to the above post by Anton, yes, the kind of runtime type checking exists in lots of popular Java frameworks, even today. And this is where frameworks like Guice and EasyMock really shine with their strongly typed API sets that make you feel more secure within the confines of your IDE and refactoring abilities.

The Morale

When you are programming in a statically typed language, use appropriate language features to make most of your type checking at compile time. This way, before you hit the run button, you can be assured that your code is well-formed within the bounds of the type system. And you have the power of easier refactoring and cleaner evolution of your codebase.