Ruminations of a Programmer

Tuesday, October 03, 2006

Aspect Days are Here Again ..

and it's raining pointcuts in Spring. It is the most discussed about topic in the blogs and forums that Spring 2.0 will support AspectJ 5. I have been tickling with aspects for some time in the recent past and have also applied AOP in production environments. I have narrated that experience in InfoQ where I had used AOP to implement application level failover over database and MOM infrastructures. This post will be all about the afterthoughts of my continuing experiments with AOP as a first class modeling artifact in your OO design.

First Tryst with Domain Aspects

I am tired of looking at aspects for logging, tracing, auditing and profiling an application. With the new AspectJ 5 and Spring integration, you can do all sorts of DI and wiring on aspects using the most popular IoC container. Spring's own @Transactional and @Configurable are great examples of AOP under the hoods. However, I always kept on asking Show me the Domain Aspects, since I always thought that in order to make aspects a first class citizen in modeling enterprise applications, it has to participate in the domain model.

In one of our applications, we had a strategy for price calculation which worked with the usual model of injecting the implementation through the Spring container.

<beans>
  <bean id="defaultStrategy" class="org.dg.domain.DefaultPricing"/>
    
  <bean id="priceCalculation" class="org.dg.domain.PriceCalculation">
    <property name="strategy">
      <ref bean="defaultStrategy"/>
    </property>
  </bean>
</beans>

Things worked like a charm, till in one of the deployments, the client came back with the demand for implementing strategy failovers. The default implementation will continue to work as the base case, while in the event of failures, we need to iterate over a collection of strategies till one gets back with a valid result. Being a one of a kind of request, we decided NOT to change the base class and the base logic of strategy selection. Instead we chose to have a non-invasive way of handling the client request by implementing pricing strategy alternatives through a domain level aspect.

public aspect CalculationStrategySelector {
 
  private List<ICalculationStrategy> strategies;
 
  public void setStrategies(List<ICalculationStrategy> strategies) {
    this.strategies = strategies;
  }
 
  pointcut inCalculate(PriceCalculation calc)
    : execution(* PriceCalculation.calculate(..)) && this(calc);
 
  Object around(PriceCalculation calc)
    : inCalculate(calc) {
      int i = 0;
      int maxRetryCount = strategies.size();
      while (true) {
        try {
          return proceed(calc);
        } catch (Exception ex) {
        if ( i < maxRetryCount) {
          calc.setStrategy(getAlternativeStrategy(i++));
        } else {
          // handle exceptions
        }
      }
    }
  }
 
  private ICalculationStrategy getAlternativeStrategy(int index) {
    return strategies.get(index);
  }
}

And the options for the selector were configured in the configuration xml of Spring ..

<beans>
  <bean id="strategySelector" 
    class="org.dg.domain.CalculationStrategySelector"
    factory-method="aspectOf">
    <property name="strategies">
      <list>
        <ref bean="customStrategy1"/>
        <ref bean="customStrategy2"/>
      </list>
    </property>
  </bean>
    
  <bean id="customStrategy1" class="org.dg.domain.CustomCalculationStrategy1"/>
  <bean id="customStrategy2" class="org.dg.domain.CustomCalculationStrategy2"/>
</beans>

The custom selectors kicked in only when the default strategy fails. Thanks to AOP, we could handle this problem completely non-invasively without any impact on existing codebase.

And you can hide your complexities too ..

Aspects provide a great vehicle to encapsulate many complexities out of your development team. While going through Brian Goetz's Java Concurrency In Practice, I found one snippet which can be used to test your code for concurrency. My development team has just been promoted to Java 5 features and not all of them are enlightened with the nuances of java.util.concurrent. The best way that I could expose the services of this new utility was through an aspect.

The following snippet is a class TestHarness and is replicated shamelessly from JCIP ..

package org.dg.domain.concurrent;

import java.util.concurrent.CountDownLatch;

public class TestHarness {
  public long timeTasks(int nThreads, final Runnable task)
    throws InterruptedException {
    final CountDownLatch startGate = new CountDownLatch(1);
    final CountDownLatch endGate = new CountDownLatch(nThreads);
  
    for(int i = 0; i < nThreads; ++i) {
      Thread t = new Thread() {
        public void run() {
          try {
            startGate.await();
            try {
              task.run();
            } finally {
              endGate.countDown();
            }
          } catch (InterruptedException ignored) {}
        }
      };
      t.start();
    }
  
    long start = System.nanoTime();
    startGate.countDown();
    endGate.await();
    long end = System.nanoTime();
    return end - start;
  }
}

My target was to allow my developers to write concurrency test codes as follows ..

public class Task implements Closure {

  @Parallel(5) public void execute(Object arg0) {
    // logic
    // ..
  }
}

The annotation @Parallel(5) indicates that this method need to be run concurrently in 5 threads. The implementation of the annotation is trivial ..

import java.lang.annotation.ElementType;
import java.lang.annotation.Retention;
import java.lang.annotation.RetentionPolicy;
import java.lang.annotation.Target;

@Target(ElementType.METHOD)
@Retention(RetentionPolicy.RUNTIME)
public @interface Parallel {
  int value();
}

The interesting part is the main aspect which implements the processing of the annotation in AspectJ 5 .. Note the join point matching based on annotations and the context exposure to get the number of threads to use for processing.

public aspect Concurrency {
  pointcut parallelExecutionJoinPoint(final Parallel par) :
    execution(@Parallel public void *.execute(..)) && @annotation(par);
 
  void around(final Parallel par) : parallelExecutionJoinPoint(par) {
  
    try {
      long elapsed =
        new TestHarness().timeTasks(par.value(), 
        new Runnable() {
          public void run() {
            proceed(par);
          }
        } );
      System.out.println("elapsed time = " + elapsed);
    } catch (InterruptedException ex) {
      // ...
    }
  }
}

The above example shows how you can build some nifty tools, which your developers will love to use. You can shield them from all complexities of the implementation and provide them the great feature of using annotations from their client code. Under the hoods, of course, it is AspectJ doing all the bigwigs.

In some of the future postings, I will bring out many of my encounters with aspects. I think we are in for an aspect awakening and the very fact that it is being backed by Spring, will make it a double whammy !

Monday, September 25, 2006

Domain Driven Design : The Leaky Model AntiPattern

Lots of people have blogged about the importance of maintaining the sanctity of the domain model in a domain driven architecture. Everyone agrees in principle that the domain model should encapsulate the domain logic to the fullest, that the service layer should be lean and the Web layer should not intrude into the implementations of the domain model.

Well, that's great for a starter! Approaching the entree, we find things moving south when various forums in TSS and Spring, Yahoo groups and individual blogs start discussing about domain models going anemic through rich service and presentation layers. Many people have suggested using the DTOs to force a separation between the domain layer and the presentation layer and getting rid of the dreaded LazyInitializationException from Hibernate-backed persistence layer. DTOs create an extra level of indirection to ensure the separation and form a valid architectural paradigm for use cases of distribution and serialization of deep nested object graphs. For others, where all layers are colocated and possibly replicated using clustering, we should always try to maximize the reachability of the domain model.

Domain Model is Crosscutting

The domain model is built upon the layers below it and serves all the layers above it. The domain layer itself encapsulates the business logic and provides fine grained behavioral POJOs to the service layer. The service layer builds up coarse-grained services out of the POJOs exposed by the domain layer and serves the presentation layer above. Hence the POJOs of the domain layer crosscuts all layers above it although exposing different views to each of them. For a particular domain entity, the interface that it publishes within the domain layer may be different from the interface that it exposes to the service or web layer. Hence we need to ensure that the model enforces this multiplicity of interfaces and prevents any implementation specific artifact leaking out of the domain layer.

Let us have a look at a real life example ..

A Domain POJO Exposing Implementation

A domain level POJO named Order, which contains a List of Items ..

package org.dg.domain;

import java.util.List;

public class Order {
  private int orderNo;
  private String description;
  private List<Item> lineItems;

  public Order(int orderNo, String description) {
    this.orderNo = orderNo;
    this.description = description;
  }

  /**
   * @return Returns the description.
   */
  public String getDescription() {
  return description;
  }

  /**
   * @param description The description to set.
   */
  public void setDescription(String description) {
    this.description = description;
  }

  /**
   * @return Returns the orderNo.
   */
  public int getOrderNo() {
    return orderNo;
  }

  /**
   * @param orderNo The orderNo to set.
   */
  public void setOrderNo(int orderNo) {
    this.orderNo = orderNo;
  }

  /**
   * @return Returns the lineItems.
   */
  public List<Item> getLineItems() {
    return lineItems;
  }

  /**
   * @param lineItems The lineItems to set.
   */
  public void setLineItems(List<Item> lineItems) {
    this.lineItems = lineItems;
  }

  public float evaluate( .. ) {
  }
}

This is a typical implementation of a domain POJO containing all getters and setters and business logic methods like evaluate(..). As it should be, it resides in the domain package and happily exposes all its implementation through public methods.

Leaky Model Symptom #1 : Direct Instantiation

The first sign of a leaky domain model is unrestricted access to class constructor thereby encouraging uncontrolled instantiations all over. Plug it in, make the constructor protected and have a factory class take care of all instantiations.

public class Order {
  // same as above

  protected Order(int orderNo, String description) {
    ..
  }
  ..
}

package org.dg.domain;

public class OrderFactory {
  public Order createOrder(int orderNo, String description) {
    return new Order(orderNo, description);
  }
}

If u feel the necessity, you can also sneak in a development aspect preventing direct instantiation even within the same package.

package org.dg.domain;

public aspect FlagNonFactoryCreation {
    declare error
      : call(Order+.new(..))
      && !within(OrderFactory+)
      : "OnlyOrderFactory can create Order";
}

Leaky Model Symptom #2 : Exposed Implementation

With the above implementation, we have the complete Order object exposed to all layers, even though we could control its instantiation through a factory. The services layer and the presentation layer can manipulate the domain model through exposed setters - a definite smell in the design and one of the major forces that have forced architects and designers to think in terms of the DTO paradigm.

Ideally we would like to have a restricted view of the Order abstraction within the web layer where users can access only a limited set of contracts that will be required to build coarse-grained service objects and presentation models. All implementation methods and anemic setters that directly manipulate the domain model without going through fluent interfaces need to be protected from being exposed. This plugs the leak and keeps the implementation artifacts locked within the domain model only. Here is a technique that achieves this by defining the restricted interface and weaving it dynamically using inter-type declarations of AOP ..

// Restricted interface for the web layer

package org.dg.domain;

import java.util.List;

public interface IOrder {
  int getOrderNo();
  String getDescription();
  List<Item> getLineItems();
  float evaluate( .. )
  void addLineItem(Item item); // new method
}

// aspect preventing leak of the domain within the web layer

package org.dg.domain;

public aspect FlagOrderInWebLayer {
  pointcut accessOrder(): call(public * Order.* (..))
          || call(public Order.new(..));

  pointcut withinWeb( ) : within(org.dg.web.*);

  declare error
    : withinWeb() && accessOrder()
    : "cannot access Order within web layer";
}

Note that all setters have been hidden in IOrder interface and an extra contract has been exposed to add an Item to an existing Order (addLineItem(..)). This makes the interface fluent and closer to the domain user, since he will typically be adding one item at a time to an Order.

Finally, weave in the IOrder interface dynamically through inter-type declarations of AspectJ, when exposing the Order abstraction to the layers above. In fact every layer to which the full abstraction of Order has been restricted can make use of this new interface.

package org.dg.domain;

public aspect RestrictOrder {
  declare parents: Order implements IOrder;

  public void Order.addLineItem(Item item) {
    this.getLineItems().add(item);
  }
}

Hence, the web layer does not get access to the full implementation of Order and has to work with the restricted contract of IOrder, thereby preventing the leaky domain model antipattern.

Tailpiece

Now that we have agreed to plug in the holes of the leaky domain model, how should we decide upon the composition of the published restricted interface ?

I find some dichotomy here. The published interface should, on one hand, be restrictive, in the sense that it should hide the implementation contracts from the client (the web layer, in the above case). While, on the other hand, the published interface has to be humane and should follow the principles of Intention-Revealing Interfaces. The interface should add convenience methods targetted to the client, which will help him use the object more effectively and conforming to the Ubiquitous Language.

Monday, September 18, 2006

Domain Driven Design : Managing Variability

The Spring guys have started talking about Domain Driven Design. And, logically so, they are talking sense. In SpringOne 2006, there were three sessions on domain driven design. Ramnivas Laddad, known as the AOP guy (recently joined Interface 21) talked about how DI and AOP help bring out the best in DDD - a minimalistic service layer, rich domain layer and fluent interfaces. Steven Devijver of Interface 21 and Grails fame discussed about rich domain modeling through usage of design patterns for managing complexity in the service layer. He emphasized on important aspects like separation of concerns through layering, AOP, IoC etc. Immutability is important to maintain the sanity of your model - make classes immutable as much as you can and focus on controlled exposure. Do not expose the internals of your implementation - public setters are evil. In a nutshell the domain model should expose only the domain language, which the domain experts speak. Make your interfaces and public contracts speak the UBIQUITOUS LANGUAGE. In another session on The Art of Domain Modeling, Keith Donald of Interface 21, talked about techniques for distilling the domain and abstracting the acquired knowledge into the domain model. All in all, it looks like the Spring guys are up for a DDD funfeast !

Variablity! Variability!

Making the domain model flexible and generative is all about managing the variabilities within the model. The better you manage the variable components of the model, the more configurable, customizable and flexible it becomes. In the last post I talked about the configuration knowledge, which will act as the generator and do the plumbing for you. It is this variability that makes the most of your configuration knowledge. The aspects of your design, specific implementations, lifecycles, scoped instances, algorithms, strategies etc. are the aspects that you would like NOT to be hardwired within your Java / C++ codebase.

Strategically Speaking ..

Over the years, I have found many variants of the Strategy pattern implementation being used to manage variabilities within the model. GOF discusses two variations in C++ - one using runtime polymorphism and virtual functions, while the other using templates and compile time polymorphism. Each has its own merits and can be applied to solve specific problems in a specific context.

The following snippet shows how the Strategy Design Pattern can be applied through runtime polymorphism to externalize the domain process of accrual calculation ..

class AccrualCalculation {

  private CalculationStrategy strategy;

  // setter injection
  public void setStrategy(CalculationStrategy strategy) {
    this.strategy = strategy;
  }

  // delegate to strategy
  public BigDecimal calculate(...) {
    return strategy.calculate(...);
  }
}

interface CalculationStrategy {
  BigDecimal calculate(...);
}

class DefaultCalculationStrategy implements CalculationStrategy {
  public BigDecimal calculate(...) {
    // implementation
  }
}

// externalize the variability through the configuration knowledge
<bean id="calcStrategy" class="com.x.y.DefaultCalculationStrategy"/>
    
<bean id="accrual" class="com.x.y.AccrualCalculation">
  <property name="strategy">
    <ref bean="calcStrategy"/>
  </property>
</bean>

Compare this with the same pattern implemented using generics, where the class AccrualCalculation can be designed as being parameterized on the strategy.

class AccrualCalculation<S extends CalculationStrategy> {
  private S strategy;

  // delegate to strategy
  public BigDecimal calculate(...) {
    return strategy.calculate(...);
  }
}

// usage
interest = 
  new AccrualCalculation<DefaultCalculationStrategy>().calculate(...);

Traits techniques, the else-if-then of types (ah! I love the name) have also been used extensively by the C++ community to manage variations at the type level. The traits classes and traits templates along with killer metaprogramming capabilities have been used to develop blazing applications in C++. Another variant of compile time strategy is the Policy Based Design of Alexandrescu. Policy classes, along with their capabilities of having an unbounded number of implementations, provide an ideal alternative for plugging in compile time strategies to the domain model.

The Strategy Design Pattern provides an ideal mechanism to encapsulate the coarse-grained variabilities of your domain model. Spruce up the pluggability of your model by externalizing the strategy implementation into an IoC container or any other generator (as the configuration knowledge).

Fine Grained Variability with Template Method

The Template Method Design Pattern is meant for plugging in fine-grained variabilities of your algorithm / strategy within the commonality framework. Use this pattern in your model if the macro level flow of your domain process is invariant, while there are customizable bits and pieces that need to be hooked and externalized. I have used this pattern with great success in modeling rule based big financial applications. The following is an example for implementing fine grained variability using the Template Method Design Pattern. Here, the overall accrual calculation algorithm is final, while there are some hooks that need to be plugged in by the derived classes. These hooks are kept as abstract methods in the base class.

abstract class AccrualCalculation {
  public final BigDecimal calculate(...) {
    // hook
    int days = calculateAccruedDays(...);
    if (days != 0) {
      // invariant logic
    } else {
      // hook
      boolean stat = isDebitAccrualAllowed(...);
      if (stat) {
        // invariant logic
      }
      // invariant logic
    }
  }

  protected abstract int calculateAccruedDays(...);
  protected abstract boolean isDebitAccrualAllowed(...);
}

Negative Variability

Jim Coplien first introduced this term in the design and analysis space with his work on Multiparadigm Design in C++. When some of the members of an otherwise common hierarchy violate a few of the base class assumptions, he calls this negative variability. Cope goes on to say

Negative variability violate rules of variation by attacking the underlying commonality - they are the exceptions to the rules.

There are quite a few techniques for implementing negative variability viz. template specialization in C++, conditional compilation etc. The Bridge Design Pattern is also known to address this problem very well. Coplien's Multiparadigm Design book details a lot of them - have a look at the deep insights of this guru. In real life, a domain model often displays this behavior and you need to take proper care through usage of appropriate patterns and idioms.

Wednesday, September 13, 2006

Is Your Domain Model Generative ?

Martin Fowler defines a domain model as an object model of the domain that incorporates both behavior and data. If the domain is complex, then the domain model has to be complex and consequently the object model will also have to be rich and complex. Unfortunately some of the modern technologies (aka Entity Beans of J2EE) have encouraged a programming model where the domain model becomes anemic and the complexity of the domain sneaks into a procedural Service Layer. Martin Fowler's P of EAA and Erik Evans' DDD books talk enough of these anti-patterns and recommends all the virtues to have a behaviorally rich domain model in a layered architectural stack.

Generative Domain Models

Czarnecki and Eisenecker mention about another important attribute regarding richness of domain models - Generativity. A generative domain model is one that offers users the ability to order specific products or artifacts by specifying combinations of concrete implementations as configuration knowledge. More specifically

The key to automating the manufacture of systems is a generative domain model, which consists of a problem space, a solution space, and the configuration knowledge mapping between them.

What exactly do we mean by the model being generative ?

Apart from achieving high intentionality in modeling domain concepts, and promoting reusability and adaptibility, a generative domain model simplifies the process of manufacturing variants of a component / module through changes in externalized configuration knowledge. The model needs to have a unified architecture based on the commonalities of the domain with appropriate declarative hooks for plugging in variabilities.

Czarnecki and Eisenecker specify the following implementation techniques for generative models :

Generic Programming

Metaprogramming

C++ Template based programming

Aspect Oriented Programming

Of course, this is from a 1999 setting and we have many more popular programming paradigms practiced today which can promote generative domain model designs. In this post I will try to ruminate on some of these paradigms and practices and emphasize the importance of making models generative.

IoC using Spring

Spring, as an IoC container, offers strong bean management, with control over object lifecycle and dependency resolution. As configuration knowledge, it offers a DSL in the form of an XML, which the user can manipulate to wire specific implementations, inject new dependencies and weave new aspects. The user can program to interfaces and can *order* specific implementations through the DSL.

<bean id="myInventoryManager" class="org.dg.wire.InventoryManagerDefaultImpl"/>
    
<bean id="myProductManager" class="org.dg.wire.ProductManagerDefaultImpl">
  <property name="inventoryManager">
    <ref bean="myInventoryManager"/>
  </property>
  <property name="retrieveCurrentStock">
    <value>true</value>
  </property>
</bean>

In the above fragment of the configuration knowledge (the configuration XML of Spring), the user can declaratively supply implementations for myInventoryManager and myProductManager and control instantiations and lifecycle for all the Spring managed beans. I think this is generative domain modeling, where I can create concrete instances of my managed beans by providing implementation specific information to the configuration knowledge.

Te above example is a simplistic one and IoC containers generally provide many sophisticated features for complex bean management along with lifecycle semantics, scoped instantiations, constructor and setter based injections and dependency management. Add to that the complete declarative semantics through some form of DSLs (mostly XML based) - and you have the generativity non-intrusively weaved into your domain model.

Metaprogramming

One of the classic examples of a metaprogramming platform is Ruby on Rails, that offers amazing capabilities of generative modeling and programming. The abstraction of ActiveRecord in RoR provides a full-blown (well, almost) ORM framework for wrapping your database tables and providing access to all data and metadata. You just create a subclass of ActiveRecord::Base and you create an entire machinery to access all details of your database table.

class Order < ActiveRecord::Base
end

With just these 2 lines of code, RoR will create for you the entire model for Order, which maps to your database table Orders. RoR values convention-over-configuration - all conventions are stored in the configuration knowledge. You want to change 'em .. you are welcome .. just change ActiveRecord::Base.pluralize_table_names to false in environment.rb of the config directory and you can disable the pluralization of tables.

Traditional ORM frameworks use the mapping path, offering an object-oriented view of the world, while RoR employs the wrapping framework with a database-centric view of the world. But, whatever be it, RoR offers strong code generation capabilities using its metaprogramming engine. And the heart of it is, of course, the nice little DSL that it exposes to the world and allows users a rich set of configurability. It's generativity, at its best!

Aspect Oriented Programming

It is a well-established fact that the technology of AOP has been used extensively to address cross-cutting concerns of domain models. Transactions, security, logging, tracing, auditing etc. are best implemented in the domain model as aspects. Aspects are maintained separately from the codebase and are weaved dynamically (loadtime / runtime / compiletime) with the codebase. And because of this externalization, implementation of strategies can be changed declaratively within the domain model.

As an example, let us consider a model for a PaymentService, where the service needs to be controlled by a failover strategy. The actual implementation of the strategy can be externalized in the form of an aspect and weaved lazily with the model implementation. The following implementation has been adopted from Ramnivas Laddad's presentation at SpringOne.

public aspect PaymentServiceFailover {
  pointcut process(PaymentProcessorprocessor)
    : execution(*PaymentProcessor.*(..)) && this(processor);

  Object around(PaymentProcessorprocessor)
    : process(processor) {
      int retry = 0;
      while (true) {
        try {
        return proceed(processor);
        } catch(RemoteException ex){
          processor = getAlternativeProcessor(processor);
          ... give up logic ...
      }
    }
  }
  ...
}

Using Spring, we can inject the list of available processors declaratively and setup the complete configuration knowledge to use this specific failover strategy implementation.

public aspect PaymentServiceFailover {
  ...
  public setProcessors(List<PaymentProcessor> processors) {
    this.processors = processors;
  }
  private List<PaymentProcessor> processors;
  ...
}

<bean id="paymentServiceFailover" 
  class="PaymentServiceFailover" factory-method="aspectOf">
    <property name="processors">
    <list>
      <ref bean="aProcessor"/>
      <ref bean="anotherProcessor"/>
    </list>
  </property>
</bean>

So, in the above example, we have successfully kept the implementation specific mappings externalized from the domain model. This results in the model being able to generate specific instances of PaymentService components with different implementations of failover strategies. Generativity !!

Conclusion

In summary, generative domain models offer more flexibility in practice. The invariant part of the domain resides within the model proper, while the variations are plugged into using DSLs as configuration knowledge. In practice there are many patterns to manage the variabilities within the scheme of the common architecture. But that's for another day. In the next post on domain models, I will talk about the benefits of a generative domain model in a layered architecture stack and how the service layer continues to shrink with a powerful domain model in place.

Tuesday, September 12, 2006

Loads of Fun with StrongTalk

Gilad Bracha has announced the open sourcing of StrongTalk by Sun, including the virtual machine of the fastest Smalltalk implementation .. to all language fanatics, here is the chance to have a look at some of the best innovations in a dynamic language with optional typing. As Gilad says

Apart from speed, it had support for mixins inside the VM, flyweight glyph based UI, optional type system and mirror based reflection, to name a few of the major innovations.

Enjoy !!

Technorati Profile

Thursday, September 07, 2006

Which Web Framework Do You Want to Use Today ?

Yes, it has literally come to this point, where you need to rummage across hundreds of pages to decide upon the technology / framework that you will use in the Web tier. In designing an enterprise application, we had converged upon Struts way back in 2001, since the only other available alternative was to write one ourselves. Things were so simple those days!

Struts - Reunited!

Struts is now like a broken family with a number of subprojects, each trying to carve its way to the mainstream - thanks to some decent heads we may see the light of Struts 2 (aka SAF 2) release very soon. Struts is one of the members in the milieu of action-oriented Web frameworks that still give HTML the respect of being a transaction oriented protocol by offering *Action* as the principal abstraction. But from whatever I have seen of Struts 2 (and their adoption of WebWork codebase), I think it is on its way up to where it belonged a couple of years back. And it also offers integration with all sorts of other frameworks in the world -

it uses Spring as the IoC container and has a proposed integration with Spring Web Flow

Don Brown has recently announced a JSF integration (the holy grail of unification of action-oriented and component-oriented paradigms) by a clever usage of the Interceptor stack

Spring MVC - Struts 1.x++

Among the other dominant players in the action-oriented world, I like Spring MVC a lot and have really found nothing to complain about it. It offers definite improvements over Struts 1.x wrt the programming model in that it has -

a strong data binding and validation mechanism that uses Spring's data binder and validator

IoC for easy testing (that's bread and butter for any Spring offering)

seamless integration with multiple view technologies

the best integration with Spring on earth

I guess the main factor between choosing either of the above 2 will be the developers' expertise that u have on your team. But Struts has a stronger community and a much larger user base that Spring MVC.

Components on the Web - The Java Server Faces

This is the latest standard from Sun and offers the component model for Web frameworks. While the action-oriented frameworks use *actions* to push request into the model and the view, JSF uses components throughout. The entire view is modeled as a tree of components and it is the responsibility of the “ViewHandler” implementation of JSF to create and render the component tree using the underlying view technology (the default is the JSP).

JSF is an attempt to align Web UI tier to the rich client metaphor of desktop applications - I am not very sure if this will scale. HTTP is stateless and having a heavyweight stateful component model plugged in there implies an equally heavy-duty maintenance cycle(state saving and rendering). That's precisely the complaint that we hear for JSF (see here, here and here) and no wonder we find people like Jacob Hookom and Ed Burns desperately trying to get some ways out of this through partial statelessness of the component model, the JSF Avatar Proposal and the other extensions in jsf-extensions.

Till today, JSF has scalability problems for large enterprise applications - the specification is a neat component model with typed event handling, encapsulated markups, separate validators and converters and a well-documented lifecycle. The problem is the implementation, which is a serious roadblock towards scalability of performance. Inspite of all these, JSF may be the platform of the future because of tool friendliness, enhanced support for developer productivity and the backing of the big daddy.

I would not dare to push into a full blown JSF based Web tier in my large scale performance sensitive, clustered enterprise application.

Another close relative in this world of component oriented Web frameworks is Tapestry, hailed by many as technically much superior to JSF. But, Tapestry has a long learning curve, is basically a single man show and has long release cycles with least concern for backwards compatibility.

My Stack

It is quite useful to use a mix-n-match solution stack, since most of the frameworks offer nice integration (a symbiosis of sorts).

For stateless pages that need raw speed, I would go for action-oriented frameworks. Whether it is Spring MVC or Struts 2 will depend upon the expertise that my team has. Spring MVC offers more powerful Spring integration (obviously), but Struts has stronger community presence - so it is really a matter of judgement on the spot.

For rich component oriented pages only, I would go for JSF and Facelets with backing beans having Spring services injected within, but have the action controller from Struts 2 or Spring MVC. So, JSF usage is limited only for very specific cases which will really benefit from the rich component model.

Lastly for dialogs and other flow intensive pages, I would pick up Spring Web Flow, since it is the best in stateful flow management with rich support for continuations (BTW, what's the deal with Struts Flow - the latest download is a 2004 snapshot!). I know SWF integrates with JSF, but I am not very sure about the current state of integration between SWF and Struts 2.

The above stack is pretty heavy and uses multiple frameworks. I have not tried this out (may be I should) - I hope the integration points work out as promised by each party. It will be great to hear from all of you about your recommendation for the Web-tier stack.

Thursday, August 31, 2006

Closures in Java and The Other Side of Backwards Compatibility

It's the power of community - the awesome force that makes Java evolve, has once again started roaring at the news of a possible inclusion of Closures in Dolphin. Not that all of the ruminations are favorable, in fact the functional gurus at LtU have once again started chanting about how Java contains a number of fundamental (technical) design flaws and that Sun should start immediately looking for an alternative to Java as the next programming language.

My current post has been triggered by the excellent article that Bill Venners has written in Artima, where he has lamented how the extreme efforts to maintain backwards compatibility in Java is hindering the elegance of the language. Bruce Eckel had also raised this point many times in the past when Sun pushed in its broken Generics implementation philosophy in Java 5 for the sake of maintaining backwards compatibility with the millions of lines of existing codebase. Bill has hit the nail right on its head -

There's a natural law in programming language and API design: as backwards compatibility increases, elegance decreases. Backwards compatibility is very important. There's a cost to breaking code, but there's also a cost to not breaking it—complexity in the developer's face.

In trying to compromise with some of the inadequacies of the language, Java is turning out a feature bloat. Large enterprise applications have started to accumulate blobs of codebase built upon contradictory features of the language, just because Java did not clean 'em up in subsequent releases and still continues to support the legacy, maybe with a couple of hundreds of deprecated warnings. I think this is a far worse situation than breaking backwards compatibility.

Microsoft has displayed much more sanity in this regard and have made a conscious effort to clean things up in the course of the evolution of C#. I know the codebase size that C# has in the industry is in no way comparable to that of Java - but still I cannot support the path that Java has adopted, which, in a way has encouraged piling of inadequate code bloats. Java has released version 5, Mustang is on its way later this year, we are looking and planning for Dolphin - yet in the most popular object oriented language of the industry, primitives are no objects. We cannot have the elegance of writing

200.times { |i|
  # do something
}

Look at the tons of code in any legacy application today and you will be stunned by the amount of effort people have taken for special processing of primitives.

Closures with Backwards Compatibility ?

Closures are typically a functional programming artifact, though all modern scripting languages have been supporting it. C# has rolled out its implementation of closures through delegates and are bringing lambdas in 3.0. I suspect these have been the major triggers behind the sudden clairvoyance of Gilad Bracha and his team in announcing the support of closures in Dolphin.

Closure Implementation

Sun has been thoroughly conservative on any change in JVM, Gilad has talked about his struggle to force "invokedynamic" as the only change in the JVM for years. Closures, as this post suggests, can be implemented as a syntactic sugar at the javac level by creating closure objects on the heap and autoboxing of mutant parameters. I have strong doubts if Sun will go to the extent of changing JVM for implementing efficient cheap-to-use closures. Reason - Backwards Compatibility !

Closure Usage

In my earlier post on this subject, I had mentioned about internal iterators, which I would like to see as part of the closures package. As Joe Walker has mentioned in his blog, and Bill has discussed based on his suggestion, we would like to see a .each() method in the Collection interface. Again this cannot be done without breaking existing codebase, since it adds to the interface Collection. The question is "Will Sun go for this ?" or make us eat the humble pie by offering the much less elegant workaround of statics in Collections.each(Collection, ..). Once again Backwards Compatibility hinders the added elegance !

As a workaround to the above problem of maintaining backwards compatibility by adding more methods to existing interfaces, C# has come up with "extension methods", while Scala has introduced "implicits" and "views". Martin Odersky has had a very good discussion of these capabilities in the Weblog forum of Bill's article.

We need to wait till Java comes up with a strategy to address these issues.

Closures - Will it make Java a Functional Programming Language ?

Definitely not! I think adding closures will be just an attempt to reduce the awkwardness of the interfaces-anonymous classes idiom, now used to abstract an algorithm over a piece of code. Just by adding closures to Java, developers will never start thinking in terms of monads and combinators while composing their programs. But given an efficient implementation and a proper library support, it will help add elegance to programming in Java.

Here's the tailpiece from Karsten Wagner in a typical Java bashing in LtU ..

To get all those nice stuff you want Java needs much more than just a simply syntax for closures. So I think it's better to let Java stay the way it is and phase it out in the long term and use a real better language instead, because Java simply is beyond repair in too many points.

I am not that pessimistic, I still make my living on Java (though I seek the joy of programming in Scala) and hope that Gilad and his team bring out a killer offering with Closures.

PostScript

Just when I was rumbling through the formatting of this entry, I noticed this in InfoQ, where Sun has created JSR 270 to remove features from the Java SE platform. This is definitely a welcome step .. more details can be found in Mark Reinhold's blog.

Monday, August 28, 2006

Validation Logic : Wire up Using Spring Validators

Colin Yates has blogged about Validation Logic in his first post of Interface 21 blog. He has talked about the importance of a Validator abstraction, where you apply your business specific validation rules to your populated domain object. This is a real pragmatic advice when you are designing a large enterprise application. But when you are talking about an abstraction, you need to discuss how it will collaborate with the other participants of the mix. Take an example of a Web application, which has multiple tiers - you need to be very clear about the role that your validator is going to play in each of them. As a reader of the i21 blog, I started thinking of many use cases, scenarios and possible collaborations that our dear friend validator can play when bouncing across the layers of the application. I thought of posting a reply to clarify some of my understandings, but ultimately ended up writing this blog entry, hoping that this might trigger some discussions to clarify the reigning confusions. In what follows, I will try to think aloud my understanding and hope that the community will enrich this discussion with their collective thoughts.

Wiring the Validation Logic

Having the validation logic separated out in a validator abstraction is important, but, I think the more important part and the most confusing part of implementing validation is the wiring of the validation logic with the rest of the application. Follow the comments section of Colin's posting and my last statement will justify for itself. Various readers have posted their observations on how to wire the logic within the scope of the various tiers of a typical Web based application.

One of the pioneers in this wiring has been the Rails' in-model validation, which Cedric sees as an improvement over validation tied to the presentation framework. Rails offers validation logic as part of the model, which wires itself nicely within the framework and gets invoked automagically before persistence of the model object. Rails' validation engine offers some amount of context sensitivity as well through the protected hooks like validate, validate_on_create, validate_on_update, which the application developers can override to plug in custom rules. There can be complicated use cases where this model is not a 100% fit, but as DHH has mentioned - this is one of those "most people, most of the time" things. We're not after the "all people, all of the time" kind of framework coverage. Quite in alignment with the Ruby philosophy :-).

In the Java community, we do not have the clairvoyance of the Ruby world, we do not have a DHH to give us the dictum. The result is inevitable and we have Commons Validator, Struts, WebWork (powered by XWork validator), RIFE, Spring, Tapestry, Stripes, Wicket and many other variants implementing the same practice in their own different ways. It's time for a JSR that will bring harmony to the chaos of implementing the validation logic across all tiers of the application - enter JSR 303: Bean Validation!

Validators - Separate Abstraction ?

The main point in Colin's blog was to advise people to identify *all* the rules that define *valid* data, and the uniqueness of that data is just as much a validation rule as saying the username must not be null. He speaks about the necessity of a validator abstraction, which is provided so well by Spring. In one of his responses, he mentions

It simply boils down to this; identify *all* the validation logic and then apply it in a single coherent, atomic, explicit operation. Validators are powerful things, they can do more than just the *syntax* of the data, they can, and should also check the *semantics* of the data.

But, should we have the Validator as an abstraction separate from the domain object or follow the Rails philosophy and club 'em together with the domain object ? Working with Spring, I would like to have validators as a separate abstraction for the following reasons :

Validators are tier agnostic and can be used to collaborate with multiple layers -

Business Layer for enforcing business logic on "semantically valid" objects

Persistence layer to ensure valid objects get persisted

MVC layer to enforce data binding from request parameters

Validators in Spring have their lifecycles controlled through the IoC - hence these singletons can be nicely wired together through DI

Spring MVC requires validators as separate objects

Validators - Collaborating across Layers

In his blog, Colin goes on to say that he does not want to get down into whether this validation should be applied at web layer, or middle layer, or both etc. But, as I mentioned above, the collaboration of the validators with the participants of the various tiers of application is an area where people get confused the most.

Let me try to summarize my understanding of how to engineer the validator abstraction across the application layers, so that we can reuse the abstraction, avoid code duplication and have the validation logic tied to the domain model (since it is the domain objects that we are trying to validate). My understanding will be based on the implementation of Spring MVC, which does a nice job of engineering this glue.

Step 1: Form a Validator Abstraction

The domain class has an associated validator class. This is a deviation from the Rails implementation, but this allows a nice refactoring of the validation logic into a separate abstraction than the domain class - in fact the wiring of the validator with the domain class can be done through mixins (or as we say in the Java world - the interceptors). I think XWork engineers validators based on this interceptor technology.

Step 2: Wire Controllers to Populate Domain Object

In a typical Web application, we have Controllers that represent a component which receives HttpServletRequest and HttpServletResponse instances just like an HttpServlet and is also able to participate in an MVC workflow. Spring offers a base interface Controller, while Struts has the notion of an Action. This controller intercepts the request and creates a Command object out of the request parameters. The framework takes care of this object creation and population through the usual JavaBeans engine of property setters and getters and additional property editors (as in Spring). This Command object can be the domain object itself or it may have the domain object wrapped in it - the application developer knows how to get the domain object out of the command object.

Step 3: Validate the Domain Object Before Submission

Once the controller has successfully populated the command object, it executes all the validators registered to validate the object. Instead of automatically triggering the validation as part of data binding, Spring also offers callbacks to do the same as post-processing of the binding phase. The following snippet is from the Spring samples (jpetstore):

import org.springframework.samples.jpetstore.domain.Account;

protected void onBindAndValidate(HttpServletRequest request, 
  Object command, BindException errors)
    throws Exception {

  // command object
  AccountForm accountForm = (AccountForm) command;
  // domain object (POJO)
  Account account = accountForm.getAccount();

  // validate the object
  getValidator().validate(account, errors);
  ...
}

The above strategy describes how the validation logic is bubbled up from the domain layer and is used by the controller to send error messages to the user from the web tier. All the validation logic is centralized in the validator abstraction and is executed through validator.validate(..) and the errors being propagated through the exception structure.

Conclusion

Validators are domain level objects.

Validators are separate abstractions from the domain classes.

Validators can be "mixed-in" through interceptors if required.

Validators encapsulate "syntax" as well as "semantics" of the domain object.

In a typical Web application, validators can be wired together with Controllers, DAOs etc. to provide services at the other tiers of the application. Yet they are not coupled with any abstraction of the other layers of the application.

Sunday, August 20, 2006

Closures At Last !

There has been some significant movements amongst the Java leaders to introduce this much awaited feature in the Java programming language. The big team of Gilad Bracha, Neal Gafter, James Gosling and Peter von der Ahé has released a draft proposal for adding closures to Dolphin (JDK 7). I know Gilad has been a big proponent of having closures in Java and he has expressed his frustration in his blog at Java being a so late entry to close this out.

Brevity

Thoughts about introducing closures in Java has definitely been triggerred by the excellent support of closures provided by C# and the host of scripting languages like Ruby, Groovy, Python and Javascript. The syntax, as proposed in the draft looks a bit cumbersome to me, particularly after getting used to the elegance of Groovy, Ruby and even C#. I know Java, being a statically typed language does not help in making the closure syntax as elegant as dynamic languages.

My Wishlist of Associated Features

If closures see the light of the day in Java, then, I would like to have the following associated features, which will make the set more complete :

Type aliasing / typedefs : Without type aliasing it will be extremely cumbersome to write the entire type signature everytime. I am sure this will also make programming with generics much easier. The keyword is *syntax-brevity* and type aliasing is a great way to achieve it.

Currying : Higher order functions and closures are definitive steps towards implementing full currying features.

Generic Closures : It will be interesting to find out how closures will mix with generics.

Internal Iterators : I would love to write code like the following:

    int[] list = ...
    int[] evens = Arrays.findAll(list, (int n) { return n % 2 == 0; });

Sunday, August 13, 2006

Extend Your Type Transparently using Spring Introductions

One of the main intents of the Bridge design pattern is to allow decoupled dual hierarchies of interfaces and implementations growing independently and allowing users the flexibility to compose. However, the binding part is that all implementations have to abide by the base class of contract dictated by the abstract interface.

Readers of my blog must have been bored by now with my regular chantings for the necessity of a generic data access layer in Java based applications. I have designed one which we have been using in some of the Java projects - I have blogged extensively about the design of such an artifact here, here and here. The DAO layer has been designed as a Bridge with a dual hierarchy of interfaces acting as client contracts backed up by the implementation hierarchies. So long the clients had been using the JDBC implementation and have never complained about the contracts. Only recently I thought that I will have to sneak in a JPA implementation as well, since Spring has also started supporting JPA.

Things fell into place like a charm, till I hit upon a roadblock in the design. If u need to provide some contracts which make sense for some specific implementation (not all), then what do u do ? The basic premise of using Bridge is to have a single set of interfaces (contracts) which all implementations need to support. We have the following options :

Throw exceptions for unsupported implementations and hope the user does not use 'em. Document extensively warning users not to venture into these territories. But if my client is like me and does not have the habit of reading documentations carefully before coding, then he may be in for some surprises.

Use the Extension Object Design Pattern, which allows u to extend an object's interface and lets client choose and access the interfaces they need. Cool - this is what I need to extend the contract of my generic DAO ! But hold on !! Look at the very first line of the pattern's intent, as described by Erich Gamma .. "Anticipate that an object’s interface needs to be extended in the future.". What this means is that u will have to design your abstraction anticipating a'priori that it may be extended. So if the necessity of providing extensions is an afterthought (which is, in my case), then it doesn't fit the bill.

Extension of the Generic DAO Contract

One of the nifty features of EJB QL is that the user can specify a constructor within the SELECT clause that can allocate non-entity POJOs with the set of specified columns as constructor arguments. Let me illustrate through an example shamelessly copied from Richard Monson-Haefel and Bill Burke's Enterprise JavaBeans book.

public class Name {
  private String first;
  private String last;

  public Name(String first, String last) {
    this.first = first;
    this.last = last;
  }

  public String getFirst() { return first; }
  public String getLast() { return last; }
}

Note that Name is NOT an entity. Using EJB QL, we can actually write a query which will return a list of Name classes instead of a list of Strings.

SELECT new com.x.y.Name(c.firstName, c.lastName) FROM Customer c

I wanted to provide a contract which can return a collection of objects belonging to a different class than the Entity itself :

public <Context, Ret> List<Ret> read(Context ctx,
      String queryString,
      Object[] params);

And I wanted to have this contract specifically for the JPA implementation.

Dynamic Extension Objects using Inter-type Declarations in Aspects

Inter-type declarations in aspects provide a convenient way to declare additional methods or fields on behalf of a type. Since I have been using Spring 2.0 for the JPA implementation of the DAO, I went in for Spring Introductions, which allow me to introduce new interfaces (and a corresponding implementation) to any proxied object.

Quick on the heels, I came up with the following contract which will act as a mixin to the DAO layer:

public interface IJPAExtension {
  public <Context, Ret> List<Ret> read(Context ctx,
      String queryString,
      Object[] params);
}

and a default implementation ..

public class JPAExtension<T extends DTOBase> implements IJPAExtension {
  public <Context, Ret> List<Ret> read(Context ctx,
      String queryString,
      Object[] params) {
    // ...
  }
}

And .. the Weaving in Spring 2.0

The client who wishes to use the new interface needs to define the extension object just to introduce the mixin - the rest is the AOP magic that weaves together all necessary pieces and makes everybody happy.

@Aspect
public class DAOExtension {

  @DeclareParents(value="com.x.y.dao.provider.spring.jpa.dao.*+",
    defaultImpl=JPAExtension.class)
  private IJPAExtension mixin;
}

The original contracts remain unpolluted, other implementations do not bloat, still we have successfully introduced new functionalities in the JPA implementation, still without the client committing to any implementation class (we all know why to program-to-interfaces - right ?). The client code can write the following :

IJPAExtension mixin = (IJPAExtension)restaurantDao;
List<RName> res =
    mixin.read(factory, 
      "select new com.x.y.dao.provider.spring.jpa.RName(r.id, r.name) from Restaurant r where r.name like ?1", 
      params);

Inter-type declarations are not a very frequently used feature of aspect oriented programming. But it is a useful vehicle for implementing many patterns in a completely non-invasive way. I found it very useful while extending my JPA based DAO implementations without adding to the base contracts of the bridge.

Tuesday, August 08, 2006

XML Integration in Java and Scala

During my trip to JavaOne 2006, I had missed out the session by Mark Reinhold where he discussed Java's plan of integrating XML into the Java programming language. There have been lots of discussions in various forums about the possibilities of this happening in Dolphin - Kirill Grouchnikov has blogged about his thoughts on what he would like to see as part of native XML support in Java. The community, as usual, is divided on this subject - many people feel that integrating XML into the Java language will be a serious compromise on the simplicity of the language. Look at the comments section of this posting in JavaLobby. This feeling of compromise has gained more momentum in view of the upcoming integration of the scripting languages like Javascript, ECMA, Rhino etc. with Java (JSR 223).

Anyway, I think Java will have the first cut integration of XML in Dolphin. In the JavaOne session, Mark had discussed some of the options which they plan to offer in java.lang.XML, so as to make XML processing simpler in Java and liberate the programmers from the hell of dealing with DOM apis. Microsoft has already published its implementation of XML integration into C# and VB in the form of XLinq. I tried my hands at it using the June CTP and found it to be quite elegant. In fact the whole stuff looks seamless with the entire LINQ family and Microsoft's plan of fixing the infamous ROX triangle. Java has been lagging behind in this respect and is trying to make its last attempt to catch up - though expect nothing till Dolphin! I appreciate the fact that considering the millions of user base that Java has today and its committments to the community as being the default choice for enterprise platform (unless u r Bruce Tate, of course!), it is not easy to veto a change in the language. Still, better late, than never.

<scala/xml>

A few days ago, I was browsing through some of the slides of Mark from JavaOne, when I thought that it will be a worthwhile exercise to find out how these could be implemented in Scala, which, in fact offers the most complete XML integration as part of the language. I have repeatedly expressed my views about Scala in my blog (see here) and how positive I feel about saying Hello Scala. XML integration in Scala is no exception - in fact the nicest part of this integration is that the designers did not have to do much extra to push XML as a first class citizen in the Scala world. The elements of Scala that make it a nice host to XML integration are some of the core features of the language itself :

Scala being a functional language suppports higher order functions, which provides a natural medium to handle recursive XML trees

Scala supports pattern matching, which can model algebraic data types and be easily specialized for XML data

For-comprehensions in Scala act as a convenient front end syntax for queries

Go through this Burak Emir paper for more on how XML integration in Scala offers scalable abstractions for service based architectures.

For brevity, I am not repeating the snippets as Mark presented. They can be found in the JavaOne site for the session TS-3441. I will try to scratch the head with some of the equivalent Scala manifestations.

Disclaimer: I am no expert in Scala, hence any improvements / suggestions to make the following more Scala-ish is very much welcome. Also I tested these codes with the recent drop of 2.1.7-patch8283.

Construction : XML Literals

This example adds more literals to an existing XML block. Here's the corresponding snippet in Scala:


val mustang = 
  <feature>
    <id>29</id>
    <name>Method to find free disk space</name>
    <engineer>iris.garcia</engineer>
    <state>approved</state>
  </feature>;

def addReviewer(feature: Node, user: String, time: String): Node =
  feature match {
    case <feature>{ cs @ _* }</feature> =>
      <feature>{ cs }<reviewed>
      <who>{ user }</who>
      <when>{ time }</when>
      </reviewed></feature>
  }
 
Console.println(addReviewer(mustang, 
        "graham.hamilton", 
        "2004-11-07T13:44:25.000-08:00"));

The highlights of the above implementation are the brevity of the language, mixing of code and XML data in the method addReviewer() and the use of regular expression pattern matching which can be useful for non-XML data as well. In case u wish, u can throw in some Java expressions within XML data as well.

Queries, Collections, Generics, Paths

This snippet demonstrates the capabilities of XML queries in various manifestations including XPath style queries. One major difference that I noticed is that the Scala representation of runtime XML is immutable, while the assumption in Mark's example was that java.lang.XML is mutable. I am not sure what will be the final Java offering, but immutable data structures have their own pros, and I guess, the decision to make XML runtime representation immutable was a very well thought out one by the Scala designers. This adds little verbosity to the Scala code below compared to its Java counterpart.

val mustangFeatures = 
  <feature-list>
    <release>Mustang</release>
    <feature>
      <id>29</id>
      <name>Method to find free disk space</name>
      <engineer>iris.garcia</engineer>
      <state>approved</state>
    </feature>
    <feature>
      <id>201</id>
      <name>Improve painting (fix gray boxes)</name>
      <engineer>scott.violet</engineer>
      <state>approved</state>
    </feature>
    <feature>
      <id>42</id>
      <name>Zombie references</name>
      <engineer>mark.reinhold</engineer>
      <state>rejected</state>
    </feature>
  </feature-list>;
    
def isOpen(ft: Node): Boolean = {
  if ((ft \ "state").text.equals("approved")) 
    false
  true
}
    
def rejectOpen(doc: Node): Node = {
    
  def rejectOpenFeatures(features: Iterator[Node]): List[Node] = {
    for(val ft <- features) yield ft match {
            
      case x @ <feature>{ f @ _ * }</feature> if isOpen(x.elements.next) =>
        <feature>
        <id>{(x.elements.next \ "id").text}</id>
        <name>{(x.elements.next \ "name").text}</name>
        <engineer>{(x.elements.next \ "engineer").text}</engineer>
        <state>rejected</state>
      </feature>
                     
      case _ => ft
    }
  }.toList;
    
  doc match {
    case <feature-list>{ fts @ _ * }</feature-list> =>
      <feature-list>{ rejectOpenFeatures(fts.elements) }</feature-list>
  }
}
    
val pp = new PrettyPrinter( 80, 5 );
Console.println(pp.format(rejectOpen(mustangFeatures)));

The observations on the XML querying support in Scala are :

Use of for-comprehensions (in rejectOpenFeatures()) adds to the brevity and clarity of the clarity of the code

XPath methods (in isOpen() .. remember in Scala ft \ "state" becomes ft.\("state")) allows XQuery style of programming.

Another example which combines both of the above features and makes it a concise gem, is the following from another Burak Emir presentation:

for (val z <- doc(“books.xml”)\“bookstore”\“book”;
    z \ “price” > 30)
yield z \ “title”

Streaming In and Out

Mark showed an example of formatting XML output after summarizing all approved features from the input XML. We can have a similar implementation in Scala as follows :

def findApproved(doc: Node): Node = {
    
  def findApprovedFeatures(features: Iterator[Node]): List[Node] = {
    for(val ft <- features; (ft \ "state").text.equals("approved"))
      yield ft
    }.toList;
    
  doc match {
    case <feature-list>{ fts @ _ * }</feature-list> =>
      <feature-list>{ findApprovedFeatures(fts.elements) }</feature-list>
  }
}

Console.println(new PrettyPrinter(80, 5)
      .format(findApproved(XML.loadFile("mustang.xml"))));

Along with formatted output, the snippet above also demonstrates loading of XML from a stream.

On the whole, Scala's support for XML processing is very rich, more so, because of the support that it gets from the underlying features of the language. Scala offers powerful abstractions for transformations (scala.xml.transform), parsing, validations, handling XML expressions, XPath projections, supporting XSLT style transformations and XQuery style querying. The Scala XML library is fairly comprehensive - most importantly it is alive and kicking. Till u have the same support in Java (Dolphin is still at least one year away), enjoy <scala/xml>.

Monday, July 31, 2006

Inside the New ConcurrentMap in Mustang

Tiger has offered a large number of killer goodies for Java developers. Some of them have enjoyed major focus in the community, like generics, enhanced for-loop, autoboxing, varargs, type-safe enums etc. But none has had the sweeping impact as the new java.util.concurrent. Thanks to Doug Lea, Java now boasts of the best library support for concurrent programming in the industry. Martin Fowler mentions about an interesting anecdote in his report on OOPSLA 2005, where he mentions

While I'm on the topic of concurrency I should mention my far too brief chat with Doug Lea. He commented that multi-threaded Java these days far outperforms C, due to the memory management and a garbage collector. If I recall correctly he said "only 12 times faster than C means you haven't started optimizing".

Indeed the concurrency model in Tiger has brought to mainstream programming the implementation of non-blocking algorithms and data structures, based on the very important concept of CAS. For a general introduction to CAS and nonblocking algorithms in Java 5, along with examples and implementations, refer to the Look Ma, no locks! article by Brian Goetz.

Lock Free Data Structures

The most common way to synchronize concurrent access to shared objects is usage of mutual exclusion locks. While Java has so long offerred locking at various levels as this synchronization primitive, with Tiger we have non blocking data structures and algorithms based on the Compare and Set primitive, available in all modern processors. CAS is a processor primitive which takes three arguments - the address of a memory location, an expected value and a new value. If the memory location holds the expected value, it is assigned the new value atomically. Unlike lock based approaches, where we may have performance degradation due to delay of the lock-holder thread, lock-free implementations guarantee that of all threads trying to perform some operations on a shared object, at least one will be able to complete within a finite number of steps, irrespective of the other threads' actions.

java.util.concurrent provides ample implementations of lock free algorithms and data structures in Tiger. All of these have been covered extensively in Brian Goetze's excellent book Java Concurrency In Practice, released in JavaOne this year - go get it, if u haven't yet.

I must admit that I am not a big fan of the management of Sun Microsystems, and the confused state of mind that Schwartz and his folks out there portray to the community. Innovation happens elsewhere - this has never been more true of the way the Java community has been moving. And this is what has kept Sun moving - the vibrant Java community have been the real lifeblood behind Java's undisputed leadership in the enterprise software market today (Ruby community - r u listening ?). The entire Java community is still working tirelessly to improve Java as the computing platform. Lots of research are still going on to increase the performance of memory allocation and deallocation in the JVM (see this). Lots of heads are burining out over implementing generational garbage collection, thread-local allocation blocks and escape analysis in Java. Doug Lea is still working on how to make concurrent programming easier for the mere mortals. This, I think is the main strength of the Java community - any other platform that promises more productivity has to walk (or rail) more miles to come up with something similar.

In this post, I will discuss one such innovation that has been bundled into Mustang. I discovered it only recently while grappling with the Mustang source drop and thought that ruminating on this exceptional piece of brilliance has its own worth and deserves a separate column of its own.

The New ConcurrentMap in Mustang

In Tiger we had ConcurrentHashMap as an implementation of the ConcurrentMap. Mustang comes with another variant of the map - the contract for ConcurrentNavigableMap and a brilliant implementation in ConcurrentSkipListMap. Have a look at the source code for this beast - you will never regret that data structures are there to encapsulate the guts and provide easy-to-use interfaces to the application developers.

Concurrent programming has never been easy and lock-free concurrency implementation is definitely not for the lesser mortals. We have been blessed that we have people like Doug Lea to take care of these innards and expose easy-to-use interfaces to us, the user community. Despite the fact that research for lock-free data structures has been going on for more than a decade, the first efficient and correct lock-free list-based set algorithm (CAS based) that is compatible with lock-free memory management methods, came out only in 2002. Lea's implementation of ConcurrentSkipListMap is based on this algorithm, although it uses a slightly different strategy for handling deletion of nodes.

Why SkipList ?

The most common data structure to implement sorted structures is a form of balanced tree. The current implementation of ConcurrentSkipListMap goes away from this route and uses the probabilistic alternative - skip lists. As Bill Pugh says

Skip lists are a data structure that can be used in place of balanced trees. Skip lists use probabilistic balancing rather than strictly enforced balancing and as a result the algorithms for insertion and deletion in skip lists are much simpler and significantly faster than equivalent algorithms for balanced trees.

The verdict is not as clear as Bill says, but the main reason behind using skip lists in the current implementation is that there are no known efficient lock-free insertion and deletion algorithms for search trees (refer JavaDoc for the class). The class uses a two-dimensionally linked skip list implementation where the base list nodes (holding key and data) form a separate level than the index nodes.

Lock-Free Using CAS Magic

Any non-blocking implementation has a core loop, since the compareAndSet() method relies on the fact that one of the threads trying to access the shared resource will complete. Here is the snippet from Brian Goetz article (look at the increment() method of the counter) :

public class NonblockingCounter {
  private AtomicInteger value;

  public int getValue() {
    return value.get();
  }

  public int increment() {
    int v;
    do {
      v = value.get();
    }
    while (!value.compareAndSet(v, v + 1));
    return v + 1;
  }
}

Similarly, the implementation methods in ConcurrentSkipListMap all have a basic loop in order to ensure a consistent snapshot of the three node structure (node, predecessor and successor). Repeated traversal is required in this case because the 3-node snapshot may have been rendered inconsistent by some other threads either through deletion of the node itself or through removal of any of its adjaccent nodes. This is typical CAS coding and can be found in implementation methods doPut(), doRemove(), findNode() etc.

Handling Deletion

The original designers of these algorithms for list based sets used mark-bits and lazy removal for deletion of nodes. Doug Lea made a clever improvization here to use a marker node (with direct CAS'able next pointers) instead, which will work faster in the GC environment. He however, retains the key technique of marking the next pointer of a deleted node in order to prevent a concurrent insertion. Here's the sequence of actions that take place in a delete :

Locate the node (n)

CAS n's value to null

CAS n's next pointer to point to a marker node

CAS n's predecessor's next pointer over n and the marker

Adjust index nodes and head index level

Any failure can lead to either of the following consequences :

Simple retry when the current thread has lost to a race with another competing thread

Some other thread traversing the list hits upon the null value and helps out the marking / unlinking part

The interesting point is that in either case we have progress, which is the basic claim of the CAS based non-blocking approach. Harris and Maged has all the gory details of this technique documented here and here.

Postscript

The code for the implementation of ConcurrentSkipListMap is indeed very complex.Firstly it deals with a multilevel probabilistic data structure (Skip List) and secondly it makes that piece concurrent using lock-free techniques of CAS. But, on the whole, for anyone who enjoys learning data structure implementations, this will definitely be a very good learning experience. The devil is in the details - it could not have been more true than this exquisite piece from Doug Lea!

Monday, July 24, 2006

From Java to Ruby ? Now ? Naah ..

Bruce Tate has recently got his From Java to Ruby out. This is another one in series of those publications which professes Ruby as the successor of Java in the enterprise. The book is targeted towards the technical managers who can stand by the decision of their enlightened programmers of making the royal switch in the enterprise and take this decision upwards within the organization (without getting fired !). I have not yet read Tate's book, but I thoroughly enjoyed reading his Beyond Java. This entry is the blueprint of my thoughts on this subject - will Java be replaced by Ruby in the enterprise today ?

Cedric has already voiced his opinion on this subject in one of his usual opinionated (Rails is also opinionated - right ?) posts. I think I have a couple to add to his list ..

Scaling Up with Ruby

One of the major areas of my concern with Ruby being mainstream is the skillset scalability of the enterprise. The programming force, at large, is now baked in the realm of the supposedly (im)pure OO paradigms of Java. Call it the Perils of Java Schools, the lack of appreciation for the elegance of functional programming or whatever, the fact is that the zillions of developers today are used to program with assignments and iterators, as they are idiomed in Java and C++. It will take quite a beating to infuse into them Why FP Matters.

Ruby is elegant, Ruby blocks are cool, Ruby has Continuations and Ruby offers coroutine based solution for the same-fringe problem. But, again, there ain't no such thing as a free lunch ! You have to develop your workforce to take good care of this elegance in developing enterprise scale applications. The following is an example of the paradigm shift, shamelessly ripped from Bruce Tate's Beyond Java :

The developers in my workforce are used to writing JDBC-style access in Spring using anonymous inner classes :

JdbcTemplate template = new JdbcTemplate(dataSource);
final List names = new LinkedList();

template.query("select user.name from user",
    new RowCallbackHandler() {
      public void processRow(ResultSet rs) throws SQLException {
        names.add(rs.getString(1));
      }
    }
);

Here is a Ruby snippet implementing similar functionality through "blocks" ..

dbh.select_all("select name, category from animal") do |row|
    names << row[0]
end

A real gem - but the developers have to get used to this entirely new paradigm. It is not only a syntactical change, it implies a new thought process on part of the developer. Remember, one of the reasons why Java could smartly rip apart the C++ community was that it was a look-alike language with a cleaner memory model and a closer affiliation to the Internet. At one point in time, we all thought that SmallTalk had an equal chance of gobbling up the C++ programming fraternity. Smalltalk is a much purer OO language, but proved to be too elegant to be adopted en-masse.

Martin Fowler and Bruce Tate have been evangelizing Ruby, DHH has presented us with a masterfully elegant framework (ROR). But we need more resources to scale up - more books, more tutorials, more evangelism on the idioms of Ruby, which have gone behind the mastery of ROR.

The Art of Ruby Metaprogramming

Metaprogramming is the second habit of Ruby programmers (possibly after "Ruby Blocks"). Many of the problems that we face today due to the lack of formal AOP in Ruby can be addressed by metaprogramming principles. In fact, metaprogramming offers much more "raw" power than AOP, as can be very well illustrated by the following method from Rails validation ..

def validates_presence_of(*attr_names)
  configuration = { :message => ActiveRecord::Errors.default_error_messages[:blank], :on => :save }
  configuration.update(attr_names.pop) if attr_names.last.is_a?(Hash)

  # can't use validates_each here, because it cannot cope with nonexistent attributes,
  # while errors.add_on_empty can
  attr_names.each do |attr_name|
    send(validation_method(configuration[:on])) do |record|
      unless configuration[:if] and not evaluate_condition(configuration[:if], record)
        record.errors.add_on_blank(attr_name,configuration[:message])
      end
    end
  end
end

But this also reflects upon my earlier concern - the programmers have to be developed to cope with this kind of semantics in their programming. Many of the metaprogramming techniques have become idioms in Ruby - we need to have more preachings, professing their uses and best practices to the programming community. Otherwise Ruby Metaprogramming will remain a black magic for ever.

Final Thoughts

Rails may be the killer app, metaprogramming may be the killer technique, but we all need to be more pragmatic about Ruby's chance in the enterprise. There are performance concerns for Rails, the model that it adopts for ORM is divergent from the one that we do for Java and definitely not the one that can back up a solid object oriented domain model. It is debatable whether this will be a better fit for enterprise applications - but the community needs to tune the framework constantly if it were to compete with the ageold competencies of Java. With Java 5, we have a JVM which has been tuned for the last 10 years, we have killer libraries for concurrency (I heard it is capable of competing with raw C!) and we have oodles of goodies to make Java programming compete with the best of the breed performant systems. We have Mustang and Dolphin ready to make their impact on the enterprise world. It is definitely worth looking forward to whether the elegance of Ruby can scale up to the realities and give Sun (and the entire Java community) a run for their money.