Ruminations of a Programmer: jpa-gotcha-series

Showing posts with label jpa-gotcha-series. Show all posts

Monday, October 15, 2007

Domain Modeling with JPA - The Gotchas - Part 4 - JPA is a nice way to abstract Repository implementation

When we model a domain, we love to work at a higher level of abstraction through Intention Revealing Interfaces, where the focus is always on the domain artifacts. Whenever at the domain layer we start coding in terms of SQLs and pulling resultsets out of the database through a JDBC connection, we lose the domain focus. When we start dealing with the persistent hierarchy directly instead of the domain hierarchy, we make the domain model more vulnerable, exposing it to the lower layers of abstraction. By decoupling the domain model from the underlying persistent relational model, JPA provides us an ideal framework for building higher levels of abstraction towards data access. The domain model can now access data in terms of collections of domain entities and not in terms of the table structures where these entities are deconstructed. And the artifact that provides us a unified view of the underlying collection of entities is the Repository.

Why Repository ?

In the days of EJB 2.0, we had the DAO pattern. Data Access Objects also provided with an abstraction of the underlying data model by defining queries and updates for every table of the model. However, the difference of DAOs with repositories are more semantic than technical. Repositories provide a higher level of abstraction and is a more natural habitant of the rich domain model. Repositories offer more controlled access to the domain model only through Aggregate roots implementing a set of APIs which follow the Ubiquitous Language of the domain. Programming at the JDBC level with the DAO pattern, we used to think in terms of individual tables and operations on them. However, in reality, the domain model is much more than that - it contains complex associations between entities and strong business rules that govern the integrity of the associations. Instead of having the domain model directly deal with individual DAOs, we had always felt the need for a higher level of abstraction that would shield the domain layer from the granularities of joins and projections. ORM frameworks like Hibernate gave us this ability and specifications like JPA standardized them. Build your repository to get this higher level of abstraction on top of DAOs.

You build a repository at the level of the Aggregate Root, and provide access to all entities underneath the root through the unified interface. Here is an example ..

@Entity
class Order {
  private String orderNo;
  private Date orderDate;
  private Collection<LineItem> lineItems;
  //..

  //.. methods
}

@Entity
class LineItem {
  private Product product;
  private long quantity;
  //..

  //.. methods
}

The above snippet shows the example of an Aggregate, with Order as the root. For an Order Management System, all that the user needs is to manipulate Orders through intention revealing interfaces. He should not be given access to manipulate individual line items of an Order. This may lead to inconsistency of the domain model if the user does an operation on a LineItem which invalidates the invariant of the Order aggregate. While the Order entity encapsulates all domain logic related to manipulation of an Order aggregate, the OrderRepository is responsible for giving the user a single point of interface with the underlying persistent store.

interface OrderRepository {
  //.. queries
  List<Order> getAll();
  List<Order> getByOrderNo(String orderNo);
  List<Order> getByOrderDate(Date orderDate);
  List<Order> getByProduct(Product product);

  //..
  void write(final Order order);
  //..
}

Now the domain services can use the service of this repository to access orders from the underlying database. This is what Eric Evans calls reconsitution as opposed to construction (which is typically the responsibility of the factory).

JPA to implement Repository

The nice thing to be able to program to a specification is the abstraction that you can enforce on your model. Repositories can be implemented using JPA and can nicely be abstracted away from the actual domain services. A Repository acts like a collection and provides user the illusion of using memory based data structures without exposing the internals of the interactions with the persistent store. Let us see a sample implementation of a method of the above repository ..

class OrderRepositoryImpl implements OrderRepository {
  //..
  //..

  public List<Order> getByProduct(Product product) {
    String query = "select o from Order o, IN (o.lineItems) li where li.product.id = ?1";
    Query qry = em.createQuery(query);
    qry.setParameter(1, product.getId());

    List<Order> res = qry.getResultList();
    return res;
  }
  //..
  //..
}

The good part is that we have used JPA to implement the Repository, but the actual domain services will not contain a single line of JPA code in them. All of the JPA dependencies can be localized within the Repository implementations. Have a look at the following OrderManagementService api ..

class OrderManagementService {
  //..
  // to be dependency injected
  private OrderRepository orderRepository;

  // apply a discount to all orders for a product
  public List<Order> markDown(final Product product, float discount) {
    List<Order> orders = orderRepository.getByProduct(product);
    for(Order order : orders) {
      order.setPrice(order.getPrice() * discount);
    }
    return orders;
  }
  //..
  //..
}

Note that the repository is injected through DI containers like Spring or Guice, so that the domain service remains completely independent of the implementation of the repository.

But OrderRepository is also a domain artifact !

Right .. and with proper encapsulation we can abstract away the JPA dependencies from OrderRepositoryImpl as well. I had blogged on this before on how to implement a generic repository service and make all domain repositories independent of the implementation.

Monday, October 08, 2007

Domain Modeling with JPA - The Gotchas - Part 3 - A Tip on Abstracting Relationships

JPA is all about POJOs and all relationships are managed as associations between POJOs. All JPA implementations are based on best practices that complement an ideal POJO based domain model. In the first part of this series, I had talked about immutable entities, which prevent clients from inadvertent mutation of the domain model. Mutation mainly affects associations between entities thereby making the aggregate inconsistent. In this post, I will discuss some of the gotchas and best practices of abstracting associations between entities from client code making your domain model much more robust.

Maintaining relationships between entities is done by the underlying JPA implementation. But it the responsibility of the respective entities to set them up based on the cardinalities specified as part of the mapping. This setting up of relationships has to be explicit through appropriate message passing between the respective POJOs. Let us consider the following relationship :

@Entity
public class Employee implements Serializable {
  //..
  //.. properties

  @ManyToMany
  @JoinTable(
    name="ref_employee_skill",
    joinColumns=@JoinColumn(name="employee_pk", referencedColumnName="employee_pk"),
    inverseJoinColumns=@JoinColumn(name="skill_pk", referencedColumnName="skill_pk")
  )
  private Set<Skill> skills;
  //..
  //.. other properties
  //.. methods
}

There is a many-to-many relationship between Employee and Skill entities, which is set up appropriately using proper annotations. Here are the respective accessors and mutators that help us manage this relationship on the Employee side :

@Entity
public class Employee implements Serializable {
  //..
  //.. properties

  public Set<Skill> getSkills() {
    return skills;
  }

  public void setSkills(Set<Skill> skills) {
    this.skills = skills;
  }
  //..
}

Similarly on the Skill entity we will have the corresponding annotations and the respective accessors and mutators for the employees collection ..

@Entity
public class Skill implements Serializable {
  //..
  //.. properties

  @ManyToMany(
    mappedBy="skills"
  )
  private Set<Employee> employees;

  public Set<Employee> getEmployees() {
    return employees;
  }

  public void setEmployees(Set<Employee> employees) {
    this.employees = employees;
  }
  //..
}

Can you see the problem in the above model ?

The problem lies in the fact that the model is vulnerable to inadvertent mutation. Public setters are evil and exposes the model to be changed inconsistently by the client. How ? Let us look at the following snippet of the client code trying to set up the domain relationships between an Employee and a collection of Skills ..

Employee emp = .. ; // managed
Set<Skill> skills = .. ; //managed
emp.setSkills(skills);

The public setter sets the skillset of the employee to the set skills. But what about the back reference ? Every skill should also point to the Employee emp. And this needs to be done explicitly by the client code.

for(Skill skill : skills) {
  skill.getEmployees().add(emp);
}

This completes the relationship management code on part of the client. But is this the best level of abstraction that we can offer from the domain model ?

Try this !

If the setter can make your model inconsistent, do not make it public. Hibernate does not mandate public setters for its own working. Replace public setters with domain friendly APIs, which make more sense to your client. How about addSkill(..) ?

@Entity
public class Employee implements Serializable {
  //..
  //.. properties

  public Employee addSkill(final Skill skill) {
    skills.add(skill);
    skill.addEmployee(this);
    return this;
  }
  //..
}

addSkill() adds a skill to an employee. Internally it updates the collection of skills and best of all, transparently manages both sides of the relationship. And returns the current Employee instance to make it a FluentInterface. Now your client can use your API as ..

Employee emp = .. ; // managed
emp.add(skill_1)
   .add(skill_2)
   .add(skill_3);

Nice!

For clients holding a collection of skills (managed), add another helper method ..

@Entity
public class Employee implements Serializable {
  //..
  //.. properties

  public Employee addSkill(final Skill skill) {
    //.. as above
  }

  public Employee addSkills(final Set<Skill> skills) {
    skills.addAll(skills);

    for(Skill skill : skills) {
      skill.addEmployee(this);
    }
    return this;
  }
  //..
}

Both the above methods abstract away the mechanics of relationship management from the client and present fluent interfaces which are much more friendlier for the domain. Remove the public setter and don't forget to make the getter return an immutable collection ..

@Entity
public class Employee implements Serializable {
  //..
  //.. properties

  public Set<Skill> getSkills() {
    return Collections.unmodifiableSet(skills);
  }
  //..
}

The above approach makes your model more robust and domain friendly by abstracting away the mechanics of relationship management from your client code. Now, as an astute reader you must be wondering how would you use this domain entity as the command object of you controller layer in MVC frameworks - the setters are no longer there as public methods ! Well, that is the subject of another post, another day.

Friday, September 28, 2007

Domain Modeling with JPA - The Gotchas - Part 2 - The Invaluable Value Objects

In the first post of this gotcha series, I had discussed some of the issues around making entities publicly immutable, by not exposing direct setters to the layers above. This approach has its own set of advantages and offers a big safety net to the domain model. The domain model can then be manipulated only through the methods published by the domain contracts. While still on the subject of immutability of domain models, I thought I would discuss about the other cousin of immutable entities that plays a very big role in making your domain driven design more supple.

Enter Value Objects.

While an object-oriented domain model focuses on the behavior of entities, the relational persistence model manages object identities. And a successful marriage of the two paradigms is the job of a good ORM framework. But not all entities need to maintain their identities - their behaviors depend only upon the values they carry. Eric Evans calls them Value Objects.

Value objects are an integral part of any object oriented model, while they are somewhat obscure in the relational persistence model. It is a real challenge to have a successful representation of value objects as reusable abstractions in the OO domain model, while transparently storing them in the relational model with minimum invasiveness on part of the programmer. Value objects increase the reusability of the domain model and JPA offers a flexibile programming model to make their persistence transparent to the developer. The big advantages with value objects are that you need not manage their identities or their lifetimes - both of them are the same as the entities which own them.

Modeling a Value Object with JPA

Consider a sample model snippet where an Employee has-an Address - both of them are designed as separate domain objects in the model. After a careful analysis of the domain, we find that addresses are never shared, i.e. each employee will have a unique address. Hence the relational model becomes the following monolithic table structure :

create table employee (
  //..employee specific columns
  //..
  //..address specific columns
)

In the relational model, we need not have any identity for an address - hence it can be seamlessly glued into the employee record. While in the OO model, we need to have a fine grained abstraction for Address, since the purpose of the OO model is to have the most faithful representation of how the domain behaves. The Address class will have its own behavior, e.g. the format in which an Address gets printed depends upon the country of residence, and it makes no sense to club these behaviors within the Employee domain entity. Hence we model the class Address as a separate POJO.

// immutable
class Address {
  private String houseNumber;
  private String street;
  private String city;
  private String zip;
  private String country;

  //.. getters
  //.. no setter
  //.. behaviors
}

and an Employee has-an Address ..

class Employee {
  private Address homeAddress;
  //.. other attributes
}

JPA makes it really easy to have a successful combination of the two models in the above relationship. Just add an @Embedded annotation to the Address property in Employee class. This will do all the magic to make all individual address attributes as separate columns in the Employee table. And of course we can use all sorts of annotations like @AttributeOverride to change column names between the class and the table.

@Entity
class Employee {
  @Embedded
  @AttributeOverrides( {
    @AttributeOverride(name   =  "street",
        column = @Column(name = "home_street")),
    @AttributeOverride(name   =  "city",
          column = @Column(name = "home_city")),
    @AttributeOverride(name  =  "zip",
          column = @Column(name = "home_zip"))})
  private Address homeAddress;
  //.. other attributes
}

Modeling with JPA allows independent evolution of the OO domain model and relational persistence model. Don't ever try to enforce the relational paradigm into your domain - you are likely to end up in the swamps of the ActiveRecord modeling.

Collection of Value Objects

In the above example, the entity Employee has a one-to-one association with Address - hence it was easy to embed the address attributes as columns within the Employee table. How do we handle a one-to-many association between an entity and a value object ? Let us have a look at this scenario ..

A Project is an entity which abstracts an active project in a company. And the company raises Bills periodically to its clients for all the projects that it executes. The Bill object is a value object. We just have to raise bills and keep a record of all bills raised till date. A Bill does not have an identity, it's only the bill date and amount that matters. But we need to associate all bills with the project for which it is raised. This clearly warrants a 1..n association in the relational model as well. And the lifecycle of all bills is coupled to the lifecycle of the owning project. Sharing of bills is not allowed and we do not need to manage identities of every bill.

Using Hibernate specific annotations, here's how we can manage a set of value objects owned by an entity.

@Entity
class Project {
  //.. attributes

  @CollectionOfElements
  @JoinTable(name="project_bill",
    joinColumns = @JoinColumn(name="project_pk")
  )
  @AttributeOverrides( {
    @AttributeOverride(name = "billNo",
        column = @Column(name = "project_bill_no")),
    @AttributeOverride(name = "billDate",
      column = @Column(name = "project_bill_date")),
    @AttributeOverride(name = "raisedOn",
        column = @Column(name = "raised_on")),
    @AttributeOverride(name = "amount",
      column = @Column(name = "project_bill_amount"))}
  )
  @CollectionId(
    columns = @Column(name = "project_bill_pk"),
    type = @Type(type = "long"),
    generator = "sequence"
  )
  private Set<Bill> bills = new HashSet<Bill>();

  //..
  //..
}

Bill is not an entity - it is a simple POJO, which can be reused along with other owning entities as well. And if we want an inverse association as well, we can maintain a reference to the owning project within the Bill class.

@Embeddable
public class Bill {
  @Parent
  private Project project;
  //..
  //..
}

The database contains a table project_bill, which keeps all bills associated with a project indexed by project_pk. In case we need a sequencing of all bills, we can have a sequence generated in the project_bill table itself through the @org.hibernate.annotations.CollectionId annotation.

Value objects are an immensely useful abstraction. Analyse and find out as many value objects as you can in your domain model. And use the power of JPA and your ORM implementation to map them into your persistent model. The more value objects you can dig out, less will be the effort in managing identities and controlling lifetimes for each of them.

Decoupled Value Object Instantiation Models

There are some situations where value objects tend to be numerous in number. Here is an example :

Every employee has-a designation. Designation is a value object in our domain model and in a typical organization we have a limited number of designations. We make a separate abstraction for designation, since a designation has other behaviors associated with it e.g. perks, salary bracket etc. Here we go ..

@Embeddable
class Designation {
  //.. attributes
  //.. behavior
  //.. immutable
}

and the Employee entity ..

@Entity
class Employee {
  //.. attributes
  private Designation designation;
  //.. other attributes
  //..
}

What about the relational model ? We can employ a nice little trick here ..

Clearly many employees share a designation - hence, theoretically speaking, Designation is an entity (and not a value object) in the relational model, having a 1..n association with the Employee table. But, as Eric Evans has suggested in his discussion on Tuning a Database with Value Objects, there may be situations when it is better to apply denormalization techniques for the sake of storing collocated data. Making Designation an entity and wiring a relationship with Employee through its identity will store the Designation table in a far away physical location, leading to extra page fetches and additional access time. As an alternative, if access time is more critical than physical storage, we can store copies of Designation information with the Employee table itself. And, doing so, Designation turns into a Value Object for the relational model as well! In real world use cases, I have found this technique to be an extremely helpful one - hence thought of sharing the tip with all the readers of this blog.

However, we are not done yet - in fact, the subject of this paragraph is decoupled instantiation models for value objects, and we haven't yet started the tango. We first had to set the stage to make Designation a value object at both the levels - domain and persistence models. Now let us find out how we can optimize our object creation at the domain layer while leaving the persistence level to our JPA implementation.

In a typical use case of the application, we may have bulk creation of employees, which may lead to a bulk creation of value objects. One of the cool features of using JPA is that we can adopt a completely different instantiation strategy for our OO domain model and the relational persistent model. While persisting the value object Designation, we are embedding it within the Employee entity - hence there is always a copy of the value object associated with the persistent Employee model. And this is completely managed by the JPA implementation of the ORM. However, for the domain model, we can control the number of distinct instances of the value object created using the Flyweight design pattern. Have a look ..

@Embeddable
class Designation {
  //.. persistent attributes

  @Transient
  private static Map<String, Designation> designations
    = new HashMap<String, Designation>();

  // package scope
  Designation() {
    //.. need this for Hibernate
  }

  // factory method
  public static Designation create(..) {
    Designation d = null;
    if ((d = designations.get(..)) != null) {
      return d;
    }
    // create new designation
    // put it in the map
    // and return
  }
  //..
  //..equals(), hashCode() .. etc.
}

We have a flyweight that manages a local cache of distinct designations created and controls the number of objects instantiated. And since value objects are immutable, they can be freely shared across entities in the domain model. Here is an example where using JPA we can decouple the instantiation strategy of the domain objects from the persistence layer. Although we are storing value objects by-value in the database, we need not have distinct in-memory instances in our domain model. And, if you are using Hibernate, you need not have a public constructor as well. For generation of proxy, Hibernate recommends at least package visibility, which works fine with our strategy of controlling instantiation at the domain layer using flyweights.

Value objects are invaluable in making designs more manageable and flexible. And JPA provides great support in transparent handling of instantiation and persistence of value objects along with their owning entities. With a rich domain model, backed up up by a great ORM like Hibernate that implements JPA, we can get the best of both worlds - powerful OO abstractions as well as transparent handling of their persistence in the relational database. I had earlier blogged about injecting ORM backed repositories for transparent data access in a domain model. In future installments of this series, I plan to cover more on this subject describing real life use cases of applying domain driven design techniques using JPA and Hibernate.

Thursday, September 20, 2007

Domain Modeling with JPA - The Gotchas - Part 1 - Immutable Entities

Working on rich domain models, how many times have we wondered if we had a cookbook of best practices that discussed some of the common gotchas that we face everyday. With this post, I would like to start a mini series of such discussions, which will center around issues in rich domain modeling using JPA and Hibernate. In course of this series, I will discuss many of the problems, for which I have no satisfactory solution. I would love to have the feedback from the community for the best practices and what they feel should be done to address such issues.

Each installment of this series will discuss a single issue relevant only to designing rich domain models using JPA and Hibernate. With enough reader feedback, we should be able to have a healthy discussion on how to address it in the real world of domain modeling using the principles of Domain Driven Design. The topic of today's post is Immutability of Domain Entities and how to address this issue in the context of modeling persistent entities. While no one denies the fact that immutability is a virtue that a model should maximally possess, still there are practical concerns and reasons to act otherwise in many situations. The domain model is only one of the layers in the application architecture stack - we need to interact with other layers as well. And this is where things start getting interesting and often deviate from the ideal world.

Immutability (aka public setter methods)

There are two aspects to immutability of entities.

#1 : Making an Entity intrisically Immutable

An entity is immutable in the sense that *no* update or delete are allowed for that entity. Once created the entity is truly *immutable*. The Hibernate annotation @Immutable can be used to indicate that the entity may not be updated or deleted by the application. This also allows Hibernate to make some minor performance optimizations.

@Immutable
public class Bid implements Serializable {
  // ..
  // ..
}

Needless to say, we do not have any setter methods exposed for this entity.

Immutability can also be ensured for selective columns or properties. In that case the setter method is not exposed for this property and the ORM generated update statement also does not include any of these columns.

@Entity
public class Flight implements Serializable {
  // ..
  @Column(updatable = false, name = "flight_name", nullable = false, length=50)
  public String getName() { ... }
  // ..
}

For the above entity, the name property is mapped to the flight_name column, which is not nullable, has a length of 50 and is not updatable (making the property immutable).

#2 : Immutable in the domain layer

This is one of the most debated areas in domain modeling. Should you allow public setters to be exposed for all entities ? Or you would like to handle all mutations through domain methods only. Either way there are some issues to consider :

If we expose public setters, then we risk exposing the domain model. Any component in the layers above (e.g. Web layer) can invoke the setter on the entity and make the domain model inconsistent. e.g. the Web layer may invoke account.setOpeningBalance(0), despite the fact that there is a minimum balance check associated with the domain logic. We can have that validation within the setter itself, but ideally that domain logic should be there in the domain method, which is named following the Ubiquitous Language. In the current case, we should have a method named account.open(..), which should encapsulate all the domain logic associated with the opening of an account. This is one of the fundamental tenets of rich domain models. From this point of view, smart setters are an anti-pattern in domain modeling.

If we do not have setters exposed, then how will the Web MVC framework transport data from the Form objects to the domain objects ? The usual way frameworks like Spring MVC works is to use the public setter methods to set the command object values. One solution is to use DTOs (or one of its variants), but that is again one of the dark corners which needs a separate post of its own. We would definitely like to make a maximal reuse of our domain model and try to use them throughout the application architecture stack. One of the options that I have used is to have public setters exposed but control the usage of setters only in the Web MVC layer through development aspects. Another option may be to introduce public setters through Inter Type Declaration using aspects and use the introduced interface in the Web MVC layer only. Both of these options, while not very intuitive, deliver the goods in practice. You can do away with exposing setters *globally* from the domain model.

What about getters ? Should we have

Collection<Employee> getEmployees() {
    return employees;
}

Collection<Employee> getEmployees() {
    return Collections.unmodifiableList(employees);
}

In the former case, the collection returned is not immutable and we have the convenience of doing the following ..

Employee emp = ...  // make an employee
office.getEmployees().add(emp);  // add him to the office

while all such logic will have to be routed through specific domain methods in case of the enforced immutability of the returned collection for the second option. While this looks more ideal, the first approach also has lots of valid use cases and pragmatic usage.

Use the comments section of the post to discuss what approach you take when designing rich domain models. There may not be one single golden rule to follow in all scenarios, but we can know about some of the best practices followed in the community.