Code Cop: object orientation

Showing posts with label object orientation. Show all posts

4 July 2024

Encapsulation vs Business Rules

Business Men (licensed CC BY-NC-ND by Andreina Schoeberlein)

No Naked Primitives is a Coderetreat constraint which trains our object orientation skills. No primitive values, e.g. booleans, numbers or strings, must be visible at object boundaries, i.e. public methods. Arrays and other containers like lists or hash-tables are primitives, too. I love this constraint, as it pushes people right out of their comfort zone. ;-) (I wrote about No Naked Primitives in combination with other constraints and included it in the expert level Brutal Coding Constraints.)

Value Objects
The usual designs to avoid naked primitives are Value Objects and First Class Collections. Value Objects, by design, expose the values they wrap with a getter because some other objects will want to use these values. What happens if I go extreme and do not allow any primitives at object boundaries? (Of course this is crazy, a clear case of Primitive Obsession Obsession. Still when Coderetreat facilitators get together to practice, things end up like that.) Let us take the Game of Life as an example. (If you do not know the Game of Life, read the description and implement it right away.) In the game, for evolving a generation, I need to count the living neighbours of each cell. The number of living neighbours is an integer and its value object in C# could look like

public class NeighbourCount {
    private int count;
    
    public NeighbourCount(int count) {
        this.count = count;
    }

    // ... code to manage the count

}

Now any code which depends on the data (i.e. the count) will have to be moved into the value object to be able to access the data. Following the rules of the game, if there are two or three living neighbours, a living cell lives on. The method ApplyRulesOnLivingCell implements this rule.

public class NeighbourCount {

    // ...

    public GridSpace ApplyRulesOnLivingCell() {
        if (this.count == 2 || this.count == 3) {
            return new AliveCell();   
        }
        return new EmptySpace();
    }
}

public interface GridSpace {}
public class EmptySpace : GridSpace {}
public class AliveCell : GridSpace {}

Grouping the data (count) and the logic which is based on the data, uses or modifies it (ApplyRulesOnLivingCell) together is a core principle of object orientation. Further all data is strongly encapsulated.

Polymorphism
The next method ApplyRulesOnEmptySpace is similar. The decision which of the two methods to call depends on the state of the cell, which is either alive or dead/non existing. This boolean state has to be encapsulated inside a class, e.g. class GridSpace. This class behaves differently for the values of the boolean state, which makes the boolean a simple type code. The object oriented way to work with type codes is to use polymorphism:

public interface GridSpace {
    public GridSpace ApplyRulesWith(NeighbourCount count);
}
public class AliveCell : GridSpace {
    public GridSpace ApplyRulesWith(NeighbourCount count) {
        return count.ApplyRulesOnLivingCell();
    }
}
public class EmptySpace : GridSpace {
    public GridSpace ApplyRulesWith(NeighbourCount count) {
        return count.ApplyRulesOnEmptySpace();
    }
}

The code looks weird and it is not my usual implementation of the game's rules. It has an issue: The rules of the game are distributed among three classes. This is Shotgun Surgery - when a single change is made to multiple classes simultaneously: If I need to change the rules, or even want to read and understand the logic of cell evolution, I have to go to three places.

Shot (licensed CC BY-NC-ND by Bart Maguire)

Business Rules
On the other hand, a basic implementation of the rules using primitives (e.g. in Ruby because polyglot programming is cool),

def alive_in_next_generation(alive, living_neighbours)
  (alive and living_neighbours == 2) or 
  living_neighbours == 3
end

is one line of code and easy to understand. The game's rules - the business rules - are boolean expressions describing certain situations, which "the business" needs to act on. Typical examples of such situations are when an item is out of stock or a client qualifies for a discount. Business related conditions are called policies. (And there are predicates, which are boolean expressions, too. These have their origins in formal logic.) Boolean expression are functional in nature. So a functional design, i.e. functions operating on primitive data, could be more appropriate. Even in object oriented design there are use cases for objects containing only logic and no (mutable) data.

Conclusion
What is the point of my discussion? In the case of Game of Life, there is a tension between keeping data and its logic together versus keeping related logic together. This is particularly true for boolean expressions and code depending on them, as boolean values usually end up in conditionals. I like to keep "decisions" and the logic depending on them close together but I want to keep my business rules in one place even more. I am wondering if this is true for most design situations, besides Game of Life. Boolean logic is interesting because if allows variation in the automation. Code without any booleans is still useful, e.g. pure calculations or uniform transformations in a pipeline style of operations.

Taking it further?
While boolean is a primitive, it is different from other primitives. What happens if I do not allow any primitives besides boolean at object boundaries? The data of class NeighbourCount would still be encapsulated when I add relevant queries (in Python because I love programming languages):

class NeighbourCount:

  def __init__(self, count):
    self._count = count

  # ... code to manage the count

  def isTwoOrThree(self):
    return self._count == 2 or self._count == 3

  def isThree(self):
    return self._count == 3

Using these small methods, I get a concise implementation of the rules,

class Rules:
  def cellInNextGeneration(self, cell, count):
    if (cell.isAlive() and count.isTwoOrThree()) or count.isThree():
      return AliveCell()
    return EmptySpace()

Is this better? I am not sure. At least the (business) rules of the Game of Life are in one place now. They could be replaced with different rules if needed, making the design extensible. At the same time different rules would most likely require different queries in NeighbourCount. For example in Hex Life, I need a weighted sum of first and second tier living neighbours to decide the state of the next generation. This is not possible without adding new queries to NeighbourCount. The Open Closed Principle is not satisfied. (Then maybe Hex Life is too much of a change for any design to "survive".) My rules logic keeps calling into the encapsulated value object repeatedly, which looks much like Feature Envy. I feel like I am going in circles here ;-)

10 March 2024

Programming with Nothing

I like extreme coding constraints. A constraint, also known as an activity, is a challenge during a kata, coding dojo or code retreat designed to help participants think about writing code differently than they would otherwise. Every constraint has a specific learning goal in mind, for example Verbs instead of Nouns. After playing with basic constraints for a long time now, I need more challenging tasks. Combining existing constraints makes things harder: For example Object Callisthenics or my very own Brutal Coding Constraints are way harder than their parts applied individually.

Void (licensed CC BY-NC-ND by Jyotsna Sonawane)

Missing Feature Group of Constraints
There is a another group of extreme constraints which I call the Programming With Nothing constraints. They are a subgroup of the Only Use <placeholder> constraints. All of these belong to the Missing Feature group. The well known No If and No Naked Primitives constraints are good examples of Missing Features because we take away a single element that we are so very much used to. Only Use <placeholder> constraints force you to use new constructs instead of something else. For example, Alexandru Bolboaca, the pioneer of Coderetreat in Europe, once mentioned the following constraints to me: Only Bit Operations replaces all arithmetic operations, like plus or multiply, with bit operations and Only Regular Expressions asks you to use Regular Expressions as much as possible. You can get pretty far with Regular Expressions in exercises like Balanced Brackets, Coin Change, Snake or Word Wrap. (Look for the Bonus Round at the bottom of the Word Wrap page.)

Programming With Nothing
But let us get back to Programming With Nothing. The first one of this group, which I came across ten years ago, was presented by Tom Stuart in his 2011 Ru3y Manor talk Programming With Nothing. Tom is taking functional programming to the extreme, only allowing the declaration of lambda expressions and calling them. The exact rules he is following are:

Create functions with one argument.
Call functions and return a result.
Assign functions to names (abbreviate them as constants).

Basically he is using the Lambda Calculus and this constraint is also referred to as Lambda Calculus. His talk is using Ruby, using only Proc.new, no booleans, numbers or strings, no assignments, control flow constructs or standard library. Clearly he is programming with nothing. (Here is the recording of the talk, his slides and the code.) Over the years I have seen similar presentations, even using Java.

The Fizz Buzz Kata
The goal is to implement the Fizz Buzz kata. While Fizz Buzz is very simple, it needs looping integer numbers up to 100, conditionals on integer comparison, integer division and strings. It is very small but not simple. Some people even use it during job interviews - which is controversial. The whole Fizz Buzz is:

for (i = 1; i <= 100; i++) {
  if (i % 3*5 == 0) 
    print("FizzBuzz");
  else if (i % 3 == 0) 
    print("Fizz");
  else if (i % 5 == 0) 
    print("Buzz");
  else 
    print(i);
}

And this is quite a lot if all you have is a lambda. I maintain a starting point for TypeScript, to be used in my workshops. This kind of exercise is fun, at least for me ;-). If you follow the assignment, i.e. work on numbers, then booleans, then pairs etc., you can use Git branches to jump to the next milestone - or take a sneak peak how it could be done.

Nothing Happened (licensed CC BY-SA by Henry Burrows)

Extreme Object-Orientation
In 2015 I watched John Cinnamond's Extreme Object-Oriented Ruby, which is like Tom Stuart's Programming with Nothing. This version only allowed you to define objects which contain other objects and call the nested object's methods or return them. In his starter repository he described how to simulate booleans, numbers and so on.

Nothing but NAND
Then I tried to write Fizz Buzz only using NAND. This is Programming With Nothing the hardware way. How so? A NAND gate is a logic gate which produces an output which is false only if all its inputs are true; thus its output is complement to that of an AND says Wikipedia. More importantly, the NAND gate is significant because any Boolean function can be implemented by using a combination of NAND gates. This property is called functional completeness.. Because of its functional completeness it should be possible to create arbitrary programs. I started out with a Bit class which had its nand() function implemented and all other code was built on top of this. Numbers, i.e. arrays of bits,

class Numbers {

  static final Byt ZERO = new Byt(OFF, OFF, OFF, OFF, OFF, OFF, OFF, OFF);

  // ...

  static final Byt FIFTEEN = new Byt(ON, ON, ON, ON, OFF, OFF, OFF, OFF);
  static final Byt HUNDRED = new Byt(OFF, OFF, ON, OFF, OFF, ON, ON, OFF);
}

and bitwise logic,

class Logic {

  static Bit eq(Bit a, Bit b) {
    return not(xor(a, b));
  }

  // ...

  static Byt and(Byt a, Byt b) {
    return not(nand(a, b));
  }

  // ...

  static Byt ifThenElse(Bit b, Byt theThen, Byt theElse) {
    Byt condition = Byt.from(b);
    return or(and(condition, theThen), 
              and(not(condition), theElse));
  }
}

were straight forward. Arithmetic was cumbersome due to possible over- and underflows.

class Arithmetic {

  static BitOverflow inc(Bit b) {
    return new BitOverflow(not(b), b);
  }

  static BitOverflow add(Bit a, Bit b) {
    return new BitOverflow(xor(a, b), and(a, b));
  }

  // ...

  static Byt inc(Byt a) {
    BitOverflow r0 = add(a.b0, Bit.ON);
    BitOverflow r1 = add(a.b1, r0.overflow);
    BitOverflow r2 = add(a.b2, r1.overflow);
    BitOverflow r3 = add(a.b3, r2.overflow);
    BitOverflow r4 = add(a.b4, r3.overflow);
    BitOverflow r5 = add(a.b5, r4.overflow);
    BitOverflow r6 = add(a.b6, r5.overflow);
    BitOverflow r7 = add(a.b7, r6.overflow);
    return new Byt(r0.b,r1.b,r2.b,r3.b,r4.b,r5.b,r6.b,r7.b);
  }
}

For loops I added a sequence of bits which worked as the Instruction Pointer. Using the IP and the existing arithmetic operations I implemented goto which I used to jump back during loops. The final code did not look much different than your regular structural code, using functions and mutable data. The exact list of things I used was:

Data structures for a single bit, a byte (8 bits) and a series of bytes i.e. memory.
Bit nand(Bit other) as the only logic provided.
Getting and setting the values of the data structures.
Defining functions with multiple statements to create and modify data and call other functions.
A map to associate statements with memory addresses indexed by the IP. Was this cheating?

I had played with assembly in the past, which helped me to build my program from NANDs alone. It is a great learning exercise to understand computers' logical components and CPUs. There is even an educational game based on the idea of NAND.

What is Next?
I cannot remember how I ended up there, but next I tried to write Fizz Buzz using a Touring Machine. But this is a story for another time...

21 November 2019

Promotion Service Kata

In September I attended a small, club-like unconference. The umbrella topic of the event was katas and their use in teaching and technical coaching. A kata, or code kata, is defined as an exercise in programming which helps hone your skills through practice and repetition. We spent two days creating, practising and reviewing different exercises. I came home with a load of new challenges for my clients.

Kata Factory
One session, run by Bastien David, a software crafter from Grenoble, was named Kata Factory. Bastien guided us to create a short exercise with a very small code base, focused on a single topic. In the first part of the session we created small tasks working in pairs. Then we solved a task from another pair in the second part. A total of four new coding exercises was created, tried and refined. It was awesome.

Promotion Service Kata
I worked with Dmitry Kandalov and we created the Promotion Service Kata. It is a small refactoring exercise, based on Feature Envy, a code smell listed in Martin Fowler's book. (Did you know that there is a second edition of this great book? No, so get it quickly.) The code base contains a single service, the promotion service, which calculates discounts for promoted items. It is a bit crazy because it also reduces the tax. The data is stored in a classic DTO and its fields are not encapsulated. The task is to make it a rich object and encapsulate its fields. There are existing unit tests to make sure things are still working.

After the Kata Factory, I spent some time on porting the kata to different languages. Currently the code is available in C#, Java, Kotlin, PHP and Python. Pull requests porting the code to other languages are very welcome. Check out the code here.

Promotion Service Retrospective

Notes from first run
I already facilitated the exercise with a small team of C# developers. Here is what they said about the kata:

It is a good exercise.
It is a short exercise. It is small, so there is no need for context.
Encapsulate all the things!
I learned to separate concerns.
I learned about string.Format (a C# specific function).
I did not know the goal of the exercise.
Maybe rename the Persist() method to Save().
The Item class should be in its own file.

Conclusion
Bastien's approach shows that it is possible to create a brand new and highly focused coding exercise in a short time. As with most development related things, pair work is superior and it is easy to come up with new code katas when working in pairs. Small exercises - I call them micro exercises - are easy to get started because there is little context to know. Context is part of what makes coding assignments difficult. I am very happy with this new exercise.

Give it a try!

11 January 2018

Compliance with Object Calisthenics

During my work as Code Cop I run many workshops. Sometimes I use constraints to make exercises more focused or more intense. Some constraints, like the Brutal Coding Constraints, are composite or aggregate constraints, which means that they are a combination of several simpler or low level constraints. (Here simple does not mean that they are easy to follow, rather that they focus on a single thing.) Today I want to discuss Object Calisthenics.

Object Calisthenics
Jeff Bay's Object Calisthenics is an aggregate constraint combined of the following nine rules:

Use only one level of indentation per method.
Don't use the else keyword.
Wrap all primitives and strings (in public API).
Use only one dot per line.
Don't abbreviate (long names).
Keep all entities small.
Don't use any classes with more than two instance variables.
Use first-class collections.
Don't use any getters/setters/properties.

If you are not familiar with these rules, I recommend you read Jeff Bay's original essay published in the The ThoughtWorks Anthology in 2008. William Durand's post is an exact copy of that essay, so no need to buy the book for that. Several people have followed up and discussed their interpretation and experience with Object Calisthenics, e.g. Mark Needham, Jeff Pace, Vasiliki Vockin and Juan Antonio.

Object Calisthenics is an exercise in object orientation. How is that? One of the core concepts of OOP is Abstraction: A class should capture one and only one key abstraction. Obviously primitive and built-in types lack abstraction. Wrapping primitives (rule #3) and wrapping collections (rule #8) drive the code towards more abstractions. Small entities (rule #6) help to keep our abstractions focused.

Further objects are defined by what they do, not what they contain. All data must be hidden within its class. This is Encapsulation. Rule #9, No Properties, forces you to stay away from accessing the fields from outside of the class.

Next to Abstraction and Encapsulation, these nine rules help Loose Coupling and High Cohesion. Loose Coupling is achieved by minimizing the number of messages sent between a class and its collaborator. Rule #4, One Dot Per Line, reduces the coupling introduced by a single line. This rule is misleading, because what Jeff really meant was the Law Of Demeter. The law is not about counting dots per line, it is about dependencies: "Only talk to your immediate friends." Sometimes even a single dot in a line will violate the law.

High Cohesion means that related data and behaviour should be in one place. This means that most of the methods defined on a class should use most of the data members most of the time. Rule #5, Don't Abbreviate, addresses this: When a name of a field or method gets long and we wish to shorten it, obviously the enclosing scope does not provide enough context for the name, which means that the element is not cohesive with the other elements of the class. We need another class to provide the missing context. Next to naming, small entities (rule #6) have a higher probability of being cohesive because there are less fields and methods. Limiting the number of instance variables (rule #7) also keeps cohesion high.

The remaining rules #1 and #2, One Level Of Indentation and No else aim to make the code simpler by avoiding nested code constructs. After all who wants to be a PHP Street Fighter. ;-)

Checking Code for Compliance with Object Calisthenics
When facilitating coding exercises with composite constraints, I noticed how easy it is to overlook certain violations. We are used to conditionals or dereferencing pointers that we might not notice them when reading code. Some rules like the Law Of Demeter or a maximum size of classes need a detailed inspection of the code to verify. To check Java code for compliance with Object Calisthenics I use PMD. PMD contains several rules we can use:

Rule java/coupling.xml/LawOfDemeter for rule #4.

Rule #6 can be checked with NcssTypeCount. A NCSS count of 30 is usually around 50 lines of code.

<rule ref="rulesets/java/codesize.xml/NcssTypeCount">
    <properties>
        <property name="minimum" value="30" />
    </properties>
</rule>

And there is TooManyFields for rule #7.

<rule ref="rulesets/java/codesize.xml/TooManyFields">
    <properties>
        <property name="maxfields" value="2" />
    </properties>
</rule>

I work a lot with PMD and have created custom rules in the past. I added rules for Object Calisthenics. At the moment, my Custom PMD Rules contain a rule set file object-calisthenics.xml with these rules:

java/constraints.xml/NoElseKeyword is very simple. All else keywords are flagged by the XPath expression //IfStatement[@Else='true'].
java/codecop.xml/FirstClassCollections looks for fields of known collection types and then checks the number of fields.
java/codecop.xml/OneLevelOfIntention
java/constraints.xml/NoGetterAndSetter needs a more elaborate XPath expression. It is checking MethodDeclarator and its inner Block/ BlockStatement/ Statement/ StatementExpression/ Expression/ PrimaryExpressions.
java/codecop.xml/PrimitiveObsession is implemented in code. It checks PMD's ASTConstructorDeclaration and ASTMethodDeclaration for primitive parameters and return types.

For the nitty-gritty details of all the rules have a look at the rules defined in codecop.xml and constraints.xml.

Interpretation of Rules: Indentation
When I read Jeff Bay's original essay, the rules were clear. At least I thought so. Verifying them automatically showed some areas where different interpretations are possible. Different people see Object Calisthenics in different ways. In comparison, the Object Calisthenics rules for PHP_CodeSniffer implement One Level Of Indentation by allowing a nesting of one. For example there can be conditionals and there can be loops, but no conditional inside of a loop. So the code is either formatted at method level or indented one level deep. My PMD rule is more strict: Either there is no indentation - no conditional, no loop - or everything is indented once: for example, if there is a loop, than the whole method body must be inside this loop. This does not allow more than one conditional or loop per method. My rule follows Jeff's idea that each method does exactly one thing. Of course, I like my strict version, while my friend Aki Salmi said that I went to far as it is more like Zero Level Of Indentation. Probably he is right and I will recreate this rule and keep the Zero Level Of Indentation for the (upcoming) Brutal version of Object Calisthenics. ;-)

Wrap All Primitives
There is no PHP_CodeSniffer rule for that, as Tomas Votruba considers it "too strict, vague or annoying". Indeed, this rule is very annoying if you use primitives all the way and your only data structure is an associative array or hash map. All containers like java.util.List, Set or Map are considered primitive as well. Samir Talwar said that every type that was not written by yourself is primitive because it is not from your domain. This prohibits the direct usage of Files and URLs to name a few, but let's not go there. (Read more about the issue of primitives in one of my older posts.)

My rule allows primitive values in constructors as well as getters to implement the classic Value Object pattern. (The rule's implementation is simplistic and it is possible to cheat by passing primitives to constructors. And the getters will be flagged by rule #9, so no use for them in Object Calisthenics anyway.)

I agree with Tomas that this rule is too strict, because there is no point in wrapping primitive payloads, e.g. strings that are only displayed to the user and not acted on by the system. These will be false positives. There are certain methods with primitives in their signatures like equals and hashCode that are required by Java. Further we might have plain numbers in our domain or we use indexing of some sort, both will be false positives, too.

One Dot Per Line
As I said before, I use PMD's LawOfDemeter to verify rule #4. The law allows sending messages to objects that are

the immediate parts of this or
the arguments of the current method or
objects created inside the current method or
objects in global variables.

I did not look at PMD's source code to check the implementation of this rule - but it complains a lot. For me this is the most difficult rule of all nine rules. (I code according to #1, #3, #5 and #6 and can easily adapt to strictly follow #2, #7, #8 and #9.) Although it complains a lot, I found every violation correct. I learned much about Law Of Demeter by checking my code for violations. For example, calling methods on an element of an array is a violation. The indexed array access is similar to a pointer access. (In Ruby this is obvious because Array defines a method def [](index).) Another interesting fact is that (at least in PMD) the law flags calling methods on enums. The enum instances are not created locally, so we cannot send them messages. On the other hand, an enum is a global variable, so maybe it should be allowed to call methods on it.

The PHP_CodeSniffer rule follows the rule's name and checks that there is only one dot per line. This creates better code, because train wrecks will be split into explaining variables which make debugging easier. Also Tomas is checking for known fluent interfaces. Fluent interfaces - by definition - look like they are violating the Law Of Demeter. As long as the fluent interface returns the same instance, as for example basic builders do, there is no violation. When following a more relaxed version of the law, the Class Version of Law Of Demeter, than different implementations of the same type are still possible. The Java Stream API, where many calls return a new Stream instance of a different class - or the same class with a different generic type - is likely to violate the law. It does not matter. Fluent interfaces are designed to improve readability of code. Law Of Demeter violations in fluent interfaces are false positives.

Don't Abbreviate
I found it difficult to check for abbreviations, so rule #5 is not enforced. I thought of implementing this rule using a dictionary, but that is prone to false positives as the dictionary cannot contain all terms from all domains we create software for. The PHP_CodeSniffer rules check for names shorter than three characters and allow certain exceptions like id. This is a good start but is not catching all abbreviations, especially as the need to abbreviate arises from long names. Another option would be to analyse the name for its camel case patterns, requiring all names to contain lowercase characters between the uppercase ones. This would flag acronyms like ID or URL but no real abbreviations like usr or loc.

Small Entities
Small is relative. Different people use different limits depending on programming language. Jeff Bay's 50 lines work well for Java. Rafael Dohms proposes to use 200 lines for PHP. PHP_CodeSniffer checks function length and number of methods per class, too. Fabian Schwarz-Fritz limits packages to ten classes. All these additional rules follow Jeff Bay's original idea and I will add them to the rule set in the future.

Two Instance Variables
Allowing only two instance variables seems arbitrary - why not have three or five. Some people have changed the rules to allow five fields. I do not see how the choice of language makes a difference. Two is the smallest number that allows composition of object trees.

In PHP_CodeSniffer there is no rule for this because the number depends on the "individual domain of each project". When an entity or value object consists of three or more equal parts, the rule will flag the code but there is no problem. For example, a class BoundingBox might contain four fields top, left, bottom, right. Depending on the values, introducing a new wrapper class Coordinate to reduce these fields to topLeft and bottomRight might make sense.

No Properties
My PMD rule finds methods that return an instance field (a getter) or update it (a setter). PHP_CodeSniffer checks for methods using the typical naming conventions. It further forbids the usage of public fields, which is a great idea. As we wrapped all primitives (rule #3) and we have no getters, we can never check their values. So how do we create state based tests? Mark Needham has discussed "whether we should implement equals and hashCode methods on objects just so that we can test their equality. My general feeling is that this is fine although it has been pointed out to me that doing this is actually adding production code just for a test and should be avoided unless we need to put the object into a HashMap or HashSet."

From what I have seen, most object oriented developers struggle with that constraint. Getters and setters are very ingrained. In fact some people have dropped that constraint from Object Calisthenics. There are several ways to live without accessors. Samir Talwar has written why avoiding Getters, Setters and Properties is such a powerful mind shift.

Java Project Setup
I created a repository containing the starting point for the LCD Numbers Kata:

lcd-numbers-object-calisthenics-java-setup.

Both are Apache Maven projects. The projects are set up to check the code using the Maven PMD Plugin on each test execution. Here is the relevant snippet from the pom.xml:

<build>
  <plugins>
    <plugin>
      <groupId>org.apache.maven.plugins</groupId>
      <artifactId>maven-pmd-plugin</artifactId>
      <version>3.7</version>
      <configuration>
        <printFailingErrors>true</printFailingErrors>
        <linkXRef>false</linkXRef>
        <typeResolution>true</typeResolution>
        <targetJdk>1.8</targetJdk>
        <sourceEncoding>${encoding}</sourceEncoding>
        <includeTests>true</includeTests>
        <rulesets>
          <ruleset>/rulesets/java/object-calisthenics.xml</ruleset>
        </rulesets>
      </configuration>
      <executions>
        <execution>
          <phase>test</phase>
          <goals>
            <goal>check</goal>
          </goals>
        </execution>
      </executions>
      <dependencies>
        <dependency>
          <groupId>org.codecop</groupId>
          <artifactId>pmd-rules</artifactId>
          <version>1.2.3</version>
        </dependency>
      </dependencies>
    </plugin>
  </plugins>
</build>

You can add this snippet to any Maven project and enjoy Object Calisthenics. The Jar file of pmd-rules is available in my personal Maven repository.

To test your setup there is sample code in both projects and mvnw test will show two violations:

[INFO] PMD Failure: SampleClass.java:2 Rule:TooManyFields Priority:3 Too many fields.
[INFO] PMD Failure: SampleClass:9 Rule:NoElseKeyword Priority:3 No else keyword.

It is possible to check the rules alone with mvnw pmd:check. (Using the Maven Shell the time to run the checks is reduced by 50%.) There are two run_pmd scripts, one for Windows (.bat) and one for Linux (.sh).

Object Calisthenics Retrospective

Limitations of Checking Code
Obviously code analysis cannot find everything. On the other hand - as discussed earlier - some violations will be false positives, e.g. when using the Stream API. You can use // NOPMD comments and @SuppressWarnings("PMD") annotations to suppress false positives. I recommend using exact suppressions, e.g. @SuppressWarnings("PMD.TooManyFields") to skip violations because other violations at the same line will still be found. Use your good judgement. The goal of Object Calisthenics is to follow all nine rules, not to suppress them.

Learnings
Object Calisthenics is a great exercise. I used it all of my workshops on Object Oriented Programming and in several exercises I did myself. The verification of the rules helped me and the participants to follow the constraints and made the exercise more strict. (If people were stuck I sometimes recommended to ignore one or another PMD violations, at least for some time.) People liked it and had insights into object orientation: It is definitely a "different" and "challenging way to code". "It is good to have small classes. Now that I have many classes, I see more structure." You should give it a try, too. Jeff Bay even recommends to run an exercise or prototype of at least 1000 lines for at least 20 hours.

The question if Object Calisthenics is applicable to real working systems remains. While it is excellent for exercise, it might be too strict to be used in production. On the other hand, in his final note, Jeff Bay talks about a system of 100,000 lines of code written in this style, where the "people working on it feel that its development is so much less tiresome when embracing these rules".

23 September 2017

Verbs instead of Nouns

Verbs instead of Nouns is a basic Coderetreat activity. It was used right from the beginning of Coderetreat. I tried it the first time during the GDCR 2012. The goal was to focus on verbs instead of nouns (obviously ;-). By searching for verb names, we did not think about what a class represented or contained, rather what it did.

Constraints in General
A constraint, also known as an activity, is an artificial challenge during an exercise, e.g. code kata, coding dojo or Coderetreat. It is designed to help participants think about writing code differently than they would otherwise. Every activity has a specific learning goal in mind.

Constraints are the primary tool to focus a coding exercise. For example, to improve my Object Orientation, I will practise Jeff Bay's Object Calisthenics or even Brutal Coding Constraints. Some constraints are an exaggeration of a fundamental rule of clean code or object oriented design and might be applicable during day to day work. More extreme ones will still help you understand the underlying concepts.

Learning Goal
Verbs instead of Nouns is listed as stretch activity. Stretch activities are designed to push you out of your usual coding habits - your coding comfort zone - and broaden your horizon by showing you new ways how to do things. By design stretch activities might look awkward, ridiculous or even plain wrong.

The learning goal of Verbs instead of Nouns is to push you out of noun oriented thinking. Noun oriented thinking is a way of object orientation, where the nouns of the problem description become classes, and the verbs become methods. This is the classic definition of Object Oriented Analysis and Design. As with any technique, following it blindly is not healthy. According to Alan Kay Object Oriented Programming is about messaging and encapsulation. He wanted "to get rid of data". His objects are defined by the messages they accept. Object orientated programming becomes verb based, if we focus on behaviour.

In functional programming, verbs are natural. All activities are functions. For example Steve Yegge describes functional programming as verb based in his humorous critique of 2006's style Java. Verbs instead of Nouns is an object oriented constraint.

Interpretation of the Constraint
Besides its name there is no information about this constraint available on the Coderetreat site. There was a discussion how to meet the constraint (which has been deleted to make space for the new GDCR organisation): Separate value objects from operations and build service objects for the operations, which would be named with a verb. Or do not consider what a class contains or represents, but what it does. This keeps the concerns separated and the classes small and simple.

Being a Value
Obviously not everything can be named with a verb. Values, at least primitive values, are things: 2, true, "Hello". The Oxford Dictionary explains value - the way we use it in code - as the numerical amount denoted by an algebraic term; a magnitude, quantity, or number. Now "Hello" is neither a quantity nor number, it is a constant term. The entry about Value Object on Wikipedia defines a value object as a small object that represents a simple entity whose equality is not based on identity: i.e. two value objects are equal when they have the same value, not necessarily being the same object.. The definition uses "having the same value"... I am not getting anywhere.

On the other hand, in the Lambda Calculus, even numbers are represented as functions. For example the number two can be represented by the higher order function n₂(f,x) = f(f(x)), see Tom Stuart's Programming with Nothing. Being a function makes it verb based but which which verb would name n₂(f,x)?

Suitable Exercises
For a stretch exercise, a suitable exercise is challenging. There is no point if everything goes smooth. We need an assignment that does not support the constraint. Everything that is functional in nature is not suitable, because functions are verb based. This rules out algorithmic exercises as algorithms are usually functional. We need a kata with some state - some "values" - and the need to mutate that. Let's try different problems.

Discussion of Game of Life
As I said, I did Verbs instead of Nouns first on the Game of Life. While Game of Life is a larger exercise, most of it can be implemented in a functional way, lending itself to the constraint. Here are some of its classes:

Classify has two implementations, ClassifyPopulation and ClassifyReproduction. Both classes check if a population is optimal for survival or not. There is one public method and its arguments are passed into the constructor. These classes are functors, function objects, the representation of functions in object oriented languages. The class name suits these class and the verb oriented thinking helped in extracting and evolving them.

LocateCell represents the position of the cells in the grid. It contains two integers x, y and an equals method to identify same positions. What is the verb of being a coordinate? A coordinate locates, as it discovers the exact place or position of something (Oxford Dictionary). Here Coordinate might be a more natural name.

I am unhappy with LookupLivingCells. It has two methods reproduce and isAlive. The verb Lookup only points to the second method. A proper class name should contain all functionality the class offers, so LookupAndTrackLivingCells is more appropriate. I do not like class names with And in them because they violate the Single Responsibility Principle. On the other hand - in an object oriented way - the class is fine as it encapsulates the collection of LocateCells and represents a Generation of cells.

Discussion of Trivia
Next I tried refactoring towards the constraint. Refactoring towards a constraint allows a more fine grained transition. Together with fellow craftsman Johan Martinsson we worked on the Trivia exercise and spent several hours extracting "verbs" from the legacy code base. (Many of the observations I describe later were made by Johan or found through discussion with him.) Let's look at some of the classes we created:

We extracted Ask. An noun oriented name might be Questions or QuestionsDeck. It is a closure over the list of questions and it does ask them.

MovePlayerOnBoard contains the board of the Trivia game. We felt being unable to escape our mental model of objects as state. On the other hand, the code for the class was chosen only by looking at the behaviour. It must be good. MovePlayerOnBoard has one public method but is not a functor because it contains mutable state, the positions of the players on the board.

Score is a similar reasonable class by object oriented standard. A player scores by answering correctly, or does not score by answering wrongly. Like MovePlayerOnBoard and AllowToPlay, it is a real object with internal, encapsulated state and various methods manipulating its state. These classes are far away from functors and functional programming.

Conclusion
Verbs are abstractions, too. There are "small verbs" like increasePurse, and higher level ones like moveAndAsk. Smaller verbs are easier to identify and to create or extract. Most of our verbs encapsulate primitives. If the code is primarily state, finding a suitable verb is hard. These verb names feel even more "wrong" than other verb oriented names. Maybe, when we only behaviour of a class is mutating the subject, we should show the subject in its name.

Responsibilities
A method that does much is difficult to name with a single verb. In the refactoring exercise, we moved out logic to make the describing verb(s) simpler, clearer and "pure". During refactoring we had trouble finding concise verbs for convoluted legacy methods. I guess when creating verb based code from scratch, such methods would never exist. Naming classes as verbs helps to split logic into more classes containing different aspects of data.

Design
Many verb oriented classes are functors, objects with a single method. Some are closing over state. There are classes with different aspects of the same verb, e.g. answerCorrectly and answerWrongly in a class Answer. Despite some weird names, the resulting design was always good. The constraint drives to nice, small, focused objects.

Usefulness as Exercise
The constraint is difficult. Especially when dealing with state, it is hard to find verb oriented names. It forces small, focused objects and discourages state oriented designs like Java Beans. Intermediate Object Oriented programmers will gain most of the constraint. They understand the basics of objects and usually create noun based classes. With more knowledge of object oriented design principles like SOLID, the constraint might have has less impact on the design.

Example Code

30 November 2014

TDDaiymi vs. Naked Primitives

For our recent session Brutal Coding Constraints at the Agile Testing Days 2014, Martin Klose and I experimented with combinations of constraints for code katas and coding dojos. A constraint, also known as an activity, is a challenge during a kata, coding dojo or code retreat designed to help participants think about writing code differently than they would otherwise. Every activity has a specific learning goal in mind. Most coding constraints are an exaggeration of a fundamental rule of clean code or object oriented design. We tried to find particularly difficult combinations that would challenge expert developers and masters of object orientation. A brutal mixture of constraints is TDD as if you Meant it, No IFs and No naked primitives.

TDD as if you Meant it
Several years ago Keith Braithwaite noticed that many developers use some kind of pseudo-TDD by thinking of a solution and creating classes and functions that they just know they need to implement it. Then their tests keep failing and much time is spent debugging. But instead they should just follow Kent Beck's original procedure of adding a little test, seeing it fail and making it pass by some small change. To help developers practise these original principles Keith defined TDDaiymi as an exercise to really do TDD. The exercise uses a very strict interpretation of the practice of TDD, avoiding design up front and helping people getting used to emergent design by refactoring. It is a difficult exercise and I still remember the first time I did it in a session run by Keith himself. For almost two hours my pair and I were struggling to create something useful. While TDDaiymi is a bit extreme, some people believe it to be the next step after TDD. German "Think Tank" Ralf Westphal even claims that following TDD as if you Meant it is the only way that TDD can give you good design results.

No naked primitives
No naked primitives is another nice constraint. All primitive values, e.g. booleans, numbers or strings need to be wrapped and must not be visible at object boundaries. Arrays, all kinds of containers like lists or hash-tables and even Object (the root class of the language's class hierarchy) are considered primitive as well. Similar to Keith's TDDaiymi this rule is designed to exercise our object orientation skills. A string representing a name is an under-engineered design because many strings are no valid names. In an object oriented system we would like to represent the concept of a name with a Name class. Usually Value Objects are used for this purpose. Also a list of shopping items is not a shopping basket. A general purposes list implementation offers operations that do not make sense for a shopping basket. So containers need to be encapsulated. While it is exaggerated to wrap all primitives (see Primitive obsession obsession), I have seen too many cases of Primitive Obsessions that I rather see a few additional value objects than another map holding maps holding strings. (To avoid that I created a custom PMD rule PrimitiveObsession that flags all primitives in public signatures of Java classes.)

No IFs
No IFs is a great constraint of the Missing Feature catalogue. All conditionals like if, while, switch and the ternary operator (e.g. _ ? _ : _) are forbidden. Like No naked primitives this is exaggerated but many code bases are full of abysmal-deeply nesting, like the PHP Streetfighter. This has been recognised as a problem and there are campaigns to reduce the usage of ifs. After all, excessive usage of if and switch statements is a code smell and can be replaced by polymorphism and other structures. In order to practice the constraint is taken to the extreme. I have seen many developers struggle with the missing concept in the beginning, but after some thought each and everyone of them found a way to express his or her concepts in another, less nested way.

Combining Constraints
Some constraints like TDDaiymi are already difficult on their own, but obeying several of them at the same time is hard. We stick to our (coding-) habits even if we planned to do otherwise. While working the experiments Martin an I would notice a constraint violation only after several times switching roles, several minutes after introducing the "forbidden" concept. This is especially true for conditionals because they are most fundamental to our (imperative styled) coding. It seems that tool support for constraints would help, but not all constraints can be enforced automatically and definitely not for all programming languages.

Refactoring Towards Constraints
It is difficult to find solutions with constraints - after all that is one of their primary purposes - so we decided to first make it green and then refactor towards the constraint. This seems natural as TDD itself has a similar rule: After creating the next failing test (red phase), try to get it green as fast as possible, creating dirty code and duplication as needed (green phase). Then clean up the mess relying on the safety net of the already existing tests (refactor phase). This results in more refactoring than usual, and we spent more than half of the time refactoring. As TDDaiymi puts much more focus on the refactoring step, this might have been due to the first constraint alone. 50-50 for a code kata with a refactoring focused constraints is not much, given that some people report three refactoring commits per one feature commit in their production code.

Designing Up Front?
When refactoring towards the constraint, it is often hard to change the current ("green") solution because it is rooted on primitive data types or conditionals or whatever violates the constraints. Sometimes we had no clue how to refactor and left the violation in the code for several red-green-refactor cycles until it became clear how to proceed. Being hard is not a problem because this is an exercise after all but allowing violations for extended periods of time made me unhappy as the code was not completely clean regarding our standards after each cycle. Also the code looked weird but some constraints are weird and therefore force odd solutions. In following TDDaiymi we did not think about any design up front, but maybe we should have done so. Thinking about structures (i.e. design) that support the constraint, e.g. the absence of conditionals, greatly helps the code and such "designed" solutions look much better. While we should think about the problem, the technology-agnostic solution and its acceptance criteria (test cases) before starting to write code, classic TDD discourages creating a design up front, and TDDaiymi does even more so. I guess more practise is needed to resolve this issue.

Evolving Structures
Another, more direct contradiction is the combination of TDDaiymi and No naked primitives. While following the former constraint, we want to avoid premature design decisions and sometimes use Object as a place-holder for a concept as long as there is no structure yet. Maybe this can be avoided but I do not know how. (If you know how to TDDaiymi without plain objects, I am more than happy to pair with you on some exercises.) In the end, when the design is finished, the Object has been replaced, but as the final structure evolves late, it may stick around for some time. On the other hand Object is a primitive and therefore forbidden. Maybe it is special and should not count as primitive. I have never seen any under-engineered code base using plain objects, on the contrary they usually contain lots of strings with the occasional boolean or number, but no naked Object. So I believe it is safe to allow Object as primitive because it will be replaced eventually.

Conclusion
When I started to write this article I thought its conclusion would be that some coding constraints are just not compatible. As constraints are exaggerated principles, this would be no surprise. But now, after ~~writing~~ thinking about it, I am not sure it is a problem. On the contrary the discussion helped me understand the constraints, their origins and intent. My real conclusion is that doing katas with constraints sparks interesting discussions and is a great way of learning about one's own coding habits and many aspects of coding in general. I strongly encourage you to do so,

Credits
Thanks to Martin Klose for the countless hours of pairing and discussing these constraints with me.

8 January 2012

Why Singletons Are Evil

Currently I'm working on enabling automated unit testing in a legacy code base but I am seriously hindered by the singleton design pattern which is very common in the code base in the form of managers. A quick sweep with a static code analysis tool reveals that 1207 out of 4787 classes depend on such (singleton) managers. See the nice graph generated with GSD to get an idea of these dependencies:

Singletons Are Evil
This means that 25% of all classes in the code base are virtually impossible, or at least very hard to unit test. Testing gets difficult as the singletons' global state has to be initialised for each test. This slows down test execution and makes the tests more brittle. That's one of many reasons I just hate the singleton pattern. It's an object oriented anti-pattern. In case you do not believe me, I am sure you will believe well known people like Miško Hevery, Uncle Bob or J.B. Rainsberger to name just a few. At least read Steve Yegge's take on the singleton which is my favourite article on the topic.

Pure Evilness
Some years ago I worked on a code base which was similar both in size and the number of singletons involved. In fact it was not that bad, only 15% of all classes were depending on a singleton, still it was nasty to work with.

So I took the liberty to educate my team mates on the singleton's evilness. The following list was taken from my Design Patterns reading group notes and contains all negative effects of singletons that I'm aware of. Singletons are evil because they ...

... introduce global state/global variables.

... hurt modularity and readability.
... make concurrent programming hard.
... encourage you to forget everything about object oriented design.

... break encapsulation/increase coupling.

... are a throwback to non object oriented programming.
... allow anyone to access them at any time. (They ignore scope.)
Finding the right balance of exposure and protection for an object is critical for maintaining flexibility.
They typically show up like star networks in a coupling diagram.
... make assumptions about the applications that will use them.
If nobody is going to use them they are basically just memory leaks.

... cause coding/design problems.

Signatures of methods (inputs) don't show their dependencies, because the method could pull a singleton out of thin air.
All singletons have to stick in boilerplate code.
Everyone who uses it has to stick in boilerplate code, too.

... make code hard to test.

When classes are coupled, it is possible only to test a group of classes together, making bugs more difficult to isolate.
... can prevent a developer from testing a class at all.
Two tests that actually depend on each other by modifying a shared resource (the singleton) can produce blinking tests.

... prevent you from using other code in place of production implementations.