22 March 2012

Prime Factors XSLT

Here is a short conversation we had at Hackergarten Vienna yesterday:

I did the Prime Factors Kata in XSLT. Would you like to review it?
You did WHAT?
You know, this small program. It's calculating the prime factors of a number.
But what? In XSLT? Why XSLT?
I know that it sounds a bit crazy. I had performed the kata in all programming languages which I know, e.g. Java, Ruby and even old BASIC or Turbo Pascal, but not using XSL transformations. So I just had to do it.

Initial Version
Using XSLTunit I created a test case that would apply the template to <value> to calculate the prime factors of the given source.
<xsltu:test id="test-template-factors-one">
  <xsl:variable name="source">
  <value>1</value>
  </xsl:variable>
  <xsl:call-template name="xsltu:assertEqual">
  <xsl:with-param name="id" select="'template factors 1'" />
  <xsl:with-param name="nodes1">
    <xsl:apply-templates select="exsl:node-set($source)/value" />
  </xsl:with-param>
  <xsl:with-param name="nodes2"></xsl:with-param>
  </xsl:call-template>
</xsltu:test>

...

<xsltu:test id="test-template-factors-four">
  <xsl:variable name="source">
  <value>9</value>
  </xsl:variable>
  <xsl:call-template name="xsltu:assertEqual">
  <xsl:with-param name="id" select="'template factors 4'" />
  <xsl:with-param name="nodes1">
    <xsl:apply-templates select="exsl:node-set($source)/value" />
  </xsl:with-param>
  <xsl:with-param name="nodes2">3 3</xsl:with-param>
  </xsl:call-template>
</xsltu:test>
and so on. The result looked like that:
<xsl:template match="value">
  <xsl:call-template name="generate">
  <xsl:with-param name="number" select="number(current())" />
  <xsl:with-param name="candidate" select="number(2)" />
  </xsl:call-template>
</xsl:template>

<xsl:template name="generate">
  <xsl:param name="number" />
  <xsl:param name="candidate" />
  <xsl:choose>
  <!-- one is no prime and does not have any factors -->
  <xsl:when test="$number = 1"></xsl:when>
  <!-- candidate == number, so number is a prime -->
  <xsl:when test="$candidate = $number">
    <xsl:value-of select="$number" />
  </xsl:when>
  <!-- number is factored by the candidate, factor it -->
  <xsl:when test="$number mod $candidate = 0">
    <xsl:value-of select="$candidate" />
    <xsl:text> </xsl:text>
    <xsl:call-template name="generate">
    <xsl:with-param name="number" select="$number div $candidate" />
      <xsl:with-param name="candidate" select="$candidate" />
      </xsl:call-template>
    </xsl:when>
    <!-- try again with the next factor -->
    <xsl:otherwise>
      <xsl:call-template name="generate">
        <xsl:with-param name="number" select="$number" />
        <xsl:with-param name="candidate" select="$candidate + 1" />
      </xsl:call-template>
    </xsl:otherwise>
  </xsl:choose>
</xsl:template>
StackOverflowError
Using javax.xml.transform.TransformerFactory from Java 6 to run the XSLT transformation, the last prime that was processed successfully was 7351. Calling generate with the following prime (7369) caused a StackOverflowError inside Apache Xalan due to the recursion. As there was no need to try candidates larger than the square root of number, I replaced the test for $candidate = $number with $candidate * $candidate > $number. This brought me as far as 54184313.

Reducing the Stack Depth
XSLT is a functional language, there are no loops, the only way to iterate is recursion. So the size of the largest prime to process is limited by the size of the stack. Still half of the recursions are useless because no even number larger than two can be a prime.
<!-- try again with the next factor -->
  <xsl:otherwise>
    <xsl:choose>
      <xsl:when test="$candidate = 2">
        <xsl:call-template name="generate">
          <xsl:with-param name="number" select="$number" />
          <xsl:with-param name="candidate" select="$candidate + 1" />
        </xsl:call-template>
      </xsl:when>
      <xsl:otherwise>
        <xsl:call-template name="generate">
          <xsl:with-param name="number" select="$number" />
          <xsl:with-param name="candidate" select="$candidate + 2" />
        </xsl:call-template>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:otherwise>
This can be improved further by unrolling the (recursive) loop. If number is not factored by candidate + 2 then the next one is candidate + 4 and so on.
<!-- try again with the next factor -->
  <xsl:otherwise>
    <xsl:choose>
      <xsl:when test="$candidate = 2">
        <xsl:call-template name="generate">
          <xsl:with-param name="number" select="$number" />
          <xsl:with-param name="candidate" select="$candidate + 1" />
        </xsl:call-template>
      </xsl:when>
      <xsl:when test="$number mod ($candidate + 2) = 0">
        <xsl:call-template name="generate">
          <xsl:with-param name="number" select="$number" />
          <xsl:with-param name="candidate" select="$candidate + 2" />
        </xsl:call-template>
      </xsl:when>
      <xsl:when test="$number mod ($candidate + 4) = 0">
        <xsl:call-template name="generate">
          <xsl:with-param name="number" select="$number" />
          <xsl:with-param name="candidate" select="$candidate + 4" />
        </xsl:call-template>
      </xsl:when>
      <xsl:otherwise>
        <xsl:call-template name="generate">
          <xsl:with-param name="number" select="$number" />
          <xsl:with-param name="candidate" select="$candidate + 6" />
        </xsl:call-template>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:otherwise>

9 February 2012

Required Reading: Clean Code

Here is an email that I wrote to my team earlier this year. The team members are able to deliver new features on time but do not care for code quality (at least not as much as I do ;-). This is going to change.

Raising the bar. (1)

Clean Code AssetsDear team,
as mentioned in the kick-off, we need to get into the right attitude for the upcoming refactoring effort. Many items on our code clean-up list are ongoing changes, e.g.
  • use proper names for fields and methods
  • clean up magic numbers
  • split large classes
  • add JavaDoc to core classes
  • fix compiler warnings
  • remove duplicated code
  • remove dead code
  • add JUnit tests
All of these changes are in fact rules how to produce code that is easy to read and maintain. All these rules are part of a coding style sometimes called "clean code".

The Pragmatic Programmers once wrote that you should read at least four technical books a year to stay sharp and relevant in our fast paced industry. Remember that the half-life of our technical knowledge in only 18 months. (Heinz Kabutz) So as first technical book to read in 2012 I highly recommend Clean Code by Bob Martin. It's a great book and has raised a new wave of code-consciousness. Read one of its reviews if you do not believe me.

Sooner or later you will have to read it, so why not start now? Go ahead, buy it, read it! Or at least browse it and see what is inside. We cannot afford to have any more cryptic variable names or huge classes.

Regards,
Code Cop

Lowering the bar.

Of course I did not attach the PDF of the book. That would have been illegal but it might lower the bar to get my colleagues into reading it. I know that not many of the team will read it. I would consider my email successful if one of two read the table of contents or browse a chapter.

(1) "Raising the bar" is the subtitle of the Software Craftsmanship Manifesto.

15 January 2012

Literature for Java Developers

Jonny Andersson asked for a list of good books that one needs to read to learn how to create proper and useful Java code. He knew a few reasonable books, like the SCJP (Sun Certified Programmer) Study Guide, and I immediately came up with some recommendations. Still he encouraged me to share more literature tips in a discussion thread because as he said, there are so many books on Java development and software engineering available that it is impossible to know them all. We cannot afford to waste time on other books than the best ones and have to relay on recommendations. My last reading list is a bit outdated so I decided to repost my recommendations here.

Reading in the roundFirst I would like to differentiate the topic of Java into three phases: First learn how to program, second learn how to program in an object oriented way and only then learn whatever framework you like. Unfortunately I see that most junior developers have serious deficiencies in the first two areas, whereas frameworks are well known. I met developers who knew the Spring Framework pretty well, but had never heard of Test Driven Development and had no idea what the Java keywords volatile or transient are supposed to do.

For the first phase I refer to the most influential book every programmer should read. There you see that the most important book (as seen by the StackOverflow community) is Code Complete. The second important book is The Pragmatic Programmer. I think it's even more important than Code Complete. This book is the base of all development work. I'm sure you will like it. It has around 350 pages and for an experienced developer only a few chapters will be new. But on the other hand, I still meet people that do not use version control or build automation. That's one reason that everybody has to read it. Now go, read it! Even if you are an expert, just go and read it! ;-)

The SCJP Study Guide is a good start for Java. There is no way around knowing operator precedence and other low level details of the language. Another book on Java basics like TIJ (Thinking in Java) is necessary to deepen the knowledge after the SCJP guide, which is just focusing on the very basics. Then I highly recommend Java Generics and Collections to comprehend Java Generics. Of course no list of Java books is complete without Effective Java which is also very important. To round up the basic Java knowledge I recommend the Java Language Specification. It's a specification but it's not that bad to read.

After the basics I would definitely add Kent's Beck Test Driven Development by Example, which is a short and excellent introduction to JUnit and TDD. And of course you must read Design Patterns, Refactoring, Clean Code - argh, more and more books get piling up. Think about it, you will not become a professional (Java) developer by reading three books, you probably need more than that, maybe 30 ;-) So if you feel like more books, just check out my Goodreads reading list. All books I've read are rated and tagged accordingly. I encourage to have a look. I'm looking forward to see your recommendations.