In this post I discuss the new feature in Clojure that just made its way in the recently released 1.2. I am not going into what Protocols are - there are quite a few nice articles that introduce Clojure Protocols and the associated
defrecord
and deftype
forms. This post will be some random rants about how protocols encourage non intrusive extension of abstractions without muddling inheritance into polymorphism. I also discuss some of my realizations about what protocols aren't, which I felt was equally important along with understanding what they are.Let's start with the familiar
Show
type class of Haskell ..> :t show
show :: (Show a) => a -> String
Takes a type and renders a string for it. You get
show
for your class if you have implemented it as an instance of the Show
type class. The Show
type class extends your abstraction transparently through an additional behavior set. We can do the same thing using protocols in Clojure ..(defprotocol SHOW
(show [val]))
The protocol definition just declares the contract without any concrete implementation in it. Under the covers it generates a Java interface which you can use in your Java code as well. But a protocol is not an interface.
Adding behaviors non-invasively ..
I can extend an existing type with the behaviors of this protocol. And for this I need not have the source code for the type. This is one of the benefits that ad hoc polymorphism of type classes offers - type classes (and Clojure protocols) are open. Note how this is in contrast to the compile time coupling of Java interface and inheritance.
Extending
java.lang.Integer
with SHOW
..(extend-type Integer
SHOW
(show [i] (.toString i)))
We can extend an interface also. And get access to the added behavior from *any* of its implementations .. Here's extending
clojure.lang.IPersistentVector
..(extend-type clojure.lang.IPersistentVector
SHOW
(show [v] (.toString v)))
(show [12 1 4 15 2 4 67])
> "[12 1 4 15 2 4 67]"
And of course I can extend my own abstractions with the new behavior ..
(defrecord Name [last first])
(defn name-desc [name]
(str (:last name) " " (:first name)))
(name-desc (Name. "ghosh" "debasish")) ;; "ghosh debasish"
(extend-type Name
SHOW
(show [n]
(name-desc n)))
(show (Name. "ghosh" "debasish")) ;; "ghosh debasish"
No Inheritance
Protocols help you wire abstractions that are in no way related to each other. And it does this non-invasively. An object conforms to a protocol only if it implements the contract. As I mentioned before, there's no notion of hierarchy or inheritance related to this form of polymorphism.
No object bloat, no monkey patching
And there's no object bloat going on here. You can invoke
show
on any abstraction for which you implement the protocol, but show
is never added as a method on that object. As an example try the following after implementing SHOW
for Integer
..(filter #(= "show" (.getName %)) (.getMethods Integer))
will return an empty list. Hence there is no scope of *accidentally* overriding some one else's monkey patch on some shared class.
Not really a type class
Clojure protocols dispatch on the first argument of the methods. This limits its ability from getting the full power that Haskell / Scala type classes offer. Consider the counterpart of
Show
in Haskell, which is the Read type class ..> :t read
read :: (Read a) => String -> a
If your abstraction implements
Read
, then the exact instance of the method invoked will depend on the return type. e.g.> [1,2,3] ++ read "[4,5,6]"
=> [1,2,3,4,5,6]
The specific instance of
read
that returns a list of integers is automatically invoked here. Haskell maintains the dispatch match as part of its global dictionary.We cannot do this in Clojure protocols, since it's unable to dispatch based on the return type. Protocols dispatch only on the first argument of the function.
6 comments:
Excellent article.
It's worth noting that Scala implicits and Haskell typeclass instances are dispatched statically - that is to say the compiler statically picks an implicit/typeclass instance based on static analysis of static types.
Clojure is dynamically typed so it must dispatch to a particular type extension based on a dynamic type tag associated with a value. Things like Haskell's read are out of the question because read creates a value of the target type rather than taking a value of that type.
Perhaps some day a dynamically typed language will be able to expose the full power of a typeclass-like mechanism, but I don't think anybody has figured out a way to do it efficiently.
Note that Clojure added protocols and types to make it possible to do Clojure-in-Clojure. If you want multible dispatch you have to use multi methods but those are to slow to implement Clojure in itself (meaning the mainly the data structures).
I can extend an existing type with the behaviors of this protocol.
I think this a common misunderstanding. It's actually the other way around! The protocol is extended to the type. Seeing it this way makes it also easier to understand the behaviour eg. when a protocol is redefined, etc.
@meikel Good catch. I actually mentioned it in the section No Object Bloat. But the phrase that a protocol extends an existing type is strictly not true. The protocol methods are never installed as methods of the class / record / type on which it's applied.
Clojure protocols are just a poor twist on Lisp's generic functions. It seems, that it turned out, that nobody uses defmulti — which, I suppose, were thought to be an extensions and a replacement of Lisp's defgeneric — it proved to general-purpose. Yet defgeneric offers even more freedom, than Haskell's typeclasses, because it also supports inheritance, before/after/around and custom decorations and eql-comparison. The only thing missing out-of-the-box is the ability to unite multiple generic functions under one roof. But it's just a macro away...
I think you may have misunderstood Meikel's comment: he did not say that the protocol extends a type but is extended to a type. It seems to me (based on your own post) that one can think of the protocol as a kind of switch statement that is initially empty and is then progressively (one declaration at at time) populated with cases of types for which the protocol's operation makes sense.
Post a Comment