Optimizing CCCs
In the post Overloading lambda, I gave a translation from a typed lambda calculus into the vocabulary of cartesian closed categories (CCCs). This simple translation leads to unnecessarily complex expressions. For instance, the simple lambda term, “λ ds → (λ (a,b) → (b,a)) ds
”, translated to a rather complicated CCC term:
apply ∘ (curry (apply ∘ (apply ∘ (const (,) △ (id ∘ exr) ∘ exr) △ (id ∘ exl) ∘ exr)) △ id)
(Recall from the previous post that (∘)
binds more tightly than (△)
and (▽)
.)
However, we can do much better, translating to
exr △ exl
which says to pair the right and left halves of the argument pair, i.e., swap.
This post applies some equational properties to greatly simplify/optimize the result of translation to CCC form, including example above. First I’ll show the equational reasoning and then how it’s automated in the lambda-ccc library.
Equational reasoning on CCC terms
First, use the identity/composition laws:
f ∘ id ≡ f
id ∘ g ≡ g
Our example is now slightly simpler:
apply ∘ (curry (apply ∘ (apply ∘ (const (,) △ exr ∘ exr) △ exl ∘ exr)) △ id)
Next, consider the subterm apply ∘ (const (,) △ exr ∘ exr)
:
apply ∘ (const (,) △ exr ∘ exr)
≡ {- definition of (∘) -}
λ x → apply ((const (,) △ exr ∘ exr) x)
≡ {- definition of (△) -}
λ x → apply (const (,) x, (exr ∘ exr) x)
≡ {- definition of apply -}
λ x → const (,) x ((exr ∘ exr) x)
≡ {- definition of const -}
λ x → (,) ((exr ∘ exr) x)
≡ {- η-reduce -}
(,) ∘ (exr ∘ exr)
We didn’t use any properties of (,)
or of (exr ∘ exr)
, so let’s generalize:
apply ∘ (const g △ f)
≡ λ x → apply ((const g △ f) x)
≡ λ x → apply (const g x, f x)
≡ λ x → const g x (f x)
≡ λ x → g (f x)
≡ g ∘ f
(Note that I’ve cheated here by appealing to the function interpretations of apply
and const
. Question: Is there a purely algebraic proof, using only the CCC laws?)
With this equivalence, our example simplifies further:
apply ∘ (curry (apply ∘ ((,) ∘ exr ∘ exr △ exl ∘ exr)) △ id)
Next, lets focus on apply ∘ ((,) ∘ exr ∘ exr △ exl ∘ exr)
. Generalize to apply ∘ (h ∘ f △ g)
and fiddle about:
apply ∘ (h ∘ f △ g)
≡ λ x → apply (h (f x), g x)
≡ λ x → h (f x) (g x)
≡ λ x → uncurry h (f x, g x)
≡ uncurry h ∘ (λ x → (f x, g x))
≡ uncurry h ∘ (f △ g)
Apply to our example:
apply ∘ (curry (uncurry (,) ∘ (exr ∘ exr △ exl ∘ exr)) △ id)
We can simplify uncurry (,)
as follows:
uncurry (,)
≡ λ (x,y) → uncurry (,) (x,y)
≡ λ (x,y) → (,) x y
≡ λ (x,y) → (x,y)
≡ id
Together with the left identity law, our example now becomes
apply ∘ (curry (exr ∘ exr △ exl ∘ exr) △ id)
Next use the law that relates (∘)
and (△)
:
f ∘ r △ g ∘ r ≡ (f △ g) ∘ r
In our example, exr ∘ exr △ exl ∘ exr
becomes (exr △ exl) ∘ exr
, so we have
apply ∘ (curry ((exr △ exl) ∘ exr) △ id)
Let’s now look at how apply
, (△)
, and curry
interact:
apply ∘ (curry h △ g)
≡ λ p → apply ((curry h △ g) p)
≡ λ p → apply (curry h p, g p)
≡ λ p → curry h p (g p)
≡ λ p → h (p, g p)
≡ h ∘ (id △ g)
We can add more variety for other uses:
apply ∘ (curry h ∘ f △ g)
≡ λ p → apply ((curry h ∘ f △ g) p)
≡ λ p → apply (curry h (f p), g p)
≡ λ p → curry h (f p) (g p)
≡ λ p → h (f p, g p)
≡ h ∘ (f △ g)
With this rule (even in its more specialized form),
apply ∘ (curry ((exr △ exl) ∘ exr) △ id)
becomes
(exr △ exl) ∘ exr ∘ (id △ id)
Next use the universal property of (△)
, which is that it is the unique solution of the following two equations (universally quantified over f
and g
):
exl ∘ (f △ g) ≡ f
exr ∘ (f △ g) ≡ g
(See Calculating Functional Programs, Section 1.3.6.)
Applying the second rule to exr ∘ (id △ id)
gives id
, so our swap
example becomes
exr △ exl
Automation
By using a collection of equational properties, we’ve greatly simplified our CCC example. These properties and more are used in LambdaCCC.CCC
to simplify CCC terms during construction. As a general technique, whenever building terms, rather than applying the GADT constructors directly, we’ll use so-called “smart constructors” with built-in optimizations. I’ll show a few smart constructor definitions here. See the LambdaCCC.CCC
source code for others.
As a first simple example, consider the identity laws for composition:
f ∘ id ≡ f
id ∘ g ≡ g
Since the top-level operator on the LHSs (left-hand sides) is (∘)
, we can easily implement these laws in a “smart constructor” for (∘)
, which handles special cases and uses the plain (dumb) constructor if no simplifications apply:
infixr 9 @∘
(@∘) ∷ (b ↣ c) → (a ↣ b) → (a ↣ c)
⋯ -- simplifications go here
g @∘ f = g :∘ f
where ↣
is the GADT that represents biCCC terms, as shown in Overloading lambda.
The identity laws are easy to implement:
f @∘ Id = f
Id @∘ g = g
Next, the apply
/const
law derived above:
apply ∘ (const g △ f) ≡ g ∘ f
This rule translates fairly easily:
Apply @∘ (Const g :△ f) = prim g @∘ f
where prim
is a smart constructor for Prim
.
There are some details worth noting:
- The LHS uses only dumb constructors and variables except for the smart constructor being defined (here
(@∘)
). - Besides variables bound on the LHS, the RHS uses only smart constructors, so that the constructed combinations are optimized as well. For instance,
f
might beId
here.
Despite these details, this definition is inadequate in many cases. Consider the following example:
apply ∘ ((const u △ v) ∘ w)
Syntactically, the LHS of our rule does not match this term, because the two compositions are associated to the right instead of the left. Semantically, the rules does match, since composition is associative. In order to apply this rule, we can first left-associate and then apply the rule.
We could associate all compositions to the left during construction, in which case this rule will apply purely via syntactic matching. However, there will be other rewrites that require right-association in order to apply. Instead, for rules like this one, let’s explicitly left-decompose.
Suppose we have a smart constructor composeApply g
that constructs an optimized version of apply ∘ g
. This equivalence implies the following type:
composeApply ∷ (z ↣ (a ⇨ b) × a) → (z ↣ b)
Thus
apply ∘ (g ∘ f)
≡ (apply ∘ g) ∘ f
≡ composeApply g ∘ f
Now we can define a general rule for composing apply
:
Apply @∘ (decompL → g :∘ f) = composeApply g @∘ f
The function decompL
(defined below) does a left-decomposition and is conveniently used here in a view pattern. It decomposes a given term into g ∘ f
, where g
is as small as possible, but not Id
. Where decompL
finds such a decomposition, it yields a term with a top-level (:∘)
constructor, and composeApply
is used. Otherwise, the clause fails.
The implementation of decompL
:
decompL ∷ (a ↣ c) → (a ↣ c)
decompL Id = Id
decompL ((decompL → h :∘ g) :∘ f) = h :∘ (g @∘ f)
decompL comp@(_ :∘ _) = comp
decompL f = f :∘ Id
There’s also decompR
for right-factoring, similarly defined.
Note that I broke my rule of using only smart constructors on RHSs, since I specifically want to generate a (:∘)
term.
With this re-association trick in place, we can now look at compose/apply rules.
The equivalence
apply ∘ (const g △ f) ≡ g ∘ f
becomes
composeApply (Const p :△ f) = prim p @∘ f
Likewise, the equivalence
apply ∘ (h ∘ f △ g) ≡ uncurry h ∘ (f △ g)
becomes
composeApply (h :∘ f :△ g) = uncurryE h @∘ (f △ g)
where (△)
is the smart constructor for (:△)
, and uncurryE
is a smart constructor for Uncurry
:
uncurryE ∷ (a ↣ (b ⇨ c)) → (a × b ↣ c)
uncurryE (Curry f) = f
uncurryE (Prim PairP) = Id
uncurryE h = Uncurry h
Two more (∘)
/apply
properties:
apply ∘ (curry (g ∘ exr) △ f)
≡ λ x → curry (g ∘ exr) x (f x)
≡ λ x → (g ∘ exr) (x, f x)
≡ λ x → g (f x)
≡ g ∘ f
apply ∘ first f
≡ λ p → apply (first f p)
≡ λ (a,b) → apply (first f (a,b))
≡ λ (a,b) → apply (f a, b)
≡ λ (a,b) → f a b
≡ uncurry f
The first
combinator is not represented directly in our (↣)
data type, but rather is defined via simpler parts in LambdaCCC.CCC
:
first ∷ (a ↣ c) → (a × b ↣ c × b)
first f = f × Id
(×) ∷ (a ↣ c) → (b ↣ d) → (a × b ↣ c × d)
f × g = f @∘ Exl △ g @∘ Exr
Implementations of these two properties:
composeApply (Curry (decompR → g :∘ Exr) :△ f) = g @∘ f
composeApply (f :∘ Exl :△ Exr) = uncurryE f
These properties arose while examining CCC terms produced by translation from lambda terms. See the LambdaCCC.CCC
for more optimizations. I expect that others will arise with more experience.
Derek Elkins:
I think your difficulty with apply/const comes from the incoherence of const in this context; it adds a kind of “well-pointedness” that goes beyond a CCC and goes beyond what you need for at least this example. In a nutshell, it doesn’t make sense to say “const x” in general since there is no way to talk about elements, x, “in” some object. However, category theory does let us talk about arrows and that leads to what is sometimes called the “name” of an arrow (in a CCC). Specifically, const f = curry (f ∘ exr) which should look familiar. From there, it’s easy using the universal property of apply (as a counit of the relevant adjunction) which is: apply ∘ (curry f × id) = f. You get:
13 September 2013, 10:27 pmDan Krejsa:
A very slightly different take expanding a bit on Derek Elkin’s comment. Letting 1 (a.k.a. Unit) be the terminal object of the CCC, one can consider morphisms (1 ↣ b) as “elements of b” in some sense. Then one can have a function corresponding to ‘const’ applying to CCC morphisms:
where it :: a ↣ 1 is the unique morphism from a to 1.
Now given CCC morphisms g :: b ↣ c and f :: a ↣ b, we may view
and let
Then
and we have
using the defining property of ‘curry’ that for any f : c × a -> b,
and the result
conal:
Derek & Dan: Thanks very much for the suggestions to eliminate
16 September 2013, 10:45 amconst
, which felt shaky indeed!Conal Elliott » Blog Archive » Circuits as a bicartesian closed category:
[…] previous few posts have been about cartesian closed categories (CCCs). In From Haskell to hardware via cartesian […]
17 September 2013, 12:27 pm