Anybody can say you can’t write. Let no one say you don’t.
—Ken Rand, courtesy of Chalain
In computer programming, a hacker is a software designer and programmer who builds elegant, beautiful programs and systems… For some, “hacker” has a negative connotation and refers to a person who “hacks” or uses kludges to accomplish programming tasks that are ugly, inelegant, and inefficient.
—Wikipedia
The clustering algorithm requires a very large, fixed set of curves.1 I wrote the initial curve generation by building a gigantic list of parameter tuples, and then processing the list into records. Once the “search space” grew beyond a trivial size, the program began to eat enormous amounts of memory. The problem was that I was trying to write the generation code as clearly as possible, and that created a massive thicket of objects that all resided in memory simultaneously.odd = LazyList.unfoldr(1) { |n| n + 2 }
and even = LazyList.unfoldr(0) { |n| n + 2 }
. If you append odd
onto even
, the resulting list in theory has both even and odd numbers. But in practice you can never reach the odd numbers because there is an infinite quantity of even numbers at the front of the list.merge
them together to produce a lazy union of the two lists. Merge interleaves alternating elements from each list, so the resulting lazy list contains all the elements of each list, and if you take a finite sample, such as the first 1,000 elements, you get 500 of each. even.merge(odd) => (0 1 2 3 ...)
. (Other interesting operations that work with infinite lists include cons
, pairwise
and product
.)
Any sufficiently complicated Lisp or Ruby program contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Haskell.
LazyList
class in Ruby is here. As soon as I switched from the “eager” code using arrays to the “lazy” code using lazy lists, memory use went down and performance went up. Not as much as a switch to imperative (and not even close to writing a better clustering algorithm), but enough that I can move forward.distribute(2, 1..4) -> [[1, 2], [1, 3], [1, 4], [2, 3], [2, 4], [3, 4]]
):def distribute choices, range
spots = (range.end - range.begin) + 1
return [] if choices <= 0 || choices > spots
remaining_choices = choices - 1
if remaining_choices == 0
return range.to_a.map { |spot| [spot] }
else
(range.begin..(range.end - remaining_choices)).to_a.inject([]) { |acc, first_spot| acc + (distribute(remaining_choices, (first_spot + 1)..range.end).map { |remaining_distribution| remaining_distribution.unshift(first_spot) }) }
end
end
And rewritten with lazy lists:def distribute choices, range
spots = (range.end - range.begin) + 1
return LazyList.new() if choices <= 0 || choices > spots
remaining_choices = choices - 1
if remaining_choices == 0
LazyList.from_range(range).map { |spot| LazyList.from_element(spot) }
else
LazyList.from_range(range.begin..(range.end - remaining_choices)).mapr(:merge) do |first_spot|
distribute(remaining_choices, (first_spot + 1)..range.end).map do |remaining_distribution|
LazyList.cons(first_spot, remaining_distribution)
end
end
end
end
It’s almost identical. LazyLists
replace arrays, mapr
replaces accumulating a list with inject
, and cons
replaces unshift
, but otherwise it’s the same code. Mission accomplished: we’ve changed the behaviour without having to change the way we express our algorithms. Had we wanted to, we could have supported enough array syntax that Lazy Lists look exactly like arrays, but that isn’t necessary.
A language that doesn’t affect the way you think about programming, is not worth knowing.
—Alan Perlis
Now that we hold in our hands a tool for making infinitely long lists, the question to ask is, “how does that affect the way we think about generating sample curves?”y
value for one of the control points on a curve. Let’s say it’s a rational number between 0.0
and 1.0
inclusive. It’s relatively easy to generate a list of rationals in that range. Say we generate 0.0000...00001
(with jillions of zeroes elided: we want the smallest number our computer can represent), followed by 0.0000...00002
(the second smallest), and so on.0.0
and 1.0
that our computer can represent. That shouldn’t be hard. But even taking into account the finite limitations on a computer’s representation, that list is infinitely long.(0.0000...00001 0.0000...00002 0.0000...00003 0.0000...00004 ... )
for control points, the first values are not useful at all. We’d just get a line along the x
access.y
, we devise a list like (0.0 1.0 0.5 0.75 0.25 ... )
. We’re successively bisecting the intervals (which is why the library method is called binary_search
). We can stop at any time and we have decent distribution. (And we can pipe dream that in the future, we can train our generator to favour points more likely to be useful to us.) That is much more useful!def binary_search seed1, seed2, &block
bisection_block = block || lambda { |x, y| (x-y)/2 + y }
LazyList.cons(
seed1,
LazyList.new(
seed2,
delay {
bisect(seed1, seed2, &bisection_block)
}
)
)
end
# infinitely bisects a space, exclusive of the seed values
def bisect seed1, seed2, &block
return EMPTY if seed1 == seed2
half = block.call(seed1, seed2)
LazyList.new(
half,
delay {
merge(
bisect(seed1, half, &block),
bisect(half, seed2, &block)
)
}
)
end
And indeed, LazyList.binary_search(0.0, 1.0) -> (0.0 1.0 0.5 0.75 0.25 ... )
.(x, y)
tuples, generated as the Cartesian Product of our binary search list with itself. If we use a typical breadth-first or depth-first algorithm, we’re sunk. We get ( (0.0 0.0) (1.0 0.0) (0.5 0.0) (0.75 0.0) (0.25 0.0)... )
. That has a nice distribution along one axis but still useless. What we need is a way of combining two infinite lists such that they give us a nice useful sample when we take a finite number of elements off the front.Breadth-first mapping from elements to integers | ||
---|---|---|
1 | 2 | 3 |
4 | 5 | 6 |
7 | 8 | 9 |
y
magnitudes by changing the order in which we selected rationals between 0.0
and 1.0
, what we need to do with Cartesian Products, what we need to do is select the products in an order that provides useful results.1, 5, 9
. The second is 4, 8
. The third is 2, 6
. And so on. The advantage of this algorithm is that if given two infinite lists, it starts in the upper left-hand corner and supplies values, working its way right and down.2[{:x=>0.0, :y=>0.0}, {:x=>0.0, :y=>1.0}, {:x=>1.0, :y=>0.0}, {:x=>0.0, :y=>0.5}]
. And the next four: [{:x=>1.0, :y=>1.0}, {:x=>1.0, :y=>0.5}, {:x=>0.5, :y=>0.0}, {:x=>0.0, :y=>0.25}]
. And the next eight: [{:x=>0.5, :y=>0.5}, {:x=>0.5, :y=>0.25}, {:x=>0.5, :y=>1.0}, {:x=>1.0, :y=>0.25}, {:x=>0.25, :y=>0.25}, {:x=>0.25, :y=>0.75}, {:x=>0.25, :y=>0.0}, {:x=>0.0, :y=>0.75}]
.
And I’m thinking about eternity
Some kind of ecstasy got a hold on me
—Bruce Cockburn, Wondering Where The Lions Are
Now that we can generate an infinite list of useful magnitudes, plus we can combine infinite lists in a useful way, we can define an infinite list of sample curves. Here’s the exact definition of an infinite list of partial beziérs (there are only three control points because the origin of the beziér is always the terminus of the previous beziér in the curve we are generating):def sample_cubic_beziers
magnitudes = LazyList.binary_search(0.0, 1.0)
control_points = LazyList.cartesian_product(magnitudes, magnitudes) do |x, y|
{ :x => x, :y => y }
end
LazyList.cartesian_product(control_points, control_points, control_points) do |p1, p2, p3|
{
:p1 => p1,
:p2 => p2,
:p3 => p3
}
end
end
That’s really simple, and really easy to understand.3 The domain-specific code that defines a cubic Beziér is concise and readable. And the infrastructure code that handles combinatorics is shunted away out of sight where it doesn’t clutter up our thinking.(x,y)
tuples where x
is time and y
is a magnitude. The effort remaining in a sprint backlog is an example of this kind of data set. I need a function that computes the distance between any two data sets. I already have a function that computes the distance between a curve through the xy
space and a data set, so if I build a lot of curves and then take the distance between each curve and each data set, I can find the distance between pairs of data sets by searching for the curve with the lowest total distance.ActiveRecord
models. So all I needed was some simple code that generates all of the possible combinations and then write them to the database. It’s the kind of thing candidates write in job interviews where employers actually care about their stone cutting skills.