Friday, May 24, 2013

Truthy and Falsy vs Explicit

I've been thinking about this for a bit, and I'm honestly not sure I can understand it while it's in my head. You know what time it is.

In general, having Falsy things makes it easier to use the if statement, and not having them means getting more explicit code. With that in mind, here's the state of play across the spectrum.

Haskell

doesn't take your shit. If you want to use if instead of pattern matching or guards, you're damn well going to pass it a Bool.

GHCi, version 7.4.1: http://www.haskell.org/ghc/  :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
Prelude> let yayNay val = if val then "Yay!" else "Nay!"
Prelude> yayNay False
"Nay!"
Prelude> yayNay []

<interactive>:4:8:
    Couldn't match expected type `Bool' with actual type `[a0]'
    In the first argument of `yayNay', namely `[]'
    In the expression: yayNay []
    In an equation for `it': it = yayNay []
Prelude> yayNay ""

<interactive>:5:8:
    Couldn't match expected type `Bool' with actual type `[Char]'
    In the first argument of `yayNay', namely `""'
    In the expression: yayNay ""
    In an equation for `it': it = yayNay ""
Prelude> yayNay 0

<interactive>:6:8:
    No instance for (Num Bool)
      arising from the literal `0'
    Possible fix: add an instance declaration for (Num Bool)
    In the first argument of `yayNay', namely `0'
    In the expression: yayNay 0
    In an equation for `it': it = yayNay 0
Prelude> 

If you want to check whether something is not empty rather than merely False, check for that.

empty [] = True
empty _ = False

or, better yet, just use pattern matching in the function you're writing instead of explicit if

foo [] = handleEmpty
foo (a:b:rest) = handleTwo
foo list = handleN

Finally, if you want to express "this computation can fail": be explicit about it and use a Maybe. Then handle the Nothing case, whether with a >>= or a case or a do. Haskell likes being explicit.

Python

At the entirely other end of the spectrum is Python. It'll take your shit, interpreting if in a Pythonic™© way to mean the appropriate empty check in some contexts.

>>> def yayNay(val):
...     if val:
...             return "Yay!"
...     else:
...             return "Nay!"
... 
>>> yayNay(False)
'Nay!'
>>> yayNay([])
'Nay!'
>>> yayNay({})
'Nay!'
>>> yayNay("")
'Nay!'
>>> yayNay(()) ## the empty tuple
'Nay!'
>>> yayNay(None)
'Nay!'
>>> yayNay(0)
'Nay!'
>>> 

I'm not sure how to feel about that. On the one hand, yes, empty checks are easier. On the other, empty checks usually aren't what you actually want[1], and I can't see another benefit of doing things this way. I also don't really understand why [] and 0 should be Falsy, but [0] should be Truthy. Also, it's not quite as simple as "Empty values should be Falsy". Because there are classes that are in the standard library for which this behavior would also make sense[2], but that lack it.

>>> import Queue
>>> Queue.Queue()
<Queue.Queue instance at 0x2837518>
>>> yayNay(Queue.Queue())
'Yay!'
>>> 

Despite those built-ins, it does seem to be possible to define your own Falsies. You can either define a __nonzero__ method for your class, which should return the explicit boolean value you want it mapped to, or a __len__ method, in which case your class will be treated as Falsy if its len is zero.

>>> class Falsy(object):
...     def __nonzero__(self):
...             return False
... 
>>> yayNay(Falsy())
'Nay!'
>>> class ShittyStack(object):
...     def __init__(self):
...             self.stack = []
...     def __len__(self):
...             return len(self.stack)
...     def push(self, thing):
...             self.stack.append(thing)
...     def pop(self):
...             return self.stack.pop()
... 
>>> s = ShittyStack()
>>> yayNay(s)
'Nay!'
>>> s.push(1)
>>> yayNay(s)
'Yay!'

JavaScript

does almost the same thing as Python, but adds null, undefined and NaN to the list of Falsy values, and considers empty sequences other than the empty string Truthy. That is,

>// from the Chromium prompt
undefined
> function yayNay(val) { if (val) return "Yay!"; else return "Nay!"; }
undefined
> yayNay(0)
"Nay!"
> yayNay([])
"Yay!"
> yayNay({})
"Yay!"
> yayNay(undefined)
"Nay!"
> yayNay(null)
"Nay!"
> yayNay("")
"Nay!"
> yayNay(NaN)
"Nay!"

There are no tuples in JS, so it can't do anything there. I have no idea what the reasoning is otherwise though. I especially have no idea what would possess someone to think that making the empty array Truthy and the empty string Falsy, unless strings are actually implemented as linked lists underneath.

Common Lisp

is middle-of-the road in this respect. It has a canonical t and nil as True/False, but nil also pulls double-duty as the empty list.

; SLIME 2012-09-04
CL-USER> (defun yay-nay (val) (if val :yay! :nay!))
YAY-NAY
CL-USER> (yay-nay 0)
:YAY!
CL-USER> (yay-nay (vector))
:YAY!
CL-USER> (yay-nay (make-hash-table))
:YAY!
CL-USER> (yay-nay (list))
:NAY!
CL-USER> (yay-nay '())
:NAY!
CL-USER> (yay-nay nil)
:NAY!
CL-USER> 

To a first approximation, it seems that the rationale here is "Sequences that you're likely to interact with through recursion should be Falsy when empty". Except that breaks down when you think about it a bit, because Common Lisp doesn't support tail call optimization either, the way that say, Scheme does.

Speaking of which...

Scheme

CHICKEN
(c)2008-2011 The Chicken Team
(c)2000-2007 Felix L. Winkelmann
Version 4.7.0 
linux-unix-gnu-x86-64 [ 64bit manyargs dload ptables ]
compiled 2011-09-05 on gladstone.duckburg.org (Linux)

#;1> (define (yay-nay val) (if val 'yay! 'nay!))
#;2> (yay-nay 0)
yay!
#;3> (yay-nay nil) ;; wait, fuck, that doesn't exist here

Error: unbound variable: nil

        Call history:

        <syntax>          (yay-nay nil)
        <eval>    (yay-nay nil) <--
#;3> (yay-nay '())
yay!
#;4> (yay-nay (list))
yay!
#;5> (yay-nay #f)
nay!
#;6> (yay-nay (null? (list)))
yay!
#;7> (yay-nay (not (list)))
nay!
#;8>

Scheme, or at least Chicken Scheme, doesn't seem to treat anything but an actual #f as false. Which is mildly bizarre, because the recursion logic would actually make sense here.

Clojure

seems to do its usual and split the difference between Common Lisp and Scheme.

REPL started; server listening on localhost port 40977
user=> (defn yay-nay [val] (if val :yay! :nay!))
#'user/yay-nay
user=> (yay-nay 0)
:yay!
user=> (yay-nay [])
:yay!
user=> (yay-nay "")
:yay!
user=> (yay-nay (list))
:yay!
user=> (yay-nay '())
:yay!
user=> (yay-nay nil)
:nay!
user=> (yay-nay false)
:nay!
user=> 

That is, empty sequences are all Truthy, false is obviously false, and nil is a Falsy value that doesn't double as the empty list.

Conclusion

Hah! You thought I was going to conclude something?

Non-Conclusion

Why is nil Falsy in Clojure? Why is [] Truthy but "" Falsy in JavaScript? Why, if a language has already decided to support empty sequences and values as Falsy, is a sequence composed of nothing but Falsy values Truthy? Why give this treatment only to some sequences and container values?

I haven't a fucking clue. And I guess I'll be asking pointed questions at the various language groups I attend periodically. Each of these languages have gotten me mileage, and I'm used to doing explicit empty checks, so Falsy empty values don't give me any sort of concrete advantage. Anything other than the Haskell approach seems arbitrary. Though to be fair, purely functional languages don't have to deal with the idea of pointers[3] which makes things easier.

There definitely isn't any kind of consensus on the issue, and I really wasn't expecting any, but it bothers me mightily that I can't even get my head around what the parameters of the problem are. It's not that we haven't decided whether Falsy or Explicit is better, it's that I'm not sure what factors would play into picking one over the other.

So much for peeling that one, I guess. I'll keep you posted if I have any more thoughts.


Footnotes

1 - [back] - And don't make much sense anyway since Tail Recursion is unpythonic for some odd-sounding non- or semi-reasons.

2 - [back] - For some values of make and sense.

3 - [back] - Or "names", since Python insists on renaming things.

Wednesday, May 22, 2013

Code Retreat - Sudoku

Dammit, Dann, this is getting to be a habit.

I haven't gotten to the bottom of the squares problem yet, and yesterday was the monthly Toronto Code Retreat. Which means my time is effectively up thanks to this new problem we're being handed. I'm sure there's a canonical, three-line solution out there somewhere, but I didn't want to read up on it until after the event, so I'm thinking at this from first principles. As a result, it won't be pretty or optimized.

The Problem

Near as I can tell, Sudoku is a Set problem. This isn't how I first heard the rules described, but it fits. A solved Sudoku board is one where

  • each number is filled in
  • each row is a set
  • each column is a set
  • each disjoint nxn square is a set (where n is the square root of the board size)

we'll call this "The Sudoku Property"

Now there's actually two problems; one is generating boards and one is solving boards. Solving seems to be pretty straightforward:

  1. take a board
  2. optionally, fill in all obvious squares until there are no more (an obvious square is one where only one legal move exists)
  3. for each blank square, try each possibility

If it can't be done, you've got an unsolvable board. If it can, return the first solution you find[1]. That seems to be that.

Generating is a bit of a different story. The naive solution is to generate a bunch of numbers and place them onto a board such that they respect TSP. Essentially, we're solving an empty board by starting with some random seeds. It really seems like this should work, but I can't convince myself that placing a number in an empty square while respecting the TSP will certainly avoid generating impossible boards. That is, an otherwise valid board which contains one or more empty squares whose set of possibilities is the empty set. It might be a better idea to start with nine sets and perform some sort transformation on it until we get something that respects TSP, rather than building it up piecemeal.

The Event

We did this pretty casually. The problem as presented was to solve boards generated by this site, and not bother with the generation side of things. In practice, no group ever got to the point of a general brute-force Sudoku solver. Three groups managed to solve obvious boards (ones where it's possible to find a complete solution without guessing), and one managed to hook together a bogosort-style solution that could, in principle, if we left it running for a few years, solve an arbitrary 9x9 Sudoku board.

We went the usual three sessions, and I did my usual and used a different language each time. First up, we had a mostly-ideation session in which we ostensibly worked in Python. The only snippet I've got left over from that is

def makeBoard(regionWidth):
    side = regionWidth * regionWidth
    return [[0 for x in xrange(side)] for y in xrange(side)]

sample = [[0, 7, 1, 4, 0, 0, 0, 0, 5],
          [0, 0, 0, 0, 5, 0, 0, 8, 0],
          [0, 0, 3, 9, 0, 7, 6, 0, 0],
          [0, 0, 0, 0, 0, 1, 0, 0, 0],
          [0, 9, 0, 8, 0, 6, 0, 0, 3],
          [0, 0, 0, 0, 0, 0, 8, 2, 0],
          [0, 6, 0, 0, 4, 0, 7, 0, 8],
          [3, 0, 0, 0, 0, 0, 0, 9, 0],
          [0, 0, 0, 0, 8, 5, 0, 0, 0]]

def placements(board, n):
    """Return list of coords where `n` is a legal placement."""
    emptyRows = [i for i, row in enumerate(board) if not n in row]
    res = []
    for i in emptyRows:
        for j, col in enumerate(board[i]):
            if col == 0: res.append((i, j))
    return res

The idea would be to get a function which takes a board and an n, and finds out what the possible placements of n are. I'm not entirely sure how we would have proceeded if this ended up working; I think we'd have to hope really hard that at least one of the digits has only one possible position, which doesn't sound terribly likely in general. The fully implemented version of that function would also check for column and block presences, rather than just filtering rows poorly.

For round two, I got into a group of three with two people who wanted to see Haskell in action. So, after some type-system snafus, we did this:

module Sudoku where

import Data.Set (Set(..), fromList, union, difference)
import Data.List (genericIndex, genericTake)

type Board = [[Int]]

sample :: Board
sample = [[0, 7, 1, 4, 0, 0, 0, 0, 5],
          [0, 0, 0, 0, 5, 0, 0, 8, 0],
          [0, 0, 3, 9, 0, 7, 6, 0, 0],
          [0, 0, 0, 0, 0, 1, 0, 0, 0],
          [0, 9, 0, 8, 0, 6, 0, 0, 3],
          [0, 0, 0, 0, 0, 0, 8, 2, 0],
          [0, 6, 0, 0, 4, 0, 7, 0, 8],
          [3, 0, 0, 0, 0, 0, 0, 9, 0],
          [0, 0, 0, 0, 8, 5, 0, 0, 0]]

row :: Board -> (Int, Int) -> Set Int
row board (x, y) = fromList $ genericIndex board y

col :: Board -> (Int, Int) -> Set Int
col board (x, y) = fromList $ map (flip genericIndex x) board

origin :: Int -> Int
-- origin n = 3 * fromIntegral $ floor $ n / 3
origin n
  | 3 > n = 0
  | 6 > n = 3
  | otherwise = 6

block :: Board -> (Int, Int) -> Set Int
block board (x, y) = fromList . concat . map (genericTake blockSize . drop ox) $ genericTake blockSize $ drop oy board
  where blockSize = 3
        ox = origin x
        oy = origin y
        
possibilities :: Board -> (Int, Int) -> Set Int
possibilities board (x, y) = difference (fromList [1..9]) b
  where a = union (row board (x, y)) $ col board (x, y)
        b = union a $ block board (x, y)

...which is also ugly as fuck. Let me zoom into two particular pieces for you

origin :: Int -> Int
-- origin n = 3 * fromIntegral $ floor $ n / 3
origin n
  | 3 > n = 0
  | 6 > n = 3
  | otherwise = 6

That commented line is what we wanted to do initially. Actually, what we wanted was just blockSize * (floor $ n/blockSize), except that Ints apparently don't derive RealFrac? I don't know what the actual problem there is, and I'll likely toss the question up to SO later, but at the time, we went with the ugly, 9x9-specific, guard-based approach.

block :: Board -> (Int, Int) -> Set Int
block board (x, y) = fromList . concat . map (genericTake blockSize . drop ox) $ genericTake blockSize $ drop oy board
  where blockSize = 3
        ox = origin x
        oy = origin y

Note that blockSize is specified as 3. Which means that this function is also specific to 9x9 Sudoku boards. We went this route for the same reasons as before; the type system doesn't like it when you try to use fromIntegral . sqrt $ length board for some reason, and we didn't have enough time left to figure out what type annotations we'd need to convince GHCi that yes, this is really what we mean.

We also didn't really have enough time in one 40-minute session to write solve, and chat about Haskell and try to form theories about the errors we were seeing, but it's a reasonable-if-rushed implementation of the approach I was assuming we'd use.

The last session, my partner and I decided to have some fun with JavaScript. Except neither of us had node installed standalone, so we ended up using the Chromium JS console. Once we loaded up underscore, things went much faster than expected. We managed to put together a solution that works on any valid board size, that could solve boards where it doesn't need to guess.

var sample4x4 = [[1, 0, 3, 0],
                 [0, 4, 0, 2],
                 [0, 3, 4 ,0],
                 [4, 0, 2, 3]];

var sample9x9 = [[0, 7, 1, 4, 0, 0, 0, 0, 5],
                 [0, 0, 0, 0, 5, 0, 0, 8, 0],
                 [0, 0, 3, 9, 0, 7, 6, 0, 0],
                 [0, 0, 0, 0, 0, 1, 0, 0, 0],
                 [0, 9, 0, 8, 0, 6, 0, 0, 3],
                 [0, 0, 0, 0, 0, 0, 8, 2, 0],
                 [0, 6, 0, 0, 4, 0, 7, 0, 8],
                 [3, 0, 0, 0, 0, 0, 0, 9, 0],
                 [0, 0, 0, 0, 8, 5, 0, 0, 0]];

function possibilities (board, x, y) { 
    return _.difference(_.range(1, _.size(board[0]) + 1), 
                        row(board, x, y),
                        col(board, x, y),
                        block(board, x, y))
}

function row(board, x, y) {
    return board[y];
}

function col(board, x, y) {
    return _.map(board, function (row) { row[x] })
}

function block(board, x, y) {
    var blockSize = Math.sqrt(_.size(board[0]))
    var origin = function (n) { 
        return blockSize * Math.floor(n / blockSize)
    }
    var relevantRows = _.take(_.drop(board, origin(y)), blockSize);
    return _.reduce(relevantRows, function (memo, row) { 
        return _.union(memo, _.take(_.drop(row, origin(y)), blockSize)) 
    }, []);
}

function isFilled (board) { 
    return _.every(board, function (row) { _.every(row, function (val) { val != 0; })})
}

function solve(board) {
    if (isFilled(board)) {
        return board;
    } else {
        return solve(_.map(board, function (row, y) { 
            return _.map(row, function (val, x) { 
                var poss = possibilities(board, x, y);
                if (val != 0) {
                    return val;
                } else if ((val == 0) && (_.size(poss) == 1)) {
                    return poss[0];
                } else {
                    console.log("UNSOLVED");
                    return 0;
                }
            })
        }));
    }
}

Easily the most annoying thing about writing quickly in JS after having gotten so used to the functional paradigm is having to return things explicitly. That bugs the crap out of me over in Python-land too, but at least Python makes up for it by saving you from lines like this

                }
            })
        }));
    }
}

I could do the Lisp-esque thing and start writing

                }})}));}}

but doubt anyone would appreciate that.

Further Reading

The Sudoku situation isn't quite as solved as I assumed before reading up on it. The appropriate Rosetta Code page doesn't have any solutions that leave me gobsmacked by elegance the way that the Clojure Game of Life did. Though the Prolog approach looks quite interesting. Also, the Haskellers apparently look at Sudoku-solver-writing as a hobby, so it has its own page over at the Haskell wiki. "Elegant" isn't something I'd say about any of these, but that's probably because my Math level isn't high enough.

The last note I'll leave you on is that brute-forcing apparently isn't the best solution here. Sudoku is an instance of something called the Exact cover problem, and a Marlboro College student named Sam Auciello wrote an implementation of that solution in Python (links to code files at the bottom of that page page). I haven't quite gotten my head around this one yet, and that's probably what I'll be doing with my next few chunklets of spare time.


Footnotes

1 - [back] - (If you're doing this functionally and recursively, it seems like it would be fairly straight-forward. Side-effects complicate things slightly since you have to copy your board out at each decision step, or devise some other method of backtracking)

Thursday, May 2, 2013

How to implement a lazy sort

You can't implement a lazy sort.

Not really.

I mean, ok, yes, you can implement a sort that defers as much work as possible via that destructured approach I touched on last time and which I will expand on in a minute, but that's not a lazy sort.

It has certain characteristics which seem intrinsic to sorts that prevent it from behaving the way you'd expect an actual lazy function to behave. For starters, it doesn't save you any memory. In fact, you need to pull some fairly fancy footwork to write a functional, deferring sort that doesn't waste much more memory than the in-place version[1]. Next, it can't consume infinite sequences. That's to do with the definition of a sort; I can't see a good way of sorting an infinite sequence without backtracking, which isn't practical in the wild. Finally, it doesn't seem to save you as much time in composition as a regular lazy function would. It has to consume the entire relevant input stream, do some preliminary processing on it[2], then hand you a stream of outputs.

The end result is a sort function that defers a lot of its work, and does save you a lot of time if you only happen to want the first bit of a list sorted, and it even saves you a little bit of time if you want to run several lazy functions in succession on its output, but it doesn't quite do enough to get the label "lazy" without a massive asterisk next to it.

How to implement a lazy* sort

First up, here's the Python code.

from copy import copy

def eager(aList, pred=lambda a, b: a > b):
    arr = copy(aList)
    count = len(aList)
    heapify(arr, count, pred)
    end = count - 1
    while end > 0:
        __swap(arr, end, 0)
        end -= 1
        sift(arr, 0, end, pred)
    arr.reverse()
    return arr

def lazy(aList, pred=lambda a, b: a > b):
    arr = copy(aList)
    count = len(arr)
    heapify(arr, count, pred)
    end = count - 1
    while end > 0:
        yield arr[0]
        __swap(arr, end, 0)
        end -= 1
        sift(arr, 0, end, pred)
    yield arr[0]

def heapify(arr, count, pred=lambda a, b: a > b):
    start = (count - 1)/2
    while start >= 0:
        sift(arr, start, count-1, pred)
        start -= 1
    return arr

def sift(arr, start, end, pred=lambda a, b: a > b):
    root = start
    while root * 2 + 1 <= end:
        child = root * 2 + 1
        target = root
        if pred(arr[target], arr[child]):
            target = child
        if child+1 <= end and pred(arr[target], arr[child+1]):
            target = child + 1
        if not target == root:
            __swap(arr, root, target)
            root = target
        else:
            return arr

def __swap(arr, a, b):
    arr[a], arr[b] = arr[b], arr[a]

This is a direct translation of the Wikipedia Heapsort pseudo-code. And yes, I used Python because it's close enough to pseudo-code that I could do a working, line-by-line translation. It was really, really tempting to just implement eager as list(lazy(aList, pred)), but that wouldn't have told me what I wanted to know for the next bit.

>>> import heapsort, cProfile
>>> from random import Random
>>> sample = [Random().randint(0, 1000) for i in xrange(0, 5000)]
>>> cProfile.run("heapsort.eager(sample)")
         172098 function calls in 0.103 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.103    0.103 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 copy.py:113(_copy_with_constructor)
        1    0.000    0.000    0.000    0.000 copy.py:66(copy)
        1    0.002    0.002    0.013    0.013 heapsort.py:27(heapify)
   107567    0.014    0.000    0.014    0.000 heapsort.py:3(<lambda>)
        1    0.004    0.004    0.103    0.103 heapsort.py:3(eager)
     7499    0.069    0.000    0.097    0.000 heapsort.py:34(sift)
    57023    0.015    0.000    0.015    0.000 heapsort.py:49(__swap)
        1    0.000    0.000    0.000    0.000 {len}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        1    0.000    0.000    0.000    0.000 {method 'get' of 'dict' objects}
        1    0.000    0.000    0.000    0.000 {method 'reverse' of 'list' objects}


>>> cProfile.run("heapsort.lazy(sample)")
         3 function calls in 0.000 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.000    0.000 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 heapsort.py:15(lazy)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}


>>> cProfile.run("list(heapsort.lazy(sample))")
         177097 function calls in 0.107 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.001    0.001    0.107    0.107 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 copy.py:113(_copy_with_constructor)
        1    0.000    0.000    0.000    0.000 copy.py:66(copy)
   107567    0.015    0.000    0.015    0.000 heapsort.py:15(<lambda>)
     5001    0.004    0.000    0.106    0.000 heapsort.py:15(lazy)
        1    0.002    0.002    0.014    0.014 heapsort.py:27(heapify)
     7499    0.070    0.000    0.098    0.000 heapsort.py:34(sift)
    57023    0.015    0.000    0.015    0.000 heapsort.py:49(__swap)
        1    0.000    0.000    0.000    0.000 {len}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        1    0.000    0.000    0.000    0.000 {method 'get' of 'dict' objects}


>>> 

This is that work deferring thing happening. The call to heapsort.lazy doesn't give you a sorted list; it gives you a generator you can traverse to get that list. The call to heapsort.eager does give you the whole sorted list, and takes very slightly less time than the lazy version if you happen to need the whole list. As I said before though; if you're only after the first 10% or so elements, there's no contest in terms of execution time, even if you're trying to be semi-functional by copying out the input instead of destructively modifying it.

Oh, before anyone gets the wrong idea

>>> cProfile.run("sorted(sample)")
         3 function calls in 0.002 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.002    0.002 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        1    0.002    0.002    0.002    0.002 {sorted}


>>> 

My stupid little heapsort implementation isn't meant to showcase how slow Python is or anything like that. It's a learning exercise to show how you'd go about implementing a deferred sort in principle, not go into the nuts-and-bolts tuning process that comes once you've got your algorithm and data structures down. In other words, see it as a direct comparison of a shitty sort to the equivalent-except-lazy shitty sort.

Second verse, same as the first

Except with more parentheses. And I actually try to think through the problem rather than mindlessly parroting back an algorithm outline pilfered from Wikipedia.

(defpackage :heapsort (:use :cl))
(in-package :heapsort)

(defun heapsort-lazy (seq &optional (predicate #'>))
  (let* ((len (length seq))
         (heap-vector (heapify seq len predicate)))
    (lambda ()
      (decf len)
      (heappop! heap-vector len predicate))))

(defun heapsort-eager (seq &optional (predicate #'>))
  (let* ((len (length seq))
         (heap-vector (heapify seq len predicate)))
    (loop for i from (- len 1) downto 0
       unless (= 0 i) 
       collect (heappop! heap-vector i predicate))))

(defun heapify (seq len predicate)
  (loop with buf = (make-array len :adjustable t :fill-pointer len)
     for ix from 0
     for elem in seq do (setf (aref buf ix) elem)
     do (loop with i = ix
           while (> i 0)
           for parent = (heap-parent i)
           until (compare buf parent i predicate)
           do (swap! buf parent i)
           do (setf i parent))
     finally (return buf)))

(defun sift! (heap-vector start end predicate)
  (loop with ix = start
     until (> (+ 1 (* 2 ix)) end)
     while (loop for child-ix in (heap-children-descending heap-vector ix end predicate)
              when child-ix
              do (unless (or (compare heap-vector ix child-ix predicate))
                   (swap! heap-vector ix child-ix)
                   (setf ix child-ix)
                   (return t))
              finally (return nil))))

(defun heappop! (heap-vector last predicate)
  (swap! heap-vector 0 last)
  (sift! heap-vector 0 (- last 1) predicate)
  (vector-pop heap-vector))

(defun heap-children-descending (heap-vector ix bounds predicate)
  (let ((child-l (+ 1 (* 2 ix)))
        (child-r (+ 2 (* 2 ix))))
    (cond ((> child-l bounds) nil)
          ((> child-r bounds) (list child-l nil))
          (t (if (compare heap-vector child-l child-r predicate) 
                 (list child-l child-r)
                 (list child-r child-l))))))

(defun heap-parent (n)
  (- (ceiling (/ n 2)) 1))

(defun compare (arr ix-a ix-b &optional (predicate #'>))
  (funcall predicate (aref arr ix-a) (aref arr ix-b)))

(defun swap! (arr ix-a ix-b)
  (rotatef (aref arr ix-a) (aref arr ix-b)))

I'm not going to bother showing you the profiling on this one. Rest assured that the results on this and the Python version were very similar; the eager version is marginally faster than the lazy version at sorting the entire list and handing it to you, but has a massive disadvantage if you only want some small chunklet of the complete list. Also, the built-in sort beats both by several orders of magnitude.

For those of you who, like me, have never worked with heaps before[3], here's some basic theory. A heap is actually two things:

  • A tree-based data structure in which each parent node is ordered with respect to its children. This is the easier-than-sorted-but-still-useful property mentioned earlier; children aren't ordered with respect to each other, and if you're watching the wiki illustration for the first time ever, you might be forgiven for thinking that step 1 involves randomly re-arranging your input. It's very easy to pull out the next element; it's the root. However, every time you pop the root, you need to do some re-juggling to maintain the heap property.
  • A way of packing said tree-based data-structure into a 1-d array. It's not painfully obvious, so I figured I'd make this part explicit: you pack a heap into a vector by designating (aref vector (+ 1 (* i 2))) and (aref vector (+ 2 (* i 2))) to be the children of (aref vector i). This is faster than navigating an actual pointer tree, but it makes the structure of the code a bit counter-intuitive to the uninitiated, since it's talking about indices in non-obvious ways rather than talking about parents and children.

Now then, most of the actual heapsort.lisp code is implementing a heap. Again, just for educational purposes, I'm sure there's a variable-predicate heap implementation floating around somewhere even though I haven't looked for it[4]. In fact, lets take a look at the top-level functions before diving into that code, just to get it out of the way.

(defun heapsort-eager (seq &optional (predicate #'>))
  (let* ((len (length seq))
         (heap-vector (heapify seq len predicate)))
    (loop for i from (- len 1) downto 0
       unless (= 0 i) 
       collect (heappop! heap-vector i predicate))))

We take a list, heapify it, then collect heappop!ed elements and return the result. Nothing to see here, it's exactly what you'd expect from a sort.

The lazy version is mildly more interesting

(defun heapsort-lazy (seq &optional (predicate #'>))
  (let* ((len (length seq))
         (heap-vector (heapify seq len predicate)))
    (lambda ()
      (decf len)
      (heappop! heap-vector len predicate))))

Common Lisp doesn't have the notion of a generator in the same sense as Python, but a lambda with a closure around it does just as well for our purposes. You keep calling it to get at the next element, and it eventually throws an invalid-array-index-error that you need to deal with in some way. This actually seems like the most reasonable solution here; the alternative is something like

(defun heapsort-lazy (seq &optional (predicate #'>))
  (let* ((len (length seq))
         (heap-vector (heapify seq len predicate)))
    (lambda ()
      (if (>= 0 len)
          (values nil nil)
          (progn (decf len)
                 (values (heappop! heap-vector len predicate) t))))))

It's tempting to just return NIL, but then there's no way for a caller to disambiguate between "The next element in your sequence is NIL" and "There are no more elements in the sequence. So NIL". My kingdom for a Maybe monad, as annoying as most people seem to consider them.

Anyhow, onward.

(defun heapify (seq len predicate)
  (loop with buf = (make-array len :adjustable t :fill-pointer len)
     for ix from 0
     for elem in seq do (setf (aref buf ix) elem)
     do (loop with i = ix
           while (> i 0)
           for parent = (heap-parent i)
           until (compare buf parent i predicate)
           do (swap! buf parent i)
           do (setf i parent))
     finally (return buf)))

My definition of heapify doesn't use a call to sift! anywhere, in blatant defiance of the standard implementation. Really, I should have factored that middle bit out into heappush!, because that's what it amounts to. You start with an empty heap, insert new elements, and compare each new element to its parent, calling swap! until you have something that respects the Heap Property.

(defun swap! (arr ix-a ix-b)
  (rotatef (aref arr ix-a) (aref arr ix-b)))

swap! is implemented in terms of rotatef; it takes an array and two indices, and swaps the appropriate array cells. heap-parent shouldn't surprise you at all if you were paying attention when I explained what a heap actually is

(defun heap-parent (n)
  (- (ceiling (/ n 2)) 1))

And heappop! swaps the first element with the last, calls sift! on everything but the last element, then runs vector-pop to return the last element and shorten the vector.

(defun heappop! (heap-vector last predicate)
  (swap! heap-vector 0 last)
  (sift! heap-vector 0 (- last 1) predicate)
  (vector-pop heap-vector))

Which just leaves the sift! procedure, and its utility functions.

(defun sift! (heap-vector start end predicate)
  (loop with ix = start
     until (> (+ 1 (* 2 ix)) end)
     while (loop for child-ix in (heap-children-descending heap-vector ix end predicate)
              when child-ix
              do (unless (or (compare heap-vector ix child-ix predicate))
                   (swap! heap-vector ix child-ix)
                   (setf ix child-ix)
                   (return t))
              finally (return nil))))

It takes a start parameter, since the pseudo-code did the same, but I didn't find myself calling it with anything other than 0, so maybe that was a bit of a waste. To be fair, that pseudo also uses sift! as part of insertion, rather than doing the more straight-forward parent comparison, which might explain the difference. This is literally the most impenetrable part of this program, and it's crucial, because unless you understand this you won't get how the entire system produces sorted output. I'll take it slow, just in case. Feel free to stop reading here if you know this already.

  ...
  (loop with ix = start
  ...

We're starting at the beginning

     ...
     until (> (+ 1 (* 2 ix)) end)
     while [something-huge]
     ...

and going either until we get to the end of the heap, or until we get to the point where no more calls to swap! are needed. That makes sense because if we don't need to swap! further, and we started with a heap, we know that the rest of it already satisfies the Heap Property and therefore doesn't need to be heaped again. If we've gotten to the end, then we know that we just tried to sift! the smallest element in the heap, which is why it's at the bottom.

     while (loop for child-ix in (heap-children-descending heap-vector ix end predicate)
     ...

That's going to do something to each child-ix in the result of heap-children-descending function.

(defun heap-children-descending (heap-vector ix bounds predicate)
  (let ((child-l (+ 1 (* 2 ix)))
        (child-r (+ 2 (* 2 ix))))
    (cond ((> child-l bounds) nil)
          ((> child-r bounds) (list child-l nil))
          (t (if (compare heap-vector child-l child-r predicate) 
                 (list child-l child-r)
                 (list child-r child-l))))))

This isn't in the pseudocode either; I ended up deciding to compare the children for size so that the parent compares against the greatest first so that the Heap Property is more easily preserved. The return value is a list of children in descending order, and we also handle the case where this particular parent only has one child. Oh, also, compare is just a utility function that helps me compare two array elements by index

(defun compare (arr ix-a ix-b &optional (predicate #'>))
  (funcall predicate (aref arr ix-a) (aref arr ix-b)))

Now then.

              ...
              when child-ix
              do (unless (or (compare heap-vector ix child-ix predicate))
                   (swap! heap-vector ix child-ix)
                   (setf ix child-ix)
                   (return t))
              finally (return nil))))
              ...

If there are any children, compare the parent to them in descending order. If one of the children is bigger than the parent, swap! them, set ix to the index of that child (which now contains our new parent), and return t (this tells the upper loop to keep comparing, since we now need to compare that new ix to its children). Otherwise, return nil to signal that we've achieved a new heap.

And that's that. As long as the Heap Property is respected, the next item by whatever predicate was passed is always one quasi-efficient heappop! away, and as a result, a call to heapsort-lazy never does any more work than it absolutely needs to while still providing flexible sorted output.

I learned a lot implementing this. Hopefully the write-up does something for someone else out there too.


Footnotes

1 - [back] - I don't incidentally, both the Python and Common Lisp code I'm going to show you use generators. That is, side-effect-dependent lazy sequences with the restriction that you can only traverse them once, mostly as a result of those side-effects.

2 - [back] - I've been saying "heapsort", though that's strictly speaking not the case. What you need for a lazy* sort is an intermediate structure that

  • is easier create than a full sort in the general case
  • has some property which makes it more useful than an arbitrary collection when it comes to finding next
A heap does both of those, and it's fairly easy to understand, so I picked it, but it's not necessarily the only or best approach. I just don't know any other ones.

3 - [back] - Graphic Design degree, remember? They didn't offer Data Structures 101, as much as I would have enjoyed that much more than Art History for Consumers.

4 - [back] - As a note, Python doesn't seem to. The standard heapq doesn't provide a way of pulling out a key from the sorted structures, or a way change out predicates. The standard Python wisdom seems to be pulling keys out yourself, storing your actual values in a way indexable by said keys, then sorting the keys instead.