Monday, April 30, 2012

Indentation Ventilation

I need to vent, so just ignore this article.

Your formatting makes me want to puke.

For fuck's sake, when I ask questions about JavaScript or C, I don't bust out

lorem ipsum(dolor, sit) { 
    amet = consectetur(); 
    adipiscing (elit) 
        { Nulla_quis_purus = ac;
          arcu = consequat(consectetur(vitae(convallis,lacus)));}
        Aliquam { erat volutpat;}}

So as a point of basic courtesy, please learn to indent things in a way that doesn't make it look like you just loaded your parens into a shotgun before firing them at your editor. It's not as though it's difficult. Lisp is made of s-expressions, the only places where indentation gets ambiguous is in extended loop forms, and maybe one or two edge cases with lets. Just take the three minutes required to read the standard style, and stop pretending that you're showing us paren-using savages the proper way to waste vertical space.


I need some sleep.

Sunday, April 29, 2012

Notes From the Borders of Erlang

This is going to be a pretty disjointed, Erlang-heavy article, since that's basically been the main dominating piece of programming-related thought in my brain for the past week. It actually started a while back, when I got the unofficial heads-up that we'll soon be starting a new project at work which will call for super-massive transaction counts, require high reliability/uptime and be mostly server-based. That short-list tells me that the right tool for the job is probably a functional language that focuses on inter-process communication, and enforced isolation between components. Also, one of the big reasons I like my current company is that if I make a decision about what technology we're using, no one gets to tell me to fuck off.

Thus began the research...

There are the usual set of resources over there in the sidebar[1], but do also take the time to check out this vimeo piece featuring Joe Armstrong. It won't really give you much insight into how to use the language, but it will show you a bit of the history and intent. Like I said, entirely worth it to hear the man talk, but here are the big points, as extracted by yours truly; he highlighted three things that were missing from Erlang[2], one big mistake, two not-too-bad ideas and three fairly nice ideas that the team had when developing the language. He noted that these are controversial, but I tend to agree with a pretty large number of his assessments. Then again, I'm the crazy motherfucker who regularly blogs about his experiences with Lisp, Smalltalk, Erlang and Ruby, so maybe I'm not the best person to gauge what a mainstream opinion is supposed to look like.

Three Missing Things

Hash Maps - JSON-style key/value data structures. Not just adding them to the system, but making them the fundamental data-type rather than tuples or arrays. I can see why, too; if you look at any tutorial or piece of Erlang code, you'll see things that fake key/value pairs using tuples. Things like {shopping_list, [{oranges, 3}, {apples, 4}, {bread, 1}]}, which would be better expressed as a JSON structure[3].

Higher Order Modules - code in Erlang is organized into modules, which is par for the course these days, but you can't programmatically introspect on them at runtime. Joe mentioned the example of being able to send a particular standardized message and getting back a list of messages supported by the target. I guess this probably might get built into the existing language piecemeal by convention rather than specification. I'm imagining a situation where a given team agrees that they'll write all their modules to accept a help message which would return a list of the functions it provides and a specification of inputs they'd each accept. Thing is, 1. that wouldn't be a language-wide standard, and 2. it would take additional explicit work by the developers. If it was handled at the language level, everyone would have access to the same introspection facilities, and they'd be handled with no additional thought or deed on the developers' part.

The Ability to receive a fun - Erlang is a higher-order language, and you can send around function names whenever and wherever you damn well please, but apparently the built-in receive directive won't let you pass it an anonymous function. Ok, this isn't one you could solve with macros, but I'm not entirely sure it would be a good idea in the first place. The thing on the other end of the line isn't necessarily code you can trust, but it would certainly add more flexibility.

One Big Mistake

Lost Too Much Prolog - Joe's a big Prolog fan, which should come as no surprise to anyone who's read any Erlang tutorials, watched any Erlang talks, or indeed, written any Erlang code. I'm not qualified to comment, never having done anything approaching serious development in Prolog[4].

Two Not Too Bad Ideas

He gave this talk to an American audience, so he had to have a section with Good™ and Great™ ideas, though he would have preferred to be more modest about it. In deference to his preference, I'm keeping his intended titles.

Lightweight Processes Are Ok -

"... we've shown that you can do processes in the language, and we've shown there's no need for threads. Threads are intrinsically evil, and [shouldn't] be used. Threads were sort of this 'Oh my goodness, processes aren't efficient enough, so lets use this abomination to...' horrible things." - Joe Armstrong
For my part, I've got a half-written piece about cl-actors sitting in my drafts folder. It's a pretty good, lightweight implementation of the actor model built on top of bordeaux-threads. And if you like the Erlang-style message passing, do give it a shot, but it doesn't quite do the same thing as Erlang manages. The threading model means you can't expect to reliably spawn thousands of cl-actors on a typical machine. For comparison, the Pragmatic book has an example on pg 149/150 wherein Joe removes the built-in safety limit of 32 767 processes and has Erlang spawn 200 000 without breaking a sweat[5]. That seems like at least part of the story behind those mind-boggling benchmarks that you've all probably seen by now.

OTP Behaviours - The correct way to think of Behaviours, Joe says, is to consider them the process equivalent of higher-order functions. They formalize basic request patterns between processes letting individuals focus on the differences. I don't actually have enough experience with them yet, but if Joe's description is accurate, I can see them being very useful when constructing complex systems with a reliability requirement.

Three Fairly Nice Ideas

Bit Syntax - Is frequently useful when setting up low-level communications with non-Erlang processes, and reading files. Joe calls this out as the first of three very useful features, and it really is elegant. If you've never seen it, I encourage you to take a quick look. Short version: the notation they've set up gives you access to the same pattern matching facilities you can expect from the rest of the language, which in turn makes it very simple to decode and process binary data.

Formalized Inter-process Relationships - This is another feature that typical "Erlang-style" systems miss. They're useful as fuck when you're building multi-processing systems, but it seems like you could add them on later if you picked your primary primitives properly. The idea is that you can explicitly link various processes in certain ways. For instance, you can tell a group of processes to all fail if one of them fails, or you can tell a specific process to monitor another, restarting it in the event of an error.

Offensive Programming - He called it "non-defensive programming", but I like the negative name better. Offensive programming is the technique of programming only for the successful case, and letting any error take down the process involved (someone will be along to pick up the pieces and restart it shortly). That would sound crazy in your typical language, but starts looking like a good idea when your principal method of organization is a completely isolated process.


Aside from historical notes and tutorials, I've been looking at how I'd go about interfacing Erlang to other languages. The standard seems to be doing it the same way you'd interface different Erlang processes. Except that where Erlang nodes already know how to talk to each other, the protocol needs to be implemented manually for other languages. It works consistently whether you're talking to Python, Ruby, Common Lisp, Java or C[6]. All the languages I've taken a look at so far come with an established protocol to talk to Erlang in some way.

Here's a practical example that I'll actually end up refining for deployment later; a C-based interface to some very specific ImageMagick routines.

First, the Erlang communication functions[7]

/* erl_comm.c */

#include <unistd.h>

typedef unsigned char byte;

int read_cmd(byte *buff);
int write_cmd(byte *buff, int len);
int read_exact(byte *buff, int len);
int write_exact(byte *buff, int len);

int read_cmd(byte *buff) {
  int len;
  if (read_exact(buff, 2) != 2) {
  len = (buff[0] << 8) | buff[1];
  return read_exact(buff, len);

int write_cmd(byte *buff, int len) {
  byte li;
  li = (len >> 8) & 0xff;
  write_exact(&li, 1);
  li = len & 0xff;
  write_exact(&li, 1);

  return write_exact(buff, len);

int read_exact(byte *buff, int len){
  int i, got=0;
  do {
    if ((i = read(0, buff+got, len-got)) <= 0) {
    got +=i;
  } while (got<len);
  buff[len] = '\0';

int write_exact(byte *buff, int len) {
  int i, wrote = 0;
  do {
    if ((i = write(1, buff+wrote, len-wrote)) <= 0) {
    wrote += i;
  } while (wrote<len);

Next, the "driver". This is the bit that will actually end up being spawned and fed input by the Erlang process

/* driver.c */
#include <limits.h>
#include <libgen.h>
#include <string.h>
#include <stdio.h>
#include <stdlib.h>

typedef unsigned char byte;

int read_cmd(byte *buff);
int write_cmd(byte *buff, int len);

char *chop_path(char *orig) {
  char buf[PATH_MAX + 1];
  char *res, *dname, *thumb;

  res = realpath(orig, buf);
  if (res) {
    dname = dirname(res);
    thumb = strcat(dname, "/thumbnail.png");
    return thumb;
  return 0;

int main(){
  int result, i, len;
  byte buff[255];
  char *thumb;

  while (read_cmd(buff) > 0) {
    thumb = chop_path(buff);
    result = thumbnail(buff, thumb);
    buff[0] = result;
    write_cmd(buff, 1);

Then the actual function I'll be wanting to call[8]

/* wand.c */
#include <stdio.h>
#include <stdlib.h>
#include <wand/MagickWand.h>

#define ThrowWandException(wand, ret) \
{ \
  char \
    *description; \
  ExceptionType \
    severity; \
  description=MagickGetException(wand,&severity); \
  (void) fprintf(stderr,"%s %s %lu %s\n",GetMagickModule(),description); \
  description=(char *) MagickRelinquishMemory(description); \
  wand=DestroyMagickWand(wand); \
  MagickWandTerminus(); \
  return ret; \

int thumbnail (char *image_name, char *thumbnail_name){

  MagickWand *magick_wand;
  MagickBooleanType status;

  /* Read an image. */
  status=MagickReadImage(magick_wand, image_name);
  if (status == MagickFalse) ThrowWandException(magick_wand, 1);
  /* Turn the images into a thumbnail sequence. */
  while (MagickNextImage(magick_wand) != MagickFalse)

  /* Write the image then destroy it. */
  status=MagickWriteImages(magick_wand, thumbnail_name, MagickTrue);
  if (status == MagickFalse) ThrowWandException(magick_wand, 2);
  return 0;

and, finally, the actual calling Erlang module.

%% wand.erl
-export([start/0, stop/0, restart/0]).

start() ->
    spawn(fun() ->
                  register(wand, self()),
                  process_flag(trap_exit, true),
                  Port = open_port({spawn, "./wand"}, [{packet, 2}]),

stop() -> wand ! stop.

restart() -> stop(), start().

thumbnail(Filename) ->

call_port(Msg) ->
    wand ! {call, self(), Msg},
        {wand, Result} ->

loop(Port) ->
        {call, Caller, Msg} ->
            Port ! {self(), {command, Msg}},
                {Port, {data, Data}} ->
                    Caller ! {wand, decode(Data)}    
        stop ->
            Port ! {self(), close},
                {Port, closed} ->
        {'EXIT', Port, Reason} ->
            exit({port_terminated, Reason})

decode([0]) -> {ok, 0};
decode([1]) -> {error, could_not_read};
decode([2]) -> {error, could_not_write}.

Once all that is done, and compiled using gcc -o wand `pkg-config --cflags --libs MagickWand` wand.c erl_comm.c driver.c, I can call it from an Erlang process as if it were a native thumbnail generator.

Erlang R15B01 (erts-5.9.1) [source] [64-bit] [smp:4:4] [async-threads:0] [kernel-poll:false]
  Eshell V5.9.1  (abort with ^G)
1> c(wand).
2> wand:start().
3> wand:thumbnail("original.png").
{ok, 0}
4> wand:thumbnail("/home/inaimathi/Pictures/and-another.png").
{ok, 0}

You'll have to take my word for it, but those both generate the appropriate "thumbnail.png" file in the same directory as the specified images.

All of that looks pretty complicated, but it really isn't when you sit down and read it. If I had to break it out by time involved, it would look something like

  • 10% reading up on ImageMagick C interface
  • 5% reading up on Erlang C FFI (emphasis on the ports)
  • 5% writing it
  • 80% trying to figure out why the C component was segfaulting when assembling a thumbnail path (short answer: I didn't include everything I needed to)

As a parting note, having gone through the rat's nest that is pathname manipulation in C, I hereby promise to never again bitch about Lisp's pathname handling. Nothing like wading waist-deep in horse shit to remind you how good you've got it merely living within earshot of the stables.


1 - [back] - Though I'll admit, the Erlang section is pretty sparse compared to the rest of them

2 - [back] - That he'd put in if he had to do it again

3 - [back] - For the record, I'm trying really hard not to put on my Lisp hat and say something like "Mmmm, mmmm, this syntactic abstraction is fucking delicious! How's it working for you guys? Oh, you haven't had any?! That's a shame..." in an obnoxiously smug voice. It's difficult, and this footnote may count as a failure. Sorry.

4 - [back] - In fact the entirety of my related experience is the appropriate chapter from 7 Languages..., flipping through the Reasoned Schemer and the SICP lectures wherein prolog is briefly implemented on top of Lisp. Thant link is to the playlist rather than the correct episode; it's been a while, and I no longer remember which it was specifically.

5 - [back] - This was reportedly on a 2.4gHz Celeron machine with a half-gig of ram, so that was not a consequence of awesome hardware

6 - [back] - Actually, that's half true. There are three different ways to interface with a C program; you can do port-based communication with a custom protocol, you can call C natively at the risk of system collapse with errors, or you can implement the Erlang protocol and pretend to be an Erlang process for the purposes of interoperability. The other languages I've taken a look at do that last one, but you've got options if you're rolling your own

7 - [back] - Ripped bleeding from Programming Erlang. It which won't change at all, regardless of what specific protocol I end up picking

8 - [back] - With thanks to the reference implementation from the ImageMagick team

Tuesday, April 17, 2012

cl-smtp vs Exchange server

For those of you just here for the easy, googlable answer. To send an HTML email with cl-smtp, do this:

(cl-smtp:send-email [server] [from] [to] [subject] 
                    [plaintext message, or possibly NIL] 
                    :html-message [HTML message])

Making sure to replace the things with square brackets, obviously. Passing nil instead of the mandatory message parameter causes all the clients I've tested with so far to automatically display your email as a standard HTML message.

Now then.

The documentation in the module itself follows the usual Common Lisp standards of being minimal, verging on nonexistent[1]. The best example I managed to find of sending an HTML-formatted email from cl-smtp can be seen here. The suggestion is to do

 +mail-server+ from to subject
      <img src=\"\" alt=\"A dog comically answering a phone\"/>
 :extra-headers '(("Content-type" "text/html; charset=\"iso-8859-1\"")))

And if you do that, it will seem to work unless you run into someone with a particularly configured Exchange server. You might be thinking[2] "Oh, fantastic, MS once again cocks up what should be a simple and straightforward task", but I'm not so sure. Lets take a look at the headers produced by using the :extra-headers approach above.

Subject: Serious Business
X-Mailer: cl-smtp(SBCL
Content-type: text/html; charset="iso-8859-1" ## the result of our option 
Mime-Version: 1.0
Content-type: text/plain; charset="UTF-8" ## the default cl-smtp header 

Now like I said, this seems to get interpreted as intended in most places. Notably, gmail, hotmail, yahoo mail, my companies' exchange server, and probably mailinator as well, all output the result of this multi-Content-type-headered email as text/html. The thing is, it seems fairly reasonable to parse this strictly and accept the last Content-type declaration rather than the most general. So I guess another way of saying it is "this won't work on a properly configured Exchange server".

The actually working way of accomplishing this task is to use the built-in :html-message parameter

 +mail-server+ from to subject
 "Ok, the HTML version of this email is totally impressive. Just trust me on this."
      <img src=\"\" alt=\"A dog comically answering a phone\"/>

if you don't want to send a plaintext message at all, it's possible[3] to pass nil as the message body

 +mail-server+ from to subject nil
      <img src=\"\" alt=\"A dog comically answering a phone\"/>

Doing it this way causes cl-smtp to break your message up into a plaintext and HTML version. You then rely on a client showing its user the appropriate one depending on their context[4].

Subject: Serious Business
X-Mailer: cl-smtp(SBCL
Mime-Version: 1.0
Content-type: multipart/alternative;
Message-Id: blahblahblah

Content-type: text/plain; charset="UTF-8"
Content-Disposition: inline

Ok, the HTML version of this email is totally impressive. Just trust me on this.

Content-type: text/html; charset="UTF-8"
Content-Disposition: inline

<html><body><h2>YES. THIS IS DOG.</h2><img src="" alt="A dog comically answering a phone"/></body></html>



1 - [back] - Though it does show you a useful example of how to put an attachment in a sent email.

2 - [back] - As I did initially.

3 - [back] - Though probably not advisable in all cases.

4 - [back] - Which most seem to, but there are still one or two Exchange-server related hiccups for some users with particular versions of the software.

Thursday, April 12, 2012

Defusing Weapons of Mass Destruction

This is a complete non-sequitur thought that hit me recently. Ok, I guess what I mean is "non-sequitur assuming you're extrapolating from recent articles", but I promise it makes complete sense from my perspective. I've been keeping up with Google v Oracle recently, as well as keeping about a quarter of an eye on certain developments in Germany. I'm also dealing with some minor (thankfully non-software, so it at least makes sense) patent issues at my job, and to top it all off, the last Toronto Lisp User Group meeting included some discussion of patents for a novel way of doing natural language processing (no details, since they're still pending).

Oh, heads up I guess, I'm talking about software patents this time. Now you know. Also, let me preface by making the obvious statements:

  • I am not a lawyer
  • I don't play one on TV
  • I don't even play one on the internet
  • I'm a software developer, illustrator and regular Groklaw reader, and that's almost the entirety of my experience with the field

I'm also not even remotely the first person to think they're bad (and going to get a lot worse very shortly). Some people (mostly ones who own or are paid by extremely large software companies) think they're a good thing, but their arguments tend to boil down to "Patents exist, and we have them, so tough luck".

This is a thought exercise with the goal of defusing software patents. Of making them irrelevant to your actions as an inventive and commercial entity.

Lets imagine that this was your goal, and "you" are bigger than a lone, young lisper who likes to sit and write garbage at one fucking thirty AM for some bizarre reason. One way to do it would be to lobby governments for the abolition of patents. As various people point out, good luck with that. Legislative changes that slightly disadvantage existing powerful companies in order to fertilize the landscape for new competitors are bad at getting past any organization bureaucratic enough to call itself "government".

Another way would be to disregard patents utterly, and just do what you're doing. While that might do something good in large enough numbers, any individual entity has a pretty strong incentive to pay attention to legal concerns in the current landscape.

Another way is to make sure to make sure you never make enough money to be a juicy target and hope for the best. If there were enough inventors, a lot of them would probably be successful, by some metric, using this method.

I've been thinking of a particular way, and I'd like to pluck it out of my head to see what it looks like in the harsh-esque light of not-quite-day-yet. Essentially, what I've got in mind is a more aggressive take on the Open Invention Network. OIN works by pooling ownership of patents and granting them without fee to anyone that promises not to use their patents against Linux. As much as I like Linux, this is not a general software patent solution, but it looks like it might be tweaked into one.

Lets imagine a similar organization that wants to grab signatories. Except this organization's goal is not merely to protect a particular kernel project, but to kill software patents[1].

This organization wouldn't ask you not to use your software patents against Linux. It would establish a charter and ask that each signatory promise never to offensively use a software patent against any other signatory. It would ask that signatories promise not to engage in patent trolling tactics. It would also ask every signatory to promise that they would use their patents to defend any other signatory against outside patent claims. Because the point would be an obligation on each member, the organization itself wouldn't need to collect patents, it would merely need to ask for disclosures and document them. A company or individual with zero patents wouldn't need to be barred from entering, since the eventual goal is obsoleting them.

Now, lets step back for a second. Ignoring the fact that "don't troll", "don't use this patent offensively" and "come to the defense of other signatories" would be ridiculously difficult to express in legal terms[2], what we've got looks like a best-case scenario barring the invalidation of all software patents. This is an organization that lets small patent holders huddle for warmth against larger portfolios. It allows defensive use of these tools, but heavily discourages aggressive patent litigation against signatories[3]. As more companies sign, the defensive value of the collective gets larger, putting more pressure on new and existing companies alike to join in. If it got to the point where some large percentage of patents were in the hands of the collective, it would no longer be a viable strategy to threaten signatories with software patent litigation. Effectively, they'd become dead assets[4].

On the flipside, how would you fight such a collective? You could

  • train one of the big players on them before the defensive thicket was built up (at which point a victory would put the collective at square one, but probably invalidate a bunch of the attackers' patents too, if the Oracle case is any indication).
  • get a shell company to sign up and then get them to start using loopholes to troll outside companies and inventors (that would either precipitate the first situation, or undermine the reputation of the collective)
  • set up a bunch of smaller trolling companies and start pinging members of the collective to bleed them out. This seems like it would be the big problem to solve. You'd do it either by getting enough of the trolls on your side somehow, or by establishing some defensive mechanism to make it easier for signatories to shrug off attacks[5]

Of course, there are problems to deal with other than outright attacks. How do you fund something like this? The Graham angle would probably be to make it a start-up, but I don't think this sort of idea is a very good fit for the approach. You could charge dues, but every obstacle you place to membership means fewer signatories, means an over-all less effective defensive thicket of patents.

How do you avoid cannibalizing (in the best case) or the patent-equivalent of a WWI (in the worst case)? Remember, we're trying to defuse weapons of mass destruction here. If more than one of these collectives started up and started seeing success, it would be trivial for a malicious force to play them off against one another. There would need to be a policy mechanism in place to allow different collectives to cooperate with each other while still remaining defensible against such forces.

How do you ensure that we're not just setting up another recursion? Remember, patents and copyrights alike are ostensibly out there to encourage innovation. Or at least, they were. We're now pretty clearly getting to the point that these mechanisms themselves are bigger obstacles to innovation than any problem they supposedly solved. It's no good starting up a software patent collective, only to realize decades from now that the collective itself is strangling creative freedom amongst programmers.

It almost seems like we'd need a protocol rather than an organization. A formalized set of contracts that let two companies say "I won't patent-stab you, cross my heart/hope to fly. Also, I won't patent-stab anyone that implements a compatible agreement, and I promise to help you out". That seems to nail the funding/splintering issues and seems to allow motivation to play out naturally, but it raises new problems. How do you motivate disparate users of the protocol to stick up for each other? How do you prevent predatory companies from exploiting this system via trolling? How the hell do you define something like this in legal terms without opening enough loopholes for both Redmond and Cupertino to fit through?

I don't know the answers to any of these questions. It's thorny as fuck, and there would be big teething problems with any such arrangement, but it at least seems viable from this distance. I'll put some more thought and reading into it, I guess. After some sleep, obviously. Feel free to tell me how old/impossible/stupid/"ambitious" this idea is. Also feel free to share some ideas about how to get it rolling and restore sanity to the field.


1 - [back] - It could probably work on all patents, but I'm still working under the assumption that a certain portion of non-computational patents actually achieve the goal of protecting inventors. Feel free to convince me otherwise.

2 - [back] - Internationally or not

3 - [back] - As a side thought, it might be a good short-term idea to encourage patent trolls to join, since they have precisely the sort of just-legal-enough patents you'd need to start a defensive thicket. They're typically just after money, so paying members a certain amount per patent owned might not be an entirely stupid idea. The collective would just need to be careful not to set up a situation wherein they become the sole reason new patents are issued. That would be pretty freaking ironic.

4 - [back] - At which point the fight to dissolve them would get monumentally easier.

5 - [back] - Of course, then you get the additional takedown method, where an attacker can sneak an agent in and start provoking trolls in order to sap that defensive mechanism.