DeCode Da Code!

One of the things I have enjoyed over the past couple of years is playing with various small scale electronics projects.  I did the whole Arduino thing to get myself reengaged in the art of soldering, resistors, and blinking LEDs.  If you’ve ever looked at the ‘Duino’ world, you are sure to have come across all manner, size, and form factor of device that supports that development environment.  One of my favorites is the JeeNode by JeeLabs.

JeeLabs is exciting to me because the guy who runs it, Jean-Claude Wippler, is just a fantastic engineer.  He writes interesting application notes on an almost daily basis.  He does experiments to find out things like, can a capacitor power a node longer than a battery?  How long can a node be solar powered, and the like.

In general, JeeLabs is about “physical computing”.  Essentially, small ‘computers’ everywhere, as sensors, reporters, etc.

The other reason I like Jean-Claude is he goes off the beaten path.  For example, for the past few years, he’s been evolving a software platform based on the Tcl/Tk language!  He was using that language system because of its compactness, and some very good features he cared about.  Sounds very similar to my current Lua pursuit.  He ultimately gave up on Tcl because the language is stagnating, and not evolving as it should (long live LuaJIT!).

There was a recent post on JeeLabs entitled: Structured data  In said post, Jean-Claude outlines his desire to rely on ZeroMQ as his transport mechanism, and his agonizing over which data serialization mechanism to use.  He lays out the pros and cons of several, including XML, JSON, ASN.1, and netstrings, and a few others.  His source for comparison was the  wiki page: Comparison of data serialization formats.

Bencode is the serialization format supported by BitTorrent.  Seems good enough for that particular use case, so perhaps it’s good for others.  Considering Jean-Claude’s proclivity for creating fairly small energy efficient computing devices, I tend to believe he knows what he’s talking about, at least when it comes to code related to low power scenarios.

I thought I’d give Bencode a try myself, so I went casting about for a Lua implementation of the same.  I readily found some public domain code: https://bitbucket.org/wilhelmy/lua-bencode/src, authored by Moritz Whlhelmy.  The beauty of the bencode format is that it’s not very big at all.  Unlike JSON and XML, there’s just not a lot of parsing or validation to do there.  Perfect for cases where you’ve got some simple data structures, like Lua tables, and you want a wire representation without much fuss.

The public domain code works well enough.  There were some comments about performance, related to string concatenation, and the fact that it uses assert() and error(), so I improved the code, removing the usage of assert() and error(), and replacing the string concatenation with table inserts, with a final table.concat().  The code seems to work, and is as performant as the original.

But, there’s something in here, a pattern that I’ve seen over and over again, and have begun to implement in a different way.  The pattern is the basic switch/case that you typically find in C and other languages.  Lua does not have such a construct, so in cases where you would use this, you instead end up coding long chains of if/then/elseif/end blocks.  I thought, why not just associate the intended block with a key, in a table, and lookup the block using the key.  I’ve done this before when fiddling about with http parsing.

Here’s how I did the encoding in bencode:

-- recursively bencode x
local encode_funcs = {
  ["string"] = function(x)
    return #x .. ":" .. x
  end,

  ["number"] = function(x)
    if x % 1 ~= 0 then
      return nil, "number is not an integer: '" .. x .. "'"
    end

    return "i" .. x .. "e"
  end,

  ["table"] = function(x)
    local ret = {}
    if islist(x) then
      table.insert(ret, "l")
      for k, v in ipairs(x) do
        table.insert(ret, encode(v))
      end
      table.insert(ret,"e")
    else -- dictionary
      table.insert(ret,"d")
      -- bittorrent requires the keys to be sorted.
      local sortedkeys = {}
      for k, v in pairs(x) do
        if type(k) ~= "string" then
          return nil, "bencoding requires dictionary keys to be strings"
        end
        sortedkeys[#sortedkeys + 1] = k
      end
      sort(sortedkeys)

      for k, v in ipairs(sortedkeys) do
        table.insert(ret,encode(v))
        table.insert(ret,encode(x[v]))
      end
      table.insert(ret,"e")
    end
    return table.concat(ret)
  end,
}

function encode(x)
  local func = encode_funcs[type(x)]
  if not func then
    return nil, tx .. " cannot be converted to an acceptable type for bencoding"
  end
  return func(x)
end

What’s intended here is that based on the type of the argument passed in to the encode() function, the appropriate thing is supposed to happen. Well, the encode_funcs table simply contains the actions that are to occur, indexed by the name of the type. Given this table, just lookup the appropriate action, based on the type name. If the action is not found, then return nil. If it is found, then execute that function, passing in the value.

I think there’s an added benefit of gaining tail recursion here, which is handled nicely by Lua, so you don’t have to worry about overflowing a stack if the table structure is really really deep, but don’t quote me on that.

I really like this pattern. It’s much easier for my pea brain to understand than the long strands of logic found in those case statements. I can even collapse multiple states, by simply having table entries point to each other. At least for the cases where the number of states is fairly small, this seems like a nice sane way to program them. I like it because it leverages what Lua does really well (table lookups, functions as first class type), and offers up a simpler programming model for a scenario that has existed since the beginning of time.

I may or may not use bencode in my own work. It would be nicer if this encoded dealt with streams, rather than strings. Then it would become very interesting indeed.


One Comment on “DeCode Da Code!”

  1. Hexagon5un says:

    Yeah! I don’t really Lua, but something like what you describe here is my favorite way to make a case/switch statement in Python — make a dictionary with your cases as keys and the functions you’d like to call as values. Wrap it in a try/except if you need the default case. Instant state-machine.

    Thanks for the pointer to bencode.


Leave a comment