Isn’t the whole system just a database? – libdrm

Do enough programming, and everything looks like a database of one form or another. Case in point, when you want to get keyboard and mouse input, you first have to query the system to see which of the /dev/input/eventxxx devices you want to open for your particular needs. Yes, there are convenient shortcuts, but that’s beside the point.

Same goes with other devices in the system. This time around, I want to find the drm device which represents the graphics card in my system (from the libdrm perspective).

In LJIT2libudev there are already objects that make it convenient to enumerate all the devices in the system using a simple iterator:

local ctxt, err = require("UDVContext")()
for _, dev in ctxt:devices() do

Well, that’s find and all, but let’s get specific. To use the libdrm library, I very specifically need one of the active devices in the ‘drm’ subsystem. I could write this:

local ctxt, err = require("UDVContext")()

local function getActiveDrm()
  local function isActiveDrm(dev)
    if dev.IsInitialized and dev:getProperty("subsystem") == "drm" then
      return true;

    return false;

  for _, dev in ctxt:devices() do
    if isActiveDrm(dev) then
      return dev;

  return nil;

local device = getActiveDrm()

Yah, that would work. Then of course, when I want to change the criteria for finding the device I’m looking for, I would change up this code a bit. The core iterator is the key starting point at least. The ‘isActiveDrm()’ is a function which acts as a predicate to filter through the results, and only return the ones I want.

Since this is Lua though, and since there is a well throught out functional programming library already (luafun), this could be made even easier:

local function isActiveDrm(dev)
  if dev.IsInitialized and dev:getProperty("subsystem") == "drm" then
    return true;

  return false;

local device = head(filter(isActiveDrm, ctxt:devices()))
assert(device, "could not find active drm device")

In this case, we let the luafun ‘filter’ and ‘head’ functions do their job of dealing with the predicate, and taking the first one off the iterator that matches and returning it. Now, changing my criteria is fairly straight forward. Just change out the predicate, and done. This is kind of nice, particularly with Lua, because that predicate is just some code, it could be generated at runtime because we’re in script right?

So, how about this version:

-- File: IsActiveDrmDevice.lua
-- predicate to determine if a device is a DRM device and it's active
return function(dev)
  if dev.IsInitialized and dev:getProperty("subsystem") == "drm" then
    return true;

  return false;

-- File: devices_where.lua
#!/usr/bin/env luajit

-- devices_where.lua
-- print devices in the system, filtered by a supplied predicate
-- generates output which is a valid lua table
package.path = package.path..";../?.lua"

local fun = require("fun")()
local utils = require("utils")

local ctxt, err = require("UDVContext")()
assert(ctxt ~= nil, "Error creating context")

if #arg < 1 then
  error("you must specify a predicate")

local predicate = require(arg[1])

each(utils.printDevice, filter(predicate, ctxt:devices()))


-- Actual usage from the command line
./devices_where.lua isActiveDrmDevice

In this case, the ‘query’ has been generalized enough such that you can pass a predicate as a filename (minus the ‘.lua’). The code for the predicate will be compiled in, and used as the predicate for the filter() function. Well, that’s pretty nifty I think. And since the query itself again is just a bit of code, that can be changed on the fly as well. I can easily see a system where lua is the query language, and the entire machine is the database.

The tarantool database is written in Lua, and I believe the luafun code is used there. Tarantool is not a system database, but the fact that it’s written in Lua itself is interesting, and just proves the case that Lua is a good language for doing some database work.

I have found that tackling the lowest level enumeration by putting a Lua iterator on top of it makes life a whole lot easier. With many of the libraries that you run across, they spend a fair amount of resources/code on trying to make things look like a database. In the case of libudev, there are functions for iterating their internal hash table of values, routines for creating ‘enumerators’ which are essentially queries, routines for getting properties, routines for turning properties into more accessible strings, routines for turning the ‘FLAGS’ property into individual values, and the like, and then there’s the memory management routines (ref, unref). A lot of that stuff either goes away, or is handled much more succinctly when you’re using a language such as Lua, or JavaScript, or Python, Ruby, whatever, as long as it’s modern, dynamic, and has decent enough higher level memory managed libraries.

And thus, the whole system, from log files, to perf counters, to device lists, is a database, waiting to be harvested, and made readily available.

Functional Programming with libevdev

Previously, I wrote about the LuaJIT binding to libevdev: LJIT2libevdev – input device tracking on Linux

In that case, I went over the basics, and it all works out well, and you get get at keyboard and mouse events from Lua, with very little overhead.  Feels natural, light weight iterators, life is good.

Recently, I wanted to go one step further though.  Once you go down the path of using iterators, you begin to think about functional programming.  At least that’s how I’m wired.  So, I’ve pushed it a bit further, and incorporated the fun.lua module to make things a bit more interesting.  fun.lua provides things like each, any, all, map, filter, and other stuff which makes dealing with streams of data really brainlessly simple.

Here’s an example of spewing out the stream of events for a device:


package.path = package.path..";../?.lua"

local EVEvent = require("EVEvent")
local dev = require("EVDevice")(arg[1])

local utils = require("utils")
local fun = require("fun")

-- print out the device particulars before 
-- printing out the stream of events
print("===== ===== =====")

-- perform the actual printing of the event
local function printEvent(ev)
    print(string.format("{'%s', '%s', %d};",ev:typeName(),ev:codeName(),ev:value()));

-- decide whether an event is interesting enough to 
-- print or not
local function isInteresting(ev)
	return ev:typeName() ~= "EV_SYN" and ev:typeName() ~= "EV_MSC"

-- convert from a raw 'struct input_event' to the EVEvent object
local function toEVEvent(rawev)
    return EVEvent(rawev)


The last line there, where it starts with “fun.each”, can be read from the inside out.

dev:rawEvents() – is an iterator, it returns a steady stream of events coming off of the specified device.,…) – the map function will take some input, run it through the provided function, and send that as the output. In functional programming, this would be transformation.

fun.filter(isInteresting, …) – filter takes an input, allplies the specified predicate, and allows that item to pass through or not based on the predicate returning true or false.

fun.each(printEvent, …) – the ‘each’ function takes each of the items coming from the stream of items, and applies the specified function, in this case the printEvent function.

This is a typical pull chain. The events are pulled out of the lowest level iterator as they are needed by subsequent operations in the chain. If the chain stops, then nothing further is pulled.

This is a great way to program because doing different things with the stream of events is simply a matter of replacing items in the chain. For example, if we just want the first 100 items, we could write

        map(toEVEvent, dev:rawEvents()))));

You can create tees, send data elsewhere, construct new streams by combining streams. There are a few key operators, and with them you can do a ton of stuff.

At the very heart, the EVDevice:rawEvents() iterator looks like this:

function EVDevice.rawEvents(self)
	local function iter_gen(params, flags)
		local flags = flags or ffi.C.LIBEVDEV_READ_FLAG_NORMAL;
		local ev ="struct input_event");
		--local event = EVEvent(ev);
		local rc = 0;

			rc = evdev.libevdev_next_event(params.Handle, flags, ev);
		until rc ~= -libc.EAGAIN
			return flags, ev;

		return nil, rc;

	return iter_gen, {Handle = self.Handle}, state 

This is better than the previous version where we could pass in a predicate. In this case, we don’t even convert to the EVEvent object at this early low level stage, because we’re not sure what subsequent links in the chain will want to do, so we leave it out. This simplifies the code at the lowest levels, and makes it more composable, which is a desirable effect.

And so it goes. This binding started out as a simple ffi.cdef hack job, then objects were added to encapsulate some stuff, and now the iterators are being used in a functional programming style, which makes the whole thing that much more useful and integratable.

LJIT2pixman – Drawing on Linux

On the Linux OS, libpixman is at the heart of doing some low level drawing. It’s very simple stuff like compositing, rendering of gradients and the like. It takes up the slack where hardware acceleration doesn’t exist. So, graphics libraries such as Cairo leverage libpixman at the bottom to take care of the basics. LJIT2pixman is a little project that delivers a fairly decent LuaJIT binding to that library.

Here’s one demo of what it can do.


Of course, given the title of this post, you know LuaJIT is involved, so you can expect that there’s some way of doing this in LuaJIT.

package.path = package.path..";../?.lua"

local ffi = require("ffi")
local bit = require("bit")
local band =

local pixman = require("pixman")()
local pixlib = pixman.Lib_pixman;
local ENUM = ffi.C
local utils = require("utils")
local save_image = utils.save_image;

local function D2F(d) return (pixman_double_to_fixed(d)) end

local function main (argc, argv)

	local WIDTH = 400;
	local HEIGHT = 400;
	local TILE_SIZE = 25;

    local trans ="pixman_transform_t", { {
	    { D2F (-1.96830), D2F (-1.82250), D2F (512.12250)},
	    { D2F (0.00000), D2F (-7.29000), D2F (1458.00000)},
	    { D2F (0.00000), D2F (-0.00911), D2F (0.59231)},

    local checkerboard = pixlib.pixman_image_create_bits (ENUM.PIXMAN_a8r8g8b8,
					     WIDTH, HEIGHT,
					     nil, 0);

    local destination = pixlib.pixman_image_create_bits (ENUM.PIXMAN_a8r8g8b8,
					    WIDTH, HEIGHT,
					    nil, 0);

    for i = 0, (HEIGHT / TILE_SIZE)-1  do
		for j = 0, (WIDTH / TILE_SIZE)-1 do
	    	local u = (j + 1) / (WIDTH / TILE_SIZE);
	    	local v = (i + 1) / (HEIGHT / TILE_SIZE);
	    	local black ="pixman_color_t", { 0, 0, 0, 0xffff });
	    	local white ="pixman_color_t", {
				v * 0xffff,
				u * 0xffff,
				(1 - u) * 0xffff,
				0xffff });
	    	local c = white;

	    	if (band(j, 1) ~= band(i, 1)) then
				c = black;

	    	local fill = pixlib.pixman_image_create_solid_fill (c);

	    	pixlib.pixman_image_composite (ENUM.PIXMAN_OP_SRC, fill, nil, checkerboard,
				    0, 0, 0, 0, j * TILE_SIZE, i * TILE_SIZE,

    pixlib.pixman_image_set_transform (checkerboard, trans);
    pixlib.pixman_image_set_filter (checkerboard, ENUM.PIXMAN_FILTER_BEST, nil, 0);
    pixlib.pixman_image_set_repeat (checkerboard, ENUM.PIXMAN_REPEAT_NONE);

    pixlib.pixman_image_composite (ENUM.PIXMAN_OP_SRC,
			    checkerboard, nil, destination,
			    0, 0, 0, 0, 0, 0,
			    WIDTH, HEIGHT);

	save_image (destination, "checkerboard.ppm");

    return true;

main(#arg, arg)

With a couple of exceptions, the code looks almost exactly like its C based counterpart. I actually think this is a very good thing, because you can rapidly prototype something from a C coding example, but have all the support and protection that a dynamic language such as lua provides.

And here’s another:


In this case is the conical-test.lua demo doing the work.

package.path = package.path..";../?.lua"

local ffi = require("ffi")
local bit = require("bit")
local band =
local lshift, rshift = bit.lshift, bit.rshift

local pixman = require("pixman")()
local pixlib = pixman.Lib_pixman;
local ENUM = ffi.C
local utils = require("utils")
local save_image = utils.save_image;
local libc = require("libc")

local SIZE = 128
local NUM_GRADIENTS = 35


local function double_to_color(x)
    return (x*65536) - rshift( (x*65536), 16)

local function PIXMAN_STOP(offset,r,g,b,a)		
   return"pixman_gradient_stop_t", { pixman_double_to_fixed (offset),		
	    double_to_color (r),		
		double_to_color (g),		
		double_to_color (b),		
		double_to_color (a)		

local stops ="pixman_gradient_stop_t[4]",{
    PIXMAN_STOP (0.25,       1, 0, 0, 0.7),
    PIXMAN_STOP (0.5,        1, 1, 0, 0.7),
    PIXMAN_STOP (0.75,       0, 1, 0, 0.7),
    PIXMAN_STOP (1.0,        0, 0, 1, 0.7)

local  NUM_STOPS = (ffi.sizeof (stops) / ffi.sizeof (stops[0]))

local function create_conical (index)
    local c ="pixman_point_fixed_t")
    c.x = pixman_double_to_fixed (0);
    c.y = pixman_double_to_fixed (0);

    local angle = (0.5 / NUM_GRADIENTS + index / NUM_GRADIENTS) * 720 - 180;

    return pixlib.pixman_image_create_conical_gradient (c, pixman_double_to_fixed (angle), stops, NUM_STOPS);

local function main (argc, argv)

    local transform ="pixman_transform_t");

    local dest_img = pixlib.pixman_image_create_bits (ENUM.PIXMAN_a8r8g8b8,
					 nil, 0);
    utils.draw_checkerboard (dest_img, 25, 0xffaaaaaa, 0xff888888);

    pixlib.pixman_transform_init_identity (transform);

    pixlib.pixman_transform_translate (NULL, transform,
				pixman_double_to_fixed (0.5),
				pixman_double_to_fixed (0.5));

    pixlib.pixman_transform_scale (nil, transform,
			    pixman_double_to_fixed (SIZE),
			    pixman_double_to_fixed (SIZE));
    pixlib.pixman_transform_translate (nil, transform,
				pixman_double_to_fixed (0.5),
				pixman_double_to_fixed (0.5));

    for i = 0, NUM_GRADIENTS-1 do
	   local column = i % GRADIENTS_PER_ROW;
	   local row = i / GRADIENTS_PER_ROW;

	   local src_img = create_conical (i); 
	   pixlib.pixman_image_set_repeat (src_img, ENUM.PIXMAN_REPEAT_NORMAL);
	   pixlib.pixman_image_set_transform (src_img, transform);
	   pixlib.pixman_image_composite32 (
	       ENUM.PIXMAN_OP_OVER, src_img, nil,dest_img,
	       0, 0, 0, 0, column * SIZE, row * SIZE,
	       SIZE, SIZE);
	   pixlib.pixman_image_unref (src_img);

    save_image (dest_img, "conical-test.ppm");

    pixlib.pixman_image_unref (dest_img);

    return true;

main(#arg, arg)


Linear Gradient demo


screen demo (transparency).
Perhaps this is the easiest one of all. All the interesting functions are placed into the global namespace, so they can be accessed easily, just like everything in C is globally available.

package.path = package.path..";../?.lua"

local ffi = require("ffi")
local bit = require("bit")
local band =
local lshift, rshift = bit.lshift, bit.rshift

local pixman = require("pixman")()
local pixlib = pixman.Lib_pixman;
local ENUM = ffi.C
local utils = require("utils")
local save_image = utils.save_image;
local libc = require("libc")

local function main (argc, argv)

    WIDTH = 40
    HEIGHT = 40
    local src1 = ffi.cast("uint32_t *", libc.malloc (WIDTH * HEIGHT * 4));
    local src2 = ffi.cast("uint32_t *", libc.malloc (WIDTH * HEIGHT * 4));
    local src3 = ffi.cast("uint32_t *", libc.malloc (WIDTH * HEIGHT * 4));
    local dest = ffi.cast("uint32_t *", libc.malloc (3 * WIDTH * 2 * HEIGHT * 4));

    for i = 0, (WIDTH * HEIGHT)-1 do
	   src1[i] = 0x7ff00000;
	   src2[i] = 0x7f00ff00;
	   src3[i] = 0x7f0000ff;

    for i = 0, (3 * WIDTH * 2 * HEIGHT)-1 do
	   dest[i] = 0x0;

    local simg1 = pixman_image_create_bits (ENUM.PIXMAN_a8r8g8b8, WIDTH, HEIGHT, src1, WIDTH * 4);
    local simg2 = pixman_image_create_bits (ENUM.PIXMAN_a8r8g8b8, WIDTH, HEIGHT, src2, WIDTH * 4);
    local simg3 = pixman_image_create_bits (ENUM.PIXMAN_a8r8g8b8, WIDTH, HEIGHT, src3, WIDTH * 4);
    local dimg  = pixman_image_create_bits (ENUM.PIXMAN_a8r8g8b8, 3 * WIDTH, 2 * HEIGHT, dest, 3 * WIDTH * 4);

    pixman_image_composite (ENUM.PIXMAN_OP_SCREEN, simg1, NULL, dimg, 0, 0, 0, 0, WIDTH, HEIGHT / 4, WIDTH, HEIGHT);
    pixman_image_composite (ENUM.PIXMAN_OP_SCREEN, simg2, NULL, dimg, 0, 0, 0, 0, (WIDTH/2), HEIGHT / 4 + HEIGHT / 2, WIDTH, HEIGHT);
    pixman_image_composite (ENUM.PIXMAN_OP_SCREEN, simg3, NULL, dimg, 0, 0, 0, 0, (4 * WIDTH) / 3, HEIGHT, WIDTH, HEIGHT);

    save_image (dimg, "screen-test.ppm");
    return true;

main(#arg, arg)

I really like this style of rapid prototyping. The challenge I have otherwise is that it’s just too time consuming to consume things in their raw C form. Things like build systems, compiler versions, and other forms of magic always seem to get in the way. And if it’s not that stuff, it’s memory management, and figuring out the inevitable crashes.

Once you wrap a library up in a bit of lua goodness though, it becomes much more approachable. It may or may not be the most performant thing in the world, but you can worry about that later.

Having this style of rapid prototyping available saves tremendous amounts of time. Since you’re not wasting your time on the mundane (memory management, build system, compiler extensions and the like), you can spend much more time on doing mashups of the various pieces of technology at hand.

In this case it was libpixman. Previously I’ve tackled everything from hot plugging usb devices, to consuming keystrokes, and putting a window on the screen.

What’s next? Well, someone inquired as to whether I would be doing a Wayland binding. My response was essentially, “since I now have all the basics, there’s no need for wayland, I can just call what it was going to call directly.

And so it goes. One more library in the toolbox, more interesting demos generated.

LJIT2libc – LuaJIT IS “batteries included”

I would say one of the most common complaints about any platform, framework, product is a paucity of available stuff to fiddle about with.  I’ve heard this criticism a few times leveled at the lua world in general.  Fatter frameworks are usually “batteries included”.  That’s great, when you get a whole ton of stuff that makes writing all sorts of apps relatively easy to do right off the bat.  The challenge of “batteries included” is that you get a whole ton of stuff, most of which you will never use, and some of which doesn’t have the best implementation.

Recently, I’ve been doing quite a lot off luajit bindings on the Linux platform:

If you were into anything from ASCII art graphics drivers to raw frame buffer management in a Linux system, these might start to look like batteries.  They’re not included with the luajit compiler, but they’re relatively easy to add.

But, there’s another one that I’ve been playing with recently which I think is even better:

Luajit is typically built and compiled against the libc and libm libraries.  As such, being able to access routines within those libraries comes for ‘free’ from within luajit, courtesy of the ffi capabilities.  This is very interesting because it means these routines, which are already on your system, are in fact just a stones throw away from being available within your app.
Let’s imagine you wanted to write some script like this, using the libc provided ‘puts()’ function:
puts("Hello, World!");

Well, the lua language has its own print routine, so this is a bit contrived, but let’s say you wanted to do it anyway. Well, to make this work, you need access to the function signature for the ‘puts’ routine and you need that to be accessible in the global namespace. So, it would actually look like this:

local ffi = require("ffi")

ffi.cdef("int puts(const char *);");
puts = ffi.C.puts;

puts("Hello, World!")

Great. A bit of work, and now I can program as wrecklessly as I could in C with all the batteries included in the libc and libm libraries. But, it gets tedious having to write that ffi stuff all over the place, so this is where the LJIT2libc thing comes in. It basically goes and implements a ton of the required ffi.cdef statements to make all those functions readily available to your script. An easy way to pull it into any script is to do the following:

local init = require("test_setup")()

puts("Hello, World!");

That first line ‘init = …’ in this case pulls in all of the definitions, and puts them into the global namespace, so that you can simply write your program without having to know anything about the ffi, or where functions come from or anything like that. Just start writing your code knowing that the batteries are already included.

Now, this example might seem too trivial for words, but just think about what’s in a typical libc library. It’s all the file system, sockets, system calls, math, random numbers, everything you’re likely to need to create higher level application stuff. It’s a lot of stuff that other systems end up creating from scratch, as part of their ‘batteries’.

Of course this is trivializing what’s happening in those batteries, because what you’re getting here is the raw C implementations of things, with all the headaches and dangers that are typically associated with writing code at that level. But, for those who want to have access to that level of code, and apply the safety net of lua where they see fit, this is quite a useful tool I think.

In short, batteries ARE included with every luajit. Those batteries might be just anode, cathode, and electrolyte, without the shell, but there are the raw ingredients. Having these wrappers available just makes it that much easier to think about and deal with a lot of low level stuff where I might previously had to resort to some sort of ‘package’ to achieve something as simple as traverse the file system.

So there you have it. luajit is a ‘batteries included’ system.

Fast Apps, Microsoft Style


That’s what I exclaimed at least a couple of times this morning as I sat at a table in a makeshift “team room” in building 43 at Microsoft’s Redmond campus. What was the exclamation for? Well, over the past 3 months, I’ve been working on a quick strike project with a new team, and today we finally announced our “Public Preview“.  Or, if you want to get right to the product: Cloud App Discovery

I’m not a PM or marketing type, so it’s best to go and read the announcement for yourself if you want to get the official spiel on the project.  Here, I want to write a bit about the experience of coming up with a project, in short order, in the new Microsoft.

It all started back in January for me.  I was just coming off another project, and casting about for the next hardest ‘mission impossible’ to jump on.  I had a brief conversation with a dev manager who posed the question; “Is it possible to reestablish the ‘perimeter’ for IT guys in this world of cloud computing”?  An intriguing question.  The basic problem was, if you go to a lot of IT guys, they can barely tell you how many of the people within their corporation are using, let alone DropBox from a cafe in Singapore.  Forget the notion of even trying to control such access.  The corporate ‘firewall’ is almost nothing more than a quartz space heater at this point, preventing very little, and knowing about even less.

So, with that question in mind, we laid out 3 phases of development.  Actually, they were already laid out before I joined the party (by a couple of weeks), so I just heard the pitch.  It was simple, the first phase of development is to see if we can capture network traffic, using various means, and project it up to the  cloud where we could use some machine learning to give an admin a view of what’s going on.

Conveniently sidestepping any objections actual employees might have with this notion, I got to thinking on how it could be done.

For my part, we wanted to have something sitting on the client machine (a windows machine that the user is using), which will inspect all network traffic coming and going, and generate some reports to be sent up to the cloud.  Keep in mind, this is all consented activity, the employee gets to opt in to being monitored in this way.  All in the open and up front.

At the lowest level, my first inclination was to use a raw socket to create a packet sniffer, but Windows has a much better solution these days, built for exactly this purpose.  The Windows Filter Platform, allows you to create a ‘filter’ which you can configure to callout to a function whenever there is traffic.  My close teammate implemented that piece, and suddenly we had a handle on network packets.

We fairly quickly decided on an interface between that low level packet sniffing, and the higher level processor.  It’s as easy as this:


int WriteBytes(char *buff, int bufflen);
int ReadBytes(char *buff, int bufflen, int &bytesRead);

I’m paraphrasing a bit, but it really is that simple. What’s it do? Well, the fairly raw network packets are sent into ‘WriteBytes’, some processing is done, and a ‘report’ becomes available through ‘ReadBytes’. The reports are a JSON formatted string which then gets turned into the appropriate thing to be sent up to the cloud.

The time it took from hearing about the basic product idea, to a prototype of this thing was about 3 weeks.

What do I do once I get network packets? Well, the network packets represent a multiplexed stream of packets, as if I were a NIC. All incoming, outgoing, all TCP ports. Once I receive some bytes, I have to turn it back into individual streams, then start doing some ‘parsing’. Right now we handle http and TLS. For http, I do full http parsing, separating out headers, reading bodies, and the like. I did that by leveraging the http parsing work I had done for TINN already. I used C++ in this case, but it’s all relatively the same.

TLS is a different story. At this ‘discovery’ phase, it was more about simple parsing. So, reading the record layer, decoding client_hello and server_hello, certificate, and the like. This gave me a chance to implement TLS processing using C++ instead of Lua. One of the core components that I leveraged was the byte order aware streams that I had developed for TINN. That really is the crux of most network protocol handling. If you can make herds or tails of what the various RFCs are saying, it usually comes down to doing some simple serialization, but getting the byte ordering is the hardest part. 24-bit big endian integers?

At any rate, http parsing, fairly quick. TLS client_hello, fast enough, although properly handling the extensions took a bit of time. At this point, we’d be a couple months in, and our first partners get to start kicking the tires.

For such a project, it’s very critical that real world customers are involved really early, almost sitting in our design meetings. They course corrected us, and told us what was truly important and annoying about what we were doing, right from day one.

From the feedback, it becomes clear that getting more information, like the amount of traffic flowing through the pipes is as interesting as the meta information, so getting the full support for flows becomes a higher priority. For the regular http traffic, no problem. The TLS becomes a bit more interesting. In order to deal with that correctly, it becomes necessary to suck in more of the TLS implementation. Read the server_hello, and the certificate information. Well, if you’re going to read in the cert, you might as well get the subject common name out so you can use that bit of meta information. Now comes ASN.1 (DER) parsing, and x509 parsing. That code took about 2 weeks, working “nights and weekends” while the other stuff was going on. It took a good couple of weeks not to integrate, but to write enough test cases, with real live data, to ensure that it was actually working correctly.

The last month was largely a lot of testing, making sure corner cases were dealt with and the like. As the client code is actually deployed to a bunch of machines, it really needed to be rock solid, no memory leaks, no excessive resource utilization, no CPU spiking, just unobtrusive, quietly getting the job done.

So, that’s what it does.

Now, I’ve shipped at Microsoft for numerous years. The fastest cycles I’ve usually dealt with are on the order of 3 months. That’s usually for a product that’s fairly mature, has plenty of engineering system support, and a well laid out roadmap. Really you’re just turning the crank on an already laid out plan.

This AppDiscovery project has been a bit different. It did not start out with a plan that had a 6 month planning cycle in front of it. It was a hunch that we could deliver customer value by implementing something that was challenging enough, but achievable, in a short amount of time.

So, how is this different than Microsoft of yore? Well, yes, we’ve always been ‘customer focused’, but this is to the extreme. I’ve never had customers this involved in what I was doing this early in the development cycle. I mean literally, before the first prototypical bits are even dry, the PM team is pounding on the door asking “when can I give it to the customers?”. That’s a great feeling actually.

The second thing is how much process we allowed ourselves to use. Recognizing that it’s a first run, and recognizing that customers might actually say “mehh, not interested”, it doesn’t make sense to spin up the classic development cycle which is meant to maintain a product for 10-14 years. A much more streamlined lifecycle which favors delivering quality code and getting customer feedback, is what we employed. If it turns out that customers really like the product, then there’s room to fit the cycle to a cycle that is more appropriate for longer term support.

The last thing that’s special is the amount of leveraging Open Source we are allowing ourselves these days. Microsoft has gone full tilt on OpenSource support. I didn’t personally end up using much myself, but we are free to use it elsewhere (with some legal guidelines). This is encouraging, because for crypto, I’m looking forward to using things like SipHash, and ChaCha20, which don’t come natively with the Microsoft platform.

Overall, as Microsoft continues to evolve and deliver ‘customer centric’ stuff, I’m pretty excited and encouraged that we’ll be able to use this same model time and again to great effect. Microsoft has a lot of smart engineers. Combined with some new directives about meeting customer expectations at the market, we will surely be cranking out some more interesting stuff.

I’ve implemented some interesting stuff while working on this project, some if it I’ll share here.

Device Iteration with Functional Programming

One of the great pleasures I have in life is learning something new. There’s nothing greater than those ‘light bulb goes on’ moments as you realize something and gain a much deeper understanding than you had before.

Well, a little while ago, there was an announcement of this thing called Lua Fun.  Lua Fun is a large set of functions which make functional programming in Lua really easy.  It has the usual suspects such as map, reduce, filter, each, etc.  If you read the documentation, you get a really good understanding of how iterators work in Lua, and more importantly, how LuaJIT is able to fold and manipulate things in hot loops such that the produced code is much tighter than anything I could possibly write in C, or any other language I so happen to use.

So, now I’m a fan of Lua Fun, and I would encourage anyone who’s both into Lua, and functional programming to take a look.

How to use it?  I’ve been enumerating various types of things in the Windows system of late.  Using the MMDevice subsystem, I was able to get an enumeration of the audio devices (that took a lot of COM work).  What about displays, and disk drives, and USB devices, and…  Yes, each one of those things has an attendant API which will facilitate monitoring said category.  But, is there one to rule them all?  Well yes, as it turns out, in most cases what these various APIs are doing is getting information out of the System Registry, and just presenting it reasonably.

There is a way to enumerate all the devices in the system.  You know, like when you bring up the Device Manager in Windows, and you see a tree of devices, and their various details.  The stage is set, how do you do that?  I created a simple object that does the grunt work of enumerating the devices in the system.

The DeviceRecordSet is essentially a query. Creating an instance of the object just gives you a handle onto making query requests. Here is the code:

local ffi = require("ffi")
local bit = require("bit")
local bor = bit.bor;
local band =;

local errorhandling = require("core_errorhandling_l1_1_1");
local SetupApi = require("SetupApi")
local WinNT = require("WinNT")

local DeviceRecordSet = {}
setmetatable(DeviceRecordSet, {
	__call = function(self, ...)
		return self:create(...)

local DeviceRecordSet_mt = {
	__index = DeviceRecordSet,

function DeviceRecordSet.init(self, rawhandle)
	print("init: ", rawhandle)

	local obj = {
		Handle = rawhandle,
	setmetatable(obj, DeviceRecordSet_mt)

	return obj;

function DeviceRecordSet.create(self, Flags)
	Flags = Flags or bor(ffi.C.DIGCF_PRESENT, ffi.C.DIGCF_ALLCLASSES)

	local rawhandle = SetupApi.SetupDiGetClassDevs(

	if rawhandle == nil then
		return nil, errorhandling.GetLastError();

	return self:init(rawhandle)

function DeviceRecordSet.getNativeHandle(self)
	return self.Handle;

function DeviceRecordSet.getRegistryValue(self, key, idx)
	idx = idx or 0;

	did.cbSize = ffi.sizeof("SP_DEVINFO_DATA");

--print("HANDLE: ", self.Handle)
	local res = SetupApi.SetupDiEnumDeviceInfo(self.Handle,idx,did)

	if res == 0 then
		local err = errorhandling.GetLastError()
		--print("after SetupDiEnumDeviceInfo, ERROR: ", err)
		return nil, err;

	local regDataType ="DWORD[1]")
	local pbuffersize ="DWORD[1]",260);
	local buffer ="char[260]")

	local res = SetupApi.SetupDiGetDeviceRegistryProperty(

	if res == 0 then
		local err = errorhandling.GetLastError();
		--print("after GetDeviceRegistryProperty, ERROR: ", err)
		return nil, err;

	--print("TYPE: ", regDataType[0])
	if (regDataType[0] == 1) or (regDataType[0] == 7) then
		return ffi.string(buffer, pbuffersize[0]-1)
	elseif regDataType[0] == ffi.C.REG_DWORD_LITTLE_ENDIAN then
		return ffi.cast("DWORD *", buffer)[0]

	return nil;

function DeviceRecordSet.devices(self, fields)
	fields = fields or {
		{ffi.C.SPDRP_DEVICEDESC, "description"},
		{ffi.C.SPDRP_MFG, "manufacturer"},
		{ffi.C.SPDRP_DEVTYPE, "devicetype"},
		{ffi.C.SPDRP_CLASS, "class"},
		{ffi.C.SPDRP_ENUMERATOR_NAME, "enumerator"},
		{ffi.C.SPDRP_FRIENDLYNAME, "friendlyname"},
		{ffi.C.SPDRP_LOCATION_INFORMATION , "locationinfo"},
		{ffi.C.SPDRP_LOCATION_PATHS, "locationpaths"},
		{ffi.C.SPDRP_SERVICE, "service"},

	local function closure(fields, idx)
		local res = {}

		local count = 0;
		for _it, field in ipairs(fields) do
			local value, err = self:getRegistryValue(field[1], idx)
			if value then
				count = count + 1;
				res[field[2]] = value;

		if count == 0 then
			return nil;
		return idx+1, res;

	return closure, fields, 0

return DeviceRecordSet

The ‘getRegistryValue()’ function is the real workhorse of this object. That’s what gets your values out of the system registry. The other function of importance is ‘devices()’. This is an iterator.

There are a couple of things of note about this iterator. First of all, it does not require ‘up values’ to be held onto. All that means is that everything the iterator needs to operate is carried in the return values from the function. The ‘state’ if you will, is handed in fresh every time the ‘closure()’ is called. This is the key to creating an iterator that will work well with Lua Fun.

By default, this iterator will return quite a few (but not all) fields related to each object, and it will return all the objects. This is ok, because there are typically less than 150 objects in any given system.

Now, I want to do various queries against this set without much fuss. This is where Lua Fun, and functional programming in general, really shines.

First, a little setup:

local ffi = require("ffi")
local DeviceRecordSet = require("DeviceRecordSet")
local serpent = require("serpent")
local Functor = require("Functor")

local fun = require("fun")()
local drs = DeviceRecordSet();

local function printIt(record)
	each(print, record)

This creates an instance of the DeviceRecordSet object, which will be used in the queries. Already the printIt() function is utilizing Lua Fun. The ‘each()’ function will take whatever it’s handed, and perform the function specified. In this case, the ‘record’ will be a table. So, each will iterate over the table entries and print each one of them out. This is the equivalent of doing:

for k,v in pairs(record)
print(k, v)

I think that simply typing ‘each’ is a lot simpler and pretty easy to understand.

How about a query then?

-- show everything for every device
each(printIt, drs:devices())

In this case, the ‘each’ is applied to the results of the ‘devices()’ iterator. For each record coming from the devices iterator, the printIt function will be called, which will in turn print out all the values in the record. That’s pretty nice.

What if I don’t want to see all the fields in the record, I just want to see the objectname, and description fields. Well, this is a ‘map’ operation, or a projection in database parlance, so:

-- do a projection on the fields
local function projection(x)
  return {objectname = x.objectname, description = x.description}
each(printIt, map(projection, drs:devices()))

Working from the inside out, for each record coming from the devices() iterator, call the ‘projection’ function. The return value from the projection function becomes the new record for this iteration. For each of those records, call the printIt function.

Using ‘map’ is great as you can reshape data in any way you like without much fuss.

Lastly, I want to see only the records that are related to “STORAGE”, so…

-- show only certain records
local function enumeratorFilter(x)
	return x.enumerator == "STORAGE"

each(printIt, filter(enumeratorFilter, drs:devices()))

Here, the ‘filter’ iterator is used. So, again, for each of the records coming from the ‘devices()’ enumerator, call the ‘enumeratorFilter’ function. If this function returns ‘true’ for the record, then it is passed along as the next record for the ‘each’. If ‘false’, then it is skipped, and the next record is tried.

This is pretty powerful, and yet simple stuff. The fact that iterators create new iterators, in tight loops, makes for some very dense and efficient code. If you’re interested in why this is so special in LuaJIT, and not many other languages, read up on the Lua Fun documentation.

I’ve killed two birds with one stone. I have finally gotten to the root of all device iterators. I have also learned how to best write iterators that can be used in a functional programming way. Judicious usage of this mechanism will surely make a lot of my code more compact and readable, as well as highly performant.



Get every new post delivered to your Inbox.

Join 54 other followers