Isn’t the whole system just a database? – libdrm

Do enough programming, and everything looks like a database of one form or another. Case in point, when you want to get keyboard and mouse input, you first have to query the system to see which of the /dev/input/eventxxx devices you want to open for your particular needs. Yes, there are convenient shortcuts, but that’s beside the point.

Same goes with other devices in the system. This time around, I want to find the drm device which represents the graphics card in my system (from the libdrm perspective).

In LJIT2libudev there are already objects that make it convenient to enumerate all the devices in the system using a simple iterator:

local ctxt, err = require("UDVContext")()
for _, dev in ctxt:devices() do
    print(dev)
end

Well, that’s find and all, but let’s get specific. To use the libdrm library, I very specifically need one of the active devices in the ‘drm’ subsystem. I could write this:

local ctxt, err = require("UDVContext")()

local function getActiveDrm()
  local function isActiveDrm(dev)
    if dev.IsInitialized and dev:getProperty("subsystem") == "drm" then
      return true;
    end

    return false;
  end

  for _, dev in ctxt:devices() do
    if isActiveDrm(dev) then
      return dev;
    end
  end

  return nil;
end

local device = getActiveDrm()

Yah, that would work. Then of course, when I want to change the criteria for finding the device I’m looking for, I would change up this code a bit. The core iterator is the key starting point at least. The ‘isActiveDrm()’ is a function which acts as a predicate to filter through the results, and only return the ones I want.

Since this is Lua though, and since there is a well throught out functional programming library already (luafun), this could be made even easier:

local function isActiveDrm(dev)
  if dev.IsInitialized and dev:getProperty("subsystem") == "drm" then
    return true;
  end

  return false;
end

local device = head(filter(isActiveDrm, ctxt:devices()))
assert(device, "could not find active drm device")

In this case, we let the luafun ‘filter’ and ‘head’ functions do their job of dealing with the predicate, and taking the first one off the iterator that matches and returning it. Now, changing my criteria is fairly straight forward. Just change out the predicate, and done. This is kind of nice, particularly with Lua, because that predicate is just some code, it could be generated at runtime because we’re in script right?

So, how about this version:

-- File: IsActiveDrmDevice.lua
-- predicate to determine if a device is a DRM device and it's active
return function(dev)
  if dev.IsInitialized and dev:getProperty("subsystem") == "drm" then
    return true;
  end

  return false;
end

-- File: devices_where.lua
#!/usr/bin/env luajit

-- devices_where.lua
-- print devices in the system, filtered by a supplied predicate
-- generates output which is a valid lua table
package.path = package.path..";../?.lua"

local fun = require("fun")()
local utils = require("utils")

local ctxt, err = require("UDVContext")()
assert(ctxt ~= nil, "Error creating context")

if #arg < 1 then
  error("you must specify a predicate")
end

local predicate = require(arg[1])

print("{")
	
each(utils.printDevice, filter(predicate, ctxt:devices()))

print("}")


-- Actual usage from the command line
./devices_where.lua isActiveDrmDevice

In this case, the ‘query’ has been generalized enough such that you can pass a predicate as a filename (minus the ‘.lua’). The code for the predicate will be compiled in, and used as the predicate for the filter() function. Well, that’s pretty nifty I think. And since the query itself again is just a bit of code, that can be changed on the fly as well. I can easily see a system where lua is the query language, and the entire machine is the database.

The tarantool database is written in Lua, and I believe the luafun code is used there. Tarantool is not a system database, but the fact that it’s written in Lua itself is interesting, and just proves the case that Lua is a good language for doing some database work.

I have found that tackling the lowest level enumeration by putting a Lua iterator on top of it makes life a whole lot easier. With many of the libraries that you run across, they spend a fair amount of resources/code on trying to make things look like a database. In the case of libudev, there are functions for iterating their internal hash table of values, routines for creating ‘enumerators’ which are essentially queries, routines for getting properties, routines for turning properties into more accessible strings, routines for turning the ‘FLAGS’ property into individual values, and the like, and then there’s the memory management routines (ref, unref). A lot of that stuff either goes away, or is handled much more succinctly when you’re using a language such as Lua, or JavaScript, or Python, Ruby, whatever, as long as it’s modern, dynamic, and has decent enough higher level memory managed libraries.

And thus, the whole system, from log files, to perf counters, to device lists, is a database, waiting to be harvested, and made readily available.


Curating Data – Resene and Hollasch Colors

I found some more color data that I wanted to play with. Resene paints has a whole bunch of color swatches. I found one file full of them, and the data looks like this:

Acadia                     27   20    4
Acapulco                  124  176  161
Acorn                     106   93   27
Aero Blue                 201  255  229
Affair                    113   70  147
Afghan Tan                134   86   10

When I did the SGI Colors, I used the following code, relying on the string.find() function:

parseline = function(line)
	local starting, ending, n1, n2, n3, name = line:find("%s*(%d*)%s*(%d*)%s*(%d*)%s*([%a%d%s]*)")
	return tonumber(n1), tonumber(n2), tonumber(n3), name
end

In this case, it would shift around a bit because the color name is first, rather than last on the line, but otherwise, relatively the same.

One other problem with that mechanism though is that it doesn’t deal with the space that can be found in the color’s name. So, now I have a chance to do it slightly differently. I do observe that the data is columnar. That is, all the names start at column 1, the red values start at 27, green at 32, and blue at 37. With this knowledge, I can simply use substrings:

ConvertReseneFile = function(filename)
  parseline = function(line)
    -- get name, strip whitespace
    -- make lowercase
    local name = line:sub(1,26)
    name = string.gsub(name, "%s",'')
    name = name:lower();

    -- get the numeric strings, convert to numbers
    local r = tonumber(line:sub(27,29))
    local g = tonumber(line:sub(32,34))
    local b = tonumber(line:sub(37,39))

    return name, r, g, b
  end

  io.write("local ReseneColors = {\n");
  for line in io.lines(filename) do
    local name, red,green,blue = parseline(line)
    io.write(string.format("%s = {%d, %d, %d},\n",name, red, green, blue))
  end
  io.write("}\n");
end

ConvertReseneFile("ReseneRGB.txt")

To get the name, I do the following:

    local name = line:sub(1,26)
    name = string.gsub(name, "%s",'')
    name = name:lower();

That will get 26 characters from the front of the line. Then, using ‘gsub()’, replace all the ‘space’ characters with nothing, effectively removing them. This has the effect of both trimming whitespace from the end, as well as removing intervening whitespace. Then I gratuitously convert to lowercase.

The numeric values are easy. Just get the subtext from where the number column starts, and 3 characters later. Turn it into a number explicitly, and you’re done.

Next up, Steve Hollasch colors. The raw data looks like this:

wheat             245   222   179   0.9608   0.8706   0.7020
white             255   255   255   1.0000   1.0000   1.0000
white_smoke       245   245   245   0.9608   0.9608   0.9608
zinc_white        253   248   255   0.9900   0.9700   1.0000

Greys
cold_grey         128   138   135   0.5000   0.5400   0.5300
dim_grey          105   105   105   0.4118   0.4118   0.4118

Quite similar to the Resene colors, in that it is columnar. A couple of differences though. The field names have the ‘_’ character, and the sections have a name, that I’d like to preserve as a comment. So, the conversion looks like this:

ConvertHollaschFile = function(filename)
  parseline = function(line)
    -- get name, strip whitespace
    -- make lowercase
    local name = line:sub(1,18)
    name = string.gsub(name, "%s",'')
    name = name:lower();

    -- get the numeric strings, convert to numbers
    local r = tonumber(line:sub(19,21))
    local g = tonumber(line:sub(25,27))
    local b = tonumber(line:sub(31,33))

    return name, r, g, b
  end

  io.write("local HollaschColors = {\n");
  for line in io.lines(filename) do
    local name, red,green,blue = parseline(line)
    --print(name, red, green,blue)
    if name ~= "" then
      if red and green and blue then
        io.write(string.format("%s = {%d, %d, %d},\n",name, red, green, blue))
      else
        io.write(string.format("\n-- %s\n", name);
      end
    end
  end
  io.write("}\n");
end

ConvertHollaschFile("HollaschColors.txt")

It looks almost identical to the Resene code. There’s just a little bit more in the post processing to deal with the section titles as a comment in the output.

In the end, you have tables, which can easily be converted to JSON form, or used directly as databases, or what have you, but basically you’ve curated the data and converted it into a form which is much easier to deal with programmatically.

I find this to be a strong feature of the Lua language. The string library is fairly small, but it hits just the right set of features to make it useful. There are other environments in which this small task might be equally easy to deal with, but I like this one because the whole runtime is only 300K of code. In other environments, the regular expression library alone might be that big, and the rest of the runtime might run into the multi-megabyte size. So, not bad for a few minutes of work.


Curating Data – SGI Color names

Every once in a while I turn some low level bit of data into something useful for my programming.  I’ve done the ASCII table, and I’ve done mime types, and I’ve done http headers.  this time around, I’ve been playing with colors, pixels, graphics, and the like.  So, this time around, I found some useful color names that I wanted to use.

SGI X Windows Colors is a file that contains some information that is no doubt already available in many different forms.  Apparently, this is the source for names such as: azure, honeydew, navyblue, rosy brown, saddle brown, etc.  There’s about 660 of them.  The data in the original form looks like this:

255 250 250 snow
248 248 255 ghost white
248 248 255 GhostWhite

So on and so forth.  Of particular note, the names have a space in them in some cases, which is then rectified with a duplicate entry with the space removed, and Caml casing used.  Furthermore, gray/grey values look like this:

163 163 163 gray64
163 163 163 grey64
166 166 166 gray65
166 166 166 grey65
168 168 168 gray66
168 168 168 grey66

Some people like ‘grey’, some like ‘gray’, and so it goes.

When I look at the data, I want to do different things with it.  In some cases, I might like to use the values as direct pixel RGB representations.  I might want to do this:

ffi.cdef[[
struct RGB {
   uint8_t r,g,b;
}
]]
RGB = ffi.typeof("struct RGB");

red = RGB(255, 0,0);
green = RGB(0, 255, 0);

But, then again, I might prefer something like:

colors = {
  {name = "red", red=255, green=0, blue=0},
  {name = "green", red=0, green=255, blue=0},
}

These forms are very application specific, and either too verbose, or too special. Instead, I’ll pick a middle ground which is more like this:

local SGIColors = {
snow = {255, 250, 250},
ghostwhite = {248, 248, 255},
whitesmoke = {245, 245, 245},
gainsboro = {220, 220, 220},
.
.
.
}

When it’s in this form, I can fairly easily transform it into any other form, including those above, using some simple code that just traverses the table and does the needful.

But, how to get it in this form in the first place?

parseline = function(line)
	local starting, ending, n1, n2, n3, name = line:find("%s*(%d*)%s*(%d*)%s*(%d*)%s*([%a%d%s]*)")
	return tonumber(n1), tonumber(n2), tonumber(n3), name
end

convertFile = function(filename)
	for line in io.lines(filename) do
		local red,green,blue,name = parseline(line)
		name = name:lower()
		if name:find("%s") == nil then
			io.write(string.format("%s = {%d, %d, %d},\n",name, red, green, blue))
		end
	end
end

convertFile("SGIColors.txt")

The parse line() function takes a single line, and parses out the three component numbers, and the name. The io.write simply outputs it into the format that I expect (could be changed to anything). With this, I’m done.

So, there you have it. A web search, and about 10 minutes of coding, and suddenly I’ve got a little database of colors rather than simply a web page of stuff.