schedlua – refactor compactor

The subject of scheduling and async programming has been a long running theme in my blog.  From the very first entries related to LJIT2Win32, through the creation of TINN, and most recently (within the past year), the creation of schedlua, I have been exploring this subject.  It all kind of started innocently enough.  When node.js was born, and libuv was ultimately released, I thought to myself, ‘what prevents anyone from doing this in LuaJIT without the usage of any external libraries whatsovever?’

It’s been a long road.  There’s really no reason for this code to continue to evolve.  It’s not at the center of some massively distributed system.  These are merely bread crumbs left behind, mainly for myself, as I explore and evolve a system that has proven itself to be useful at least as a teaching aid.

In the most recent incarnation of schedlua kernel, I was able to clean up my act with the realization that you can implement all higher level semantics using a very basic ‘signal’ mechanism within the kernel.  That was pretty good as it allowed me to easily implement the predicate system (when, whenever, waitForTruth, signalOnPredicate).  In addition, it allowed me to reimplement the async io portion with the realization that a task waiting on IO to occur is no different than a task waiting on any other kind of signal, so I could simply build the async io atop the signaling.

schedlua has largely been a Linux based project, until now.  The crux of the difference between Linux and Windows comes down to two things in schedlua.  The first thing is timing operations.  Basically, how do you get a microsecond accurate clock on the system.  On Linux, I use the ‘clock_gettime()’ system call.  On Windows, I use ‘QueryPerformanceCounter, QueryPerformanceFrequency’.  In order to isolate these, I put them into their own platform specific timeticker.lua file, and they both just have to surface a ‘seconds()’ function.  The differences are abstracted away, and the common interface is that of a stopwatch class.

That was good for time, but what about alarms?

The functions in schedlua related to alarms, are: delay, periodic, runnintTime, and sleep.  Together, these allow you to run things based on time, as well as delay the current task as long as you like.  My first implementation of these routines, going all the way back to the TINN implementation, were to run a separate ‘watchdog’ task, which in turn maintained its list of tasks that were waiting, and scheduled them.  Recently, I thought, “why can’t I just use the ‘whenever’ semantics to implement this?”.

Now, the implementation of the alarm routines comes down to this:

 

local function taskReadyToRun()
	local currentTime = SWatch:seconds();

	-- traverse through the fibers that are waiting
	-- on time
	local nAwaiting = #SignalsWaitingForTime;

	for i=1,nAwaiting do
		local task = SignalsWaitingForTime[1];
		if not task then
			return false;
		end

		if task.DueTime <= currentTime then
			return task
		else
			return false
		end
	end

	return false;
end

local function runTask(task)
    signalOne(task.SignalName);
    table.remove(SignalsWaitingForTime, 1);
end

Alarm = whenever(taskReadyToRun, runTask)

The Alarm module still keeps a list of tasks that are waiting for their time to execute, but instead of using a separate watchdog task to keep track of things, I simply use the schedlua built-in ‘whenever’ function. This basically says, “whenever the function ‘taskReadyToRun()’ returns a non-false value, call the function ‘runTask()’ passing the parameter from taskReadyToRun()”. Convenient, end of story, simple logic using words that almost feel like an English sentence to me.

I like this kind of construct for a few reasons. First of all, it reuses code. I don’t have to code up that specialized watchdog task time and time again. Second, it wraps up the async semantics of the thing. I don’t really have to worry about explicitly calling spawn, or anything else related to multi-tasking. It’s just all wrapped up in that one word ‘whenever’. It’s relatively easy for me to explain this code, without mentioning semaphores, threads, conditions, or whatever. I can tell a child “whenever this is true, do that other thing”, and they will understand it.

So, that’s it. First I used signals as the basis to implement higher order functions, such as the predicate based flow control. Now I’m using the predicate based flow control to implement yet other functions such as alarms. Next, I’ll take that final step and do the same to the async IO, and I’ll be back to where I was a few months back, but with a much smaller codebase, and cross platform to boot.


SVG And Me – Don’t tell me, just another database!

A picture is worth 175Kb…

grapes

So, SVG right? Well, the original was, but this image was converted to a .png file for easy embedding in WordPress. The file size of the original grapes.svg is 75K. That savings in space is one of the reasons to use .svg files whenever you can.

But, I digress. The remotesvg project has been moving right along.

Last time around, I was able to use Lua syntax as a stand in for the raw .svg syntax.  That has some benefits because since your in a programming language, you can use programming constructs such as loops, references, functions and the like to enhance the development of your svg.  That’s great when you’re creating something from scratch programmatically, rather than just using a graphical editing tool such as inkscape to construct your .svg.  If you’re constructing a library of svg handling routines, you need a bit more though.

This time around, I’m adding in some parsing of svg files, as well as general manipulation of the same from within Lua.  Here’s a very simple example of how to read an svg file into a lua table:

 

local parser = require("remotesvg.parsesvg")

local doc = parser:parseFile("grapes.svg");

That’s it! You now have the file in a convenient lua table, ready to be manipulated. But wait, what do I have exactly? Let’s look at a section of that file and see what it gives us.

    <linearGradient
       inkscape:collect="always"
       id="linearGradient4892">
      <stop
         style="stop-color:#eeeeec;stop-opacity:1;"
         offset="0"
         id="stop4894" />
      <stop
         style="stop-color:#eeeeec;stop-opacity:0;"
         offset="1"
         id="stop4896" />
    </linearGradient>
    <linearGradient
       inkscape:collect="always"
       xlink:href="#linearGradient4892"
       id="linearGradient10460"
       gradientUnits="userSpaceOnUse"
       gradientTransform="translate(-208.29289,-394.63604)"
       x1="-238.25415"
       y1="1034.7042"
       x2="-157.4043"
       y2="1093.8906" />

This is part of the definitions, which later get used on portions of representing the grapes. A couple of things to notice. As a straight ‘parsing’, you’ll get a bunch of text values. For example: y2 = “109.8906”, that will turn into a value in the lua table like this: {y2 = “109.8906”}, the ‘109.8906’ is still a string value. That’s useful, but a little less than perfect. Sometimes, depending on what I’m doing, retaining that value as a string might be just fine, but sometimes, I’ll want that value to be an actual lua number. So, there’s an additional step I can take to parse the actual attributes values and turn them into a more native form:

local parser = require("remotesvg.parsesvg")

local doc = parser:parseFile("grapes.svg");
doc:parseAttributes();

doc:write(ImageStream)

That line with doc:parseAttributes(), tells the document to go through all its attributes and parse them, turning them into more useful values from the Lua perspective. In the case above, the representation of ‘y2’ would become: {y2 = 109.8906}, which is a string value.

This gets very interesting when you have values where the string representation and the useful lua representation are different.

<svg>
<line x1="10", y1="20", width = "10cm", height= "12cm" />
</svg>

This will be turning into:

{
  x1 = {value = 10},
  y1 = {value = 20},
  width = {value = 10, units = 'cm'},
  height = {value = 12, units = 'cm'}
}

Now, in my Lua code, I can access these values like so:

local doc = parser:parseFile("grapes.svg");
doc:parseAttributes();
print(doc.svg[1].x1.value);

When I subsequently want to write this value out as valid svg, it will turn back into the string representation with no loss of fidelity.

Hidden in this example is a database query. How do I know that doc.svg[1] is going to give me the ” element that I’m looking for? In this particular case, it’s only because the svg is so simple that I know for a fact that the ” element is going to show up as the first child in the svg document. But, most of the time, that is not going to be the case.

In any svg that’s of substance, there is the usage of various ‘id’ fields, and that’s typically what is used to find an element. So, how to do that in remotesvg? If we look back at the example svg, we see this ‘id’ attribute on the first gradient: id=”linearGradient4892″.

How could I possibly find that gradient element based on the id field? Before that though, let’s look at how to enumerate elements in the document in the first place.

local function printElement(elem)
    if type(elem) == "string" then
        -- don't print content values
        return 
    end
    
    print(string.format("==== %s ====", elem._kind))

    -- print the attributes
    for name, value in elem:attributes() do
        print(name,value)
    end
end

local function test_selectAll()
    -- iterate through all the nodes in 
    -- document order, printing something interesting along
    -- the way

    for child in doc:selectAll() do
	   printElement(child)
    end
end

Here is a simple test case where you have a document already parsed, and you want to iterate through the elements, in document order, and just print them out. This is the first step in viewing the document as a database, rather than as an image. The working end of this example is the call to ‘doc:selectAll()’. This amounts to a call to an iterator that is on the BasicElem class, which looks like this:

--[[
	Traverse the elements in document order, returning
	the ones that match a given predicate.
	If no predicate is supplied, then return all the
	elements.
--]]
function BasicElem.selectElementMatches(self, pred)
	local function yieldMatches(parent, predicate)
		for idx, value in ipairs(parent) do
			if predicate then
				if predicate(value) then
					coroutine.yield(value)
				end
			else
				coroutine.yield(value)
			end

			if type(value) == "table" then
				yieldMatches(value, predicate)
			end
		end
	end

  	return coroutine.wrap(function() yieldMatches(self, pred) end)	
end

-- A convenient shorthand for selecting all the elements
-- in the document.  No predicate is specified.
function BasicElem.selectAll(self)
	return self:selectElementMatches()
end

As you can see, ‘selectAll()’ just turns around and calls ‘selectElementMatches()’, passing in no parameters. The selectElementMatches() function then does the actual work. In Lua, there are a few ways to create iterators. In this particular case, where we want to recursive traverse down a hierarchy of nodes (document order), it’s easiest to use this coroutine method. You could instead keep a stack of nodes, pushing as you go down the hierarchy, popping as you return back up, but this coroutine method is much more compact to code, if a bit harder to understand if you’re not used to coroutines. The end result is an iterator that will traverse down a document hierarchy, in document order.

Notice also that the ‘selectElementMatches’ function takes a predicate. A predicate is simply a function that takes a single parameter, and will return ‘true’ or ‘false’ depending on what it sees there. This will become useful.

So, how to retrieve an element with a particular ID? Well, when we look at our elements, we know that the ‘id’ field is one of the attributes, so essentially, what we want to do is traverse the document looking for elements that have an id attribute that matches what we’re looking for.

function BasicElem.getElementById(self, id)
    local function filterById(entry)
        print("filterById: ", entry.id, id)
        if entry.id == id then
            return true;
        end
    end

    for child in self:selectMatches(filterById) do
        return child;
    end
end

Here’s a convenient function to do just that. And to use it:

local elem = doc:getElementById("linearGradient10460")

That will retrieve the second linear gradient of the pair of gradients from our svg fragment. That’s great! And the syntax is looking very much like what I might write in javascript against the DOM. But, it’s just a database!

Given the selectMatches(), you’re not just limited to querying against attribute values. You can get at anything, and form as complex queries as you like. For example, you could find all the elements that are deep green, and turn them purple with a simple query loop.

Here’s an example of finding all the elements of a particular kind:

local function test_selectElementMatches()
    print("<==== selectElementMatches: entry._kind == 'g' ====>")
	for child in doc:selectElementMatches(function(entry) if entry._kind == "g" then return true end end) do
		print(child._kind)
	end
end

Or finding all the elements that have a ‘sodipodi’ attribute of some kind:

local function test_selectAttribute()
    -- select the elements that have an attribute
    -- with the name 'sodipodi' in them
    local function hasSodipodiAttribute(entry)
        if type(entry) ~= "table" then
            return false;
        end

        for name, value in entry:attributes() do
            --print("hasSodipodi: ", entry._kind, name, value, type(name))
            if name:find("sodipodi") then
                return true;
            end
        end

        return false
    end

    for child in doc:selectElementMatches(hasSodipodiAttribute) do
        if type(child) == "table" then
            printElement(child)
        end
    end
end

Of course, just finding these elements is one thing. Once found, you can use this to filter out those elements you don’t want. for example, eliminating the ones that are inkscape specific.

Well, there you have it. First, you can construct your svg programmatically using Lua syntax. Alternatively, you can simply parse a svg file into a lua structure. Last, you can query your document, no matter how it was constructed, for fun and profit.

Of course, the real benefit of being able to parse, and find elements and the like, is it makes manipulating the svg that much easier. Find the node that represents the graph of values, for example, and change those values over time for some form of animation…


SVG And Me

lineargradient

That’s a simple linear gradient, generated from an SVG document that looks like this:

 

<svg viewBox = '0 0 120 120' version = '1.1' xmlns = 'http://www.w3.org/2000/svg'   width = '120' height = '120' xmlns:xlink = 'http://www.w3.org/1999/xlink'>
  <defs>
    	<linearGradient id = 'MyGradient'>
      <stop stop-color = 'green' offset = '5%' />
      <stop stop-color = 'gold' offset = '95%' />
    </linearGradient>
  </defs>
  <rect x = '10' y = '10' height = '100' fill = 'url(#MyGradient)' width = '100' />
</svg>

 

Fair enough. And of course there are a thousand and one ways to generate .svg files. For various reasons, I am interested in generating .svg files on the fly in a Lua context. So, the code I used to generate this SVG document looks like this:

require("remotesvg.SVGElements")()
local FileStream = require("remotesvg.filestream")
local SVGStream = require("remotesvg.SVGStream")

local ImageStream = SVGStream(FileStream.open("test_lineargradient.svg"))

local doc = svg {
	width = "120",
	height = "120",
	viewBox = "0 0 120 120",
    ['xmlns:xlink'] ="http://www.w3.org/1999/xlink",

    defs{
        linearGradient {id="MyGradient",
            stop {offset="5%",  ['stop-color']="green"};
            stop {offset="95%", ['stop-color']="gold"};
        }
    },

    rect {
    	fill="url(#MyGradient)",
        x=10, y=10, width=100, height=100,
    },
}

doc:write(ImageStream);

This comes from my remotesvg project. If you squint your eyes, these look fairly similar I think. In the second case, it’s definitely valid Lua script. Mostly it’s nested tables with some well known types. But, where are all the parenthesis, and how can you just put a name in front of ‘{‘ and have that do anything?

OK, so Lua has some nice syntactics tricks up its sleeve that make certain things a bit easier. For example, there’s this trick that if there’s only a single parameter to a function, you can leave off the ‘()’ combination. I’ve mentioned this before way long back when I was doing some Windows code, and supporting the “L” compiler thing for unicode literals.

In this case, it’s about tables, and later we’ll see about strings. The following two things are equivalent:

local function myFunc(tbl)
  for k,v in pairs(tbl) do
    print(k,v)
  end
end


myFunc({x=1, y=1, id="MyID"})

-- Or this slightly shorter form

myFunc {x=1, y=1, id="MyID"}

OK. So that’s how we get rid of those pesky ‘()’ characters, which don’t add to the conversation. In lua, since tables are a basic type, I can easily include tables in tables, nesting as deeply as I please. So, what’s the other trick here then? The fact that all those things before the ‘{‘ are simply the names of tables. This is one area where a bit of trickery goes a long way. I created a ‘base type’ if you will, which knows how to construct these tables from a function, and do the nesting, and ultimately print out SVG. It looks like this:

--[[
	SVGElem

	A base type for all other SVG Elements.
	This can do the basic writing
--]]
local BasicElem = {}
setmetatable(BasicElem, {
	__call = function(self, ...)
		return self:new(...);
	end,
})
local BasicElem_mt = {
	__index = BasicElem;
}

function BasicElem.new(self, kind, params)
	local obj = params or {}
	obj._kind = kind;

	setmetatable(obj, BasicElem_mt);

	return obj;
end

-- Add an attribute to ourself
function BasicElem.attr(self, name, value)
	self[name] = value;
	return self;
end

-- Add a new child element
function BasicElem.append(self, name)
	-- based on the obj, find the right object
	-- to represent it.
	local child = nil;

	if type(name) == "table" then
		child = name;
	elseif type(name) == "string" then
		child = BasicElem(name);
	else
		return nil;
	end

	table.insert(self, child);

	return child;
end

function BasicElem.write(self, strm)
	strm:openElement(self._kind);

	local childcount = 0;

	for name, value in pairs(self) do
		if type(name) == "number" then
			childcount = childcount + 1;
		else
			if name ~= "_kind" then
				strm:writeAttribute(name, tostring(value));
			end
		end
	end

	-- if we have some number of child nodes
	-- then write them out 
	if childcount > 0 then
		-- first close the starting tag
		strm:closeTag();

		-- write out child nodes
		for idx, value in ipairs(self) do
			if type(value) == "table" then
				value:write(strm);
			else
				-- write out pure text nodes
				strm:write(tostring(value));
			end
		end
		
		strm:closeElement(self._kind);
	else
		strm:closeElement();
	end
end

And further on in the library, I have things like this:

defs = function(params) return BasicElem('defs', params) end;

So, ‘defs’ is a function, which takes a single parameter (typically a table), and it constructs an instance of the BasicElem ‘class’, handing in the name of the element, and the specified ‘params’. And that’s that…

BasicElem has a function ‘write(strm)’, which knows how to turn the various values and tables it contains into correct looking SVG elements and attributes. It’s all right there in the write() function. In addition, it adds a couple more tidbits, such as the attr() and append() functions.

Now that these basic constructs exist, what can be done? Well, first off all, every one of the SVG elements is covered with the simple construct we see with the ‘defs’ element. How might you used this:

	local doc = svg {
		width = "12cm", 
		height= "4cm", 
		viewBox="0 0 1200 400",
	}


	doc:append('rect')
		:attr("x", 1)
		:attr("y", 2)
		:attr("width", 1198)
		:attr("height", 398)
		:attr("fill", "none")
		:attr("stroke", "blue")
		:attr("stroke-width", 2);

   local l1 = line({x1=100, y1=300, x2=300, y2=100, stroke = "green", ["stroke-width"]=5});
   local l2 = line({x1=300, y1=300, x2=500, y2=100, stroke = "green", ["stroke-width"]=20});
   local l3 = line({x1=500, y1=300, x2=700, y2=100, stroke = "green", ["stroke-width"]=25});
   local l4 = line({x1=700, y1=300, x2=900, y2=100, stroke = "green", ["stroke-width"]=20});
   local l5 = line({x1=900, y1=300, x2=1100, y2=100, stroke = "green", ["stroke-width"]=25});


	--doc:append(r1);
	doc:append(l1);
	doc:append(l2);
	doc:append(l3);
	doc:append(l4);
	doc:append(l5);

In this case, instead of doing the ‘inlined table document’ style of the first example, I’m doing more of a ‘programmatic progressive document building’ style. I create the basic ‘svg’ element and save it in the doc variable. Then I use the ‘append()’ function, to create a ‘rect’ element. On that same element, I can use a short hand to add it’s attributes. Then, I can create separate ‘line’ elements, and append them onto the document as well. That’s pretty special if you need to construct the document based on some data you’re seeing, and you can’t use the embedded table style up front.

There are some special elements that get extra attention though. Aside from the basic table construction, and attribute setting, the ‘path’ element has a special retained mode graphics building capability.

	local p1 = path {
		fill="red", 
		stroke="blue", 
		["stroke-width"]=3
	};
	
	p1:moveTo(100, 100);
	p1:lineTo(300, 100);
	p1:lineTo(200, 300);
	p1:close();

	local doc = svg {
		width="4cm", 
		height="4cm", 
		viewBox="0 0 400 400",
		
		rect {
			x="1", y="1", 
			width="398", height="398",
        	fill="none", stroke="blue"};
	
		p1;
	}

In this case, I create my ‘path’ element, and then I use its various path construction functions such as ‘moveTo()’, and ‘lineTo()’. There’s the full set of arc, bezier curvs, and the like, so you have all the available path construction commands. Again, this works out fairly well when you are trying to build something on the fly based on some previously unknown data.

There’s one more important construct, and that’s string literals. There are cases where you might want to do something that this easy library just doesn’t make simple. In those cases, you might just want to embed some literal text into the output document. Well, luckily, Lua has a fairly easy ability to indicate single or multi-line text, and the BasicElem object knows what to do if it sees it.

    g {
      ['font-family']="Arial",
      ['font-size']="36",

      [[
      <text x="48" y="48">Test a motion path</text> 
      <text x="48" y="95" fill="red">'values' attribute.</text> 
      <path d="M90,258 L240,180 L390,180" fill="none" stroke="black" stroke-width="6" /> 
      <rect x="60" y="198" width="60" height="60" fill="#FFCCCC" stroke="black" stroke-width="6" /> 
      <text x="90" y="300" text-anchor="middle">0 sec.</text> 
      <rect x="210" y="120" width="60" height="60" fill="#FFCCCC" stroke="black" stroke-width="6" /> 
      <text x="240" y="222" text-anchor="middle">3+</text> 
      <rect x="360" y="120" width="60" height="60" fill="#FFCCCC" stroke="black" stroke-width="6" /> 
      <text x="390" y="222" text-anchor="middle">6+</text> 
      ]];

      path {
        d="M-30,0 L0,-60 L30,0 z", 
        fill="blue", 
        stroke="red", 
        ['stroke-width']=6, 
        
        animateMotion {values="90,258;240,180;390,180", begin="0s", dur="6s", calcMode="linear", fill="freeze"} 
      } 
    }

Notice the portion after the ‘font-size’ attribute is a Lua multi-line string literal. This section will be incuded in the form document verbatim. Another thing to notice here is that ‘path’ element. Although path is specialized, it still has the ability to have attributes, and even have child nodes of its own, such as for animation.

Another case where the literals may come in handy is for CSS style sheets.

	defs {
		style {type="text/css",
[[
			.land
			{
				fill: #CCCCCC;
				fill-opacity: 1;
				stroke:white;
				stroke-opacity: 1;
				stroke-width:0.5;
			}
]]
		};
	};

The ‘style’ element is well known, but the format of the actual content is a bit too specific to translate into a Lua form, so it can simply be included as a literal.

Well, that’s the beginning of this journey. Ultimately I want to view some live graphics generated from data, and send some commands back to the server to perform some functions. At this point, I can use Lua to generate the SVG on the fly, and there isn’t an SVG parser, or Javascript interpreter in sight.


Spelunking Linux – procfs or is that sysctl?

Last time around, I introduced some simple things with lj2procfs.  Being able to simply access the contents of the various files within procfs is a bit of convenience.  Really what lj2procfs is doing is just giving you a common interface to the data in those files.  Everything shows up as simple lua values, typically tables, with strings, and numbers.  That’s great for most of what you’d be doing with procfs, just taking a look at things.

But, on Linux, procfs has another capability.  The /proc/sys directory contains a few interesting directories of its own:

 

abi/
debug/
dev/
fs/
kernel/
net/
vm/

And if you look into these directories, you find some more interesting files. For example, in the ‘kernel/’ directory, we can see a little bit of this:

hostname
hotplug
hung_task_check_count
hung_task_panic
hung_task_timeout_secs
hung_task_warnings
io_delay_type
kexec_load_disabled
keys
kptr_restrict
kstack_depth_to_print
max_lock_depth
modprobe
.
.
.

Now, these are looking kind of interesting. These files contain typically tunable portions of the kernel. On other unices, these values might be controlled through the sysctl() function call. On Linux, that function would just manipulate the contents of these files. So, why not just use lj2procfs to do the same.

Let’s take a look at a few relatively simple tasks. First, I want to get the version of the OS running on my machine. This can be obtained through the file /proc/sys/kernel/version

local procfs = require("lj2procfs.procfs")
print(procfs.sys.kernel.version)

$ #15-Ubuntu SMP Thu Apr 16 23:32:37 UTC 2015

This is the same string returned from the call ‘uname -v’

And, to get the hostname of the machine:

print(procfs.sys.kernel.hostname)
$ ubuntu

Which is what the ‘hostname’ command returns on my machine.

And what about setting the hostname? First of all, you’ll want to do this as root, but it’s equally simple:

procfs.sys.kernel.hostname = 'alfredo'

Keep in mind that setting the hostname in this way is transient, and it will seriously mess up things, like your about to sudo after this. But, there you have it.

Any value under /proc/sys can be retrieved or set using the fairly simple mechanism. I find this to be very valuable for two reasons. First of all, spelunking these values makes for great discovery. More importantly, being able to capture and set the values makes for a fairly easily tunable system.

An example of how this can be used for system diagnostics and tuning, you can capture the kernel values, using a simple command that just dumps what you want into a table. Send that table to anyone else for analysis. Similarly, if someone has come up with a system configuration that is great for a particular task, tuning the VM allocations, networking values, and the like, they can send you the configuration (just a string value that is a lua table) and you can apply it to your system.

This is a tad better than simply trying to look at system logs to determine after the fact what might be going on with a system. Perhaps the combination of these live values, as well as correlation with system logs, makes it easier to automate the process of diagnosing and tuning a system.

Well, there you have it. The lj2procfs thing is getting more concise, as well as becoming more usable at the same time.


Spelunking Linux – what is this auxv thing anyway

While spelunking Linux, trying to find an easier way to do this or that, I ran across this vDSO thing (virtual ELF Dynamic Shared Object). What?

Ok, it’s like this. I was implementing direct calling of syscalls on Linux, and reading up on how the C libraries do things. On Linux, when you wan to talk to the kernel, you typically go through syscalls, or ioctl calls, or netlinks. With syscall, it’s actually a fairly expensive process, switching from userspace to kernel space, issuing and responding to interrupts, etc. In some situations this could be a critical performance hit. So, to make things easier/faster, some of the system calls are implemented in this little ELF package (vDSO). This little elf package (a dynamic link library) is loaded into every application on Linux. Then, the C library can decide to make calls into that library, instead of syscalls, thus saving a lot of overhead and speeding things up. Not all systems have this capability, but many do.

Alright, so how does the C runtime know whether the capability is there or now, and where this little library is, and how to get at functions therewith? In steps our friend auxv. In the GNU C library, there is a lone call:

unsigned long getauxval(unsigned long);

What values can you get out of this thing? Well, the constants can be found in the elf.h file, and look like:

#define AT_PLATFORM 15
#define AT_PAGESZ 6

And about 30 others. How you use each of these depends on the type of the value that you are looking up. For instance, the AT_PLATFORM returns a pointer to a null terminated string. The AT_PAGESZ returns an integer which represents the memory page size of the machine you’re running on.

OK, so what’s the lua version?

ffi.cdef[[
static const int AT_NULL = 0;
static const int AT_IGNORE = 1;
static const int AT_EXECFD = 2;
static const int AT_PHDR = 3;
static const int AT_PHENT = 4;
static const int AT_PHNUM = 5;
static const int AT_PAGESZ = 6;
static const int AT_BASE = 7;
static const int AT_FLAGS = 8;
static const int AT_ENTRY = 9;
static const int AT_NOTELF = 10;
static const int AT_UID = 11;
static const int AT_EUID = 12;
static const int AT_GID = 13;
static const int AT_EGID = 14;
static const int AT_CLKTCK = 17;
static const int AT_PLATFORM = 15;
static const int AT_HWCAP = 16;
static const int AT_FPUCW = 18;
static const int AT_DCACHEBSIZE = 19;
static const int AT_ICACHEBSIZE = 20;
static const int AT_UCACHEBSIZE = 21;
static const int AT_IGNOREPPC = 22;
static const int AT_SECURE = 23;
static const int AT_BASE_PLATFORM = 24;
static const int AT_RANDOM = 25;
static const int AT_HWCAP2 = 26;
static const int AT_EXECFN = 31;
static const int AT_SYSINFO = 32;
static const int AT_SYSINFO_EHDR = 33;
static const int AT_L1I_CACHESHAPE = 34;
static const int AT_L1D_CACHESHAPE = 35;
static const int AT_L2_CACHESHAPE = 36;
]]

ffi.cdef[[
unsigned long getauxval(unsigned long);
]]

With this, I can then write code that looks like the following:

local function getStringAuxVal(atype)
	local res = libc.getauxval(atype)

	if res == 0 then 
		return false, "type not found"
	end

	local str = ffi.string(ffi.cast("char *", res));
	return str
end

local function getIntAuxValue(atype)
	local res = libc.getauxval(atype)

	if res == 0 then 
		return false, "type not found"
	end

	return tonumber(res);
end

local function getPtrAuxValue(atype)
	local res = libc.getauxval(atype)

	if res == 0 then 
		return false, "type not found"
	end

	return ffi.cast("intptr_t", res);
end


-- convenience functions
local function getExecPath()
	return getStringAuxVal(libc.AT_EXECFN);
end

local function getPlatform()
	return getStringAuxVal(libc.AT_PLATFORM);
end

local function getPageSize()
	return getIntAuxValue(libc.AT_PAGESZ);
end

local function getRandom()
	return getPtrAuxValue(libc.AT_RANDOM);
end

--[[
	Some test cases
--]]
print(" Platform: ", getPlatform());
print("Exec Path: ", getExecPath());
print("Page Size: ", getPageSize());
print("   Random: ", getRandom());


And so on and so forth, assuming you have the proper ‘libc’ luajit ffi binding, which gives you access to constants through the ‘libc.’ mechanism.

OK, fine, if I’m a C programmer, and I just want to port some code that’s already doing this sort of thing. By the way, the one value that we’re interested in is: AT_SYSINFO_EHDR. That contains a pointer to the beginning of our vDSO. Then you can call functions directly from there (there’s an API for that).

But, if I’m a lua programmer, I’ve come to expect more out of my environment, largely because I’m lazy and don’t like so much typing. Upon further examination, you can get this information yourself. If you’re hard core, you can look at the top of memory in your program, and map that location to a pointer you can fiddle with directly. Otherwise, you can actually get this information from a ‘file’ on Linux.

Turns out that you can get this info from ‘/proc/self/auxv’, if you’re running this command about your current process (which you most likely are). So, now what can I do with that? Well, the lua way would be the following:

-- auxv_iter.lua
local ffi = require("ffi")
local libc = require("libc")

local E = {}

-- This table maps the constant values for the various
-- AT_* types to their symbolic names.  This table is used
-- to both generate cdefs, as well and hand back symbolic names
-- for the keys.
local auxtbl = {
	[0] =  "AT_NULL";
	[1] =  "AT_IGNORE";
	[2] = "AT_EXECFD";
	[3] = "AT_PHDR";
	[4] = "AT_PHENT";
	[5] = "AT_PHNUM";
	[6] = "AT_PAGESZ";
	[7] = "AT_BASE";
	[8] = "AT_FLAGS";
	[9] = "AT_ENTRY";
	[10] = "AT_NOTELF";
	[11] = "AT_UID";
	[12] = "AT_EUID";
	[13] = "AT_GID";
	[14] = "AT_EGID";
	[17] = "AT_CLKTCK";
	[15] = "AT_PLATFORM";
	[16] = "AT_HWCAP";
	[18] = "AT_FPUCW";
	[19] = "AT_DCACHEBSIZE";
	[20] = "AT_ICACHEBSIZE";
	[21] = "AT_UCACHEBSIZE";
	[22] = "AT_IGNOREPPC";
	[23] = "AT_SECURE";
	[24] = "AT_BASE_PLATFORM";
	[25] = "AT_RANDOM";
	[26] = "AT_HWCAP2";
	[31] = "AT_EXECFN";
	[32] = "AT_SYSINFO";
	[33] = "AT_SYSINFO_EHDR";
	[34] = "AT_L1I_CACHESHAPE";
	[35] = "AT_L1D_CACHESHAPE";
	[36] = "AT_L2_CACHESHAPE";
}

-- Given a auxv key(type), and the value returned from reading
-- the file, turn the value into a lua specific type.
-- string pointers --> string
-- int values -> number
-- pointer values -> intptr_t

local function auxvaluefortype(atype, value)
	if atype == libc.AT_EXECFN or atype == libc.AT_PLATFORM then
		return ffi.string(ffi.cast("char *", value))
	end

	if atype == libc.AT_UID or atype == libc.AT_EUID or
		atype == libc.AT_GID or atype == libc.AT_EGID or 
		atype == libc.AT_FLAGS or atype == libc.AT_PAGESZ or
		atype == libc.AT_HWCAP or atype == libc.AT_CLKTCK or 
		atype == libc.AT_PHENT or atype == libc.AT_PHNUM then

		return tonumber(value)
	end

	if atype == libc.AT_SECURE then
		if value == 0 then 
			return false
		else
			return true;
		end
	end


	return ffi.cast("intptr_t", value);
end

-- iterate over the auxv values at the specified path
-- if no path is specified, use '/proc/self/auxv' to get
-- the values for the currently running program
local function auxviterator(path)
	path = path or "/proc/self/auxv"
	local fd = libc.open(path, libc.O_RDONLY);

	local params = {
		fd = fd;
		keybuff = ffi.new("intptr_t[1]");
		valuebuff = ffi.new("intptr_t[1]");
		buffsize = ffi.sizeof(ffi.typeof("intptr_t"));
	}


	local function gen_value(param, state)
		local res1 = libc.read(param.fd, param.keybuff, param.buffsize)
		local res2 = libc.read(param.fd, param.valuebuff, param.buffsize)
		if param.keybuff[0] == 0 then
			libc.close(param.fd);
			return nil;
		end

		local atype = tonumber(param.keybuff[0])
		return state, atype, auxvaluefortype(atype, param.valuebuff[0])
	end

	return gen_value, params, 0

end

-- generate ffi.cdef calls to turn the symbolic type names
-- into constant integer values
local cdefsGenerated = false;

local function gencdefs()
	for k,v in pairs(auxtbl) do		
		-- since we don't know if this is already defined, we wrap
		-- it in a pcall to catch the error
		pcall(function() ffi.cdef(string.format("static const int %s = %d;", v,k)) end)
	end
	cdefsGenerated = true;
end

-- get a single value for specified key.  A path can be specified
-- as well (default it '/proc/self/auxv')
-- this is most like the gnuC getauxval() function
local function getOne(key, path)
	-- iterate over the values, looking for the one we want
	for _, atype, value in auxviterator(path) do
		if atype == key then
			return value;
		end
	end

	return nil;
end

E.gencdefs = gencdefs;
E.keyvaluepairs = auxviterator;	
E.keynames = auxtbl;
E.getOne = getOne;

setmetatable(E, {
	-- we allow the user to specify one of the symbolic constants
	-- when doing a 'getOne()'.  This indexing allows for the creation
	-- and use of those constants if they haven't already been specified
	__index = function(self, key)
		if not cdefsGenerated then
			gencdefs();
		end

		local success, value = pcall(function() return ffi.C[key] end)
		if success then
			rawset(self, key, value);
			return value;
		end

		return nil;
	end,

})

return E

In a nutshell, this is all you need for all the lua based auxv goodness in your life. Here are a couple of examples of usage in action:

local init = require("test_setup")()
local auxv_util = require("auxv_iter")
local apairs = auxv_util.keyvaluepairs;
local keynames = auxv_util.keynames;
local auxvGetOne = auxv_util.getOne;


--auxv_util.gencdefs();
print("==== Iterate All ====")
local function printAll()
	for _, key, value in apairs(path) do
		io.write(string.format("%20s[%2d] : ", keynames[key], key))
		print(value);
	end
end

-- print all the entries
printAll();

-- try to get a specific one
print("==== Get Singles ====")
print(" Platform: ", auxvGetOne(auxv_util.AT_PLATFORM))
print("Page Size: ", auxvGetOne(auxv_util.AT_PAGESZ))

The output from printAll() might look like this:

==== Iterate All ====
     AT_SYSINFO_EHDR[33] : 140721446887424LL
            AT_HWCAP[16] : 3219913727
           AT_PAGESZ[ 6] : 4096
           AT_CLKTCK[17] : 100
             AT_PHDR[ 3] : 4194368LL
            AT_PHENT[ 4] : 56
            AT_PHNUM[ 5] : 10
             AT_BASE[ 7] : 140081410764800LL
            AT_FLAGS[ 8] : 0
            AT_ENTRY[ 9] : 4208720LL
              AT_UID[11] : 1000
             AT_EUID[12] : 1000
              AT_GID[13] : 1000
             AT_EGID[14] : 1000
           AT_SECURE[23] : false
           AT_RANDOM[25] : 140721446787113LL
           AT_EXECFN[31] : /usr/local/bin/luajit
         AT_PLATFORM[15] : x86_64

The printAll() function uses the auxv iteration function, which in turns reads the key/value pairs directly from the /proc/self/auxv file. No need for the GNU C lib function at all. It goes further and turns the raw ‘unsigned long’ values into the appropriate data type based on what kind of data the key specified represents. So, you get lua string, and not just a pointer to a C string.

In the second example, getting singles, the output is simply this:

 Platform: 	x86_64
Page Size: 	4096

The code for this goes into a bit of trickery that’s possible with luajit. first of all, notice the use of ‘auxv_util.AT_PAGESZ’. There is nothing in the auxv_iter.lua file that supports this value directly. There is the table of names, and then there’s that ‘setmetatable’ at the end of things. Here’s where the trickery happens. Basically, this function is called whenever you put a ‘.’ after a table to try and access something, and that something isn’t in the table. You get a chance to make something up and return it. Ini this case, we first call ‘gencdefs()’ if that hasn’t already been called. This will generate ‘static const int XXX’ for all the values in the table of names, so that we can then do a lookup of the values in the ‘ffi.C.’ namespace, using the name. If we find a name, then we add it to the table, so next time the lookup will succeed, and we won’t end up calling the __index function.

At any rate, we now have the requisite value to lookup. Then we just roll through the iterator, and return when we’ve got the value we were looking for. The conversion to the appropriate lua type is automatic.

And there you have it! From relative obscurity, to complete usability, in one iterator. Being able to actually get function pointers in the vDSO is the next step. That will require another API wrapper, or worst case, and all encompassing ELF parser…


cUrling up to the net – a LuaJIT binding

I have need to connect to some Azure services from Linux, using C++. There are a few C/C++ libraries around that will make life relatively easy in this regard, but I thought I’d go with an option that has me learn about something I don’t use that often, but will be fairly powerful and complete.  I chose to use cURL, or libcurl.so to be more precise.  Why?  cURL has been around for ages, has continued to evolve, and makes it fairly easy to do anything from ftp down/upload to https connections.  It has tons of options and knobs, including dealing with authentication, SSL, or just acting like an ordinary socket of you prefer that.

First I created a luajit binding to libcurl.  This just follows my usual learning pattern.  In order to conquer an API, you must first render it useful from Lua.  I did my typical two part binding, first a fairly faithful low level binding, then added some luajit idioms atop.  In this particular case, there’s not a big database to query, although there are quite a few options that get listed in a table:

 

CINIT(COOKIE, OBJECTPOINT, 22),
CINIT(HTTPHEADER, OBJECTPOINT, 23),
CINIT(HTTPPOST, OBJECTPOINT, 24),
CINIT(SSLCERT, OBJECTPOINT, 25),
CINIT(KEYPASSWD, OBJECTPOINT, 26),
CINIT(CRLF, LONG, 27),

This is an excerpt from the original curl.h header file. There is a listing of some 200 options which can be set on a curl connection (depending on the context of the connection). This CINIT is a macro that sets the value of an enum appropriately. Well, those macros don’t work in the luajit ffi.cdef call, so I needed a way to convert these. I could have just run the lot through the gcc pre-processor, and that would give me the values I needed, but I thought I’d take a different approach.

I wrote a bit of script to scan the curl.h file looking for the CINIT lines, and turn them into something interesting.

function startswith(s, prefix)
    return string.find(s, prefix, 1, true) == 1
end

local function writeClean(filename)
	for line in io.lines(filename) do
		if startswith(line, "CINIT") then
			name, tp, num = line:match("CINIT%((%g+),%s*(%g+),%s*(%d+)")
			print(string.format("\t%-25s = {'%s', %s},", name, tp, num))
		end
	end
end

generates…

	COOKIE                    = {'OBJECTPOINT', 22},
	HTTPHEADER                = {'OBJECTPOINT', 23},
	HTTPPOST                  = {'OBJECTPOINT', 24},
	SSLCERT                   = {'OBJECTPOINT', 25},
	KEYPASSWD                 = {'OBJECTPOINT', 26},
	CRLF                      = {'LONG', 27},

Well, that’s nice. Now I have it as a useful table. I can write another script to turn that inti enums, or anything else.

local ffi = require("ffi")

local filename = arg[1] or "CurlOptions"

local options = require(filename)

ffi.cdef[[
typedef enum {
	CURLOPTTYPE_LONG          = 0,
	CURLOPTTYPE_OBJECTPOINT   = 10000,
	CURLOPTTYPE_FUNCTIONPOINT = 20000,
	CURLOPTTYPE_OFF_T         = 30000
};
]]


local function CINIT(na,t,nu) 
	return string.format("\tCURLOPT_%s = CURLOPTTYPE_%s+%d,", na, t, nu)
end

local tbl = {}
local function addenum(name, type, number)
	table.insert(tbl, CINIT(name, type, number));
end

table.insert(tbl, "local ffi = require('ffi')");
table.insert(tbl, "ffi.cdef[[\ntypedef enum {")

for k,v in pairs(options) do
	addenum(k, v[1], v[2]);
end

table.insert(tbl, "} CURLoption;]]");

local tblstr = table.concat(tbl,'\n')
print(tblstr)
-- now get the definitions as a giant string
-- and execute it
local defs = loadstring(tblstr);
defs();

It’s actually easier than this. I threw in the loadstring just to ensure the output was valid. Yah, ok basics.

libcurl is an extremely convenient library. As such, it can be a challenge to use. Fortunately, it has an ‘easy’ interface as well. Here, I chose to wrap the easy interface in an object like wrapper to make it even easier. Here’s how you use it:

require("CRLEasyRequest")(url):perform();

That will retrieve the entirety of any given url, and dump the output to stdout. Well, that’s somewhat useful, and its only one line of code. This simple interface will become more sophisticated over time, including being the basis for REST calls, but for now, it suffices.

So, there you have it. libcurl seems to be a viable choice for web access from within lua. You could of corse just do it all with pure lua code, but, you probably wont be wrong to leverage libcurl.


Spelunking Linux – Yes, the system truly is a database

In this article: Isn’t the whole system just a database? – libdrm, I explored a little bit of the database nature of Linux by using libudev to enumerate and open libdrm devices.  After that, I spent some time bringing up a USB module: LJIT2libusb.  libusb is a useful cross platform library that makes it relatively easy to gain access to the usb functions on multiple platforms.  It can enumerate devices, deal with hot plug notifications, open up, read, write, etc.

At its core, on Linux at least, libusb tries to leverage the uvdev capabilities of the target system, if those capabilities are there.  This means that device enumeration and hot plugging actually use the libuvdev stuff.  In fact, the code for enumerating those usb devices in libusb looks like this:

 

	udev_enumerate_add_match_subsystem(enumerator, "usb");
	udev_enumerate_add_match_property(enumerator, "DEVTYPE", "usb_device");
	udev_enumerate_scan_devices(enumerator);
	devices = udev_enumerate_get_list_entry(enumerator);

There’s more stuff of course, to turn that into data structures which are appropriate for use within the libusb view of the world. But, here’s the equivalent using LLUI and the previously developed UVDev stuff:

local function isUsbDevice(dev)
	if dev.IsInitialized and dev:getProperty("subsystem") == "usb" and
		dev:getProperty("devtype") == "usb_device" then
		return true;
	end

	return false;
end

each(print, filter(isUsbDevice, ctxt:devices()))

It’s just illustrative, but it’s fairly simple to understand I think. The ‘ctxt:devices()’ is an iterator over all the devices in the system. The ‘filter’ function is part of the luafun functional programming routines available to Lua. the ‘isUsbDevice’ is a predicate function, which returns ‘true’ when the device in question matches what it believes makes a device a ‘usb’ device. In this case, its the subsystem and dev_type properties which are used.

Being able to easily query devices like this makes life a heck of a lot easier. No funky code polluting my pure application. Just these simple query predicates written in Lua, and I’m all set. So, instead of relying on libusb to enumerate my usb devices, I can just enumerate them directly using uvdev, which is what the library does anyway. Enumeration and hotplug handing is part of the library. The other part is the actual send and receiving of data. For that, the libusb library is still primarily important, as replacing that code will take some time.

Where else can this great query capability be applied? Well, libudev is just a nice wrapper atop sysfs, which is that virtual file system built into Linux for gaining access to device information and control of the same. There’s all sorts of stuff in there. So, let’s say you want to list all the block devices?

local function isBlockDevice(dev)
	if dev.IsInitialized and dev:getProperty("subsystem") == "block" then
		return true;
	end

	return false;
end

That will get all the devices which are in the subsystem “block”. That includes physical disks, virtual disks, partitions, and the like. If you’re after just the physical ones, then you might use something like this:

local function isPhysicalBlockDevice(dev)
	if dev.IsInitialized and dev:getProperty("subsystem") == "block" and
		dev:getProperty("devtype") == "disk" and
		dev:getProperty("ID_BUS") ~= nil then
		return true;
	end

	return false;
end

Here, a physical device is indicated by subsystem == ‘block’ and devtype == ‘disk’ and the ‘ID_BUS’ property exists, assuming any physical disk would show up on one of the system’s buses. This won’t catch a SD card though. For that, you’d use the first one, and then look for a property related to being an SD card. Same goes for ‘cd’ vs ramdisk, or whatever. You can make these queries as complex or simple as you want.

Once you have a device, you can simply open it using the “SysName” parameter, handed to an fopen() call.

I find this to be a great way to program. It makes the creation of utilities such as ‘lsblk’ relatively easy. You would just look for all the block devices and their partitions, and put them into a table. Then separately, you would have a display routine, which would consume the table and generate whatever output you want. I find this much better than the typical Linux tools which try to do advanced display using the terminal window. That’s great as far as it goes, but not so great if what you really want is a nice html page generated for some remote viewing.

At any rate, this whole libudev exploration is a great thing. You can list all devices easily, getting every bit of information you care to examine. Since it’s all scriptable, it’s fairly easy to taylor your queries on the fly, looking at, discovering, and the like. I discovered that the thumb print reader in my old laptop was made by Broadcom, and my webcam by 3M? It’s just so much fun.

Well there you have it. The more you spelunk, the more you know, and the more you can fiddle about.


Note To Self – Enumerating bit flags

I’ve been trawling through the Linux V4L2 group of libraries of late as part of LLUI.  v4l2 is one of those sprawling libraries that does all things for all people in terms of video on Linux machines.  It’s roughly equivalent to oh so many similar things from the past on the Windows side.  This is one of the libraries you might utilize if you were to get into streaming from your webcam programmatically.  Of course, you could just read from it directly with libusb, but then you lose out on all the nifty format conversions, and I miss this chance to write another pointless reminder for my later coding self.

So, what’s got me so bothered this time around?  Well, lets say I’m just parusing my system, turning everything into a database as I go along.  I’d like to get a hold of my webcam, and see what it’s capable of.  There’s a call for that of course.  Once you make the appropriate IOCtl call, you end up with a struct that looks like this:

 
[soucecode]
struct v4l2_capability {
uint8_t driver[16];
uint8_t card[32];
uint8_t bus_info[32];
uint32_t version;
uint32_t capabilities;
uint32_t device_caps;
uint32_t reserved[3];
};
[/sourcecode]

The driver, card, and bus_info fields are pretty straight forward as they are simple ‘null terminated’ strings, so you have print them out if you like. It’s that ‘capabilities’ field that gives me fits. This is one of those combined bit flags sort of things. The value can be a combination of any of the numerous ‘capability’ flags, which are these:

-- Values for 'capabilities' field
caps = {
	V4L2_CAP_VIDEO_CAPTURE		= 0x00000001 ; -- Is a video capture device */
	V4L2_CAP_VIDEO_OUTPUT		= 0x00000002; -- Is a video output device */
	V4L2_CAP_VIDEO_OVERLAY		= 0x00000004; -- Can do video overlay */
	V4L2_CAP_VBI_CAPTURE		= 0x00000010; -- Is a raw VBI capture device */
	V4L2_CAP_VBI_OUTPUT			= 0x00000020; -- Is a raw VBI output device */
	V4L2_CAP_SLICED_VBI_CAPTURE	= 0x00000040; -- Is a sliced VBI capture device */
	V4L2_CAP_SLICED_VBI_OUTPUT	= 0x00000080; -- Is a sliced VBI output device */
	V4L2_CAP_RDS_CAPTURE		= 0x00000100; -- RDS data capture */
	V4L2_CAP_VIDEO_OUTPUT_OVERLAY	= 0x00000200; -- Can do video output overlay */
	V4L2_CAP_HW_FREQ_SEEK		= 0x00000400; -- Can do hardware frequency seek  */
	V4L2_CAP_RDS_OUTPUT			= 0x00000800; -- Is an RDS encoder */

	V4L2_CAP_VIDEO_CAPTURE_MPLANE	= 0x00001000;
	V4L2_CAP_VIDEO_OUTPUT_MPLANE	= 0x00002000;
	V4L2_CAP_VIDEO_M2M_MPLANE		= 0x00004000;
	V4L2_CAP_VIDEO_M2M				= 0x00008000;

	V4L2_CAP_TUNER			= 0x00010000; -- has a tuner */
	V4L2_CAP_AUDIO			= 0x00020000; -- has audio support */
	V4L2_CAP_RADIO			= 0x00040000; -- is a radio device */
	V4L2_CAP_MODULATOR		= 0x00080000; -- has a modulator */

	V4L2_CAP_READWRITE              = 0x01000000; -- read/write systemcalls */
	V4L2_CAP_ASYNCIO                = 0x02000000; -- async I/O */
	V4L2_CAP_STREAMING              = 0x04000000; -- streaming I/O ioctls */
}

For the embedded webcam in my laptop, the reported value is: 0x04000001;

Of course, when you’re doing something programmatically, and you just want to check whether a particular flag is set or not, you can just do:

canStream = band(V4L2_CAP_STREAMING, 0x04000001) ~= 0

Very common, and probably some of the most common code you’ll see anywhere. But what else? For various reasons, I want to create the string values for those bit fields, and use those values as keys to tables, or just to print, or to send somewhere, or display, or what have you.

I’ve seen enough ‘C’ code deal with this there is a common patter. First create the #define, or enum statement which encapsulates the values for all the flags. Then, to get the values as strings, create a completely separate string table, which does the mapping of the nice tight enum values and the string values. Then write a little lookup function which can go from the value to the string.

Well, here’s one of those things I love about Lua. In this case, the program IS the database. No need for those parallel representations. Here’s some code:

local pow = math.pow
local bit = require("bit")
local lshift, rshift, band, bor = bit.lshift, bit.rshift, bit.band, bit.bor

local function getValueName(value, tbl)
	for k,v in pairs(tbl) do
		if v == value then
			return k;
		end
	end

	return nil;
end

local function enumbits(bitsValue, tbl, bitsSize)
	local function name_gen(params, state)

		if state >= params.bitsSize then return nil; end

		while(true) do
			local mask = pow(2,state)
			local maskedValue = band(mask, params.bitsValue)
--print(string.format("(%2d) MASK [%x] - %#x", state, mask, maskedValue))			
			if maskedValue ~= 0 then
				return state + 1, getValueName(maskedValue, params.tbl) or "UNKNOWN"
			end

			state = state + 1;
			if state >= params.bitsSize then return nil; end
		end

		return nil;
	end

	return name_gen, {bitsValue = bitsValue, tbl = tbl, bitsSize = bitsSize or 32}, 0
end

return enumbits

The function “getValueName()” at the top there simply does a reverse lookup in the table. That is, given a value, return the string that represents that value (is the string key for that value).

Next, the “enumbits()” function is an enumerator. It will iterate over the bit flags, returning a string name for all the ones that are set to ‘1’, and nothing for any of the other bits. Here’s an example:

local bit = require("bit")
local lshift, rshift, band, bor = bit.lshift, bit.rshift, bit.band, bit.bor

local enumbits = require("enumbits")

local testtbl = {
	LOWEST 	= 0x0001;
	MEDIUM 	= 0x0002;
	HIGHEST = 0x0004;
	MIGHTY 	= 0x0008;
	SLUGGO 	= 0x0010;
	MUGGO 	= 0x0020;
	BUGGO 	= 0x0040;
	PUGGO 	= 0x0080;
}


local caps = {
	V4L2_CAP_VIDEO_CAPTURE		= 0x00000001 ; -- Is a video capture device */
	V4L2_CAP_VIDEO_OUTPUT		= 0x00000002; -- Is a video output device */
	V4L2_CAP_VIDEO_OVERLAY		= 0x00000004; -- Can do video overlay */
	V4L2_CAP_VBI_CAPTURE		= 0x00000010; -- Is a raw VBI capture device */
	V4L2_CAP_VBI_OUTPUT			= 0x00000020; -- Is a raw VBI output device */
	V4L2_CAP_SLICED_VBI_CAPTURE	= 0x00000040; -- Is a sliced VBI capture device */
	V4L2_CAP_SLICED_VBI_OUTPUT	= 0x00000080; -- Is a sliced VBI output device */
	V4L2_CAP_RDS_CAPTURE		= 0x00000100; -- RDS data capture */
	V4L2_CAP_VIDEO_OUTPUT_OVERLAY	= 0x00000200; -- Can do video output overlay */
	V4L2_CAP_HW_FREQ_SEEK		= 0x00000400; -- Can do hardware frequency seek  */
	V4L2_CAP_RDS_OUTPUT			= 0x00000800; -- Is an RDS encoder */

	V4L2_CAP_VIDEO_CAPTURE_MPLANE	= 0x00001000;
	V4L2_CAP_VIDEO_OUTPUT_MPLANE	= 0x00002000;
	V4L2_CAP_VIDEO_M2M_MPLANE		= 0x00004000;
	V4L2_CAP_VIDEO_M2M				= 0x00008000;

	V4L2_CAP_TUNER			= 0x00010000; -- has a tuner */
	V4L2_CAP_AUDIO			= 0x00020000; -- has audio support */
	V4L2_CAP_RADIO			= 0x00040000; -- is a radio device */
	V4L2_CAP_MODULATOR		= 0x00080000; -- has a modulator */

	V4L2_CAP_READWRITE              = 0x01000000; -- read/write systemcalls */
	V4L2_CAP_ASYNCIO                = 0x02000000; -- async I/O */
	V4L2_CAP_STREAMING              = 0x04000000; -- streaming I/O ioctls */
}

local function printBits(bitsValue, tbl)
	tbl = tbl or testtbl
	for _, name in enumbits(bitsValue, tbl) do
		io.write(string.format("%s, ",name))
	end
	print()
end

-- single bits
printBits(lshift(1,0))
printBits(lshift(1,1))
printBits(lshift(1,31))

-- combined bits
printBits(0x0045)
printBits(0x04000001, caps)

With that last test case, what you’ll get is the output:

V4L2_CAP_VIDEO_CAPTURE, V4L2_CAP_STREAMING

Well that’s handy, particularly when you’re doing some debugging. Just a simple 20 line iterator, and you’re in business, printing flag fields like a boss! That is, if you’re in the lua environment, or any dynamic programming environment that supports iteration of a dictionary.

So, this note to future self is about pointing out the fact that even bitflags are nothing than a very compact form of database. Unpacking them into human readable, programmable form, requires just the right routine, and away you go, you never have to bother with dealing with this little item again. Great for debugging, great for sticking keys in tables, great for displaying on controls!


LJIT2libevdev – input device tracking on Linux

Once you start going down into the rabbit hole that is UI on Linux, there seems to be no end. I was wanting to get to the bottom of the stack as it were, because I just want to get raw keyboard and mouse events, and do stuff with them. There is a library that helps you do that call libevdev. Here is the luajit binding to it:

LJIT2libevdev

As it turns out, getting keyboard and mouse activity is highly dependent on what environment you’re in of course. Are you sitting at a Terminal, in which cases ncurses or similar might be your best choice. If you’re looking at a graphics display, then something related to X, or the desktop manager might be appropriate. At the very bottom of it all though is the kernel, and it’s ability to read the keyboard and mouse, and report what it finds up the chain to interested parties. Down there at the very bottom is a userspace library libevdev, which takes care of making the ioctl calls into the kernel to get the raw values. Great! Only caveat is that you need to be setup with the proper permissions to do it because you’re getting ALL of the keyboard and mouse events on the system. Great for key loggers…

Alright, so what does this mean in the context of Lua? Well, libevdev is a straight up C interface to which is a very thin veneer atop the ioctl calls. It would not be that hard to actually replace the ioctl calls with ioctl calls from luajit directly, but the maintainers of libevdev seem to have it covered quite nicely, so ffi calls to the library are sufficient. The library provides some conveniences like tables of strings to convert from the integer values of things to their string name equivalents. These could probably be replaced with the same within lua land, and save the round trips and string conversions. As a low level interface, it does not provide managedment of the various input devices. You can not ask it “give me the logitech mouse”. You have to know which device is the mouse in the first place before you can start asking for input. Similarly, it’s giving you a ton of raw data that you may not be interested in. Things like the sync signals, indicating the end of an event train. Or the skipped data events, so you can catch up if you prefer not to lose any. How to manage it all?

Let’s start at the beginning.

I have found it challenging to find appropriate discussions relating to UI on Linux. Linux has such a long history, and for most of it, the UI subsystems have been evolving and changing in fundamental ways. So, as soon as you find that juicy article dated from 2002, it’s been replaced by something in 2006, and then again in 2012. It also depends on whether you’re talking about X11, Wayland, Qt, Gnome, SDL, terminal, or some other context.

Recently, I was trying to track down the following scenario: I want to start reading input from the attached logitech mouse on my laptop. Not the track pad under my thumbs, and not the little red nubby stick in the middle of the keyboard, but that mouse specifically. How do I do that?

libevdev is the right library to use, but in order to use it, you need a file descriptor for the specific device. The interwebs tell me you simply open up the appropriate /dev/input/eventxxx file and start reading from it? Right. And how do I know which is the correct ‘eventxxx’ file I should be reading from? You can simply do:

$ cat /proc/bus/input

Look at the output, find the device you’re interested in, look at which event it indicates it’s attached to, then go open up that event…

And how do I do that programatically, and consistently such that it will be the same when I move the mouse to a different system? Ah yes, there’s a library for that, and why don’t you just use Python, and…

Or, how about this:

local EVContext = require("EVContext")

local function isLogitech(dev)
    return dev:name():lower():find("logitech") ~= nil
end

local dev = EVContext:getMouse(isLogitech);

assert(dev, "no mouse found")

print(string.format("Input device name: \"%s\"", dev:name()));
print(string.format("Input device ID: bus %#x vendor %#x product %#x\n",
        dev:busType(),
        dev:vendorId(),
        dev:productId()));

-- print out a constant stream of events
for _, ev in dev:events() do
    print(string.format("Event: %s %s %d",
        ev:typeName(),
        ev:codeName(),
        ev:value()));
end

How can I get to this state? First, how about that EVContext thing, and the ‘getMouse()’ call?

EVContext is a convenience class which wraps up all the things in libevdev which aren’t related to a specific instance of a device. So, doing things like iterating over devices, setting the logging level, getting a specific device, etc. Device iteration is a core piece of the puzzle. So, here it is.

function EVContext.devices(self)
    local function dev_iter(param, idx)
        local devname = "/dev/input/event"..tostring(idx);
        local dev, err = EVDevice(devname)

        if not dev then
            return nil; 
        end

	return idx+1, dev
    end

    return dev_iter, self, 0
end

That’s a quick and dirty iterator that will get the job done. Basically, just construct a string of the form ‘/dev/input/eventxxx’, and vary the ‘xxx’ with numbers until you can no longer open up devices. For each one, create a EVDevice object from that name. A bit wasteful, but highly beneficial. Once we can iterate all the input devices, we can leverage this for greater mischief.

Looking back at our code, there was this bit to get the keyboard:

local function isLogitech(dev)
    return dev:name():lower():find("logitech") ~= nil
end

local dev = EVContext:getMouse(isLogitech);

It looks like we could just call the ‘EVContext:getMouse()’ function and be done with it. What’s with the extra ‘isLogitech()’ part? Well, on its own, getMouse() will simply return the first device which reportedly is like a mouse. That code looks like this:

function EVDevice.isLikeMouse(self)
	if (self:hasEventType(EV_REL) and
    	self:hasEventCode(EV_REL, REL_X) and
    	self:hasEventCode(EV_REL, REL_Y) and
    	self:hasEventCode(EV_KEY, BTN_LEFT)) then
    	
    	return true;
    end

    return false;
end

It’s basically saying, a ‘mouse’ is something that has relative movements, at least an x and y axis, and a ‘left’ button. On my laptop, the little mouse nub on the keyboard qualifies, and since it has a lower /dev/input/event number (3), it will be reported before any other mouse on my laptop. So, I need a way to filter on anything that reports to be a mouse, as well as having “logitech” in its name. The code for that is the following from EVContext:

function EVContext.getDevice(self, predicate)
	for _, dev in self:devices() do
		if predicate then
			if predicate(dev) then
				return dev
			end
		else
			return dev
		end
	end

	return nil;
end

function EVContext.getMouse(self, predicate)
	local function isMouse(dev)
		if dev:isLikeMouse() then
			if predicate then
				return predicate(dev);
			end
			
			return true;
		end

		return false;
	end

	return self:getDevice(isMouse);
end

As you can see, ‘EVContext:getDevice()’ takes a predicate (a function that returns true or false). It will iterate through all the devices, applying the predicate to each device in turn. When it finds a device matching the predicate, it will return that device. Of course, you could easily change this to return ALL the devices that match the predicate, but that’s a different story.

The ‘predicate’ in this case is the internal ‘isMouse’ function within ‘getMouse()’, which in turn applies two filters. The first is calling the ‘isLikeMouse()’ function on the device. If that’s satisfied, then it will call the predicate that was passed in, which in this case would be our ‘isLogitech()’ function. If that is satisfied, then the device is returned.

In the end, here’s some output:

Input device name: "Logitech USB Optical Mouse"
Input device ID: bus 0x3 vendor 0x46d product 0xc018

Event: EV_REL REL_Y -1
Event: EV_SYN SYN_REPORT 0
Event: EV_REL REL_Y -1
Event: EV_SYN SYN_REPORT 0
Event: EV_REL REL_X -1
Event: EV_REL REL_Y -2
Event: EV_SYN SYN_REPORT 0
Event: EV_REL REL_Y -1
Event: EV_SYN SYN_REPORT 0
Event: EV_MSC MSC_SCAN 589827
Event: EV_KEY BTN_MIDDLE 1
Event: EV_SYN SYN_REPORT 0
Event: EV_MSC MSC_SCAN 589827
Event: EV_KEY BTN_MIDDLE 0
Event: EV_SYN SYN_REPORT 0
Event: EV_REL REL_Y -2
Event: EV_SYN SYN_REPORT 0
Event: EV_REL REL_X 1
Event: EV_SYN SYN_REPORT 0
Event: EV_REL REL_Y -1

Some relative movements, a middle button press/release, some more movement.

The libevdev library represents some pretty low level stuff, and for the moment it seems to be the ‘correct’ way to deal with system level input device event handling. The LJIT2libevdev binding provide both the fundamental access to the library as well as the higher level device access which is sorely needed in this environment. I’m sure over time it will be beneficial to pull some of the conveniences that libevdev provides directly into the binding, further shrinking the required size of the library. For now though, I am simply happy that I can get my keyboard and mouse events into my application without too much fuss.


LJIT2libc – LuaJIT IS “batteries included”

I would say one of the most common complaints about any platform, framework, product is a paucity of available stuff to fiddle about with.  I’ve heard this criticism a few times leveled at the lua world in general.  Fatter frameworks are usually “batteries included”.  That’s great, when you get a whole ton of stuff that makes writing all sorts of apps relatively easy to do right off the bat.  The challenge of “batteries included” is that you get a whole ton of stuff, most of which you will never use, and some of which doesn’t have the best implementation.

Recently, I’ve been doing quite a lot off luajit bindings on the Linux platform:

If you were into anything from ASCII art graphics drivers to raw frame buffer management in a Linux system, these might start to look like batteries.  They’re not included with the luajit compiler, but they’re relatively easy to add.

But, there’s another one that I’ve been playing with recently which I think is even better:

Luajit is typically built and compiled against the libc and libm libraries.  As such, being able to access routines within those libraries comes for ‘free’ from within luajit, courtesy of the ffi capabilities.  This is very interesting because it means these routines, which are already on your system, are in fact just a stones throw away from being available within your app.
Let’s imagine you wanted to write some script like this, using the libc provided ‘puts()’ function:
puts("Hello, World!");

Well, the lua language has its own print routine, so this is a bit contrived, but let’s say you wanted to do it anyway. Well, to make this work, you need access to the function signature for the ‘puts’ routine and you need that to be accessible in the global namespace. So, it would actually look like this:

local ffi = require("ffi")

ffi.cdef("int puts(const char *);");
puts = ffi.C.puts;

puts("Hello, World!")

Great. A bit of work, and now I can program as wrecklessly as I could in C with all the batteries included in the libc and libm libraries. But, it gets tedious having to write that ffi stuff all over the place, so this is where the LJIT2libc thing comes in. It basically goes and implements a ton of the required ffi.cdef statements to make all those functions readily available to your script. An easy way to pull it into any script is to do the following:

local init = require("test_setup")()

puts("Hello, World!");

That first line ‘init = …’ in this case pulls in all of the definitions, and puts them into the global namespace, so that you can simply write your program without having to know anything about the ffi, or where functions come from or anything like that. Just start writing your code knowing that the batteries are already included.

Now, this example might seem too trivial for words, but just think about what’s in a typical libc library. It’s all the file system, sockets, system calls, math, random numbers, everything you’re likely to need to create higher level application stuff. It’s a lot of stuff that other systems end up creating from scratch, as part of their ‘batteries’.

Of course this is trivializing what’s happening in those batteries, because what you’re getting here is the raw C implementations of things, with all the headaches and dangers that are typically associated with writing code at that level. But, for those who want to have access to that level of code, and apply the safety net of lua where they see fit, this is quite a useful tool I think.

In short, batteries ARE included with every luajit. Those batteries might be just anode, cathode, and electrolyte, without the shell, but there are the raw ingredients. Having these wrappers available just makes it that much easier to think about and deal with a lot of low level stuff where I might previously had to resort to some sort of ‘package’ to achieve something as simple as traverse the file system.

So there you have it. luajit is a ‘batteries included’ system.