Note To Self – Enumerating bit flags

I’ve been trawling through the Linux V4L2 group of libraries of late as part of LLUI.  v4l2 is one of those sprawling libraries that does all things for all people in terms of video on Linux machines.  It’s roughly equivalent to oh so many similar things from the past on the Windows side.  This is one of the libraries you might utilize if you were to get into streaming from your webcam programmatically.  Of course, you could just read from it directly with libusb, but then you lose out on all the nifty format conversions, and I miss this chance to write another pointless reminder for my later coding self.

So, what’s got me so bothered this time around?  Well, lets say I’m just parusing my system, turning everything into a database as I go along.  I’d like to get a hold of my webcam, and see what it’s capable of.  There’s a call for that of course.  Once you make the appropriate IOCtl call, you end up with a struct that looks like this:

 
[soucecode]
struct v4l2_capability {
uint8_t driver[16];
uint8_t card[32];
uint8_t bus_info[32];
uint32_t version;
uint32_t capabilities;
uint32_t device_caps;
uint32_t reserved[3];
};
[/sourcecode]

The driver, card, and bus_info fields are pretty straight forward as they are simple ‘null terminated’ strings, so you have print them out if you like. It’s that ‘capabilities’ field that gives me fits. This is one of those combined bit flags sort of things. The value can be a combination of any of the numerous ‘capability’ flags, which are these:

-- Values for 'capabilities' field
caps = {
	V4L2_CAP_VIDEO_CAPTURE		= 0x00000001 ; -- Is a video capture device */
	V4L2_CAP_VIDEO_OUTPUT		= 0x00000002; -- Is a video output device */
	V4L2_CAP_VIDEO_OVERLAY		= 0x00000004; -- Can do video overlay */
	V4L2_CAP_VBI_CAPTURE		= 0x00000010; -- Is a raw VBI capture device */
	V4L2_CAP_VBI_OUTPUT			= 0x00000020; -- Is a raw VBI output device */
	V4L2_CAP_SLICED_VBI_CAPTURE	= 0x00000040; -- Is a sliced VBI capture device */
	V4L2_CAP_SLICED_VBI_OUTPUT	= 0x00000080; -- Is a sliced VBI output device */
	V4L2_CAP_RDS_CAPTURE		= 0x00000100; -- RDS data capture */
	V4L2_CAP_VIDEO_OUTPUT_OVERLAY	= 0x00000200; -- Can do video output overlay */
	V4L2_CAP_HW_FREQ_SEEK		= 0x00000400; -- Can do hardware frequency seek  */
	V4L2_CAP_RDS_OUTPUT			= 0x00000800; -- Is an RDS encoder */

	V4L2_CAP_VIDEO_CAPTURE_MPLANE	= 0x00001000;
	V4L2_CAP_VIDEO_OUTPUT_MPLANE	= 0x00002000;
	V4L2_CAP_VIDEO_M2M_MPLANE		= 0x00004000;
	V4L2_CAP_VIDEO_M2M				= 0x00008000;

	V4L2_CAP_TUNER			= 0x00010000; -- has a tuner */
	V4L2_CAP_AUDIO			= 0x00020000; -- has audio support */
	V4L2_CAP_RADIO			= 0x00040000; -- is a radio device */
	V4L2_CAP_MODULATOR		= 0x00080000; -- has a modulator */

	V4L2_CAP_READWRITE              = 0x01000000; -- read/write systemcalls */
	V4L2_CAP_ASYNCIO                = 0x02000000; -- async I/O */
	V4L2_CAP_STREAMING              = 0x04000000; -- streaming I/O ioctls */
}

For the embedded webcam in my laptop, the reported value is: 0x04000001;

Of course, when you’re doing something programmatically, and you just want to check whether a particular flag is set or not, you can just do:

canStream = band(V4L2_CAP_STREAMING, 0x04000001) ~= 0

Very common, and probably some of the most common code you’ll see anywhere. But what else? For various reasons, I want to create the string values for those bit fields, and use those values as keys to tables, or just to print, or to send somewhere, or display, or what have you.

I’ve seen enough ‘C’ code deal with this there is a common patter. First create the #define, or enum statement which encapsulates the values for all the flags. Then, to get the values as strings, create a completely separate string table, which does the mapping of the nice tight enum values and the string values. Then write a little lookup function which can go from the value to the string.

Well, here’s one of those things I love about Lua. In this case, the program IS the database. No need for those parallel representations. Here’s some code:

local pow = math.pow
local bit = require("bit")
local lshift, rshift, band, bor = bit.lshift, bit.rshift, bit.band, bit.bor

local function getValueName(value, tbl)
	for k,v in pairs(tbl) do
		if v == value then
			return k;
		end
	end

	return nil;
end

local function enumbits(bitsValue, tbl, bitsSize)
	local function name_gen(params, state)

		if state >= params.bitsSize then return nil; end

		while(true) do
			local mask = pow(2,state)
			local maskedValue = band(mask, params.bitsValue)
--print(string.format("(%2d) MASK [%x] - %#x", state, mask, maskedValue))			
			if maskedValue ~= 0 then
				return state + 1, getValueName(maskedValue, params.tbl) or "UNKNOWN"
			end

			state = state + 1;
			if state >= params.bitsSize then return nil; end
		end

		return nil;
	end

	return name_gen, {bitsValue = bitsValue, tbl = tbl, bitsSize = bitsSize or 32}, 0
end

return enumbits

The function “getValueName()” at the top there simply does a reverse lookup in the table. That is, given a value, return the string that represents that value (is the string key for that value).

Next, the “enumbits()” function is an enumerator. It will iterate over the bit flags, returning a string name for all the ones that are set to ‘1’, and nothing for any of the other bits. Here’s an example:

local bit = require("bit")
local lshift, rshift, band, bor = bit.lshift, bit.rshift, bit.band, bit.bor

local enumbits = require("enumbits")

local testtbl = {
	LOWEST 	= 0x0001;
	MEDIUM 	= 0x0002;
	HIGHEST = 0x0004;
	MIGHTY 	= 0x0008;
	SLUGGO 	= 0x0010;
	MUGGO 	= 0x0020;
	BUGGO 	= 0x0040;
	PUGGO 	= 0x0080;
}


local caps = {
	V4L2_CAP_VIDEO_CAPTURE		= 0x00000001 ; -- Is a video capture device */
	V4L2_CAP_VIDEO_OUTPUT		= 0x00000002; -- Is a video output device */
	V4L2_CAP_VIDEO_OVERLAY		= 0x00000004; -- Can do video overlay */
	V4L2_CAP_VBI_CAPTURE		= 0x00000010; -- Is a raw VBI capture device */
	V4L2_CAP_VBI_OUTPUT			= 0x00000020; -- Is a raw VBI output device */
	V4L2_CAP_SLICED_VBI_CAPTURE	= 0x00000040; -- Is a sliced VBI capture device */
	V4L2_CAP_SLICED_VBI_OUTPUT	= 0x00000080; -- Is a sliced VBI output device */
	V4L2_CAP_RDS_CAPTURE		= 0x00000100; -- RDS data capture */
	V4L2_CAP_VIDEO_OUTPUT_OVERLAY	= 0x00000200; -- Can do video output overlay */
	V4L2_CAP_HW_FREQ_SEEK		= 0x00000400; -- Can do hardware frequency seek  */
	V4L2_CAP_RDS_OUTPUT			= 0x00000800; -- Is an RDS encoder */

	V4L2_CAP_VIDEO_CAPTURE_MPLANE	= 0x00001000;
	V4L2_CAP_VIDEO_OUTPUT_MPLANE	= 0x00002000;
	V4L2_CAP_VIDEO_M2M_MPLANE		= 0x00004000;
	V4L2_CAP_VIDEO_M2M				= 0x00008000;

	V4L2_CAP_TUNER			= 0x00010000; -- has a tuner */
	V4L2_CAP_AUDIO			= 0x00020000; -- has audio support */
	V4L2_CAP_RADIO			= 0x00040000; -- is a radio device */
	V4L2_CAP_MODULATOR		= 0x00080000; -- has a modulator */

	V4L2_CAP_READWRITE              = 0x01000000; -- read/write systemcalls */
	V4L2_CAP_ASYNCIO                = 0x02000000; -- async I/O */
	V4L2_CAP_STREAMING              = 0x04000000; -- streaming I/O ioctls */
}

local function printBits(bitsValue, tbl)
	tbl = tbl or testtbl
	for _, name in enumbits(bitsValue, tbl) do
		io.write(string.format("%s, ",name))
	end
	print()
end

-- single bits
printBits(lshift(1,0))
printBits(lshift(1,1))
printBits(lshift(1,31))

-- combined bits
printBits(0x0045)
printBits(0x04000001, caps)

With that last test case, what you’ll get is the output:

V4L2_CAP_VIDEO_CAPTURE, V4L2_CAP_STREAMING

Well that’s handy, particularly when you’re doing some debugging. Just a simple 20 line iterator, and you’re in business, printing flag fields like a boss! That is, if you’re in the lua environment, or any dynamic programming environment that supports iteration of a dictionary.

So, this note to future self is about pointing out the fact that even bitflags are nothing than a very compact form of database. Unpacking them into human readable, programmable form, requires just the right routine, and away you go, you never have to bother with dealing with this little item again. Great for debugging, great for sticking keys in tables, great for displaying on controls!


Iterating Over Oddities – of strings, arrays, and counting from 0

Could there possibly be anything more said about iterators and strings? Well, yah, actually tons. Last time around, I showed a simple iterator over a “string”. The focus was primarily on satisfying the job of parsing out null terminated strings from within a ‘null terminated’ string.

As I was doing that, I was also speculating as to whether I could use the exact same iterator to parse fixed sized records from an array of records. I actually wrote a giant iterator that does just that. Then I got to thinking, ‘this isn’t the way to do it’. Iterators, you see, can be broken down into constituent parts. The Lua documentation itself has quite a lot to say about iterators of various forms. One of the most promising bits of documentation is on 7.3 – Stateless Iterators.  In order to pull this off, you split up the “iterator” into a few parts.

 

generator – The function that gets called every time you need a new value.  The parameters that are passed to it are the “fixedpart”, and the “control”.

invariant state – this is the part of the iterator that doesn’t change much.  For example, the source string that you might be iterating over.

index – this is the part that changes every time generator is called.

so, when you do the following:

local values = {'a', 'b', 'c'}
for _idx, value in ipairs(values) do
  print(value)
end

The ‘generator’ is some function returned from the ‘ipairs()’ function. It will be called again and again, until it returns nil.

The ‘invariant state’ is the ‘values’ array. This will be fed to the generator each time a new value is needed.

And last, the _idx, is the index value. It will also be fed to the generator, along with the ‘invariant state’.

So, how about applying this to my previous multi string iterator?

local ffi = require("ffi")
local fun = require("fun")

local floor = math.floor;


-- a nil generator.  
-- good for cases when there's no data
local function nil_gen(param, state)
    return nil
end

local function delim_gen(param, idx)
	local len = 0;

	while ((idx+len) < param.nelems) do
		--print("wchar: ", string.char(ffi.cast(param.basetypeptr, param.data)[idx + len]))
		if ffi.cast(param.basetypeptr, param.data)[idx + len] ~= param.separator then
			len = len + 1;
		else
			break
		end
	end
	
	if len == 0 then
		return nil;
	end

	return idx + len + 1, ffi.cast(param.basetypeptr, param.data)+idx, len
end


local function array_gen(param, idx)
	if idx >= param.nelems then
		return nil;
	end

	return idx+1, ffi.cast(param.basetypeptr, param.data)+idx, 1
end


local function striter(params)
	if not params then
		return nil_gen, params, nil
	end

	if not params.data then
		return nil_gen, params, nil
	end

	params.datalength = params.datalength or #params.data
	if params.basetype then
		if type(params.basetype)== "string" then
			params.basetype = ffi.typeof(params.basetype)
		end
	end
	params.basetype = params.basetype or ffi.typeof("char")
	params.basetypeptr = ffi.typeof("const $ *", params.basetype)
	params.basetypesize = ffi.sizeof(params.basetype)
	params.nelems = math.floor(params.datalength / params.basetypesize)

	if params.separator ~= nil then
		return delim_gen, params, 0
	else
		return array_gen, params, 0
	end

	return nil_gen, nil, nil
end

How to apply it?

local src3 = "big,boy,baby,bear,bounces,basketballs,behind,the,barn,,"

local function printAnsi(ptr, len)
  print(ffi.string(ptr, len))
end

each(printAnsi, striter{data=src3, basetype="char"})

Here, I am using the Lua Fun ‘each’ function to drive my iterator. I could just as easily use a simple ‘for – in’ loop, but I’m getting all functional these days. What the last statement says is, “for each of the items coming out of the striter iterator, call the ‘printAnsi()’ function”.

The striter function is called with a table that contains the various parameters it will need. In this particular case, I’ve left off the parenthesis, because in Lua, if it’s just a single table value, or a string, you can do that.

So, how about that striter() function? Looking back, it has the job of returning a ‘generator’, ‘invariant state’, and a ‘index’. Well, it cheats a bit because the ‘invariant’ also contains the ‘index’. The ‘invariant’ is the fact that the table value doesn’t change, even though the contents can. This is just a matter of convenience.

At any rate, the striter() function decides which generator to return based on what it sees in the parameters. For example, if it sees a separator, then it will return the ‘delim_gen’ generator. That generator functions pretty much the same was as the one I created last time for the multi string thing. In the case where it doesn’t see a separator, it will return the ‘array_gen’ generator. That generator will assume it is being handed a pointer to an array of values of a particular type.

One thing to note that is different this time around from the mstrziter, Lua string creation does not occur within the iterator itself. Rather than return a string value, the iterator will simply return an offset and a length. It is up to the caller to determine what they want to do with the values.

This is kind of a key to an IEnumerable chain. Do the least amount of work as possibly, deferring really heavy work towards the end of your chain. This lazy evaluation makes for a more efficient chain. So, the ‘printAnsi’ function is at the end of the chain. It might have turned out that instead of creating strings at all, I might have wanted to send the values across a network to be stored in a database. In that case, the pointer, offset, length is perfect to be consumed directy by the Socket:send(buff, len) function, so no copying would be necessary.

How about that array case?

Let’s imagine I wanted to print out every value of the string one by one.

each(printAnsi, striter{data=src3, basetype="char"})

In this case, I’m creating an iterator, not specifying the separator (so the array_gen will be used). I have also specified the ‘basetype’ of the elements of my array. That’s so it can calculate how many there are, and create a pointer of the appropriate type. And you’re done!

Of course, the ‘basetype’ could just as easily be ‘BGR32’, or ‘PersonRecord’, or whatever fixed size type you so happen to have stored in some array. Makes for some fairly easy ‘tokenizing’ of array values.

To go further, what’s say you have a multi string based on ‘wchar_t’, and delimeted by ‘ ‘ (space) characters?

How about a little convenience function?

local function wmstriter(data, separator, datalength)
  if type(separator) == "string" then
    separator = string.byte(separator)
  end

  datalength = datalength or ffi.sizeof(data)
  return map(core_string.toAnsi, striter{data=data, datalength = datalength, basetype="wchar_t", separator=separator})
end

and using it:

local whello, whellolen = core_string.toUnicode("Hello World");

each(print, wmstriter(whello, ' '))

In this case, I create a ‘wchar_t’ based string, using the ‘toUnicode()’ function. Then I feed that to the wmstriter iterator, and print out each of the words, delimeted by a ‘(sp)’.

The core ‘striter()’ remains the same, and the generators don’t change. You just build up more useful ways of feeding them and consuming them.

I find this to be fairly useful and powerful. When you think about the generators, they have a fairly simple task. Do one small job, and do it well. The complexity of a larger system is gained through thoughtful composition of these simpler parts, rather than building hugely complex macro parts.

This is helpful for code maintenance. I find it hard to maintain largish single functions with hundreds of lines of code. I find it fairly easy to maintain a generator that has a few lines of code, and does a fairly simple job. I find the composition model easy as well. I can look back on it over time, read the functional chain, and understand what was intended.

And there you have it. More iterating over oddities.

 


Device Iteration with Functional Programming

One of the great pleasures I have in life is learning something new. There’s nothing greater than those ‘light bulb goes on’ moments as you realize something and gain a much deeper understanding than you had before.

Well, a little while ago, there was an announcement of this thing called Lua Fun.  Lua Fun is a large set of functions which make functional programming in Lua really easy.  It has the usual suspects such as map, reduce, filter, each, etc.  If you read the documentation, you get a really good understanding of how iterators work in Lua, and more importantly, how LuaJIT is able to fold and manipulate things in hot loops such that the produced code is much tighter than anything I could possibly write in C, or any other language I so happen to use.

So, now I’m a fan of Lua Fun, and I would encourage anyone who’s both into Lua, and functional programming to take a look.

How to use it?  I’ve been enumerating various types of things in the Windows system of late.  Using the MMDevice subsystem, I was able to get an enumeration of the audio devices (that took a lot of COM work).  What about displays, and disk drives, and USB devices, and…  Yes, each one of those things has an attendant API which will facilitate monitoring said category.  But, is there one to rule them all?  Well yes, as it turns out, in most cases what these various APIs are doing is getting information out of the System Registry, and just presenting it reasonably.

There is a way to enumerate all the devices in the system.  You know, like when you bring up the Device Manager in Windows, and you see a tree of devices, and their various details.  The stage is set, how do you do that?  I created a simple object that does the grunt work of enumerating the devices in the system.

The DeviceRecordSet is essentially a query. Creating an instance of the object just gives you a handle onto making query requests. Here is the code:

local ffi = require("ffi")
local bit = require("bit")
local bor = bit.bor;
local band = bit.band;

local errorhandling = require("core_errorhandling_l1_1_1");
local SetupApi = require("SetupApi")
local WinNT = require("WinNT")


local DeviceRecordSet = {}
setmetatable(DeviceRecordSet, {
	__call = function(self, ...)
		return self:create(...)
	end,
})

local DeviceRecordSet_mt = {
	__index = DeviceRecordSet,
}


function DeviceRecordSet.init(self, rawhandle)
	print("init: ", rawhandle)

	local obj = {
		Handle = rawhandle,
	}
	setmetatable(obj, DeviceRecordSet_mt)

	return obj;
end

function DeviceRecordSet.create(self, Flags)
	Flags = Flags or bor(ffi.C.DIGCF_PRESENT, ffi.C.DIGCF_ALLCLASSES)

	local rawhandle = SetupApi.SetupDiGetClassDevs(
		nil, 
        nil, 
        nil, 
        Flags);

	if rawhandle == nil then
		return nil, errorhandling.GetLastError();
	end

	return self:init(rawhandle)
end

function DeviceRecordSet.getNativeHandle(self)
	return self.Handle;
end

function DeviceRecordSet.getRegistryValue(self, key, idx)
	idx = idx or 0;

	did = ffi.new("SP_DEVINFO_DATA")
	did.cbSize = ffi.sizeof("SP_DEVINFO_DATA");

--print("HANDLE: ", self.Handle)
	local res = SetupApi.SetupDiEnumDeviceInfo(self.Handle,idx,did)

	if res == 0 then
		local err = errorhandling.GetLastError()
		--print("after SetupDiEnumDeviceInfo, ERROR: ", err)
		return nil, err;
	end

	local regDataType = ffi.new("DWORD[1]")
	local pbuffersize = ffi.new("DWORD[1]",260);
	local buffer = ffi.new("char[260]")

	local res = SetupApi.SetupDiGetDeviceRegistryProperty(
            self:getNativeHandle(),
            did,
			key,
			regDataType,
            buffer,
            pbuffersize[0],
            pbuffersize);

	if res == 0 then
		local err = errorhandling.GetLastError();
		--print("after GetDeviceRegistryProperty, ERROR: ", err)
		return nil, err;
	end

	--print("TYPE: ", regDataType[0])
	if (regDataType[0] == 1) or (regDataType[0] == 7) then
		return ffi.string(buffer, pbuffersize[0]-1)
	elseif regDataType[0] == ffi.C.REG_DWORD_LITTLE_ENDIAN then
		return ffi.cast("DWORD *", buffer)[0]
	end

	return nil;
end


function DeviceRecordSet.devices(self, fields)
	fields = fields or {
		{ffi.C.SPDRP_DEVICEDESC, "description"},
		{ffi.C.SPDRP_MFG, "manufacturer"},
		{ffi.C.SPDRP_DEVTYPE, "devicetype"},
		{ffi.C.SPDRP_CLASS, "class"},
		{ffi.C.SPDRP_ENUMERATOR_NAME, "enumerator"},
		{ffi.C.SPDRP_FRIENDLYNAME, "friendlyname"},
		{ffi.C.SPDRP_LOCATION_INFORMATION , "locationinfo"},
		{ffi.C.SPDRP_LOCATION_PATHS, "locationpaths"},
		{ffi.C.SPDRP_PHYSICAL_DEVICE_OBJECT_NAME, "objectname"},
		{ffi.C.SPDRP_SERVICE, "service"},
	}

	local function closure(fields, idx)
		local res = {}

		local count = 0;
		for _it, field in ipairs(fields) do
			local value, err = self:getRegistryValue(field[1], idx)
			if value then
				count = count + 1;
				res[field[2]] = value;
			end
		end

		if count == 0 then
			return nil;
		end
				
		return idx+1, res;
	end

	return closure, fields, 0
end

return DeviceRecordSet

The ‘getRegistryValue()’ function is the real workhorse of this object. That’s what gets your values out of the system registry. The other function of importance is ‘devices()’. This is an iterator.

There are a couple of things of note about this iterator. First of all, it does not require ‘up values’ to be held onto. All that means is that everything the iterator needs to operate is carried in the return values from the function. The ‘state’ if you will, is handed in fresh every time the ‘closure()’ is called. This is the key to creating an iterator that will work well with Lua Fun.

By default, this iterator will return quite a few (but not all) fields related to each object, and it will return all the objects. This is ok, because there are typically less than 150 objects in any given system.

Now, I want to do various queries against this set without much fuss. This is where Lua Fun, and functional programming in general, really shines.

First, a little setup:

--test_enumdevices.lua
local ffi = require("ffi")
local DeviceRecordSet = require("DeviceRecordSet")
local serpent = require("serpent")
local Functor = require("Functor")

local fun = require("fun")()
local drs = DeviceRecordSet();

local function printIt(record)
	print("==========")
	each(print, record)
	print("----------")
end

This creates an instance of the DeviceRecordSet object, which will be used in the queries. Already the printIt() function is utilizing Lua Fun. The ‘each()’ function will take whatever it’s handed, and perform the function specified. In this case, the ‘record’ will be a table. So, each will iterate over the table entries and print each one of them out. This is the equivalent of doing:

for k,v in pairs(record)
print(k, v)
end

I think that simply typing ‘each’ is a lot simpler and pretty easy to understand.

How about a query then?

-- show everything for every device
each(printIt, drs:devices())

In this case, the ‘each’ is applied to the results of the ‘devices()’ iterator. For each record coming from the devices iterator, the printIt function will be called, which will in turn print out all the values in the record. That’s pretty nice.

What if I don’t want to see all the fields in the record, I just want to see the objectname, and description fields. Well, this is a ‘map’ operation, or a projection in database parlance, so:

-- do a projection on the fields
local function projection(x)
  return {objectname = x.objectname, description = x.description}
end
each(printIt, map(projection, drs:devices()))

Working from the inside out, for each record coming from the devices() iterator, call the ‘projection’ function. The return value from the projection function becomes the new record for this iteration. For each of those records, call the printIt function.

Using ‘map’ is great as you can reshape data in any way you like without much fuss.

Lastly, I want to see only the records that are related to “STORAGE”, so…

-- show only certain records
local function enumeratorFilter(x)
	return x.enumerator == "STORAGE"
end

each(printIt, filter(enumeratorFilter, drs:devices()))

Here, the ‘filter’ iterator is used. So, again, for each of the records coming from the ‘devices()’ enumerator, call the ‘enumeratorFilter’ function. If this function returns ‘true’ for the record, then it is passed along as the next record for the ‘each’. If ‘false’, then it is skipped, and the next record is tried.

This is pretty powerful, and yet simple stuff. The fact that iterators create new iterators, in tight loops, makes for some very dense and efficient code. If you’re interested in why this is so special in LuaJIT, and not many other languages, read up on the Lua Fun documentation.

I’ve killed two birds with one stone. I have finally gotten to the root of all device iterators. I have also learned how to best write iterators that can be used in a functional programming way. Judicious usage of this mechanism will surely make a lot of my code more compact and readable, as well as highly performant.