Revisiting C++

I was a C++ expert twice in the past. The first time around was because I was doing some work for Taligent, and their whole operating system was written in C++. With that system I got knee deep into the finer details of templates, and exceptions, to a degree that will likely never be seen on the planet earth.

The second time around, was because I was programming on the BeOS. Not quite as crazy as the Taligent experience, but C/C++ were all the rage.

Then I drifted into Microsoft, and C# was born. For the past 15 years, it’s been a slow rise to dominance with C# in certain quarters of Microsoft. It just so happens that this corresponds to the rise of the virus attacks on Windows, as well as the shift in programming skills of college graduates. In the early days of spectacular virus attacks, you could attribute most of them to buffer overruns, which allowed code to run on the stack. This was fairly easily plugged by C#, and security coding standards.

Today, I am working on a project where once again I am learning C++. This time around it’s C++ 11, which is decidedly more mature than the C++ I learned while working on Taligent. It’s not so dramatically different as say the difference between Lisp and Cobol, but it gained a lot of stuff over the years.

I thought I would jot down some of the surface differences I have noticed since I’ve been away.

First, to compare C++ to Lua, there are some surface differences. Most of the languages I program in today have their roots in Algol, so they largely look the same. But, there are some simple dialect differences. C++ is full of curly braces ‘{}’, semi-colons ‘;’, and parenthesis ‘()’. Oh my god with the parens and semis!! With Lua, parens are optional, semis are optional, and instead of curlies, there are ‘do’, ‘end’, or simply ‘end’. For loops are different, array indices are different (unless you’re doing interop with the FFI), and do/while is repeat/until.

These are all minor differences, like say the differences between Portuguese and Spanish. You can still understand the other if you speak one. Perhaps not perfectly, but there is a relatively easy translation path.

Often times in language wars, these are the superficial differences that people talk about. Meh, not interesting enough to drive me one way or another.

But then, there’s this other stuff, which is truly the essence of the differences. Strong typing/duck typing, managed memory, dynamic code execution. I say ‘Lua’ here, but really that could be a standin for C#, node.js, Python, or Ruby. Basically, there are a set of modern languages which exhibit a similar set of features which are different enough from C/C++ that there is a difference in the programming models.

To illustrate, here’s a bit of C++ code that I have written recently. The setup is this, I receive a packet of data, typically the beginning of a HTTP conversation. From that packet of data, I must be able to ‘parse’ the thing, determine whether it is http/https, pull out headers, etc. I need to build a series of in-place parsers, which keep the amount of memory allocated to a minimum, and work fairly quickly. So, the first piece is this thing called a AShard_t:

#pragma once

#include "anl_conf.h"

class  DllExport AShard_t  {
public:
	uint8_t *	m_Data;
	size_t	m_Length;
	size_t	m_Offset;

	// Constructors
	AShard_t();
	AShard_t(const char *);
	AShard_t(uint8_t *data, size_t length, size_t offset);

	// Virtual Destructor
	virtual ~AShard_t() {};

	// type cast
	operator uint8_t *() {return getData();}

	// Operator Overloads
	AShard_t & operator= (const AShard_t & rhs);

	// Properties
	uint8_t *	getData() {return &m_Data[m_Offset];};
	size_t		getLength() {return m_Length;};

	// Member functions
	AShard_t &	clear();
	AShard_t &	first(AShard_t &front, AShard_t &rest, uint8_t delim) const;
	bool		indexOfChar(const uint8_t achar, size_t &idx) const;
	bool		indexOfShard(const AShard_t &target, size_t &idx);
	bool 		isEmpty() const;
	void		print() const;
	bool		rebase();
	char *		tostringz() const;
	AShard_t &	trimfrontspace();

};

OK, so it’s actually a fairly simple data structure. Assuming you have a buffer of data, a shard is just a pointer into that buffer. It contains the pointer, an offset, and a length. You might say that the pointer/offset combo is redundant, you probably don’t need both. The offset could be eliminated, assuming the pointer is always at the base of the structure. But, there might be a design choice that makes this useful later.

At any rate, there’s a lot going on here for such a simple class. First of all, there’s that ‘#pragma once’ at the top. Ah yes, good ol’ C preprocessor, needs to be told not to load stuff it’s already loaded before. There’s there’s class vs struct, not to be confused with ‘typedef struct’. Public/Protected/Private, copy constructor or ‘operator=’. And heaven forbid you forget to make a public default constructor. You will not be able to create an array of these things without it!

These are not mere dialectual differences, these are the differences between Spanish and Hungarian. You MUST know about the default constructor thing, or things just won’t work.

As far as implementation is concerned, I did a mix of things here, primarily because the class is so small. I’ve inserted some simple “string” processing right into the class, because I found them to be constantly useful. ‘first’, ‘indexOfChar’, and ‘indexOfShard’ turn out to be fairly handy when you’re trying to parse through something in place. ‘first’ is like in Lisp, get the first element off the list of elements. In this case you can specify a single character delimiter. ‘indexOfChar’, is like strchr() function from C, except in this case it’s aware of the length, and it doesn’t assume a ‘null’ terminated string. ‘indexOfShard’ is like ‘strstr’, or ‘strpbrk’. With these in hand, you can do a lot of ‘tokenizing’.

Here’s an example of parsing a URL:

bool parseUrl(const AShard_t &uriShard)
{
  AShard_t shard = uriShard;
  AShard_t rest;
	
  AShard_t scheme;
  AShard_t url;
  AShard_t authority;
  AShard_t hostname;
  AShard_t port;
  AShard_t resquery;
  AShard_t resource;
  AShard_t query;

  // http:
  shard.first(scheme, rest, ':');

  // the 'rest' represents the resource, which 
  // includes the authority + query
  // so try and separate authority from query if the 
  // query part exists
  shard = rest;
  // skip past the '//'
  shard.m_Offset += 2;
  shard.m_Length -= 2;

  // Now we have the url separated from the scheme
  url = shard;

  // separate the authority from the resource based on '/'
  url.first(authority, rest, '/');
  resquery = rest;

  // Break the authority into host and port
  authority.first(hostname, rest, ':');
  port = rest;

  // Back to the resource.  Split it into resource/query
  parseResourceQuery(resquery, resource, query);


  // Print the shards
  printf("URI: "); uriShard.print();
  printf("  Scheme: "); scheme.print();
  printf("  URL: "); url.print();
  printf("    Authority: "); authority.print();
  printf("      Hostname: "); hostname.print();
  printf("      Port: "); port.print();
  printf("    Resquery: "); resquery.print();
  printf("      Resource: "); resource.print();
  printf("      Query: "); query.print();
  printf("\n");

  return true;
}

AShard_t url0("http://www.sharmin.com:8080/resources/gifs/bunny.gif?user=willynilly&password=funnybunny");
parseUrl(url0);

Of course, I’m leaving out error checking, but even for this simple tokenization, it’s fairly robust because in most cases, if a ‘first’ fails, you’ll just gen an empty ‘rest’, but definitely not a crash.

So, how does this fair against my beloved LuaJIT? Well, at this level things are about the same. In Lua, I could create exactly the same structure, using a table, and perform exactly the same operations. Only, if I wanted to do it without using the ffi, I’d have to stuff the data into a Lua string object (which causes a copy), then use the lua string.char, count from 1, etc. totally doable, and probably fairly optimized. There is a bit of a waste though because in Lua, everything interesting is represented by a table, so that’s a much bigger data structure than this simple AShard_t. It’s bigger in terms of memory footprint, and it’s probably slower in execution because it’s a generalized data structure that can serve many wonderful purposes.

For memory management, at this level of structure, things are relatively easy. Since the shard does not copy the data, it doesn’t actually do any allocations, so there’s relatively little to cleanup. The most common use case for shards is that they’ll either be stack based, or they’ll be stuffed into a data structure. In either case, their lifetime is fairly short and well managed, so memory management isn’t a big issue. If they are dynamically allocated, then of course there’s something to be concerned with.

Well, that touches the ice berg. I’ve re-attached to C++, and so far the gag reflex hasn’t driven me insane, so I guess it’s ok to continue.

Next, I’ll explore how insanely great the world becomes when shards roam the earth.


Device Iteration with Functional Programming

One of the great pleasures I have in life is learning something new. There’s nothing greater than those ‘light bulb goes on’ moments as you realize something and gain a much deeper understanding than you had before.

Well, a little while ago, there was an announcement of this thing called Lua Fun.  Lua Fun is a large set of functions which make functional programming in Lua really easy.  It has the usual suspects such as map, reduce, filter, each, etc.  If you read the documentation, you get a really good understanding of how iterators work in Lua, and more importantly, how LuaJIT is able to fold and manipulate things in hot loops such that the produced code is much tighter than anything I could possibly write in C, or any other language I so happen to use.

So, now I’m a fan of Lua Fun, and I would encourage anyone who’s both into Lua, and functional programming to take a look.

How to use it?  I’ve been enumerating various types of things in the Windows system of late.  Using the MMDevice subsystem, I was able to get an enumeration of the audio devices (that took a lot of COM work).  What about displays, and disk drives, and USB devices, and…  Yes, each one of those things has an attendant API which will facilitate monitoring said category.  But, is there one to rule them all?  Well yes, as it turns out, in most cases what these various APIs are doing is getting information out of the System Registry, and just presenting it reasonably.

There is a way to enumerate all the devices in the system.  You know, like when you bring up the Device Manager in Windows, and you see a tree of devices, and their various details.  The stage is set, how do you do that?  I created a simple object that does the grunt work of enumerating the devices in the system.

The DeviceRecordSet is essentially a query. Creating an instance of the object just gives you a handle onto making query requests. Here is the code:

local ffi = require("ffi")
local bit = require("bit")
local bor = bit.bor;
local band = bit.band;

local errorhandling = require("core_errorhandling_l1_1_1");
local SetupApi = require("SetupApi")
local WinNT = require("WinNT")


local DeviceRecordSet = {}
setmetatable(DeviceRecordSet, {
	__call = function(self, ...)
		return self:create(...)
	end,
})

local DeviceRecordSet_mt = {
	__index = DeviceRecordSet,
}


function DeviceRecordSet.init(self, rawhandle)
	print("init: ", rawhandle)

	local obj = {
		Handle = rawhandle,
	}
	setmetatable(obj, DeviceRecordSet_mt)

	return obj;
end

function DeviceRecordSet.create(self, Flags)
	Flags = Flags or bor(ffi.C.DIGCF_PRESENT, ffi.C.DIGCF_ALLCLASSES)

	local rawhandle = SetupApi.SetupDiGetClassDevs(
		nil, 
        nil, 
        nil, 
        Flags);

	if rawhandle == nil then
		return nil, errorhandling.GetLastError();
	end

	return self:init(rawhandle)
end

function DeviceRecordSet.getNativeHandle(self)
	return self.Handle;
end

function DeviceRecordSet.getRegistryValue(self, key, idx)
	idx = idx or 0;

	did = ffi.new("SP_DEVINFO_DATA")
	did.cbSize = ffi.sizeof("SP_DEVINFO_DATA");

--print("HANDLE: ", self.Handle)
	local res = SetupApi.SetupDiEnumDeviceInfo(self.Handle,idx,did)

	if res == 0 then
		local err = errorhandling.GetLastError()
		--print("after SetupDiEnumDeviceInfo, ERROR: ", err)
		return nil, err;
	end

	local regDataType = ffi.new("DWORD[1]")
	local pbuffersize = ffi.new("DWORD[1]",260);
	local buffer = ffi.new("char[260]")

	local res = SetupApi.SetupDiGetDeviceRegistryProperty(
            self:getNativeHandle(),
            did,
			key,
			regDataType,
            buffer,
            pbuffersize[0],
            pbuffersize);

	if res == 0 then
		local err = errorhandling.GetLastError();
		--print("after GetDeviceRegistryProperty, ERROR: ", err)
		return nil, err;
	end

	--print("TYPE: ", regDataType[0])
	if (regDataType[0] == 1) or (regDataType[0] == 7) then
		return ffi.string(buffer, pbuffersize[0]-1)
	elseif regDataType[0] == ffi.C.REG_DWORD_LITTLE_ENDIAN then
		return ffi.cast("DWORD *", buffer)[0]
	end

	return nil;
end


function DeviceRecordSet.devices(self, fields)
	fields = fields or {
		{ffi.C.SPDRP_DEVICEDESC, "description"},
		{ffi.C.SPDRP_MFG, "manufacturer"},
		{ffi.C.SPDRP_DEVTYPE, "devicetype"},
		{ffi.C.SPDRP_CLASS, "class"},
		{ffi.C.SPDRP_ENUMERATOR_NAME, "enumerator"},
		{ffi.C.SPDRP_FRIENDLYNAME, "friendlyname"},
		{ffi.C.SPDRP_LOCATION_INFORMATION , "locationinfo"},
		{ffi.C.SPDRP_LOCATION_PATHS, "locationpaths"},
		{ffi.C.SPDRP_PHYSICAL_DEVICE_OBJECT_NAME, "objectname"},
		{ffi.C.SPDRP_SERVICE, "service"},
	}

	local function closure(fields, idx)
		local res = {}

		local count = 0;
		for _it, field in ipairs(fields) do
			local value, err = self:getRegistryValue(field[1], idx)
			if value then
				count = count + 1;
				res[field[2]] = value;
			end
		end

		if count == 0 then
			return nil;
		end
				
		return idx+1, res;
	end

	return closure, fields, 0
end

return DeviceRecordSet

The ‘getRegistryValue()’ function is the real workhorse of this object. That’s what gets your values out of the system registry. The other function of importance is ‘devices()’. This is an iterator.

There are a couple of things of note about this iterator. First of all, it does not require ‘up values’ to be held onto. All that means is that everything the iterator needs to operate is carried in the return values from the function. The ‘state’ if you will, is handed in fresh every time the ‘closure()’ is called. This is the key to creating an iterator that will work well with Lua Fun.

By default, this iterator will return quite a few (but not all) fields related to each object, and it will return all the objects. This is ok, because there are typically less than 150 objects in any given system.

Now, I want to do various queries against this set without much fuss. This is where Lua Fun, and functional programming in general, really shines.

First, a little setup:

--test_enumdevices.lua
local ffi = require("ffi")
local DeviceRecordSet = require("DeviceRecordSet")
local serpent = require("serpent")
local Functor = require("Functor")

local fun = require("fun")()
local drs = DeviceRecordSet();

local function printIt(record)
	print("==========")
	each(print, record)
	print("----------")
end

This creates an instance of the DeviceRecordSet object, which will be used in the queries. Already the printIt() function is utilizing Lua Fun. The ‘each()’ function will take whatever it’s handed, and perform the function specified. In this case, the ‘record’ will be a table. So, each will iterate over the table entries and print each one of them out. This is the equivalent of doing:

for k,v in pairs(record)
print(k, v)
end

I think that simply typing ‘each’ is a lot simpler and pretty easy to understand.

How about a query then?

-- show everything for every device
each(printIt, drs:devices())

In this case, the ‘each’ is applied to the results of the ‘devices()’ iterator. For each record coming from the devices iterator, the printIt function will be called, which will in turn print out all the values in the record. That’s pretty nice.

What if I don’t want to see all the fields in the record, I just want to see the objectname, and description fields. Well, this is a ‘map’ operation, or a projection in database parlance, so:

-- do a projection on the fields
local function projection(x)
  return {objectname = x.objectname, description = x.description}
end
each(printIt, map(projection, drs:devices()))

Working from the inside out, for each record coming from the devices() iterator, call the ‘projection’ function. The return value from the projection function becomes the new record for this iteration. For each of those records, call the printIt function.

Using ‘map’ is great as you can reshape data in any way you like without much fuss.

Lastly, I want to see only the records that are related to “STORAGE”, so…

-- show only certain records
local function enumeratorFilter(x)
	return x.enumerator == "STORAGE"
end

each(printIt, filter(enumeratorFilter, drs:devices()))

Here, the ‘filter’ iterator is used. So, again, for each of the records coming from the ‘devices()’ enumerator, call the ‘enumeratorFilter’ function. If this function returns ‘true’ for the record, then it is passed along as the next record for the ‘each’. If ‘false’, then it is skipped, and the next record is tried.

This is pretty powerful, and yet simple stuff. The fact that iterators create new iterators, in tight loops, makes for some very dense and efficient code. If you’re interested in why this is so special in LuaJIT, and not many other languages, read up on the Lua Fun documentation.

I’ve killed two birds with one stone. I have finally gotten to the root of all device iterators. I have also learned how to best write iterators that can be used in a functional programming way. Judicious usage of this mechanism will surely make a lot of my code more compact and readable, as well as highly performant.

 


Asynchronous DNS lookups on Windows

I began this particular journey because I wanted to do DNS lookups asynchronously on Windows. There is of course a function for that:

DnsQueryEx

The problem I ran into is that unlike the various other Windows functions I’ve done with async, this one does not use an IO Completion Port. Instead it uses a mechanism called APC (Asynchronouse Procedure Call). With this little bit of magic, you pass in a pointer to a function which will be called in your thread context, kind of in between when other things are happening. Well, given the runtime environment I’m using, I don’t think this quite works out. Basically, I’d have a function being called leaving the VM in an unknown state.

So, I got to digging. I figured, how hard can it be to make calls to a DNS server directly? After all, it is nothing more than a network based service with a well known protocol. Once I could make a straight networking call, then I could go back to leveraging IO Completion Ports just like I do for all other IO.

You can view the DNS system as nothing more than a database to which you pose queries. You express your queries using some nice well defined protocol, which is ancient in origin, and fairly effective considering how frequently DNS queries are issued. Although I could man up and write the queries from scratch, Windows helps me here by providing functions that will format the query into a buffer for me.

But, before I get into that, what do the queries look like? What am I looking up? Well, a Domain Name Server serves up translations of names to other names and numbers. For example, I need to find the IP address of http://www.bing.com. I can look for CNAMES (an alias), or ‘A’ records (direct to an IP address. This gets esoteric and confusing, so a little code can help:

-- Prepare the DNS request
local dwBuffSize = ffi.new("DWORD[1]", 2048);
local buff = ffi.new("uint8_t[2048]")

local wID = clock:GetCurrentTicks() % 65536;
        
local res = windns_ffi.DnsWriteQuestionToBuffer_UTF8( 
  ffi.cast("DNS_MESSAGE_BUFFER*",buff), 
  dwBuffSize, 
  ffi.cast("char *",strToQuery), 
  wType, 
  wID, 
  true )

DnsWriteQuestionToBuffer_UTF8 is the Windows function which helps me to write a DNS query into a buffer, which will then be send to the actual dns server.

wType, represents the type of record you want to be returned. The values might be something like:

wType = ffi.C.DNS_TYPE_A
wType = ffi.C.DNS_TYPE_MX  - mail records
wType = ffi.C.DNS_TYPE_CNAME

There are about a hundred different types that you can query for. The vast majority of the time though, you either looking for ‘A’, or ‘CNAME’ records.

The wID is just a unique identifier for the particular query so that if you’re issuing several on the same channel, you can check the response to ensure they match up.

OK. Now I have a DNS query stuffed into a buffer, how do I make the query and get the results?

-- Send the request.
local IPPORT_DNS = 53;
local remoteAddr = sockaddr_in(IPPORT_DNS, AF_INET);
remoteAddr.sin_addr.S_addr = ws2_32.inet_addr( "209.244.0.3");

-- create the UDP socket
local socket, err = NativeSocket( AF_INET, SOCK_DGRAM, IPPROTO_UDP );

-- send the query
local iRes, err = socket:sendTo(
  ServerAddress, ffi.sizeof(ServerAddress), 
  buff, dwBuffSize[0]);

This little bit of code shows the socket creation, and the actual ‘sendTo’ call. Of note, the “209.244.0.3” represents the IP address of a well known public DNS server. In this case it is hosted by Level 3, which is a internet services provider. There are of course calls you can make to figure out which DNS server your machine is typically configured to use, but this way the query will always work, no matter which network you are on.

Notice the socket is a UDP socket.

At this point, we’re already running cooperatively due to the fact that within TINN, all IO is done cooperatively, without the programmer needing to do much special.

Now to receive the query response back:

   -- Try to receive the results
    local RecvFromAddr = sockaddr_in();
    local RecvFromAddrSize = ffi.sizeof(RecvFromAddr);
    local cbReceived, err = self.Socket:receiveFrom(RecvFromAddr, RecvFromAddrSize, buff, 2048);

Basically just wait for the server to send back a response. Of course, like the sendTo, the receiveFrom works cooperatively, so that if the developer issues several ‘spawn’ commands, each query could be running in its own task, working cooperatively.

Once you have the response, you can parse out the results. The results come back as a set of records. There are of course functions which will help you to parse these records out. The key here is that the record type is indicated, and its up to the developer to pull out the relevant information.

The complete DNSNameServer class is here:

local ffi = require("ffi")

local Application = require("Application")
local windns_ffi = require("windns_ffi")
local NativeSocket = require("NativeSocket")
local ws2_32 = require("ws2_32")
local Stopwatch = require("StopWatch")

local clock = Stopwatch();

-- DNS UDP port
local IPPORT_DNS = 53;

local DNSNameServer = {}
setmetatable(DNSNameServer, {
    __call = function(self, ...)
        return self:create(...)
    end,
})
local DNSNameServer_mt = {
    __index = DNSNameServer,
}

function DNSNameServer.init(self, serveraddress)
    local socket, err = NativeSocket( AF_INET, SOCK_DGRAM, IPPROTO_UDP );

    if not socket then
        return nil, err
    end

    local obj = {
        Socket = socket,
        ServerAddress = serveraddress,
    }
    setmetatable(obj, DNSNameServer_mt)

    return obj;
end

function DNSNameServer.create(self, servername)
    local remoteAddr = sockaddr_in(IPPORT_DNS, AF_INET);
    remoteAddr.sin_addr.S_addr = ws2_32.inet_addr( servername );

    return self:init(remoteAddr)
end

-- Construct DNS_TYPE_A request, send it to the specified DNS server, wait for the reply.
function DNSNameServer.Query(self, strToQuery, wType, msTimeout)
    wType = wType or ffi.C.DNS_TYPE_A
    msTimeout = msTimeout or 60 * 1000  -- 1 minute


    -- Prepare the DNS request
    local dwBuffSize = ffi.new("DWORD[1]", 2048);
    local buff = ffi.new("uint8_t[2048]")

    local wID = clock:GetCurrentTicks() % 65536;
        
    local res = windns_ffi.DnsWriteQuestionToBuffer_UTF8( ffi.cast("DNS_MESSAGE_BUFFER*",buff), dwBuffSize, ffi.cast("char *",strToQuery), wType, wID, true )

    if res == 0 then
        return false, "DnsWriteQuestionToBuffer_UTF8 failed."
    end

    -- Send the request.
    local iRes, err = self.Socket:sendTo(self.ServerAddress, ffi.sizeof(self.ServerAddress), buff, dwBuffSize[0]);
    

    if (not iRes) then
        print("Error sending data: ", err)
        return false, err
    end

    -- Try to receive the results
    local RecvFromAddr = sockaddr_in();
    local RecvFromAddrSize = ffi.sizeof(RecvFromAddr);
    local cbReceived, err = self.Socket:receiveFrom(RecvFromAddr, RecvFromAddrSize, buff, 2048);

    if not cbReceived then
        print("Error Receiving Data: ", err)
        return false, err;
    end

    if( 0 == cbReceived ) then
        return false, "Nothing received"
    end

    -- Parse the DNS response received with DNS API
    local pDnsResponseBuff = ffi.cast("DNS_MESSAGE_BUFFER*", buff);
    windns_ffi.DNS_BYTE_FLIP_HEADER_COUNTS ( pDnsResponseBuff.MessageHead );

    if pDnsResponseBuff.MessageHead.Xid ~= wID then        
        return false, "wrong transaction ID"
    end

    local pRecord = ffi.new("DNS_RECORD *[1]",nil);

    iRes = windns_ffi.DnsExtractRecordsFromMessage_W( pDnsResponseBuff, cbReceived, pRecord );
    
    pRecord = pRecord[0];
    local pRecordA = ffi.cast("DNS_RECORD *", pRecord);
    
    local function closure()
        if pRecordA == nil then
            return nil;
        end

        if pRecordA.wType == wType then
            local retVal = pRecordA
            pRecordA = pRecordA.pNext

            return retVal;
        end

        -- Find the next record of the specified type
        repeat
            pRecordA = pRecordA.pNext;
        until pRecordA == nil or pRecordA.wType == wType
    
        if pRecordA ~= nil then
            local retVal = pRecordA
            pRecordA = pRecordA.pNext
            
            return retVal;
        end

        -- Free the resources
        if pRecord ~= nil then
            windns_ffi.DnsRecordListFree( pRecord, ffi.C.DnsFreeRecordList );
        end 

        return nil
    end

    return closure
end

function DNSNameServer.A(self, domainToQuery) return self:Query(domainToQuery, ffi.C.DNS_TYPE_A) end
function DNSNameServer.MX(self, domainToQuery) return self:Query(domainToQuery, ffi.C.DNS_TYPE_MX) end
function DNSNameServer.CNAME(self, domainToQuery) return self:Query(domainToQuery, ffi.C.DNS_TYPE_CNAME) end
function DNSNameServer.SRV(self, domainToQuery) return self:Query(domainToQuery, ffi.C.DNS_TYPE_SRV) end

return DNSNameServer

Notice at the end there are some convenience functions for a few of the well known DNS record types. The ‘Query()’ function is generic, and will return records of any type. These convenience functions just make it easier.

And how to use it?

local ffi = require("ffi")
local DNSNameServer = require("DNSNameServer")
local core_string = require("core_string_l1_1_0")


--local serveraddress = "10.211.55.1"		-- xfinity
local serveraddress = "209.244.0.3" -- level 3

local domains = {
	"www.nanotechstyles.com",
	"www.adafruit.com",
	"adafruit.com",
	"adamation.com",
	"www.adamation.com",
	"microsoft.com",
	"google.com",
	"ibm.com",
	"oracle.com",
	"sparkfun.com",
	"apple.com",
	"netflix.com",
	"www.netflix.com",
	"www.us-west-2.netflix.com",
	"www.us-west-2.prodaa.netflix.com",
	"news.com",
	"hardkernel.org",
	"amazon.com",
	"walmart.com",
	"target.com",
	"godaddy.com",
	"luajit.org",
}



local function queryA()
	local function queryDomain(name)
		local dns = DNSNameServer(serveraddress) -- ms corporate
		print("==== DNS A ====> ", name)
		for record in dns:A(name) do
			local a = IN_ADDR();
    		a.S_addr = record.Data.A.IpAddress

    		print(string.format("name: %s\tIP: %s, TTL %d", name, a, record.dwTtl));
		end
	end

	for _, name in ipairs(domains) do 
		spawn(queryDomain, name)
		--queryDomain(name)
	end
end

local function queryCNAME()
	local dns = DNSNameServer(serveraddress) -- ms corporate
	local function queryDomain(name)
		print("==== DNS CNAME ====> ", name)
		for record in dns:CNAME(name) do
			print(core_string.toAnsi(record.pName), core_string.toAnsi(record.Data.CNAME.pNameHost))
		end
	end

	for _, name in ipairs(domains) do 
		queryDomain(name)
	end
end

local function queryMX()
	local function queryDomain(name)
		local dns = DNSNameServer(serveraddress) -- ms corporate
		print("==== DNS MX ====> ", name)
		for record in dns:MX(name) do
			print(core_string.toAnsi(record.pName), core_string.toAnsi(record.Data["MX"].pNameExchange))
		end
	end

	for _, name in ipairs(domains) do 
		spawn(queryDomain, name)
	end
end

local function querySRV()
	local dns = DNSNameServer(serveraddress) -- ms corporate
	for _, name in ipairs(domains) do 
		print("==== DNS SRV ====> ", name)
		for record in dns:SRV(name) do
			print(core_string.toAnsi(record.pName), core_string.toAnsi(record.Data.SRV.pNameTarget))
		end
	end
end

local function main()
  queryA();
  --queryCNAME();
  --queryMX();
  --querySRV();
end

run(main)

The function queryA() will query for the ‘A’ records, and print them out. Notice that it has knowledge of the giant union structure that contains the results, and it pulls out the specific information for ‘A’ records. It will create a new instance of the DNSNameServer for each query. That’s not as bad as it might seem. All it amounts to is creating a new UDP socket for each query. Since each query is spawned into its own task, they are all free to run and complete independently, which was the goal of this little exercise.

In the case of the CNAME query, there is only a single socket, and it is used repeatedly, serially, for each query.

The difference between the two styles is noticeable. For the serial case, the queries might ‘take a while’, because you have to wait for each result to come back before issuing the next query. In the cooperative case, you issue several queries in parallel, so the total time will only be as long as the longest query.

That’s a good outcome.

I like this style of programming. You go as low as you can to root out where the system might otherwise block, and you make that part cooperative. That way everything else above it is automatically cooperative. I also like the fact that it feels like I’m getting some parallelism, but I’m not using any of the typical primitives of parallelism, including actual threads, mutexes, and the like.

Well, that’s a hefty bit of code, and it serves the purpose I set out, so I’m a happy camper. Now, if I could just turn those unions into tables automatically…


All the pretty little asyncs…

I have gone on about various forms of async for quite some time now. So could there possibly be more? Well, yes of course!

Here’s the scenario I want to enable. I want to keep track of my file system activity, sending the various operations to a distant storage facility. I want to do this while a UI displays what’s going on, and I want to be able to configure things while its happening, like which events I really care to shadow, and the like.

I don’t want to use multiple OS level threads if I can at all avoid them as they will complicate my programming tremendously. So, what to do.

Well, first I’ll start with the file tracking business. I have talked about change journals in the past. This is a basic mechanism that Windows has to track changes to the file system. Every single open, close, delete, write, etc, has an entry in this journal. If you’re writing a backup program, you’ll be using change journals.

The essence of the change journal functionality is usage of the DeviceIoControl() function. Most of us are very familiar with the likes of CreateFile(), ReadFile(), WriteFile(), CloseHandle(), when it comes to dealing with files. But, for everything else, there is this DeviceIOControl() function.

What is a device? Well, you’d be surprised to learn that most things in the Windows OS are represented by ‘devices’ just like they are in UNIX systems. For example, ‘C:’, is a device. But, also “DISPLAY1” is also a device, as are “LCD” and “PhysicalDisk0”. When it comes to controlling devices, the Win32 level API calls will ultimately make DeviceIoControl() calls with various parameters. That’s great to know as it allows you to create whatever API you want, as long as you know the nuances of the device driver you’re trying to control.

But, I digress. The key point here is that I can open up a device, and I can make a DeviceIoControl() call, and true to form, I can use OVERLAPPED structures, and IO Completion Ports. That makes these calls “async”, or with TINN, cooperative.

To wrap it up in a tidy little bow, here is a Device class which does the grunt work for me:

--[[
References
http://msdn.microsoft.com/en-us/magazine/cc163415.aspx
--]]
local ffi = require("ffi")
local bit = require("bit")
local bor = bit.bor;

local core_file = require("core_file_l1_2_0");
local core_io = require("core_io_l1_1_1");
local Application = require("Application")
local IOOps = require("IOOps")
local FsHandles = require("FsHandles");
local errorhandling = require("core_errorhandling_l1_1_1");
local WinBase = require("WinBase");


local Device = {}
setmetatable(Device, {
	__call = function(self, ...)
		return self:open(...)
	end,
})
local Device_mt = {
	__index = Device,
}

function Device.init(self, rawhandle)
	local obj = {
		Handle = FsHandles.FsHandle(rawhandle)
	}
	setmetatable(obj, Device_mt)
	
	Application:watchForIO(rawhandle, rawhandle)

	return obj;
end


function Device.open(self, devicename, dwDesiredAccess, dwShareMode)
	local lpFileName = string.format("\\\\.\\%s", devicename);
	dwDesiredAccess = dwDesiredAccess or bor(ffi.C.GENERIC_READ, ffi.C.GENERIC_WRITE);
	dwShareMode = bor(FILE_SHARE_READ, FILE_SHARE_WRITE);
	local lpSecurityAttributes = nil;
	local dwCreationDisposition = OPEN_EXISTING;
	local dwFlagsAndAttributes = FILE_FLAG_OVERLAPPED;
	local hTemplateFile = nil;

	local handle = core_file.CreateFileA(
        lpFileName,
        dwDesiredAccess,
        dwShareMode,
     	lpSecurityAttributes,
        dwCreationDisposition,
        dwFlagsAndAttributes,
     	hTemplateFile);


	if handle == INVALID_HANDLE_VALUE then
		return nil, errorhandling.GetLastError();
	end

	return self:init(handle)
end

function Device.getNativeHandle(self)
	return self.Handle.Handle;
end

function Device.createOverlapped(self, buff, bufflen)
	local obj = ffi.new("FileOverlapped");
	
	obj.file = self:getNativeHandle();
	obj.OVL.Buffer = buff;
	obj.OVL.BufferLength = bufflen;

	return obj;
end

function Device.control(self, dwIoControlCode, lpInBuffer, nInBufferSize, lpOutBuffer, nOutBufferSize)
	local lpBytesReturned = nil;
	local lpOverlapped = self:createOverlapped(ffi.cast("void *", lpInBuffer), nInBufferSize);


	local status = core_io.DeviceIoControl(self:getNativeHandle(), 
          dwIoControlCode, 
          ffi.cast("void *", lpInBuffer),
          nInBufferSize,
          lpOutBuffer,
          nOutBufferSize,
          lpBytesReturned,
          ffi.cast("OVERLAPPED *",lpOverlapped));

	local err = errorhandling.GetLastError();

	-- Error conditions
	-- status == 1, err == WAIT_TIMEOUT (258)
	-- status == 0, err == ERROR_IO_PENDING (997)
	-- status == 0, err == something else

	if status == 0 then
		if err ~= ERROR_IO_PENDING then
			return false, err
		end
	end

    local key, bytes, ovl = Application:waitForIO(self, lpOverlapped);

    return bytes;
end

return Device

I’ve shown this kind of construct before with the NativeFile object. That object contains Read, and Write functions as well, but lacks the control() function. Of course the two could be combined for maximum benefit.

How to use this thing?

dev = Device("c:")
dev:control(...)

OK, that’s out of the way. Now, what about this change journal thing? Very simple now that the device is handled.
A change journal can look like this:

-- USNJournal.lua
-- References
-- http://msdn.microsoft.com/en-us/library/windows/desktop/aa364563(v=vs.85).aspx
-- http://www.microsoft.com/msj/0999/journal/journal.aspx
-- http://www.microsoft.com/msj/1099/journal2/journal2.aspx
-- 

local ffi = require("ffi");
local bit = require("bit");
local bor = bit.bor;
local band = bit.band;

local core_io = require("core_io_l1_1_1");
local core_file = require("core_file_l1_2_0");
local WinIoCtl = require("WinIoCtl");
local WinBase = require("WinBase");
local errorhandling = require("core_errorhandling_l1_1_1");
local FsHandles = require("FsHandles");
local Device = require("Device")

--[[
	ChangeJournal

	An abstraction for NTFS Change journal management
--]]
local ChangeJournal = {}
setmetatable(ChangeJournal, {
	__call = function(self, ...)
		return self:open(...);
	end,
});

local ChangeJournal_mt = {
	__index = ChangeJournal;
}

ChangeJournal.init = function(self, device)
	local obj = {
		Device = device;
	}
	setmetatable(obj, ChangeJournal_mt);

	local jinfo, err = obj:getJournalInfo();

	print("ChangeJournal.init, jinfo: ", jinfo, err)

	if jinfo then
		obj.JournalID = jinfo.UsnJournalID;
		obj.LowestUsn = jinfo.LowestValidUsn;
		obj.FirstUsn = jinfo.FirstUsn;
		obj.MaxSize = jinfo.MaximumSize;
		obj.MaxUsn = jinfo.MaxUsn;
		obj.AllocationSize = jinfo.AllocationDelta;
	end

	return obj;
end


ChangeJournal.open = function(self, driveLetter)
	local device, err = Device(driveLetter)

	if not device then
		print("ChangeJournal.open, ERROR: ", err)
		return nil, err
	end

	return self:init(device);
end


ChangeJournal.getNextUsn = function(self)
	local jinfo, err = self:getJournalInfo();

	if not jinfo then
		return false, err;
	end

	return jinfo.NextUsn;
end



ChangeJournal.getJournalInfo = function(self)
	local dwIoControlCode = FSCTL_QUERY_USN_JOURNAL;
	local lpInBuffer = nil;
	local nInBufferSize = 0;
	local lpOutBuffer = ffi.new("USN_JOURNAL_DATA");
	local nOutBufferSize = ffi.sizeof(lpOutBuffer);

	local success, err = self.Device:control(dwIoControlCode, 
          lpInBuffer,
          nInBufferSize,
          lpOutBuffer,
          nOutBufferSize);

	if not success then
		return false, errorhandling.GetLastError();
	end

	return lpOutBuffer;
end

function ChangeJournal.waitForNextEntry(self, usn, ReasonMask) 
 	usn = usn or self:getNextUsn();
 	local ReasonMask = ReasonMask or 0xFFFFFFFF;
 	local ReturnOnlyOnClose = false;
 	local Timeout = 0;
 	local BytesToWaitFor = 1;

    local ReadData = ffi.new("READ_USN_JOURNAL_DATA", {usn, ReasonMask, ReturnOnlyOnClose, Timeout, BytesToWaitFor, self.JournalID});

    local pusn = ffi.new("USN");
    
    -- This function does not return until the USN
    -- record exits
	local BUF_LEN = ffi.C.USN_PAGE_SIZE;
	local Buffer = ffi.new("uint8_t[?]", BUF_LEN);
    local dwBytes = ffi.new("DWORD[1]");

	local success, err = self.Device:control(FSCTL_READ_USN_JOURNAL, 
        ReadData,
        ffi.sizeof(ReadData),
        Buffer,
        BUF_LEN);

	if not success then 
		return false, err
	end

	local UsnRecord = ffi.cast("PUSN_RECORD", ffi.cast("PUCHAR",Buffer) + ffi.sizeof("USN")); 

    return UsnRecord;
end

return ChangeJournal;

This very much looks like the change journal I created a few months back. The primary difference is the device control stuff is abstracted out into the Device object, so it does not need to be repeated here.

When we want to track the changes to the device, we make repeated calls to ‘waitForNextEntry’.

local function test_waitForNextEntry(journal)
    local entry = journal:waitForNextEntry();

    while entry do
        printJournalEntry(entry);
        entry = journal:waitForNextEntry();
    end
end

This is your typical serially written code. There’s nothing that look special about it, no hint of async processing. Behind the covers though, way back in the Device:control() function, the actual sending of a command to the device happens using IO Completion Port, so if you’re running with TINN, this particular task will ‘waitForIO’, and other tasks can continue.

So, using it in context looks like this:

local function main()
    local journal, err = ChangeJournal("c:")

    spawn(test_waitForNextEntry, journal);
    periodic(function() print("tick") end, 1000)
end

run(main)

In this case, I spawn the journal waiting/printing thing as its own task. Then I setup a periodic timer to simply print “tick” every second, to show there is some activity.

Since the journaling is cooperative (mostly waiting for io control to complete), the periodic timer, or UI processing, or what have you, is free to run, without any hindrance.

Combine this with the already cooperative UI stuff, and you can imagine how easy it could be to construct the scenario I set out to construct. Since all networking and file system operations in TINN are automatically async, it would be easy to log these values, or send them across a network to be stored for later analysis or what have you.

And there you have it. Async everywhere makes for some very nice scenarios. Being able to do async on any device, whether with standard read/write operations, or io control, makes for very flexible programming.

Next time around, I’m going to show how to do async DNS queries for fun and profit.


Interface Obscura – DXGI, DXVA, MMDevice…

If you have programmed with Windows over the years, you have been subjected to a plethora of languages, paradigms, frameworks, APIs, tectonic shifts, and the like. Why is this so? Well, because Windows is fairly old, and stuff that worked 30 years ago still works today, mostly. Hardware has evolved, gained and changed capabilities, and along with these changes, so too have the frameworks that support it.

One interesting trend that has been occurring with Windows over the past few OS releases, is that things are becoming more modular, and more properly layered. For example, the API sets that now exist in the core of Windows properly layer and segregate APIs such that you don’t have to pull in the entire world just to do a conversion from an Ansi string to a multi-byte string.

There are some parts of the system that offer up more obscure interfaces that most programmers won’t experience directly. Recently I’ve been wanting to get at the core of audio and visual stuff. In Windows, you could be using the Media Foundation framework, or you would be using the older ‘Wave’ media framework. Both of these at their core are supported by the MMDevice library. That’s an obscure library which gives you things like the names of your audio endpoints, and various attributes about them. Ultimately you can open up an audio stream on a device and get bytes out of it.

Another one, a little less obscure is the Direct X Graphics Interface (DXGI). This one gives you access to low level information about the graphics adapters and capabilities of your machine. How is it that the control panel knows about your different monitors for example? It ultimately calls down to DXGI I would bet.

Yet another one is DXVA. This one, closely related to DXGI, gives you access to graphics ‘surfaces’. This is how you can actually do high speed rendering directly to the screen for example. You can also make a high speed screen capture program, without the use of mirror drivers, because you can get a handle on the actual graphics surface that’s being used to render the screen. This is how the application and desktop thumbnails are able to be rendered.

In most cases, these interfaces are steeped in the Component Object Model (COM). COM is a fairly mature part of the system which was created to ease the task of joining disparate programming languages and system components using a common runtime infrastructure. It is the dynamic type system of Windows if you will. COM served a world where there was no universal type system, and you had to be able to marshal data across disparate libraries and application domains. If I were to do it from scratch today, I’d probably use something like LuaJIT’s ffi and just write the interfaces in standard ‘C’ headers, but this came from a different time.

Recently I wanted to get a handle on the audio endpoints of my machine, and just print out the various attributes of each endpoint. Here’s the code:

local MMDevice = require("MMDevice")

for device in MMDevice:AudioEndpoints() do
  print("Device ID: ", device:getID())

  for property in device:properties() do
    print("  Property: ", property.vt, tostring(property))
  end
end

Audio endpoints are things like ‘microphone’ and ‘speaker’. They either render sound (speaker), or capture sound (microphone). The MMDevice:AudioEndpoints() function is a typical Lua iterator on the IMMDevice interfaces that represent the various endpoints in the system. Since this is an iterator, it can be used with the Lua Fun functional programming library to do all sorts of filtering, reductions, and the like. But, in this particular case, I just want to print out the various properties associated with the endpoint.

Each device has an iterator that will feed out the properties. That is the ‘properties()’ function. Again, a standard Lua iterator which will just spew out the properties one by one. In this case, the properties are represented by a very COM like VARIANT structure. When you look it up, the VARIANT, is nothing more than a giant union of all the base types that the COM system understands. In a modern language such as Lua, I can actually return the base type instead of the boxed up union, but for now it’s good enough.

Here, the VARIANT object has a simple __tostring() implementation, to turn some of the more common types into their string equivalents, it looks something like this:

local PROPVARIANT = ffi.typeof("PROPVARIANT")
local PROPVARIANT_mt = {
    __tostring = function(self)
        if self.vt == ffi.C.VT_BLOB then
            return tostring(self.blob)
        elseif self.vt == ffi.C.VT_BOOL then
            return not (self.boolVal == 0);
        elseif self.vt == ffi.C.VT_CLSID then
            return tostring(ffi.cast("CLSID *", self.puuid))
        elseif self.vt == ffi.C.VT_LPWSTR then
            return core_string.toAnsi(self.pwszVal)
        elseif self.vt == ffi.C.VT_UI4 then
            return tonumber(self.uintVal)
        end

        return tostring(self.pcVal)
    end,
}
ffi.metatype(PROPVARIANT, PROPVARIANT_mt)

There are various levels of “COM Support” within the APIs of the Windows system. Some are fairly basic, providing nothing more than a fairly thin veneer over an otherwise standard C interface. The Kinect interface is like this, as is the DXGI interface. There may not be any usage of things like VARIANT, and there is not much ‘inheritance’ of interfaces other than from ‘IUnknown’. Then, there are other interfaces that pull you further down into the COM rabbit hole. MMDevice is one such interface. You need VARIANT, and IDispatch (at least in a cursory way), IStorage, IPropertyStorage, and quite a few GUIDs along the way. Everything is done with GUIDs, and you have to worry about using “Release()” on your interfaces, CoMemoryFree() and a whole bunch of other stuff which, if not done properly, will lead to leaks, bugs, and worse.

And this is why the various frameworks exist. This knowledge is too obscure and esoteric for most developers to have to concern themselves with it. Better someone who really knows and understands the intricacies to deal with it, and provide a higher level abstraction that smooths over some of the rough spots.

Why not just make the lower level stuff more approachable in the first place? Well, yah, it’s that legacy thing isn’t it? In some cases you can make a complete break and just code something up from scratch. The Kinect API does this. I can get frames out of my Kinect without having to worry about any multimedia frameworks, and without pulling in half the COM world.

It took me a good couple of days of pulling on the COM thread to bring in just enough support to make these interfaces work. Now that the basics of COM are in the system though, it’s much easier to pull in more interfaces. Mostly it’s a tedious job of looking up all the interfaces, putting them into a ffi header file, coding up the stubs to make the viable calls, making sure the __gc method is implemented to call “Release()”, etc. At the end of it, you gain access to what is likely a fairly useful interface without all the fuss of dealing with the low level COM stuff.

I implemented the MMDevice interface, not just to gain access to the audio endpoints, but really as a practice run for implementing the DXVA interfaces. In Windows 8, the DXVA interfaces contain the desktop duplication API, which replaces the old mirror driver API for screen copying. Using this low level API, I will redo the plumbing on my screen capture application, and dramatically improve the performance browser based screen sharing. The basic screen capture portion should remain fairly small, at a couple hundred lines of code, but the capabilities will be dramatically improved, while reducing the bandwidth requirements, which will mean even faster frame rates. It’s all goodness.

And so it goes. Windows has a rich history of APIs, and with that history comes a lot of baggage that most developers probably don’t want to deal with. There are new, more modern APIs introduced all the time. LuaJIT forms a pretty easy bridge to some of the less trod parts of the system, and gives rapid scriptable access to the new parts. Once that bridge has been formed, you can do some pretty nice things within the Lua environment on Windows.


Multitasking single threaded UI – Gaming meets networking

There are two worlds that I want to collide. There is the active UI gaming world, then there’s the more passive networking server world. The difference between these two worlds is the kind of control loop and scheduler that they both typically use.

The scheduler for the networking server is roughly:

while true do
  waitFor(networkingEvent)
  
  doNetworkingStuff()
end

This is great, and works quite well to limit the amount of resources required to run the server because most of the time it’s sitting around idle. This allows you to serve up web pages with a much smaller machine than if you were running flat out, with say a “polling” api.

The game loop is a bit different. On Windows it looks something like this:

while true do
  ffi.fill(msg, ffi.sizeof("MSG"))
  local peeked = User32.PeekMessageA(msg, nil, 0, 0, User32.PM_REMOVE);
			
  if peeked ~= 0 then
    -- do regular Windows message processing
    local res = User32.TranslateMessage(msg)		
    User32.DispatchMessageA(msg)
  end

  doOtherStuffLikeRunningGamePhysics();
end

So, when I want to run a networking game, where I take some input from the internet, and incorporate that into the gameplay, I’ve got a basic problem. Who’s on top? Which loop is going to be THE loop?

The answer lies in the fact that the TINN scheduler is a basic infinite loop, and you can modify what occurs within that loop. Last time around, I showed how the waitFor() function can be used to insert a ‘predicate’ into the scheduler’s primary loop. So, perhaps I can recast the gaming loop as a predicate and insert it into the scheduler?

I have a ‘GameWindow’ class that takes care of creating a window, showing it on the screen and dealing with drawing. This window has a run() function which has the typical gaming infinite loop. I have modified this code to recast things in such a way that I can use the waitFor() as the primary looping mechanism. The modification make it look like this:

local appFinish = function(win)

  win.IsRunning = true
  local msg = ffi.new("MSG")

  local closure = function()
    ffi.fill(msg, ffi.sizeof("MSG"))
    local peeked = User32.PeekMessageA(msg, nil, 0, 0, User32.PM_REMOVE);
			
    if peeked ~= 0 then
      local res = User32.TranslateMessage(msg)
      User32.DispatchMessageA(msg)

      if msg.message == User32.WM_QUIT then
        return win:OnQuit()
      end
    end

    if not win.IsRunning then
      return true;
    end
  end

  return closure;
end

local runWindow = function(self)
	
  self:show()
  self:update()

  -- Start the FrameTimer
  local period = 1000/self.FrameRate;
  self.FrameTimer = Timer({Delay=period, Period=period, OnTime =self:handleFrameTick()})

  -- wait here until the application window is closed
  waitFor(appFinish(self))

  -- cancel the frame timer
  self.FrameTimer:cancel();
end

GameWindow.run = function(self)
  -- spawn the fiber that will wait
  -- for messages to finish
  Task:spawn(runWindow, self);

  -- set quanta to 0 so we don't waste time
  -- in i/o processing if there's nothing there
  Task:setMessageQuanta(0);
	
  Task:start()
end

Starting from the run() function. Only three things need to occur. First, spawn ‘runWindow’ in its own fiber. I want this routine to run in a fiber so that it is cooperative with other parts of the system that might be running.

Second, call ‘setMessageQuanta(0)’. This is a critical piece to get a ‘gaming loop’. This quanta is the amount of time the IO processing part of the scheduler will spend waiting for an IO event to occur. This time will be spent every time through the primary scheduler’s loop. If the value is set to 0, then effectively we have a nice runaway infinite loop for the scheduler. IO events will still be processed, but we won’t spend any time waiting for them to occur.

This has the effect of providing maximum CPU timeslice to various other waiting fibers. If this value is anything other than 0, let’s say ‘5’ for example, then the inner loop of the scheduler will slow to a crawl, providing no better than 60 iterations of the loop per second. Not enough time slice for a game. Setting it to 0 allows more like 3000 iteractions of the loop per second, which gives more time to other fibers.

That’s the trick of this integration right there. Just set the messageQuanta to 0, and away you go. To finish this out, take a look at the runWindow() function. Here just a couple of things are set in place. First, a timer is created. This timer will ultimately end up calling a ‘tick()’ function that the user can specify.

The other thing of note is the use of the waitFor(appCheck(self)). This fiber will block here until the “appCheck()” predicate returns true.

So, finally, the appFinish() predicate, what does that do?

Well, I’ll be darned if it isn’t the essence of the typical game window loop, at least the Windows message handling part of it. Remember that a predicate that is injected to the scheduler using “waitFor()” is executed every time through the scheduler’s loop, so the scheduler’s loop is essentially the same as the outer loop of a typical game.

With all this in place, you can finally do the following:


local GameWindow = require "GameWindow"
local StopWatch = require "StopWatch"

local sw = StopWatch();

-- The routine that gets called for any
-- mouse activity messages
function mouseinteraction(msg, wparam, lparam)
	print(string.format("Mouse: 0x%x", msg))
end

function keyboardinteraction(msg, wparam, lparam)
	print(string.format("Keyboard: 0x%x", msg))
end


function randomColor()
		local r = math.random(0,255)
		local g = math.random(0,255)
		local b = math.random(0,255)
		local color = RGB(r,g,b)

	return color
end

function randomline(win)
	local x1 = math.random() * win.Width
	local y1 = 40 + (math.random() * (win.Height - 40))
	local x2 = math.random() * win.Width
	local y2 = 40 + (math.random() * (win.Height - 40))

	local color = randomColor()
	local ctxt = win.GDIContext;

	ctxt:SetDCPenColor(color)

	ctxt:MoveTo(x1, y1)
	ctxt:LineTo(x2, y2)
end

function randomrect(win)
	local width = math.random(2,40)
	local height = math.random(2,40)
	local x = math.random(0,win.Width-1-width)
	local y = math.random(0, win.Height-1-height)
	local right = x + width
	local bottom = y + height
	local brushColor = randomColor()

	local ctxt = win.GDIContext;

	ctxt:SetDCBrushColor(brushColor)
	--ctxt:RoundRect(x, y, right, bottom, 0, 0)
	ctxt:Rectangle(x, y, right, bottom)
end


function ontick(win, tickCount)

	for i=1,30 do
		randomrect(win)
		randomline(win)
	end

	local stats = string.format("Seconds: %f  Frame: %d  FPS: %f", sw:Seconds(), tickCount, tickCount/sw:Seconds())
	win.GDIContext:Text(stats)
end



local appwin = GameWindow({
		Title = "Game Window",
		KeyboardInteractor = keyboardinteraction,
		MouseInteractor = mouseinteraction,
		FrameRate = 24,
		OnTickDelegate = ontick,
		Extent = {1024,768},
		})


appwin:run()

This will put a window on the screen, and draw some lines and rectangles at a frame rate of roughly 24 frames per second.

And thus, the two worlds have been melded together. I’m not doing any networking in this particular case, but adding it is no different than doing it for any other networking application. The same scheduler is still at play, and everything else will work as expected.

The cool thing about this is the integration of these two worlds does not require the introduction of multiple threads of execution. Everything is still within the same single threaded context that TINN normally runs with. If real threads are required, they can easily be added and communicated with using the computicles that are within TINN.


My Head In The Cloud – putting my code where my keyboard is

I have written a lot about the greatness of LuaJIT, coding for the internet, async programming, and the wonders of Windows. Now, I have finally reached a point where it’s time to put the code to the test.

I am running a service in Azure: http://nanotechstyles.cloudapp.net

This site is totally fueled by the work I have done with TINN. It is a static web page server with a couple of twists.

First of all, you can access the site through a pretty name:
http://www.nanotechstyles.com

If you just hit the site directly, you will get the static front page which has nothing more than an “about” link on it.

If you want to load up a threed model viewing thing, hit this:
http://www.nanotechstyles.com/threed.html

If you want to see what your browser is actually sending to the server, then hit this:
http://www.nanotechstyles.com/echo

I find the echo thing to be interesting, and I try hitting the site using different browsers to see what they produce.  This kind of feedback makes it relatively easy to do rapid turnarounds on the webpage content, challenging my assumptions and filling in the blanks.

The code for this web server is not very complex.  It’s the same standard ‘main’ that I’ve used in the past:

local resourceMap = require("ResourceMap");
local ResourceMapper = require("ResourceMapper");
local HttpServer = require("HttpServer")

local port = arg[1] or 8080

local Mapper = ResourceMapper(resourceMap);

local obj = {}

local OnRequest = function(param, request, response)
	local handler, err = Mapper:getHandler(request)


	-- recycle the socket, unless the handler explictly says
	-- it will do it, by returning 'true'
	if handler then
		if not handler(request, response) then
			param.Server:HandleRequestFinished(request);
		end
	else
		print("NO HANDLER: ", request.Url.path);
		-- send back content not found
		response:writeHead(404);
		response:writeEnd();

		-- recylce the request in case the socket
		-- is still open
		param.Server:HandleRequestFinished(request);
	end

end

obj.Server = HttpServer(port, OnRequest, obj);
obj.Server:run()

In this case, I’m dealing with the OnRequest() directly, rather than using the WebApp object.  I’m doing this because I want to do some more interactions at this level that the standard WebApp may not support.

Of course the ‘handlers’ are where all the fun is. I guess it makes sense to host the content of the site up on the site for all to see and poke fun at.

My little experiment here is to give my code real world exposure, with the intention of hardening it, and gaining practical experience on what a typical web server is likely to see out in the wild.

So, if you read this blog, go hit those links. Soon enough, perhaps I will be able to serve up my own blog using my own software. That’s got a certain circular reference to it.