Spelunking Windows – Tokens for fun and profit

I want to shutdown/restart my machine programmatically. There’s an API for that:

-- kernel32.dll
BOOL
InitiateSystemShutdownExW(
    LPWSTR lpMachineName,
    LPWSTR lpMessage,
    DWORD dwTimeout,
    BOOL bForceAppsClosed,
    BOOL bRebootAfterShutdown,
    DWORD dwReason);

Wow, it’s that easy?!!

OK. So, I need the name of the machine, some message to display in a dialog box, a timeout, force app closure, reboot or not, and some reason why the shutdown is occuring. That sounds easy enough. So, I’ll just give it a call…

local status = core_shutdown.InitiateSystemShutdownExW(
  nil,    -- nil, so local machine
  nil,    -- no special message
  10,     -- wait 10 seconds
  false,  -- don't force apps to close
  true,   -- reboot after shutdown
  ffi.C.SHTDN_REASON_MAJOR_APPLICATION);

And what do I get for my troubles?
> error: (5) ERROR_ACCESS_DENIED

Darn, now I’m going to have to read the documentation.

In the Remarks of the documentation, it plainly states:

To shut down the local computer, the calling thread must have the SE_SHUTDOWN_NAME privilege.

Yah, ok, right then. What’s a privilege? And thus Alice went down into the rabbit’s hole…

As it turns out, there are quite a few concepts in Windows that are related to identity, security, authorization, and the like. As soon as you log into your machine, even if done programmatically, you get this thing called a ‘Token’ attached to your process. The easiest way to think of the token is it’s your electronic proxy and passport. Just like your passport, this token contains some basic identity information about who you are (name, identifying marks…). Some things in the system, such as being able to access a file, can be handled simply by knowing your name. These are simple access rights. But, other things in the system require a ‘visa’, meaning, not only does the operation have to know who you are, but it also needs to know you have the proper permissions to perform the operation you’re about to perform. It’s just like getting a visa stamped into your passport. If I want to travel to India, my passport alone is not enough. I need to get a visa as well. The same is true of this token thing. It’s not enough that I simply have an identity, I must also have a “privilege” in order to perform certain operations.

In addition to having a privilege, I must actually ‘activate’ it. So, yes, the system may have granted me the privilege, but it’s like super powers, you don’t want them to always be active. It’s like when you’re walking down the street in that foreign country you’re visiting. You don’t walk down the street flashing your fancy passport showing everyone the neat visas you have stamped in there. If you do, you’ll likely get a crowd following you trying to relieve you of said passport. So, you generally keep it to yourself, and only flash it when the need arises. So too with token privilege. Yes, you might have the ability to reboot the machine, but you don’t always want to have that privilege enabled, in case some nefarious software so happens to come along to exploit that fact.

Alright, that’s enough analogizing. How about some code. Well, it can be daunting to get your head around the various APIs associated with tokens. To begin with, there is a token associated with the process you’re currently running in, and there is a token associated with every thread you may launch from within that process as well. Generally, you want the process token if you’re single threaded. That’s one API call:

BOOL
OpenProcessToken (
    HANDLE ProcessHandle,
    DWORD DesiredAccess,
    PHANDLE TokenHandle
    );

This is one of those standard API calls where you pass in a couple of parameters (ProcessHandle, DesiredAccess), and a ‘handle’ is returned (TokenHandle). You then use the ‘handle’ to make subsequent calls to the various API functions. This is ripe for wrapping up in some nice data structure to deal with it.

I’ve created the ‘Token’ object, as the convenience point. One of the functions in there is this one:

getProcessToken = function(self, DesiredAccess)
  DesiredAccess = DesiredAccess or ffi.C.TOKEN_QUERY;
	
  local ProcessHandle = core_process.GetCurrentProcess();
  local pTokenHandle = ffi.new("HANDLE [1]")
  local status  = core_process.OpenProcessToken (ProcessHandle, DesiredAccess, pTokenHandle);

  if status == 0 then
    return false, errorhandling.GetLastError();
  end

  return Token(pTokenHandle[0]);
end

One of the important things to take note of when you create a token is the DesiredAccess. What you can do with a token after it is created is somewhat determined by the access that you put into it when you create it. Here are the various options available:

static const int TOKEN_ASSIGN_PRIMARY    =(0x0001);
static const int TOKEN_DUPLICATE         =(0x0002);
static const int TOKEN_IMPERSONATE       =(0x0004);
static const int TOKEN_QUERY             =(0x0008);
static const int TOKEN_QUERY_SOURCE      =(0x0010);
static const int TOKEN_ADJUST_PRIVILEGES =(0x0020);
static const int TOKEN_ADJUST_GROUPS     =(0x0040);
static const int TOKEN_ADJUST_DEFAULT    =(0x0080);
static const int TOKEN_ADJUST_SESSIONID  =(0x0100);

For the case where we want to turn on a privilege that’s attached to the token, we will want to make sure the ‘TOKEN_ADJUST_PRIVILEGES’ access right is attached. It also does not hurt to add the ‘TOKEN_QUERY’ access as well. It’s probably best to use the least of these rights as is necessary to get the job done.

Setting a privilege on a token is another bit of work. It’s not hard, but it’s just one of those things where you have to read the docs, and look at a few samples on the internet in order to get it right. Assuming your token has the TOKEN_ADJUST_PRIVILEGES access right on it, you can do the following:

Token.enablePrivilege = function(self, privilege)
  local lpLuid, err = self:getLocalPrivilege(privilege);
  if not lpLuid then
    return false, err;
  end

  local tkp = ffi.new("TOKEN_PRIVILEGES");
  tkp.PrivilegeCount = 1;
  tkp.Privileges[0].Luid = lpLuid;
  tkp.Privileges[0].Attributes = ffi.C.SE_PRIVILEGE_ENABLED;

  local status = security_base.AdjustTokenPrivileges(self.Handle.Handle, false, tkp, 0, nil, nil);

  if status == 0 then
    return false, errorhandling.GetLastError();
  end

  return true;
end

Well, that gets into some data structures, and introduces this thing called a LUID, and that AdjustTokenPrivileges function, and… I get tired just thinking about it. Luckily, once you have this function, it’s a fairly easy task to turn a privilege on and off.

OK. So, with this little bit of code in hand, I can now do the following:

	local token = Token:getProcessToken(ffi.C.TOKEN_ADJUST_PRIVILEGES);
	token:enablePrivilege(Token.Privileges.SE_SHUTDOWN_NAME);

This just gets a token that is associated with the current process and turns on the privilege that allows us to successfully call the shutdown function.

In totality:

-- test_shutdown.lua
local ffi = require("ffi");

local core_shutdown = require("core_shutdown_l1_1_0");
local errorhandling = require("core_errorhandling_l1_1_1");
local Token = require("Token");

local function test_Shutdown()
  local token = Token:getProcessToken();
  token:enablePrivilege(Token.Privileges.SE_SHUTDOWN_NAME);
	
  local status = core_shutdown.InitiateSystemShutdownExW(nil, nil,
    10,false,true,ffi.C.SHTDN_REASON_MAJOR_APPLICATION);

  if status == 0 then
    return false, errorhandling.GetLastError();
  end

  return true;
end

print(test_Shutdown());

And finally we emerge back into the light! This will now actually work. It’s funny, when I got this to work correctly, I pointed out to my wife that my machine was rebooting without me touching it. She tried to muster a smile of support, but really, she wasn’t that impressed. But, knowing the amount of work that goes into such a simple task, I gave myself a pat on the back, and smiled inwardly at the greatness of my programming fu.

Tokens are a very powerful thing in Windows. Being able to master both the concepts, and the API calls themselves, gives you a lot of control over what happens with your machine.


Spelunking Windows – Exploring the file system

I’m working on programs that generally make your stuff available to you “from any device anywhere”. While some of the work is of the generic internet high performance server variety, some of it is much more esoteric. A lot of your “stuff” is in files located on your machine. Widows has a wild array of filesytem related APIs, and it can be a daunting task to try and wrap your head around it to achieve some given task.

I set out a task for myself to turn the file system into a relatively easy to query “database” to get lists of files that meet certain criteria based on their various attributes. Some queries that I’m interested in:

List all directories on my machine
List all files that contain “.lua”
List all files which are hidden
List all system files
List all compressed files

Of course, using the standard search capabilities that are built into Windows, you can perform some of these tasks. This is different from the generally useful ‘find’ command of the command shell as well, as that command is interested in searching the contents of the file.

Yes, there are numerous tools and ways to perform these tasks, so what I am exploring is how you would actually create such a tool from scratch if you were so inclined.

I will begin at the beginning. Exploring the file system doesn’t require that much. Just a few data structures, and 3 function calls. The key components are the following:
(Full Source Available here: https://github.com/Wiladams/TINN/tree/master/tests FileSystemItem.lua, test_filesystem.lua)

ffi.cdef[[
typedef struct _WIN32_FIND_DATAW {
    DWORD dwFileAttributes;
    FILETIME ftCreationTime;
    FILETIME ftLastAccessTime;
    FILETIME ftLastWriteTime;
    DWORD nFileSizeHigh;
    DWORD nFileSizeLow;
    DWORD dwReserved0;
    DWORD dwReserved1;
    WCHAR  cFileName[ MAX_PATH ];
    WCHAR  cAlternateFileName[ 14 ];
} WIN32_FIND_DATAW, *PWIN32_FIND_DATAW, *LPWIN32_FIND_DATAW;

HANDLE
FindFirstFileExW(
  LPCWSTR lpFileName,
  FINDEX_INFO_LEVELS fInfoLevelId,
  LPVOID lpFindFileData,
  FINDEX_SEARCH_OPS fSearchOp,
  LPVOID lpSearchFilter,
  DWORD dwAdditionalFlags);

BOOL
FindNextFileW(HANDLE hFindFile,
  LPWIN32_FIND_DATAW lpFindFileData);

BOOL
FindClose(HANDLE hFindFile);
]]

The three functions; FindFirstFileExW, FindNextFileW, FindClose; combine to form an ‘iteration’ set. The iteration begins with FindFirstFileExW, which also returns the first results, and continues with FindNextFileW. After all is done, you finish up with FindClose, to recover the system resources that were allocated for this little search. Although there are flags to be set to enhance the search capabilitie, I don’t actually want to use them as I can create much more interesting search filters from the Lua side.

First things first though. The ‘handle’ that is created when you call ‘FindFirstFileExW’ must be cleaned up with a matching ‘FindClose’. If you don’t do this, you’ll end up with a leaked handle, which is essentially a resource leak. You don’t want that, so a ‘smart pointer’ is created to deal with the lifetime of that thing.

ffi.cdef[[
typedef struct {
  HANDLE Handle;
} FsFindFileHandle;
]]
local FsFindFileHandle = ffi.typeof("FsFindFileHandle");
local FsFindFileHandle_mt = {
  __gc = function(self)
    core_file.FindClose(self.Handle);
  end,

  __index = {
    isValid = function(self)
      return self.Handle ~= INVALID_HANDLE_VALUE;
    end,
  },
};
ffi.metatype(FsFindFileHandle, FsFindFileHandle_mt);

This little wrapper ensures that whenever the handle is no longer being referenced, it will automatically get cleaned up because the __gc method will call ‘FindClose()’, which is exactly what we want. Here is how it can be used:

local rawHandle = core_file.FindFirstFileExW(lpFileName,
  fInfoLevelId,
  lpFindFileData,
  fSearchOp,
  lpSearchFilter,
  dwAdditionalFlags);

local handle = FsFindFileHandle(rawHandle);

Alright, so there’s a nicely wrapped handle to the beginning of the iterator. The full iterator looks like this:

-- Iterate over the subitems this item might contain
FileSystemItem.items = function(self, pattern)
	pattern = pattern or self:getFullPath().."\\*";
	local lpFileName = core_string.toUnicode(pattern);
	--local fInfoLevelId = ffi.C.FindExInfoStandard;
	local fInfoLevelId = ffi.C.FindExInfoBasic;
	local lpFindFileData = ffi.new("WIN32_FIND_DATAW");
	local fSearchOp = ffi.C.FindExSearchNameMatch;
	local lpSearchFilter = nil;
	local dwAdditionalFlags = 0;

	local rawHandle = core_file.FindFirstFileExW(lpFileName,
		fInfoLevelId,
		lpFindFileData,
		fSearchOp,
		lpSearchFilter,
		dwAdditionalFlags);

	local handle = FsFindFileHandle(rawHandle);
	local firstone = true;

	local closure = function()
		if not handle:isValid() then 
			return nil;
		end

		if firstone then
			firstone = false;
			return FileSystemItem({
				Parent = self;
				Attributes = lpFindFileData.dwFileAttributes;
				Name = core_string.toAnsi(lpFindFileData.cFileName);
				Size = (lpFindFileData.nFileSizeHigh * (MAXDWORD+1)) + lpFindFileData.nFileSizeLow;
				});
		end

		local status = core_file.FindNextFileW(handle.Handle, lpFindFileData);

		if status == 0 then
			return nil;
		end

		return FileSystemItem({
				Parent = self;
				Attributes = lpFindFileData.dwFileAttributes;
				Name = core_string.toAnsi(lpFindFileData.cFileName);
				});

	end
	
	return closure;
end

Refer to the full source to see it in context.

Here is how I would use it:

local depthQuery = {}

depthQuery.traverseItems = function(starting, indentation, filterfunc)
  indentation = indentation or "";

  starting = starting or FileSystemItem({Name="c:"});

  for item in starting:items() do
    if filterfunc then
      if filterfunc(item) then
        if item.Name ~= '.' and item.Name ~= ".." then
          io.write(indentation, item.Name, '\n');
        end
      end
    else
      if item.Name ~= '.' and item.Name ~= ".." then
        io.write(indentation, item.Name, '\n');
      end
    end
		
    if item:isDirectory() and item.Name ~= "." and item.Name ~= ".." then
      depthQuery.traverseItems(item, indentation.."  ", filterfunc);
    end
  end
end

-- Iterate all files/directories on 'c:' drive
depthQuery.traverseItems(FileSystemItem({Name="c:"}));

Scanning the entire c: drive on my recently new lenovo X1 carbon takes about 6 seconds, and there are a few hundred thousand files on it. Most of the time is spent on the string creation and io. On smaller directory searches, the scan is essentially instantaneous.

With these basics in hand, I can now do some more interesting queries.

local function passLua(item)
  return item.Name:find(".lua", 1, true);
end

depthQuery.traverseItems(FileSystemItem({Name="c:"}), "", passLua);

In this case, I want to find all the files on my system that have ‘.lua’ in their name. The ‘passLua()’ function is very simple, just doing a string compare. Of course, you have the full power of Lua and any libraries at your disposal. You could even open up the file if you like, and read the contents and decide whether you wanted to pass it along or not. Your filter just returns ‘true’ to pass it along, or ‘false’ to block it.

The FileSystemItem object has some of the file’s properties readily available, so they can be a part of the filtering as well. If I wanted a list of all the directories on my ‘c:’ drive, I would do the following:

local function passDirectory(item)
  return item:isDirectory();
end

depthQuery.traverseItems(FileSystemItem({Name="c:"}), "", passDirectory);

Of course, if you were using the .net frameworks, or Java, or Python, or any number of mature libraries in the world, you’d be thinking this was very simple work indeed, and what’s all the fuss? No fuss, really. The key here is showing how simple it really is to do these things, and create your own. I find it useful to do it in Lua because it’s easier than trying to do it with some more involved environments. Once the basic wrappers are in place, spelunking around becomes much easier.

Windows is a vast landscape of mature APIs which have evolved over time to meet the needs of diverse consumers over the years. The APIs are raw, and at times daunting. With a little bit of wrapping though, exploring them becomes much easier. In this particular case, by doing everything from the low level ffi to the higher level iterator, I’ve put in place some rope ladders, pitons, and other core exploration equipment. Now I can reap the benefits and do some more exploration with relative ease and safety.


When Is Software Engineering – Surely a database is required

So, I’ve gotten data, and presented it on a web page in JSON format.  If that’s not engineering, I’m not sure what is, but way, surely a database of sorts must be involved.

There are plenty of times in my code where I need to quickly filter some ‘records’ performing some activity only on those records that meet a particular criteria.  Given that Lua is table based, everything of interest becomes a ‘record’.  This applies to “classes” as well as the more garden variety of ‘records’ that might be streaming out of an actual database, or in my recent example, a simple iterator over the services on my machine.  It would be nice if I had some fairly straight forward way to deal with those records.   What I need is an iterator based query processor.

The requirements are fairly simple.  There are three things that are typical of record processors:

record source –  The source of data.  In my case, the source will be any iterator that feeds out simple key/value table structures.

projection – In database terminology, ‘projection’ is simply the list of fields that you want to actually present in the query results.  I might have a record that looks like this:

{name = "William", address="1313 Mockingbird Lane", occupation="enng"}

I might want to just retrieve the name though, so the projection would be simply:

{name = "William"}

filter – I want the ability to only retrieve the records that meet a particular criteria.

I will ignore aggregate functions, such as groupby, sort, and the like as those do not work particularly well with a streaming interface. What follows is a simple implementation of a query processor that satisfies the needs I listed above:

-- Query.lua
--

--[[
	the query function receives its parameters as a single table
	params.source - The data source.  It should be an iterator that returns
	table values

	params.filter - a function, that receives a single table value as input
	and returns a single table value as output.  If the record is 'passed' then
	it is returned as the return value.  If the record does not meet the filter
	criteria, then 'nil' will be returned.

	params.projection - a function to morph a single entry.  It receives a single
	table value as input, and returns a single table value as output.

	The 'filter' and 'projection' functions are very similar, and in fact, the
	filter can also be used to transform the input.  They are kept separate 
	so that each can remain fairly simple in terms of their implementations.
--]]

local query = function(params)
	if not params or not params.source then
		return false, "source not specified";
	end

	local nextRecord = params.source;
	local filter = params.filter;
	local projection = params.projection;


	local function closure()
		local record;

		if filter then
			while true do
				record = nextRecord();	
	
				if not record then
					return nil;
				end
				
				record = filter(self, record);

				if record then
					break;
				end
			end
		else
			record = nextRecord();
		end

		if not record then
			return nil;
		end

		if projection then
			return projection(self, record);
		end

		return record;
	end

	return closure;
end

-- A simple iterator over a table
-- returns the embedded table entries
-- individually.
local irecords = function(tbl)
	local i=0;

	local closure = function()
		i = i + 1;
		if i > #tbl then
			return nil;
		end

		return tbl[i];
	end

	return closure	
end

-- given a key/value record, and a filter table
-- pass the record if every field in the filtertable
-- matches a field in the record.
local recordfilter = function(record, filtertable)
	for key,value in pairs(filtertable) do
		if not record[key] then 
			print("record does not have field: ", key)
			return nil;
		end

		if tostring(record[key]) ~= tostring(value) then
			print(record[key], "~=", value);
			return nil;
		end
	end

	return record;
end

return {
  irecords = irecords,
  recordfilter = recordfilter,
  query = query,
}

The ‘query()’ function represents the bulk of the operation. The other two functions help in forming iterators and doing simple queries.

Here is one example of how it can be used:

-- test_query.lua
--

local JSON = require("dkjson");
local Query = require("Query");
local irecords = Query.irecords

local records = {
  {name = "William", address="1313 Mockingbird Lane", occupation = "eng"},
  {name = "Daughter", address="university", occupation="student"},
  {name = "Wife", address="home", occupation="changer"},
}

local test_query = function()
  local source = irecords(records);

  local res = {}

  for record in Query.query {
    source = source, 
	
    projection = function(self, record)
      return {name=record.name, address=record.address, };
    end,

    filter = function(self, record)
      if record.occupation == "eng" then
        return record;
      end
    end
  } do
    table.insert(res, record);
  end

  local jsonstr = JSON.encode(res, {indent=true});
  print(jsonstr);
end

test_query();

Which results in the following:

[{
    "name":"William",
    "address":"1313 Mockingbird Lane"
  }]

This uses the iterator, a specified filter, and projection. The query() function itself is an iterator, so it will iterate over the data source, and apply the filter and projection to each record, returning results. Nice and easy, very Lua like.

Now that I have a very rudimentary query processor, I can apply it to my web case. So, if I rewrite the web page that’s showing the services on my machine, and can deal with a little bit of query processing:

--[[
	Description: A very simple demonstration of one way a static web server
	can be built using TINN.

	In this case, the WebApp object is being used.  It is handed a routine to be
	run for every http request that comes in (HandleSingleRequest()).

	Either a file is fetched, or an error is returned.

	Usage:
	  tinn staticserver.lua 8080

	default port used is 8080
]]

local WebApp = require("WebApp")


local HttpRequest = require "HttpRequest"
local HttpResponse = require "HttpResponse"
local URL = require("url");
local StaticService = require("StaticService");
local SCManager = require("SCManager");
local JSON = require("dkjson");
local Query = require("Query");
local utils = require("utils");


local getRecords = function(query)
  local mgr, err = SCManager();
  local filter = nil;
  local queryparts;

  if query then
    queryparts = utils.parseparams(query);

    filter = function(self, record)
      return Query.recordfilter(record, queryparts);
    end
  end

  local res = {};

  for record in Query.query {
    source = mgr:services(), 
    filter = filter,
    } do
      table.insert(res, record);
  end
  return res;
end

local HandleSingleRequest = function(stream, pendingqueue)
	local request, err  = HttpRequest.Parse(stream);

	if not request then
		print("HandleSingleRequest, Dump stream: ", err)
		return 
	end

	local urlparts = URL.parse(request.Resource)
	local response = HttpResponse.Open(stream)

	if urlparts.path == "/system/services" then
		local res = getRecords(urlparts.query);
		local jsonstr = JSON.encode(res, {indent=true});

		--print("echo")
		response:writeHead("200")
		response:writeEnd(jsonstr);
	else
		response:writeHead("404");
		response:writeEnd();
	end

	-- recycle the stream in case a new request comes 
	-- in on it.
	return pendingqueue:Enqueue(stream)
end


--[[ Configure and start the service ]]
local port = tonumber(arg[1]) or 8080

Runtime = WebApp({port = port, backlog=100})
Runtime:Run(HandleSingleRequest);

Here I have introduced the ‘getRecords()’ function, which takes care of getting the raw records from the list of services, and running the query to filter for the ones that I might want to see. In this case, a filter is created if the user specifies something interesting in the url. Without a filter, the url is simply:


http://localhost:8080/system/services

In which case you’ll get the list of all services on the machine, regardless of their current running state.

If you wanted to filter for only the services that were currently running, you would specify a URL such as this:


http://localhost:8080/system/services?State=RUNNING

And if you want to look for a particular service, by name, you would do:


http://localhost:8080/system/services?ServiceName=ACPI

[{
    "ServiceType":"KERNEL_DRIVER",
    "ProcessId":0,
    "DisplayName":"Microsoft ACPI Driver",
    "ServiceName":"ACPI",
    "ServiceFlags":0,
    "State":"RUNNING"
  }]

Of course, you can also do simple combinations:


http://localhost:8080/system/services?State=RUNNING;ServiceType=KERNEL_DRIVER

This will return the list of all the kernel drivers that are currently running.

Of course, if you’re sitting on your local machine, you could bring up the TaskManager, export the list of services, import it into a real database/excel, and perform queries to your heart’s content…

This type of coding makes spelunking your system really easy. The fact that it’s available through a web interface opens up some possibilities in terms of display, interaction, and accessibility. Since the stream is just JSON, it could be fairly straight forward to present this information in a much more interesting form, perhaps by using d3 or webgl, or who knows what.

So, is this software engineering?

Having gone from a low level system call to a higher level web based interface with interactive query capabilities, I’d say it must be approaching the term. Perhaps the ‘engineering’ lies in the simplicity. Rather than this being a fairly large integrated system, it’s just a few lines of script code that ties together well.

I believe the “engineering”, and thus an “engineer” comes from being able to recognize the minimal amount of code necessary to get a job done. The “engineering” lies in the process of finding those minimal lines of code.


When Is Software Engineering – Hitting the Interwebs

So far, I’ve tamed an OS API that gives me the list of services running on my machine. I’ve been able to slap an iterator on it so that I could more easily deal with the information within my scripting language. Surely this is engineering?

My daughter, who knows Python, is not impressed. Next step, surely if I can make this information readily available through a web interface, it will be “Engineering”…

local WebApp = require("WebApp")

local HttpRequest = require "HttpRequest"
local HttpResponse = require "HttpResponse"
local URL = require("url");
local StaticService = require("StaticService");
local SCManager = require("SCManager");
local JSON = require("dkjson");

local HandleSingleRequest = function(stream, pendingqueue)
  local request, err  = HttpRequest.Parse(stream);

  if not request then
    print("HandleSingleRequest, Dump stream: ", err)
    return
  end

  local urlparts = URL.parse(request.Resource)

  if urlparts.path == "/system/services" then
    local mgr, err = SCManager();
    local res = {}

    for service in mgr:services() do
      if service.Status.State == "RUNNING" then
        table.insert(res, service);
      end
    end

    local jsonstr = JSON.encode(res, {indent=true});

    local response = HttpResponse.Open(stream)
    response:writeHead("200")
    response:writeEnd(jsonstr);
  else
    local filename = './wwwroot'..urlparts.path;
    local response = HttpResponse.Open(stream);
    StaticService.SendFile(filename, response)
  end

  -- recycle the stream in case a new request comes
  -- in on it.
  return pendingqueue:Enqueue(stream)
end

Runtime = WebApp({port = 8080, backlog=100})
Runtime:Run(HandleSingleRequest);

Surely this is engineering!! The business end of this code is right there with the familiar mgr:services() iterator. In this case, I take each of the records returned and stuff them into a table. Then, after all are returned, I turn that into a JSON string, and return it as the web result.

RunningServices

I only want to return the services that are in a “RUNNING” state, so I do that check before I actually stuff the record into the table. Well, there you have it. I’ve now gone from simply being able to make a simple system call, to being able to display the results of that call in a webpage accessible on my machine.  Is this “Software Engineering” yet?

Does it become engineering because I wrote the code that consumes a lower level framework?  Does it become engineering if I actually wrote the lower level framework?  Is it only engineering if I wrote the OS that supports that lower level framework?

As Doctor Evil would say; “Somebody throw me a frickin bone…”

I’ll try to impress my daughter with this code, then I’ll try it out on the cocktail circuit.  Perhaps someone will see this as software engineering…

But wait, there’s more!  That ‘query’ where I filtered for only the “RUNNING” services looked a bit enemic, and what about changing the state of those services?  Surely there’s room to do some engineering in there…

 


When Is Software Engineering – That’s just coding…

Over the past couple of weeks, I’ve been thinking the following; Am I a “Software Engineer”? “Am I just a hacker?”, “What is software ‘Engineering’ exactly?”.

When I was in high school, about ready to go to college, I told my math professor that I was studying “Electrical Engineering and Computer Science”. With a puzzled look, he said “what is computer science?”. This was no dullard of a man. He was actually a wealthy man who took up teaching as his way of paying back society for the public school education he had received. He was a Berkeley grad, who had worked on the Manhattan project. He was a true “Engineer”. His question was essentially along the lines of; “I know what engineers do, and we use machines to help, and sometimes you have to program those machines, but is there really a whole “science” behind it?”

Granted, this was back in 1982, so times were a bit different with respect to computing, but this stuck in my mind.

Then, recently, I was discussing the same question with a colleague at work.  The line of reasoning we were pursuing was, “How much of our time and engineering is truly “engineering”, and how much of it is simply “coding”.  Coding is that mindless thing we do when we’re banging at our keyboards, looking up documentation on obscure functions, just trying to use some library that someone has provided.  We didn’t really consider this “engineering” because it’s basically just a translation job.  I’ve spent a fair bit of time with the Windows headers of late, and I can tell you, 90% of that time I would consider to be “coding”, but not “engineering”.

Here is one example.  I want to get the list of services that are running on my local machine and project that out to the internet as a JSON structure, so I can easily display it in a web page.  There’s a core function in Windows to get things started:

 

BOOL
EnumServicesStatusExA(
  SC_HANDLE             hSCManager,
  SC_ENUM_TYPE          InfoLevel,
  DWORD                 dwServiceType,
  DWORD                 dwServiceState,
  LPBYTE                lpServices,
  DWORD                 cbBufSize,
  LPDWORD               pcbBytesNeeded,
  LPDWORD               lpServicesReturned,
  LPDWORD               lpResumeHandle,
  LPCSTR                pszGroupName
);

 

Well, that’s a mouthful. First of all you have to discover that this call actually exists, and figure out all the parameters. Luckily MSDN documentation exists, so at least you can read about it. You might even get lucky enough to find an example laying around on the internet to help show you how to use it.

Since I’m using a scripting language, my first task is to wrap this in an ffi.cdef[[]] so that I can call it easily. In order to do that, I have to put everything into ffi.cdef[[]], the structures, enums, function call prototypes, etc. After all is said and done, I can make a call that looks like this:

local InfoLevel = ffi.C.SC_ENUM_PROCESS_INFO;
local dwServiceType = dwServiceType or ffi.C.SERVICE_TYPE_ALL;
local dwServiceState = dwServiceState or ffi.C.SERVICE_STATE_ALL;
local lpServices = nil;
local cbBufSize = 0;
local pcbBytesNeeded = ffi.new("DWORD[1]");
local lpServicesReturned = ffi.new("DWORD[1]");
local lpResumeHandle = ffi.new("DWORD[1]");
local pszGroupName = nil;

local status = service_core.EnumServicesStatusExA(
  self.Handle,
  InfoLevel,
  dwServiceType,
  dwServiceState,
  lpServices,
  cbBufSize,
  pcbBytesNeeded,
  lpServicesReturned,
  lpResumeHandle,
  pszGroupName);

That’s only the beginning. I have to call it again after allocating some space to receive the results… Although there’s a lot of little bits and pieces here, is this ‘engineering’?

If 90% of our time is “coding”, I would say at least 75% of that time is spent on this type of activity. You’re trying to use some API in some library, and you’re doing a lot of scaffolding, and boilerplate work. What is the skill involved here? Well, an experienced coder will be more familiar with the various idioms involved, such as make this call to figure out how much space to allocate, then allocate, then make the call again. They’ll also be familiar with the fact that a return value of 0 == failure in this case, and that you need to call GetLastError() to figure out what could really be wrong, if anything. I might call myself an “engineering” for having gained the wisdom and years of experience necessary to perform this particular task efficiently, but I’m not sure I’d call it “software engineering”.

But, this is only the beginning of the journey on this particular task. Ultimately I need to get this information projected out to the internet. There will be iterators, queries, and other big words involved, and somewhere along the line, I might find the dividing line where software turns into engineering.


The Challenge of Writing Correct System Calls

If you are a veteran of using Windows APIs, you might be familiar with some patterns. One of them is the dual return value:

int somefunc();

Documentation: somefunc(), will return some value other than 0 upon success. On failure, it will return 0, and you can then call GetLastError() to find out which error actually occured…

In some cases, 0 means error. In some cases, 0 means success, and anything else is actually the error. In some cases they return BOOL, and some BOOLEAN (completely different!).

Another pattern is the “pass me a buffer, and a size…”

int somefunc(char *buff, int sizeofbuff)

Documentation: Pass in a buffer, and the size of the buffer. If the ‘sizebuff’ == 0, then the return value will indicate how many bytes need to be allocated in the buffer, so you can call the function again.

A slight variant is this one:

int somefunc(char *buff, int * sizeofbuff)

In this case, the return value of the function will indicate whether there was an error or not. If there was an error such as: ERROR_INSUFFICIENT_BUFFER, then the ‘sizeofbuff’ was stuffed with the actual size needed to fulfill the request.

This can be very confusing to say the least. What makes it more confusing, at least in the Windows world, is that there is no single way this is done. Windows APIs have existed since the beginning of time, so there is as much variety in the various APIs as there are programmers who have worked on them over the years.

How to bring sanity to this world? I’ll examine just one case where I use Lua to make a more sane picture. I want to get the current system time. In kernel32, there are a few date/time formatting functions, so I’ll use one of those. Here is the ffi to the one I want:

ffi.cdef[[
int
GetTimeFormatEx(
    LPCWSTR lpLocaleName,
    DWORD dwFlags,
    const SYSTEMTIME *lpTime,
    LPCWSTR lpFormat,
    LPWSTR lpTimeStr,
    int cchTime
);
]]

That’s one hefty function to get a time printed out in a nice way. There are pointers to unicode strings, pointers to structures that contain the system time, size of buffers, buffers in unicode…

In the end, I want to be able to do this: GetTimeFormat(), and have it print: “7:54 AM”

Alrighty then. Can’t be too hard…

local GetTimeFormat = function(lpFormat, dwFlags, lpTime, lpLocaleName)
  dwFlags = dwFlags or 0;
	
  --lpFormat = lpFormat or "hh':'mm':'ss tt";
  if lpFormat then
    lpFormat = k32.AnsiToUnicode16(lpFormat);
  end

  -- first call to figure out how big the string needs to be
  local buffsize = k32Lib.GetTimeFormatEx(
    lpLocaleName,
    dwFlags,
    lpTime,
    lpFormat,
    lpDataStr,
    0);

  -- buffsize should be the required size
  if buffsize < 1  then
    return false,  k32Lib.GetLastError();
  end

  local lpDataStr = ffi.new("WCHAR[?]", buffsize);
  local res = k32Lib.GetTimeFormatEx(
    lpLocaleName,
    dwFlags,
    lpTime,
    lpFormat,
    lpDataStr,
    buffsize);


  if res == 0 then
    return false, Lib.GetLastError();
  end

  -- We have a widechar, turn it into ASCII
  return k32.Unicode16ToAnsi(lpDataStr);
end

Not too bad, if a bit redundant. There are a couple of things of note, which are easy to miss if you’re not paying close attention.

First of all, I’m following the convention that any system function that succeeds should return the value, and if it fails, it should return false, and an error.

First thing to do is deal with default parameters. The dwFlags parameter is an integer, so if it has not been specified, a default value of ’0′ will be used. If you don’t do this, then a ‘nil’ will be passed to the system function, and that will surely not work.

The time value can be passed in. If it is, it will be used. If not, then the nil in this position will result in using current system time. Same goes for localeName, and lpFormat. If they are nil, then the system default values will be used, according to the function call documentation.

The next important thing is, turning the lpFormat string into a unicode string if it was specified. Lua, by default, deals in straight ANSII 7/8 bit strings, not unicode, so by default, I assume what’s been specified is a standard Lua ansii string, so I convert it to unicode.

And finally, the first function call to the system. In this first call, I want to get the size of the buffer needed, so I pass in ’0′ as the size of the buffer. The return value of the function will be the size of the buffer needed to fill in the string. Of course, if the return value is ’0′, then the ‘GetLastError()’ function must be called to figure out what was wrong. In this case, it could be that one of the parameters specified was wrong, or something else. But, bail out at any rate.

Now that I know how big the buffer needs to be (in unicode characters, not in bytes?), I allocate the appropriate buffer, and make the call again, this time passing in the specified buffer.

Last step, take the unicode string that was returned, and turn it back into an ansii string so the rest of Lua can be happy with it.

There are a couple more error conditions that could possibly be handled, like checking the types of the passed in parameters, or the size of the needed buffer might change between the two system calls, but this is a ‘good enough’ approach.

It’s 38 lines of boilerplate code to ensure the simplicity and relative correctness of a single system call. With literally hundreds of very interesting system calls in the Win32 system, you can imagine how challenging it can get to do these things right.

Of course, this is why libraries exist, because someone has actually gone through and done all the challenging work to get things right. I find that doing this work in Lua is pretty easy. The biggest challenge is reading and interpreting the documentation of the API. Sometimes it’s clear, sometimes it’s not. Once conquered though, it sure does make programming in Windows a lot easier. I suspect the same it true of any language/os binding.


Taming VTables with Aplomb

If you do enough interop work, you’ll eventually run across a VTable that you’re going to have to work with.  I have previously dealt with OpenGL, which doesn’t strictly have a vtable, but has a bunch of functions which you have to lookup in order to use.  In explored the topic in this article: HeadsUp OpenGL Extension Wrangling

Recently, I have been writing code to support TLS connections in TINN.  This ultimately involves using the sspi interfaces in Windows, which leads you to the sspi.h header file which contains the following:

 

typedef struct _SECURITY_FUNCTION_TABLE_A {
    unsigned long                       dwVersion;
    ENUMERATE_SECURITY_PACKAGES_FN_A    EnumerateSecurityPackagesA;
    QUERY_CREDENTIALS_ATTRIBUTES_FN_A   QueryCredentialsAttributesA;
    ACQUIRE_CREDENTIALS_HANDLE_FN_A     AcquireCredentialsHandleA;
    FREE_CREDENTIALS_HANDLE_FN          FreeCredentialHandle;
    void *                      Reserved2;
    INITIALIZE_SECURITY_CONTEXT_FN_A    InitializeSecurityContextA;
    ACCEPT_SECURITY_CONTEXT_FN          AcceptSecurityContext;
    COMPLETE_AUTH_TOKEN_FN              CompleteAuthToken;
    DELETE_SECURITY_CONTEXT_FN          DeleteSecurityContext;
    APPLY_CONTROL_TOKEN_FN              ApplyControlToken;
    QUERY_CONTEXT_ATTRIBUTES_FN_A       QueryContextAttributesA;
    IMPERSONATE_SECURITY_CONTEXT_FN     ImpersonateSecurityContext;
    REVERT_SECURITY_CONTEXT_FN          RevertSecurityContext;
    MAKE_SIGNATURE_FN                   MakeSignature;
    VERIFY_SIGNATURE_FN                 VerifySignature;
    FREE_CONTEXT_BUFFER_FN              FreeContextBuffer;
    QUERY_SECURITY_PACKAGE_INFO_FN_A    QuerySecurityPackageInfoA;
    void *                      Reserved3;
    void *                      Reserved4;
    EXPORT_SECURITY_CONTEXT_FN          ExportSecurityContext;
    IMPORT_SECURITY_CONTEXT_FN_A        ImportSecurityContextA;
    ADD_CREDENTIALS_FN_A                AddCredentialsA ;
    void *                      Reserved8;
    QUERY_SECURITY_CONTEXT_TOKEN_FN     QuerySecurityContextToken;
    ENCRYPT_MESSAGE_FN                  EncryptMessage;
    DECRYPT_MESSAGE_FN                  DecryptMessage;
    SET_CONTEXT_ATTRIBUTES_FN_A         SetContextAttributesA;
    SET_CREDENTIALS_ATTRIBUTES_FN_A     SetCredentialsAttributesA;
    CHANGE_PASSWORD_FN_A                ChangeAccountPasswordA;
} SecurityFunctionTableA, * PSecurityFunctionTableA;

You get at this function table by making the following call:

local sspilib = ffi.load("secur32");
local VTable = sspilib.InitSecurityInterfaceA();

And then, to execute one of the functions, you could do this:

local pcPackages = ffi.new("int[1]");
local ppPackageInfo = ffi.new("PSecPkgInfoA[1]");
local result = VTable["EnumerateSecurityPackagesA"](pcPackages, ppPackageInfo);

-- Print names of all security packages
for i=0,pcPackages[0] do
  print(ffi.string(ppPackageInfo[0][i].Name));
end

Tada!! What could be simpler…

Well, this is Lua of course, so things could be made a bit simpler.

First of all, why is there even a vtable in this case? All these functions are just in the .dll file directly aren’t they? Well, there’s a bit of trickery when it comes to security packages. It turns out, it’s best not to actually load the .dll that represents the security package into the address space of the program that’s using it, directly. By calling “IniSecurityInterface()”, the actual package is loaded into a different address space, and the vtable is then used to access the functions.

You can make multiple calls to InitSecurityInterface() to get that vtable pointer, or you could stuff it into a global variable, making it available to all modules within your program, or, you could stuff it into a bit of a table wrapping and make life much easier.

-- sspi.lua
local ffi = require("ffi");

local sspi_ffi = require("sspi_ffi");
local SecError = require ("SecError");
local sspilib = ffi.load("secur32");
local SecurityPackage = require("SecurityPackage");
local Credentials = require("CredHandle");
local schannel = require("schannel");

local SecurityInterface = {
  VTable = sspilib.InitSecurityInterfaceA();
}
setmetatable(SecurityInterface, {
  __index = function(self, key)
    return self.VTable[key]
  end,
});

return {
  schannel = schannel;
	
  SecurityInterface = SecurityInterface;
  SecurityPackage = SecurityPackage;
  Credentials = Credentials;
}

With this little bit, I can now do this in my program:

local sspi = require("sspi");
local SecurityInterface = sspi.SecurityInterface;

local pcPackages = ffi.new("int[1]");
local ppPackageInfo = ffi.new("PSecPkgInfoA[1]");

local result = SecurityInterface.EnumerateSecurityPackagesA(pcPackages, ppPackageInfo);

The SecurityInterface table takes care of loading the VTable as part of it’s construction. By doing the setmetatable, and implementing the ‘__index’ metamethod, whenever a ‘.functionname’ is asked for, as with ‘.EnumerateSecurityPackagesA’, the element within the vtable with that name will be returned. Those elements so happen to be function pointers, so they will then just be executed like regular functions!

I think that’s a pretty awesome trick. The SecurityInterface table looks like a static structure with function pointers, and you just get to call those functions directly, passing in the appropriate arguments. This looks pretty much exactly like what I would expect if I were writing this in C, but I don’t have to worry about type casts and the like.

This works in this particular case because there is a single table representing the function pointers. If you were instead doing something where there were instances of an object, and an attendant vtable, you’d have to do a little bit more work to preserve the instance data, and pass it into the individual functions. Not too hard, and I actually do this trick in my Kinect interface implementation.

At any rate, that’s a relatively easy way to tackle vtables without much work. It was actually a bit surprising to me that it worked so easily, and I’ve been able to refine a pattern that I somewhat understood before, and now truly appreciate.


Leap Motion Event Filtering

So, I’m always trying to make my code simpler. Easier to understand, easier to maintain. With the Leap, or any input device, you are faced with a continuous stream of data. In many situations, you’d like to just filter through stuff and only deal with certain types of data. In one sense, you need a simple stream based query processor.

I had the beginnings of such a thing, and I’ve now separated out the pieces. In the case of the Leap Motion, you are faced with streams of data that might look like this:

{
	"id":1237560,
	"r":[[0.444044,0.663489,-0.602169],[0.184129,-0.725287,-0.663367],[-0.876882,0.183687,-0.444227]],
	"s":762.482,
	"t":[5336.48,-24560.1,5768.29],
	"timestamp":13587071004,

	"hands":[{
		"id":4,
		"direction":[-0.0793992,0.899586,-0.427785],
		"palmNormal":[-0.16208,-0.432144,-0.886711],
		"palmPosition":[27.138,227.235,80.2504],
		"palmVelocity":[-136.716,-134.926,-359.534],
		"sphereCenter":[9.15823,202.468,9.29922],
		"sphereRadius":106.122,
		"r":[[0.989305,-0.132062,-0.0619254],[0.117032,0.97208,-0.203384],[0.0870557,0.193962,0.977139]],
		"s":1.45151,
		"t":[-18.2708,21.6366,-106.687]
	}],
	
	"pointables":[
		{
			"direction":[0.196259,0.670762,-0.715235],
			"handId":4,
			"id":7,
			"length":68.7964,
			"tipPosition":[61.3422,285.46,38.3742],
			"tipVelocity":[-184.398,-119.405,-322.679],
			"tool":false
		},
		{
			"direction":[0.0324904,0.792378,-0.609165],
			"handId":4,
			"id":3,
			"length":76.8893,
			"tipPosition":[14.7425,304.766,41.4163],
			"tipVelocity":[-229.246,-95.6285,-323.667],
			"tool":false
		}
}]
}

This data is representative of a single ‘frame’ coming off the device. As you can see, it’s hierarchical. That is, for every discreet frame, I get a grouping of hand(s) information as well as pointables and possibly gestures. This is how the Leap Motion software packages things up.

In some applications, what I really want is a stream of events, not presented hierarchically. Additionally, I want to read that stream of events and easily filter it looking for particular patterns. Basically, I’d like to be able to write a program like the following, which does nothing more than print the hand information. In particular, I’m looking for the radius of the sphere, as well as the sphere center.

package.path = package.path.."../?.lua"

local LeapInterface = require("LeapInterface");
local FrameEnumerator = require("FrameEnumerator");
local EventEnumerator = require("EventEnumerator");

local leap, err = LeapInterface();

assert(leap, "Error Loading Leap Interface: ", err);

local printHand = function(hand)
  local c = hand.sphereCenter;
  local r = hand.sphereRadius;
  local n = hand.palmNormal;
  print(string.format("HAND: %3.2f  [%3.2f %3.2f %3.2f]", r, c[1], c[3], c[3]));
end

-- Only allow the hand events to come through
local handfilter = function(event)
  return event.palmNormal ~= nil
end

local main = function()
  for event in EventEnumerator(FrameEnumerator(leap), handfilter) do
    printHand(event);
  end	
end

run(main);

Easy to digest. Looking at the main() function, there is an iterator chain. The core is the ‘FrameEnumerator(leap)’. This is the source. That is wrapped by the EventEnumerator() iterator, which consumes the frameiterator, as well as the ‘handfilter()’ function.

Looking at the FrameEnumerator:

-- FrameEnumerator.lua
local JSON = require("dkjson");
local ffi = require("ffi");

local FrameEnumerator = function(interface)
  local closure = function()
    for rawframe in interface:RawFrames() do
      local frame = JSON.decode(ffi.string(rawframe.Data, rawframe.DataLength));
      return frame;
    end
  end	

  return closure;
end

return FrameEnumerator

Not very much code at all, and borrowed from the FrameObserver object in a previous article. This enumerator has the sole purpose of turning the raw strings that arrive from the WebSocket interface into a stream of discreet table objects. It does that by doing the JSON.decode(), which takes a JSON formatted string and turns it into a Lua table object. It just hands those out until it can’t hand out any more.

The EventEnumerator is a little bit more complex, but not much. It’s sole purpose in life is to take Lua tables, which presumably have the format of these Leap Motion frames, and turn them into discreet events, basically flattening the hierarchical data structure. In addition to flattening the data structure, you can apply the filter, thus throwing away bits of the frame that are not going to be processed any further:

-- EventEnumerator.lua
local Collections = require("Collections");

local EventEnumerator = function(frameemitter, filterfunc)
  local eventqueue = Collections.Queue.new();

  local addevent = function(event)
    if filterfunc then
      if filterfunc(event) then
        eventqueue:Enqueue(event);
      end
    else
      eventqueue:Enqueue(event);
    end
  end


  local closure = function()
    -- If the event queue is empty, then grab the
    -- next frame from the emitter, and turn it into
    -- discreet events.
    while eventqueue:Len() < 1 do
      local frame = frameemitter();
      if frame == nil then
        return nil
      end

      if frame.hands ~= nil then
        for _,hand in ipairs(frame.hands) do
          addevent(hand);
        end
      end

      if frame.pointables ~= nil then
        for _,pointable in ipairs(frame.pointables) do
          addevent(pointable);
        end
      end

      if frame.gestures then
        for _,gesture in ipairs(frame.gestures) do
          addevent(gesture);
        end
      end						
    end

    return eventqueue:Dequeue();
  end	

  return closure;
end

return EventEnumerator

Stitch it all together and the first program actually works, you can get a stream of ‘hand’ events. The beauty of this system is that you can further string iterators together in this way. Of course, at the lowest level, you can easily filter to find hands, fingers, tools, gestures, and the like. You might also realize something else special about this. The ‘filter’ is just Lua code, so it can be as complex or simple as you want. In this particular case, I just wanted to check for the existance of a single field. If I thought the amount of data was too much, and I wanted to cut it in half, I could have easily kept a counter, and only returned every other event. You could go further, and not just return a true/false, but you could possibly alter the event as well in some way. But, the true power comes from composition. Rather than making a more complex single filter, I’d rather just make more filters and string them together.

Using this simple concept of Enumeration, I can imagine performing a whole bunch of simple, and even complex operations, simply by tying the proper enumerators, observers, queues and the like together.


Screen Capture for Fun and Profit

In Screen Sharing from a Browser I wrote about how relatively easy it is to display a continuous snapshot of a remote screen, and even send mouse and keyboard events back to it.  That was the essence of modern day browser based screen sharing.  Everything else is about compression for bandwidth management.

In this article, I’ll present the “server” side of the equation.  Since I’ve discovered the ‘sourcecode’ bracket in WordPress, I can even present the code with line numbers.  So, here in its entirety is the server side:

 


local ffi = require "ffi"

local WebApp = require("WebApp")

local HttpRequest = require "HttpRequest"
local HttpResponse = require "HTTPResponse"
local URL = require("url")
local StaticService = require("StaticService")

local GDI32 = require ("GDI32")
local User32 = require ("User32")
local BinaryStream = require("core.BinaryStream")
local MemoryStream = require("core.MemoryStream")
local WebSocketStream = require("WebSocketStream")
local Network = require("Network")

local utils = require("utils")
local zlib = require ("zlib")

local UIOSimulator = require("UIOSimulator")

--[[
	Application Variables
--]]
local ScreenWidth = User32.GetSystemMetrics(User32.FFI.CXSCREEN);
local ScreenHeight = User32.GetSystemMetrics(User32.FFI.CYSCREEN);

local captureWidth = ScreenWidth;
local captureHeight = ScreenHeight;

local ImageWidth = captureWidth;
local ImageHeight = captureHeight;
local ImageBitCount = 16;

local hbmScreen = GDIDIBSection(ImageWidth, ImageHeight, ImageBitCount);
local hdcScreen = GDI32.CreateDCForDefaultDisplay();

local net = Network();

--[[
	Application Functions
--]]
function captureScreen(nWidthSrc, nHeightSrc, nXOriginSrc, nYOriginSrc)
  nXOriginSrc = nXOriginSrc or 0;
  nYOriginSrc = nYOriginSrc or 0;

  -- Copy some of the screen into a
  -- bitmap that is selected into a compatible DC.
  local ROP = GDI32.FFI.SRCCOPY;

  local nXOriginDest = 0;
  local nYOriginDest = 0;
  local nWidthDest = ImageWidth;
  local nHeightDest = ImageHeight;
  local nWidthSrc = nWidthSrc;
  local nHeightSrc = nHeightSrc;

  GDI32.Lib.StretchBlt(hbmScreen.hDC.Handle,
    nXOriginDest,nYOriginDest,nWidthDest,nHeightDest,
    hdcScreen.Handle,
    nXOriginSrc,nYOriginSrc,nWidthSrc,nHeightSrc,
    ROP);

  hbmScreen.hDC:Flush();
end

-- Serve the screen up as a bitmap image (.bmp)
local getContentSize = function(width, height, bitcount, alignment)
  alignment = alignment or 4

  local rowsize = GDI32.GetAlignedByteCount(width, bitcount, alignment);
  local pixelarraysize = rowsize * math.abs(height);
  local filesize = 54+pixelarraysize;
  local pixeloffset = 54;

  return filesize;
end

local filesize = getContentSize(ImageWidth, ImageHeight, ImageBitCount);
local memstream = MemoryStream.new(filesize);
local zstream = MemoryStream.new(filesize);

local writeImage = function(dibsec, memstream)
  --print("printImage")
  local width = dibsec.Info.bmiHeader.biWidth;
  local height = dibsec.Info.bmiHeader.biHeight;
  local bitcount = dibsec.Info.bmiHeader.biBitCount;
  local rowsize = GDI32.GetAlignedByteCount(width, bitcount, 4);
  local pixelarraysize = rowsize * math.abs(height);
  local filesize = 54+pixelarraysize;
  local pixeloffset = 54;

  -- allocate a MemoryStream to fit the file size
  local streamsize = GDI32.GetAlignedByteCount(filesize, 8, 4);

  memstream:Seek(0);

  local bs = BinaryStream.new(memstream);

  -- Write File Header
  bs:WriteByte(string.byte('B'))
  bs:WriteByte(string.byte('M'))
  bs:WriteInt32(filesize);
  bs:WriteInt16(0);
  bs:WriteInt16(0);
  bs:WriteInt32(pixeloffset);

  -- Bitmap information header
  bs:WriteInt32(40);
  bs:WriteInt32(dibsec.Info.bmiHeader.biWidth);
  bs:WriteInt32(dibsec.Info.bmiHeader.biHeight);
  bs:WriteInt16(dibsec.Info.bmiHeader.biPlanes);
  bs:WriteInt16(dibsec.Info.bmiHeader.biBitCount);
  bs:WriteInt32(dibsec.Info.bmiHeader.biCompression);
  bs:WriteInt32(dibsec.Info.bmiHeader.biSizeImage);
  bs:WriteInt32(dibsec.Info.bmiHeader.biXPelsPerMeter);
  bs:WriteInt32(dibsec.Info.bmiHeader.biYPelsPerMeter);
  bs:WriteInt32(dibsec.Info.bmiHeader.biClrUsed);
  bs:WriteInt32(dibsec.Info.bmiHeader.biClrImportant);

  -- Write the actual pixel data
  memstream:WriteBytes(dibsec.Pixels, pixelarraysize, 0);
end

local getSingleShot = function(response, compressed)
  captureScreen(captureWidth, captureHeight);

  writeImage(hbmScreen, memstream);

  zstream:Seek(0);
  local compressedLen = ffi.new("int[1]", zstream.Length);
  local err = zlib.compress(zstream.Buffer,   compressedLen, memstream.Buffer, memstream:GetPosition() );

  zstream.BytesWritten = compressedLen[0];

  local contentlength = zstream.BytesWritten;
  local headers = {
    ["Content-Length"] = tostring(contentlength);
    ["Content-Type"] = "image/bmp";
    ["Content-Encoding"] = "deflate";
  }

  response:writeHead("200", headers);
  response:WritePreamble();
  return response.DataStream:WriteBytes(zstream.Buffer, zstream.BytesWritten);
end

local handleUIOCommand = function(command)

  local values = utils.parseparams(command)

  if values["action"] == "mousemove" then
    UIOSimulator.MouseMove(tonumber(values["x"]), tonumber(values["y"]))
  elseif values["action"] == "mousedown" then
    UIOSimulator.MouseDown(tonumber(values["x"]), tonumber(values["y"]))
  elseif values["action"] == "mouseup" then
    UIOSimulator.MouseUp(tonumber(values["x"]), tonumber(values["y"]))
  elseif values["action"] == "keydown" then
    UIOSimulator.KeyDown(tonumber(values["which"]))
  elseif values["action"] == "keyup" then
    UIOSimulator.KeyUp(tonumber(values["which"]))
  end
end

local startupContent = nil

local handleStartupRequest = function(request, response)
  -- read the entire contents
  if not startupContent then
    -- load the file into memory
    local fs, err = io.open("viewscreen2.htm")

    if not fs then
      response:writeHead("500")
      response:writeEnd();

      return true
    end

    local content = fs:read("*all")
    fs:close();

    -- perform the substitution of values
    -- assume content looks like this:
    -- <!--?hostip? -->:<!--?serviceport?-->
    local subs = {
      ["frameinterval"]	= 300,
      ["hostip"] 			= net:GetLocalAddress(),
      ["capturewidth"]	= captureWidth,
      ["captureheight"]	= captureHeight,
      ["imagewidth"]		= ImageWidth,
      ["imageheight"]		= ImageHeight,
      ["screenwidth"]		= ScreenWidth,
      ["screenheight"]	= ScreenHeight,
      ["serviceport"] 	= Runtime.config.port,
    }
    startupContent = string.gsub(content, "%<%?(%a+)%?%>", subs)
  end

  -- send the content back to the requester
  response:writeHead("200",{["Content-Type"]="text/html"})
  response:writeEnd(startupContent);

  return true
end

--[[
  Responding to remote user input
]]--
local handleUIOSocketData = function(ws)
  while true do
    local bytes, bytesread = ws:ReadFrame()

    if not bytes then
      print("handleUIOSocketData() - END: ", err);
      break
    end

    local command = ffi.string(bytes, bytesread);
    handleUIOCommand(command);
  end
end

local handleUIOSocket = function(request, response)
  local ws = WebSocketStream();
  ws:RespondWithServerHandshake(request, response);

  Runtime.Scheduler:Spawn(handleUIOSocketData, ws);

  return false;
end

--[[
  Primary Service Response routine
]]--
local HandleSingleRequest = function(stream, pendingqueue)
  local request, err  = HttpRequest.Parse(stream);

  if not request then
    -- dump the stream
    --print("HandleSingleRequest, Dump stream: ", err)
    return
  end

  local urlparts = URL.parse(request.Resource)
  local response = HttpResponse.Open(stream)
  local success = nil;

  if urlparts.path == "/uiosocket" then
    success, err = handleUIOSocket(request, response)
  elseif urlparts.path == "/screen.bmp" then
    success, err = getSingleShot(response, true);
  elseif urlparts.path == "/screen" then
    success, err = handleStartupRequest(request, response)
  elseif urlparts.path == "/favicon.ico" then
    success, err = StaticService.SendFile("favicon.ico", response)
  elseif urlparts.path == "/jquery.js" then
    success, err = StaticService.SendFile("jquery.js", response)
  else
    response:writeHead("404");
    success, err = response:writeEnd();
  end

  if success then
    return pendingqueue:Enqueue(stream)
  end
end

--[[
  Start running the service
--]]
local serviceport = tonumber(arg[1]) or 8080

Runtime = WebApp({port = serviceport, backlog=100})

Runtime:Run(HandleSingleRequest);

As a ‘server’ this code is responsible for handling a couple of things. First, it needs to act as a basic http server, serving up relatively static content to get things started. When the user specifies the url http://localhost/screen, the server will respond by sending back the browser code that I showed in the previous article. The function “handleStartupRequest()” performs this operation. The file ‘viewscreen2.htm’ is HTML, but it’s a bit of a template as well. You can delimit a piece to be replaced by enclosing it in a tag such as: . This tag can be replaced by any bit of code that you choose. In this case, I’m doing replacements for the size of the image, the size of the screen, the refreshinterval, and the hostid and port. This last is most important because without it, you won’t be able to setup the websocket.

The other parts are fairly straight forward. Of particular note is the ‘captureScreen()’ function. In Windows, since the dawn of man, there has been GDI for graphics. Good ol’ GDI still has the ability to capture the screen, or a single window, or a portion of the screen. this still works in Windows 8 as well. So, capturing the screen is nothing more that drawing into a DIBSection, and that’s that. Just one line of code.

The magic happens after that. Rather than handing the raw image back to the client, I want to send it out as a compressed BMP image. I could choose PNG, or JPG, or any other format browsers are capable of handling, but BMP is the absolute easiest to deal with, even if it is the most bulky. I figure that since I’m using zlib to deflate it before sending it out, that will be somewhat helpful, and it turns out this works just fine.

The rest of the machinery there is just to deal with being an http server. A lot is hidden behind the ‘WebApp’ and the ‘WebSocket’ classes. Those are good for another discussion.

So, all in, this is about 300 lines of code. Not too bad for a rudimentary screen sharing service. Of course, there’s a supporting cast that runs into the thousands of lines of code, but I’m assuming this as a given since frameworks such as Node and various others exist.

I could explain each and every line of code here, but I think it’s small enough and easy enough to read that won’t be necessary. I will point out that there’s not much difference between sending single snapshots one at a time vs having an open stream and presenting the screen as h.264 or WebM. For that scenario, you just need a library that can capture snapshots of the screen and turn them into the properly encoded video stream. Since you have the WebSocket, it could easily be put to use for that purpose, rather than just receiving the mouse and keyboard events.

Food for thought.


Follow

Get every new post delivered to your Inbox.

Join 31 other followers