schedlua – async io

And so, finally we come to the point. Thus far, we looked at the simple scheduler, the core signaling, and the predicate and alarm functions built atop that. That’s all great stuff for fairly straight forward apps that need some amount of concurrency. The best part though is when you can do concurrent networking stuff.

Here’s the setup; I want to issue 20 concurrent GET requests to 20 different sites, and get results back. I want the program to halt once all the tasks have been completed.

--test_linux_net.lua
package.path = package.path..";../?.lua"

local ffi = require("ffi")

local Kernel = require("kernel"){exportglobal = true}
local predicate = require("predicate")(Kernel, true)
local AsyncSocket = require("AsyncSocket")

local sites = require("sites");

-- list of tasks
local taskList = {}


local function httpRequest(s, sitename)
	local request = string.format("GET / HTTP/1.1\r\nUser-Agent: schedlua (linux-gnu)\r\nAccept: */*\r\nHost: %s\r\nConnection: close\r\n\r\n", sitename);
	return s:write(request, #request);
end

local function httpResponse(s)
	local BUFSIZ = 512;
	local buffer = ffi.new("char[512+1]");
	local bytesRead = 0
	local err = nil;
	local cumulative = 0

	repeat
		bytesRead, err = s:read(buffer, BUFSIZ);

		if bytesRead then
			cumulative = cumulative + bytesRead;
		else
			print("read, error: ", err)
			break;
		end
	until bytesRead < 1

	return cumulative;
end


local function siteGET(sitename)
	print("siteGET, BEGIN: ", sitename);

	local s = AsyncSocket();

	local success, err = s:connect(sitename, 80);  

	if success then
		httpRequest(s, sitename);
		httpResponse(s);
	else
		print("connect, error: ", err, sitename);
	end

	s:close();

	print("siteGET, FINISHED: ", sitename)
end


local function allProbesFinished()
	for idx, t in ipairs(taskList) do
		if t:getStatus() ~= "dead" then
			return false;
		end
	end

	return true;
end

local function main()
	for count=1,20 do
		table.insert(taskList, Kernel:spawn(siteGET, sites[math.random(#sites)]))
		Kernel:yield();
	end

	when(allProbesFinished, halt);
end

run(main)

Step by step. The httpRequest() function takes a socket, and does the most bare mimimal HTTP GET request, assuming the socket is already connected to the site.

Similarly, the httpResponse() function gets a response back from the server, and reads as much as it can until the socket is closed (because the Connection: close header was sent).

That’s about the most basic of HTTP request/response pairs you can have, ignoring doing any parsing of the returned data.

Alright, so let’s wrap those two up into a function called siteGET(). siteGET(sitename) takes the name of a site, creates a socket, connects it to the site, and then issues the httpRequest(), and then the httpResponse(). Very simple. What I like about this is that the httpRequest(); httpResponse() sequence is executed in serial as far as I’m concerned. I don’t have to be worried about the httpResponse() being issued before the request completes. Furthermore, if I didn’t use a spawn(), I could simply execute the code directly and be none the wiser.

I want to execute these siteGET()s concurrently though, so within main(), I start up 20 of these tasks, and let them go. Then comes the waiting part:

local function allProbesFinished()
	for idx, t in ipairs(taskList) do
		if t:getStatus() ~= "dead" then
			return false;
		end
	end

	return true;
end

	when(allProbesFinished, halt);

Going back to our knowledge of predicates, we know that the ‘when’ function takes a predicate (function that returns true/false), and will execute the second function when the predicate returns true.

OK, so we just need to come up with a predicate which tells us that all the tasks have completed. Easy enough as a list of the tasks is generated when they are spawned. So, we just go through that list and see if any of them are still running. If there is a single one that is still running, the predicate will return false, and ‘halt()’ will not be called. As soon as the last task finished, the predicate will return true, and the halt() function will be called.

Of course, most things in schedlua are convenient compositions of deeper things (with signals being at the core).

Instead of using the ‘when’ function, you could write the code more directly like this:

	while true
		if allProbesFinished() then
			halt();
			break;
		end
		yield();
	end

That doesn’t quite look as nice as just using the when() function I think. Also, you’re sitting in the main() function, which is no big deal as there’s nothing else trying to execute after this, but it just doesn’t seem as clean. Furthermore, the ‘when’ function might have some magic in its implementation, such as a better understanding of the state of tasks, or special knowledge of the scheduler, or who knows what. At any rate, either way essentially implements a barrier, and the technique can be used anywhere you want to perform an action after some set of tasks has completed. The allProbesFinished() function can be generalized to wait on any list of tasks, maybe call it “waitForTasks()” or some such thing.

At any rate, that completes the primitives that are baked into the core schedlua package. Everything from signals, to predicates, alarms, and finally async io. Of course this is Linux, so async io works with any file descriptor, not just network sockets, so file management or device communications in general can be thrown into the mix.

Now that the basics work, it’s a matter of cleaning up, writing more test cases, fixing bugs, reorganizing, and optimizing resource usage a bit. In general though, the constructs are there, and it’s ready to be used for real applications.


schedlua – the kernel

Previously, I talked about the scheduler within the new scedlua.  A scheduler is a fairly simple thing, it just decides which of the many ready tasks will run next.  The default scheduler follows a fairly simple FIFO strategy, so there are no priorities, favorites, or the like.  Of course this wouldn’t be any fun if you were stuck with just one scheduler, so naturally enough this is an easily pluggable part of the system.  But what does this plugging?

In steps the Kernel.  In general, the schedlua project is about creating a set of tools by which highly performant services can be constructed.  schedlua largely supports the concept that a single processor can be highly leveraged if programmed correctly.  It does not try to gain performance through the usage of multiple threads, but rather it just takes on the task of suspending various tasks which are blocked on IO or otherwise idle, and letting tasks which are ready to run do their thing.  The concurrency model is cooperative, not preemptive, so if any one task misbehaves, the process can become stuck.

So, let’s take a look at this code:

--kernel.lua
-- kernel is a singleton, so return
-- single instance if we've already been
-- through this code
print("== KERNEL INCLUDED ==")

local Scheduler = require("scheduler")
local Task = require("task")
local Queue = require("queue")
local Functor = require("functor")

local Kernel = {
	ContinueRunning = true;
	TaskID = 0;
	Scheduler = Scheduler();
	TasksSuspendedForSignal = {};
}

setmetatable(Kernel, {
    __call = function(self, params)
    	params = params or {}
    	params.Scheduler = params.Scheduler or self.Scheduler
    	
    	if params.exportglobal then
    		self:globalize();
    	end

    	self.Scheduler = params.Scheduler;

    	return self;
    end,
})

function Kernel.getNewTaskID(self)
	self.TaskID = self.TaskID + 1;
	return self.TaskID;
end

function Kernel.getCurrentTaskID(self)
	return self:getCurrentTask().TaskID;
end

function Kernel.getCurrentTask(self)
	return self.Scheduler:getCurrentTask();
end

function Kernel.spawn(self, func, ...)
	local task = Task(func, ...)
	task.TaskID = self:getNewTaskID();
	self.Scheduler:scheduleTask(task, {...});
	
	return task;
end

function Kernel.suspend(self, ...)
	self.Scheduler:suspendCurrentFiber();
	return self:yield(...)
end

function Kernel.yield(self, ...)
	return self.Scheduler:yield();
end


function Kernel.signalOne(self, eventName, ...)
	if not self.TasksSuspendedForSignal[eventName] then
		return false, "event not registered", eventName
	end

	local nTasks = #self.TasksSuspendedForSignal[eventName]
	if nTasks < 1 then
		return false, "no tasks waiting for event"
	end

	local suspended = self.TasksSuspendedForSignal[eventName][1];

	self.Scheduler:scheduleTask(suspended,{...});
	table.remove(self.TasksSuspendedForSignal[eventName], 1);

	return true;
end

function Kernel.signalAll(self, eventName, ...)
	if not self.TasksSuspendedForSignal[eventName] then
		return false, "event not registered"
	end

	local nTasks = #self.TasksSuspendedForSignal[eventName]
	if nTasks < 1 then
		return false, "no tasks waiting for event"
	end

	for i=1,nTasks do
		self.Scheduler:scheduleTask(self.TasksSuspendedForSignal[eventName][1],{...});
		table.remove(self.TasksSuspendedForSignal[eventName], 1);
	end

	return true;
end

function Kernel.waitForSignal(self, eventName)
	local currentFiber = self.Scheduler:getCurrentTask();

	if currentFiber == nil then
		return false, "not currently in a running task"
	end

	if not self.TasksSuspendedForSignal[eventName] then
		self.TasksSuspendedForSignal[eventName] = {}
	end

	table.insert(self.TasksSuspendedForSignal[eventName], currentFiber);

	return self:suspend()
end

function Kernel.onSignal(self, func, eventName)
	local function closure()
		self:waitForSignal(eventName)
		func();
	end

	return self:spawn(closure)
end



function Kernel.run(self, func, ...)

	if func ~= nil then
		self:spawn(func, ...)
	end

	while (self.ContinueRunning) do
		self.Scheduler:step();		
	end
end

function Kernel.halt(self)
	self.ContinueRunning = false;
end

function Kernel.globalize()
	halt = Functor(Kernel.halt, Kernel);
    onSignal = Functor(Kernel.onSignal, Kernel);

    run = Functor(Kernel.run, Kernel);

    signalAll = Functor(Kernel.signalAll, Kernel);
    signalOne = Functor(Kernel.signalOne, Kernel);

    spawn = Functor(Kernel.spawn, Kernel);
    suspend = Functor(Kernel.suspend, Kernel);

    waitForSignal = Functor(Kernel.waitForSignal, Kernel);

    yield = Functor(Kernel.yield, Kernel);
end

return Kernel;

 

From the top, you can see the Kernel requires the scheduler, task and functor. The scheduler has already been explained. The Kernel serves a couple of purposes. First of all, it manages the scheduler. The ‘run’ function at the bottom is the ‘loop’ of the application. It will run until ‘halt’ is called. Each time through the loop it’s telling the scheduler to take a step.

Also at the bottom, you can see usage of the Functor. A functor is just a simple convenience wrapper that helps you call a function on a table at a later point. Those functors are used to make the keywords global.

There are two primary things the kernel does. One is to spawn new tasks, the other is to provide a central point for signal handling.

First, let’s look at the ‘spawn’.

function Kernel.spawn(self, func, ...)
	local task = Task(func, ...)
	task.TaskID = self:getNewTaskID();
	self.Scheduler:scheduleTask(task, {...});
	
	return task;
end

This is actually the coolest part of the system in terms of how the programming model is expressed. Here’s an example of it in use.

local Kernel = require("kernel"){exportglobal = true}


local function numbers(ending)
	local idx = 0;
	local function closure()
		idx = idx + 1;
		if idx > ending then
			return nil;
		end
		return idx;
	end
	
	return closure;
end

local function task1()
	print("first task, first line")
	yield();
	print("first task, second line")
end

local function task2()
	print("second task, only line")
end

local function counter(name, nCount)
	for num in numbers(nCount) do
		print(name, num);
		yield();
	end
	halt();
end

local function main()
	local t0 = spawn(counter, "counter1", 5)
	local t1 = spawn(task1)
	local t2 = spawn(task2)
	local t3 = spawn(counter, "counter2", 7)
end

run(main)

Basically, any time you want something to happen concurrently, you just say ‘spawn(func, params) and that’s that.

What happens is a Task object is created which holds onto the function object as well as the initial set of parameters. This task is then sent to the scheduler to be run. From then on out you can forget about it. Of course, you’re handed the task when you say ‘spawn’, so you do have a chance of suspending and killing it off in the future if you like. Similarly, you can wait for a task to complete as well.

So, that’s spawning.

The other major feature in the kernel is signal handling.

function Kernel.waitForSignal(self, eventName)
	local currentFiber = self.Scheduler:getCurrentTask();

	if currentFiber == nil then
		return false, "not currently in a running task"
	end

	if not self.TasksSuspendedForSignal[eventName] then
		self.TasksSuspendedForSignal[eventName] = {}
	end

	table.insert(self.TasksSuspendedForSignal[eventName], currentFiber);

	return self:suspend()
end

This is probably THE most important routine in the whole system. Basically, take the current task, and put it onto the suspension list, connected with the signal name (signals are just a string value). Later on, when you want the task to resume, you would signal it using either signalOne(), or signalAll().

A little bit of code demonstrating this:

local Kernel = require("kernel"){exportglobal = true}
local Functor = require("functor")

local function numbers(ending)
	local idx = 0;
	local function closure()
		idx = idx + 1;
		if idx > ending then
			return nil;
		end
		return idx;
	end
	
	return closure;
end

local function waitingOnCount(name, ending)
	local eventName = name..tostring(ending)
	waitForSignal(eventName)

	print("REACHED COUNT: ", ending)
end

local function onCountFinished(name)
	print("Counter Finished: ", name)
end

local function counter(name, nCount)
	for num in numbers(nCount) do
		print(num)
		local eventName = name..tostring(num);
		--print(eventName)
		signalOne(eventName);

		yield();
	end

	signalAll(name..'-finished')
end

local function main()
	local t1 = spawn(counter, "counter", 50)
	local t2 = spawn(waitingOnCount, "counter", 20)
	local t3 = spawn(function() print("LAMDA"); waitForSignal("counter15") print("reached 15!!") end)
	
	-- test signalAll().  All three of these should trigger when
	-- counter finishes
	local t13 = onSignal(Functor(onCountFinished, "counter-1"), "counter-finished")
	local t14 = onSignal(Functor(onCountFinished, "counter-2"), "counter-finished")
	local t15 = onSignal(Functor(onCountFinished, "counter-3"), "counter-finished")
end

run(main)

Here, there is a counter(), which is just counting, and firing off a signal for each number. The various waitingOnCount(), and LAMBDA routines are going to respond to the appropriate signals.

Finally, the t13, t14, and t15 tasks are waiting for the “counter-finished” signal, and they will all fire off and print their little message.

Of course, at this point you could have something that would call ‘halt()’ so you don’t have to press Ctl-C to stop the process, but you get the idea.

And that’s pretty much it for the kernel. Absent from here are the async io, predicates, alarms and the like. They are available in schedlua, but they’re not a part of the kernel. Instead of being part of the kernel proper, these are essentially modules. They utilize the signal and spawn features built into the kernel, and they’re free to do their own thing.

I’ll get into the details of alarms and predicates next time around to demonstrate the concept of easy add-on modules.


Schedulers revisited

A ran quite a long series on the buildup of the TINN scheduler, including this gem on the primitives included in the application model: Parallel Conversations

The scheduler developed in that whole series served me very well, taking on timing, predicates, signals, and async IO.  It was all based on Windows primitives, and although fairly modular, was an evolved codebase, which had some lumps and inefficiencies.  So now, two years on, I decided to reimagine that scheduler.  Primary reasons?  I had some time on my hands, and I wanted a scheduler that works equally well on Linux as on Windows.

And so, the schedlua project was born.

What were the design criteria this time around?

  • Reduce Size and complexity
  • Increase composability
  • Do not force features where they’re not desired

In the previous incarnation, I went back and forth between having an “Application” object, and the “Scheduler”.  Things were primarily driven by the Application, and it had a central role in the instance of any running app.  I still favor having this central role, but I changed the meaning and composition of that central player this time around.  First though, there is the “Scheduler”.


local ffi = require("ffi");

local Queue = require("queue")
local Task = require("task");


--[[
	The Scheduler supports a collaborative processing
	environment.  As such, it manages multiple tasks which
	are represented by Lua coroutines.

	The scheduler works by being the arbitrator of which routine is running
	in the application.  When work is to be done, it is encapsulated in 
	the form of a task object.  The scheduler relies upon that task object
	to call a form of 'yield' at some point, at which time it will receive
	the main thread of execution, and will pick the next task out of the ready
	list to run.
--]]
local Scheduler = {}
setmetatable(Scheduler, {
	__call = function(self, ...)
		return self:create(...)
	end,
})
local Scheduler_mt = {
	__index = Scheduler,
}

function Scheduler.init(self, ...)
	local obj = {
		TasksReadyToRun = Queue();
	}
	setmetatable(obj, Scheduler_mt)
	
	return obj;
end

function Scheduler.create(self, ...)
	return self:init(...)
end


function Scheduler.tasksPending(self)
	return self.TasksReadyToRun:length();
end


-- put a task on the ready list
-- the 'task' should be something that can be executed,
-- whether it's a function, functor, or something that has a '__call'
-- metamethod implemented.
-- The 'params' is a table of parameters which will be passed to the function
-- when it's ready to run.
function Scheduler.scheduleTask(self, task, params)
	--print("Scheduler.scheduleTask: ", task, params)
	params = params or {}
	
	if not task then
		return false, "no task specified"
	end

	task:setParams(params);
	self.TasksReadyToRun:enqueue(task);	
	task.state = "readytorun"

	return task;
end

function Scheduler.removeFiber(self, fiber)
	--print("REMOVING DEAD FIBER: ", fiber);
	return true;
end

function Scheduler.inMainFiber(self)
	return coroutine.running() == nil; 
end

function Scheduler.getCurrentFiber(self)
	return self.CurrentFiber;
end

function Scheduler.suspendCurrentFiber(self, ...)
	self.CurrentFiber.state = "suspended"
end

function Scheduler.step(self)
	-- Now check the regular fibers
	local task = self.TasksReadyToRun:dequeue()

	-- If no fiber in ready queue, then just return
	if task == nil then
		print("Scheduler.step: NO TASK")
		return true
	end

	if task:getStatus() == "dead" then
		self:removeFiber(task)

		return true;
	end

	-- If the task we pulled off the active list is 
	-- not dead, then perhaps it is suspended.  If that's true
	-- then it needs to drop out of the active list.
	-- We assume that some other part of the system is responsible for
	-- keeping track of the task, and rescheduling it when appropriate.
	if task.state == "suspended" then
		--print("suspended task wants to run")
		return true;
	end

	-- If we have gotten this far, then the task truly is ready to 
	-- run, and it should be set as the currentFiber, and its coroutine
	-- is resumed.
	self.CurrentFiber = task;
	local results = {task:resume()};

	-- once we get results back from the resume, one
	-- of two things could have happened.
	-- 1) The routine exited normally
	-- 2) The routine yielded
	--
	-- In both cases, we parse out the results of the resume 
	-- into a success indicator and the rest of the values returned 
	-- from the routine
	--local pcallsuccess = results[1];
	--table.remove(results,1);

	local success = results[1];
	table.remove(results,1);

--print("PCALL, RESUME: ", pcallsuccess, success)

	-- no task is currently executing
	self.CurrentFiber = nil;


	if not success then
		print("RESUME ERROR")
		print(unpack(results));
	end

	-- Again, check to see if the task is dead after
	-- the most recent resume.  If it's dead, then don't
	-- bother putting it back into the readytorun queue
	-- just remove the task from the list of tasks
	if task:getStatus() == "dead" then
		self:removeFiber(task)

		return true;
	end

	-- The only way the task will get back onto the readylist
	-- is if it's state is 'readytorun', otherwise, it will
	-- stay out of the readytorun list.
	if task.state == "readytorun" then
		self:scheduleTask(task, results);
	end
end

function Scheduler.yield(self, ...)
	return coroutine.yield(...);
end


return Scheduler

I think the code is pretty straight forward to follow, except for that ‘step()’ function.  So, what is a scheduler anyway?  Well, we want to have some level of ‘concurrency’ in our applications.  That is, I would love to be able to have the appearance of running several tasks at once on my machine, without much work.  In modern day computing, this is fairly well covered by the OS as applications.  We can run several apps at the same time on our computers, and the kernel takes care of the details of how that gets done.  Modern kernels simply split  time between tasks, giving some amount of time to each task in turn, so they feel like they are running in parallel, or concurrently.  It’s a nice illusion because the machines are fast enough, and there are enough times when a task will simply be waiting around anyway that the amount of concurrency achieved can be very high.

So, what if you want some of that kernel goodness within the context of your app directly?  You want to handle mouse and keyboard, while doing some drawing and networking at the same time.  This is typically where ‘threads’ step in, and down the rabbit hole we go, getting involved in threading models, concurrency primitives and the like.  Well, in Lua, there’s a concurrency primitive built in, in the form of coroutines.  Coroutines are a fairly old style of concurrency whereby the application gets control of the CPU, and only gives it up willingly by calling a ‘yield()’ function at some point.  OK, great.  So, you start a routine, and at some point you call yield, and some other routine gets a chance to run.  But, which routine?  Aha, so this is where the ‘scheduler’ comes in.

The scheduler might be considered the ‘home’ routine.  That is, once a routine gives up a slide of time by calling ‘yield’, the scheduler is the place they will return to.  At that point, the scheduler can decide which routines is to be run next.  It can do this because it has a list of ready to run tasks (those placed on the list using ‘scheduleTask’).  In this scheduler, it simply takes the next one off the list, and tells it to resume.  That is the essence of the ‘step()’ function.

Great!  And what does it look like in action?

--test_scheduler.lua
package.path = package.path..";../?.lua"

Scheduler = require("scheduler")()
Task = require("task")
local taskID = 0;

local function getNewTaskID()
	taskID = taskID + 1;
	return taskID;
end

local function spawn(scheduler, func, ...)
	local task = Task(func, ...)
	task.TaskID = getNewTaskID();
	scheduler:scheduleTask(task, {...});
	
	return task;
end


local function task1()
	print("first task, first line")
	Scheduler:yield();
	print("first task, second line")
end

local function task2()
	print("second task, only line")
end

local function main()
	local t1 = spawn(Scheduler, task1)
	local t2 = spawn(Scheduler, task2)

	while (true) do
		--print("STATUS: ", t1:getStatus(), t2:getStatus())
		if t1:getStatus() == "dead" and t2:getStatus() == "dead" then
			break;
		end
		Scheduler:step()
	end
end

main()

That’s the scheduler in isolation.  In this test case, you can see that task creation, and the ‘spawn’ primitive aren’t part of the scheduler proper.  They don’t need to be.  The scheduler should only be concerned with deciding which task is going to be next.  In this way, you can imagine creating all sorts of different kinds of schedulers.  You cold introduce the concept of priorities, or aging, or realtime, or whatever.  The same general idea would apply, the scheduler just needs to decide which task is going to run next.  Also of not here, the ‘main()’ function operates as the “even loop” if you will.  This is the bit of code that drives the scheduler between steps.  Having this separate from the scheduler proper will become advantageous down the road when we want to compose the scheduler will a larger application framework.

So, this is the first step in revitalizing the concurrency core I built into the TINN project.  This time around, more composable, more feature rich, more easily cross platform, more bang for the buck.  Next time around, I’ll examine the “kernel”.


TINN Reboot

I always dread writing posts that start with “it’s been a long time since…”, but here it is.

It’s been a long time since I did anything with TINN.  I didn’t actually abandon it, I just put it on the back burner as I was writing a bunch of code in C/C++ over the past year.  I did do quite a lot of experimental stuff in TINN, adding new interfaces, trying out new classes, creating a better coroutine experience.

The thing with software is, a lot of testing is required to ensure things actually work as expected and fail gracefully when they don’t.  Some things I took from the ‘experimental’ category are:

fun.lua – A library of functional routines specifically built for LuaJIT and it’s great handling of tail recursion.

msiterators.lua – Some handy iterators that split out some very MS specific string types

Now that msiterators is part of the core, it makes it much easier to do things like query the system registry and get the list of devices, or batteries, or whatever, in a simple table form.  That opens up some of the other little experiments, like enumerating batteries, monitors, and whatnot, which I can add in later.

There are not earth shattering, and don’t represent a year’s worth of waiting, but soon enough I’ll create a new package with new goodness in it.  This begs the question, what is TINN useful for?  I originally created it for the purpose of doing network programming, like you could do with node.  Then it turned into a way of doing Windows programming in general.  Since TINN provides scripted access to almost all the interesting low level APIs that are in Windows, it’s very handy for trying out how an API works, and whether it is good for a particular need.

In addition to just giving ready access to low level Windows APIs, it serves as a form of documentation as well.  When I look at a Windows API, it’s not obvious how to handle all the parameters.  Which ones do I allocate, which ones come from the system, which special function do I call when I’m done.  Since I read the docs when I create the interface, the wrapper code encapsulates that reading of the documentation, and thus acts as an encapsulated source of knowledge that’s sitting right there with the code.  Quite handy.

At any rate, TINN is not dead, long live TINN!


When Scripts Roamed the Earth

Way way back in the day, I played with Tcl.  What a nice little compact thing that was.  Then along came this thing called Python.  Kind of funky with it’s indentation thing, but wow, what it has become!  I was cutting my CS chops when ‘p-code’ meant something.  Then along came this Javascript thing.  For the longest time, I think it kind of puttered along, until BAM!  The internet exploded, and more recently node.js happened.  Now suddenly it’s becoming a ‘de-facto’ go to language of the day.

But, another thing has happened recently as well.  With the V8 javascript compiler comes JIT compilation.  Then along comes Lua, and Go, and Python again, and suddenly ‘script’ is becoming as fast, if not faster, than statically compiled ‘C’, which has been the mainstay of computer programming for a few decades now.

And now, two other things are happening.  LuaJIT has this thing called dynasm.  This Dynamic Assembler quickly turns what looks like embedded assembly instructions into actual machine instructions at ‘runtime’.  This is kind of different than what nasm does.  Nasm is an assembler proper.  It takes assembly instructions, and turns that into machine specific code, as part of a typical ‘compile/link/run’ chain.  Dynasm just generates a function in memory, and then you can call it directly, while your program is running.

This concept of dynamic machine code generation seems to be a spreading trend, and all JIT runtimes do it.  I just came across another tool that helps you embed such a JIT thing into your C++ code.  Asmjit is the tool that does a thing similar to what luajit’s dynasm does.

These of course are not unique, and I’m sure there are countless projects that can be pointed to that do something somewhat similar.  And that’s kind of the point.  This dynamic code generation and execution thing is rapidly leaving the p-code phase, and entering the direct machine execution phase, which is making dynamic languages all the more usable and performant.

So, what’s next?

Well, that got me to thinking.  If really fast code can be delivered and executed at runtime, what kinds of problems can be solved?  Remote code execution is nothing new.  There are always challenges with marshaling, versioning, different architectures, security, and the like.  Some of the problems that exist are due to the typically static nature of the code that is being executed on both ends.  Might things change if both ends are more dynamic?

Take the case of TLS/SSL.  There’s all these certificate authorities, which is inherently fragile and error prone.  Then there’s the negotiation of the highest common denominator parameters for the exchange of data.  Well, what if this whole mess were given over to a dynamic piece?  Rather than negotiating the specifics of the encryption mechanism, the two parties could simply negotiate and possibly transfer a chunk of code to be executed.

How can that work?  The client connects to the server, using some mechanism to identify itself (possibly anonymous, possibly this is handled higher up in the stack).  The server then sends a bit of code that the client will then use to pass through every chunk of data that’s headed to the server.  Since the client has dynasm embedded, it can compile that code, and continue operating.  Whomever wrote the client doesn’t know anything about the particulars of communicating with the server.  They didn’t mess up the cryptography, they didn’t have to keep up to date with the latest heart bleed.  The server can change and customize the exchange however they see fit.

The worst case scenario is that the parties cannot agree on anything interesting, so they fall back to using plain old TLS.  This seems useful to me.  A lot of code, that has a high probability of being done wrong, is eliminated from the equation.  If certificate authorities are desired, then they can be used.  If something more interesting is desired, it can easily be encoded and shared.  If thing need to change instantly, it’s just a change on the server side, and move along.

Of course each side needs to provide an appropriate sandbox so the code doesn’t just execute something arbitrary.  Each side also needs to provide some primitives, like ability to grab certificates if needed, and access to crypto libraries if needed.

If the server wants to use a non-centralized form of identity, it can just code that up, and be on its way.  The potential is high for extremely dynamic communications, as well as mischief.

And what else?  Well, I guess just about anything that can benefit from being dynamic.  Learning new gestures, voice recognition, image recognition, learning to walk, learning new algorithms for searching, sorting, filtering, etc.  Just about anything.

Following this line of reasoning, I’d expect my various machines to start talking with each other using protocols of their own making.  Changing dynamically to fit whatever situation they encounter.  The communications algorithms go meta.  We need algorithms to create algorithms.  Threats and intrusions are perceived, and dealt with dynamically.  No waiting for a rev of the OS, no centrally distributed patches, no worrying about incompatible versions of this that and the other thing.  The machines, and their communications, become individual, dynamic, and non-static.

This could be interesting.

 


A Dictionary with a count

The most interesting type in Lua is the table object. The table serves dual purposes. It can act as an array, as well as a hash table. As a dictionary, you can do simple things like:

local tbl = {}
tbl["alpha"] = "alpha-value"

The problem with this construction is that you can not easily find out the number of items that are within the dictionary. The easiest way is to enumerate the whole thing, and keep a count:

local count = 0;
for k,v in pairs(tbl) do
  count = count + 1;
end

And so, what I really want is something that acts as a simple dictionary, but gives me the ability to easily find the count. I’ve created the “Bag”, which is simply a wrapper on a table, but it gives you a count.

local Bag = {}
setmetatable(Bag, {
	__call = function(self, ...)
		return self:_new(...);
	end,
})

local Bag_mt = {
	__index = function(self, key)
		--print("__index: ", key)
		return self.tbl[key]
	end,

	__newindex = function(self, key, value)		
		--print("__newindex: ", key, value)
		if value == nil then
			self.__Count = self.__Count - 1;
		else
			self.__Count = self.__Count + 1;
		end

		--rawset(self, key, value)
		self.tbl[key] = value;
	end,

	__len = function(self)
--		print("__len: ", self.__Count)
		return self.__Count;
	end,
}

function Bag._new(self, obj)
	local obj = {
		tbl = {},
		__Count = 0,
	}

	setmetatable(obj, Bag_mt);

	return obj;
end

Each Bag instance has a ‘tbl’ and a ‘__Count’. Within the metastable, both the ‘__newindex’ and the ‘__index’ are implemented. The ‘__newindex’ is used when you make an assignment. So, when I do:

tbl["alpha"] = "alpha-value"

The ‘__newindex’ is called. This will in turn simply put the value into the self.tbl table. While it does that, it also increments the count of values that are stored in the Bag. When you assign “nil” to an entry, this will remove it from the underlying table, and thus decrement the count.

Then there’s the ‘__len’ metamethod. This will normally return the length of an item. In the case of regular tables, as long as they are being used as arrays (contiguous indices), then it will return the number of items. In the case of a regular table being used as a dictionary, it will return 0. So, implementing it here gives the Bag the ability to use the convenient ‘#’ operator.

local Collections = require("Collections")

local names = {
	alpha = "alpha-value";
	beta = "beta-value";
	gamma = "gamma-value";
}

local bg = Collections.Bag();

for k,v in pairs(names) do
	print("adding: ", k, v)
	bg[k] = v;
end;

print("Count after add: ", #bg)


bg["gamma"] = nil;

print("Count, after 1 remove: ", #bg)

print("beta: ", bg["beta"])

Lua 5.1 will not utilize the ‘__len’ metamethod, only 5.2 will do that. So, with LuaJIT, which is half way between the two, you need to make sure you compile with the -DLUAJIT_ENABLE_LUA52COMPAT flag, or you won’t get the expected behavior.

This is great, and achieves what I wanted. But, I still want it to act as a dictionary in terms of being iterable, so I should implement a __pairs metamethod, but that can wait.

So, there you have it. A quick and dirty improvement upon tables which saves me the headache and wasted CPU cycles of counting my dictionary entries.

There are many situations where I am using a simple dictionary, but I want to quickly find the count of items in the dictionary


Iterating Over Oddities – of strings, arrays, and counting from 0

Could there possibly be anything more said about iterators and strings? Well, yah, actually tons. Last time around, I showed a simple iterator over a “string”. The focus was primarily on satisfying the job of parsing out null terminated strings from within a ‘null terminated’ string.

As I was doing that, I was also speculating as to whether I could use the exact same iterator to parse fixed sized records from an array of records. I actually wrote a giant iterator that does just that. Then I got to thinking, ‘this isn’t the way to do it’. Iterators, you see, can be broken down into constituent parts. The Lua documentation itself has quite a lot to say about iterators of various forms. One of the most promising bits of documentation is on 7.3 – Stateless Iterators.  In order to pull this off, you split up the “iterator” into a few parts.

 

generator – The function that gets called every time you need a new value.  The parameters that are passed to it are the “fixedpart”, and the “control”.

invariant state – this is the part of the iterator that doesn’t change much.  For example, the source string that you might be iterating over.

index – this is the part that changes every time generator is called.

so, when you do the following:

local values = {'a', 'b', 'c'}
for _idx, value in ipairs(values) do
  print(value)
end

The ‘generator’ is some function returned from the ‘ipairs()’ function. It will be called again and again, until it returns nil.

The ‘invariant state’ is the ‘values’ array. This will be fed to the generator each time a new value is needed.

And last, the _idx, is the index value. It will also be fed to the generator, along with the ‘invariant state’.

So, how about applying this to my previous multi string iterator?

local ffi = require("ffi")
local fun = require("fun")

local floor = math.floor;


-- a nil generator.  
-- good for cases when there's no data
local function nil_gen(param, state)
    return nil
end

local function delim_gen(param, idx)
	local len = 0;

	while ((idx+len) < param.nelems) do
		--print("wchar: ", string.char(ffi.cast(param.basetypeptr, param.data)[idx + len]))
		if ffi.cast(param.basetypeptr, param.data)[idx + len] ~= param.separator then
			len = len + 1;
		else
			break
		end
	end
	
	if len == 0 then
		return nil;
	end

	return idx + len + 1, ffi.cast(param.basetypeptr, param.data)+idx, len
end


local function array_gen(param, idx)
	if idx >= param.nelems then
		return nil;
	end

	return idx+1, ffi.cast(param.basetypeptr, param.data)+idx, 1
end


local function striter(params)
	if not params then
		return nil_gen, params, nil
	end

	if not params.data then
		return nil_gen, params, nil
	end

	params.datalength = params.datalength or #params.data
	if params.basetype then
		if type(params.basetype)== "string" then
			params.basetype = ffi.typeof(params.basetype)
		end
	end
	params.basetype = params.basetype or ffi.typeof("char")
	params.basetypeptr = ffi.typeof("const $ *", params.basetype)
	params.basetypesize = ffi.sizeof(params.basetype)
	params.nelems = math.floor(params.datalength / params.basetypesize)

	if params.separator ~= nil then
		return delim_gen, params, 0
	else
		return array_gen, params, 0
	end

	return nil_gen, nil, nil
end

How to apply it?

local src3 = "big,boy,baby,bear,bounces,basketballs,behind,the,barn,,"

local function printAnsi(ptr, len)
  print(ffi.string(ptr, len))
end

each(printAnsi, striter{data=src3, basetype="char"})

Here, I am using the Lua Fun ‘each’ function to drive my iterator. I could just as easily use a simple ‘for – in’ loop, but I’m getting all functional these days. What the last statement says is, “for each of the items coming out of the striter iterator, call the ‘printAnsi()’ function”.

The striter function is called with a table that contains the various parameters it will need. In this particular case, I’ve left off the parenthesis, because in Lua, if it’s just a single table value, or a string, you can do that.

So, how about that striter() function? Looking back, it has the job of returning a ‘generator’, ‘invariant state’, and a ‘index’. Well, it cheats a bit because the ‘invariant’ also contains the ‘index’. The ‘invariant’ is the fact that the table value doesn’t change, even though the contents can. This is just a matter of convenience.

At any rate, the striter() function decides which generator to return based on what it sees in the parameters. For example, if it sees a separator, then it will return the ‘delim_gen’ generator. That generator functions pretty much the same was as the one I created last time for the multi string thing. In the case where it doesn’t see a separator, it will return the ‘array_gen’ generator. That generator will assume it is being handed a pointer to an array of values of a particular type.

One thing to note that is different this time around from the mstrziter, Lua string creation does not occur within the iterator itself. Rather than return a string value, the iterator will simply return an offset and a length. It is up to the caller to determine what they want to do with the values.

This is kind of a key to an IEnumerable chain. Do the least amount of work as possibly, deferring really heavy work towards the end of your chain. This lazy evaluation makes for a more efficient chain. So, the ‘printAnsi’ function is at the end of the chain. It might have turned out that instead of creating strings at all, I might have wanted to send the values across a network to be stored in a database. In that case, the pointer, offset, length is perfect to be consumed directy by the Socket:send(buff, len) function, so no copying would be necessary.

How about that array case?

Let’s imagine I wanted to print out every value of the string one by one.

each(printAnsi, striter{data=src3, basetype="char"})

In this case, I’m creating an iterator, not specifying the separator (so the array_gen will be used). I have also specified the ‘basetype’ of the elements of my array. That’s so it can calculate how many there are, and create a pointer of the appropriate type. And you’re done!

Of course, the ‘basetype’ could just as easily be ‘BGR32′, or ‘PersonRecord’, or whatever fixed size type you so happen to have stored in some array. Makes for some fairly easy ‘tokenizing’ of array values.

To go further, what’s say you have a multi string based on ‘wchar_t’, and delimeted by ‘ ‘ (space) characters?

How about a little convenience function?

local function wmstriter(data, separator, datalength)
  if type(separator) == "string" then
    separator = string.byte(separator)
  end

  datalength = datalength or ffi.sizeof(data)
  return map(core_string.toAnsi, striter{data=data, datalength = datalength, basetype="wchar_t", separator=separator})
end

and using it:

local whello, whellolen = core_string.toUnicode("Hello World");

each(print, wmstriter(whello, ' '))

In this case, I create a ‘wchar_t’ based string, using the ‘toUnicode()’ function. Then I feed that to the wmstriter iterator, and print out each of the words, delimeted by a ‘(sp)’.

The core ‘striter()’ remains the same, and the generators don’t change. You just build up more useful ways of feeding them and consuming them.

I find this to be fairly useful and powerful. When you think about the generators, they have a fairly simple task. Do one small job, and do it well. The complexity of a larger system is gained through thoughtful composition of these simpler parts, rather than building hugely complex macro parts.

This is helpful for code maintenance. I find it hard to maintain largish single functions with hundreds of lines of code. I find it fairly easy to maintain a generator that has a few lines of code, and does a fairly simple job. I find the composition model easy as well. I can look back on it over time, read the functional chain, and understand what was intended.

And there you have it. More iterating over oddities.

 


Follow

Get every new post delivered to your Inbox.

Join 51 other followers