schedlua – the kernel

Previously, I talked about the scheduler within the new scedlua.  A scheduler is a fairly simple thing, it just decides which of the many ready tasks will run next.  The default scheduler follows a fairly simple FIFO strategy, so there are no priorities, favorites, or the like.  Of course this wouldn’t be any fun if you were stuck with just one scheduler, so naturally enough this is an easily pluggable part of the system.  But what does this plugging?

In steps the Kernel.  In general, the schedlua project is about creating a set of tools by which highly performant services can be constructed.  schedlua largely supports the concept that a single processor can be highly leveraged if programmed correctly.  It does not try to gain performance through the usage of multiple threads, but rather it just takes on the task of suspending various tasks which are blocked on IO or otherwise idle, and letting tasks which are ready to run do their thing.  The concurrency model is cooperative, not preemptive, so if any one task misbehaves, the process can become stuck.

So, let’s take a look at this code:

--kernel.lua
-- kernel is a singleton, so return
-- single instance if we've already been
-- through this code
print("== KERNEL INCLUDED ==")

local Scheduler = require("scheduler")
local Task = require("task")
local Queue = require("queue")
local Functor = require("functor")

local Kernel = {
	ContinueRunning = true;
	TaskID = 0;
	Scheduler = Scheduler();
	TasksSuspendedForSignal = {};
}

setmetatable(Kernel, {
    __call = function(self, params)
    	params = params or {}
    	params.Scheduler = params.Scheduler or self.Scheduler
    	
    	if params.exportglobal then
    		self:globalize();
    	end

    	self.Scheduler = params.Scheduler;

    	return self;
    end,
})

function Kernel.getNewTaskID(self)
	self.TaskID = self.TaskID + 1;
	return self.TaskID;
end

function Kernel.getCurrentTaskID(self)
	return self:getCurrentTask().TaskID;
end

function Kernel.getCurrentTask(self)
	return self.Scheduler:getCurrentTask();
end

function Kernel.spawn(self, func, ...)
	local task = Task(func, ...)
	task.TaskID = self:getNewTaskID();
	self.Scheduler:scheduleTask(task, {...});
	
	return task;
end

function Kernel.suspend(self, ...)
	self.Scheduler:suspendCurrentFiber();
	return self:yield(...)
end

function Kernel.yield(self, ...)
	return self.Scheduler:yield();
end


function Kernel.signalOne(self, eventName, ...)
	if not self.TasksSuspendedForSignal[eventName] then
		return false, "event not registered", eventName
	end

	local nTasks = #self.TasksSuspendedForSignal[eventName]
	if nTasks < 1 then
		return false, "no tasks waiting for event"
	end

	local suspended = self.TasksSuspendedForSignal[eventName][1];

	self.Scheduler:scheduleTask(suspended,{...});
	table.remove(self.TasksSuspendedForSignal[eventName], 1);

	return true;
end

function Kernel.signalAll(self, eventName, ...)
	if not self.TasksSuspendedForSignal[eventName] then
		return false, "event not registered"
	end

	local nTasks = #self.TasksSuspendedForSignal[eventName]
	if nTasks < 1 then
		return false, "no tasks waiting for event"
	end

	for i=1,nTasks do
		self.Scheduler:scheduleTask(self.TasksSuspendedForSignal[eventName][1],{...});
		table.remove(self.TasksSuspendedForSignal[eventName], 1);
	end

	return true;
end

function Kernel.waitForSignal(self, eventName)
	local currentFiber = self.Scheduler:getCurrentTask();

	if currentFiber == nil then
		return false, "not currently in a running task"
	end

	if not self.TasksSuspendedForSignal[eventName] then
		self.TasksSuspendedForSignal[eventName] = {}
	end

	table.insert(self.TasksSuspendedForSignal[eventName], currentFiber);

	return self:suspend()
end

function Kernel.onSignal(self, func, eventName)
	local function closure()
		self:waitForSignal(eventName)
		func();
	end

	return self:spawn(closure)
end



function Kernel.run(self, func, ...)

	if func ~= nil then
		self:spawn(func, ...)
	end

	while (self.ContinueRunning) do
		self.Scheduler:step();		
	end
end

function Kernel.halt(self)
	self.ContinueRunning = false;
end

function Kernel.globalize()
	halt = Functor(Kernel.halt, Kernel);
    onSignal = Functor(Kernel.onSignal, Kernel);

    run = Functor(Kernel.run, Kernel);

    signalAll = Functor(Kernel.signalAll, Kernel);
    signalOne = Functor(Kernel.signalOne, Kernel);

    spawn = Functor(Kernel.spawn, Kernel);
    suspend = Functor(Kernel.suspend, Kernel);

    waitForSignal = Functor(Kernel.waitForSignal, Kernel);

    yield = Functor(Kernel.yield, Kernel);
end

return Kernel;

 

From the top, you can see the Kernel requires the scheduler, task and functor. The scheduler has already been explained. The Kernel serves a couple of purposes. First of all, it manages the scheduler. The ‘run’ function at the bottom is the ‘loop’ of the application. It will run until ‘halt’ is called. Each time through the loop it’s telling the scheduler to take a step.

Also at the bottom, you can see usage of the Functor. A functor is just a simple convenience wrapper that helps you call a function on a table at a later point. Those functors are used to make the keywords global.

There are two primary things the kernel does. One is to spawn new tasks, the other is to provide a central point for signal handling.

First, let’s look at the ‘spawn’.

function Kernel.spawn(self, func, ...)
	local task = Task(func, ...)
	task.TaskID = self:getNewTaskID();
	self.Scheduler:scheduleTask(task, {...});
	
	return task;
end

This is actually the coolest part of the system in terms of how the programming model is expressed. Here’s an example of it in use.

local Kernel = require("kernel"){exportglobal = true}


local function numbers(ending)
	local idx = 0;
	local function closure()
		idx = idx + 1;
		if idx > ending then
			return nil;
		end
		return idx;
	end
	
	return closure;
end

local function task1()
	print("first task, first line")
	yield();
	print("first task, second line")
end

local function task2()
	print("second task, only line")
end

local function counter(name, nCount)
	for num in numbers(nCount) do
		print(name, num);
		yield();
	end
	halt();
end

local function main()
	local t0 = spawn(counter, "counter1", 5)
	local t1 = spawn(task1)
	local t2 = spawn(task2)
	local t3 = spawn(counter, "counter2", 7)
end

run(main)

Basically, any time you want something to happen concurrently, you just say ‘spawn(func, params) and that’s that.

What happens is a Task object is created which holds onto the function object as well as the initial set of parameters. This task is then sent to the scheduler to be run. From then on out you can forget about it. Of course, you’re handed the task when you say ‘spawn’, so you do have a chance of suspending and killing it off in the future if you like. Similarly, you can wait for a task to complete as well.

So, that’s spawning.

The other major feature in the kernel is signal handling.

function Kernel.waitForSignal(self, eventName)
	local currentFiber = self.Scheduler:getCurrentTask();

	if currentFiber == nil then
		return false, "not currently in a running task"
	end

	if not self.TasksSuspendedForSignal[eventName] then
		self.TasksSuspendedForSignal[eventName] = {}
	end

	table.insert(self.TasksSuspendedForSignal[eventName], currentFiber);

	return self:suspend()
end

This is probably THE most important routine in the whole system. Basically, take the current task, and put it onto the suspension list, connected with the signal name (signals are just a string value). Later on, when you want the task to resume, you would signal it using either signalOne(), or signalAll().

A little bit of code demonstrating this:

local Kernel = require("kernel"){exportglobal = true}
local Functor = require("functor")

local function numbers(ending)
	local idx = 0;
	local function closure()
		idx = idx + 1;
		if idx > ending then
			return nil;
		end
		return idx;
	end
	
	return closure;
end

local function waitingOnCount(name, ending)
	local eventName = name..tostring(ending)
	waitForSignal(eventName)

	print("REACHED COUNT: ", ending)
end

local function onCountFinished(name)
	print("Counter Finished: ", name)
end

local function counter(name, nCount)
	for num in numbers(nCount) do
		print(num)
		local eventName = name..tostring(num);
		--print(eventName)
		signalOne(eventName);

		yield();
	end

	signalAll(name..'-finished')
end

local function main()
	local t1 = spawn(counter, "counter", 50)
	local t2 = spawn(waitingOnCount, "counter", 20)
	local t3 = spawn(function() print("LAMDA"); waitForSignal("counter15") print("reached 15!!") end)
	
	-- test signalAll().  All three of these should trigger when
	-- counter finishes
	local t13 = onSignal(Functor(onCountFinished, "counter-1"), "counter-finished")
	local t14 = onSignal(Functor(onCountFinished, "counter-2"), "counter-finished")
	local t15 = onSignal(Functor(onCountFinished, "counter-3"), "counter-finished")
end

run(main)

Here, there is a counter(), which is just counting, and firing off a signal for each number. The various waitingOnCount(), and LAMBDA routines are going to respond to the appropriate signals.

Finally, the t13, t14, and t15 tasks are waiting for the “counter-finished” signal, and they will all fire off and print their little message.

Of course, at this point you could have something that would call ‘halt()’ so you don’t have to press Ctl-C to stop the process, but you get the idea.

And that’s pretty much it for the kernel. Absent from here are the async io, predicates, alarms and the like. They are available in schedlua, but they’re not a part of the kernel. Instead of being part of the kernel proper, these are essentially modules. They utilize the signal and spawn features built into the kernel, and they’re free to do their own thing.

I’ll get into the details of alarms and predicates next time around to demonstrate the concept of easy add-on modules.


Schedulers revisited

A ran quite a long series on the buildup of the TINN scheduler, including this gem on the primitives included in the application model: Parallel Conversations

The scheduler developed in that whole series served me very well, taking on timing, predicates, signals, and async IO.  It was all based on Windows primitives, and although fairly modular, was an evolved codebase, which had some lumps and inefficiencies.  So now, two years on, I decided to reimagine that scheduler.  Primary reasons?  I had some time on my hands, and I wanted a scheduler that works equally well on Linux as on Windows.

And so, the schedlua project was born.

What were the design criteria this time around?

  • Reduce Size and complexity
  • Increase composability
  • Do not force features where they’re not desired

In the previous incarnation, I went back and forth between having an “Application” object, and the “Scheduler”.  Things were primarily driven by the Application, and it had a central role in the instance of any running app.  I still favor having this central role, but I changed the meaning and composition of that central player this time around.  First though, there is the “Scheduler”.


local ffi = require("ffi");

local Queue = require("queue")
local Task = require("task");


--[[
	The Scheduler supports a collaborative processing
	environment.  As such, it manages multiple tasks which
	are represented by Lua coroutines.

	The scheduler works by being the arbitrator of which routine is running
	in the application.  When work is to be done, it is encapsulated in 
	the form of a task object.  The scheduler relies upon that task object
	to call a form of 'yield' at some point, at which time it will receive
	the main thread of execution, and will pick the next task out of the ready
	list to run.
--]]
local Scheduler = {}
setmetatable(Scheduler, {
	__call = function(self, ...)
		return self:create(...)
	end,
})
local Scheduler_mt = {
	__index = Scheduler,
}

function Scheduler.init(self, ...)
	local obj = {
		TasksReadyToRun = Queue();
	}
	setmetatable(obj, Scheduler_mt)
	
	return obj;
end

function Scheduler.create(self, ...)
	return self:init(...)
end


function Scheduler.tasksPending(self)
	return self.TasksReadyToRun:length();
end


-- put a task on the ready list
-- the 'task' should be something that can be executed,
-- whether it's a function, functor, or something that has a '__call'
-- metamethod implemented.
-- The 'params' is a table of parameters which will be passed to the function
-- when it's ready to run.
function Scheduler.scheduleTask(self, task, params)
	--print("Scheduler.scheduleTask: ", task, params)
	params = params or {}
	
	if not task then
		return false, "no task specified"
	end

	task:setParams(params);
	self.TasksReadyToRun:enqueue(task);	
	task.state = "readytorun"

	return task;
end

function Scheduler.removeFiber(self, fiber)
	--print("REMOVING DEAD FIBER: ", fiber);
	return true;
end

function Scheduler.inMainFiber(self)
	return coroutine.running() == nil; 
end

function Scheduler.getCurrentFiber(self)
	return self.CurrentFiber;
end

function Scheduler.suspendCurrentFiber(self, ...)
	self.CurrentFiber.state = "suspended"
end

function Scheduler.step(self)
	-- Now check the regular fibers
	local task = self.TasksReadyToRun:dequeue()

	-- If no fiber in ready queue, then just return
	if task == nil then
		print("Scheduler.step: NO TASK")
		return true
	end

	if task:getStatus() == "dead" then
		self:removeFiber(task)

		return true;
	end

	-- If the task we pulled off the active list is 
	-- not dead, then perhaps it is suspended.  If that's true
	-- then it needs to drop out of the active list.
	-- We assume that some other part of the system is responsible for
	-- keeping track of the task, and rescheduling it when appropriate.
	if task.state == "suspended" then
		--print("suspended task wants to run")
		return true;
	end

	-- If we have gotten this far, then the task truly is ready to 
	-- run, and it should be set as the currentFiber, and its coroutine
	-- is resumed.
	self.CurrentFiber = task;
	local results = {task:resume()};

	-- once we get results back from the resume, one
	-- of two things could have happened.
	-- 1) The routine exited normally
	-- 2) The routine yielded
	--
	-- In both cases, we parse out the results of the resume 
	-- into a success indicator and the rest of the values returned 
	-- from the routine
	--local pcallsuccess = results[1];
	--table.remove(results,1);

	local success = results[1];
	table.remove(results,1);

--print("PCALL, RESUME: ", pcallsuccess, success)

	-- no task is currently executing
	self.CurrentFiber = nil;


	if not success then
		print("RESUME ERROR")
		print(unpack(results));
	end

	-- Again, check to see if the task is dead after
	-- the most recent resume.  If it's dead, then don't
	-- bother putting it back into the readytorun queue
	-- just remove the task from the list of tasks
	if task:getStatus() == "dead" then
		self:removeFiber(task)

		return true;
	end

	-- The only way the task will get back onto the readylist
	-- is if it's state is 'readytorun', otherwise, it will
	-- stay out of the readytorun list.
	if task.state == "readytorun" then
		self:scheduleTask(task, results);
	end
end

function Scheduler.yield(self, ...)
	return coroutine.yield(...);
end


return Scheduler

I think the code is pretty straight forward to follow, except for that ‘step()’ function.  So, what is a scheduler anyway?  Well, we want to have some level of ‘concurrency’ in our applications.  That is, I would love to be able to have the appearance of running several tasks at once on my machine, without much work.  In modern day computing, this is fairly well covered by the OS as applications.  We can run several apps at the same time on our computers, and the kernel takes care of the details of how that gets done.  Modern kernels simply split  time between tasks, giving some amount of time to each task in turn, so they feel like they are running in parallel, or concurrently.  It’s a nice illusion because the machines are fast enough, and there are enough times when a task will simply be waiting around anyway that the amount of concurrency achieved can be very high.

So, what if you want some of that kernel goodness within the context of your app directly?  You want to handle mouse and keyboard, while doing some drawing and networking at the same time.  This is typically where ‘threads’ step in, and down the rabbit hole we go, getting involved in threading models, concurrency primitives and the like.  Well, in Lua, there’s a concurrency primitive built in, in the form of coroutines.  Coroutines are a fairly old style of concurrency whereby the application gets control of the CPU, and only gives it up willingly by calling a ‘yield()’ function at some point.  OK, great.  So, you start a routine, and at some point you call yield, and some other routine gets a chance to run.  But, which routine?  Aha, so this is where the ‘scheduler’ comes in.

The scheduler might be considered the ‘home’ routine.  That is, once a routine gives up a slide of time by calling ‘yield’, the scheduler is the place they will return to.  At that point, the scheduler can decide which routines is to be run next.  It can do this because it has a list of ready to run tasks (those placed on the list using ‘scheduleTask’).  In this scheduler, it simply takes the next one off the list, and tells it to resume.  That is the essence of the ‘step()’ function.

Great!  And what does it look like in action?

--test_scheduler.lua
package.path = package.path..";../?.lua"

Scheduler = require("scheduler")()
Task = require("task")
local taskID = 0;

local function getNewTaskID()
	taskID = taskID + 1;
	return taskID;
end

local function spawn(scheduler, func, ...)
	local task = Task(func, ...)
	task.TaskID = getNewTaskID();
	scheduler:scheduleTask(task, {...});
	
	return task;
end


local function task1()
	print("first task, first line")
	Scheduler:yield();
	print("first task, second line")
end

local function task2()
	print("second task, only line")
end

local function main()
	local t1 = spawn(Scheduler, task1)
	local t2 = spawn(Scheduler, task2)

	while (true) do
		--print("STATUS: ", t1:getStatus(), t2:getStatus())
		if t1:getStatus() == "dead" and t2:getStatus() == "dead" then
			break;
		end
		Scheduler:step()
	end
end

main()

That’s the scheduler in isolation.  In this test case, you can see that task creation, and the ‘spawn’ primitive aren’t part of the scheduler proper.  They don’t need to be.  The scheduler should only be concerned with deciding which task is going to be next.  In this way, you can imagine creating all sorts of different kinds of schedulers.  You cold introduce the concept of priorities, or aging, or realtime, or whatever.  The same general idea would apply, the scheduler just needs to decide which task is going to run next.  Also of not here, the ‘main()’ function operates as the “even loop” if you will.  This is the bit of code that drives the scheduler between steps.  Having this separate from the scheduler proper will become advantageous down the road when we want to compose the scheduler will a larger application framework.

So, this is the first step in revitalizing the concurrency core I built into the TINN project.  This time around, more composable, more feature rich, more easily cross platform, more bang for the buck.  Next time around, I’ll examine the “kernel”.


Composition at What Level?

barpoly

That picture’s got a lot going on.  In Render Text like a Boss, I was rendering text, which is a pretty exciting milestone.  The only thing more exciting to do with text is render based on vectors, and being able to scale and rotate, but there are other fish to fry in the meanwhile.  I find that polygons are fairly useful as a primitive.

The above picture shows a whole bunch of primitives rendering with varying degrees of transparency.  Those random rectangles in the background are just making things look interesting.  Next are some random bars, with varying degrees of transparency, again just to make things look interesting.  And then comes that orange tinted polygon thing that lays atop the bars.  The polygon is simply constructed by taking the midpoint of the top edge of the bar as a vertex.  The two bottom vertices (left and right) close out the polygon.

But wait a minute, I see a bunch of triangles in there?  Well, just for the purposes of this article, I’m showing how the polygon is broken down into triangles (triangulated).  Triangles?  what the heck?  Well, you see, it’s like this.  There is more than one way to skin a cat, or render a polygon.  You can search the interwebs and find as many ways to do it as there are graphics libraries.  The ‘direct’ method would have you go scan line by line, calculating even and odd line crossings to determine whether individual pixels were ‘inside’ or ‘outside’ the polygon, and thus color the pixel.  You could traverse down edges, creating horizontal line spans, and then render those spans.  Well, these both look fairly similar to the choices made when rendering a triangle.  So, maybe it’s easier to simply break the polygon down into triangles, and then render the resultant triangles.  I’m sure there’s some math theorem somewhere that can be used to prove that any polygon > 3 points can be broken up into nice sets of triangles.  Equivalently, there are numerous coding algorithms which will do the same.  For graphic, I found a nice public domain triangulator, and incorporated it into the library.  Pass it some points, it breaks them down into triangles, and the triangles are rendered.

Great.  This has a couple of nice implications.  First of all, triangles (and ultimately horizontal lines) remain a useful primitive in the core of the graphics library.  I did not actually add a polygon capability to the core, just keep the triangles there, and anyone who needs polygons can easily break them down into triangles.  That leaves me with a single complex primitive to optimize.  I do the same for quads.  Rectangles are a special case, in that they may be optimized separately to come up with faster behavior than the triangulation would deliver.

ellipses

What about the ellipse?  Hmmm.  There are faily fast ellipse drawing routines in the world.  The most interesting for drawing ellipse outlines looks like this:

 

void raster_rgba_ellipse_stroke(pb_rgba *pb, const uint32_t cx, const uint32_t cy, const size_t xradius, size_t yradius, const uint32_t color)
{
	int x, y;
	int xchange, ychange;
	int ellipseerror;
	int twoasquare, twobsquare;
	int stoppingx, stoppingy;

	twoasquare = 2 * xradius*xradius;
	twobsquare = 2 * yradius*yradius;

	x = xradius;
	y = 0;

	xchange = yradius*yradius*(1 - 2 * xradius);
	ychange = xradius*xradius;
	ellipseerror = 0;
	stoppingx = twobsquare*xradius;
	stoppingy = 0;

	// first set of points, sides
	while (stoppingx >= stoppingy)
	{
		Plot4EllipsePoints(pb, cx, cy, x, y, color);
		y++;
		stoppingy += twoasquare;
		ellipseerror += ychange;
		ychange += twoasquare;
		if ((2 * ellipseerror + xchange) > 0) {
			x--;
			stoppingx -= twobsquare;
			ellipseerror += xchange;
			xchange += twobsquare;
		}
	}

	// second set of points, top and bottom
	x = 0;
	y = yradius;
	xchange = yradius*yradius;
	ychange = xradius*xradius*(1 - 2 * yradius);
	ellipseerror = 0;
	stoppingx = 0;
	stoppingy = twoasquare*yradius;

	while (stoppingx <= stoppingy) {
		Plot4EllipsePoints(pb, cx, cy, x, y, color);
		x++;
		stoppingx += twobsquare;
		ellipseerror += xchange;
		xchange += twobsquare;
		if ((2 * ellipseerror + ychange) > 0) {
			y--;
			stoppingy -= twoasquare;
			ellipseerror += ychange;
			ychange += twoasquare;
		}
	}
}

This is one of those routines that takes advantage of the fact that axis aligned ellipses have a lot of symmetry, so 4 pointes are plotted at the same time. Well, it’s not a big leap to think that this technique can be extended. For filling the ellipse, you can simply draw horizontal lines between the symmetrical points, and you end up with a solid ellipse.

inline void fill2EllipseLines(pb_rgba *pb, const uint32_t cx, const uint32_t cy, const unsigned int x, const unsigned int y, const uint32_t color)
{
	raster_rgba_hline_blend(pb, cx - x, cy + y, 2*x, color);
	raster_rgba_hline_blend(pb, cx - x, cy - y, 2 * x, color);
}

void raster_rgba_ellipse_fill(pb_rgba *pb, const uint32_t cx, const uint32_t cy, const size_t xradius, size_t yradius, const uint32_t color)
{
	int x, y;
	int xchange, ychange;
	int ellipseerror;
	int twoasquare, twobsquare;
	int stoppingx, stoppingy;

	twoasquare = 2 * xradius*xradius;
	twobsquare = 2 * yradius*yradius;

	x = xradius;
	y = 0;

	xchange = yradius*yradius*(1 - 2 * xradius);
	ychange = xradius*xradius;
	ellipseerror = 0;
	stoppingx = twobsquare*xradius;
	stoppingy = 0;

	// first set of points, sides
	while (stoppingx >= stoppingy)
	{
		//Plot4EllipsePoints(pb, cx, cy, x, y, color);
		fill2EllipseLines(pb, cx, cy, x, y, color);
		y++;
		stoppingy += twoasquare;
		ellipseerror += ychange;
		ychange += twoasquare;
		if ((2 * ellipseerror + xchange) > 0) {
			x--;
			stoppingx -= twobsquare;
			ellipseerror += xchange;
			xchange += twobsquare;
		}
	}

	// second set of points, top and bottom
	x = 0;
	y = yradius;
	xchange = yradius*yradius;
	ychange = xradius*xradius*(1 - 2 * yradius);
	ellipseerror = 0;
	stoppingx = 0;
	stoppingy = twoasquare*yradius;

	while (stoppingx <= stoppingy) {
		fill2EllipseLines(pb, cx, cy, x, y, color);
		x++;
		stoppingx += twobsquare;
		ellipseerror += xchange;
		xchange += twobsquare;
		if ((2 * ellipseerror + ychange) > 0) {
			y--;
			stoppingy -= twoasquare;
			ellipseerror += ychange;
			ychange += twoasquare;
		}
	}
}

In this context, the ellipse is kind of like the rectangle. There is an easy optimization at hand, so the benefits of employing triangulation might not be worth the effort. Then again, if you wanted to maintain a smaller codebase, then you might want to triangulate even this ellipse. You’d still be stuck with using the straight up code to draw the ellipse outline, and with a little bit of refactoring, you could use the exact same code for the outline and the fill. The routine to be utilized for the symmetrical points could be passed in as a function pointer, and that would be that. That makes for easy composition, code reuse, and the like.

I’m not actually sure how much I tend to use ellipses in realistic drawing anyway. I suppose there are circles here and there, but eh, there you go.

The theme of this article was composition. There is a bit of code reuse, composition in how you build larger things from smaller things, and simply trying to maintain a constraint of a small codebase, requires that higher order elements are composed from smaller simpler elements. The triangulation code currently relies on some C++ (for vector), so it can’t go into the core library as yet (straight C only), but I’m sure it will get there.

Well, there you have it, the little library that could is gaining more capabilities all the time. Soon enough, these capabilities will allow the composition of some fairly complex UI elements.


Render Text like a Boss

And it looks like this:

verdana18

One of the most rewarding parts of bringing up a graphics system is when you’re able to draw some text.  I have finally reached that point with the graphicc library.  There are many twists and turns along the way, some of which are interesting, but I’ll just share a bit of the journey here.

First of all, the code that generates this image looks like this:

#include "test_common.h"
#include "animwin32.h"

struct font_entry {
	char *name;
	const uint8_t *data;
};

struct font_entry fontlist[] = {
	{"gse4x6", gse4x6 },
	{ "gse4x8", gse4x8 },
	{ "gse5x7", gse5x7 },
	{ "gse5x9", gse5x9 },
	{ "gse6x12", gse6x12 },
	{ "gse6x9", gse6x9 },
	{ "gse7x11", gse7x11 },
	{ "gse7x11_bold", gse7x11_bold },
	{ "gse7x15", gse7x15 },
	{ "gse7x15_bold", gse7x15_bold },
	{ "gse8x16", gse8x16 },
	{ "gse8x16_bold", gse8x16_bold },
	{ "mcs11_prop", mcs11_prop },
	{ "mcs11_prop_condensed", mcs11_prop_condensed },
	{ "mcs12_prop", mcs12_prop },
	{ "mcs13_prop", mcs13_prop },
	{ "mcs5x10_mono", mcs5x10_mono },
	{ "mcs5x11_mono", mcs5x11_mono },
	{ "mcs6x10_mono", mcs6x10_mono },
	{ "mcs6x11_mono", mcs6x11_mono },
	{ "mcs7x12_mono_high", mcs7x12_mono_high },
	{ "mcs7x12_mono_low", mcs7x12_mono_low },
	{ "verdana12", verdana12 },
	{ "verdana12_bold", verdana12_bold },
	{ "verdana13", verdana13 },
	{ "verdana13_bold", verdana13_bold },
	{ "verdana14", verdana14 },
	{ "verdana14_bold", verdana14_bold },
	{ "verdana16", verdana16 },
	{ "verdana16_bold", verdana16_bold },
	{ "verdana17", verdana17 },
	{ "verdana17_bold", verdana17_bold },
	{ "verdana18", verdana18 },
	{ "verdana18_bold", verdana18_bold }

};
static const int numfonts = sizeof(fontlist) / sizeof(fontlist[0]);
static int fontidx = 0;



LRESULT CALLBACK keyReleased(HWND hWnd, UINT message, WPARAM wParam, LPARAM lParam)
{
	switch (wParam)
	{
		case VK_UP:
			fontidx--;
			if (fontidx < 0) {
				fontidx = numfonts - 1;
			}
		break;

		case VK_DOWN:
			fontidx++;
			if (fontidx >= numfonts){
				fontidx = 0;
			}
		break;

		case VK_SPACE:
			write_PPM("test_agg_raster_fonts.ppm", gpb);
		break;
	}

	return 0;
}

void setup()
{
	size(640, 480);
	background(pWhite);

	setOnKeyReleasedHandler(keyReleased);
}


static char CAPS[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
static char LOWS[] = "abcdefghijklmnopqrstuvwxyz";
static char NUMS[] = "1234567890";
static char SYMS[] = "!@#$%^&*()_+-={}|[]\\:\"; '<>?,./~`";
static char SOME[] = "The quick brown fox jumped over the lazy dog.";

void draw()
{
	setFont(fontlist[fontidx].data);

	fill(pBlack);
	text(fontlist[fontidx].name, 0, 0);
	text(CAPS, 0, gfont.height * 1);

	fill(pRed);
	text(LOWS, 0, gfont.height * 2);

	fill(pGreen);
	text(NUMS, 0, gfont.height * 3);

	fill(pBlue);
	text(SYMS, 0, gfont.height * 4);

	fill(pBlack);
	text(SOME, 0, gfont.height * 5);
}

Many of the test programs in graphicc have this similar layout. If you’ve every done any programming using Processing you might be familiar with the ‘setup()’ and ‘draw()’ methods. The animwin32 file is responsible for this little construction. Basically, it provides a “Processing like” environment, which makes it really easy to play with various graphics routines. It also does mouse/keyboard handling, so you can setup to deal with key presses and mouse actions. In this case, I use the key up/down actions to change the font from a list of font data. Pressing ‘SPACE’ will take a dump of the screen.

That font data was borrowed from the Anti-Grain Geometry library, because it’s a nice compact bitmap representations of a few fonts.  Just a brief on the AGG library.  This is a fairly well written graphics library written by Maxim Shemanarev, who died way too young I think.  The library is an inspiration and source of knowledge to anyone doing 2D graphics.

The text() function will render the string using whatever is the currently selected font, fill color, and specified location. The location specified the upper left of the font box. There are various text rendering attributes which can be employed, such as alignment, a draw box, whether to use the upper left corner, or the baseline, etc. All in due time.

The guts of the text() routine look like this:

// Text Processing
void text(const char *str, const int x, const int y)
{
	scan_str(gpb, &gfont, x, y, str, fillColor);
}

void setFont(const uint8_t *fontdata)
{
	font_t_init(&gfont, fontdata);
}

Very simple yes?

And deeper into the rabbit hole…

int scan_str(pb_rgba *pb, font_t *font, const int x, const int y, const char *chars, const int color)
{
	glyph_t ginfo;

	int idx = 0;
	int dx = x;
	int dy = y;

	while (chars[idx])
	{
		glyph_t_init(font, &ginfo, chars[idx]);
		scan_glyph(pb, font, &ginfo, dx, dy, color);
		dx += ginfo.width;
		idx++;
	}

	return dx;
}

Basically, get the glyph information for each character, and scan that glyph into the pixel buffer. The scan_glyph() in turn takes the packed bitmap information, unpacks it into individual ‘cover spans’, then turns that into pixels to be placed into the pixel buffer.

Lots can be done to optimize this activity. For example, all the glyphs can be rendered into a bitmap, and then just do a blit to the pixel buffer when needed. This would essentially be a tradeoff between space and time. That’s a choice easily made at a higher level than the lowest graphics routines, so the low level graphic routines don’t make that assumption, they only deal with the packed form of the data. I find this kind of design choice to be most interesting and useful in terms of what kind of profile the overall system has, and what constraints it will adhere to in terms of what type of system it will run on.

So, there you have it.  At this point, the graphicc library can do primitives from single pixels to bezier curves, and text.  With the help of the animwin32 routines, you can do simple UI programming as if you were using the 2D portion of Processing.  I find this to be a good plateau in the development of the library.  At this point, it makes sense to reconsider the various design choices made, cleanup the API, throw in some ‘const’ here and there, consider error checking, and the like.  The next step after that is to consider higher level libraries, which might be purpose built.  I’ll want to incorporate 3D at some point, and there are hints of that already in the routines, but I don’t want to get bogged down in that just yet.  Good image scaling and rotation is probably a first priority before lighting models and 3D rendering.

I would love to be able to create graphic systems profiles, such as, can I mimic GDI, or the HTML Canvas routines.  I can already do Processing to a certain extent.  My first test of the library though is something much simpler.  I want to be able to render the state of a network of machines performing some tasks.  First pass will likely be simple 2D graphs of some sort, so time to get on it.


LJITColors, there’s a name for that

I have written about curating color data in the past: Curating Data – Resene and Hollasch Colors

Now, here it is, two years later, and I find a reason to revisit the topic.  Really I just wanted to create a new repository on GitHub which isolated my color related stuff because I was finding it hard to find it.  So, I wrote some code, and put it in this LJITColors repository.

What’s there?  Well, first of all, you must have some color databases.  The allcolors.lua file contains color tables for; Crayola, Hollasch, Resene, SGI.  These are just the tables that I’ve curated from around the web that I find to be interesting.  A color table is nothing more than a name/value pair that might look like this:

 

local CrayolaColors = {
	navyblue = {25, 116, 210},
	seagreen = {159, 226, 191},
	screamingreen = {118, 255, 122},
	greenblue = {17, 100, 180},
	royalpurple = {120, 81, 169},
	bluegray = {102, 153, 204},
	blizzardblue = {172, 229, 238},

And at the end of the allcolors.lua file we find:

return {
	resene = ReseneColors,
	crayola = CrayolaColors,
	hollasch = HollaschColors,
	sgi = SGIColors,
}

This in and of itself is interesting because right away you can already do something as simple as:

local colordb = require("allcolors")
local screaminggreen = colordb.crayola.screaminggreen

And you’ll get the RGB triplet that represents the ‘screaminggreen’ color. OK, that might be useful for some use cases. But, since you have the data in a form that is easily searchable and maliable, you can do even more.

Let’s say you want to lookup a color based on a fragment of the name. You want some form of green, but you’re not sure what exactly, so you just want to explore. Let’s say you want to know all the colors in the Crayola set that have the word “yellow” in them:

 	local found = colorman.matchColorByName("yellow", colorman.colordb["crayola"], "crayola")

This returns a set of values that looks like:

    {dbname ="Crayola", name="lemmonyellow", color={255,244,79}}

   crayola ultrayellow          255  164  116
   crayola yellowgreen          197  227  132
   crayola lemonyellow          255  244   79
   crayola orangeyellow         248  213  104
   crayola yellow               252  232  131
   crayola yelloworange         255  174   66
   crayola greenyellow          240  232  145
   crayola unmellowyellow       255  255  102

Well, that’s pretty spiffy I think. And if you want to search the whole database, and not just one set, you can use this form:

	local found = getColorLikeName(pattern)

The ‘pattern’ is used in a lua string.find() function, so it can actually be a complex expression if you like.

Although looking up color values by name is useful and interesting, looking up by component values might be even more interesting. This is where things get really interesting though. Let’s imagine we want to lookup colors that appear to be ‘white’ {255, 255, 255}.

My first naïve attempt at doing this was to make the observation that the triplet is just a vector. Well, if I can find the angle between two vectors, then a ‘match’ is simply when two vectors have an angle close to zero. Yah!! That’s the ticket.

local function normalize(A)
	local mag = math.sqrt(A[1]*A[1] + A[2]*A[2] + A[3]*A[3])
	return {A[1]/mag, A[2]/mag, A[3]/mag}
end

-- linear algebra dot product
local function dot(A,B)
	return A[1]*B[1]+A[2]*B[2]+A[3]*B[3];
end

local function angleBetweenColors(A, B)
	local a = normalize(A)
	local b = normalize(B)
	local angle = acos(dot(a,b))

	return angle
end

local function matchColorByValue(color, db, dbname)
	local colors = {}

	for name, candidate in pairs(db) do
		local angle = angleBetweenColors(color, candidate)
		if (angle < 0.05) then
			table.insert(colors, {dbname=dbname, name=name, color = candidate})
		end
	end

	return colors;
end

Then I can just do the following to find the “white” colors:

	local found = colorman.matchColorByValue({255,255,255}, colorman.colordb.sgi, "sgi")

This results in about 250 entries that look like this:

       sgi seashell1            255  245  238
       sgi gainsboro            220  220  220
       sgi grey20                51   51   51
       sgi oldlace              253  245  230
       sgi grey85               217  217  217
       sgi grey30                77   77   77
       sgi gray73               186  186  186
       sgi grey45               115  115  115
       sgi gray31                79   79   79
       sgi grey51               130  130  130
       sgi gray92               235  235  235
       sgi grey                 190  190  190
       sgi grey38                97   97   97
       sgi grey62               158  158  158
       sgi gray27                69   69   69
       sgi gray39                99   99   99
       sgi grey11                28   28   28
       .
       .
       .

Huh? What’s that about? Well, upon further inspection, it did exactly what I asked. It found all the colors that have the same normalized vector. In that context, there’s no difference between {28, 28, 28} and {255, 255, 255}. Hmmm, so what I need to also take account of is the luminance value, so I get a similar direction and magnitude if you will.

--
-- Convert to luminance using ITU-R Recommendation BT.709 (CCIR Rec. 709)
-- This is the one that matches modern HDTV and LCD monitors
local function lumaBT709(c)
	local gray = 0.2125 * c[1] + 0.7154 * c[2] + 0.0721 * c[3]

	return gray;
end

local function matchColorByValue(color, db, dbname)
	local colors = {}
	local colorluma = lumaBT709(color)


	for name, candidate in pairs(db) do
		local angle = angleBetweenColors(color, candidate)
		local candidateluma = lumaBT709(candidate)
		if (angle < 0.05) and (abs(candidateluma - colorluma) < 5) then
			table.insert(colors, {dbname=dbname, name=name, color = candidate})
		end
	end

	return colors;
end

This time, the set is more like what I was after:

       sgi gray100              255  255  255
       sgi snow1                255  250  250
       sgi azure                240  255  255
       sgi ivory1               255  255  240
       sgi grey99               252  252  252
       sgi gray99               252  252  252
       sgi ivory                255  255  240
       sgi grey100              255  255  255
       sgi mintcream            245  255  250
       sgi floralwhite          255  250  240
       sgi snow                 255  250  250
       sgi honeydew1            240  255  240
       sgi azure1               240  255  255
       sgi honeydew             240  255  240
       sgi white                255  255  255

And, again, if I want to do it over the entire database:

	local found = colorman.getColorByValue({255,255,255})
    resene ricecake             255  254  240
    resene blackwhite           255  254  246
    resene bianca               252  251  243
    resene quarterpearllusta    255  253  244
    resene soapstone            255  251  249
    resene orchidwhite          255  253  243
    resene halfpearllusta       255  252  234
    resene apricotwhite         255  254  236
    resene romance              255  254  253
    resene clearday             233  255  253
    resene whitenectar          252  255  231
    resene butterywhite         255  252  234
    resene promenade            252  255  231
    resene wanwhite             252  255  249
    resene sugarcane            249  255  246
    resene hintofyellow         250  253  228
    resene seafog               252  255  249
    resene rocksalt             255  255  255
    resene travertine           255  253  232
    resene ceramic              252  255  249
    resene alabaster            255  255  255
    resene chileanheath         255  253  230
    resene dew                  234  255  254
    resene bridalheath          255  250  244
    resene hintofgrey           252  255  249
    resene islandspice          255  252  238
    resene chinaivory           252  255  231
    resene orangewhite          254  252  237
   crayola white                255  255  255
  hollasch azure                240  255  255
  hollasch snow                 255  250  250
  hollasch ivory                255  255  240
  hollasch titanium_white       252  255  240
  hollasch mint_cream           245  255  250
  hollasch floral_white         255  250  240
  hollasch white                255  255  255
  hollasch honeydew             240  255  240
       sgi gray100              255  255  255
       sgi snow1                255  250  250
       sgi azure                240  255  255
       sgi ivory1               255  255  240
       sgi grey99               252  252  252
       sgi gray99               252  252  252
       sgi ivory                255  255  240
       sgi grey100              255  255  255
       sgi mintcream            245  255  250
       sgi floralwhite          255  250  240
       sgi snow                 255  250  250
       sgi honeydew1            240  255  240
       sgi azure1               240  255  255
       sgi honeydew             240  255  240
       sgi white                255  255  255

Now we’re cooking with gas!

So, originally, I was interested in curating a few sets of colors just so that I’d always have some color sets readily available. Now, I’ve turned those color sets into a quick and dirty database, and thrown in some data set specific search routines which makes them much more useful. For color matching, I can imagine picking up a few pixels from a bitmap image, and doing color matching to find ways to blend and extend.

It’s all fun, and realizations like taking into account the luminance value, makes the learning that much more interesting.

So, there you have it.


Drawing Curvy Lines with aplomb – Beziers

I have written plenty about Bezier and other curves and surfaces.  That was largely in the context of 3D printing using OpenScad, or BanateCad.  But now, I’m adding to a general low level graphics library.

test_bezier

Well well, what do we have here.  Bezier curves are one of those constructs where you lay down some ‘control points’ and then draw a line that meanders between them according to some mathematical formula.  In the picture, the green curve is represented by 4 control points, the red one is represented by 5 points, and the blue ones are each represented by 4 points.

How do you construct a Bezier curve?  Well, you don’t need much more than the following code:

void computeCoefficients(const int n, int * c)
{
	int k, i;

	for (k = 0; k <= n; k++)
	{
		// compute n!/(k!(n-k)!)
		c[k] = 1;
		for (i = n; i >= k + 1; i--)
		{
			c[k] *= i;
		}

		for (i = n - k; i >= 2; i--)
		{
			c[k] /= i;
		}
	}
}

void computePoint(const float u, Pt3 * pt, const int nControls, const Pt3 *controls, const int * c)
{
	int k;
	int n = nControls - 1;
	float blend;

	pt->x = 0.0;	// x
	pt->y = 0.0;	// y
	pt->z = 0.0;	// z
	
	// Add in influence of each control point
	for (k = 0; k < nControls; k++){
		blend = c[k] * powf(u, k) *powf(1 - u, n - k);
		pt->x += controls[k].x * blend;
		pt->y += controls[k].y * blend;
		pt->z += controls[k].z * blend;
	}
}

void bezier(const Pt3 *controls, const int nControls, const int m, Pt3 * curve)
{
	// create space for the coefficients
	int * c = (int *)malloc(nControls * sizeof(int));
	int i;

	computeCoefficients(nControls - 1, c);
	for (i = 0; i <= m; i++) {
		computePoint(i / (float)m, &curve[i], nControls, controls, c);
	}
	free(c);	
}

This is pretty much the same code you would get from any book or tutorial on fundamental computer graphics. It will allow you to calculate a Bezier curve using any number of control points.

Here’s the test case that generated the picture

typedef struct {
	REAL x;
	REAL y;
	REAL z;
} Pt3;

void polyline(pb_rgba *pb, Pt3 *curve, const int nPts, int color)
{
	for (int idx = 0; idx < nPts; idx++) {
		raster_rgba_line(pb, curve[idx].x, curve[idx].y, curve[idx + 1].x, curve[idx + 1].y, color);
	}
}

void test_bezier()
{

	size_t width = 400;
	size_t height = 400;
	int centerx = width / 2;
	int centery = height / 2;
	int xsize = (int)(centerx*0.9);
	int ysize = (int)(centery*0.9);

	pb_rgba pb;
	pb_rgba_init(&pb, width, height);

	// background color
	raster_rgba_rect_fill(&pb, 0, 0, width, height, pLightGray);

	// One curve drooping down
	Pt3 controls[4] = { { centerx - xsize, centery, 0 }, { centerx, centery + ysize, 0 }, { centerx, centery + ysize, 0 }, { centerx + xsize, centery, 0 } };
	int nControls = 4;
	int m = 60;
	Pt3 curve[100];
	bezier(controls, nControls, m, curve);
	polyline(&pb, curve, m, pGreen);

	// Several curves going up
	for (int offset = 0; offset < ysize; offset += 5) {
		Pt3 ctrls2[4] = { { centerx - xsize, centery, 0 }, { centerx, centery - offset, 0 }, { centerx, centery - offset, 0 }, { centerx + xsize, centery, 0 } };
		bezier(ctrls2, nControls, m, curve);
		polyline(&pb, curve, m, pBlue);
	}

	// one double peak through the middle
	Pt3 ctrls3[5] = { { centerx - xsize, centery, 0 }, { centerx-(xsize*0.3f), centery + ysize, 0 }, { centerx, centery - ysize, 0 }, { centerx+(xsize*0.3f), centery + ysize, 0 }, { centerx + xsize, centery, 0 } };
	int nctrls = 5;
	bezier(ctrls3, nctrls, m, curve);
	polyline(&pb, curve, m, pRed);

	// Now we have a simple image, so write it to a file
	int err = write_PPM("test_bezier.ppm", &pb);
}

From here, there are some interesting considerations. For example, you don’t want to calculate the coefficients every single time you draw a single curve. In terms of computer graphics, most Bezier curves consist of 3 or 4 points at most. Ideally, you would calculate the coefficients for those specific curves and store the values statically for later usage. This is what you’d do ideally for a small embedded library. The tradeoff in storage space is well worth the savings in compute time.

Additionally, instead of calculating all the line segments and then storing those values and using a polyline routine to draw things, you’d like want to simply have the bezier routine draw the lines directly. That would cut down on temporary allocations.

At the same time though, you want to retain the ‘computePoint’ function as independent because once you have a Bezier calculation function within the library, you’ll want to use it to do things other than just draw curved lines. Bezier, and it’s corollary Hermite, are good for calculating things like color ramps, motion, and other curvy stuff. This is all of course before you start using splines and nurbs, which are much more involved process.

There you have it, a couple of functions, and suddenly you have Bezier curves based on nothing more than the low level line drawing primitives. At the moment, this code sits in a test file, but soon enough I’ll move it into the graphicc library proper.


William Does Linux on Azure!

What?

You see, it’s like this.  As it turns out, a lot of people want to run code against a Linux kernel in the cloud.  Even though Windows might be a fine OS for cloud computing, the truth is, many customers are simply Linux savvy.  So, if we want to make those customers happy, then Linux needs to become a first class citizen in the Azure ecosystem.

Being a person to jump on technological and business related grenades, I thought I would join the effort within Microsoft to make Linux a fun place to be on Azure.  What does that mean?  Well, you can already get a Linux VM on Azure pretty easily, just like with everyone else.  But what added value is there coming from Microsoft so this isn’t just a simple commodity play?  Microsoft does in fact have a rich set of cloud assets, and not all of them are best accessed from a Linux environment.  This might mean anything from providing better access to Azure Active Directory, to creating new applications and frameworks altogether.

One thing is for sure.  As the Windows OS heads for the likes of the Raspberry Pi, and Linux heads for Azure, the world of computing is continuing to be a very interesting place.


Follow

Get every new post delivered to your Inbox.

Join 51 other followers