What The Functor? – Making them useful

Last time, I introduced the Functor in a rather convoluted way. I said that particular implementation may or may not be as performant as a straight up closure. Well, as it turns out, that’s going to be true most of the time. Why?

You see, when you use a table as a functor, you are forcing a call to ‘getmetatable’ to do a lookup to see if the ‘__call’ metamethod is implemented or not. If it’s not implemented, an error is thrown, if it is implemented, then your operation is performed. Well, that means an extra function call and dereference into our table for every single function call. When this is on a hot path, it could be quite a bummer in terms of performance.

So, why bother with this particular convoluted construct for functor implementation? The beauty of the table technique is that since your functor is actually a table, which gives you they syntactic sugar of being called like a function, it has all the attributes of a table. For example, let’s imagine you want to associate some bit of information with your function, such as when it was created, or version information:

local f1 = Functor(somefunction)
f1.Creation = os.date()
f1.Version = "1.2"

That might be useful in a case where you want to refresh your functors if they’re getting to be too old, or you want to compare versions of functions to ensure they are what you expect. Very handy, yet very costly.

Although this is useful, it does impose this overhead, so is there another way to implement the functor without incurring so much overhead?

We of course! If you’re willing to lose the table capabilities of your functor, but you want to retain the fact that you can associate some amount of state with your functor, then a closure is the way to go:

local Functor = function(func, target)
  if not target then return func end

  return function(...)
    return func(target,...)
  end
end

Yah, that’s right. After I bad mouthed closures when I first introduced functors, here I am praising their value. And it’s true. Given the constraint of ditching associating information with the functor, beyond the passed in state, this closure implementation is a lot faster when it comes time to actually execute the code.

Not only is this code faster, but it’s a lot easier to understand, easier to maintain, etc.

And how would I use this in real life? Well, at the core of TINN is the scheduler, as represented by the IOProcessor class. The scheduler is at the heart of what makes TINN tick, and thus it must be the most compact, performant bit of code that it can possibly be. At the same time, I want it to be flexible, because ultimately I need to ability to change the very core scheduling algorithms and other core features, without having to totally reengineer the codebase to make changes.

So, I’ve been tinkering.

I’ve recently rewritten parts of the scheduler to utilize the functors, and iteratores. Here is the main loop as it stands today:

IOProcessor.start = function(self)
  self:addQuantaStep(Functor(self.stepIOEvents,self));
  self:addQuantaStep(Functor(self.stepTimeEvents,self));
  self:addQuantaStep(Functor(self.stepFibers,self));
  self:addQuantaStep(Functor(self.stepPredicates,self));


  self.ContinueRunning = true;

  when(Functor(self.noMoreTasks,self), Functor(self.stop, self))

  for astep in self:quantumSteps() do
    astep()
  end
end

IOProcessor.run = function(self, func, ...)
  if func ~= nil then
    self:spawn(func, ...);
  end

  self:start();
end

If you were writing a typical TINN based program where you wanted multi-tasking, you would be doing this:

local Task = require("IOProcessor")

local function main()
  -- after 5 seconds, stop the system
  delay(function(tick) Task:stop(), 5 * 1000)

  -- print something every half second
  periodic(function(tick) print("Hello Again: ",tick) end, 500)
end

Task:run(main)

The IOProcessor.start() routine is littered with Functors! What are they all doing? OK, first with the ‘addQuantaSteps’. The quanta steps are the runctions that are executed each time through the scheduler’s loop. You can see this in the iterator at the bottom:

  for astep in self:quantumSteps() do
    astep()
  end

The usual way you’d see this code written is as a while loop, or some other looping construct. In this case, I have used an iterator as the fundamental loop construct. That seems a bit novel, or at least slightly off the beaten path. Why bother with such a construct? Well, first of all, since it’s an iterator, the ‘step()’ function doesn’t actually know anything about the code that is being executed. It doesn’t know how long, and it doesn’t assume any side effects. This gives great flexibility because in order to change the behavior of the main loop only requires changing the behavior of the iterator that is being used. You could easily modify the scheduler to utilize your own custom iterator if you like. The only assumption is that the items coming out of the iterator are functions that can be called, and that’s it.

The ‘addQuantaStep()’ function calls are simply a way to prepopulate the list of functions with a known set of things I want to do each time through the scheduler’s loop. I made this a function call because that way any bit of code could do the same thing before running the scheduler. Here the Functor is being used to ensure we capture the state that goes along with the function.

The last place the Functor is being used here is in the ‘when()’ call:

  when(Functor(self.noMoreTasks,self), Functor(self.stop, self))

“when there are no more tasks, stop”

The functors are used because the functions in question live within the IOProcessor table, and are called with state information, so it needs to be associated with them. As an aside, why not just make these functions flat? Why are they even associated with a table anyway? Well, because, even though at the moment you can run only a single scheduler in your process, there’s no reason in the world why this MUST be the case. With another couple of changes, you’ll actually be able to run multiple schedulers at the same time within a single process. Yep, that might be a tad bit useful. Kind of like running multiple processes at the OS level.

At any rate, just for kicks, here is what the iterator of those functors looks like:

IOProcessor.addQuantaStep = function(self, astep)
  table.insert(self.QuantaSteps,astep)
end

IOProcessor.quantumSteps = function(self)
  local index = 0;
  local listSize = #self.QuantaSteps;

  local closure = function()
    if not self.ContinueRunning then
      return nil;
    end

    index = index + 1;
		
    local astep = self.QuantaSteps[index]

    if (index % listSize) == 0 then
      index = 0
    end

    return astep
  end

  return closure	
end

Basically, just keep going through the list of steps over and over again until something sets ‘ContinueRunning’ to false.

Having these steps represented as an iterator satisfies the needs that I have for running a fairly dynamically configurable main loop. While I was doing this, I was also considering what implications this might have for doing other things with the iterator. For example, is there something I can do with functional programming? The recent Lua Fun library is all about iterators, and I’m wondering if I can do something with that.

Functor and iterator in hand, I am going to look to streamline the scheduler even further. I am imagining the core of the scheduler is not more than 100 lines of code. The quantum steps, which is where all the real action is, could live separately, and be easily pluggable modules. That’s where this is headed, because I don’t know how to deal with all that complexity otherwise.

So, there you have it. Functors can be useful in practice, and not just a software engineering curiosity.


What the Functor! Again?

No, I’m not talking about George Clinton (probably related to Bill), but rather the function that can wrap functions.

What’s the point of this? In this case, I’ll start with the Functor class itself.

local functor = {}
setmetatable(functor, {
  __call = function(self, ...)
    return self:create(...);
  end,
})
local functor_mt = {
  __index = functor,
  __call = function(self, ...)
    if self.Target then
      return self.Func(self.Target, ...)
    end

    return self.Func(...)
  end,
}

functor.init = function(self, func, target)
  local obj = {
    Func = func;
    Target = target;
  }
  setmetatable(obj, functor_mt)

  return obj;
end

functor.create = function(self, func, target)
  return self:init(func, target)
end

Before explaining how this little thing works, here’s how it is used.

local function printSomething(words)
  print(words)
end

local f1 = Functor(printSomething)

f1("words to say")

OK. So, what’s the big deal? This is the most basic case, and there is no big deal. Basically the Function wraps the function you pass to it when you construct it: f1 = Functor(printSomething)

Then, later, when you use the function: f1(“words to say”), it’s as if you were calling the function directly. Why on earth would you ever want to do this?

Let’s imagine that you have a list of functions that you want to call from within the body of some other function.

listOfFuncs = {
  Functor(func1),
  Functor(func2), 
  Functor(func3)
}

local function every(funclist)
  for _, fun in ipairs(funclist) do
    fun()
  end
end

every(listOfFuncs)

For regular functions, there’s no real benefit here to using a functor, and there’s all this added overhead.

Now here’s another case. Instead of the functions just laying around without any associated state, they are methods of ‘objects’. The trick is to get the object instance associated with the function pointer, while maintaining the same relatively easy syntax.

local methods = {}
local methods_mt = {
	__index = methods,
}

methods.init = function(self)
	local obj = {
		name = "foo",
		name2 = "bar",
	}
	setmetatable(obj, methods_mt)
	
	return obj;
end

methods.method1 = function(self)
	print(self.name)
end

methods.method2 = function(self)
	print(self.name2)
end

local m1 = methods:init();

local f2 = Functor(methods.method1, m1)
local f3 = Functor(m1.method2, m1)

f2();
f3();

In this case, I need to associate the function pointer with an instance of the methods ‘object’, so when the Functor is constructed, the second parameter, which is the instance of the object, is also passed along.

Now, back to the Functor code, when it comes time to execute the function, it does it one way or another depending on whether it has a target or not:

  __call = function(self, ...)
    if self.Target then
      return self.Func(self.Target, ...)
    end

    return self.Func(...)
  end,

If it has a target, it will pass that as the first parameter when calling the function. If there is no specific target, then it will just call the function, passing along the supplied parameters.

So, this solves a very basic problem. Without the Functor construct, I would have to use a ‘closure’ for each function. A closure is essentially the same thing, but the language itself supports the concept of keeping information associated with a function. It might do this by placing information in a global state, or some other construct that might not be quite as succinct as this Functor. The Functor construct allows you to do essentially what you would be doing with a closure, but by using an object, you can more tightly control exactly what’s going on, what’s in the structure, lifetime, and all that.

To use a closure, or to use a functor? Closures are an easy and natural part of the Lua language. Functors are a construct that may or may not be efficient in comparison. The Functor does solve a particular problem when it comes to sticking a function in a list with its attendant state.