Leap Motion Event Filtering

So, I’m always trying to make my code simpler. Easier to understand, easier to maintain. With the Leap, or any input device, you are faced with a continuous stream of data. In many situations, you’d like to just filter through stuff and only deal with certain types of data. In one sense, you need a simple stream based query processor.

I had the beginnings of such a thing, and I’ve now separated out the pieces. In the case of the Leap Motion, you are faced with streams of data that might look like this:

{
	"id":1237560,
	"r":[[0.444044,0.663489,-0.602169],[0.184129,-0.725287,-0.663367],[-0.876882,0.183687,-0.444227]],
	"s":762.482,
	"t":[5336.48,-24560.1,5768.29],
	"timestamp":13587071004,

	"hands":[{
		"id":4,
		"direction":[-0.0793992,0.899586,-0.427785],
		"palmNormal":[-0.16208,-0.432144,-0.886711],
		"palmPosition":[27.138,227.235,80.2504],
		"palmVelocity":[-136.716,-134.926,-359.534],
		"sphereCenter":[9.15823,202.468,9.29922],
		"sphereRadius":106.122,
		"r":[[0.989305,-0.132062,-0.0619254],[0.117032,0.97208,-0.203384],[0.0870557,0.193962,0.977139]],
		"s":1.45151,
		"t":[-18.2708,21.6366,-106.687]
	}],
	
	"pointables":[
		{
			"direction":[0.196259,0.670762,-0.715235],
			"handId":4,
			"id":7,
			"length":68.7964,
			"tipPosition":[61.3422,285.46,38.3742],
			"tipVelocity":[-184.398,-119.405,-322.679],
			"tool":false
		},
		{
			"direction":[0.0324904,0.792378,-0.609165],
			"handId":4,
			"id":3,
			"length":76.8893,
			"tipPosition":[14.7425,304.766,41.4163],
			"tipVelocity":[-229.246,-95.6285,-323.667],
			"tool":false
		}
}]
}

This data is representative of a single ‘frame’ coming off the device. As you can see, it’s hierarchical. That is, for every discreet frame, I get a grouping of hand(s) information as well as pointables and possibly gestures. This is how the Leap Motion software packages things up.

In some applications, what I really want is a stream of events, not presented hierarchically. Additionally, I want to read that stream of events and easily filter it looking for particular patterns. Basically, I’d like to be able to write a program like the following, which does nothing more than print the hand information. In particular, I’m looking for the radius of the sphere, as well as the sphere center.

package.path = package.path.."../?.lua"

local LeapInterface = require("LeapInterface");
local FrameEnumerator = require("FrameEnumerator");
local EventEnumerator = require("EventEnumerator");

local leap, err = LeapInterface();

assert(leap, "Error Loading Leap Interface: ", err);

local printHand = function(hand)
  local c = hand.sphereCenter;
  local r = hand.sphereRadius;
  local n = hand.palmNormal;
  print(string.format("HAND: %3.2f  [%3.2f %3.2f %3.2f]", r, c[1], c[3], c[3]));
end

-- Only allow the hand events to come through
local handfilter = function(event)
  return event.palmNormal ~= nil
end

local main = function()
  for event in EventEnumerator(FrameEnumerator(leap), handfilter) do
    printHand(event);
  end	
end

run(main);

Easy to digest. Looking at the main() function, there is an iterator chain. The core is the ‘FrameEnumerator(leap)’. This is the source. That is wrapped by the EventEnumerator() iterator, which consumes the frameiterator, as well as the ‘handfilter()’ function.

Looking at the FrameEnumerator:

-- FrameEnumerator.lua
local JSON = require("dkjson");
local ffi = require("ffi");

local FrameEnumerator = function(interface)
  local closure = function()
    for rawframe in interface:RawFrames() do
      local frame = JSON.decode(ffi.string(rawframe.Data, rawframe.DataLength));
      return frame;
    end
  end	

  return closure;
end

return FrameEnumerator

Not very much code at all, and borrowed from the FrameObserver object in a previous article. This enumerator has the sole purpose of turning the raw strings that arrive from the WebSocket interface into a stream of discreet table objects. It does that by doing the JSON.decode(), which takes a JSON formatted string and turns it into a Lua table object. It just hands those out until it can’t hand out any more.

The EventEnumerator is a little bit more complex, but not much. It’s sole purpose in life is to take Lua tables, which presumably have the format of these Leap Motion frames, and turn them into discreet events, basically flattening the hierarchical data structure. In addition to flattening the data structure, you can apply the filter, thus throwing away bits of the frame that are not going to be processed any further:

-- EventEnumerator.lua
local Collections = require("Collections");

local EventEnumerator = function(frameemitter, filterfunc)
  local eventqueue = Collections.Queue.new();

  local addevent = function(event)
    if filterfunc then
      if filterfunc(event) then
        eventqueue:Enqueue(event);
      end
    else
      eventqueue:Enqueue(event);
    end
  end


  local closure = function()
    -- If the event queue is empty, then grab the
    -- next frame from the emitter, and turn it into
    -- discreet events.
    while eventqueue:Len() < 1 do
      local frame = frameemitter();
      if frame == nil then
        return nil
      end

      if frame.hands ~= nil then
        for _,hand in ipairs(frame.hands) do
          addevent(hand);
        end
      end

      if frame.pointables ~= nil then
        for _,pointable in ipairs(frame.pointables) do
          addevent(pointable);
        end
      end

      if frame.gestures then
        for _,gesture in ipairs(frame.gestures) do
          addevent(gesture);
        end
      end						
    end

    return eventqueue:Dequeue();
  end	

  return closure;
end

return EventEnumerator

Stitch it all together and the first program actually works, you can get a stream of ‘hand’ events. The beauty of this system is that you can further string iterators together in this way. Of course, at the lowest level, you can easily filter to find hands, fingers, tools, gestures, and the like. You might also realize something else special about this. The ‘filter’ is just Lua code, so it can be as complex or simple as you want. In this particular case, I just wanted to check for the existance of a single field. If I thought the amount of data was too much, and I wanted to cut it in half, I could have easily kept a counter, and only returned every other event. You could go further, and not just return a true/false, but you could possibly alter the event as well in some way. But, the true power comes from composition. Rather than making a more complex single filter, I’d rather just make more filters and string them together.

Using this simple concept of Enumeration, I can imagine performing a whole bunch of simple, and even complex operations, simply by tying the proper enumerators, observers, queues and the like together.


Handy Mouse – Using the Leap as a mouse controller

By using the Leap Motion, I have finally taken the step of getting into “Gesturing”.  By that I just mean using my hands and fingers in ways other than keyboard and mouse.  Of course the Leap Motion allows you to point, with a stick, your fingers, and what have you.

As a basic input device, it spews a stream of data to the developer, who must then make sense of it.  This is true of all input devices, even going down to the lowly mouse.  The difference with more traditional devices is their input stream has been well understood and mapped to models that work for existing applications.  Mouse spew: x,y,wheel, button and that’s about it.  Keyboard: keycode, up/down, led state.

The Leap Motion, and other such input devices are a bit different.  Their data streams, and how they map to applications are different.  What does waving your hand wildy do?  What does swiping in the form of an “S” curve do?  Unknown.  Furthermore, how do these motions map to a traditional application?  Should they?

Well, one mapping that I have chosen to take a look at is mouse movement. Seems simple enough.  Basically, pick up a pencil, or chopstick, and use it to point at the screen, move around, and ‘click’ by doing quick dips of the tip.  Seems natural enough.  Here’s a bit of code to achieve that:

 

-- pointertrack.lua
package.path = package.path.."../?.lua"

local LeapScape = require ("LeapScape");
local MouseBehavior = require("MouseBehavior");
local UIOSimulator = require("UIOSimulator");

local scape, err = LeapScape();

if not scape then 
  print("No LeapScape: ", err)
  return false
end

local OnMouseMove = function(param, x, y)
  UIOSimulator.MouseMove(x, y);
end

local mousetrap = MouseBehavior(scape, UIOSimulator.ScreenWidth, UIOSimulator.ScreenHeight);
mousetrap:AddListener("mouseMove", OnMouseMove, nil);

scape:Start();

run();

Of course this uses the TINN runtime, and can be found in the Leap TINNSnip project.

What’s going on here? First of all, create the LeapScape object. Then create a function which will turn x,y, coordinates into simulated mouse movements. Lastly, create an instance of the ‘MouseBehavior’ object, and add the OnMouseMove() function as a listener of the “mouseMove” event. Start the LeapScape, and run() the program.

You do this from a command line. Once the program is running, you can point the stick around the screen and see that the cursor will follow it.

But first, you have to create a calibration that allows you to constrain the movement. With the Leap Motion, you get a roughly 2’x2’x2′ area in which it can detect various movements. What you need to do is create a mapping from a porting of that space into your screen coordinates. So, first you run the mousetrain.lua program. Which looks like this:

--package.path = package.path.."../../?.lua"
package.path = package.path.."../?.lua"

local LeapScape = require ("LeapScape");
local FrameObserver = require("FrameObserver");
local UIOSimulator = require("UIOSimulator");
local StopWatch = require("StopWatch");
local GDI32 = require ("GDI32");
local FileStream = require("FileStream");


--[[
	Map a value from one range to another
--]]
local mapit = function(x, minx, maxx, rangemin, rangemax)
  return rangemin + (((x - minx)/(maxx - minx)) * (rangemax - rangemin))
end

--[[
	Clamp a value to a range
--]]
local clampit = function(x, minx, maxx)
  if x < minx then return minx end
  if x > maxx then return maxx end

  return x
end

local main = function()
  local scape, err = LeapScape();

  if not scape then 
    print("No LeapScape: ", err)
    return false
  end

  local fo = FrameObserver(scape);

  local sensemin = {math.huge, math.huge, math.huge}
  local sensemax = {-math.huge, -math.huge, -math.huge}

  -- We'll use this to do some drawing on the screen
  local hdcScreen = GDI32.CreateDCForDefaultDisplay();


  local busywait = function(millis)
    sw = StopWatch.new();

    while true do
      if sw:Milliseconds() > millis then
        break
      end
      coroutine.yield();
    end
  end

  local drawTarget = function(originx, originy, width, height)
    local brushColor = RGB(255,0,0);

    x = originx - width/2;
    y = originy - height/2;

    x = clampit(x, 0, UIOSimulator.ScreenWidth-1 - width);
    y = clampit(y, 0, UIOSimulator.ScreenHeight-1 - height);

    local right = x + width
    local bottom = y + height
--print(x,y,width,height)
    hdcScreen:SetDCBrushColor(brushColor)
    hdcScreen:RoundRect(x, y, right, bottom, 4, 4)
  end

  local observerange = function(param, event)
    local newvalue = false
    local tp = event.tipPosition;

    sensemin[1] = math.min(tp[1], sensemin[1])
    sensemin[2] = math.min(tp[2], sensemin[2])
    sensemin[3] = math.min(tp[3], sensemin[3])

    sensemax[1] = math.max(tp[1], sensemax[1])
    sensemax[2] = math.max(tp[2], sensemax[2])
    sensemax[3] = math.max(tp[3], sensemax[3])
  end

  local dwellAtPosition = function(x, y)
    drawTarget(x,y, 32,32);
    busywait(500);
    fo:AddPointerObserver(observerange, nil)
    busywait(1000);
    fo:RemovePointerObserver(observerange, nil)
  end

  local matchTargets = function()
    dwellAtPosition(0, UIOSimulator.ScreenHeight-1);
    dwellAtPosition(0, 0);
    dwellAtPosition(UIOSimulator.ScreenWidth-1, 0);
    dwellAtPosition(UIOSimulator.ScreenWidth-1, UIOSimulator.ScreenHeight-1);
  end

  local writeConfig = function()
    fs = FileStream.Open("sensor.cfg");

    local output = {
      string.format("do return {");
      string.format("sensemin = {%3.2f, %3.2f, %3.2f};", sensemin[1], sensemin[2], sensemin[3]);
      string.format("sensemax={%3.2f, %3.2f, %3.2f};", sensemax[1], sensemax[2], sensemax[3]);
      string.format("} end");
    }
    output = table.concat(output,"\n");
    fs:WriteString(output);
    fs:Close();
  end

  scape:Start();
  matchTargets();
  writeConfig();
  stop();
end

run(main);

This snippet will place a target in each of the 4 corners of the screen. You point at it for a bit, and then move on to the next one. Easy enough. Once you’ve done that, it will write this range information out to a configuration file. Then you can use the mouse pointtracker to your heart’s content.

There is a ‘MouseBehavior’ object, which is a work in progress. It basically filters through the stream of events coming off the Leap device and determines what is ‘move’, what is ‘click’ and the like, and fires off the appropriate event to whomever may be observing. Oh yes, the IObservable/IEnumerable thing comes home to roost.

The Behaviors are an interesting place to be. It really makes you think. What is a “mouse click”? Is it a dip of the pointing device? By how much, and how long? Is there a ‘right click’? How can I sumulate the wheel? Is it perhaps a circular motion? This is where being able to annotate a gesture, and then subsequently search for that pattern in the data stream becomes really interesting.

For now, I’m happy enough to get basic mouse movement. Soon enough though, chording gestures should be close at hand.


Taming the Leap Motion

I have been playing with the Leap Motion input device of late.  If you haven’t seen one of these, it’s worth taking a look at the video demonstrations.  The Leap joins the likes of the Kinect in terms of offering an alternative form of input than the standard mouse and keyboard.  But, that’s about where the similarities end.

Whereas the Kinect is good at large body movements of multiple players at a distance, it’s not so great at fine hand tracking at a close distance.  This is the specialty of the Leap Motion.  You can point at your screen, move your hands around, tap your fingers and the like.

It’s kind of like turning every monitor into a touch screen, except it provides a larger volume than the flat surface of the screen.

The Leap Motion comes with a SDK for developers to write whatever it is they’re going to write with this new input device.  The core of the SDK is a library written in C++.  That’s great for languages such as Python, Java, C# and the like which can use SWIG to turn the header file into something they can interop to.  This approach might work for Lua as well, but C++ is just a pain to deal with from these dynamic languages.

I want to play with the Leap Motion, but I don’t want to do it through the supplied SDK as C++ just doesn’t work for me.  Fortunately, the Leap Motion provides a different mechanism.  The Leap Motion offers up a WebSocket server interface.  So, you can ‘talk’ to it by simply opening up a websocket and start streaming the results.

Alrighty then, how to get started?

Well, First of all the url for the device is: “ws://127.0.0.1:6437/”

I have a hacked together WebSocket object that’s good enough to establish a connection to this service endpoint.  So, I do that, and encapsulate the connection in a LeapInterface_t object.  This interface serves out a constant stream of JSON formatted data.  To make things simple for Lua consumption, I put an iterator interface on this:

LeapInterface_t.RawFrames = function(self)
  local closure = function()
    local frame, err = self.SocketStream:ReadFrame();
    if not frame then
      return nil, err
    end

    return frame;
  end

  return closure;
end

With this alone, you could do something like:

local leap = LeapInterface()
for rawframe in leap:RawFrames() do
  print(rawframe);
end

That’s pretty much all you need to see the stream of frames coming out of the Leap Motion. But, that’s not particularly useful. There’s a ton of information you can get out of this device. You can get hand tracking, finger and ‘tool’ tracking, and even ‘gestures’. It’s almost too much information to make sense of. So, a little bit of organization is in order.

First, I create an object that will be the central focus of all things Leap Motion. This is the LeapScape object. I chose this name because I conceive of the area the Leap Motion can watch and report on as the LeapScape. There can be hands, pointables, gestures, and the like. I need something to control and coordinate all these things. It looks like this:

local Collections = require("Collections");
local LeapInterface = require("LeapInterface");
local JSON = require("dkjson");

local LeapScape_t = {}
local LeapScape_mt = {
	__index = LeapScape_t,
}

local LeapScape = function()
	local interface, err = LeapInterface({enableGestures=true});

	if not interface then
		return nil, err;
	end

	local obj = {
		Interface = interface;

		Hands = {};
		Pointables = {};
		Frames = {};

		FrameQueue = Collections.Queue.new();
		ContinueRunning = false;
	};

	setmetatable(obj, LeapScape_mt);

	return obj;
end

LeapScape_t.Start = function(self)
	self.ContinueRunning = true;

	Runtime.Scheduler:Spawn(LeapScape_t.ProcessFrames, self)
	Runtime.Scheduler:Spawn(LeapScape_t.ProcessRawFrames, self);
end

LeapScape_t.ProcessRawFrames = function(self)

	for rawframe in self.Interface:RawFrames() do
		local frame = JSON.decode(ffi.string(rawframe.Data, rawframe.DataLength));
		self.FrameQueue:Enqueue(frame);

		coroutine.yield();
	end
end

LeapScape_t.ProcessFrames = function(self, frame)
	while self.ContinueRunning do
		-- get a frame off the queue
		local frame = self.FrameQueue:Dequeue()
		if frame then
			if frame.gestures then
				for _, gesture in ipairs(frame.gestures) do
					self:OnGesture(gesture);
				end
			end
		end
		coroutine.yield();
	end
end

LeapScape_t.OnGesture = function(self, gesture)
  if self.GestureHandler then
    self.GestureHandler:OnGesture(gesture);
  end
end

Another piece that focuses on gesture handling specifically is the “GestureHandler” class. This is fairly straight forward as well. It will take a “gesture”, which is just a decoded JSON string, and perform various actions based on what it reads there. One of the things it does is provide some order to the input. For example, it’s totally possible that while you’re processing a ‘swipe’, you get some intervening ‘screenTap’ and ‘keyTap’ gestures, which may not be desirable. So, the GestureHandler will create some order by only allowing sweep update and ‘end’ gestures to filter through, until the sweep is completed.

GestureHandler_t = {}
GestureHandler_mt = {
	__index = GestureHandler_t;
}

GestureHandler = function()
	local obj = {
		CurrentGesture = "none";
	}

	setmetatable(obj, GestureHandler_mt);

	return obj;
end

GestureHandler_t.OnGesture = function(self, gesture)
	--print("==== GESTURE ====")
	--print("type: ", gesture.type, gesture.state);

	if gesture.type == "screenTap" then
		self:HandleScreenTap(gesture)
	elseif gesture.type == "keyTap" then
		self:HandleKeyTap(gesture)
	elseif gesture.type == "swipe" then
		self:HandleSwipe(gesture);
	elseif gesture.type == "circle" then
		self:HandleCircle(gesture);
	end
end

GestureHandler_t.HandleScreenTap = function(self, gesture)
	if self.CurrentGesture == "none" then
		if self.OnScreenTap then
			self.OnScreenTap(gesture);
		end
	end
end

GestureHandler_t.HandleKeyTap = function(self, gesture)
	if self.CurrentGesture == "none" then
		if self.OnKeyTap then
			self.OnKeyTap(gesture);
		end
	end
end

GestureHandler_t.HandleCircle = function(self, gesture)
	if not (self.OnCircleBegin or self.OnCircling or self.OnCircleEnd) then
		return
	end

	if self.CurrentGesture == "circle" then
		if gesture.state == "stop" then
			if self.OnCircleEnd then
				self.OnCircleEnd(gesture);
			end
			self.CurrentGesture = "none";
		elseif gesture.state == "update" then
			if self.OnCircling then
				self.OnCircling(gesture)
			end
		end
	elseif self.CurrentGesture == "none" then
		self.CurrentGesture = "circle";
		if self.OnCircleBegin then
			self.OnCircleBegin(gesture)
		end
	end
end

GestureHandler_t.HandleSwipe = function(self, gesture)
	if not (self.OnSwipeBegin or self.OnSwiping or self.OnSwipeEnd) then
		return
	end

	if self.CurrentGesture == "swipe" then
		if gesture.state == "stop" then
			if self.OnSwipeEnd then
				self.OnSwipeEnd(gesture);
			end
			self.CurrentGesture = "none";
		elseif gesture.state == "update" then
			if self.OnSwiping then
				self.OnSwiping(gesture)
			end
		end
	elseif self.CurrentGesture == "none" then
		self.CurrentGesture = "swipe";
		if self.OnSwipeBegin then
			self.OnSwipeBegin(gesture)
		end
	end
end

return GestureHandler

OK. So, there’s the lowest level WebSocket, the subsequent LeapInterface, the enclosing LeapScape, and finally the Gesture Handler. Pulling it all together in a little demo program, you get:

local Runtime = require("Runtime");
local LeapScape = require ("LeapScape");
local GestureHandler = require("GestureHandler");

local printDict = function(dict)
	for k,v in pairs(dict) do
		print(k,v)
	end
end

local OnSwipeBegin = function(gesture)
	local p = gesture.position;
	print("============")
	print("SWIPE BEGIN: ", p[1], p[2], p[3])
end

local OnSwipeEnd = function(gesture)
	local p = gesture.position;
	local d = gesture.direction;

	print("SWIPE END: ", p[1], p[2], p[3]);
	print("Direction: ", d[1], d[2], d[3]);
	print("Speed: ", gesture.speed);
end

local OnCircleBegin = function(gesture)
	local p = gesture.position;
	print("============")
	print("CIRCLE BEGIN: ")
end

local OnCircling = function(gesture)
	local n = gesture.normal;
	local direction = "ccw";
	if n[1] <0 and n[3] < 0 then
		direction = "cw"
	end

	print(string.format("CIRCLING: %f %s", gesture.progress, direction));
	--printDict(gesture);
end

local OnCircleEnd = function(gesture)
  local c = gesture.center;
  print(string.format("CIRCLE END: %f [%f, %f, %f]", gesture.radius, c[1], c[2], c[3]));
end

local OnScreenTap = function(gesture)
  local p = gesture.position;
  print("SCREEN TAP: ", p[1], p[2], p[3]);
end

local OnKeyTap = function(gesture)
  local p = gesture.position;
  print("KEY TAP: ", p[1], p[2], p[3]);
end

local main = function()
  local scape = LeapScape();
  local ghandler = GestureHandler();

  -- Swipes
  --ghandler.OnSwipeEnd = OnSwipeEnd;
  --ghandler.OnSwipeBegin = OnSwipeBegin;

  -- Circles
  ghandler.OnCircleBegin = OnCircleBegin;
  ghandler.OnCircling = OnCircling;
  ghandler.OnCircleEnd = OnCircleEnd;

  -- Taps
  --ghandler.OnKeyTap = OnKeyTap;
  --ghandler.OnScreenTap = OnScreenTap;

  scape.GestureHandler = ghandler;

  spawn(scape.Start, scape)
end

run(main);

In this particular case, I’m just tracking the ‘circle’ gesture. If you were doing a real application, you would probably tie this circle gesture to something meaningful in the application, like turning a knob perhaps, or rotating something one way or the other. Notice that within ‘main’, at the very end, the scape.Start routine is spawned. That means that it will run in parallel to any other spawned fibers running in the application, including things like network communications and UI.

The Leap Motion is a nifty device. It presents quite a lot of information to the programmer. Having a fairly simple websocket based interface available makes it relatively easy to do programming with it. I did not require a ‘C’ interface in the end, but just did whatever I wanted with the raw information provided.

I’m sure there will be some very interesting applications created with this device. Using TINN, this becomes a snap as the code here is a TINNSnip located here: leaper