When Is Software Engineering – Surely a database is required

So, I’ve gotten data, and presented it on a web page in JSON format.  If that’s not engineering, I’m not sure what is, but way, surely a database of sorts must be involved.

There are plenty of times in my code where I need to quickly filter some ‘records’ performing some activity only on those records that meet a particular criteria.  Given that Lua is table based, everything of interest becomes a ‘record’.  This applies to “classes” as well as the more garden variety of ‘records’ that might be streaming out of an actual database, or in my recent example, a simple iterator over the services on my machine.  It would be nice if I had some fairly straight forward way to deal with those records.   What I need is an iterator based query processor.

The requirements are fairly simple.  There are three things that are typical of record processors:

record source –  The source of data.  In my case, the source will be any iterator that feeds out simple key/value table structures.

projection – In database terminology, ‘projection’ is simply the list of fields that you want to actually present in the query results.  I might have a record that looks like this:

{name = "William", address="1313 Mockingbird Lane", occupation="enng"}

I might want to just retrieve the name though, so the projection would be simply:

{name = "William"}

filter – I want the ability to only retrieve the records that meet a particular criteria.

I will ignore aggregate functions, such as groupby, sort, and the like as those do not work particularly well with a streaming interface. What follows is a simple implementation of a query processor that satisfies the needs I listed above:

-- Query.lua
--

--[[
	the query function receives its parameters as a single table
	params.source - The data source.  It should be an iterator that returns
	table values

	params.filter - a function, that receives a single table value as input
	and returns a single table value as output.  If the record is 'passed' then
	it is returned as the return value.  If the record does not meet the filter
	criteria, then 'nil' will be returned.

	params.projection - a function to morph a single entry.  It receives a single
	table value as input, and returns a single table value as output.

	The 'filter' and 'projection' functions are very similar, and in fact, the
	filter can also be used to transform the input.  They are kept separate 
	so that each can remain fairly simple in terms of their implementations.
--]]

local query = function(params)
	if not params or not params.source then
		return false, "source not specified";
	end

	local nextRecord = params.source;
	local filter = params.filter;
	local projection = params.projection;


	local function closure()
		local record;

		if filter then
			while true do
				record = nextRecord();	
	
				if not record then
					return nil;
				end
				
				record = filter(self, record);

				if record then
					break;
				end
			end
		else
			record = nextRecord();
		end

		if not record then
			return nil;
		end

		if projection then
			return projection(self, record);
		end

		return record;
	end

	return closure;
end

-- A simple iterator over a table
-- returns the embedded table entries
-- individually.
local irecords = function(tbl)
	local i=0;

	local closure = function()
		i = i + 1;
		if i > #tbl then
			return nil;
		end

		return tbl[i];
	end

	return closure	
end

-- given a key/value record, and a filter table
-- pass the record if every field in the filtertable
-- matches a field in the record.
local recordfilter = function(record, filtertable)
	for key,value in pairs(filtertable) do
		if not record[key] then 
			print("record does not have field: ", key)
			return nil;
		end

		if tostring(record[key]) ~= tostring(value) then
			print(record[key], "~=", value);
			return nil;
		end
	end

	return record;
end

return {
  irecords = irecords,
  recordfilter = recordfilter,
  query = query,
}

The ‘query()’ function represents the bulk of the operation. The other two functions help in forming iterators and doing simple queries.

Here is one example of how it can be used:

-- test_query.lua
--

local JSON = require("dkjson");
local Query = require("Query");
local irecords = Query.irecords

local records = {
  {name = "William", address="1313 Mockingbird Lane", occupation = "eng"},
  {name = "Daughter", address="university", occupation="student"},
  {name = "Wife", address="home", occupation="changer"},
}

local test_query = function()
  local source = irecords(records);

  local res = {}

  for record in Query.query {
    source = source, 
	
    projection = function(self, record)
      return {name=record.name, address=record.address, };
    end,

    filter = function(self, record)
      if record.occupation == "eng" then
        return record;
      end
    end
  } do
    table.insert(res, record);
  end

  local jsonstr = JSON.encode(res, {indent=true});
  print(jsonstr);
end

test_query();

Which results in the following:

[{
    "name":"William",
    "address":"1313 Mockingbird Lane"
  }]

This uses the iterator, a specified filter, and projection. The query() function itself is an iterator, so it will iterate over the data source, and apply the filter and projection to each record, returning results. Nice and easy, very Lua like.

Now that I have a very rudimentary query processor, I can apply it to my web case. So, if I rewrite the web page that’s showing the services on my machine, and can deal with a little bit of query processing:

--[[
	Description: A very simple demonstration of one way a static web server
	can be built using TINN.

	In this case, the WebApp object is being used.  It is handed a routine to be
	run for every http request that comes in (HandleSingleRequest()).

	Either a file is fetched, or an error is returned.

	Usage:
	  tinn staticserver.lua 8080

	default port used is 8080
]]

local WebApp = require("WebApp")


local HttpRequest = require "HttpRequest"
local HttpResponse = require "HttpResponse"
local URL = require("url");
local StaticService = require("StaticService");
local SCManager = require("SCManager");
local JSON = require("dkjson");
local Query = require("Query");
local utils = require("utils");


local getRecords = function(query)
  local mgr, err = SCManager();
  local filter = nil;
  local queryparts;

  if query then
    queryparts = utils.parseparams(query);

    filter = function(self, record)
      return Query.recordfilter(record, queryparts);
    end
  end

  local res = {};

  for record in Query.query {
    source = mgr:services(), 
    filter = filter,
    } do
      table.insert(res, record);
  end
  return res;
end

local HandleSingleRequest = function(stream, pendingqueue)
	local request, err  = HttpRequest.Parse(stream);

	if not request then
		print("HandleSingleRequest, Dump stream: ", err)
		return 
	end

	local urlparts = URL.parse(request.Resource)
	local response = HttpResponse.Open(stream)

	if urlparts.path == "/system/services" then
		local res = getRecords(urlparts.query);
		local jsonstr = JSON.encode(res, {indent=true});

		--print("echo")
		response:writeHead("200")
		response:writeEnd(jsonstr);
	else
		response:writeHead("404");
		response:writeEnd();
	end

	-- recycle the stream in case a new request comes 
	-- in on it.
	return pendingqueue:Enqueue(stream)
end


--[[ Configure and start the service ]]
local port = tonumber(arg[1]) or 8080

Runtime = WebApp({port = port, backlog=100})
Runtime:Run(HandleSingleRequest);

Here I have introduced the ‘getRecords()’ function, which takes care of getting the raw records from the list of services, and running the query to filter for the ones that I might want to see. In this case, a filter is created if the user specifies something interesting in the url. Without a filter, the url is simply:

http://localhost:8080/system/services

In which case you’ll get the list of all services on the machine, regardless of their current running state.

If you wanted to filter for only the services that were currently running, you would specify a URL such as this:

http://localhost:8080/system/services?State=RUNNING

And if you want to look for a particular service, by name, you would do:

http://localhost:8080/system/services?ServiceName=ACPI

[{
    "ServiceType":"KERNEL_DRIVER",
    "ProcessId":0,
    "DisplayName":"Microsoft ACPI Driver",
    "ServiceName":"ACPI",
    "ServiceFlags":0,
    "State":"RUNNING"
  }]

Of course, you can also do simple combinations:

http://localhost:8080/system/services?State=RUNNING;ServiceType=KERNEL_DRIVER

This will return the list of all the kernel drivers that are currently running.

Of course, if you’re sitting on your local machine, you could bring up the TaskManager, export the list of services, import it into a real database/excel, and perform queries to your heart’s content…

This type of coding makes spelunking your system really easy. The fact that it’s available through a web interface opens up some possibilities in terms of display, interaction, and accessibility. Since the stream is just JSON, it could be fairly straight forward to present this information in a much more interesting form, perhaps by using d3 or webgl, or who knows what.

So, is this software engineering?

Having gone from a low level system call to a higher level web based interface with interactive query capabilities, I’d say it must be approaching the term. Perhaps the ‘engineering’ lies in the simplicity. Rather than this being a fairly large integrated system, it’s just a few lines of script code that ties together well.

I believe the “engineering”, and thus an “engineer” comes from being able to recognize the minimal amount of code necessary to get a job done. The “engineering” lies in the process of finding those minimal lines of code.



Leave a comment