When Is Software Engineering – Surely a database is required
Posted: May 14, 2013 Filed under: Lua, LuaJIT, Microsoft, Network Programming, System Programming, TINN | Tags: lua, query, tinn Leave a commentSo, I’ve gotten data, and presented it on a web page in JSON format. If that’s not engineering, I’m not sure what is, but way, surely a database of sorts must be involved.
There are plenty of times in my code where I need to quickly filter some ‘records’ performing some activity only on those records that meet a particular criteria. Given that Lua is table based, everything of interest becomes a ‘record’. This applies to “classes” as well as the more garden variety of ‘records’ that might be streaming out of an actual database, or in my recent example, a simple iterator over the services on my machine. It would be nice if I had some fairly straight forward way to deal with those records. What I need is an iterator based query processor.
The requirements are fairly simple. There are three things that are typical of record processors:
record source – The source of data. In my case, the source will be any iterator that feeds out simple key/value table structures.
projection – In database terminology, ‘projection’ is simply the list of fields that you want to actually present in the query results. I might have a record that looks like this:
{name = "William", address="1313 Mockingbird Lane", occupation="enng"}
I might want to just retrieve the name though, so the projection would be simply:
{name = "William"}
filter – I want the ability to only retrieve the records that meet a particular criteria.
I will ignore aggregate functions, such as groupby, sort, and the like as those do not work particularly well with a streaming interface. What follows is a simple implementation of a query processor that satisfies the needs I listed above:
-- Query.lua -- --[[ the query function receives its parameters as a single table params.source - The data source. It should be an iterator that returns table values params.filter - a function, that receives a single table value as input and returns a single table value as output. If the record is 'passed' then it is returned as the return value. If the record does not meet the filter criteria, then 'nil' will be returned. params.projection - a function to morph a single entry. It receives a single table value as input, and returns a single table value as output. The 'filter' and 'projection' functions are very similar, and in fact, the filter can also be used to transform the input. They are kept separate so that each can remain fairly simple in terms of their implementations. --]] local query = function(params) if not params or not params.source then return false, "source not specified"; end local nextRecord = params.source; local filter = params.filter; local projection = params.projection; local function closure() local record; if filter then while true do record = nextRecord(); if not record then return nil; end record = filter(self, record); if record then break; end end else record = nextRecord(); end if not record then return nil; end if projection then return projection(self, record); end return record; end return closure; end -- A simple iterator over a table -- returns the embedded table entries -- individually. local irecords = function(tbl) local i=0; local closure = function() i = i + 1; if i > #tbl then return nil; end return tbl[i]; end return closure end -- given a key/value record, and a filter table -- pass the record if every field in the filtertable -- matches a field in the record. local recordfilter = function(record, filtertable) for key,value in pairs(filtertable) do if not record[key] then print("record does not have field: ", key) return nil; end if tostring(record[key]) ~= tostring(value) then print(record[key], "~=", value); return nil; end end return record; end return { irecords = irecords, recordfilter = recordfilter, query = query, }
The ‘query()’ function represents the bulk of the operation. The other two functions help in forming iterators and doing simple queries.
Here is one example of how it can be used:
-- test_query.lua -- local JSON = require("dkjson"); local Query = require("Query"); local irecords = Query.irecords local records = { {name = "William", address="1313 Mockingbird Lane", occupation = "eng"}, {name = "Daughter", address="university", occupation="student"}, {name = "Wife", address="home", occupation="changer"}, } local test_query = function() local source = irecords(records); local res = {} for record in Query.query { source = source, projection = function(self, record) return {name=record.name, address=record.address, }; end, filter = function(self, record) if record.occupation == "eng" then return record; end end } do table.insert(res, record); end local jsonstr = JSON.encode(res, {indent=true}); print(jsonstr); end test_query();
Which results in the following:
[{ "name":"William", "address":"1313 Mockingbird Lane" }]
This uses the iterator, a specified filter, and projection. The query() function itself is an iterator, so it will iterate over the data source, and apply the filter and projection to each record, returning results. Nice and easy, very Lua like.
Now that I have a very rudimentary query processor, I can apply it to my web case. So, if I rewrite the web page that’s showing the services on my machine, and can deal with a little bit of query processing:
--[[ Description: A very simple demonstration of one way a static web server can be built using TINN. In this case, the WebApp object is being used. It is handed a routine to be run for every http request that comes in (HandleSingleRequest()). Either a file is fetched, or an error is returned. Usage: tinn staticserver.lua 8080 default port used is 8080 ]] local WebApp = require("WebApp") local HttpRequest = require "HttpRequest" local HttpResponse = require "HttpResponse" local URL = require("url"); local StaticService = require("StaticService"); local SCManager = require("SCManager"); local JSON = require("dkjson"); local Query = require("Query"); local utils = require("utils"); local getRecords = function(query) local mgr, err = SCManager(); local filter = nil; local queryparts; if query then queryparts = utils.parseparams(query); filter = function(self, record) return Query.recordfilter(record, queryparts); end end local res = {}; for record in Query.query { source = mgr:services(), filter = filter, } do table.insert(res, record); end return res; end local HandleSingleRequest = function(stream, pendingqueue) local request, err = HttpRequest.Parse(stream); if not request then print("HandleSingleRequest, Dump stream: ", err) return end local urlparts = URL.parse(request.Resource) local response = HttpResponse.Open(stream) if urlparts.path == "/system/services" then local res = getRecords(urlparts.query); local jsonstr = JSON.encode(res, {indent=true}); --print("echo") response:writeHead("200") response:writeEnd(jsonstr); else response:writeHead("404"); response:writeEnd(); end -- recycle the stream in case a new request comes -- in on it. return pendingqueue:Enqueue(stream) end --[[ Configure and start the service ]] local port = tonumber(arg[1]) or 8080 Runtime = WebApp({port = port, backlog=100}) Runtime:Run(HandleSingleRequest);
Here I have introduced the ‘getRecords()’ function, which takes care of getting the raw records from the list of services, and running the query to filter for the ones that I might want to see. In this case, a filter is created if the user specifies something interesting in the url. Without a filter, the url is simply:
http://localhost:8080/system/services
In which case you’ll get the list of all services on the machine, regardless of their current running state.
If you wanted to filter for only the services that were currently running, you would specify a URL such as this:
http://localhost:8080/system/services?State=RUNNING
And if you want to look for a particular service, by name, you would do:
http://localhost:8080/system/services?ServiceName=ACPI [{ "ServiceType":"KERNEL_DRIVER", "ProcessId":0, "DisplayName":"Microsoft ACPI Driver", "ServiceName":"ACPI", "ServiceFlags":0, "State":"RUNNING" }]
Of course, you can also do simple combinations:
http://localhost:8080/system/services?State=RUNNING;ServiceType=KERNEL_DRIVER
This will return the list of all the kernel drivers that are currently running.
Of course, if you’re sitting on your local machine, you could bring up the TaskManager, export the list of services, import it into a real database/excel, and perform queries to your heart’s content…
This type of coding makes spelunking your system really easy. The fact that it’s available through a web interface opens up some possibilities in terms of display, interaction, and accessibility. Since the stream is just JSON, it could be fairly straight forward to present this information in a much more interesting form, perhaps by using d3 or webgl, or who knows what.
So, is this software engineering?
Having gone from a low level system call to a higher level web based interface with interactive query capabilities, I’d say it must be approaching the term. Perhaps the ‘engineering’ lies in the simplicity. Rather than this being a fairly large integrated system, it’s just a few lines of script code that ties together well.
I believe the “engineering”, and thus an “engineer” comes from being able to recognize the minimal amount of code necessary to get a job done. The “engineering” lies in the process of finding those minimal lines of code.