Serialization 104 – Pixel Packing Pugilism

Do you ever wonder what’s really going on inside that computer? I mean, you layout some data structure, all nice and neat. You pack tightly using bit fields, thinking you’ve written the most tightly packed conservative bit of code known to man. And then you go and try to access the structure, to set or get a value.

Well, truth be told, there’s no such thing as a 2-bit value, so somehow, the machine/compiler coerces that into something it can deal with, either a int8_t, or int16_t, or int32_t, or what have you. Yah, it’s only a couple of instructions, and it all happens transparently, and it’s super scalar (or not), and runs in parallel without stalling, and how many angels can dance on the head of a transistor anyway…

Often times, when you’re dealing with a protocol, you get something in the mail that describes a ‘header’ like this:

	The Rtp header has the following format:

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |V=2|P|X|  CC   |M|     PT      |       sequence number         |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                           timestamp                           |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |            synchronization source (SSRC) identifier           |
    +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
    |    contributing source (CSRC) identifiers  (if mixers used)   |
    |                             ....                              |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

     V = Version
     P = Padding
     X = Extensions
     CC = Count of Contributing Sources
     M = Marker
     PT = Payload Type

And it looks just like that, complete with ASCII art and everything. Take a look at some of the internet standards if you don’t believe me.

What this is telling me, is that in order to communicate using the Rtp protocol (which I really want to do), I will receive a packet off the networking wire that can be described by this header. It gets a bit dicey after CSRC, but essentially the header is all I’d care about up front. In particular, I’m interested in the sequence number, timestamp, and Payload Type.

Since I have trained myself to describe my data structures in a more abstract form, it looks like this:

RTPHeader_Info = {
    name = "RTPHeader";
    fields = {
	{name = "Version", basetype = "uint8_t", subtype="bit", repeating = 2};
	{name = "Padding", basetype = "uint8_t", subtype="bit", repeating = 1};
	{name = "Extensions", basetype = "uint8_t", subtype="bit", repeating = 1};
	{name = "ContributingCount", basetype = "uint8_t", subtype="bit", repeating=4};
	{name = "Marker", basetype = "uint8_t", subtype="bit", repeating=1};
	{name = "PayloadType", basetype = "uint8_t", subtype="bit", repeating=7};
	{name = "SequenceNumber", basetype = "uint16_t"};
	{name = "TimeStamp", basetype = "uint32_t"};
	{name = "SSRC", basetype = "uint32_t"};
    };
};

Armed with the previously developed CStructFromTypeInfo(info) function, I can easily create the following structure:

typedef struct RTPHeader {
    uint8_t Version : 2;
    uint8_t Padding : 1;
    uint8_t Extensions : 1;
    uint8_t ContributingCount : 4;
    uint8_t Marker : 1;
    uint8_t PayloadType : 7;
    uint16_t SequenceNumber;
    uint32_t TimeStamp;
    uint32_t SSRC;
} RTPHeader;

I can even create easy serialize/deserializers which can go from a stream to this structure and back. But, wait a second. Should I?

The way networking traffic works, you are handed a chunk of data, and even if you are “streaming”, you get a certain amount of data in a fixed sized buffer. Depending on your application, you may choose to copy that data, or just look at values and take some actions based on what you see. If you’re doing something like playing audio, you’re probably going to copy the buffer into another part of the system that deals with playing audio. If you’re just doing some filtering, or some simple command where the buffer isn’t needed, you’ll take take what you want and let it return back to the system unharmed.

So, let’s assume you don’t really want to copy the buffer at all, but you need to get values out of it, or perhaps set some values in it before passing it along. Or, perhaps you’re at the head end, and you want to construct a header package, right there inside the buffer that some other part of the system has handed you. You don’t want to cons up a structure, just for the convenience of being able to use a function to write a value into a memory location, just so you can copy that whole structure into the buffer. And what about the endianness of it all!!

OK. So, clearly there must be another mechanism that can be employed to achieve the following flow:

  • Create a buffer of a specific size
  • Write values into the buffer at specific bit locations
  • Send the buffer into the networking stack without any copying

Well, from the previous expose on bit banging, two functions were created:

setbitstobytes(bytes, startbit, bitcount, value, bigendian)
getbitsfrombytes(bytes, startbit, bitcount)

With these two functions, that’s all you need to be able to set a value at any location anywhere in a buffer. That is, any value up to an int32_t. Going beyond that is not impossible as you can always just break it up.

At any rate, this looks like another opportunity to do some codegen. So, given the same exact meta information that was used to create that typedef, and the serializers, I could create an offsets table. Basically, I need a table that contains the following information for each field:

name, offset, number of bits

Easy enough, and the BitOffsetsFromTypeInfo(desc) can do the trick. Given the previous header description, the following table can be generated:

0	2	Version
2	1	Padding
3	1	Extensions
4	4	ContributingCount
8	1	Marker
9	7	PayloadType
16	16	SequenceNumber
32	32	TimeStamp
64	32	SSRC

It is pretty printed here, but in reality, it’s just a standard Lua table where each entry is itself a table containing the essential information we need.

With that, it looks like everyting is in place to auto gen up some field accessors. Those are just simply assembled strings that look like this:

function CreateBufferFieldWriter(field)
    return string.format([[
function set_%s(bytes, value)
    setbitstobytes(bytes, %d, %d, value);
end
]], field.name, field.offset, field.size);
end

function CreateBufferFieldReader(field)
    return string.format([[
function get_%s(bytes)
    return getbitsfrombytes(bytes, %d, %d);
end
]], field.name, field.offset, field.size);
end

As a simple example, dealing with the Marker field would look like this:

function set_Marker(bytes, value)
    setbitstobytes(bytes, 8, 1, value);
end

function get_Marker(bytes)
    return getbitsfrombytes(bytes, 8, 1);
end

There is another function that just kind of ties all this together. The CreateBufferAccessor(desc) function takes our structure descriptor, and autogens all those appropriate accessors.

function CreateBufferAccessor(desc)
    local funcs = {}

    -- first create the offsets structure
    local offsets = BitOffsetsFromTypeInfo(desc)

    -- go through field by field and create the
    -- bit of code that will write to the buffer
    for _,field in ipairs(offsets) do
        table.insert(funcs, CreateBufferFieldWriter(field))
        table.insert(funcs, CreateBufferFieldReader(field))
    end

    return funcs
end

At this point we have a table that is full of strings that represent the functions that are used to set and get values on our buffer. And now, finally, to put it to use.

function test_BufferClass()
    local funcs = CreateBufferAccessor(RTPHeader_Info)
    local funcstr = table.concat(funcs)

    -- Now that we have the functions, compile them
    -- so we can try to use them
    local f = loadstring(funcstr)
    f()

    -- Finally, try to set some values
    -- Create a buffer first to act as the header storage
    local buff = ffi.new("uint8_t[2048]")

    set_Version(buff, 2)
    set_Padding(buff,0)
    set_Extensions(buff,1)
    set_ContributingCount(buff, 3)
    set_Marker(buff, 1)
    set_PayloadType(buff, 15)
    set_SequenceNumber(buff, 127)
    set_TimeStamp(buff, 523)
    set_SSRC(buff, 722)

    print("Version: ", get_Version(buff))
    print("Padding: ", get_Padding(buff))
    print("Extensions: ", get_Extensions(buff))
    print("Count: ", get_ContributingCount(buff))
    print("Marker: ", get_Marker(buff))
    print("Payload Type: ", get_PayloadType(buff))
    print("Sequence: ", get_SequenceNumber(buff))
    print("Timestamp: ", get_TimeStamp(buff))
    print("SSRC: ", get_SSRC(buff))
end

The output of running this is:

Version: 	2
Padding: 	0
Extensions: 	1
Count: 	3
Marker: 	1
Payload Type: 	15
Sequence: 	127
Timestamp: 	523
SSRC: 	722

And that’s exactly what I would have expected it to be. It appears that I can in fact round trip values into and out of the buffer. Now, if I were a true nutter, I’d benchmark this, and compare to the case where I’m consing up a structure and doing copying back and forth. But, my gut tells me I could better spend my time optimizing this technique and only optimize if it’s not meeting my needs. The easy of programming is just too tremendous to pass up.

I wrote Rtp code way long back, and it was a lot nastier looking with funny symbols such as ‘~^&|’ littered all over the place, with hard coded constants, and all manner of other bug inducing nasties laying about. With this version of things, I feel much more relaxed, and confident that the implementation is correct. Even when it is not correct, there are only a couple of places to look in order to change things. This also isolates all the endianness into one place as well, so I don’t have to deal with that anywhere in my higher level code.

There are a couple of nifty tricks here that are good from Lua. Since I can execute new code from a string on the fly at any time, I can construct the serialization code on the fly, and just execute it, without every having to actually have the code laying around anywhere in my environment. I know when using other technologies such as protocol buffers, it becomes a pain to change things because you do have to “compile” code, and distribute it. That makes for a very fragile system indeed. In the code here, the description can change, and can even come from another trusted source, and the serialization will just change automatically, as new serialization code can be constructed automatically. That’s a fairly powerful construct, and solves a very challenging maintencance problem. If you add versioning, you can deal with the versioning as well. Just use the proper serializer for the specific version of the header that comes in. It would only be brought in if a header of a new type is seen. Otherwise, you’re never bothered with it.

The last thing has to do with the convenience functions constructed. In the code here, I just created the functions in the global namespace. They could just as easily be stuffed into a list, or into a ‘class’. That’s the beauty of autogen’d code. It’s very small and easy to change to get a maximal effect.

Well, there you have it. I’m able to meet my goals. When I need to write out a new RtpHeader, I just take a buffer (probably from a circular queue), and write in the appropriate header values. Then, I can use the same buffer to fill in the payload information, and finally use the same buffer to write out to the network interface. Assuming the buffer was allocated from the system heap, and I have turned off send buffering in my networking code, this will go straight out without additional copying. And that’s very good indeed.

I like programming like this. I get to be fully lazy, assured that things will just work the way they are supposed to. I can read specs, and code them up fairly painlessly. The next step is to encode the actual protocol itself. It might be that Lua itself is the way to encapsulate the protocol description. No abstraction necessary. We’ll see.


3 Comments on “Serialization 104 – Pixel Packing Pugilism”

  1. […] CAD Documentation Serialization 104 – Pixel Packing Pugilism […]

  2. Levi Pearson says:

    As an added level of convenience, you could use a metatable to turn the get/set functions you just built into methods on your cdata object.

    Consider an iterator interface that would return a datatype that would contain a pointer into your buffer coupled with a metatable that would change table lookup dispatch to call the method named by the key as looked up by that key in the metatable itself. The metatable could also store other relevant information, such as the type of the data you were reading from the buffer.

    Now you can write code like:

    rtpBuffer = CreateBufferAccessor(RTPHeader_Info, buff) — Internally compiles accessors and assigns them to a table for use as a metatable, then returns an iterator that will return a pointer to the next frame in the buffer next time it’s called with the metatable set to allow for lookups.

    for rtpFrame in rtpBuffer()
    rtpFrame.Version = 2
    rtpFrame.padding = 0
    ….
    end

    • Now that I know a bit more about working with metatables, this makes sense. You’re right in that the RTPStream suddenly becomes as easy to program as any other iterator. I’ll have to go back and revisit this code and do an update.


Leave a comment