Build vs Buy in software development – Part 1, the selection criteria

There is a common theme throughout life; “Should I build, or buy?” It doesn’t seem to matter which thing is under consideration, the process we used to make the decision should remain the same. This is a multi-part series on making the build vs buy decision for software projects.

This is WAAV Studio, a rapidly evolving application used for “Smart City” visualization and management. There are several components that go into it, and plenty of ‘build vs buy’ decisions to be made.

Here are the considerations I’ve had in deciding which way to go on several components

  • What is the scope of the component – How broad, and fundamental is it
  • Do I have the expertise to build it
  • How long will it take to integrate a purchased component
  • How long will it take to build it myself
  • What is the cost associated with purchasing it
  • How will maintenance look over time
  • The solution must be easy
  • The solution must be portable to multiple platforms
  • The solution must be small, fast, and intuitive

There are perhaps a couple more things to consider, but these are the highlights. There are some market considerations as well, but they essentially boil down to:

How important is time to market?;

and

What is your budget?

I’ll start with the WAAV Studio time to market and budget first. Time to market in this case is measured in months. A polished product needs to be available within a 12 month time period from when I started (January). The product must meet various milestones of usability along the way, but overall, the 1.0 version must be user friendly within 12 months.

Second, is the budget. This is the product of a very small team, using tools such as Copilot, and ChatGPT, as well as their core skills and programming tools. There is effectively no real budget to speak of, in terms of purchasing other bits of software.

With those hard constraints, I consider the list of ‘modules’ that need to be in this application.

  • Geo spacial mapping
  • Visualize various geo-spacial data sets
    • KMZ
    • GEOJSon
    • Shapefile
    • .csv data sets
  • Visualize various graphics assets
    • .png images
    • .gif images
    • .jpeg images
    • .svg images

That’s the rough list, with lots of detail on each item. The biggest and first build/buy decision was around mapping visualization. The easy and typical answer would be “just use Google Earth”, or something like that. Even before that, it might be “Use ArgGIS”, or any number of GIS software packages. Going this route might be expedient, but will lead you down a path of being constrained by whatever that core package is capable of.

A few of the criteria are around ease of use, and size. This application is something a typical city administrator will use on occasion. Their key job function might not be related to this software, so they need to be able to come to it after 3 months of absence, and still be able to pick it up and use it effectively to achieve some immediate goal. This is a hard one to achieve, and has to do with the UI elements, their layout, and natural usage. Again, when you select a core package, you may not have enough control over these factors to have a satisfying outcome. Using ArcGIS, for example, it has all the features a GIS professional could possibly want. The package size is measured in the 10s of megabytes, and the user’s manual would make an encyclopedia blush if it were printed in paper. This is not an app that can be picked up by your typical city clerk in a 10 minute session, let alone mastered six months later, without constant usage.

First decision: Create a core mapping component from scratch, without reliance, or dependence on any existing mapping components.

This is such a core, fundamental decision, it drives decision making across the other components, so, it better be good.

I have never built a mapping platform before WAAV Studio, so I started with the naive notion that I could actually do it. I mean, how hard could it be, right? All software engineers have the urge to build from scratch, and just jump onto any coding challenge. My years of wisdom told me, I better have a way to evaluate my progress on the task, and determine if it was time to abandon my naive first choice for a better choice later down the line.

In the next part, I’ll look into what goes into the core mapping platform, and which other components were chosen for the build vs buy machine.


SVG From The Ground Up – Time to wrap it up

This time around, we’re in the final stretch, going from a file, through the parsing, creating an object model, and finally, rendering an image.

To recap:

We started with low level building blocks to scan byte streams: Parsing Fundamentals

We got into the guts of XML parsing: It’s XML, how hard could it be

We then looked at core data structures, and how to parse their content: Along the right path

Most recently, we went over several drawing primitives and data structures: Can you imaging that

This series is a reflection on the code that can be found in this repository: svg2b2d, so you can follow along, and freely use it to make your own creations.

Now let’s get back to the build…

Thus far in the series, we’ve been looking at the guts of things, essentially from the bottom up. For this final installment, I’m going to go the other way around, and start from the end result and work back towards the beginning.

The goal I have for a program is to a .svg file into a .png file. That is, take the .svg, parse it, render it into a bitmap image, save that image as a .png. We’ll call this program svg2png, and here it is:

#include "blend2d.h"
#include "mmap.h"
#include "svg.h"

using namespace filemapper;


int main(int argc, char** argv)
{
    if (argc < 2)
    {
        printf("Usage: genimage <svg file>  [output file]\n");
        return 1;
    }

    // create an mmap for the specified file
    const char *filename = argv[1];
    auto mapped = mmap::createShared(filename);

    if (mapped == nullptr)
        return 0;

    // Create the BLImage we're going to draw into
    BLImage outImage(420, 340, BL_FORMAT_PRGB32);

    // parse the svg, and draw it into the image
    parseSVG(mapped->data(), mapped->size(), outImage);
    
    // save the image to a png file
    const char *output = argc > 2 ? argv[2] : "output.png";
    outImage.writeToFile(output);

    // close the mapped file
    mapped->close();


    return 0;
}

We’ve seen bits and pieces of this along the way. First, we get a filename from the command line, and create a memory mapped file from it. That allows us to deal with the contents as a pointer in memory, which makes the parsing really easy. Then we setup an initial BLImage object. The fact that it starts out as a certain size doesn’t matter. It will be changed later in the parsing process. Then there’s the call to parseSVG, which is the entry point to parsing SVG in this case. And finally, we output the image using an inbuilt capability of the blend2d library ‘outImage.writeToFile()’, and we’re done! What could be easier?

Let’s take a look at parseSVG(), since that’s where the real action is.

#include "svg.h"
#include "svgshapes.h"
#include "bspanutil.h"

#include <vector>
#include <memory>

bool parseSVG(const void* bytes, const size_t sz, BLImage& outImage)
{
    svg2b2d::ByteSpan inChunk(bytes, sz);
    
    // Create a new document
    svg2b2d::SVGDocument doc;

    // Load the document from the data
    doc.readFromData(inChunk);
    
    
    // Draw the document into a IRender
    outImage.create(doc.width(), doc.height(), BL_FORMAT_PRGB32);
    SVGRenderer ctx(outImage);
    doc.draw(ctx);
    ctx.end();
    
    return true;
}
  • Create a ByteSpan for the pointer and size that we’ve been given (the memory mapped file)
  • Create an instance of an SVGDocument (a container to hold what we parse)
  • Read/parse the contents, filling in the SVGDocument
  • Resize the image to match the size of the document
  • Create a rendering context and connect it to the image
  • Draw the document into the rendering context
  • Done

Light and simple. So, let’s go one step further, and look at that ‘readFromData()’, by examining the whole SVGDocument object

	struct SVGDocument : public IDrawable
	{

		// All the drawable nodes within this document
		std::shared_ptr<SVGRootNode> fRootNode{};
		std::vector<std::shared_ptr<IDrawable>> fShapes{};
		BLBox fExtent{};

		SVGDocument() = default;

		double width() const { 
			if (fRootNode == nullptr)
				return 100;
			return fRootNode->width();
        }

		double height() const {
			if (fRootNode == nullptr)
				return 100;
			return fRootNode->height();
		}

		void draw(IRender& ctx) override
		{
			for (auto& shape : fShapes)
			{
				shape->draw(ctx);
			}
		}

		// Add a node that can be drawn
		void addNode(std::shared_ptr<SVGObject> node)
		{
			fShapes.push_back(node);
		}


		// Load the document from an XML Iterator
		// Since this is the top level document, we just want to kick
		// off loading the root node 'svg', and we're done 
		void loadFromIterator(XmlElementIterator& iter)
		{

			// skip past any elements that come before the 'svg' element
			while (iter)
			{
				const XmlElement& elem = *iter;

				if (!elem)
					break;

				//printXmlElement(*iter);

				// Skip over these node types we don't know how to process
				if (elem.isComment() || elem.isContent() || elem.isProcessingInstruction())
				{
					iter++;
					continue;
				}


				if (elem.isStart() && (elem.name() == "svg"))
				{
                    // There should be only one root node in a document, so we should 
                    // break here, but, curiosity...
                    fRootNode = SVGRootNode::createFromIterator(iter);
                    if (fRootNode != nullptr)
                    {
                        addNode(fRootNode);
                    }
				}

				iter++;
			}
		}

		bool readFromData(const ByteSpan &inChunk)
		{
			ByteSpan s = inChunk;

			XmlElementIterator iter(s);

			loadFromIterator(iter);

			return true;
		}


	};

The SVGDocument has two primary things it achieves. The first is to parse the raw svg and turn it into a structured thing that can later be rendered, or some other processing can occur on it. The second thing is to provide a convenient entry point to render into a drawing context.

The readFromData() call should be familiar. It’s just XML after all isn’t it? So, create an XmlIterator on the chunk of memory we were passed, and get to parsing. The ‘loadFromIterator()’ function above that is the one that’s taking the individual nodes, and doing something with them. In this case, we’re only interested in the top level ‘<svg>’ node, so when we see that, we tell it to load itself from the iterator, and we save it as our root node.

The SVGRootNode itself isn’t much more complicated, and does a similar act

	struct SVGRootNode : public SVGGroup
	{
		std::shared_ptr<SVGPortal> fPortal;

		SVGRootNode() :SVGGroup(nullptr) { setRoot(this); }
		SVGRootNode(IMapSVGNodes *root)
			: SVGGroup(root)
		{
			setRoot(this);
		}
		
		double width()
		{
			if (fPortal != nullptr)
				return fPortal->width();
			return 100;
		}

		double height()
		{
			if (fPortal != nullptr)
				return fPortal->height();
			return 100;
		}

		void loadSelfFromXml(const XmlElement& elem) override
		{
			SVGGroup::loadSelfFromXml(elem);
			
			fPortal = SVGPortal::createFromXml(root(), elem, "portal");
		}

		static std::shared_ptr<SVGRootNode> createFromIterator(XmlElementIterator& iter)
		{
			auto node = std::make_shared<SVGRootNode>();
			node->loadFromIterator(iter);

			return node;
		}

		void draw(IRender& ctx) override
		{
			ctx.save();

			// Start with default state

			ctx.setFillStyle(BLRgba32(0, 0, 0));
			
			ctx.setStrokeStyle(BLRgba32(0));
			ctx.setStrokeWidth(1.0);
			ctx.textSize(16);
			
			// Apply attributes that have been gathered
			// in the case of the root node, it's mostly the viewport
			applyAttributes(ctx);

			// Draw the children
			drawSelf(ctx);

			ctx.restore();
		}

	};

The fact that it’s a subclass of SVGGroup takes care of some boilerplate code for loading self-enclosing elements, and grouped elements, and specialized elements and the like. The svgshapes.h file contains the nitty gritty details, so you should check there, so I don’t bore you with it all here. You can see that in the drawing routine, we setup the drawing context to have the default values the SVG environment expects. There are things such as having no stroke, but a black fill, to start. There are other items such as setting up the drawing coordinates, according to the ‘viewBox’ on the svg element, if it exists, and that happens in the ‘applyAttributes()’ function call.

Here’s another picture to keep you interested in the possibilities.

One last guidepost in the code, before we wrap this up. The SVGCompoundNode object is most important for the document structure. That’s where the ‘loadFromIterator()’ function lives. Classes such as SVGGroup, and SVGGradient, descend from there, and just implement a few calls to deal with further grouped things. So, that’s a piece of code to take a look at. It’s structured and organized to make it simple for sub-classes to just add a little bit here and there to specialize for given situations. Otherwise, it’s meant to be a relatively safe default to handle the processing of nodes, whether they be self contained, or compound.

And that’s about it. We’ve gone from the beginnings of how to scan stuff at a byte level, all the way through a simple XML parser, and into the complexities of parsing details of SVG types, and constructing a tree to be rendered, and saved as an image. All of the images shared in these articles have been rendered using the code built here, so it’s capable of doing fairly complex SVGs, beyond the typical rendering of the Ghostscript tiger. From here, if you have need for SVG in your own code, you can pretty much just lift the svg2b2d directly, and start using it. I did not cover doing text in SVG in this article series, but it’s actually not as hard as it might seem, simply because the blend2d library already deals with text rendering as well. Normally you’d have to contemplate using freetype, or stb_xxx something or other to get text, which just increases your surface area. With blend2d, you don’t, it does all that as well.

So, there you have it. SVG From The Ground Up! A step by step guide on how to go from bytes in a file, to bits on the screen. I hope this helps those who are inspired to simply learn some of the details, if not those who actually want to implement their own. I am personally using SVG for visualization and UI elements. You can imagine the common refrain of “just use HTML and the browser”, but what’s the fun in that.

Until next time, go parse you some SVG!!


SVG From the Ground Up – Can you imaging that?

It’s time to put the pieces together and get to rendering some SVG!

In the last couple of installments, we were looking at how to do some very low level scanning and parsing. We got through the basics of XML, and looked at some geometry with parsing of the SVG <path> ‘d’ element. The next step is to decide on how we’re going to represent an entire document in memory so that we can render the whole image. This is a pretty big subject, so we’ll start with some design constraints and goals.

At the end of the day, I want to turn the .svg file into bits on the screen. The blend2d graphics library has a BLContext object, which does all the drawing that I need. It can draw everything from individual lines, to gradient shaded polygons, and bitmaps. SVG has some particular drawing requirements, in terms which elements are drawn first, how they are styled, how they are grouped, etc. One example of this is the usage of Cascading Style Sheets (CSS). What this means is that if I turn on one attribute, such as a fill color for polygons, that attribute will be applied to all subsequent elements in a tree drawn after it, until something changes.

Example:

<svg
 viewbox='10 10 190 10'
 xmlns="http://www.w3.org/2000/svg">
<g stroke="red" stroke-width='4'>
  <line x1='10' y1='10' x2='200' y2='200'/>
  <line stroke='green' x1='10' y1='100' x2='200' y2='200'/>
  <line stroke-width='8' stroke='blue' x1='10' y1='200' x2='200' y2='200'/>
  <rect x='100' y='10' width='50' height='50' />
</g>
</svg>

The ‘<g…’ serves as a grouping mechanism. It allows you to set some attributes which will apply to all the elements that are within that group. In this case, I set the stroke (the color used to draw lines) to ‘red’. Until something later changes this, the default will be red lines. I also set the stroke-width (number of pixels used to draw the lines). So, again, unless it is explicitly changed, all subsequent lines in the group will have this width.

The first line drawn, since it does not change the color or the width, uses red, and 4.

The second line drawn, changes the color to ‘green’, but does not change the width.

The third line drawn, changes the color to blue, and changes the width to 8

The rectangle, does not explicitly change the color, so red line drawing, with a width of 4, and a default filll color of black.

Of note, changing the attributes on a single element, such as the green line, does not change that attribute for sibling elements, it only applies to that single element. Only attributes applied at a group level will affect the elements within that group, from above.

This imposes some of our first requirements. We need an object that can contain drawing attributes. In addition, there’s a difference between objects that contain the attributes, such as stroke-width, stroke, fill, etc, and actual geometry, such as line, polygon, path. I will drop SVGObject here, as it is a baseline. If you want to follow along, the code is in the svgtypes.h file.


struct IMapSVGNodes;    // forward declaration



struct SVGObject : public IDrawable
{
    XmlElement fSourceElement;

    IMapSVGNodes* fRoot{ nullptr };
    std::string fName{};    // The tag name of the element
    BLVar fVar{};
    bool fIsVisible{ false };
    BLBox fExtent{};



    SVGObject() = delete;
    SVGObject(const SVGObject& other) :fName(other.fName) {}
    SVGObject(IMapSVGNodes* root) :fRoot(root) {}
    virtual ~SVGObject() = default;

    SVGObject& operator=(const SVGObject& other) {
        fRoot = other.fRoot;
        fName = other.fName;
        BLVar fVar = other.fVar;

        return *this;
    }

    IMapSVGNodes* root() const { return fRoot; }
    virtual void setRoot(IMapSVGNodes* root) { fRoot = root; }

    const std::string& name() const { return fName; }
    void setName(const std::string& name) { fName = name; }

    const bool visible() const { return fIsVisible; }
    void setVisible(bool visible) { fIsVisible = visible; }

    const XmlElement& sourceElement() const { return fSourceElement; }

    // sub-classes should return something interesting as BLVar
    // This can be used for styling, so images, colors, patterns, gradients, etc
    virtual const BLVar& getVariant()
    {
        return fVar;
    }

    void draw(IRender& ctx) override
    {
        ;// draw the object
    }

    virtual void loadSelfFromXml(const XmlElement& elem)
    {
        ;
    }

    virtual void loadFromXmlElement(const svg2b2d::XmlElement& elem)
    {
        fSourceElement = elem;

        // load the common attributes
        setName(elem.name());

        // call to loadselffromxml
        // so sub-class can do its own loading
        loadSelfFromXml(elem);
    }
};

As a base object, it contains the bare minimum that is common across all subsequent objects. It also has a couple of extras which have proven to be convenient, if not strictly necessary.

The strictly necessary is the ‘void draw(IRender &ctx)’. Almost all objects, whether they be attributes, or elements, will need to affect the drawing context. So, they all will need to be given a chance to do that. The ‘draw()’ routine is what gives them that chance.

All objects need to be able to construct themselves from the xml element stream, so the convenient ‘load..’ functions sit here. Whether it’s an alement, or an attribute, it has a name, so we set the name as well. Attributes can set their name independently from being loaded from the XmlElement, so this is a bit of specialization, but it’s ok.

There is this bit of an oddity in the forward declaration of ‘struct IMapSVGNodes; // forward declaration’. As we’ll see much later, we need the ability to lookup nodes based on an ID, so we need an interface somewhere that allows us to do that. As every node constructed might need to do this, we need a way to pass this interface down the tree, without copying it, and without causing circular references, so the forward declaration, and use of the ‘root()’ method.

That’s got us started. We now have something of a base object.

Next up, we have SVGVisualProperty

// SVGVisualProperty
    // This is meant to be the base class for things that are optionally
    // used to alter the graphics context.
    // If isSet() is true, then the drawSelf() is called.
	// sub-classes should override drawSelf() to do the actual drawing
    //
    // This is used for things like; Paint, Transform, Miter, etc.
    //
    struct SVGVisualProperty :  public SVGObject
    {
        bool fIsSet{ false };

        //SVGVisualProperty() :SVGObject(),fIsSet(false){}
        SVGVisualProperty(IMapSVGNodes *root):SVGObject(root),fIsSet(false){}
        SVGVisualProperty(const SVGVisualProperty& other)
            :SVGObject(other)
            ,fIsSet(other.fIsSet)
        {}

        SVGVisualProperty operator=(const SVGVisualProperty& rhs)
        {
            SVGObject::operator=(rhs);
            fIsSet = rhs.fIsSet;
            
            return *this;
        }

        void set(const bool value) { fIsSet = value; }
        bool isSet() const { return fIsSet; }

		virtual void loadSelfFromChunk(const ByteSpan& chunk)
        {
			;
        }

        virtual void loadFromChunk(const ByteSpan& chunk)
        {
			loadSelfFromChunk(chunk);
        }
        
        // Apply propert to the context conditionally
        virtual void drawSelf(IRender& ctx)
        {
            ;
        }

        void draw(IRender& ctx) override
        {
            if (isSet())
                drawSelf(ctx);
        }

    };

It’s not much, and you might question whether it needs to even exist. Maybe it’s couple of routines can just be merged into the SVGObject itself. That is a simple design changed to contemplate, as the only real attribute introduced here is the ‘isSet()’. This is essentially a way to say ‘the value is null’. If I had nullable types, I’d just use that mechanism. But, it also allows you to turn an attribute on and off programmatically, which might turn out to be useful.

Now we can look at a single attribute, the stroke-width, and see how it goes from an xmlElement attribute, to a property in our tree.

    //=========================================================
    // SVGStrokeWidth
    //=========================================================
    
    struct SVGStrokeWidth : public SVGVisualProperty
    {
		double fWidth{ 1.0};

		//SVGStrokeWidth() : SVGVisualProperty() {}
		SVGStrokeWidth(IMapSVGNodes* iMap) : SVGVisualProperty(iMap) {}
		SVGStrokeWidth(const SVGStrokeWidth& other) :SVGVisualProperty(other) { fWidth = other.fWidth; }
        
		SVGStrokeWidth& operator=(const SVGStrokeWidth& rhs)
		{
			SVGVisualProperty::operator=(rhs);
			fWidth = rhs.fWidth;
			return *this;
		}

		void drawSelf(IRender& ctx)
		{
			ctx.setStrokeWidth(fWidth);
		}

		void loadSelfFromChunk(const ByteSpan& inChunk)
		{
			fWidth = toNumber(inChunk);
			set(true);
		}

		static std::shared_ptr<SVGStrokeWidth> createFromChunk(IMapSVGNodes* root, const std::string& name, const ByteSpan& inChunk)
		{
			std::shared_ptr<SVGStrokeWidth> sw = std::make_shared<SVGStrokeWidth>(root);

			// If the chunk is empty, return immediately 
			if (inChunk)
				sw->loadFromChunk(inChunk);

			return sw;
		}

		static std::shared_ptr<SVGStrokeWidth> createFromXml(IMapSVGNodes* root, const std::string& name, const XmlElement& elem)
		{
			return createFromChunk(root, name, elem.getAttribute(name));
		}
    };

It starts from the ‘createFromXml…’. We can look at the main parsing loop later, but there is a point where we’re looking at the attributes of an element, and we’ll run across the ‘stroke-width’, and call this function. The ‘createFromChunk’ is then called, which then calls loadFromChunk after instantiating an object.

There are a couple more design choices being made here. First is the fact that we’re using ‘std::shared_ptr’. This implies heap allocation of memory, and this is the right place to finally make such a decision. We did not want the XML parser itself to do any allocations, but we’re finally at the point where we need to. It’s possible to not even do allocations here, just have the attributes allocated on the objects that use them. But, since attributes can be shared, it’s easier just to bite the bullet now, and use shared_ptr.

In the case of stroke-width, we want to save the width specified (call toNumber()), and when it comes time to apply that width, in the ‘drawSelf()’, we make the rigth call on the drawing context ‘setStrokeWidth()’. Since the same drawing context is used throughout the rendering process, setting an attribute at one point will make that attribute sticky, until something else changes it, which is the behavior that we want.

I would like to describe the ‘stroke’ and ‘fill’ attributes, but they are actually the largest portions of the library. Setting these attributes can occur in so many different ways, it’s worth taking a look at them. Here I will just show a few ways in which they can be used, so you get a feel for how involved they are:

<line stroke="blue" x1='0' y1='0'  x2='100'  y2='100'/>
<line stroke="rgb(0,0,255)" x1='0' y1='0'  x2='100'  y2='100'/> 
<line stroke="rgba(0,0,255,1.0)" x1='0' y1='0'  x2='100'  y2='100'/> 
<line stroke="rgba(0,0,100%,1.0)" x1='0' y1='0'  x2='100'  y2='100'/> 
<line stroke="rgba(0%,0%,100%,100%)" x1='0' y1='0'  x2='100'  y2='100'/> 
<line style = "stroke:blue" x1='0' y1='0'  x2='100'  y2='100'/> 
<line stroke= "url(#SVGID_1)" x1='0' y1='0'  x2='100'  y2='100'/> 

And more…

There is a bewildering assortment of ways in which you can set a stroke or fill, and they don’t all resolve to a single color value. It can be patterns, gradients, even other graphics. So, it can get pretty intense. The SVGPaint structure does a good job of representing all the possibilities, so take a look at that if you want to want to see the intimate details.

We round out our basic object strucutures by looking at how shapes are represented.

//
	// SVGVisualObject
	// This is any object that will change the state of the rendering context
	// that's everything from paint that needs to be applied, to geometries
	// that need to be drawn, to line widths, text alignment, and the like.
	// Most things, other than basic attribute type, will be a sub-class of this
	struct SVGVisualNode : public SVGObject
	{

		std::string fId{};      // The id of the element
		std::map<std::string, std::shared_ptr<SVGVisualProperty>> fVisualProperties{};

		SVGVisualNode() = default;
		SVGVisualNode(IMapSVGNodes* root)
			: SVGObject(root)
		{
			setRoot(root);
		}
		SVGVisualNode(const SVGVisualNode& other) :SVGObject(other)
		{
			fId = other.fId;
			fVisualProperties = other.fVisualProperties;
		}


		SVGVisualNode & operator=(const SVGVisualNode& rhs)
		{
			fId = rhs.fId;
			fVisualProperties = rhs.fVisualProperties;
			
			return *this;
		}
		
		const std::string& id() const { return fId; }
		void setId(const std::string& id) { fId = id; }
		
		void loadVisualProperties(const XmlElement& elem)
		{
			// Run Through the property creation routines, generating
			// properties for the ones we find in the XmlElement
			for (auto& propconv : gSVGPropertyCreation)
			{
				// get the named attribute
				auto attrName = propconv.first;

				// We have a property and value, convert to SVGVisibleProperty
				// and add it to our map of visual properties
				auto prop = propconv.second(root(), attrName, elem);
				if (prop->isSet())
					fVisualProperties[attrName] = prop;

			}
		}

		void setCommonVisualProperties(const XmlElement &elem)
		{
			// load the common stuff that doesn't require
			// any additional processing
			loadVisualProperties(elem);

			// Handle the style attribute separately by turning
			// it into a standalone XmlElement, and then loading
			// that like a normal element, by running through the properties again
			// It's ok if there were already styles in separate attributes of the
			// original elem, because anything in the 'style' attribute is supposed
			// to override whatever was there.
			auto styleChunk = elem.getAttribute("style");

			if (styleChunk) {
				// Create an XML Element to hang the style properties on as attributes
				XmlElement styleElement{};

				// use CSSInlineIterator to iterate through the key value pairs
				// creating a visual attribute, using the gSVGPropertyCreation map
				CSSInlineStyleIterator iter(styleChunk);

				while (iter.next())
				{
					std::string name = std::string((*iter).first.fStart, (*iter).first.fEnd);
					if (!name.empty() && (*iter).second)
					{
						styleElement.addAttribute(name, (*iter).second);
					}
				}

				loadVisualProperties(styleElement);
			}

			// Deal with any more attributes that need special handling
		}

		void loadSelfFromXml(const XmlElement& elem) override
		{
			SVGObject::loadSelfFromXml(elem);
			
			auto id = elem.getAttribute("id");
			if (id)
				setId(std::string(id.fStart, id.fEnd));

			
			setCommonVisualProperties(elem);
		}
		
		// Contains styling attributes
		void applyAttributes(IRender& ctx)
		{
			for (auto& prop : fVisualProperties) {
				prop.second->draw(ctx);
			}
		}
		
		virtual void drawSelf(IRender& ctx)
		{
			;

		}
		
		void draw(IRender& ctx) override
		{
			ctx.save();
			
			applyAttributes(ctx);

			drawSelf(ctx);

			ctx.restore();
		}
	};

We are building up nodes in a tree structure. The SVGVisualNode is essentially the primary node of that construction. At the end of all the tree construction, we want to end up with a root node where we can just call ‘draw(context)’, and have it render itself into the context. That node needs to deal with the Cascading Styles, children drawing in the proper order (painter’s algorithm), deal with all the attributes, and context state.

Of particular note, right there at the end is the ‘draw()’ method. It starts with ‘ctx.save()’ and finishes with ‘ctx.restore()’. This is critical to maintaining the design constraint of ‘attributes are applied locally in the tree’. So, we save the sate of the context coming in, make whatever changes we, or our children will make, then restore the state upon exit. This is the essential operation required to maintain proper application of drawing attributes. Luckily, or rather by design, the blend2d library makes saving and restoring state very fast and efficient. If the base library did not have this facility, it would be up to our code to maintain this state.

Another note here is ‘applyAttributes’. This is what allows things such as the ‘<g>’ element to apply attributes at a high level in the tree, and subsequent elements don’t have to worry about it. They can just apply the attributes that they alter. And where do those common attributes come from?

	static std::map<std::string, std::function<std::shared_ptr<SVGVisualProperty> (IMapSVGNodes *root, const std::string& , const XmlElement& )>> gSVGPropertyCreation = {
		{"fill", [](IMapSVGNodes* root, const std::string& name, const XmlElement& elem) {return SVGPaint::createFromXml(root, "fill", elem ); } }
		,{"fill-rule", [](IMapSVGNodes* root, const std::string& name, const XmlElement& elem) {return SVGFillRule::createFromXml(root, "fill-rule", elem); } }
		,{"font-size", [](IMapSVGNodes* root, const std::string& name, const XmlElement& elem) {return SVGFontSize::createFromXml(root, "font-size", elem); } }
		,{"opacity", [](IMapSVGNodes* root, const std::string& name, const XmlElement& elem) {return SVGOpacity::createFromXml(root, "opacity", elem); } }
		,{"stroke", [](IMapSVGNodes* root, const std::string& name, const XmlElement& elem ) {return SVGPaint::createFromXml(root, "stroke", elem); } }
		,{"stroke-linejoin", [](IMapSVGNodes* root, const std::string& name, const XmlElement& elem) {return SVGStrokeLineJoin::createFromXml(root, "stroke-linejoin", elem); } }
		,{"stroke-linecap", [](IMapSVGNodes* root, const std::string& name, const XmlElement& elem ) {return SVGStrokeLineCap::createFromXml(root, "stroke-linecap", elem); } }
		,{"stroke-miterlimit", [](IMapSVGNodes* root, const std::string& name, const XmlElement& elem ) {return SVGStrokeMiterLimit::createFromXml(root, "stroke-miterlimit", elem); } }
		,{"stroke-width", [](IMapSVGNodes* root, const std::string& name, const XmlElement& elem ) {return SVGStrokeWidth::createFromXml(root, "stroke-width", elem); } }
		,{"text-anchor", [](IMapSVGNodes* root, const std::string& name, const XmlElement& elem) {return SVGTextAnchor::createFromXml(root, "text-anchor", elem); } }
		,{"transform", [](IMapSVGNodes* root, const std::string& name, const XmlElement& elem) {return SVGTransform::createFromXml(root, "transform", elem); } }
};

A nice dispatch table of the most common of attributes. The ‘loadVisualProperties()’ method uses this dispatch table to load the common display properties. Individual geometry objects can load their own specific properties after this, but these are the common ones, so this is very convenient. This table can and should be expanded as even more properties can be supported.

Finally, let’s get to the meat of the geometry representation. This can be found in the svgshapes.h file.

	struct SVGPathBasedShape : public SVGShape
	{
		BLPath fPath{};
		
		SVGPathBasedShape() :SVGShape() {}
		SVGPathBasedShape(IMapSVGNodes* iMap) :SVGShape(iMap) {}
		
		
		void drawSelf(IRender &ctx) override
		{
			ctx.fillPath(fPath);
			ctx.strokePath(fPath);
		}
	};

Ignoring the SVGShape object (a small shim atop SVGObject), we have a BLPath, and a drawSelf(). What could be simpler? The general premise is that all geometry can be represented as a BLPath at the end of the day. Everything from single lines, to polygons, to complex paths, they all boil down to a BLPath. Making this object hugely simplifies the drawing task. All subsequent geometry classes just need to convert themselves into BLPath, which we’ll see is very easy.

Here is the SVGLine, as it’s fairly simple, and representative of the rest of the geometries.

struct SVGLine : public SVGPathBasedShape
	{
		BLLine fGeometry{};
		
		SVGLine() :SVGPathBasedShape(){ reset(0, 0, 0, 0); }
		SVGLine(IMapSVGNodes* iMap) :SVGPathBasedShape(iMap) {}
		
		
		void loadSelfFromXml(const XmlElement& elem) override 
		{
			SVGPathBasedShape::loadSelfFromXml(elem);
			
			fGeometry.x0 = parseDimension(elem.getAttribute("x1")).calculatePixels();
			fGeometry.y0 = parseDimension(elem.getAttribute("y1")).calculatePixels();
			fGeometry.x1 = parseDimension(elem.getAttribute("x2")).calculatePixels();
			fGeometry.y1 = parseDimension(elem.getAttribute("y2")).calculatePixels();

			fPath.addLine(fGeometry);
		}

		static std::shared_ptr<SVGLine> createFromXml(IMapSVGNodes *iMap, const XmlElement& elem)
		{
			auto shape = std::make_shared<SVGLine>(iMap);
			shape->loadFromXmlElement(elem);

			return shape;
		}

		
	};

It’s fairly boilerplate. Just have to get the right attributes turned into values for the BLLine geometry, and add that to our path. That’s it. The rect, circle, ellipse, polyline, polygon, and path objects, all do pretty much the same thing, in as small a space. These are much simpler than having to deal with the ‘stroke’ or ‘fill’ attributes. There is some trickery here in terms of parsing the actual coordinate values, because they can be represented in different kinds of units, but the SVGDimension object deals with all those details.

That’s about enough code for this time around. We’ve looked at attributes, and VisualNodes, and we know that we need cascading styles, painter’s algorithm drawing order, and an ability to draw into a context. Now we have all the pieces we need to complete the final rendering task.

Next time around, I’ll wrap it up by bringing in the SVG ‘parser’, which will combine the XML scanning with our document tree, and render final images.


SVG From the Ground Up – Along the right path

<svg xmlns="http://www.w3.org/2000/svg" width="22" height="22" viewBox="0 0 22 22">
	<path d="M20.658,9.26l0.006-0.007l-9-8L11.658,1.26C11.481,1.103,11.255,1,11,1 
	c-0.255,0-0.481,0.103-0.658,0.26l-0.006-0.007l-9,8L1.342,9.26
	C1.136,9.443,1,9.703,1,10c0,0.552,0.448,1,1,1 c0.255,0,0.481-0.103,0.658-0.26l0.006,0.007
	L3,10.449V20c0,0.552,0.448,1,1,1h5v-8h4v8h5c0.552,0,1-0.448,1-1v-9.551l0.336,0.298 
	l0.006-0.007C19.519,10.897,19.745,11,20,11c0.552,0,1-0.448,1-1C21,9.703,20.864,9.443,20.658,9.26z 
	M7,16H5v-3h2V16z M17,16h-2 v-3h2V16z"/>
</svg>

In the last installment (It’s XML, How hard could it be?), we got as far as being able to scan XML, and generate a bunch of XmlElement objects. That’s a great first couple of steps, and now the really interesting parts begin. But, first, before we get knee deep into the seriousness of the rest of SVG, we need to deal with the graphics subsystem. It’s one thing to ‘parse’ SVG, and even build up a Document Object Model (DOM). It’s quite another to actually do the rendering of the same. To do both, in a compact form, with speed and agility, that’s what we’re after.

This time around I’m going to introduce blend2d, which is the graphics library that I’m using to do all my drawing. I stumbled across blend2d a few years ago, and I don’t even remember how I found it. There are a couple of key aspects to it that are of note. One is that the API is really simple to use, and the library is easy to build. The other part, is more esoteric, but perfect for our needs here. The library was built around support for SVG. So, it has all the functions we need to build the typical kinds of graphics that we’re concerned with. I won’t go into excruciating detail about the blend2d API here, as you can visit the project on github, but I will take a look at the BLPath object, because this is the true workhorse of most SVG graphics.

The little house graphic above is typical of the kinds of little vector based icons you find all over the place. In your apps as icons, on Etsy as laser cuttable images, etc. Besides the opening ‘<svg…’, you see the ‘<path…’. SVG images are comprised of various geometry elements such as rect, circle, polyline, polygon, and path. If you really want to get into the nitty gritty details, you can check out the full SVG Specification.

The path geometry is used to describe a series of movements a pen might make on a plotter. MoveTo, LineTo, CurveTo, etc. There are a total of 20 commands you can use to build up a path, and they can used in almost any combination to create as complex a figure as you want.

    // Shaper contour Commands
    // Origin from SVG path commands
    // M - move       (M, m)
    // L - line       (L, l, H, h, V, v)
    // C - cubic      (C, c, S, s)
    // Q - quad       (Q, q, T, t)
    // A - ellipticArc  (A, a)
    // Z - close        (Z, z)
    enum class SegmentCommand : uint8_t
    {
        INVALID = 0
        , MoveTo = 'M'
        , MoveBy = 'm'
        , LineTo = 'L'
        , LineBy = 'l'
        , HLineTo = 'H'
        , HLineBy = 'h'
        , VLineTo = 'V'
        , VLineBy = 'v'
        , CubicTo = 'C'
        , CubicBy = 'c'
        , SCubicTo = 'S'
        , SCubicBy = 's'
        , QuadTo = 'Q'
        , QuadBy = 'q'
        , SQuadTo = 'T'
        , SQuadBy = 't'
        , ArcTo = 'A'
        , ArcBy = 'a'
        , CloseTo = 'Z'
        , CloseBy = 'z'
    };

A single path has a ‘d’ attribute, which contains a series of these commands strung together. It’s a very compact description for geometry. A single path can be used to generate something quite complex.

With the exception of the blue text, that entire image is generated with a single path element. Quite complex indeed.

Being able to parse the ‘d’ attribute of the path element is super critical to our success in ultimately rendering SVG. There are a few design goals we have in doing this.

  • Be as fast as possible
  • Be as memory efficient as possible
  • Do not introduce intermediate forms if possible

No big deal right? Well, as luck would have it, or rather by design, the blend2d library has an object, BLPath, which was designed for exactly this task. You can checkout the API documentation if you want to look at the details, but it essentially has all those ‘moveTo’, ‘lineTo’, etc, and a whole lot more. It only implements the ‘to’ forms, and not the ‘by’ forms, but it’s easy to get the last vertex and implement the ‘by’ forms ourselves, which we’ll do.

So, our implementation strategy will be to read a command, and read enough numbers to make a call to a BLPath object to actually create the geometry. The entirety of the code is roughly 500 lines, and most of it is boilerplate, so I won’t bother listing it all here, but you can check it out online in the parseblpath.h file.

Let’s look at a little snippet of our house path, and see what it’s doing.

M20.658,9.26l0.006-0.007l-9-8

It is hard to see in this way, so let me write it another way.

M 20.658, 9.26
l 0.006, -0.007
l -9, -8

Said as a series of instructions (and it’s hard to tell between ‘one’ and ‘el’), it would be:

Move to 20.658, 9.26
Line by 0.006, -0.007
Line by -9, -8

If I were to do it as code in blend2d, it would be

BLPath path{};
BLPoint lastPt{};

path.moveTo(20.658, 9.26);
path.getLastVertex(&lastPt);

path.lineTo(lastPt.x + 0.006, lastPt.y+ -0.007);
path.getLastVertex(&lastPt);

path.lineTo(lastPt.x + -9, lastPt.y + -8);

So, our coding task is to get from that cryptic ‘d’ attribute to the code connecting to the BLPath object. Let’s get started.

The first thing we’re going to need is a main routine that drives the process.

		static bool parsePath(const ByteSpan& inSpan, BLPath& apath)
		{
			// Use a ByteSpan as a cursor on the input
			ByteSpan s = inSpan;
			SegmentCommand currentCommand = SegmentCommand::INVALID;
			int iteration = 0;

			while (s)
			{
				// ignore leading whitespace
				s = chunk_ltrim(s, whitespaceChars);

				// If we've gotten to the end, we're done
				// so just return
				if (!s)
					break;

				if (commandChars[*s])
				{
					// we have a command
					currentCommand = SegmentCommand(*s);
					
					iteration = 0;
					s++;
				}

				// Use parseMap to dispatch to the appropriate
				// parse function
				if (!parseMap[currentCommand](s, apath, iteration))
					return false;


			}

			return true;
		}

Takes a ByteSpan and a reference to a BLPath object, and returns ‘true’ if successful, ‘false’ otherwise. There are design choices to be made at every step of course. Why did I pass in a reference to a BLPath, instead of just constructing one inside the routine, and handing it back? Well, because this way, I allow something else to decide where the memory is allocated. This way also allows you to build upon an existing path if you want.

Second choice is, why a const ByteSpan? That’s a harder one. This allows a greater number of choices in terms of where the ByteSpan is coming from, such as you might have been passed a const span to begin with. But mainly it’s a contract that says “this routine will not alter the span.

OK, so following along, we make a reference to the input span, which does NOT copy anything, just sets up a couple of pointers. Then we use this ‘s’ span to do our movement. The main ‘while’ starts with a ‘trim’. XML, and thus SVG, are full of optional whitespace. I can say that for almost every routine, the first thing you want to do is eliminate whitespace. the ‘chunk_ltrim()’ function is very short and efficient, so liberal usage of that is a good thing.

Now we’re sitting at the ‘M’, so we first check to see if it’s one of our command characters. If it is, then we use it as our current command, and advance our pointer. The ‘iteration = 0’ is only useful for the Move commands, but we need that, as we’ll soon see.

Last, we have that cryptic function call thing

				if (!parseMap[currentCommand](s, apath, iteration))
					return false;

All set! Easy peasy, our task is done here…

That last little bit of function call trickery is using a dispatch table to make a call to a function. So let’s look at the dispatch table.

		// A dispatch std::map that matches the command character to the
		// appropriate parse function
		static std::map<SegmentCommand, std::function<bool(ByteSpan&, BLPath&, int&)>> parseMap = {
			{SegmentCommand::MoveTo, parseMoveTo},
			{SegmentCommand::MoveBy, parseMoveBy},
			{SegmentCommand::LineTo, parseLineTo},
			{SegmentCommand::LineBy, parseLineBy},
			{SegmentCommand::HLineTo, parseHLineTo},
			{SegmentCommand::HLineBy, parseHLineBy},
			{SegmentCommand::VLineTo, parseVLineTo},
			{SegmentCommand::VLineBy, parseVLineBy},
			{SegmentCommand::CubicTo, parseCubicTo},
			{SegmentCommand::CubicBy, parseCubicBy},
			{SegmentCommand::SCubicTo, parseSmoothCubicTo},
			{SegmentCommand::SCubicBy, parseSmoothCubyBy},
			{SegmentCommand::QuadTo, parseQuadTo},
			{SegmentCommand::QuadBy, parseQuadBy},
			{SegmentCommand::SQuadTo, parseSmoothQuadTo},
			{SegmentCommand::SQuadBy, parseSmoothQuadBy},
			{SegmentCommand::ArcTo, parseArcTo},
			{SegmentCommand::ArcBy, parseArcBy},
			{SegmentCommand::CloseTo, parseClose},
			{SegmentCommand::CloseBy, parseClose}
		};

Dispatch tables are the modern day C++ equivalent of the giant switch statement typically found in such programs. I actually started with the giant switch statement, then said to myself, “why don’t I just use a dispatch table”. They are functionally equivalent. In this case, we have a std::map, which uses a single SegmentCommand as the key. Each element is tied to a function, that takes the same set of parameters, namely a ByteSpan, a BLPath, and an int. As you can see, there is a function for each of our 20 commands.

I won’t go into every single one of those 20 commands, but looking at a couple will be instructive. Let’s start with the MoveTo

		static bool parseMoveTo(ByteSpan& s, BLPath& apath, int& iteration)
		{
			double x{ 0 };
			double y{ 0 };

			if (!parseNextNumber(s, x))
				return false;
			if (!parseNextNumber(s, y))
				return false;

			if (iteration == 0)
				apath.moveTo(x, y);
			else
				apath.lineTo(x, y);

			iteration++;

			return true;
		}

This has a few objectives.

  • Parse a couple of numbers
  • Call the appropriate function on the BLPath object
  • Increment the ‘iteration’ parameter
  • Advance the pointer, indicating how much we’ve consumed
  • Return false on failure, true on success

This pattern is repeated for every other of the 19 functions. One thing to know about all the commands, and why the main loop is structured the way it is, you can have multiple sets of numbers after the initial set. In the case of MoveTo, the following is a valid input stream.

M 0,0 20,20 30,30 40,40

The way you treat it, in the case of MoveTo, is the initial numbers set an origin (0,0), all subsequent number pairs are implied LineTo commands. That’s why we need to know the iteration. If the iteration is ‘0’, then we need to call moveTo on the BLPath object. If the iteration is greater than 0, then we need to call lineTo on the BLPath. All the commands behave in a similar fashion, except they don’t change based on the iteration number.

Well gee whiz, that seems pretty simple and straightforward. Don’t know what all the fuss is about. Hidden within the parseMoveTo() is parseNextNumber(), so let’s take a look at that as this is where all the bugs can be found.

// Consume the next number off the front of the chunk
// modifying the input chunk to advance past the  number
// we removed.
// Return true if we found a number, false otherwise
		static inline bool parseNextNumber(ByteSpan& s, double& outNumber)
		{
			static charset whitespaceChars(",\t\n\f\r ");          // whitespace found in paths

			// clear up leading whitespace, including ','
			s = chunk_ltrim(s, whitespaceChars);

			ByteSpan numChunk{};
			s = scanNumber(s, numChunk);

			if (!numChunk)
				return false;

			outNumber = chunk_to_double(numChunk);

			return true;
		}

The comment gives you the flavor of it. Again, we start with trimming ‘whitespace’, before doing anything. This is very important. In the case of these numbers, ‘whitespace’ not only includes the typical 0x20 TAB, etc, but also the COMMA (‘,’) character. “M20,20” and “M20 20” and “M 20 20” and “M 20, 20” and even “M,20,20” are all equivalent. So, if you’re going to be parsing numbers in a sequence, you’re going to have to deal with all those cases. The easiest thing to do is trim whitespace before you start. I will point out the convenience of the charset construction. Super easy.

We trim the whitespace off the front, then call ‘scanNumber()’. That’s another workhourse routine, which is worth looking into, but I won’t put the code here. You can find it in the bspanutil.h file. I will put the comment associated with the code here though, as it’s informative.

// Parse a number which may have units after it
//   1.2em
// -1.0E2em
// 2.34ex
// -2.34e3M10,20
// 
// By the end of this routine, the numchunk represents the range of the 
// captured number.
// 
// The returned chunk represents what comes next, and can be used
// to continue scanning the original inChunk
//
// Note:  We assume here that the inChunk is already positioned at the start
// of a number (including +/- sign), with no leading whitespace

This is probably the most singularly important routine in the whole library. It has the big task of figuring out numbers from a stream of characters. Those numbers, as you can see from the examples, come in many different forms, and things can get confusing. Here’s another example of a sequence of characters it needs to be able to figure out: “M-1.7-82L.92 27”. You save yourself a ton of time, headache, and heartburn by getting this right.

The next choice you make is how to convert from the number that we scanned (it’s still just a stream of ASCII characters) into an actual ‘double’. This is the point where most programmers might throw up their hands and reach for their trusty ‘strtod’ or ye olde ‘atof’, or even ‘sprintf’. There’s a whole science to this, just know that strtod() is not your friend, and for something you ‘ll be doing millions of times, it’s worth investigating some alternatives. I highly recommend reading the code for fast_double_parser. If you want to examine what I do, checkout the chunk_to_double() routine within the bspanutil.h file.

We’re getting pretty far into the weeds down here, so let’s look at one more function, the LineTo

		static bool parseLineTo(ByteSpan& s, BLPath& apath, int& iteration)
		{
			double x{ 0 };
			double y{ 0 };

			if (!parseNextNumber(s, x))
				return false;
			if (!parseNextNumber(s, y))
				return false;

			apath.lineTo(x, y);

			iteration++;

			return true;
		}

Same as MoveTo, parse a couple of numbers, apply them to the right function on the path object, return true or false. Just do the same thing 18 more times for the other functions, and you’ve got your path ‘parser’.

To recap, parsing the ‘d’ parameter is one of the most important parts of any SVG parser. In this case, we want to get from the text to an actual object we can render, as quickly as possible. A BLPath alone is not enough to create great images, we still have a long way to go until we start seeing pretty pictures on the screen. Parsing the path is critical to getting there though. This is where you could waste tons of time and memory, so it’s worth considering the options carefully. In this case, we’ve chosen to represent the path in memory using a data structure that can be a part of a graphic elements tree, as well as being handed to the drawing engine directly, without having to transform it once again before actually drawing.

There you have it. One step closer to our beautiful images.

Next time around, we need to look at what kind of Document Object Model (DOM) we want to construct, and how our SVG parser will construct it.


SVG From the Ground Up – It’s XML, How hard could it be?

Let’s take a look at the SVG (XML) code that generates that image.

<svg height="200" width="680" xmlns="http://www.w3.org/2000/svg">
    <circle cx="70" cy="70" r="50" />
    <circle cx="200" cy="70" r="50" fill="#79C99E" />
    <circle cx="330" cy="70" r="50" fill="#79C99E" stroke-width="10" stroke="#508484" />
    <circle cx="460" cy="70" r="50" fill="#79C99E" stroke-width="10" />
    <circle cx="590" cy="70" r="50" fill="none" stroke-width="10" stroke="#508484" />
</svg>

By the end of this post, we should be able to scan through the components of that, and generate the tokens necessary to begin rendering it as SVG. So, where to start?

Last time around (SVG From the Ground Up – Parsing Fundamentals), I introduced the ByteSpan and charset data structures, as a way to say “these are the only tools you’ll need…”. Well, at least they are certainly the core building blocks. Now we’re going to actually use those components to begin the process of breaking down the XML. XML can be a daunting sprawling beast. Its origins are in an even older document technology known as SGML. The first specification for the language can be found here: Extensible Markup Language (XML) 1.0 (Fifth Edition). When I joined the team at Microsoft in 1998 to work on this under Jean Paoli, one of the original authors, there were probably 30 people across dev, test, and pm. Of course we had people working on the standards body, and I was working on XSLT, and a couple on the parser, someone on DTD schema. It was quite a production. At that time, we had to deal with myriad encodings (utf-8 did not rule the world yet), conformance and compliance test suites, and that XSLT beast (CSS did not rule the world yet). It was a daunting endeavor, and at some point we tried to color everything with XML, much to the chagrin of most other people. But, some things did come out of that era, and SVG is one of them.

Today, our task is not to implement a fully compliant validating parser. That again would take a team of a few, and a ton of testing. What we’re after is something more modest. Something a hobby hacker could throw together in a weekend, but have a fair chance at it being able to consume most of the SVG you’re ever really interested in. To that end, there’s a much smaller, simpler XML spec out there. MicroXML. This describes a subset of XML that leaves out all the really hard parts. While that spec is far more readable, we’ll go even one step simpler. With our parser here, we won’t even be supporting utf-8. That might seem like a tragic simplification, but the reality is, not even that’s needed for most of what we’ll be doing with SVG. So, here’s the list of what we will be doing.

  • Decoding elements
  • Decoding attributes
  • Decoding element content (supporting text nodes)
  • Skipping Doctype
  • Skipping Comments
  • Skipping Processing Instructions
  • Not expanding character entities (although user can)

As you will soon see “skipping” doesn’t mean you have access to the data, it just means our SVG parser won’t do anything with it. This is a nice extensibility point. We start simple, and you can add as much complexity as you want over time, without changing the fundamental structure of what we’re about to build.

Now for some types and enums. I won’t put the entirety of the code in here, so if you want to follow along, you can look at the xmlscan.h file. We’ll start with the XML element types.

    enum XML_ELEMENT_TYPE {
        XML_ELEMENT_TYPE_INVALID = 0
		, XML_ELEMENT_TYPE_XMLDECL                  // An XML declaration, like <?xml version="1.0" encoding="UTF-8"?>
        , XML_ELEMENT_TYPE_CONTENT                  // Content, like <foo>bar</foo>, the 'bar' is content
        , XML_ELEMENT_TYPE_SELF_CLOSING             // A self-closing tag, like <foo/>
        , XML_ELEMENT_TYPE_START_TAG                // A start tag, like <foo>
        , XML_ELEMENT_TYPE_END_TAG                  // An end tag, like </foo>
        , XML_ELEMENT_TYPE_COMMENT                  // A comment, like <!-- foo -->
        , XML_ELEMENT_TYPE_PROCESSING_INSTRUCTION   // A processing instruction, like <?foo bar?>
        , XML_ELEMENT_TYPE_CDATA                    // A CDATA section, like <![CDATA[ foo ]]>
        , XML_ELEMENT_TYPE_DOCTYPE                  // A DOCTYPE section, like <!DOCTYPE foo>
    };

This is where we indicate what kinds of pieces of the XML file we will recognize. If something is not in this list, it will either be reported as invalid, or it will simply cause the scanner to stop processing. From the little bit of XML that opened this article, we see “START_TAG”, “SELF_CLOSING”, “END_TAG”. And that’s it!! Simple right?

OK. Next up are a couple of data structures which are the guts of the XML itself. First is the XmlName. Although we’re not building a super conformant parser, there are some simple realities we need to be able to handle to make our future life easier. XML namespaces are one of those things. In XML, you can have a name with a ‘:’ in it, which puts the name into a namespace. Without too much detail, just know that “circle”, could have been “svg:circle”, or something, and possibly mean the same thing. We need a data structure that will capture this.

struct XmlName {
        ByteSpan fNamespace{};
        ByteSpan fName{};

        XmlName() = default;
        
        XmlName(const ByteSpan& inChunk)
        {
            reset(inChunk);
        }

        XmlName(const XmlName &other):fNamespace(other.fNamespace), fName(other.fName){}
        
        XmlName& operator =(const XmlName& rhs)
        {
            fNamespace = rhs.fNamespace;
            fName = rhs.fName;
            return *this;
        }
        
        XmlName & operator=(const ByteSpan &inChunk)
        {
            reset(inChunk);
            return *this;
        }
        
		// Implement for std::map, and ordering in general
		bool operator < (const XmlName& rhs) const
		{
			size_t maxnsbytes = std::min(fNamespace.size(), rhs.fNamespace.size());
			size_t maxnamebytes = std::min(fName.size(), rhs.fName.size());
            
			return (memcmp(fNamespace.begin(), rhs.fNamespace.begin(), maxnsbytes)<=0)  && (memcmp(fName.begin(), rhs.fName.begin(), maxnamebytes) < 0);
		}
        
        // Allows setting the name after it's been created
        XmlName& reset(const ByteSpan& inChunk)
        {
            fName = inChunk;
            fNamespace = chunk_token(fName, charset(':'));
            if (chunk_size(fName)<1)
            {
                fName = fNamespace;
                fNamespace = {};
            }
            return *this;
        }
        
		ByteSpan name() const { return fName; }
		ByteSpan ns() const { return fNamespace; }
	};

Given a ByteSpan, our universal data representation, split it out into the ‘namespace’ and ‘name’ parts, if they exist. Then we can get the name part by calling ‘name()’, and if there was a namespace part, we can get that from ‘ns()’. Why ‘ns’ instead of ‘namespace’? Because ‘namespace’ is a keyword in C/C++, and we don’t want any confusion or compiler errors.

One thing to note here is the implementation of the ‘operator <‘. Why is that there? Because if you want to use this as a keyfield in an associative container, such as std::map, you need some comparison operator, and by implementing ‘<‘, you get a quick and dirty comparison operator. This is a future enhancement we’ll use later.

Next up is the representation of an XML node itself, where we have XmlElement.

    // Representation of an xml element
    // The xml iterator will generate these
    struct XmlElement
    {
    private:
        int fElementKind{ XML_ELEMENT_TYPE_INVALID };
        ByteSpan fData{};

        XmlName fXmlName{};
        std::string fName{};
        std::map<std::string, ByteSpan> fAttributes{};

    public:
        XmlElement() {}
        XmlElement(int kind, const ByteSpan& data, bool autoScanAttr = false)
            :fElementKind(kind)
            , fData(data)
        {
            reset(kind, data, autoScanAttr);
        }

		void reset(int kind, const ByteSpan& data, bool autoScanAttr = false)
		{
            clear();

            fElementKind = kind;
            fData = data;

            if ((fElementKind == XML_ELEMENT_TYPE_START_TAG) ||
                (fElementKind == XML_ELEMENT_TYPE_SELF_CLOSING) ||
                (fElementKind == XML_ELEMENT_TYPE_END_TAG))
            {
                scanTagName();

                if (autoScanAttr) {
                    if (fElementKind != XML_ELEMENT_TYPE_END_TAG)
                        scanAttributes();
                }
            }
		}
        
		// Clear this element to a default state
        void clear() {
			fElementKind = XML_ELEMENT_TYPE_INVALID;
			fData = {};
			fName.clear();
			fAttributes.clear();
		}
        
        // determines whether the element is currently empty
        bool empty() const { return fElementKind == XML_ELEMENT_TYPE_INVALID; }

        explicit operator bool() const { return !empty(); }

        // Returning information about the element
        const std::map<std::string, ByteSpan>& attributes() const { return fAttributes; }
        
        const std::string& name() const { return fName; }
		void setName(const std::string& name) { fName = name; }
        
        int kind() const { return fElementKind; }
		void kind(int kind) { fElementKind = kind; }
        
        const ByteSpan& data() const { return fData; }

		// Convenience for what kind of tag it is
        bool isStart() const { return (fElementKind == XML_ELEMENT_TYPE_START_TAG); }
		bool isSelfClosing() const { return fElementKind == XML_ELEMENT_TYPE_SELF_CLOSING; }
		bool isEnd() const { return fElementKind == XML_ELEMENT_TYPE_END_TAG; }
		bool isComment() const { return fElementKind == XML_ELEMENT_TYPE_COMMENT; }
		bool isProcessingInstruction() const { return fElementKind == XML_ELEMENT_TYPE_PROCESSING_INSTRUCTION; }
        bool isContent() const { return fElementKind == XML_ELEMENT_TYPE_CONTENT; }
		bool isCData() const { return fElementKind == XML_ELEMENT_TYPE_CDATA; }
		bool isDoctype() const { return fElementKind == XML_ELEMENT_TYPE_DOCTYPE; }

        
        void addAttribute(std::string& name, const ByteSpan& valueChunk)
        {
            fAttributes[name] = valueChunk;
        }

        ByteSpan getAttribute(const std::string &name) const
		{
			auto it = fAttributes.find(name);
			if (it != fAttributes.end())
				return it->second;
			else
                return ByteSpan{};
		}
        
    private:
        //
        // Parse an XML element
        // We should be sitting on the first character of the element tag after the '<'
        // There are several things that need to happen here
        // 1) Scan the element name
        // 2) Scan the attributes, creating key/value pairs
        // 3) Figure out if this is a self closing element

        // 
        // We do NOT scan the content of the element here, that happens
        // outside this routine.  We only deal with what comes up the the closing '>'
        //
        void setTagName(const ByteSpan& inChunk)
        {
            fXmlName.reset(inChunk);
            fName = toString(fXmlName.name());
        }
        
        void scanTagName()
        {
            ByteSpan s = fData;
            bool start = false;
            bool end = false;

            // If the chunk is empty, just return
            if (!s)
                return;

            // Check if the tag is end tag
            if (*s == '/')
            {
                s++;
                end = true;
            }
            else {
                start = true;
            }

            // Get tag name
            ByteSpan tagName = s;
            tagName.fEnd = s.fStart;

            while (s && !wspChars[*s])
                s++;

            tagName.fEnd = s.fStart;
            setTagName(tagName);


            fData = s;
        }

        public:
        //
        // scanAttributes
        // Scans the fData member looking for attribute key/value pairs
        // It will add to the member fAttributes these pairs, without further processing.
        // This should be called after scanTagName(), because we want to be positioned
        // on the first key/value pair. 
        //
        int scanAttributes()
        {

            int nattr = 0;
            bool start = false;
            bool end = false;
            uint8_t quote{};
            ByteSpan s = fData;


            // Get the attribute key/value pairs for the element
            while (s && !end)
            {
                uint8_t* beginattrValue = nullptr;
                uint8_t* endattrValue = nullptr;


                // Skip white space before the attrib name
                s = chunk_ltrim(s, wspChars);

                if (!s)
                    break;

                if (*s == '/') {
                    end = true;
                    break;
                }

                // Find end of the attrib name.
                //static charset equalChars("=");
                auto attrNameChunk = chunk_token(s, "=");
                attrNameChunk = chunk_trim(attrNameChunk, wspChars);    // trim whitespace on both ends

                std::string attrName = std::string(attrNameChunk.fStart, attrNameChunk.fEnd);

                // Skip stuff past '=' until the beginning of the value.
                while (s && (*s != '\"') && (*s != '\''))
                    s++;

                // If we've reached end of span, bail out
                if (!s)
                    break;

                // capture the quote character
                // Store value and find the end of it.
                quote = *s;

				s++;    // move past the quote character
                beginattrValue = (uint8_t*)s.fStart;    // Mark the beginning of the attribute content

                // Skip until we find the matching closing quote
                while (s && *s != quote)
                    s++;

                if (s)
                {
                    endattrValue = (uint8_t*)s.fStart;  // Mark the ending of the attribute content
                    s++;
                }

                // Store only well formed attributes
                ByteSpan attrValue = { beginattrValue, endattrValue };

                addAttribute(attrName, attrValue);

                nattr++;
            }

            return nattr;
        }
    };

That’s a bit of a brute, but actually pretty straightforward. We need a data structure that tells us what kind of XML element type we’re dealing with. We need the name, as the content of the element held onto for future processing. We hold onto the content as a ByteSpan, but have provision for making more convenient representations. For example, we turn the name into a std::string. In the futue, we can eliminate even this, and just use the XmlName with its chunks directly.

Besides the element name, we also have the ability to split out the attribute key/value pairs, as seen in ‘scanAttributes()’. Let’s take a deeper look at the constructor.

        XmlElement(int kind, const ByteSpan& data, bool autoScanAttr = false)
            :fElementKind(kind)
            , fData(data)
        {
            reset(kind, data, autoScanAttr);
        }

		void reset(int kind, const ByteSpan& data, bool autoScanAttr = false)
		{
            clear();

            fElementKind = kind;
            fData = data;

            if ((fElementKind == XML_ELEMENT_TYPE_START_TAG) ||
                (fElementKind == XML_ELEMENT_TYPE_SELF_CLOSING) ||
                (fElementKind == XML_ELEMENT_TYPE_END_TAG))
            {
                scanTagName();

                if (autoScanAttr) {
                    if (fElementKind != XML_ELEMENT_TYPE_END_TAG)
                        scanAttributes();
                }
            }
		}

The constructor takes a ‘kind’, a ByteSpan, and a flag indicating whether we want to parse out the attributes or not. In ‘reset()’, we see that we hold onto the kind of element, and the ByteSpan. That ByteSpan contains everything between the ‘<‘ of the tag to the closing ‘>’, non-inclusive. The first thing we do is scan the tag name, so we can at least hold onto that, leaving the fData representing the rest. This is relatively low impact so far.

Why not just do this in the constructor itself, why have a “reset()”? As we’ll see later, we actually reuse XmlElement in some situations while parsing, so we want to be able to set, and reset, the same object multiple times. At least that’s one way of doing things.

Another item of note is whether you scan the attributes or not. If you do scan the attributes, you end up with a map of those elements, and a way to get the value of individual attributes.

        std::map<std::string, ByteSpan> fAttributes{};

        ByteSpan getAttribute(const std::string &name) const
		{
			auto it = fAttributes.find(name);
			if (it != fAttributes.end())
				return it->second;
			else
                return ByteSpan{};
		}

The ‘getAttribute()’ method is a most critical piece when we later start building our SVG model, so it needs to be fast and efficient. Of course, this does not have to be embedded in the core of the XmlElement, you could just as easily construct an attribute list outside of the element, but then you’d have to associate it back to the element anyway, and you end up in the same place. getAttribute() takes a name as a string, and returns the ByteSpan which is the raw, uninterpreted content of that attribute, without the enclosing quote marks. In the future, it would be nice to replace that std::string based name with a XmlName, which will save on some allocations, but we’ll stick with this convenience for now.

The stage is now set. We have our core components and data structures, we’re ready for the main event of actually parsing some content. For that, we have to make some design decisions. The first one we already made in the very beginning. We will be consuming a chunk of memory as represented in a ByteSpan. The next decision is how we want to consume? Do we want to build a Document Object Model (DOM), or some other structure? Do we just want to print out nodes as we see them? Do we want a ‘pull model’ parser, where we are in control of getting each node one by one, or a ‘push model’, where we have a callback function which is called every time a node is seen, but the primary driver is elsewhere?

My choice is to have a pull model parser, where I ask for each node, one by one, and do whatever I’m going to do with it. In terms of programming patterns, this is the ‘iterator’. So, I’m going to create an XML iterator. The fundamental structure of an iterator is this.

Iterator iter(content)
while (iter)
{
   doSomethingWithCurrentItem(*iter);
  iter++;
}

So, that’s what we need to construct for our XML. Something that can scan its input, delivering XmlElement as the individual items that we can then do something with. So, here is XmlElementIterator.

   struct XmlElementIterator {
    private:
        // XML Iterator States
        enum XML_ITERATOR_STATE {
            XML_ITERATOR_STATE_CONTENT = 0
            , XML_ITERATOR_STATE_START_TAG

        };
        
        // What state the iterator is in
        int fState{ XML_ITERATOR_STATE_CONTENT };
        svg2b2d::ByteSpan fSource{};
        svg2b2d::ByteSpan mark{};

        XmlElement fCurrentElement{};
        
    public:
        XmlElementIterator(const svg2b2d::ByteSpan& inChunk)
        {
            fSource = inChunk;
            mark = inChunk;

            fState = XML_ITERATOR_STATE_CONTENT;
            
            next();
        }

		explicit operator bool() { return !fCurrentElement.empty(); }
        
        // These operators make it operate like an iterator
        const XmlElement& operator*() const { return fCurrentElement; }
        const XmlElement* operator->() const { return &fCurrentElement; }

        XmlElementIterator& operator++() { next(); return *this; }
        XmlElementIterator& operator++(int) { next(); return *this; }
        
        // Reset the iterator to a known state with data
        void reset(const svg2b2d::ByteSpan& inChunk, int st)
        {
            fSource = inChunk;
            mark = inChunk;

            fState = st;
        }

        ByteSpan readTag()
        {
            ByteSpan elementChunk = fSource;
            elementChunk.fEnd = fSource.fStart;
            
            while (fSource && *fSource != '>')
                fSource++;

            elementChunk.fEnd = fSource.fStart;
            elementChunk = chunk_rtrim(elementChunk, wspChars);
            
            // Get past the '>' if it was there
            fSource++;
            
            return elementChunk;
        }
        
        // readDoctype
		// Reads the doctype chunk, and returns it as a ByteSpan
        // fSource is currently sitting at the beginning of !DOCTYPE
        // Note: 
        
        ByteSpan readDoctype()
        {

            // skip past the !DOCTYPE to the first whitespace character
			while (fSource && !wspChars[*fSource])
				fSource++;
            
			// Skip past the whitespace
            // to get to the beginning of things
			fSource = chunk_ltrim(fSource, wspChars);

            
            // Mark the beginning of the "content" we might return
            ByteSpan elementChunk = fSource;
            elementChunk.fEnd = fSource.fStart;

            // To get to the end, we're looking for '[]' or just '>'
            auto foundChar = chunk_find_char(fSource, '[');
            if (foundChar)
            {
                fSource = foundChar;
                foundChar = chunk_find_char(foundChar, ']');
                if (foundChar)
                {
                    fSource = foundChar;
                    fSource++;
                }
                elementChunk.fEnd = fSource.fStart;
            }
            
            // skip whitespace?
            // search for closing '>'
            foundChar = chunk_find_char(fSource, '>');
            if (foundChar)
            {
                fSource = foundChar;
                elementChunk.fEnd = fSource.fStart;
                fSource++;
            }
            
            return elementChunk;
        }
        
        
        // Simple routine to scan XML content
        // the input 's' is a chunk representing the xml to 
        // be scanned.
        // The input chunk will be altered in the process so it
        // can be used in a subsequent call to continue scanning where
        // it left off.
        bool next()
        {
            while (fSource)
            {
                switch (fState)
                {
                case XML_ITERATOR_STATE_CONTENT: {

                    if (*fSource == '<')
                    {
                        // Change state to beginning of start tag
                        // for next turn through iteration
                        fState = XML_ITERATOR_STATE_START_TAG;

                        if (fSource != mark)
                        {
                            // Encapsulate the content in a chunk
                            svg2b2d::ByteSpan content = { mark.fStart, fSource.fStart };

                            // collapse whitespace
							// if the content is all whitespace
                            // don't return anything
							content = chunk_trim(content, wspChars);
                            if (content)
                            {
                                // Set the state for next iteration
                                fSource++;
                                mark = fSource;
                                fCurrentElement.reset(XML_ELEMENT_TYPE_CONTENT, content);
                                
                                return true;
                            }
                        }

                        fSource++;
                        mark = fSource;
                    }
                    else {
                        fSource++;
                    }

                }
                break;

                case XML_ITERATOR_STATE_START_TAG: {
                    // Create a chunk that encapsulates the element tag 
                    // up to, but not including, the '>' character
                    ByteSpan elementChunk = fSource;
                    elementChunk.fEnd = fSource.fStart;
                    int kind = XML_ELEMENT_TYPE_START_TAG;
                    
                    if (chunk_starts_with_cstr(fSource, "?xml"))
                    {
						kind = XML_ELEMENT_TYPE_XMLDECL;
                        elementChunk = readTag();
                    } 
                    else if (chunk_starts_with_cstr(fSource, "?"))
                    {
                        kind = XML_ELEMENT_TYPE_PROCESSING_INSTRUCTION;
                        elementChunk = readTag();
                    }
                    else if (chunk_starts_with_cstr(fSource, "!DOCTYPE"))
                    {
                        kind = XML_ELEMENT_TYPE_DOCTYPE;
                        elementChunk = readDoctype();
                    }
                    else if (chunk_starts_with_cstr(fSource, "!--"))
                    {
						kind = XML_ELEMENT_TYPE_COMMENT;
                        elementChunk = readTag();
                    }
                    else if (chunk_starts_with_cstr(fSource, "![CDATA["))
                    {
                        kind = XML_ELEMENT_TYPE_CDATA;
                        elementChunk = readTag();
                    }
					else if (chunk_starts_with_cstr(fSource, "/"))
					{
						kind = XML_ELEMENT_TYPE_END_TAG;
						elementChunk = readTag();
					}
					else {
						elementChunk = readTag();
                        if (chunk_ends_with_char(elementChunk, '/'))
                            kind = XML_ELEMENT_TYPE_SELF_CLOSING;
					}
                    
                    fState = XML_ITERATOR_STATE_CONTENT;

                    mark = fSource;

					fCurrentElement.reset(kind, elementChunk, true);

                    return true;
                }
                break;

                default:
                    fSource++;
                    break;

                }
            }

            fCurrentElement.clear();
            return false;
        } // end of next()
    };

That code might have a face only a programmer could love, but it’s relatively simple to break down. The constructor takes a ByteSpan, and holds onto it as fSource. This ByteSpan is ‘consumed’, meaning, once you’ve iterated, you can’t go back. But, since ‘iteration’ is nothing more than moving a pointer in a ByteSpan, you can always take a ‘snapshot’ of where you’re at, and continue, but we won’t go into that right here. That’s going to be useful for tracking down where an error occured.

The crux of the iterator is the ‘next()’ method. This is where we look for the ‘<‘ character that indicates the start of some tag. The iterator runs between two states. You’re either in ‘XML_ITERATOR_STATE_CONTENT’ or ‘XML_ITERATOR_STATE_START_TAG’. Initially we start in the ‘CONTENT’ state, and flip to ‘START_TAG’ as soon as we see the character. Once in ‘START_TAG’, we try to further refine what kind of tag we’re dealing with. In most cases, we just capture the content, and that becomes the current element.

The iteration terminates when the current XmlElement (fCurretElement) is empty, which happems if we run out of input, or there’s some kind of error.

So, next() returns true or false. And our iterator does what it’s supposed to do, which is hold onto the current XmlElement that we have scanned. You can get to the contents of the element by using the dereference operator *, like this: *iter, or the arrow operator. In either case, they simply return the current element

        const XmlElement& operator*() const { return fCurrentElement; }
        const XmlElement* operator->() const { return &fCurrentElement; }

Alright, in practice, it looks like this:

#include "mmap.h"
#include "xmlscan.h"
#include "xmlutil.h"

using namespace filemapper;
using namespace svg2b2d;

int main(int argc, char** argv)
{
    if (argc < 2)
    {
        printf("Usage: pullxml <xml file>\n");
        return 1;
    }

    // create an mmap for the specified file
    const char* filename = argv[1];
    auto mapped = mmap::createShared(filename);

    if (mapped == nullptr)
        return 0;


    // 
	// Parse the mapped file as XML
    // printing out the elements along the way
    ByteSpan s(mapped->data(), mapped->size());
    
    XmlElementIterator iter(s);

    while (iter)
    {
		ndt_debug::printXmlElement(*iter);

        iter++;
    }

    // close the mapped file
    mapped->close();

    return 0;
}

That will generate the following output, where the printXmlElement() function can be found in the file xmlutil.h. The individual attributes are indicated with their name followed by ‘:’, such as ‘height:’, followed by the value of the attributed, surrounded by ‘||’ markers. Each tag kind is indicated as well.

START_TAG: [svg]
    height: ||200||
    width: ||680||
    xmlns: ||http://www.w3.org/2000/svg||
SELF_CLOSING: [circle]
    cx: ||70||
    cy: ||70||
    r: ||50||
SELF_CLOSING: [circle]
    cx: ||200||
    cy: ||70||
    fill: ||#79C99E||
    r: ||50||
SELF_CLOSING: [circle]
    cx: ||330||
    cy: ||70||
    fill: ||#79C99E||
    r: ||50||
    stroke: ||#508484||
    stroke-width: ||10||
SELF_CLOSING: [circle]
    cx: ||460||
    cy: ||70||
    fill: ||#79C99E||
    r: ||50||
    stroke-width: ||10||
SELF_CLOSING: [circle]
    cx: ||590||
    cy: ||70||
    fill: ||none||
    r: ||50||
    stroke: ||#508484||
    stroke-width: ||10||
END_TAG: [svg]

At this point, we have our XML “parser”. It can scan/parse enough for us to continue on our journey to parse and display SVG. It’s not the most robust XML parser on the planet, but it’s a good performer, very small and hopefully understandable. Usage could not be easier, and it does not impose a lot of frameworks, or pull in a lot of dependencies. We’re at a good starting point, and if all you wanted was to be able to parse some XML to do something, you could stop here and call it a day.

Next time around, we’re going to look into the SVG side of things, and sink deep into that rabbit hole.


Creating A SVG Viewer from the ground up

In the series I did last summer (Hello Scene), I walked through the fundamentals of creating a simple graphics system, from the ground up. Putting a window on the screen, setting pixels, drawing, text, visual effects, screen captures, and all that. Along the way, I discussed various design choices and tradeoffs that I made while creating the code.

While capturing screenshots, and flipping some bits might make for a cool demo, at the end of the day, I need to create actual applications that are: robust, performant, functional, and a delight for the user to use. A lot of what we see today are “web apps”, that is, things that are created to be run in a web browser. Web Apps have a lot of HTML, CSS, Javascript, and are programmed with myriad frameworks, in multiple languages on the front end and backend. It a whole industry out there!

One question arises for me though, and perhaps a bit of envy. Why do those web apps have to look so great, with their fancy gradients, shadows, and animations, whereas my typical applications look like they’re stuck in a late 2000 computer geek movie. I’m talking about desktop apps, and why they haven’t changed much in the past 20 years. Maybe we get a splash here and there with some changes in icon styles (shardows, transparency, flat, ‘dark’), but really, the rest of the app looks and feels the same. No animations, no fancy pictures, everything is square, just no fun.

Well, to this point, I’ve been on a mission to create more engaging desktop app experiences, and it starts with the graphics. To that end, I looked out into the world and saw that SVG (Scalable Vector Graphics) would be a great place to start. Vector graphics are great. The other form of graphics are ‘bitmap’. Bitmap graphics are the realm of file formats such as ‘png’, ‘jpeg’, ‘gif’, ‘webp’, and the like. As the name implies, a ‘bitmap’ is just a bunch of dots of color in a square. There are a couple of challenges with bitmap graphics. One is that when you scale them, the thing starts to look “pixelated”. You know, they get the ‘jaggies’, and they just don’t look that great.

The second challenge you have is that the image is static. You don’t know where the keys on that keyboard are located, so being able to push them, or have them reflect music that’s playing, is quite a hard task.

In steps vector graphics. Vector Graphics contain the original drawing commands that are used to create a bitmap, at any size. With a vector graphics file, you can retain the information about colors, locations, geometry, everything that went into creating the image. This means that you can locate individual elements, name them, change them during the application, and so on.

Why don’t we just use vector graphics all the time then? Honestly, I really don’t know. I do know that one impediment to using them is being able to parse the format, and do something meaningful with it. To date, you mostly find support for SVG in web browsers, where they’re already parsing this kind of data. In that environment, you have full access to all those annotations, and furthermore, you can attach javascript code the various actions, like mouse hovering, clicking, dragging and the like. But, for the most part, desktop applications don’t participate in that world. Instead, we’re typically stuck with bitmap graphics and clunky UI builders.

To change that, the first step is parsing the .svg file format. Lucky for me, SVG is based on XML, which is the first thing I worked on at Microsoft back in 1998. I never actually wrote the parser (worked on XSLT originally), but I’m super familiar with it. So, that’s where to start.

In this series, I’m going to write a functional SVG parser, which will be capable of generating SVG based bitmap images, as well as operate in an application development environment for desktop apps. I will be using the blend2d graphics library to do all the super heavy lifting of rendering the actual images, but I will focus on what goes into writing the parser, and seamlessly integrating the results into useful desktop applications.

So, follow along over the next few installments to see how it’s done.


Hello Scene – It’s all about the text

That’s a lot of fonts. But, it’s a relatively simple task to achieve once we’ve gained some understanding of how to deal with text. We’ll park this bit of code here (fontlist.cpp) while we gain some understanding.

#include "gui.h"
#include "fontmonger.h"

std::list<std::string> fontList;

void drawFonts()
{
	constexpr int rowHeight = 24;
	constexpr int colWidth = 213;

	int maxRows = canvasHeight / rowHeight;
	int maxCols = canvasWidth / colWidth;

	int col = 0;
	int row = 0;

	std::list<std::string>::iterator it;
	for (it = fontList.begin(); it != fontList.end(); ++it) 
	{
		int x = col * colWidth;
		int y = row * rowHeight;

		textFont(it->c_str(), 18);
		text(it->c_str(), x, y);

		col++;
		if (col >= maxCols)
		{
			col = 0;
			row++;
		}
	}
}

void setup()
{
	setCanvasSize(1280, 1024);

	FontMonger::collectFontFamilies(fontList);

	background(PixelRGBA(0xffdcdcdc));

	drawFonts();
}

I must say, dealing with fonts, and text rendering is one of the most challenging of the graphics disciplines. We could spend years and gigabytes of text explaining the intricacies of how fonts and text work. For our demo scene, we’re not going to get into all that though. We just want a little bit of text to be able to splash around here and there. So, I’m going to go the easy route, and explain how to use the system text rendering and incorporate it into the rest of our little demo framework.

First of all, some terminology. These words; Font, Font Face, OpenType, Points, etc, are all related to fonts, and all can cause confusion. So, let’s ignore all that for now, and just do something simple.

And the code to make it happen?

#include "gui.h"

void setup()
{
	setCanvasSize(320, 240);
	background(PixelRGBA (0xffffffff));		// A white background

	text("Hello Scene!", 24, 48);
}

Pretty simple right? By default, the demo scene chooses the “Segoe UI” font at 18 pixels high to do text rendering. The single call to “text(…)”, puts whatever text you want at the x,y coordinates specified afterward. So, what is “Segoe UI”? A Font describes the shape of a character. So, the letter ‘A’ in one font looks one way in say “Times New Roman”, and probably slightly different in “Tahoma”. These are stylistic differences. Us humans will just recognize it as ‘A’. Each font contains a bunch of descriptions of how to draw individual characters. These descriptions are essentially just polygons, with curves, and straight lines.

I’m grossly simplifying.

The basic description can be scaled, rotated, printed in ‘bold’, ‘italics’, or ‘underline’, depending on what you want to do when you’re displaying text. So, besides just saying where we want text to be located, we can specify the size (in pixels), and choose a specific font name other than the default.

Which was produced with a slight change in the code

#include "gui.h"

void setup()
{
	setCanvasSize(640, 280);
	background(PixelRGBA (0xffffffff));		// A white background

	textFont("Sitka Text", 100);
	text("Hello My Scene!", 24, 48);
}

And last, you can change the color of the text

How exciting is that?! For the simplest of demos, and maybe even some UI framework, this might be enough. But, le’ts go a little bit further, and get some more functions that might be valuable.

First thing, we need to understand a little bit more about the font, like how tall and wide characters are, where’s the baseline, the ascent, and descent. Character width and height are easily understood. Ascent and descent might not be as well understood. Let’s start with a little display.

Some code to go with it

#include "gui.h"

constexpr int leftMargin = 24;
constexpr int topMargin = 24;


void drawTextDetail()
{
    // Showing font metrics
	const char* str2 = "My Scene!";
	PixelCoord sz;
	textMeasure(sz, str2);

	constexpr int myTop = 120;

	int baseline = myTop + fontHeight - fontDescent;
	int topline = myTop + fontLeading;

	strokeRectangle(*gAppSurface, leftMargin, myTop, sz.x(), sz.y(), PixelRGBA(0xffff0000));

	// Draw internalLeading - green
	copySpan(*gAppSurface, 
        leftMargin, topline, sz.x(), 
        PixelRGBA(0xff00ff00));

	// draw baseline
	copySpan(*gAppSurface, 
        leftMargin, baseline, sz.x(), 
        PixelRGBA(0xff0000ff));

	// Draw text in the box
    // Turquoise Text
	textColor(PixelRGBA(0xff00ffff));	
	text("My Scene!", leftMargin, myTop);
}

void setup()
{
	setCanvasSize(640, 280);
	background(PixelRGBA (0xffffffff));

	textFont("Sitka Text", 100);

	drawTextDetail();
}

In the setup, we do the usual to create a canvas of a desirable size. Then we select the font with a particular pixel height. Then wave our hands and call ‘drawDetail()’.

In ‘drawDetail()’, one of the first calls is to ‘textMeasure()’. We want the answer to; “How many pixels wide and high is this string?” The ‘textMeasure()’ function does this. It’s pretty straight forward as the GDI API that we’re using for text rendering has a function call for this purpose.

void textMeasure(PixelCoord& pt, const char* txt)
{
    SIZE sz;
    ::GetTextExtentPoint32A(gAppSurface->getDC(), txt,strlen(txt), &sz);

    pt[0] = sz.cx;
    pt[1] = sz.cy;
}

It’s that simple. Just pass in a structure to receive the size, and make the call to ‘GetTextExtentPoint32A()’. I choose to return the value in a PixelCoord object, because I don’t want the Windows specific data structures bleeding into my own demo API. This allows me to change the underlying text API without having to worry about changing dependent data structures.

The size that is returned incorporates a few pieces of information. It’s not a tight fit to the string. The size is derived from a combination of global font information (tallest character, lowest part of a character), as well as the cumulative widths of the actual characters specified. In the case of our little demo, the red rectangle represents the size that was returned.

There are a couple more bits of information that are set when you select a font of a particular size. The three most important bits are, the ascent, descent, and internal leading.

Let’s start from the descent. Represented by the blue line, this is the maximum amount any given character of the font might fall below the ‘baseline’. The baseline is implicitly defined by this descent, and it essentially the fontHeight-fontDescent. This is the line where all the other characters will have as their ‘bottom’. The ‘ascent’ is the amount of space above this baseline. So, the total fontHeight is the fontDescent+fontAscent. The ascent isn’t explicitly shown, because it is essentially the topline of the rectangle. The last bit is the internalLeading. This is the amount of space used by accent characters and the like. The fontLeading is this number, and is represented as the green line, as it’s essentially subtracted from the fontHeight in terms of coordinates.

And there you have it. All the little bits and pieces of a font. When you specify a location for drawing the font in the ‘text()’ function, you’re essentially specifying the top left corner of this red rectangle. Of course, that leaves you a bit high and dry when it comes to precisely placing your text. More than likely, what you really want to do is place your text according to the baseline, so that you can be more assured of where your text is actually going to show up. Maybe you want that, maybe you don’t. What you really need is the flexibility to specify the ‘alignment’ of your text rendering.

This is actually a re-creation of something I did about 10 years ago, for another project. It’s a pretty simple matter once you have adequate font and character sizing information.

#include "gui.h"
#include "textlayout.h"

TextLayout tLayout;

void drawAlignedText()
{
	int midx = canvasWidth / 2;
	int midy = canvasHeight / 2;

	// draw vertical line down center of canvas
	line(*gAppSurface, midx, 0, midx, canvasHeight - 1, PixelRGBA(0xff000000));

	// draw horizontal line across canvas
	line(*gAppSurface, 0, midy, canvasWidth - 1, midy, PixelRGBA(0xff000000));

	tLayout.textFont("Consolas", 24);
	tLayout.textColor(PixelRGBA(0xff000000));

	tLayout.textAlign(ALIGNMENT::LEFT, ALIGNMENT::BASELINE);
	tLayout.text("LEFT", midx, 24);

	tLayout.textAlign(ALIGNMENT::CENTER, ALIGNMENT::BASELINE);
	tLayout.text("CENTER", midx, 48);

	tLayout.textAlign(ALIGNMENT::RIGHT, ALIGNMENT::BASELINE);
	tLayout.text("RIGHT", midx, 72);

	tLayout.textAlign(ALIGNMENT::RIGHT, ALIGNMENT::BASELINE);
	tLayout.text("SOUTH EAST", midx, midy);

	tLayout.textAlign(ALIGNMENT::LEFT, ALIGNMENT::BASELINE);
	tLayout.text("SOUTH WEST", midx, midy);

	tLayout.textAlign(ALIGNMENT::RIGHT, ALIGNMENT::TOP);
	tLayout.text("NORTH EAST", midx, midy);

	tLayout.textAlign(ALIGNMENT::LEFT, ALIGNMENT::TOP);
	tLayout.text("NORTH WEST", midx, midy);
}

void setup()
{
	setCanvasSize(320, 320);

	tLayout.init(gAppSurface);

	background(PixelRGBA(0xffDDDDDD));

	drawAlignedText();
}

Design-wise, I chose to stuff the various text measurement and rendering routines into a separate object. My other choice would have been to put them into the gui.h/cpp file, and I did do that initially, but then I thought better of it, because that would be forcing a particular strong opinion on how text should be dealt with, and I did not make that choice for drawing in general, so I thought better of it and chose to encapsulate the text routines in this layout structure (textlayout.h) .

Now that we have the ability to precisely place a string, we can get a little creative in playing with the displacement of all the characters in a string. With that ability, we can have text placed based on the evaluation of a function, with animation of course.

#include "gui.h"
#include "geotypes.hpp"
#include "textlayout.h"

using namespace alib;

constexpr int margin = 50;
constexpr int FRAMERATE = 20;

int dir = 1;				// direction
int currentIteration = 1;	// Changes during running
int iterations = 30;		// increase past frame rate to slow down
bool showCurve = true;
TextLayout tLayout;

void textOnBezier(const char* txt, GeoBezier<ptrdiff_t>& bez)
{
	int len = strlen(txt);

	double u = 0.0;
	int offset = 0;
	int xoffset = 0;

	while (txt[offset])
	{
		// Isolate the current character
		char buf[2];
		buf[0] = txt[offset];
		buf[1] = 0;

		// Figure out the x and y offset
		auto pt = bez.eval(u);

		// Display current character
		tLayout.text(buf, pt.x(), pt.y());

		// Calculate size of current character
		// so we can figure out where next one goes
		PixelCoord charSize;
		tLayout.textMeasure(charSize, buf);

		// Now get the next value of 'u' so we 
		// can evaluate where the next character will go
		u = bez.findUForX(pt.x() + charSize.x());

		offset++;
	}

}

void strokeCurve(PixelMap& pmap, GeoBezier<ptrdiff_t> &bez, int segments, const PixelRGBA c)
{
	// Get starting point
	auto lp = bez.eval(0.0);

	int i = 1;
	while (i <= segments) {
		double u = (double)i / segments;

		auto p = bez.eval(u);

		// draw line segment from last point to current point
		line(pmap, lp.x(), lp.y(), p.x(), p.y(), c);

		// Assign current to last
		lp = p;

		i = i + 1;
	}
}

void onFrame()
{
	background(PixelRGBA(0xffffffff));

	int y1 = maths::Map(currentIteration, 1, iterations, 0, canvasHeight);

	GeoCubicBezier<ptrdiff_t> bez(margin, canvasHeight / 2, 
        canvasWidth * 0.25, y1, 
        canvasWidth - (canvasWidth * 0.25), canvasHeight -y1, 
        canvasWidth - margin, canvasHeight / 2.0);
	
	if (showCurve)
		strokeCurve(*gAppSurface, bez, 50, PixelRGBA(0xffff0000));

	// honor the character spacing
	tLayout.textColor(PixelRGBA(0xff0000ff));
	textOnBezier("When Will The Quick Brown Fox Jump Over the Lazy Dogs Back", bez);


	currentIteration += dir;

	// reverse direction if needs be
	if ((currentIteration >= iterations) || (currentIteration <= 1))
		dir = dir < 1 ? 1 : -1;
}

void setup()
{
	setCanvasSize(800, 600);
	setFrameRate(FRAMERATE);

	tLayout.init(gAppSurface);
	tLayout.textFont("Consolas", 24);
	tLayout.textAlign(ALIGNMENT::CENTER, ALIGNMENT::CENTER);
}


void keyReleased(const KeyboardEvent& e) 
{
	switch (e.keyCode) {
	case VK_ESCAPE:
		halt();
		break;

	case VK_SPACE:
		showCurve = !showCurve;
		break;

	case 'R':
		recordingToggle();
		break;
	}
}

For once, I won’t go line by line. The key trick here is the ‘findUForX()’ function of the bezier object. Since textMeasure() tells us how wide a string is (in pixels), we know how much to advance in the x direction as we display characters. Our bezier curve has an eval() function, which takes a value from 0.0 to 1.0. It will return a ‘y’ value along the curve when given a ‘u’ value between 0 and 1 to evaluate. So, we want to match the x offset of the next character with its corresponding ‘u’ value along the curve, then we can evaluate the curve at that position, and find out the appropriate ‘y’ value.

Notice in the setup, the text alignment is set to CENTER, CENTER. This means that the coordinate positions being calculated should represent the center of the characters being printed. That roughly leaves the center of the character aligned with the evaluated values of the curve, which will match your expectations most closely. Another way to do it might be to do LEFT, BASELINE, to get the characters left aligned, and use the curve as the baseline. There are a few possibilities, and you can simply choose what suits your needs.

This is a very crude way to do some displayment of text on a curve, but, showing text along a path is a fairly common parlor trick in demo applications, and this is one way to doing it quick and dirty. Your curves doesn’t have to be a bezier, it could be anything you like. Just take it one character at a time, and use the textAlignment, and see what you can accomplish.

There is a design choice here. I am using simple GDI based interfaces to display the text. I can do this because at the core, the PixelArray that we’re drawing into does in fact have a “DeviceContext”, so GDI knows how to draw into it. This is a great convenience, because it means that we can do all the independent drawing that we’ve been doing, from random pixels to bezier curves, and when we get to something we can’t quite handle, we can fall back to what the system provides, in this case text rendering.

With that, we’re at the end of this series. We’ve gone from a basic window on the screen, to drawing text along an animating bezier curve, all while recording to a .mpg file. This is just the beginning. We’ve covered some design choices along the way, including the desire to keep the code small and composable. The only thing left to do is go out and create something of your own, by using this kind of toolkit, or better yet, have the confidence to create your own.

The Demo Scene is out there. Go create something.


Hello Scene – Screen Captures for Fun and Profit

Being able to capture the display screen opens up some interesting possibilities for our demo scenes.

In this particular case, my demo app is capturing a part of my screen, and using it as a ‘texture map’ on a trapezoid, and compositing that onto a perlin noise background. The capture is live, as we’ll see shortly, but first, the code that does this little demo (sampmania.cpp).


#include "gui.h"
#include "sampledraw2d.h"
#include "screensnapshot.h"
#include "perlintexture.h"

ScreenSnapshot screenSamp;

void onFrame()
{
	// Take current snapshot of screen
	screenSamp.next();

	// Trapezoid
	PixelCoord verts[] = { PixelCoord({600,100}),PixelCoord({1000,100}),PixelCoord({1700,800}),PixelCoord({510,800}) };
	int nverts = 4;
	sampleConvexPolygon(*gAppSurface, 
		verts, nverts, 0, 
		screenSamp, 
		{ 0,0,canvasWidth, canvasHeight });

}

void setup()
{
	setCanvasSize(1920, 1080);
	setFrameRate(15);

	// Draw noisy background only once
	NoiseSampler perlinSamp(4);
	sampleRectangle(*gAppSurface, gAppSurface->frame(), perlinSamp);

	// Setup the screen sampler
	// Capture left half of screen
	screenSamp.init(0, 0, displayWidth / 2, displayHeight);
}

Pretty standard fair for our demos. There are a couple of new concepts here though. One is a sampler, the other is the ScreenSnapshot object. Let’s first take a look at the ScreenSnapshot object. The idea here is we want to take a picture of what’s on the screen, and make it available to the program in a PixelArray, which is how we represent pixel images in general. If we can do that, we can further use the screen snapshot just like the canvas. We can draw on it, save it, whatever.

On the Windows platform, there are 2 or 3 ways to take a snapshot of the display screen. Each method comes from a different era of the evolution of the Windows APIs, and has various benefits or limitations. In this case, we use the most ancient method for taking a snapshot, relying on the good old GDI API to do the work, since it’s been reliable all the way back to Windows 3.0.

#pragma once
// ScreenSnapshot
//
// Take a snapshot of a portion of the screen and hold
// it in a PixelArray (User32PixelMap)
//
// When constructed, a single snapshot is taken.
// every time you want a new snapshot, just call 'next()'
// This is great for doing a live screen capture
//
//    ScreenSnapshot ss(x,y, width, height);
//
//    References:
//    https://www.codeproject.com/articles/5051/various-methods-for-capturing-the-screen
//    https://stackoverflow.com/questions/5069104/fastest-method-of-screen-capturing-on-windows
//  https://github.com/bmharper/WindowsDesktopDuplicationSample
//

#include "User32PixelMap.h"

class ScreenSnapshot : public User32PixelMap
{
    HDC fSourceDC;  // Device Context for the screen

    // which location on the screen are we capturing
    int fOriginX;   
    int fOriginY;


public:
    ScreenSnapshot()
        : fSourceDC(nullptr)
        , fOriginX(0)
        , fOriginY(0)
    {}

    ScreenSnapshot(int x, int y, int awidth, int aheight, HDC srcDC = NULL)
        : User32PixelMap(awidth, aheight),
        fOriginX(x),
        fOriginY(y)
    {
        init(x, y, awidth, aheight, NULL);

        // take at least one snapshot
        next();
    }

    bool init(int x, int y, int awidth, int aheight, HDC srcDC=NULL)
    {
        User32PixelMap::init(awidth, aheight);

        if (NULL == srcDC)
            fSourceDC = GetDC(nullptr);
        else
            fSourceDC = srcDC;

        fOriginX = x;
        fOriginY = y;

        return true;
    }

    // take a snapshot of current screen
    bool next()
    {
        // copy the screendc into our backing buffer
        // getDC retrieves the device context of the backing buffer
        // which in this case is the 'destination'
        // the fSourceDC is the source
        // the width and height are dictated by the width() and height() 
        // and the source origin is given by fOriginX, fOriginY
        // We use the parameters (SRCCOPY, CAPTUREBLT) because that seems 
        // to be best practice in this case
        BitBlt(getDC(), 0, 0, width(), height(), fSourceDC, fOriginX, fOriginY, SRCCOPY | CAPTUREBLT);

        return true;
    }
};

There’s really not much to it. The real working end of it is the ‘next()’ function. That function call to ‘BitBlt()’ is where all the magic happens. That’s a Graphics Device Interface (GDI) system call, which will copy from one “DeviceContext” to another. A DevieContext is a Windows construct that represents the interface for drawing into something. This interface exists for screens, printers, or bitmaps in memory. Very old, very basic, very functional.

So, the basics are, get a ‘DeviceContext’ for the screen, and another ‘DeviceContext’ for a bitmap in memory, and call BitBlt to copy pixes from one to the other.

Also, notice the ScreenSnapshot inherits from User32PixelMap. We first saw this early on in this series (What’s In a Window), when we were first exploring how to put pixels up on the screen. We’re just leveraging what was built there, which was essentially a Windows Bitmap.

OK, so bottom line, we can take a picture of the screen, and put it into a bitmap, that we can then use in various ways.

Here’s the movie

Well, isn’t that nifty. You might notice that if you query the internet for “screen capture”, you’ll find links to tons of products that do screen capture, and recording. Finding a library that does this for you programmatically is a bit more difficult. One method that seems to pop up a lot is to capture the screen to a file, or to the clipboard, but that’s not what you want, you just want it in a bitmap ready to go, which is what we do here.

On Windows, a more modern method is to use DirectX, because that’s the preferred interface of modern day Windows. The GDI calls under the covers probably call into DirectX. The benefit of using this simple BitBlt() method is that you don’t have to increase your dependencies, and you don’t need to learn a fairly complicated interface layer, just to capture the screen.

I’ve used a complex image here, mainly to draw attention to this subject, but really, the screen capturing and viewing can be much simpler.

Just a straight up view, without any geometric transformation, other than to fit the rectangle.

Code that looks very similar, but just using a simple mapping to a rectangle, rather than a trapezoid. This is from screenview.cpp

//
// screenview
// Simplest application to do continuous screen capture
// and display in another window.
//
#include "gui.h"

#include "screensnapshot.h"

ScreenSnapshot screenSamp;

void onFrame()
{
    // Get current screen snapshot
    screenSamp.next();

    // Draw a rectangle with snapshot as texture
    sampleRectangle(*gAppSurface,gAppSurface->frame(),screenSamp);
}

// Do application setup before things get
// going
void setup()
{
    // Setup application window
	setCanvasSize(displayWidth/2, displayHeight);

    // setup the snapshot
    screenSamp.init(0, 0, displayWidth / 2, displayHeight);
}

void keyReleased(const KeyboardEvent& e) {
    switch (e.keyCode)
    {
    case VK_ESCAPE:
        halt();
        break;

    case 'R':
    {
        recordingToggle();
    }
    break;
    }
}

Capturing the screen has additional benefit for our demo scenes. One little used feature of Windows is the fact you can use translucency, and transparency. As such, you can display rather interesting things on the display. Using the recording technique where we just capture what’s on our canvas won’t really capture what the user will see. You’ll only capture what you’re drawing in your own buffer. In order to capture the fullness of the demo, you need to capture what’s on the screen.

And just to kick it up a notch, and show off some other things you can do with transparency…

In both these cases of the chasing balls, as well as the transparent keyboard, there is a function call within the demo scene ‘layered()’. If you call this in your setup, then your window won’t have any sort of border, and if you use transparency in your colors, they’ll be composited with whatever is on the desktop.

You can go one step further (in the case of the chasing balls), and call ‘fullscreen()’, which will essentiallly do a: setCanvas(displayWidth, displayHeight); layered();

There is one additional call, which allows you to retain your window title bar (for moving around and closing), but sets a global transparency level for your window ‘windowOpacity(double)’, which takes a value between 0.0 (fully transparent), and 1.0 (fully opaque).

And of course the demo code for the disappearing rectangles trick.

#include "apphost.h"
#include "draw.h"
#include "maths.hpp"

using namespace maths;

bool outlineOnly = false;
double opacity = 1.0;

INLINE PixelRGBA randomColor(uint32_t alpha=255)
{
	uint32_t r = random_int(255);
	uint32_t g = random_int(255);
	uint32_t b = random_int(255);

	return { r,g,b,alpha };
}

void handleKeyboardEvent(const KeyboardEventTopic& p, const KeyboardEvent& e)
{
	if (e.keyCode == VK_ESCAPE)
		halt();

	if (e.keyCode == VK_SPACE)
		outlineOnly = !outlineOnly;

	if (e.keyCode == VK_UP)
		opacity = maths::Clamp(opacity + 0.05, 0.0, 1.0);

	if (e.keyCode == VK_DOWN)
		opacity = maths::Clamp(opacity - 0.05, 0.0, 1.0);

	windowOpacity(opacity);
}

void onLoop()
{
	PixelRGBA stroke;
	PixelRGBA fill;
	PixelRGBA c;

	gAppSurface->setAllPixels(PixelRGBA(0x0));

	for (int i = 1; i <= 2000; i++)
	{
		int x1 = random_int(canvasWidth - 1);
		int y1 = random_int(canvasHeight - 1);
		int lwidth = random_int(4, 60);
		int lheight = random_int(4, 60);

		c = randomColor(192);

		if (outlineOnly)
		{
			stroke = c;
			draw::rectangle_copy(*gAppSurface, x1, y1, lwidth, lheight, c);

		}
		else
		{
			fill = c;
			//draw::rectangle_copy(*gAppSurface, x1, y1, lwidth, lheight, c);
			draw::rectangle_blend(*gAppSurface, x1, y1, lwidth, lheight, c);
		}
	}

	refreshScreen();
}

void onLoad()
{
	subscribe(handleKeyboardEvent);

	setCanvasSize(800, 800);
}

Well, that’s a lot of stuff, but mostly we covered various forms of screen capture, what you can do with it, and why recording just your own drawing buffer doesn’t show the full fidelity of your work.

We also covered a little bit of Windows wizardry with transparent windows, a very little known or used feature, but we can use it to great advantage for certain kinds of apps.

From a design perspective, I chose to use an ancient, but still supported API call, because it has the least number of dependencies, is the easiest of all the screen capture methods to understand and implement, and it uses the smallest amount of code.

Another thing of note for this demo framework is the maximum usage of ‘.h’ files. In each demo sample, there’s typically only 2 or 3 ‘.cpp’ files, and NO .dll files. This is again for simplicity and portability. You could easily put things in a library, and having appmain.cpp in a .exe file would even work, but that leads down a different path. Here, we just make every demo self contained, compiling all the code needed right then and there. This works out when your file count is relatively small (fewer than 10), and you’re working on a small team (fewer than 5). This probably does not scale as well beyond that.

But, there you have it. We’ve gone all the way from putting a single pixel on the screen, to displaying complex deometries with animation in transparent windows. The only thing left in this series is to draw some text, and call it a wrap.

So, next time.


Hello Scene – Events, Organization, more drawing

There are some design principles I’m after with my little demo scene library. Staring at that picture is enough to make your eyes hurt, but we’ll explore when it’s time to call it quits on your own home brew drawing library, and rely on the professionals. We’re also going to explore the whole eventing model, because this is where a lot of fun can come into the picture.

What is eventing then? Mouse, keyboard, touch, pen, all those ways the user can give input to a program. At times, the thing I’m trying to explore is the eventing model itself, so I need some flexibility in how the various mouse and keyboard events are percolated through the system. I don’t want to be forced into a single model designated by the operating system, so I build up a structure that gives me that flexibility.

First things first though. On Windows, and any other system, I need to actually capture the mouse and keyboard stuff, typically decode it, and then deal with it in my world. That code looks like this in the appmain.cpp file.

/*
    Generic Windows message handler
    This is used as the function to associate with a window class
    when it is registered.
*/
LRESULT CALLBACK MsgHandler(HWND hWnd, UINT msg, WPARAM wParam, LPARAM lParam)
{
    LRESULT res = 0;

    if ((msg >= WM_MOUSEFIRST) && (msg <= WM_MOUSELAST)) {
        // Handle all mouse messages
        HandleMouseMessage(hWnd, msg, wParam, lParam);
    }
    else if (msg == WM_INPUT) {
        res = HandleHIDMessage(hWnd, msg, wParam, lParam);
    }
    else if (msg == WM_DESTROY) {
        // By doing a PostQuitMessage(), a 
        // WM_QUIT message will eventually find its way into the
        // message queue.
        ::PostQuitMessage(0);
        return 0;
    }
    else if ((msg >= WM_KEYFIRST) && (msg <= WM_KEYLAST)) {
        // Handle all keyboard messages
        HandleKeyboardMessage(hWnd, msg, wParam, lParam);
    }
    else if ((msg >= MM_JOY1MOVE) && (msg <= MM_JOY2BUTTONUP)) 
    {
        // Legacy joystick messages
        HandleJoystickMessage(hWnd, msg, wParam, lParam);
    }
    else if (msg == WM_TOUCH) {
        // Handle touch specific messages
        //std::cout << "WM_TOUCH" << std::endl;
        HandleTouchMessage(hWnd, msg, wParam, lParam);
    }
    //else if (msg == WM_GESTURE) {
    // we will only receive WM_GESTURE if not receiving WM_TOUCH
    //}
    //else if ((msg >= WM_NCPOINTERUPDATE) && (msg <= WM_POINTERROUTEDRELEASED)) {
    //    HandlePointerMessage(hWnd, msg, wParam, lParam);
    //}
    else if (msg == WM_ERASEBKGND) {
        //loopCount = loopCount + 1;
        //printf("WM_ERASEBKGND: %d\n", loopCount);
        if (gPaintHandler != nullptr) {
            gPaintHandler(hWnd, msg, wParam, lParam);
        }

        // return non-zero indicating we dealt with erasing the background
        res = 1;
    }
    else if (msg == WM_PAINT) {
        if (gPaintHandler != nullptr) 
        {
                gPaintHandler(hWnd, msg, wParam, lParam);
        }
    }
    else if (msg == WM_WINDOWPOSCHANGING) {
        if (gPaintHandler != nullptr) 
        {
            gPaintHandler(hWnd, msg, wParam, lParam);
        }
    }
    else if (msg == WM_DROPFILES) {
        HandleFileDropMessage(hWnd, msg, wParam, lParam);
    }
    else {
        // Not a message we want to handle specifically
        res = ::DefWindowProcA(hWnd, msg, wParam, lParam);
    }

    return res;
}

Through the magic of the Windows API, this function ‘MsgHandler’ is going to be called every time there is a Windows Message of some sort. It is typical of all Windows applications, in one form or another. Windows messages are numerous, and very esoteric. There are a couple of parameters, and the values are typically packed in as bitfields of integers, or pointers to data structures that need to be further decoded. Plenty of opportunity to get things wrong.

What we do here is capture whole sets of messages, and hand them off to another function to be processed further. In the case of mouse messages, we have this little bit of code:

    if ((msg >= WM_MOUSEFIRST) && (msg <= WM_MOUSELAST)) {
        // Handle all mouse messages
        HandleMouseMessage(hWnd, msg, wParam, lParam);
    }

So, first design choice here, is delegation. We don’t know how any application is going to want to handle the mouse messages, so we’re just going to capture them, and send them somewhere. In this case, the HandleMouseMessage() function.

/*
    Turn Windows mouse messages into mouse events which can
    be dispatched by the application.
*/
LRESULT HandleMouseMessage(HWND hwnd, UINT msg, WPARAM wParam, LPARAM lParam)
{   
    LRESULT res = 0;
    MouseEvent e;

    e.x = GET_X_LPARAM(lParam);
    e.y = GET_Y_LPARAM(lParam);

    auto fwKeys = GET_KEYSTATE_WPARAM(wParam);
    e.control = (fwKeys & MK_CONTROL) != 0;
    e.shift = (fwKeys & MK_SHIFT) != 0;

    e.lbutton = (fwKeys & MK_LBUTTON) != 0;
    e.rbutton = (fwKeys & MK_RBUTTON) != 0;
    e.mbutton = (fwKeys & MK_MBUTTON) != 0;
    e.xbutton1 = (fwKeys & MK_XBUTTON1) != 0;
    e.xbutton2 = (fwKeys & MK_XBUTTON2) != 0;
    bool isPressed = e.lbutton || e.rbutton || e.mbutton;

    // Based on the kind of message, there might be further
    // information to be decoded
    // mostly we're interested in setting the activity kind
    switch(msg) {
        case WM_LBUTTONDBLCLK:
	    case WM_MBUTTONDBLCLK:
	    case WM_RBUTTONDBLCLK:
            break;

        case WM_MOUSEMOVE:
            e.activity = MOUSEMOVED;
            break;

        case WM_LBUTTONDOWN:
        case WM_RBUTTONDOWN:
        case WM_MBUTTONDOWN:
        case WM_XBUTTONDOWN:
            e.activity = MOUSEPRESSED;
            break;
        case WM_LBUTTONUP:
        case WM_RBUTTONUP:
        case WM_MBUTTONUP:
        case WM_XBUTTONUP:
            e.activity = MOUSERELEASED;
            break;
        case WM_MOUSEWHEEL:
            e.activity = MOUSEWHEEL;
            e.delta = GET_WHEEL_DELTA_WPARAM(wParam);
            break;
        case WM_MOUSEHWHEEL:
            e.activity = MOUSEHWHEEL;
            e.delta = GET_WHEEL_DELTA_WPARAM(wParam);
            break;
            
        break;
    }

    gMouseEventTopic.notify(e);

    return res;
}

Here, I do introduce a strong opinion. I create a specific data structure to represent a MouseEvent. I do this because I want to make sure to decode all the mouse event has to offer, and present it in a very straight forward data structure that applications can access easily. So, the design choice is to trade off some memory for the sake of ease of consumption. In the uievent.h file, are various data structures that represent the various event structures, for mouse, keyboard, joystick, touch, even file drops, and pointers in general. That’s not the only kinds of messages that can be decoded, but it’s the ones used most for user interaction.

// Basic type to encapsulate a mouse event
enum {
    // These are based on regular events
    MOUSEMOVED,
    MOUSEPRESSED,
    MOUSERELEASED,
    MOUSEWHEEL,         // A vertical wheel
    MOUSEHWHEEL,        // A horizontal wheel

    // These are based on application semantics
    MOUSECLICKED,
    MOUSEDRAGGED,

    MOUSEENTERED,
    MOUSEHOVER,         // like move, when we don't have focus
    MOUSELEFT           // exited boundary
};

struct MouseEvent {
    int id;
    int activity;
    int x;
    int y;
    int delta;

    // derived attributed
    bool control;
    bool shift;
    bool lbutton;
    bool rbutton;
    bool mbutton;
    bool xbutton1;
    bool xbutton2;
};

From a strict performance perspective, this data structure should be a “cache line size” amount of data ideally, so the processor cache will handle it most efficiently. But, that kind of optimization can be tackled later if this is really beneficial. Initially, I’m just concerned with properly decoding the information and presenting it in an easy manner.

At the very end of HandleMouseMessage(), we see this interesting call before the return

    gMouseEventTopic.notify(e);

OK. This is where we depart from the norm of mouse handling and introduce a new concept, the publish/subscribe mechanism.

So far, we’ve got a tight coupling between the event coming in through MsgHandler, and being processed at HandleMouseMessage(). Within an application, the next logical step might be to have this explicitly call the mouse logic of the application. But, that’s not very flexible. What I’d really like to do is say “hay system, call this specific function I’m going to give you whenever a mouse event occurs”. But wait, didn’t we already do that with HandleMouseMessage()? Yes, in a sense, but that was primarily to turn the native system mouse message into something more palatable.

In general terms, I want to view the system as a publish/subscribe pattern. I want to look at the system as if it’s publishing various bits of information, and I want to ‘subscribe’ to various topics. What’s the difference? With the tightly couple function calling thing, one function calls another, calls another, etc. With pub/sub, the originator of an event doesn’t know who’s interested in it, it just knows that several subscribers have said “tell me when an event occurs”, and that’s it.

OK, so how does this work?

I need to tell the application runtime that I’m interested in receiving mouse events. I need to implement a function that has a certain interface to it, and ‘subscribe’ to the mouse events.

// This routine will be called whenever there
// is a mouse event in the application window
void handleMouseEvent(const MouseEventTopic& p, const MouseEvent& e)
{
    mouseX = e.x;
    mouseY = e.y;

    switch (e.activity)
    {
    case MOUSERELEASED:
        // change the color for the cursor
        cColor = randomColor();
        break;
    }
}

void onLoad()
{
    subscribe(handleMouseEvent);
}

That’s pretty much it. In the ‘onLoad()’ implementation, I call ‘subscribe()’, passing in a pointer to the function that will receive the mouse events when they occur. If you’re content with this, jump over the following section, and continue at Back To Sanity. Otherwise, buckle in for some in depth.

There are several subscribe() functions. Each one of them is a convenience for registering a function to be called in response to information being available for a specific topic. You can see these in apphost.h

APP_EXPORT void subscribe(SignalEventTopic::Subscriber s);
APP_EXPORT void subscribe(MouseEventTopic::Subscriber s);
APP_EXPORT void subscribe(KeyboardEventTopic::Subscriber s);
APP_EXPORT void subscribe(JoystickEventTopic::Subscriber s);
APP_EXPORT void subscribe(FileDropEventTopic::Subscriber s);
APP_EXPORT void subscribe(TouchEventTopic::Subscriber s);
APP_EXPORT void subscribe(PointerEventTopic::Subscriber s);

The construction ‘EventTopic::Subscriber’ is a manifestation of how these Topics are constructed. Let’s take a look at the Topic template to understand a little more deeply. The comments in the code below give a fair explanation of the Topic template. Essentially, you just want to have a way to identify a topic, and construct a function signature to match. The topic contains two functions of interest. ‘subscribe()’, allows you to register a function to be called when the topic wants to publish information, and the ‘notify()’ function, which is the way in which the information is actually published.

/*
	Publish/Subscribe is that typical pattern where a 
	publisher generates interesting data, and a subscriber
	consumes that data.

	The Topic class contains both the publish and subscribe
	interfaces.


	Whatever is responsible for indicating the thing happened
	will call the notify() function of the topic, and the
	subscribed function will be called.

	The Topic does not incoroporate any threading model
	A single topic is not a whole pub/sub system
	Multiple topics are meant to be managed together to create
	a pub/sub system.

	Doing it this way allows for different forms of composition and
	general topic management.

	T - The event payload, this is the type of data that will be
	sent when a subscriber is notified.

	The subscriber is a functor, that is, anything that has the 
	function signature.  It can be an object, or a function pointer,
	essentially anything that resolves as std::function<void>()

	This is a very nice pure type with no dependencies outside
	the standard template library
*/
template <typename T>
class Topic
{
public:
	// This is the form of subscriber
	using Subscriber = std::function<void(const Topic<T>& p, const T m)>;

private:
	std::deque<Subscriber> fSubscribers;

public:
	// Notify subscribers that an event has occured
	// Just do a simple round robin serial invocation
	void notify(const T m)
	{
		for (auto & it : fSubscribers) {
			it(*this, m);
		}
	}

	// Add a subscriber to the list of subscribers
	void subscribe(Subscriber s)
	{
		fSubscribers.push_back(s);
	}
};

So, it’s a template. Let’s look at some instantiations of the template that are made within the runtime.

// Within apphost.h
// Make Topic publishers available
using SignalEventTopic = Topic<intptr_t>;

using MouseEventTopic = Topic<MouseEvent&>;
using KeyboardEventTopic = Topic<KeyboardEvent&>;
using JoystickEventTopic = Topic<JoystickEvent&>;
using FileDropEventTopic = Topic<FileDropEvent&>;
using TouchEventTopic = Topic<TouchEvent&>;
using PointerEventTopic = Topic<PointerEvent&>;


// Within appmain.cpp
// Topics applications can subscribe to
SignalEventTopic gSignalEventTopic;
KeyboardEventTopic gKeyboardEventTopic;
MouseEventTopic gMouseEventTopic;
JoystickEventTopic gJoystickEventTopic;
FileDropEventTopic gFileDropEventTopic;
TouchEventTopic gTouchEventTopic;
PointerEventTopic gPointerEventTopic;

The application runtime, as we saw in the HandleMouseMessage() function, will then call the appropriate topic’s ‘notify()’ function, to let the subscribers know there’s some interesting information being published. Perhaps this function should be renamed to ‘publish()’.

And that’s it. All this pub/sub machinery makes it so that we can be more flexible about when and how we handle various events within the system. You can go further and create whatever other constructs you want from here. You can add queues, multiple threads, duplicates. You can decide you want to have two places react to mouse events, completely unbeknownst to each other.

Back to Sanity

Alright, let’s see how an application can actually use all this. This is the mousetrack application.

mousetrack

The app is simple. The read square follows the mouse around while it’s within the boundary of the window. All the lines from the top and bottom terminate at the point of the mouse as well.

In this case, we want to track where the mouse is, make note of that location, and use it in our drawing routines. In addition, of course, we want to draw the lines, circles, square, and background.

/*
    Demonstration of how to subscribe
    to keyboard and mouse events.

    Using encapsulated drawing and PixelArray
*/
#include "apphost.h"
#include "draw.h"


// Some easy pixel color values
#define black	PixelRGBA(0xff000000)
#define white	PixelRGBA(0xffffffff)
#define red		PixelRGBA(0xffff0000)
#define green	PixelRGBA(0xff00ff00)
#define blue	PixelRGBA(0xff0000ff)
#define yellow	PixelRGBA(0xffffff00)


// Some variables to track mouse and keyboard info
int mouseX = 0;
int mouseY = 0;
int keyCode = -1;

// global pixel array (gpa)
// The array of pixels we draw into
// This will just wrap what's already created
// for the canvas, for convenience
PixelArray gpa;

PixelPolygon gellipse1;
PixelPolygon gellipse2;

// For the application, we define the size of 
// the square we'll be drawing wherever the mouse is
constexpr size_t iconSize = 64;
constexpr size_t halfIconSize = 32;

// Define the initial color of the square we'll draw
// clicking on mouse, or space bar, will change color
PixelRGBA cColor(255, 0, 0);

// Simple routine to create a random color
PixelRGBA randomColor(uint32_t alpha = 255)
{
    return { 
        (uint32_t)maths::random(255), 
        (uint32_t)maths::random(255), 
        (uint32_t)maths::random(255), alpha };
    }

// This routine will be called whenever there
// is a mouse event in the application window
void handleMouseEvent(const MouseEventTopic& p, const MouseEvent& e)
{
    // Keep track of the current mouse location
    // Use this in the drawing routine
    mouseX = e.x;
    mouseY = e.y;

    switch (e.activity)
    {
    case MOUSERELEASED:
        // change the color for the cursor
        cColor = randomColor();
        break;
    }
}

// Draw some lines from the top and bottom edges of
// the canvas, converging on the 
// mouse location
void drawLines(PixelArray &pa)
{
    // Draw some lines from the edge to where
    // the mouse is
    for (size_t x = 0; x < pa.width; x += 4)
    {
        draw::copyLine(pa, x, 0, mouseX, mouseY, white);
    }

    for (size_t x = 0; x < pa.width; x += 16)
    {
        draw::copyLine(pa, x, pa.height-1, mouseX, mouseY, white, 1);
    }

}

// Simple routine to create an ellipse
// based on a polygon.  Very crude, but
// useful enough 
INLINE void createEllipse(PixelPolygon &poly, ptrdiff_t centerx, ptrdiff_t centery, ptrdiff_t xRadius, ptrdiff_t yRadius)
{
    static const int nverts = 72;
    int steps = nverts;

    ptrdiff_t awidth = xRadius * 2;
    ptrdiff_t aheight = yRadius * 2;

    for (size_t i = 0; i < steps; i++) {
        auto u = (double)i / steps;
        auto angle = u * (2 * maths::Pi);

        ptrdiff_t x = (int)std::floor((awidth / 2.0) * cos(angle));
        ptrdiff_t y = (int)std::floor((aheight / 2.0) * sin(angle));
        poly.addPoint(PixelCoord({ x + centerx, y + centery }));
    }
    poly.findTopmost();
}

// Each time through the main application 
// loop, do some drawing
void onLoop()
{
    // clear screen to black to start
    draw::copyAll(gpa, black);

    drawLines(gpa);

    // draw a rectangle wherever the mouse is
    draw::copyRectangle(gpa, 
        mouseX-halfIconSize, mouseY-halfIconSize, 
        iconSize, iconSize, 
        cColor);

    // Draw a couple of green ellipses
    draw::copyPolygon(gpa, gellipse1, green);
    draw::copyPolygon(gpa, gellipse2, green);

    // force the canvas to be drawn
    refreshScreen();
}

// Called as the application is starting up, and
// before the main loop has begun
void onLoad()
{
    setTitle("mousetrack");

    // initialize the pixel array
    gpa.init(canvasPixels, canvasWidth, canvasHeight, canvasBytesPerRow);


    createEllipse(gellipse1, 120, 120, 30, 30);
    createEllipse(gellipse2, (ptrdiff_t)gpa.width - 120, 120, 30, 30);

    // setup to receive mouse events
    subscribe(handleMouseEvent);
}

At the end of the ‘onLoad()’, we see the call to subscribe for mouse events. Within handleMouseEvent(), we simply keep track of the mouse location. Also, if the user clicks a button, we will change the color of the rectangle to be drawn.

Well, that’s pretty much it. We’ve wandered through the pub/sub mechanism for event dispatch, and looked specifically at how this applies to the mouse messages coming from the Windows operating system. The design principle here is to be loosely coupled, and allow the application developer to create the constructs that best suite their needs, without imposing too much of an opinion on how that must go.

I snuck in a bit more drawing. Now there are lines, in any direction, and thickness, as well as rudimentary polygons.

In the next installment, I’ll look a bit deeper into the drawing, and we’ll look at things like screen capture, and how to record our activities to turn into demo movies.


SVG And Me – Don’t tell me, just another database!

A picture is worth 175Kb…

grapes

So, SVG right? Well, the original was, but this image was converted to a .png file for easy embedding in WordPress. The file size of the original grapes.svg is 75K. That savings in space is one of the reasons to use .svg files whenever you can.

But, I digress. The remotesvg project has been moving right along.

Last time around, I was able to use Lua syntax as a stand in for the raw .svg syntax.  That has some benefits because since your in a programming language, you can use programming constructs such as loops, references, functions and the like to enhance the development of your svg.  That’s great when you’re creating something from scratch programmatically, rather than just using a graphical editing tool such as inkscape to construct your .svg.  If you’re constructing a library of svg handling routines, you need a bit more though.

This time around, I’m adding in some parsing of svg files, as well as general manipulation of the same from within Lua.  Here’s a very simple example of how to read an svg file into a lua table:

 

local parser = require("remotesvg.parsesvg")

local doc = parser:parseFile("grapes.svg");

That’s it! You now have the file in a convenient lua table, ready to be manipulated. But wait, what do I have exactly? Let’s look at a section of that file and see what it gives us.

    <linearGradient
       inkscape:collect="always"
       id="linearGradient4892">
      <stop
         style="stop-color:#eeeeec;stop-opacity:1;"
         offset="0"
         id="stop4894" />
      <stop
         style="stop-color:#eeeeec;stop-opacity:0;"
         offset="1"
         id="stop4896" />
    </linearGradient>
    <linearGradient
       inkscape:collect="always"
       xlink:href="#linearGradient4892"
       id="linearGradient10460"
       gradientUnits="userSpaceOnUse"
       gradientTransform="translate(-208.29289,-394.63604)"
       x1="-238.25415"
       y1="1034.7042"
       x2="-157.4043"
       y2="1093.8906" />

This is part of the definitions, which later get used on portions of representing the grapes. A couple of things to notice. As a straight ‘parsing’, you’ll get a bunch of text values. For example: y2 = “109.8906”, that will turn into a value in the lua table like this: {y2 = “109.8906”}, the ‘109.8906’ is still a string value. That’s useful, but a little less than perfect. Sometimes, depending on what I’m doing, retaining that value as a string might be just fine, but sometimes, I’ll want that value to be an actual lua number. So, there’s an additional step I can take to parse the actual attributes values and turn them into a more native form:

local parser = require("remotesvg.parsesvg")

local doc = parser:parseFile("grapes.svg");
doc:parseAttributes();

doc:write(ImageStream)

That line with doc:parseAttributes(), tells the document to go through all its attributes and parse them, turning them into more useful values from the Lua perspective. In the case above, the representation of ‘y2’ would become: {y2 = 109.8906}, which is a string value.

This gets very interesting when you have values where the string representation and the useful lua representation are different.

<svg>
<line x1="10", y1="20", width = "10cm", height= "12cm" />
</svg>

This will be turning into:

{
  x1 = {value = 10},
  y1 = {value = 20},
  width = {value = 10, units = 'cm'},
  height = {value = 12, units = 'cm'}
}

Now, in my Lua code, I can access these values like so:

local doc = parser:parseFile("grapes.svg");
doc:parseAttributes();
print(doc.svg[1].x1.value);

When I subsequently want to write this value out as valid svg, it will turn back into the string representation with no loss of fidelity.

Hidden in this example is a database query. How do I know that doc.svg[1] is going to give me the ” element that I’m looking for? In this particular case, it’s only because the svg is so simple that I know for a fact that the ” element is going to show up as the first child in the svg document. But, most of the time, that is not going to be the case.

In any svg that’s of substance, there is the usage of various ‘id’ fields, and that’s typically what is used to find an element. So, how to do that in remotesvg? If we look back at the example svg, we see this ‘id’ attribute on the first gradient: id=”linearGradient4892″.

How could I possibly find that gradient element based on the id field? Before that though, let’s look at how to enumerate elements in the document in the first place.

local function printElement(elem)
    if type(elem) == "string" then
        -- don't print content values
        return 
    end
    
    print(string.format("==== %s ====", elem._kind))

    -- print the attributes
    for name, value in elem:attributes() do
        print(name,value)
    end
end

local function test_selectAll()
    -- iterate through all the nodes in 
    -- document order, printing something interesting along
    -- the way

    for child in doc:selectAll() do
	   printElement(child)
    end
end

Here is a simple test case where you have a document already parsed, and you want to iterate through the elements, in document order, and just print them out. This is the first step in viewing the document as a database, rather than as an image. The working end of this example is the call to ‘doc:selectAll()’. This amounts to a call to an iterator that is on the BasicElem class, which looks like this:

--[[
	Traverse the elements in document order, returning
	the ones that match a given predicate.
	If no predicate is supplied, then return all the
	elements.
--]]
function BasicElem.selectElementMatches(self, pred)
	local function yieldMatches(parent, predicate)
		for idx, value in ipairs(parent) do
			if predicate then
				if predicate(value) then
					coroutine.yield(value)
				end
			else
				coroutine.yield(value)
			end

			if type(value) == "table" then
				yieldMatches(value, predicate)
			end
		end
	end

  	return coroutine.wrap(function() yieldMatches(self, pred) end)	
end

-- A convenient shorthand for selecting all the elements
-- in the document.  No predicate is specified.
function BasicElem.selectAll(self)
	return self:selectElementMatches()
end

As you can see, ‘selectAll()’ just turns around and calls ‘selectElementMatches()’, passing in no parameters. The selectElementMatches() function then does the actual work. In Lua, there are a few ways to create iterators. In this particular case, where we want to recursive traverse down a hierarchy of nodes (document order), it’s easiest to use this coroutine method. You could instead keep a stack of nodes, pushing as you go down the hierarchy, popping as you return back up, but this coroutine method is much more compact to code, if a bit harder to understand if you’re not used to coroutines. The end result is an iterator that will traverse down a document hierarchy, in document order.

Notice also that the ‘selectElementMatches’ function takes a predicate. A predicate is simply a function that takes a single parameter, and will return ‘true’ or ‘false’ depending on what it sees there. This will become useful.

So, how to retrieve an element with a particular ID? Well, when we look at our elements, we know that the ‘id’ field is one of the attributes, so essentially, what we want to do is traverse the document looking for elements that have an id attribute that matches what we’re looking for.

function BasicElem.getElementById(self, id)
    local function filterById(entry)
        print("filterById: ", entry.id, id)
        if entry.id == id then
            return true;
        end
    end

    for child in self:selectMatches(filterById) do
        return child;
    end
end

Here’s a convenient function to do just that. And to use it:

local elem = doc:getElementById("linearGradient10460")

That will retrieve the second linear gradient of the pair of gradients from our svg fragment. That’s great! And the syntax is looking very much like what I might write in javascript against the DOM. But, it’s just a database!

Given the selectMatches(), you’re not just limited to querying against attribute values. You can get at anything, and form as complex queries as you like. For example, you could find all the elements that are deep green, and turn them purple with a simple query loop.

Here’s an example of finding all the elements of a particular kind:

local function test_selectElementMatches()
    print("<==== selectElementMatches: entry._kind == 'g' ====>")
	for child in doc:selectElementMatches(function(entry) if entry._kind == "g" then return true end end) do
		print(child._kind)
	end
end

Or finding all the elements that have a ‘sodipodi’ attribute of some kind:

local function test_selectAttribute()
    -- select the elements that have an attribute
    -- with the name 'sodipodi' in them
    local function hasSodipodiAttribute(entry)
        if type(entry) ~= "table" then
            return false;
        end

        for name, value in entry:attributes() do
            --print("hasSodipodi: ", entry._kind, name, value, type(name))
            if name:find("sodipodi") then
                return true;
            end
        end

        return false
    end

    for child in doc:selectElementMatches(hasSodipodiAttribute) do
        if type(child) == "table" then
            printElement(child)
        end
    end
end

Of course, just finding these elements is one thing. Once found, you can use this to filter out those elements you don’t want. for example, eliminating the ones that are inkscape specific.

Well, there you have it. First, you can construct your svg programmatically using Lua syntax. Alternatively, you can simply parse a svg file into a lua structure. Last, you can query your document, no matter how it was constructed, for fun and profit.

Of course, the real benefit of being able to parse, and find elements and the like, is it makes manipulating the svg that much easier. Find the node that represents the graph of values, for example, and change those values over time for some form of animation…