Build vs Buy in software development – Part 1, the selection criteria

There is a common theme throughout life; “Should I build, or buy?” It doesn’t seem to matter which thing is under consideration, the process we used to make the decision should remain the same. This is a multi-part series on making the build vs buy decision for software projects.

This is WAAV Studio, a rapidly evolving application used for “Smart City” visualization and management. There are several components that go into it, and plenty of ‘build vs buy’ decisions to be made.

Here are the considerations I’ve had in deciding which way to go on several components

  • What is the scope of the component – How broad, and fundamental is it
  • Do I have the expertise to build it
  • How long will it take to integrate a purchased component
  • How long will it take to build it myself
  • What is the cost associated with purchasing it
  • How will maintenance look over time
  • The solution must be easy
  • The solution must be portable to multiple platforms
  • The solution must be small, fast, and intuitive

There are perhaps a couple more things to consider, but these are the highlights. There are some market considerations as well, but they essentially boil down to:

How important is time to market?;

and

What is your budget?

I’ll start with the WAAV Studio time to market and budget first. Time to market in this case is measured in months. A polished product needs to be available within a 12 month time period from when I started (January). The product must meet various milestones of usability along the way, but overall, the 1.0 version must be user friendly within 12 months.

Second, is the budget. This is the product of a very small team, using tools such as Copilot, and ChatGPT, as well as their core skills and programming tools. There is effectively no real budget to speak of, in terms of purchasing other bits of software.

With those hard constraints, I consider the list of ‘modules’ that need to be in this application.

  • Geo spacial mapping
  • Visualize various geo-spacial data sets
    • KMZ
    • GEOJSon
    • Shapefile
    • .csv data sets
  • Visualize various graphics assets
    • .png images
    • .gif images
    • .jpeg images
    • .svg images

That’s the rough list, with lots of detail on each item. The biggest and first build/buy decision was around mapping visualization. The easy and typical answer would be “just use Google Earth”, or something like that. Even before that, it might be “Use ArgGIS”, or any number of GIS software packages. Going this route might be expedient, but will lead you down a path of being constrained by whatever that core package is capable of.

A few of the criteria are around ease of use, and size. This application is something a typical city administrator will use on occasion. Their key job function might not be related to this software, so they need to be able to come to it after 3 months of absence, and still be able to pick it up and use it effectively to achieve some immediate goal. This is a hard one to achieve, and has to do with the UI elements, their layout, and natural usage. Again, when you select a core package, you may not have enough control over these factors to have a satisfying outcome. Using ArcGIS, for example, it has all the features a GIS professional could possibly want. The package size is measured in the 10s of megabytes, and the user’s manual would make an encyclopedia blush if it were printed in paper. This is not an app that can be picked up by your typical city clerk in a 10 minute session, let alone mastered six months later, without constant usage.

First decision: Create a core mapping component from scratch, without reliance, or dependence on any existing mapping components.

This is such a core, fundamental decision, it drives decision making across the other components, so, it better be good.

I have never built a mapping platform before WAAV Studio, so I started with the naive notion that I could actually do it. I mean, how hard could it be, right? All software engineers have the urge to build from scratch, and just jump onto any coding challenge. My years of wisdom told me, I better have a way to evaluate my progress on the task, and determine if it was time to abandon my naive first choice for a better choice later down the line.

In the next part, I’ll look into what goes into the core mapping platform, and which other components were chosen for the build vs buy machine.


SVG From The Ground Up – Time to wrap it up

This time around, we’re in the final stretch, going from a file, through the parsing, creating an object model, and finally, rendering an image.

To recap:

We started with low level building blocks to scan byte streams: Parsing Fundamentals

We got into the guts of XML parsing: It’s XML, how hard could it be

We then looked at core data structures, and how to parse their content: Along the right path

Most recently, we went over several drawing primitives and data structures: Can you imaging that

This series is a reflection on the code that can be found in this repository: svg2b2d, so you can follow along, and freely use it to make your own creations.

Now let’s get back to the build…

Thus far in the series, we’ve been looking at the guts of things, essentially from the bottom up. For this final installment, I’m going to go the other way around, and start from the end result and work back towards the beginning.

The goal I have for a program is to a .svg file into a .png file. That is, take the .svg, parse it, render it into a bitmap image, save that image as a .png. We’ll call this program svg2png, and here it is:

#include "blend2d.h"
#include "mmap.h"
#include "svg.h"

using namespace filemapper;


int main(int argc, char** argv)
{
    if (argc < 2)
    {
        printf("Usage: genimage <svg file>  [output file]\n");
        return 1;
    }

    // create an mmap for the specified file
    const char *filename = argv[1];
    auto mapped = mmap::createShared(filename);

    if (mapped == nullptr)
        return 0;

    // Create the BLImage we're going to draw into
    BLImage outImage(420, 340, BL_FORMAT_PRGB32);

    // parse the svg, and draw it into the image
    parseSVG(mapped->data(), mapped->size(), outImage);
    
    // save the image to a png file
    const char *output = argc > 2 ? argv[2] : "output.png";
    outImage.writeToFile(output);

    // close the mapped file
    mapped->close();


    return 0;
}

We’ve seen bits and pieces of this along the way. First, we get a filename from the command line, and create a memory mapped file from it. That allows us to deal with the contents as a pointer in memory, which makes the parsing really easy. Then we setup an initial BLImage object. The fact that it starts out as a certain size doesn’t matter. It will be changed later in the parsing process. Then there’s the call to parseSVG, which is the entry point to parsing SVG in this case. And finally, we output the image using an inbuilt capability of the blend2d library ‘outImage.writeToFile()’, and we’re done! What could be easier?

Let’s take a look at parseSVG(), since that’s where the real action is.

#include "svg.h"
#include "svgshapes.h"
#include "bspanutil.h"

#include <vector>
#include <memory>

bool parseSVG(const void* bytes, const size_t sz, BLImage& outImage)
{
    svg2b2d::ByteSpan inChunk(bytes, sz);
    
    // Create a new document
    svg2b2d::SVGDocument doc;

    // Load the document from the data
    doc.readFromData(inChunk);
    
    
    // Draw the document into a IRender
    outImage.create(doc.width(), doc.height(), BL_FORMAT_PRGB32);
    SVGRenderer ctx(outImage);
    doc.draw(ctx);
    ctx.end();
    
    return true;
}
  • Create a ByteSpan for the pointer and size that we’ve been given (the memory mapped file)
  • Create an instance of an SVGDocument (a container to hold what we parse)
  • Read/parse the contents, filling in the SVGDocument
  • Resize the image to match the size of the document
  • Create a rendering context and connect it to the image
  • Draw the document into the rendering context
  • Done

Light and simple. So, let’s go one step further, and look at that ‘readFromData()’, by examining the whole SVGDocument object

	struct SVGDocument : public IDrawable
	{

		// All the drawable nodes within this document
		std::shared_ptr<SVGRootNode> fRootNode{};
		std::vector<std::shared_ptr<IDrawable>> fShapes{};
		BLBox fExtent{};

		SVGDocument() = default;

		double width() const { 
			if (fRootNode == nullptr)
				return 100;
			return fRootNode->width();
        }

		double height() const {
			if (fRootNode == nullptr)
				return 100;
			return fRootNode->height();
		}

		void draw(IRender& ctx) override
		{
			for (auto& shape : fShapes)
			{
				shape->draw(ctx);
			}
		}

		// Add a node that can be drawn
		void addNode(std::shared_ptr<SVGObject> node)
		{
			fShapes.push_back(node);
		}


		// Load the document from an XML Iterator
		// Since this is the top level document, we just want to kick
		// off loading the root node 'svg', and we're done 
		void loadFromIterator(XmlElementIterator& iter)
		{

			// skip past any elements that come before the 'svg' element
			while (iter)
			{
				const XmlElement& elem = *iter;

				if (!elem)
					break;

				//printXmlElement(*iter);

				// Skip over these node types we don't know how to process
				if (elem.isComment() || elem.isContent() || elem.isProcessingInstruction())
				{
					iter++;
					continue;
				}


				if (elem.isStart() && (elem.name() == "svg"))
				{
                    // There should be only one root node in a document, so we should 
                    // break here, but, curiosity...
                    fRootNode = SVGRootNode::createFromIterator(iter);
                    if (fRootNode != nullptr)
                    {
                        addNode(fRootNode);
                    }
				}

				iter++;
			}
		}

		bool readFromData(const ByteSpan &inChunk)
		{
			ByteSpan s = inChunk;

			XmlElementIterator iter(s);

			loadFromIterator(iter);

			return true;
		}


	};

The SVGDocument has two primary things it achieves. The first is to parse the raw svg and turn it into a structured thing that can later be rendered, or some other processing can occur on it. The second thing is to provide a convenient entry point to render into a drawing context.

The readFromData() call should be familiar. It’s just XML after all isn’t it? So, create an XmlIterator on the chunk of memory we were passed, and get to parsing. The ‘loadFromIterator()’ function above that is the one that’s taking the individual nodes, and doing something with them. In this case, we’re only interested in the top level ‘<svg>’ node, so when we see that, we tell it to load itself from the iterator, and we save it as our root node.

The SVGRootNode itself isn’t much more complicated, and does a similar act

	struct SVGRootNode : public SVGGroup
	{
		std::shared_ptr<SVGPortal> fPortal;

		SVGRootNode() :SVGGroup(nullptr) { setRoot(this); }
		SVGRootNode(IMapSVGNodes *root)
			: SVGGroup(root)
		{
			setRoot(this);
		}
		
		double width()
		{
			if (fPortal != nullptr)
				return fPortal->width();
			return 100;
		}

		double height()
		{
			if (fPortal != nullptr)
				return fPortal->height();
			return 100;
		}

		void loadSelfFromXml(const XmlElement& elem) override
		{
			SVGGroup::loadSelfFromXml(elem);
			
			fPortal = SVGPortal::createFromXml(root(), elem, "portal");
		}

		static std::shared_ptr<SVGRootNode> createFromIterator(XmlElementIterator& iter)
		{
			auto node = std::make_shared<SVGRootNode>();
			node->loadFromIterator(iter);

			return node;
		}

		void draw(IRender& ctx) override
		{
			ctx.save();

			// Start with default state

			ctx.setFillStyle(BLRgba32(0, 0, 0));
			
			ctx.setStrokeStyle(BLRgba32(0));
			ctx.setStrokeWidth(1.0);
			ctx.textSize(16);
			
			// Apply attributes that have been gathered
			// in the case of the root node, it's mostly the viewport
			applyAttributes(ctx);

			// Draw the children
			drawSelf(ctx);

			ctx.restore();
		}

	};

The fact that it’s a subclass of SVGGroup takes care of some boilerplate code for loading self-enclosing elements, and grouped elements, and specialized elements and the like. The svgshapes.h file contains the nitty gritty details, so you should check there, so I don’t bore you with it all here. You can see that in the drawing routine, we setup the drawing context to have the default values the SVG environment expects. There are things such as having no stroke, but a black fill, to start. There are other items such as setting up the drawing coordinates, according to the ‘viewBox’ on the svg element, if it exists, and that happens in the ‘applyAttributes()’ function call.

Here’s another picture to keep you interested in the possibilities.

One last guidepost in the code, before we wrap this up. The SVGCompoundNode object is most important for the document structure. That’s where the ‘loadFromIterator()’ function lives. Classes such as SVGGroup, and SVGGradient, descend from there, and just implement a few calls to deal with further grouped things. So, that’s a piece of code to take a look at. It’s structured and organized to make it simple for sub-classes to just add a little bit here and there to specialize for given situations. Otherwise, it’s meant to be a relatively safe default to handle the processing of nodes, whether they be self contained, or compound.

And that’s about it. We’ve gone from the beginnings of how to scan stuff at a byte level, all the way through a simple XML parser, and into the complexities of parsing details of SVG types, and constructing a tree to be rendered, and saved as an image. All of the images shared in these articles have been rendered using the code built here, so it’s capable of doing fairly complex SVGs, beyond the typical rendering of the Ghostscript tiger. From here, if you have need for SVG in your own code, you can pretty much just lift the svg2b2d directly, and start using it. I did not cover doing text in SVG in this article series, but it’s actually not as hard as it might seem, simply because the blend2d library already deals with text rendering as well. Normally you’d have to contemplate using freetype, or stb_xxx something or other to get text, which just increases your surface area. With blend2d, you don’t, it does all that as well.

So, there you have it. SVG From The Ground Up! A step by step guide on how to go from bytes in a file, to bits on the screen. I hope this helps those who are inspired to simply learn some of the details, if not those who actually want to implement their own. I am personally using SVG for visualization and UI elements. You can imagine the common refrain of “just use HTML and the browser”, but what’s the fun in that.

Until next time, go parse you some SVG!!


SVG From the Ground Up – Can you imaging that?

It’s time to put the pieces together and get to rendering some SVG!

In the last couple of installments, we were looking at how to do some very low level scanning and parsing. We got through the basics of XML, and looked at some geometry with parsing of the SVG <path> ‘d’ element. The next step is to decide on how we’re going to represent an entire document in memory so that we can render the whole image. This is a pretty big subject, so we’ll start with some design constraints and goals.

At the end of the day, I want to turn the .svg file into bits on the screen. The blend2d graphics library has a BLContext object, which does all the drawing that I need. It can draw everything from individual lines, to gradient shaded polygons, and bitmaps. SVG has some particular drawing requirements, in terms which elements are drawn first, how they are styled, how they are grouped, etc. One example of this is the usage of Cascading Style Sheets (CSS). What this means is that if I turn on one attribute, such as a fill color for polygons, that attribute will be applied to all subsequent elements in a tree drawn after it, until something changes.

Example:

<svg
 viewbox='10 10 190 10'
 xmlns="http://www.w3.org/2000/svg">
<g stroke="red" stroke-width='4'>
  <line x1='10' y1='10' x2='200' y2='200'/>
  <line stroke='green' x1='10' y1='100' x2='200' y2='200'/>
  <line stroke-width='8' stroke='blue' x1='10' y1='200' x2='200' y2='200'/>
  <rect x='100' y='10' width='50' height='50' />
</g>
</svg>

The ‘<g…’ serves as a grouping mechanism. It allows you to set some attributes which will apply to all the elements that are within that group. In this case, I set the stroke (the color used to draw lines) to ‘red’. Until something later changes this, the default will be red lines. I also set the stroke-width (number of pixels used to draw the lines). So, again, unless it is explicitly changed, all subsequent lines in the group will have this width.

The first line drawn, since it does not change the color or the width, uses red, and 4.

The second line drawn, changes the color to ‘green’, but does not change the width.

The third line drawn, changes the color to blue, and changes the width to 8

The rectangle, does not explicitly change the color, so red line drawing, with a width of 4, and a default filll color of black.

Of note, changing the attributes on a single element, such as the green line, does not change that attribute for sibling elements, it only applies to that single element. Only attributes applied at a group level will affect the elements within that group, from above.

This imposes some of our first requirements. We need an object that can contain drawing attributes. In addition, there’s a difference between objects that contain the attributes, such as stroke-width, stroke, fill, etc, and actual geometry, such as line, polygon, path. I will drop SVGObject here, as it is a baseline. If you want to follow along, the code is in the svgtypes.h file.


struct IMapSVGNodes;    // forward declaration



struct SVGObject : public IDrawable
{
    XmlElement fSourceElement;

    IMapSVGNodes* fRoot{ nullptr };
    std::string fName{};    // The tag name of the element
    BLVar fVar{};
    bool fIsVisible{ false };
    BLBox fExtent{};



    SVGObject() = delete;
    SVGObject(const SVGObject& other) :fName(other.fName) {}
    SVGObject(IMapSVGNodes* root) :fRoot(root) {}
    virtual ~SVGObject() = default;

    SVGObject& operator=(const SVGObject& other) {
        fRoot = other.fRoot;
        fName = other.fName;
        BLVar fVar = other.fVar;

        return *this;
    }

    IMapSVGNodes* root() const { return fRoot; }
    virtual void setRoot(IMapSVGNodes* root) { fRoot = root; }

    const std::string& name() const { return fName; }
    void setName(const std::string& name) { fName = name; }

    const bool visible() const { return fIsVisible; }
    void setVisible(bool visible) { fIsVisible = visible; }

    const XmlElement& sourceElement() const { return fSourceElement; }

    // sub-classes should return something interesting as BLVar
    // This can be used for styling, so images, colors, patterns, gradients, etc
    virtual const BLVar& getVariant()
    {
        return fVar;
    }

    void draw(IRender& ctx) override
    {
        ;// draw the object
    }

    virtual void loadSelfFromXml(const XmlElement& elem)
    {
        ;
    }

    virtual void loadFromXmlElement(const svg2b2d::XmlElement& elem)
    {
        fSourceElement = elem;

        // load the common attributes
        setName(elem.name());

        // call to loadselffromxml
        // so sub-class can do its own loading
        loadSelfFromXml(elem);
    }
};

As a base object, it contains the bare minimum that is common across all subsequent objects. It also has a couple of extras which have proven to be convenient, if not strictly necessary.

The strictly necessary is the ‘void draw(IRender &ctx)’. Almost all objects, whether they be attributes, or elements, will need to affect the drawing context. So, they all will need to be given a chance to do that. The ‘draw()’ routine is what gives them that chance.

All objects need to be able to construct themselves from the xml element stream, so the convenient ‘load..’ functions sit here. Whether it’s an alement, or an attribute, it has a name, so we set the name as well. Attributes can set their name independently from being loaded from the XmlElement, so this is a bit of specialization, but it’s ok.

There is this bit of an oddity in the forward declaration of ‘struct IMapSVGNodes; // forward declaration’. As we’ll see much later, we need the ability to lookup nodes based on an ID, so we need an interface somewhere that allows us to do that. As every node constructed might need to do this, we need a way to pass this interface down the tree, without copying it, and without causing circular references, so the forward declaration, and use of the ‘root()’ method.

That’s got us started. We now have something of a base object.

Next up, we have SVGVisualProperty

// SVGVisualProperty
    // This is meant to be the base class for things that are optionally
    // used to alter the graphics context.
    // If isSet() is true, then the drawSelf() is called.
	// sub-classes should override drawSelf() to do the actual drawing
    //
    // This is used for things like; Paint, Transform, Miter, etc.
    //
    struct SVGVisualProperty :  public SVGObject
    {
        bool fIsSet{ false };

        //SVGVisualProperty() :SVGObject(),fIsSet(false){}
        SVGVisualProperty(IMapSVGNodes *root):SVGObject(root),fIsSet(false){}
        SVGVisualProperty(const SVGVisualProperty& other)
            :SVGObject(other)
            ,fIsSet(other.fIsSet)
        {}

        SVGVisualProperty operator=(const SVGVisualProperty& rhs)
        {
            SVGObject::operator=(rhs);
            fIsSet = rhs.fIsSet;
            
            return *this;
        }

        void set(const bool value) { fIsSet = value; }
        bool isSet() const { return fIsSet; }

		virtual void loadSelfFromChunk(const ByteSpan& chunk)
        {
			;
        }

        virtual void loadFromChunk(const ByteSpan& chunk)
        {
			loadSelfFromChunk(chunk);
        }
        
        // Apply propert to the context conditionally
        virtual void drawSelf(IRender& ctx)
        {
            ;
        }

        void draw(IRender& ctx) override
        {
            if (isSet())
                drawSelf(ctx);
        }

    };

It’s not much, and you might question whether it needs to even exist. Maybe it’s couple of routines can just be merged into the SVGObject itself. That is a simple design changed to contemplate, as the only real attribute introduced here is the ‘isSet()’. This is essentially a way to say ‘the value is null’. If I had nullable types, I’d just use that mechanism. But, it also allows you to turn an attribute on and off programmatically, which might turn out to be useful.

Now we can look at a single attribute, the stroke-width, and see how it goes from an xmlElement attribute, to a property in our tree.

    //=========================================================
    // SVGStrokeWidth
    //=========================================================
    
    struct SVGStrokeWidth : public SVGVisualProperty
    {
		double fWidth{ 1.0};

		//SVGStrokeWidth() : SVGVisualProperty() {}
		SVGStrokeWidth(IMapSVGNodes* iMap) : SVGVisualProperty(iMap) {}
		SVGStrokeWidth(const SVGStrokeWidth& other) :SVGVisualProperty(other) { fWidth = other.fWidth; }
        
		SVGStrokeWidth& operator=(const SVGStrokeWidth& rhs)
		{
			SVGVisualProperty::operator=(rhs);
			fWidth = rhs.fWidth;
			return *this;
		}

		void drawSelf(IRender& ctx)
		{
			ctx.setStrokeWidth(fWidth);
		}

		void loadSelfFromChunk(const ByteSpan& inChunk)
		{
			fWidth = toNumber(inChunk);
			set(true);
		}

		static std::shared_ptr<SVGStrokeWidth> createFromChunk(IMapSVGNodes* root, const std::string& name, const ByteSpan& inChunk)
		{
			std::shared_ptr<SVGStrokeWidth> sw = std::make_shared<SVGStrokeWidth>(root);

			// If the chunk is empty, return immediately 
			if (inChunk)
				sw->loadFromChunk(inChunk);

			return sw;
		}

		static std::shared_ptr<SVGStrokeWidth> createFromXml(IMapSVGNodes* root, const std::string& name, const XmlElement& elem)
		{
			return createFromChunk(root, name, elem.getAttribute(name));
		}
    };

It starts from the ‘createFromXml…’. We can look at the main parsing loop later, but there is a point where we’re looking at the attributes of an element, and we’ll run across the ‘stroke-width’, and call this function. The ‘createFromChunk’ is then called, which then calls loadFromChunk after instantiating an object.

There are a couple more design choices being made here. First is the fact that we’re using ‘std::shared_ptr’. This implies heap allocation of memory, and this is the right place to finally make such a decision. We did not want the XML parser itself to do any allocations, but we’re finally at the point where we need to. It’s possible to not even do allocations here, just have the attributes allocated on the objects that use them. But, since attributes can be shared, it’s easier just to bite the bullet now, and use shared_ptr.

In the case of stroke-width, we want to save the width specified (call toNumber()), and when it comes time to apply that width, in the ‘drawSelf()’, we make the rigth call on the drawing context ‘setStrokeWidth()’. Since the same drawing context is used throughout the rendering process, setting an attribute at one point will make that attribute sticky, until something else changes it, which is the behavior that we want.

I would like to describe the ‘stroke’ and ‘fill’ attributes, but they are actually the largest portions of the library. Setting these attributes can occur in so many different ways, it’s worth taking a look at them. Here I will just show a few ways in which they can be used, so you get a feel for how involved they are:

<line stroke="blue" x1='0' y1='0'  x2='100'  y2='100'/>
<line stroke="rgb(0,0,255)" x1='0' y1='0'  x2='100'  y2='100'/> 
<line stroke="rgba(0,0,255,1.0)" x1='0' y1='0'  x2='100'  y2='100'/> 
<line stroke="rgba(0,0,100%,1.0)" x1='0' y1='0'  x2='100'  y2='100'/> 
<line stroke="rgba(0%,0%,100%,100%)" x1='0' y1='0'  x2='100'  y2='100'/> 
<line style = "stroke:blue" x1='0' y1='0'  x2='100'  y2='100'/> 
<line stroke= "url(#SVGID_1)" x1='0' y1='0'  x2='100'  y2='100'/> 

And more…

There is a bewildering assortment of ways in which you can set a stroke or fill, and they don’t all resolve to a single color value. It can be patterns, gradients, even other graphics. So, it can get pretty intense. The SVGPaint structure does a good job of representing all the possibilities, so take a look at that if you want to want to see the intimate details.

We round out our basic object strucutures by looking at how shapes are represented.

//
	// SVGVisualObject
	// This is any object that will change the state of the rendering context
	// that's everything from paint that needs to be applied, to geometries
	// that need to be drawn, to line widths, text alignment, and the like.
	// Most things, other than basic attribute type, will be a sub-class of this
	struct SVGVisualNode : public SVGObject
	{

		std::string fId{};      // The id of the element
		std::map<std::string, std::shared_ptr<SVGVisualProperty>> fVisualProperties{};

		SVGVisualNode() = default;
		SVGVisualNode(IMapSVGNodes* root)
			: SVGObject(root)
		{
			setRoot(root);
		}
		SVGVisualNode(const SVGVisualNode& other) :SVGObject(other)
		{
			fId = other.fId;
			fVisualProperties = other.fVisualProperties;
		}


		SVGVisualNode & operator=(const SVGVisualNode& rhs)
		{
			fId = rhs.fId;
			fVisualProperties = rhs.fVisualProperties;
			
			return *this;
		}
		
		const std::string& id() const { return fId; }
		void setId(const std::string& id) { fId = id; }
		
		void loadVisualProperties(const XmlElement& elem)
		{
			// Run Through the property creation routines, generating
			// properties for the ones we find in the XmlElement
			for (auto& propconv : gSVGPropertyCreation)
			{
				// get the named attribute
				auto attrName = propconv.first;

				// We have a property and value, convert to SVGVisibleProperty
				// and add it to our map of visual properties
				auto prop = propconv.second(root(), attrName, elem);
				if (prop->isSet())
					fVisualProperties[attrName] = prop;

			}
		}

		void setCommonVisualProperties(const XmlElement &elem)
		{
			// load the common stuff that doesn't require
			// any additional processing
			loadVisualProperties(elem);

			// Handle the style attribute separately by turning
			// it into a standalone XmlElement, and then loading
			// that like a normal element, by running through the properties again
			// It's ok if there were already styles in separate attributes of the
			// original elem, because anything in the 'style' attribute is supposed
			// to override whatever was there.
			auto styleChunk = elem.getAttribute("style");

			if (styleChunk) {
				// Create an XML Element to hang the style properties on as attributes
				XmlElement styleElement{};

				// use CSSInlineIterator to iterate through the key value pairs
				// creating a visual attribute, using the gSVGPropertyCreation map
				CSSInlineStyleIterator iter(styleChunk);

				while (iter.next())
				{
					std::string name = std::string((*iter).first.fStart, (*iter).first.fEnd);
					if (!name.empty() && (*iter).second)
					{
						styleElement.addAttribute(name, (*iter).second);
					}
				}

				loadVisualProperties(styleElement);
			}

			// Deal with any more attributes that need special handling
		}

		void loadSelfFromXml(const XmlElement& elem) override
		{
			SVGObject::loadSelfFromXml(elem);
			
			auto id = elem.getAttribute("id");
			if (id)
				setId(std::string(id.fStart, id.fEnd));

			
			setCommonVisualProperties(elem);
		}
		
		// Contains styling attributes
		void applyAttributes(IRender& ctx)
		{
			for (auto& prop : fVisualProperties) {
				prop.second->draw(ctx);
			}
		}
		
		virtual void drawSelf(IRender& ctx)
		{
			;

		}
		
		void draw(IRender& ctx) override
		{
			ctx.save();
			
			applyAttributes(ctx);

			drawSelf(ctx);

			ctx.restore();
		}
	};

We are building up nodes in a tree structure. The SVGVisualNode is essentially the primary node of that construction. At the end of all the tree construction, we want to end up with a root node where we can just call ‘draw(context)’, and have it render itself into the context. That node needs to deal with the Cascading Styles, children drawing in the proper order (painter’s algorithm), deal with all the attributes, and context state.

Of particular note, right there at the end is the ‘draw()’ method. It starts with ‘ctx.save()’ and finishes with ‘ctx.restore()’. This is critical to maintaining the design constraint of ‘attributes are applied locally in the tree’. So, we save the sate of the context coming in, make whatever changes we, or our children will make, then restore the state upon exit. This is the essential operation required to maintain proper application of drawing attributes. Luckily, or rather by design, the blend2d library makes saving and restoring state very fast and efficient. If the base library did not have this facility, it would be up to our code to maintain this state.

Another note here is ‘applyAttributes’. This is what allows things such as the ‘<g>’ element to apply attributes at a high level in the tree, and subsequent elements don’t have to worry about it. They can just apply the attributes that they alter. And where do those common attributes come from?

	static std::map<std::string, std::function<std::shared_ptr<SVGVisualProperty> (IMapSVGNodes *root, const std::string& , const XmlElement& )>> gSVGPropertyCreation = {
		{"fill", [](IMapSVGNodes* root, const std::string& name, const XmlElement& elem) {return SVGPaint::createFromXml(root, "fill", elem ); } }
		,{"fill-rule", [](IMapSVGNodes* root, const std::string& name, const XmlElement& elem) {return SVGFillRule::createFromXml(root, "fill-rule", elem); } }
		,{"font-size", [](IMapSVGNodes* root, const std::string& name, const XmlElement& elem) {return SVGFontSize::createFromXml(root, "font-size", elem); } }
		,{"opacity", [](IMapSVGNodes* root, const std::string& name, const XmlElement& elem) {return SVGOpacity::createFromXml(root, "opacity", elem); } }
		,{"stroke", [](IMapSVGNodes* root, const std::string& name, const XmlElement& elem ) {return SVGPaint::createFromXml(root, "stroke", elem); } }
		,{"stroke-linejoin", [](IMapSVGNodes* root, const std::string& name, const XmlElement& elem) {return SVGStrokeLineJoin::createFromXml(root, "stroke-linejoin", elem); } }
		,{"stroke-linecap", [](IMapSVGNodes* root, const std::string& name, const XmlElement& elem ) {return SVGStrokeLineCap::createFromXml(root, "stroke-linecap", elem); } }
		,{"stroke-miterlimit", [](IMapSVGNodes* root, const std::string& name, const XmlElement& elem ) {return SVGStrokeMiterLimit::createFromXml(root, "stroke-miterlimit", elem); } }
		,{"stroke-width", [](IMapSVGNodes* root, const std::string& name, const XmlElement& elem ) {return SVGStrokeWidth::createFromXml(root, "stroke-width", elem); } }
		,{"text-anchor", [](IMapSVGNodes* root, const std::string& name, const XmlElement& elem) {return SVGTextAnchor::createFromXml(root, "text-anchor", elem); } }
		,{"transform", [](IMapSVGNodes* root, const std::string& name, const XmlElement& elem) {return SVGTransform::createFromXml(root, "transform", elem); } }
};

A nice dispatch table of the most common of attributes. The ‘loadVisualProperties()’ method uses this dispatch table to load the common display properties. Individual geometry objects can load their own specific properties after this, but these are the common ones, so this is very convenient. This table can and should be expanded as even more properties can be supported.

Finally, let’s get to the meat of the geometry representation. This can be found in the svgshapes.h file.

	struct SVGPathBasedShape : public SVGShape
	{
		BLPath fPath{};
		
		SVGPathBasedShape() :SVGShape() {}
		SVGPathBasedShape(IMapSVGNodes* iMap) :SVGShape(iMap) {}
		
		
		void drawSelf(IRender &ctx) override
		{
			ctx.fillPath(fPath);
			ctx.strokePath(fPath);
		}
	};

Ignoring the SVGShape object (a small shim atop SVGObject), we have a BLPath, and a drawSelf(). What could be simpler? The general premise is that all geometry can be represented as a BLPath at the end of the day. Everything from single lines, to polygons, to complex paths, they all boil down to a BLPath. Making this object hugely simplifies the drawing task. All subsequent geometry classes just need to convert themselves into BLPath, which we’ll see is very easy.

Here is the SVGLine, as it’s fairly simple, and representative of the rest of the geometries.

struct SVGLine : public SVGPathBasedShape
	{
		BLLine fGeometry{};
		
		SVGLine() :SVGPathBasedShape(){ reset(0, 0, 0, 0); }
		SVGLine(IMapSVGNodes* iMap) :SVGPathBasedShape(iMap) {}
		
		
		void loadSelfFromXml(const XmlElement& elem) override 
		{
			SVGPathBasedShape::loadSelfFromXml(elem);
			
			fGeometry.x0 = parseDimension(elem.getAttribute("x1")).calculatePixels();
			fGeometry.y0 = parseDimension(elem.getAttribute("y1")).calculatePixels();
			fGeometry.x1 = parseDimension(elem.getAttribute("x2")).calculatePixels();
			fGeometry.y1 = parseDimension(elem.getAttribute("y2")).calculatePixels();

			fPath.addLine(fGeometry);
		}

		static std::shared_ptr<SVGLine> createFromXml(IMapSVGNodes *iMap, const XmlElement& elem)
		{
			auto shape = std::make_shared<SVGLine>(iMap);
			shape->loadFromXmlElement(elem);

			return shape;
		}

		
	};

It’s fairly boilerplate. Just have to get the right attributes turned into values for the BLLine geometry, and add that to our path. That’s it. The rect, circle, ellipse, polyline, polygon, and path objects, all do pretty much the same thing, in as small a space. These are much simpler than having to deal with the ‘stroke’ or ‘fill’ attributes. There is some trickery here in terms of parsing the actual coordinate values, because they can be represented in different kinds of units, but the SVGDimension object deals with all those details.

That’s about enough code for this time around. We’ve looked at attributes, and VisualNodes, and we know that we need cascading styles, painter’s algorithm drawing order, and an ability to draw into a context. Now we have all the pieces we need to complete the final rendering task.

Next time around, I’ll wrap it up by bringing in the SVG ‘parser’, which will combine the XML scanning with our document tree, and render final images.


SVG From the Ground Up – Along the right path

<svg xmlns="http://www.w3.org/2000/svg" width="22" height="22" viewBox="0 0 22 22">
	<path d="M20.658,9.26l0.006-0.007l-9-8L11.658,1.26C11.481,1.103,11.255,1,11,1 
	c-0.255,0-0.481,0.103-0.658,0.26l-0.006-0.007l-9,8L1.342,9.26
	C1.136,9.443,1,9.703,1,10c0,0.552,0.448,1,1,1 c0.255,0,0.481-0.103,0.658-0.26l0.006,0.007
	L3,10.449V20c0,0.552,0.448,1,1,1h5v-8h4v8h5c0.552,0,1-0.448,1-1v-9.551l0.336,0.298 
	l0.006-0.007C19.519,10.897,19.745,11,20,11c0.552,0,1-0.448,1-1C21,9.703,20.864,9.443,20.658,9.26z 
	M7,16H5v-3h2V16z M17,16h-2 v-3h2V16z"/>
</svg>

In the last installment (It’s XML, How hard could it be?), we got as far as being able to scan XML, and generate a bunch of XmlElement objects. That’s a great first couple of steps, and now the really interesting parts begin. But, first, before we get knee deep into the seriousness of the rest of SVG, we need to deal with the graphics subsystem. It’s one thing to ‘parse’ SVG, and even build up a Document Object Model (DOM). It’s quite another to actually do the rendering of the same. To do both, in a compact form, with speed and agility, that’s what we’re after.

This time around I’m going to introduce blend2d, which is the graphics library that I’m using to do all my drawing. I stumbled across blend2d a few years ago, and I don’t even remember how I found it. There are a couple of key aspects to it that are of note. One is that the API is really simple to use, and the library is easy to build. The other part, is more esoteric, but perfect for our needs here. The library was built around support for SVG. So, it has all the functions we need to build the typical kinds of graphics that we’re concerned with. I won’t go into excruciating detail about the blend2d API here, as you can visit the project on github, but I will take a look at the BLPath object, because this is the true workhorse of most SVG graphics.

The little house graphic above is typical of the kinds of little vector based icons you find all over the place. In your apps as icons, on Etsy as laser cuttable images, etc. Besides the opening ‘<svg…’, you see the ‘<path…’. SVG images are comprised of various geometry elements such as rect, circle, polyline, polygon, and path. If you really want to get into the nitty gritty details, you can check out the full SVG Specification.

The path geometry is used to describe a series of movements a pen might make on a plotter. MoveTo, LineTo, CurveTo, etc. There are a total of 20 commands you can use to build up a path, and they can used in almost any combination to create as complex a figure as you want.

    // Shaper contour Commands
    // Origin from SVG path commands
    // M - move       (M, m)
    // L - line       (L, l, H, h, V, v)
    // C - cubic      (C, c, S, s)
    // Q - quad       (Q, q, T, t)
    // A - ellipticArc  (A, a)
    // Z - close        (Z, z)
    enum class SegmentCommand : uint8_t
    {
        INVALID = 0
        , MoveTo = 'M'
        , MoveBy = 'm'
        , LineTo = 'L'
        , LineBy = 'l'
        , HLineTo = 'H'
        , HLineBy = 'h'
        , VLineTo = 'V'
        , VLineBy = 'v'
        , CubicTo = 'C'
        , CubicBy = 'c'
        , SCubicTo = 'S'
        , SCubicBy = 's'
        , QuadTo = 'Q'
        , QuadBy = 'q'
        , SQuadTo = 'T'
        , SQuadBy = 't'
        , ArcTo = 'A'
        , ArcBy = 'a'
        , CloseTo = 'Z'
        , CloseBy = 'z'
    };

A single path has a ‘d’ attribute, which contains a series of these commands strung together. It’s a very compact description for geometry. A single path can be used to generate something quite complex.

With the exception of the blue text, that entire image is generated with a single path element. Quite complex indeed.

Being able to parse the ‘d’ attribute of the path element is super critical to our success in ultimately rendering SVG. There are a few design goals we have in doing this.

  • Be as fast as possible
  • Be as memory efficient as possible
  • Do not introduce intermediate forms if possible

No big deal right? Well, as luck would have it, or rather by design, the blend2d library has an object, BLPath, which was designed for exactly this task. You can checkout the API documentation if you want to look at the details, but it essentially has all those ‘moveTo’, ‘lineTo’, etc, and a whole lot more. It only implements the ‘to’ forms, and not the ‘by’ forms, but it’s easy to get the last vertex and implement the ‘by’ forms ourselves, which we’ll do.

So, our implementation strategy will be to read a command, and read enough numbers to make a call to a BLPath object to actually create the geometry. The entirety of the code is roughly 500 lines, and most of it is boilerplate, so I won’t bother listing it all here, but you can check it out online in the parseblpath.h file.

Let’s look at a little snippet of our house path, and see what it’s doing.

M20.658,9.26l0.006-0.007l-9-8

It is hard to see in this way, so let me write it another way.

M 20.658, 9.26
l 0.006, -0.007
l -9, -8

Said as a series of instructions (and it’s hard to tell between ‘one’ and ‘el’), it would be:

Move to 20.658, 9.26
Line by 0.006, -0.007
Line by -9, -8

If I were to do it as code in blend2d, it would be

BLPath path{};
BLPoint lastPt{};

path.moveTo(20.658, 9.26);
path.getLastVertex(&lastPt);

path.lineTo(lastPt.x + 0.006, lastPt.y+ -0.007);
path.getLastVertex(&lastPt);

path.lineTo(lastPt.x + -9, lastPt.y + -8);

So, our coding task is to get from that cryptic ‘d’ attribute to the code connecting to the BLPath object. Let’s get started.

The first thing we’re going to need is a main routine that drives the process.

		static bool parsePath(const ByteSpan& inSpan, BLPath& apath)
		{
			// Use a ByteSpan as a cursor on the input
			ByteSpan s = inSpan;
			SegmentCommand currentCommand = SegmentCommand::INVALID;
			int iteration = 0;

			while (s)
			{
				// ignore leading whitespace
				s = chunk_ltrim(s, whitespaceChars);

				// If we've gotten to the end, we're done
				// so just return
				if (!s)
					break;

				if (commandChars[*s])
				{
					// we have a command
					currentCommand = SegmentCommand(*s);
					
					iteration = 0;
					s++;
				}

				// Use parseMap to dispatch to the appropriate
				// parse function
				if (!parseMap[currentCommand](s, apath, iteration))
					return false;


			}

			return true;
		}

Takes a ByteSpan and a reference to a BLPath object, and returns ‘true’ if successful, ‘false’ otherwise. There are design choices to be made at every step of course. Why did I pass in a reference to a BLPath, instead of just constructing one inside the routine, and handing it back? Well, because this way, I allow something else to decide where the memory is allocated. This way also allows you to build upon an existing path if you want.

Second choice is, why a const ByteSpan? That’s a harder one. This allows a greater number of choices in terms of where the ByteSpan is coming from, such as you might have been passed a const span to begin with. But mainly it’s a contract that says “this routine will not alter the span.

OK, so following along, we make a reference to the input span, which does NOT copy anything, just sets up a couple of pointers. Then we use this ‘s’ span to do our movement. The main ‘while’ starts with a ‘trim’. XML, and thus SVG, are full of optional whitespace. I can say that for almost every routine, the first thing you want to do is eliminate whitespace. the ‘chunk_ltrim()’ function is very short and efficient, so liberal usage of that is a good thing.

Now we’re sitting at the ‘M’, so we first check to see if it’s one of our command characters. If it is, then we use it as our current command, and advance our pointer. The ‘iteration = 0’ is only useful for the Move commands, but we need that, as we’ll soon see.

Last, we have that cryptic function call thing

				if (!parseMap[currentCommand](s, apath, iteration))
					return false;

All set! Easy peasy, our task is done here…

That last little bit of function call trickery is using a dispatch table to make a call to a function. So let’s look at the dispatch table.

		// A dispatch std::map that matches the command character to the
		// appropriate parse function
		static std::map<SegmentCommand, std::function<bool(ByteSpan&, BLPath&, int&)>> parseMap = {
			{SegmentCommand::MoveTo, parseMoveTo},
			{SegmentCommand::MoveBy, parseMoveBy},
			{SegmentCommand::LineTo, parseLineTo},
			{SegmentCommand::LineBy, parseLineBy},
			{SegmentCommand::HLineTo, parseHLineTo},
			{SegmentCommand::HLineBy, parseHLineBy},
			{SegmentCommand::VLineTo, parseVLineTo},
			{SegmentCommand::VLineBy, parseVLineBy},
			{SegmentCommand::CubicTo, parseCubicTo},
			{SegmentCommand::CubicBy, parseCubicBy},
			{SegmentCommand::SCubicTo, parseSmoothCubicTo},
			{SegmentCommand::SCubicBy, parseSmoothCubyBy},
			{SegmentCommand::QuadTo, parseQuadTo},
			{SegmentCommand::QuadBy, parseQuadBy},
			{SegmentCommand::SQuadTo, parseSmoothQuadTo},
			{SegmentCommand::SQuadBy, parseSmoothQuadBy},
			{SegmentCommand::ArcTo, parseArcTo},
			{SegmentCommand::ArcBy, parseArcBy},
			{SegmentCommand::CloseTo, parseClose},
			{SegmentCommand::CloseBy, parseClose}
		};

Dispatch tables are the modern day C++ equivalent of the giant switch statement typically found in such programs. I actually started with the giant switch statement, then said to myself, “why don’t I just use a dispatch table”. They are functionally equivalent. In this case, we have a std::map, which uses a single SegmentCommand as the key. Each element is tied to a function, that takes the same set of parameters, namely a ByteSpan, a BLPath, and an int. As you can see, there is a function for each of our 20 commands.

I won’t go into every single one of those 20 commands, but looking at a couple will be instructive. Let’s start with the MoveTo

		static bool parseMoveTo(ByteSpan& s, BLPath& apath, int& iteration)
		{
			double x{ 0 };
			double y{ 0 };

			if (!parseNextNumber(s, x))
				return false;
			if (!parseNextNumber(s, y))
				return false;

			if (iteration == 0)
				apath.moveTo(x, y);
			else
				apath.lineTo(x, y);

			iteration++;

			return true;
		}

This has a few objectives.

  • Parse a couple of numbers
  • Call the appropriate function on the BLPath object
  • Increment the ‘iteration’ parameter
  • Advance the pointer, indicating how much we’ve consumed
  • Return false on failure, true on success

This pattern is repeated for every other of the 19 functions. One thing to know about all the commands, and why the main loop is structured the way it is, you can have multiple sets of numbers after the initial set. In the case of MoveTo, the following is a valid input stream.

M 0,0 20,20 30,30 40,40

The way you treat it, in the case of MoveTo, is the initial numbers set an origin (0,0), all subsequent number pairs are implied LineTo commands. That’s why we need to know the iteration. If the iteration is ‘0’, then we need to call moveTo on the BLPath object. If the iteration is greater than 0, then we need to call lineTo on the BLPath. All the commands behave in a similar fashion, except they don’t change based on the iteration number.

Well gee whiz, that seems pretty simple and straightforward. Don’t know what all the fuss is about. Hidden within the parseMoveTo() is parseNextNumber(), so let’s take a look at that as this is where all the bugs can be found.

// Consume the next number off the front of the chunk
// modifying the input chunk to advance past the  number
// we removed.
// Return true if we found a number, false otherwise
		static inline bool parseNextNumber(ByteSpan& s, double& outNumber)
		{
			static charset whitespaceChars(",\t\n\f\r ");          // whitespace found in paths

			// clear up leading whitespace, including ','
			s = chunk_ltrim(s, whitespaceChars);

			ByteSpan numChunk{};
			s = scanNumber(s, numChunk);

			if (!numChunk)
				return false;

			outNumber = chunk_to_double(numChunk);

			return true;
		}

The comment gives you the flavor of it. Again, we start with trimming ‘whitespace’, before doing anything. This is very important. In the case of these numbers, ‘whitespace’ not only includes the typical 0x20 TAB, etc, but also the COMMA (‘,’) character. “M20,20” and “M20 20” and “M 20 20” and “M 20, 20” and even “M,20,20” are all equivalent. So, if you’re going to be parsing numbers in a sequence, you’re going to have to deal with all those cases. The easiest thing to do is trim whitespace before you start. I will point out the convenience of the charset construction. Super easy.

We trim the whitespace off the front, then call ‘scanNumber()’. That’s another workhourse routine, which is worth looking into, but I won’t put the code here. You can find it in the bspanutil.h file. I will put the comment associated with the code here though, as it’s informative.

// Parse a number which may have units after it
//   1.2em
// -1.0E2em
// 2.34ex
// -2.34e3M10,20
// 
// By the end of this routine, the numchunk represents the range of the 
// captured number.
// 
// The returned chunk represents what comes next, and can be used
// to continue scanning the original inChunk
//
// Note:  We assume here that the inChunk is already positioned at the start
// of a number (including +/- sign), with no leading whitespace

This is probably the most singularly important routine in the whole library. It has the big task of figuring out numbers from a stream of characters. Those numbers, as you can see from the examples, come in many different forms, and things can get confusing. Here’s another example of a sequence of characters it needs to be able to figure out: “M-1.7-82L.92 27”. You save yourself a ton of time, headache, and heartburn by getting this right.

The next choice you make is how to convert from the number that we scanned (it’s still just a stream of ASCII characters) into an actual ‘double’. This is the point where most programmers might throw up their hands and reach for their trusty ‘strtod’ or ye olde ‘atof’, or even ‘sprintf’. There’s a whole science to this, just know that strtod() is not your friend, and for something you ‘ll be doing millions of times, it’s worth investigating some alternatives. I highly recommend reading the code for fast_double_parser. If you want to examine what I do, checkout the chunk_to_double() routine within the bspanutil.h file.

We’re getting pretty far into the weeds down here, so let’s look at one more function, the LineTo

		static bool parseLineTo(ByteSpan& s, BLPath& apath, int& iteration)
		{
			double x{ 0 };
			double y{ 0 };

			if (!parseNextNumber(s, x))
				return false;
			if (!parseNextNumber(s, y))
				return false;

			apath.lineTo(x, y);

			iteration++;

			return true;
		}

Same as MoveTo, parse a couple of numbers, apply them to the right function on the path object, return true or false. Just do the same thing 18 more times for the other functions, and you’ve got your path ‘parser’.

To recap, parsing the ‘d’ parameter is one of the most important parts of any SVG parser. In this case, we want to get from the text to an actual object we can render, as quickly as possible. A BLPath alone is not enough to create great images, we still have a long way to go until we start seeing pretty pictures on the screen. Parsing the path is critical to getting there though. This is where you could waste tons of time and memory, so it’s worth considering the options carefully. In this case, we’ve chosen to represent the path in memory using a data structure that can be a part of a graphic elements tree, as well as being handed to the drawing engine directly, without having to transform it once again before actually drawing.

There you have it. One step closer to our beautiful images.

Next time around, we need to look at what kind of Document Object Model (DOM) we want to construct, and how our SVG parser will construct it.


SVG From the Ground Up – It’s XML, How hard could it be?

Let’s take a look at the SVG (XML) code that generates that image.

<svg height="200" width="680" xmlns="http://www.w3.org/2000/svg">
    <circle cx="70" cy="70" r="50" />
    <circle cx="200" cy="70" r="50" fill="#79C99E" />
    <circle cx="330" cy="70" r="50" fill="#79C99E" stroke-width="10" stroke="#508484" />
    <circle cx="460" cy="70" r="50" fill="#79C99E" stroke-width="10" />
    <circle cx="590" cy="70" r="50" fill="none" stroke-width="10" stroke="#508484" />
</svg>

By the end of this post, we should be able to scan through the components of that, and generate the tokens necessary to begin rendering it as SVG. So, where to start?

Last time around (SVG From the Ground Up – Parsing Fundamentals), I introduced the ByteSpan and charset data structures, as a way to say “these are the only tools you’ll need…”. Well, at least they are certainly the core building blocks. Now we’re going to actually use those components to begin the process of breaking down the XML. XML can be a daunting sprawling beast. Its origins are in an even older document technology known as SGML. The first specification for the language can be found here: Extensible Markup Language (XML) 1.0 (Fifth Edition). When I joined the team at Microsoft in 1998 to work on this under Jean Paoli, one of the original authors, there were probably 30 people across dev, test, and pm. Of course we had people working on the standards body, and I was working on XSLT, and a couple on the parser, someone on DTD schema. It was quite a production. At that time, we had to deal with myriad encodings (utf-8 did not rule the world yet), conformance and compliance test suites, and that XSLT beast (CSS did not rule the world yet). It was a daunting endeavor, and at some point we tried to color everything with XML, much to the chagrin of most other people. But, some things did come out of that era, and SVG is one of them.

Today, our task is not to implement a fully compliant validating parser. That again would take a team of a few, and a ton of testing. What we’re after is something more modest. Something a hobby hacker could throw together in a weekend, but have a fair chance at it being able to consume most of the SVG you’re ever really interested in. To that end, there’s a much smaller, simpler XML spec out there. MicroXML. This describes a subset of XML that leaves out all the really hard parts. While that spec is far more readable, we’ll go even one step simpler. With our parser here, we won’t even be supporting utf-8. That might seem like a tragic simplification, but the reality is, not even that’s needed for most of what we’ll be doing with SVG. So, here’s the list of what we will be doing.

  • Decoding elements
  • Decoding attributes
  • Decoding element content (supporting text nodes)
  • Skipping Doctype
  • Skipping Comments
  • Skipping Processing Instructions
  • Not expanding character entities (although user can)

As you will soon see “skipping” doesn’t mean you have access to the data, it just means our SVG parser won’t do anything with it. This is a nice extensibility point. We start simple, and you can add as much complexity as you want over time, without changing the fundamental structure of what we’re about to build.

Now for some types and enums. I won’t put the entirety of the code in here, so if you want to follow along, you can look at the xmlscan.h file. We’ll start with the XML element types.

    enum XML_ELEMENT_TYPE {
        XML_ELEMENT_TYPE_INVALID = 0
		, XML_ELEMENT_TYPE_XMLDECL                  // An XML declaration, like <?xml version="1.0" encoding="UTF-8"?>
        , XML_ELEMENT_TYPE_CONTENT                  // Content, like <foo>bar</foo>, the 'bar' is content
        , XML_ELEMENT_TYPE_SELF_CLOSING             // A self-closing tag, like <foo/>
        , XML_ELEMENT_TYPE_START_TAG                // A start tag, like <foo>
        , XML_ELEMENT_TYPE_END_TAG                  // An end tag, like </foo>
        , XML_ELEMENT_TYPE_COMMENT                  // A comment, like <!-- foo -->
        , XML_ELEMENT_TYPE_PROCESSING_INSTRUCTION   // A processing instruction, like <?foo bar?>
        , XML_ELEMENT_TYPE_CDATA                    // A CDATA section, like <![CDATA[ foo ]]>
        , XML_ELEMENT_TYPE_DOCTYPE                  // A DOCTYPE section, like <!DOCTYPE foo>
    };

This is where we indicate what kinds of pieces of the XML file we will recognize. If something is not in this list, it will either be reported as invalid, or it will simply cause the scanner to stop processing. From the little bit of XML that opened this article, we see “START_TAG”, “SELF_CLOSING”, “END_TAG”. And that’s it!! Simple right?

OK. Next up are a couple of data structures which are the guts of the XML itself. First is the XmlName. Although we’re not building a super conformant parser, there are some simple realities we need to be able to handle to make our future life easier. XML namespaces are one of those things. In XML, you can have a name with a ‘:’ in it, which puts the name into a namespace. Without too much detail, just know that “circle”, could have been “svg:circle”, or something, and possibly mean the same thing. We need a data structure that will capture this.

struct XmlName {
        ByteSpan fNamespace{};
        ByteSpan fName{};

        XmlName() = default;
        
        XmlName(const ByteSpan& inChunk)
        {
            reset(inChunk);
        }

        XmlName(const XmlName &other):fNamespace(other.fNamespace), fName(other.fName){}
        
        XmlName& operator =(const XmlName& rhs)
        {
            fNamespace = rhs.fNamespace;
            fName = rhs.fName;
            return *this;
        }
        
        XmlName & operator=(const ByteSpan &inChunk)
        {
            reset(inChunk);
            return *this;
        }
        
		// Implement for std::map, and ordering in general
		bool operator < (const XmlName& rhs) const
		{
			size_t maxnsbytes = std::min(fNamespace.size(), rhs.fNamespace.size());
			size_t maxnamebytes = std::min(fName.size(), rhs.fName.size());
            
			return (memcmp(fNamespace.begin(), rhs.fNamespace.begin(), maxnsbytes)<=0)  && (memcmp(fName.begin(), rhs.fName.begin(), maxnamebytes) < 0);
		}
        
        // Allows setting the name after it's been created
        XmlName& reset(const ByteSpan& inChunk)
        {
            fName = inChunk;
            fNamespace = chunk_token(fName, charset(':'));
            if (chunk_size(fName)<1)
            {
                fName = fNamespace;
                fNamespace = {};
            }
            return *this;
        }
        
		ByteSpan name() const { return fName; }
		ByteSpan ns() const { return fNamespace; }
	};

Given a ByteSpan, our universal data representation, split it out into the ‘namespace’ and ‘name’ parts, if they exist. Then we can get the name part by calling ‘name()’, and if there was a namespace part, we can get that from ‘ns()’. Why ‘ns’ instead of ‘namespace’? Because ‘namespace’ is a keyword in C/C++, and we don’t want any confusion or compiler errors.

One thing to note here is the implementation of the ‘operator <‘. Why is that there? Because if you want to use this as a keyfield in an associative container, such as std::map, you need some comparison operator, and by implementing ‘<‘, you get a quick and dirty comparison operator. This is a future enhancement we’ll use later.

Next up is the representation of an XML node itself, where we have XmlElement.

    // Representation of an xml element
    // The xml iterator will generate these
    struct XmlElement
    {
    private:
        int fElementKind{ XML_ELEMENT_TYPE_INVALID };
        ByteSpan fData{};

        XmlName fXmlName{};
        std::string fName{};
        std::map<std::string, ByteSpan> fAttributes{};

    public:
        XmlElement() {}
        XmlElement(int kind, const ByteSpan& data, bool autoScanAttr = false)
            :fElementKind(kind)
            , fData(data)
        {
            reset(kind, data, autoScanAttr);
        }

		void reset(int kind, const ByteSpan& data, bool autoScanAttr = false)
		{
            clear();

            fElementKind = kind;
            fData = data;

            if ((fElementKind == XML_ELEMENT_TYPE_START_TAG) ||
                (fElementKind == XML_ELEMENT_TYPE_SELF_CLOSING) ||
                (fElementKind == XML_ELEMENT_TYPE_END_TAG))
            {
                scanTagName();

                if (autoScanAttr) {
                    if (fElementKind != XML_ELEMENT_TYPE_END_TAG)
                        scanAttributes();
                }
            }
		}
        
		// Clear this element to a default state
        void clear() {
			fElementKind = XML_ELEMENT_TYPE_INVALID;
			fData = {};
			fName.clear();
			fAttributes.clear();
		}
        
        // determines whether the element is currently empty
        bool empty() const { return fElementKind == XML_ELEMENT_TYPE_INVALID; }

        explicit operator bool() const { return !empty(); }

        // Returning information about the element
        const std::map<std::string, ByteSpan>& attributes() const { return fAttributes; }
        
        const std::string& name() const { return fName; }
		void setName(const std::string& name) { fName = name; }
        
        int kind() const { return fElementKind; }
		void kind(int kind) { fElementKind = kind; }
        
        const ByteSpan& data() const { return fData; }

		// Convenience for what kind of tag it is
        bool isStart() const { return (fElementKind == XML_ELEMENT_TYPE_START_TAG); }
		bool isSelfClosing() const { return fElementKind == XML_ELEMENT_TYPE_SELF_CLOSING; }
		bool isEnd() const { return fElementKind == XML_ELEMENT_TYPE_END_TAG; }
		bool isComment() const { return fElementKind == XML_ELEMENT_TYPE_COMMENT; }
		bool isProcessingInstruction() const { return fElementKind == XML_ELEMENT_TYPE_PROCESSING_INSTRUCTION; }
        bool isContent() const { return fElementKind == XML_ELEMENT_TYPE_CONTENT; }
		bool isCData() const { return fElementKind == XML_ELEMENT_TYPE_CDATA; }
		bool isDoctype() const { return fElementKind == XML_ELEMENT_TYPE_DOCTYPE; }

        
        void addAttribute(std::string& name, const ByteSpan& valueChunk)
        {
            fAttributes[name] = valueChunk;
        }

        ByteSpan getAttribute(const std::string &name) const
		{
			auto it = fAttributes.find(name);
			if (it != fAttributes.end())
				return it->second;
			else
                return ByteSpan{};
		}
        
    private:
        //
        // Parse an XML element
        // We should be sitting on the first character of the element tag after the '<'
        // There are several things that need to happen here
        // 1) Scan the element name
        // 2) Scan the attributes, creating key/value pairs
        // 3) Figure out if this is a self closing element

        // 
        // We do NOT scan the content of the element here, that happens
        // outside this routine.  We only deal with what comes up the the closing '>'
        //
        void setTagName(const ByteSpan& inChunk)
        {
            fXmlName.reset(inChunk);
            fName = toString(fXmlName.name());
        }
        
        void scanTagName()
        {
            ByteSpan s = fData;
            bool start = false;
            bool end = false;

            // If the chunk is empty, just return
            if (!s)
                return;

            // Check if the tag is end tag
            if (*s == '/')
            {
                s++;
                end = true;
            }
            else {
                start = true;
            }

            // Get tag name
            ByteSpan tagName = s;
            tagName.fEnd = s.fStart;

            while (s && !wspChars[*s])
                s++;

            tagName.fEnd = s.fStart;
            setTagName(tagName);


            fData = s;
        }

        public:
        //
        // scanAttributes
        // Scans the fData member looking for attribute key/value pairs
        // It will add to the member fAttributes these pairs, without further processing.
        // This should be called after scanTagName(), because we want to be positioned
        // on the first key/value pair. 
        //
        int scanAttributes()
        {

            int nattr = 0;
            bool start = false;
            bool end = false;
            uint8_t quote{};
            ByteSpan s = fData;


            // Get the attribute key/value pairs for the element
            while (s && !end)
            {
                uint8_t* beginattrValue = nullptr;
                uint8_t* endattrValue = nullptr;


                // Skip white space before the attrib name
                s = chunk_ltrim(s, wspChars);

                if (!s)
                    break;

                if (*s == '/') {
                    end = true;
                    break;
                }

                // Find end of the attrib name.
                //static charset equalChars("=");
                auto attrNameChunk = chunk_token(s, "=");
                attrNameChunk = chunk_trim(attrNameChunk, wspChars);    // trim whitespace on both ends

                std::string attrName = std::string(attrNameChunk.fStart, attrNameChunk.fEnd);

                // Skip stuff past '=' until the beginning of the value.
                while (s && (*s != '\"') && (*s != '\''))
                    s++;

                // If we've reached end of span, bail out
                if (!s)
                    break;

                // capture the quote character
                // Store value and find the end of it.
                quote = *s;

				s++;    // move past the quote character
                beginattrValue = (uint8_t*)s.fStart;    // Mark the beginning of the attribute content

                // Skip until we find the matching closing quote
                while (s && *s != quote)
                    s++;

                if (s)
                {
                    endattrValue = (uint8_t*)s.fStart;  // Mark the ending of the attribute content
                    s++;
                }

                // Store only well formed attributes
                ByteSpan attrValue = { beginattrValue, endattrValue };

                addAttribute(attrName, attrValue);

                nattr++;
            }

            return nattr;
        }
    };

That’s a bit of a brute, but actually pretty straightforward. We need a data structure that tells us what kind of XML element type we’re dealing with. We need the name, as the content of the element held onto for future processing. We hold onto the content as a ByteSpan, but have provision for making more convenient representations. For example, we turn the name into a std::string. In the futue, we can eliminate even this, and just use the XmlName with its chunks directly.

Besides the element name, we also have the ability to split out the attribute key/value pairs, as seen in ‘scanAttributes()’. Let’s take a deeper look at the constructor.

        XmlElement(int kind, const ByteSpan& data, bool autoScanAttr = false)
            :fElementKind(kind)
            , fData(data)
        {
            reset(kind, data, autoScanAttr);
        }

		void reset(int kind, const ByteSpan& data, bool autoScanAttr = false)
		{
            clear();

            fElementKind = kind;
            fData = data;

            if ((fElementKind == XML_ELEMENT_TYPE_START_TAG) ||
                (fElementKind == XML_ELEMENT_TYPE_SELF_CLOSING) ||
                (fElementKind == XML_ELEMENT_TYPE_END_TAG))
            {
                scanTagName();

                if (autoScanAttr) {
                    if (fElementKind != XML_ELEMENT_TYPE_END_TAG)
                        scanAttributes();
                }
            }
		}

The constructor takes a ‘kind’, a ByteSpan, and a flag indicating whether we want to parse out the attributes or not. In ‘reset()’, we see that we hold onto the kind of element, and the ByteSpan. That ByteSpan contains everything between the ‘<‘ of the tag to the closing ‘>’, non-inclusive. The first thing we do is scan the tag name, so we can at least hold onto that, leaving the fData representing the rest. This is relatively low impact so far.

Why not just do this in the constructor itself, why have a “reset()”? As we’ll see later, we actually reuse XmlElement in some situations while parsing, so we want to be able to set, and reset, the same object multiple times. At least that’s one way of doing things.

Another item of note is whether you scan the attributes or not. If you do scan the attributes, you end up with a map of those elements, and a way to get the value of individual attributes.

        std::map<std::string, ByteSpan> fAttributes{};

        ByteSpan getAttribute(const std::string &name) const
		{
			auto it = fAttributes.find(name);
			if (it != fAttributes.end())
				return it->second;
			else
                return ByteSpan{};
		}

The ‘getAttribute()’ method is a most critical piece when we later start building our SVG model, so it needs to be fast and efficient. Of course, this does not have to be embedded in the core of the XmlElement, you could just as easily construct an attribute list outside of the element, but then you’d have to associate it back to the element anyway, and you end up in the same place. getAttribute() takes a name as a string, and returns the ByteSpan which is the raw, uninterpreted content of that attribute, without the enclosing quote marks. In the future, it would be nice to replace that std::string based name with a XmlName, which will save on some allocations, but we’ll stick with this convenience for now.

The stage is now set. We have our core components and data structures, we’re ready for the main event of actually parsing some content. For that, we have to make some design decisions. The first one we already made in the very beginning. We will be consuming a chunk of memory as represented in a ByteSpan. The next decision is how we want to consume? Do we want to build a Document Object Model (DOM), or some other structure? Do we just want to print out nodes as we see them? Do we want a ‘pull model’ parser, where we are in control of getting each node one by one, or a ‘push model’, where we have a callback function which is called every time a node is seen, but the primary driver is elsewhere?

My choice is to have a pull model parser, where I ask for each node, one by one, and do whatever I’m going to do with it. In terms of programming patterns, this is the ‘iterator’. So, I’m going to create an XML iterator. The fundamental structure of an iterator is this.

Iterator iter(content)
while (iter)
{
   doSomethingWithCurrentItem(*iter);
  iter++;
}

So, that’s what we need to construct for our XML. Something that can scan its input, delivering XmlElement as the individual items that we can then do something with. So, here is XmlElementIterator.

   struct XmlElementIterator {
    private:
        // XML Iterator States
        enum XML_ITERATOR_STATE {
            XML_ITERATOR_STATE_CONTENT = 0
            , XML_ITERATOR_STATE_START_TAG

        };
        
        // What state the iterator is in
        int fState{ XML_ITERATOR_STATE_CONTENT };
        svg2b2d::ByteSpan fSource{};
        svg2b2d::ByteSpan mark{};

        XmlElement fCurrentElement{};
        
    public:
        XmlElementIterator(const svg2b2d::ByteSpan& inChunk)
        {
            fSource = inChunk;
            mark = inChunk;

            fState = XML_ITERATOR_STATE_CONTENT;
            
            next();
        }

		explicit operator bool() { return !fCurrentElement.empty(); }
        
        // These operators make it operate like an iterator
        const XmlElement& operator*() const { return fCurrentElement; }
        const XmlElement* operator->() const { return &fCurrentElement; }

        XmlElementIterator& operator++() { next(); return *this; }
        XmlElementIterator& operator++(int) { next(); return *this; }
        
        // Reset the iterator to a known state with data
        void reset(const svg2b2d::ByteSpan& inChunk, int st)
        {
            fSource = inChunk;
            mark = inChunk;

            fState = st;
        }

        ByteSpan readTag()
        {
            ByteSpan elementChunk = fSource;
            elementChunk.fEnd = fSource.fStart;
            
            while (fSource && *fSource != '>')
                fSource++;

            elementChunk.fEnd = fSource.fStart;
            elementChunk = chunk_rtrim(elementChunk, wspChars);
            
            // Get past the '>' if it was there
            fSource++;
            
            return elementChunk;
        }
        
        // readDoctype
		// Reads the doctype chunk, and returns it as a ByteSpan
        // fSource is currently sitting at the beginning of !DOCTYPE
        // Note: 
        
        ByteSpan readDoctype()
        {

            // skip past the !DOCTYPE to the first whitespace character
			while (fSource && !wspChars[*fSource])
				fSource++;
            
			// Skip past the whitespace
            // to get to the beginning of things
			fSource = chunk_ltrim(fSource, wspChars);

            
            // Mark the beginning of the "content" we might return
            ByteSpan elementChunk = fSource;
            elementChunk.fEnd = fSource.fStart;

            // To get to the end, we're looking for '[]' or just '>'
            auto foundChar = chunk_find_char(fSource, '[');
            if (foundChar)
            {
                fSource = foundChar;
                foundChar = chunk_find_char(foundChar, ']');
                if (foundChar)
                {
                    fSource = foundChar;
                    fSource++;
                }
                elementChunk.fEnd = fSource.fStart;
            }
            
            // skip whitespace?
            // search for closing '>'
            foundChar = chunk_find_char(fSource, '>');
            if (foundChar)
            {
                fSource = foundChar;
                elementChunk.fEnd = fSource.fStart;
                fSource++;
            }
            
            return elementChunk;
        }
        
        
        // Simple routine to scan XML content
        // the input 's' is a chunk representing the xml to 
        // be scanned.
        // The input chunk will be altered in the process so it
        // can be used in a subsequent call to continue scanning where
        // it left off.
        bool next()
        {
            while (fSource)
            {
                switch (fState)
                {
                case XML_ITERATOR_STATE_CONTENT: {

                    if (*fSource == '<')
                    {
                        // Change state to beginning of start tag
                        // for next turn through iteration
                        fState = XML_ITERATOR_STATE_START_TAG;

                        if (fSource != mark)
                        {
                            // Encapsulate the content in a chunk
                            svg2b2d::ByteSpan content = { mark.fStart, fSource.fStart };

                            // collapse whitespace
							// if the content is all whitespace
                            // don't return anything
							content = chunk_trim(content, wspChars);
                            if (content)
                            {
                                // Set the state for next iteration
                                fSource++;
                                mark = fSource;
                                fCurrentElement.reset(XML_ELEMENT_TYPE_CONTENT, content);
                                
                                return true;
                            }
                        }

                        fSource++;
                        mark = fSource;
                    }
                    else {
                        fSource++;
                    }

                }
                break;

                case XML_ITERATOR_STATE_START_TAG: {
                    // Create a chunk that encapsulates the element tag 
                    // up to, but not including, the '>' character
                    ByteSpan elementChunk = fSource;
                    elementChunk.fEnd = fSource.fStart;
                    int kind = XML_ELEMENT_TYPE_START_TAG;
                    
                    if (chunk_starts_with_cstr(fSource, "?xml"))
                    {
						kind = XML_ELEMENT_TYPE_XMLDECL;
                        elementChunk = readTag();
                    } 
                    else if (chunk_starts_with_cstr(fSource, "?"))
                    {
                        kind = XML_ELEMENT_TYPE_PROCESSING_INSTRUCTION;
                        elementChunk = readTag();
                    }
                    else if (chunk_starts_with_cstr(fSource, "!DOCTYPE"))
                    {
                        kind = XML_ELEMENT_TYPE_DOCTYPE;
                        elementChunk = readDoctype();
                    }
                    else if (chunk_starts_with_cstr(fSource, "!--"))
                    {
						kind = XML_ELEMENT_TYPE_COMMENT;
                        elementChunk = readTag();
                    }
                    else if (chunk_starts_with_cstr(fSource, "![CDATA["))
                    {
                        kind = XML_ELEMENT_TYPE_CDATA;
                        elementChunk = readTag();
                    }
					else if (chunk_starts_with_cstr(fSource, "/"))
					{
						kind = XML_ELEMENT_TYPE_END_TAG;
						elementChunk = readTag();
					}
					else {
						elementChunk = readTag();
                        if (chunk_ends_with_char(elementChunk, '/'))
                            kind = XML_ELEMENT_TYPE_SELF_CLOSING;
					}
                    
                    fState = XML_ITERATOR_STATE_CONTENT;

                    mark = fSource;

					fCurrentElement.reset(kind, elementChunk, true);

                    return true;
                }
                break;

                default:
                    fSource++;
                    break;

                }
            }

            fCurrentElement.clear();
            return false;
        } // end of next()
    };

That code might have a face only a programmer could love, but it’s relatively simple to break down. The constructor takes a ByteSpan, and holds onto it as fSource. This ByteSpan is ‘consumed’, meaning, once you’ve iterated, you can’t go back. But, since ‘iteration’ is nothing more than moving a pointer in a ByteSpan, you can always take a ‘snapshot’ of where you’re at, and continue, but we won’t go into that right here. That’s going to be useful for tracking down where an error occured.

The crux of the iterator is the ‘next()’ method. This is where we look for the ‘<‘ character that indicates the start of some tag. The iterator runs between two states. You’re either in ‘XML_ITERATOR_STATE_CONTENT’ or ‘XML_ITERATOR_STATE_START_TAG’. Initially we start in the ‘CONTENT’ state, and flip to ‘START_TAG’ as soon as we see the character. Once in ‘START_TAG’, we try to further refine what kind of tag we’re dealing with. In most cases, we just capture the content, and that becomes the current element.

The iteration terminates when the current XmlElement (fCurretElement) is empty, which happems if we run out of input, or there’s some kind of error.

So, next() returns true or false. And our iterator does what it’s supposed to do, which is hold onto the current XmlElement that we have scanned. You can get to the contents of the element by using the dereference operator *, like this: *iter, or the arrow operator. In either case, they simply return the current element

        const XmlElement& operator*() const { return fCurrentElement; }
        const XmlElement* operator->() const { return &fCurrentElement; }

Alright, in practice, it looks like this:

#include "mmap.h"
#include "xmlscan.h"
#include "xmlutil.h"

using namespace filemapper;
using namespace svg2b2d;

int main(int argc, char** argv)
{
    if (argc < 2)
    {
        printf("Usage: pullxml <xml file>\n");
        return 1;
    }

    // create an mmap for the specified file
    const char* filename = argv[1];
    auto mapped = mmap::createShared(filename);

    if (mapped == nullptr)
        return 0;


    // 
	// Parse the mapped file as XML
    // printing out the elements along the way
    ByteSpan s(mapped->data(), mapped->size());
    
    XmlElementIterator iter(s);

    while (iter)
    {
		ndt_debug::printXmlElement(*iter);

        iter++;
    }

    // close the mapped file
    mapped->close();

    return 0;
}

That will generate the following output, where the printXmlElement() function can be found in the file xmlutil.h. The individual attributes are indicated with their name followed by ‘:’, such as ‘height:’, followed by the value of the attributed, surrounded by ‘||’ markers. Each tag kind is indicated as well.

START_TAG: [svg]
    height: ||200||
    width: ||680||
    xmlns: ||http://www.w3.org/2000/svg||
SELF_CLOSING: [circle]
    cx: ||70||
    cy: ||70||
    r: ||50||
SELF_CLOSING: [circle]
    cx: ||200||
    cy: ||70||
    fill: ||#79C99E||
    r: ||50||
SELF_CLOSING: [circle]
    cx: ||330||
    cy: ||70||
    fill: ||#79C99E||
    r: ||50||
    stroke: ||#508484||
    stroke-width: ||10||
SELF_CLOSING: [circle]
    cx: ||460||
    cy: ||70||
    fill: ||#79C99E||
    r: ||50||
    stroke-width: ||10||
SELF_CLOSING: [circle]
    cx: ||590||
    cy: ||70||
    fill: ||none||
    r: ||50||
    stroke: ||#508484||
    stroke-width: ||10||
END_TAG: [svg]

At this point, we have our XML “parser”. It can scan/parse enough for us to continue on our journey to parse and display SVG. It’s not the most robust XML parser on the planet, but it’s a good performer, very small and hopefully understandable. Usage could not be easier, and it does not impose a lot of frameworks, or pull in a lot of dependencies. We’re at a good starting point, and if all you wanted was to be able to parse some XML to do something, you could stop here and call it a day.

Next time around, we’re going to look into the SVG side of things, and sink deep into that rabbit hole.


SVG From the Ground Up – Parsing Fundamentals

Scalable Vector Graphics, is a XML based format. So, the first thing we want to do is create an XML ‘parser’. I put that in quotes because we don’t really need to create a full fledged conformant, validating, XML parser. This is the first design principle I’m going to be following. Here I want to create just enough to make things work, with an eye towards future proofing and extensibility, but not go so far as to make it absolutely bullet proof. So, I’ll be writing just enough code to scan some typical .svg files, while leaving room to swap out our quick and dirty parser for something that is more substantial in the future.

If you want to follow along the code I am constructing, you can find it in this GitHub repository: svg2b2d

To start scanning text, begins with how text is represented in the first place. This is a very core fundamental design decision. Will we be reading from files on the local machine? Will we be reading from a stream of bytes coming from the network? Will we be reading from a chunk of memory handed to us through some API within the program? I’ve decided on this last choice. These days, it’s very common to be able to read a file into memory, and operate on it from there. Similarly with networks, speeds are fast enough that you can read the entirety of the content into memory before processing. SVG is not a format where you can easily progressively render, like with raster based formats. You really do need the whole document before you can render it.

So, we’re going to be reading from memory, assuming something else has already taken care of getting the image into memory. I am writing in C++, so I’m ok with struct, and class, but I don’t necessarily want to use the iostream facilities, nor get too far down the track with templates and the like. The latest C++ (20) has a std::span object, which is very useful, and does exactly what I want, but I want the code to be a bit more portable than C++ 20, so I’m not going to use that facility, instead I’m going to somewhat replicate it.

How to represent a chunk of memory. There are two choices. You can either use a starting pointer and length, or you can use a starting and ending pointer. After much deliberation, I chose to do the latter, and use two pointers.

struct ByteSpan
{
    const unsigned char* fStart;
    const unsigned char* fEnd;
};

Throughout the code, I will use the ‘struct’ instead of ‘class’ because I’m ok with the data structure defaulting to everything being publicly accessible. There’s not a lot of sub-classing that’s going to occur here either, so I’m not as concerned about data hiding and encapsulation. This also makes the code easier to understand, without a lot of extraneous decorations.

There we have it. You have a chunk of memory, now what? Well, the most common things you do when scanning code are advance the pointer, and check the character you’re currently scanning. So, lets add these conveniences, as well as a couple of constructors, then we can do a sample.


struct ByteSpan
{
    const unsigned char * fStart{};
    const unsigned char * fEnd{};

    ByteSpan():fStart(nullptr), fEnd(nullptr){}
    ByteSpan(const char *cstr):fStart((const unsigned char *)cstr), fEnd((const unsigned char *)cstr+strlen(cstr)){}
    explicit ByteSpan(const void* data, size_t sz) 
        :fStart((const unsigned char*)data)
        , fEnd((const unsigned char*)fStart + sz) 
        {}

    // Return false when start and end are the same
    explicit operator bool() const { return (fEnd - fStart) > 0; };

    // get current value from fStart, like a 'peek' operation
    unsigned char& operator*() { 
        static unsigned char zero = 0;  
        if (fStart < fEnd) 
            return *(unsigned char*)fStart; 
        return  zero; 
    }
    
    const uint8_t& operator*() const { 
        static unsigned char zero = 0;  
        if (fStart < fEnd) 
            return *(unsigned char*)fStart; 
        return  zero; 
    }

    ByteSpan& operator++() { return operator+=(1); }	// prefix notation ++y
    ByteSpan& operator++(int i) { return operator+=(1); }   // postfix notation y++
};


With all that boiler plate code added to the structure, you can now do the following operations

ByteSpan b("Here is some text");
while (b)
{
    printf("%c",*b);
    b++;
}

And that little loop will essentially print a copy of the string you used to create the ByteSpan ‘b’. At this point, it might hardly seem worth the effort. I mean, you could just as simply use a starting pointer and ending pointer, without the intervening ByteSpan structure. Well, yes, and a lot of code out there in the world does exactly that, and it’s just fine. But, we have some future design goals which will make this little encapsulation of the two pointers very convenient. One of the design goals is worth introducing now, and that is the concept of zero, or minimal allocation. We want the scanner to be light weight, fast, minimal impact on the memory of the system. We want it to be able to parse data that is megabytes in size without having any problems. To this end, the scanner itself does no allocations, and does not alter the original memory its operating on, even though the ByteSpan would allow you to.

Alright. With this little tool in hand, what else can we do? Well, soon enough we’re going to need to compare characters, and make decisions. Is that a ‘<‘ opening to an XmlElement? Is this whitespace? Does this string end with “/>”. We have need for something that can represent a set of characters. Here is charset.

struct charset {
		std::bitset<256> bits;

		explicit charset(const char achar) { addChar(achar); }
		charset(const char* chars) { addChars(chars); }

		charset& addChar(const char achar)
		{
			bits.set(achar);
			return *this;
		}

		charset& addChars(const char* chars)
		{
			size_t len = strlen(chars);
			for (size_t i = 0; i < len; i++)
				bits.set(chars[i]);

			return *this;
		}

		charset& operator+=(const char achar) { return addChar(achar); }
		charset& operator+=(const char* chars) { return addChars(chars); }

		charset operator+(const char achar) const
		{
			charset result(*this);
			return result.addChar(achar);
		}

		charset operator+(const char* chars) const
		{
			charset result(*this);
			return result.addChars(chars);
		}

		// This one makes it look like an array
		bool operator [](const size_t idx) const { return bits[idx]; }

		// This way makes it look like a function
		bool operator ()(const size_t idx) const { return bits[idx]; }

		bool contains(const uint8_t idx) const { return bits[idx]; }
		
	};

All of that, so that we can write the following

charset wspChars(" \t\r\n\v\f");

ByteSpan b("  Now is the time for all humans to come to the aid of animals  ");

while (b)
{
    // skip whitespace
   while (wspChars.contains(*b))
        b++;

    // Create a span that will represent a word
    // start it being empty
    ByteSpan aWord = b;
    aWord.fEnd = aWord.fStart;

    // Advance while there are still characters and they are not a whitespace character
    while (b && !wspChars.contains(*b))
        b++;

    // Now we're sitting at the end of the whole span, or at the beginning of the next
    // whitespace character.  In either case, it's the end of our word
    aWord.fEnd = b.fStart;

    // Now we can do something with the word that we found
    printWord(aWord);

    // And continue around the loop until we've exhausted the byte span
}


And that’s how we start. If you want to get ahead, you can look at the code in the repository, in particular bspan.h and bspanutil.h. With these two classes alone, we will build up the XML scanning capability, and ultimately the SVG building capability on top of that. So, these are very code, and important to get right, because they will maintain the promise of “no allocations” and “be super fast”.

One question that came up in my mind was “why not just use regex and be done with it?”. Well, yes, C/C++ have regular expression capabilities, either built-in, or as some side library. There were a couple of reasons I chose not to go that route. One is about speed, the other is about allocations. It’s super easy to just store your text into a std::string object, then use regex on that. But, when you do, you’ll find that std::string objects are allocated all over the place, and you don’t have tight control of your memory consumption, which breaks one of the design tenets I’m after. The other is just the size of such code. A good regex library can easily be as big, if not bigger, than the entirety of the SVG parser we’re trying to build. I am somewhat concerned with code size, so I’d rather not have the extra bloat. Besides, all that, trying to construct regex patterns that I or anyone can maintain in the future, can be quite challenging. We’ll essentially be building bits and pieces of what would typically go into regex libraries, but we’ll only be building as much as we need, so it will stay small and tight.

And there you have it. We have begun our journey with these two first small steps, the ByteSpan, and the charset.

Next time, we’ll see how easy it is to ‘parse’ some xml, as we introduce the XmlElement and XmlElementIterator.


Creating A SVG Viewer from the ground up

In the series I did last summer (Hello Scene), I walked through the fundamentals of creating a simple graphics system, from the ground up. Putting a window on the screen, setting pixels, drawing, text, visual effects, screen captures, and all that. Along the way, I discussed various design choices and tradeoffs that I made while creating the code.

While capturing screenshots, and flipping some bits might make for a cool demo, at the end of the day, I need to create actual applications that are: robust, performant, functional, and a delight for the user to use. A lot of what we see today are “web apps”, that is, things that are created to be run in a web browser. Web Apps have a lot of HTML, CSS, Javascript, and are programmed with myriad frameworks, in multiple languages on the front end and backend. It a whole industry out there!

One question arises for me though, and perhaps a bit of envy. Why do those web apps have to look so great, with their fancy gradients, shadows, and animations, whereas my typical applications look like they’re stuck in a late 2000 computer geek movie. I’m talking about desktop apps, and why they haven’t changed much in the past 20 years. Maybe we get a splash here and there with some changes in icon styles (shardows, transparency, flat, ‘dark’), but really, the rest of the app looks and feels the same. No animations, no fancy pictures, everything is square, just no fun.

Well, to this point, I’ve been on a mission to create more engaging desktop app experiences, and it starts with the graphics. To that end, I looked out into the world and saw that SVG (Scalable Vector Graphics) would be a great place to start. Vector graphics are great. The other form of graphics are ‘bitmap’. Bitmap graphics are the realm of file formats such as ‘png’, ‘jpeg’, ‘gif’, ‘webp’, and the like. As the name implies, a ‘bitmap’ is just a bunch of dots of color in a square. There are a couple of challenges with bitmap graphics. One is that when you scale them, the thing starts to look “pixelated”. You know, they get the ‘jaggies’, and they just don’t look that great.

The second challenge you have is that the image is static. You don’t know where the keys on that keyboard are located, so being able to push them, or have them reflect music that’s playing, is quite a hard task.

In steps vector graphics. Vector Graphics contain the original drawing commands that are used to create a bitmap, at any size. With a vector graphics file, you can retain the information about colors, locations, geometry, everything that went into creating the image. This means that you can locate individual elements, name them, change them during the application, and so on.

Why don’t we just use vector graphics all the time then? Honestly, I really don’t know. I do know that one impediment to using them is being able to parse the format, and do something meaningful with it. To date, you mostly find support for SVG in web browsers, where they’re already parsing this kind of data. In that environment, you have full access to all those annotations, and furthermore, you can attach javascript code the various actions, like mouse hovering, clicking, dragging and the like. But, for the most part, desktop applications don’t participate in that world. Instead, we’re typically stuck with bitmap graphics and clunky UI builders.

To change that, the first step is parsing the .svg file format. Lucky for me, SVG is based on XML, which is the first thing I worked on at Microsoft back in 1998. I never actually wrote the parser (worked on XSLT originally), but I’m super familiar with it. So, that’s where to start.

In this series, I’m going to write a functional SVG parser, which will be capable of generating SVG based bitmap images, as well as operate in an application development environment for desktop apps. I will be using the blend2d graphics library to do all the super heavy lifting of rendering the actual images, but I will focus on what goes into writing the parser, and seamlessly integrating the results into useful desktop applications.

So, follow along over the next few installments to see how it’s done.


Hello Scene – Conclusion

demo scene

I have been writing code since about 1978, when I first had access to a Commodore PET computer. For me, it’s always been about having fun, doing things with the machine that might not be obvious, and certainly not achievable on my own. Over the years, I’ve picked up some tips and tricks to help me get to the interesting parts sooner rather than later. During the 1980s-1990s, there was a ‘demo scene’ wherein coders such as myself, were engaged in trying to push our ‘personal computers’ to the limit in terms of what they could do visually, and with audio. This demo scene was often centered around computers such as the Commodore 64, or Apple II, or the venerable Commodore Amiga. The demo scene days are largely gone, and computers are a few orders of magnitude more powerful than those early personal computers.

And yet…

I still get excited to create quick and dirty programs that really push the limits of what you can do with the modern personal computer. Modern day programming is super heavy with frameworks, operating systems, SDKs and libraries. We are several levels removed from the core of the machine, which the demo scene of yore leveraged to great effect. But, the machine is still down there, waiting to be unlocked. With some esoteric knowledge, and some good habits and insights, we can begin to unlock all that the computer has to offer, and make creations quickly and easily, for fun and profit.

At the end of June of 2022, I decided I wanted to share some of this low level esoteric knowledge, because why should I have all the fun. So, I began on a series of tutorials to show how, the average programmer, can start to conquer some typically low level stuff. In brief, the series is about how to create quick and dirty programs using the C/C++ programming language on the Windows platform. Included in the series is everything from how to put a window up on the screen, to how to display text along an animated bezier curve. I avoid the typical frameworks and libraries, and use a fairly minimal amount of OS features. Without much work, the tutorials can apply to just about any platform where you have access to the graphics screen, mouse, and keyboard.

Along the way, I share various design decisions that I’ve made, as well as the reasoning behind doing things simple and cheap, rather than relying on giant frameworks. In the end, you could pick up where this series left off and create your own demos, or simply use it as inspiration for generating your own things that are small and fun to play with.

Here are the links to the various tutorials. They rely on my minwe github repository, so it’s pretty easy to follow along if you want to look at the code in full.

Have You Scene My Demo?

Hello Scene – Win32 Wrangling

Hello Scene – What’s in a Window?

Hello Scene – Events, Organization, more drawing

Hello Scene – All the pretty little things

Hello Scene – Screen Captures for Fun and Profit

Hello Scene – It’s all about the text

I can only hope these tutorials give someone a fresh new perspective on one aspect or another of the coding process, and if they’re like my younger self, gives them some tools so they can create their own wild creation.


Hello Scene – It’s all about the text

That’s a lot of fonts. But, it’s a relatively simple task to achieve once we’ve gained some understanding of how to deal with text. We’ll park this bit of code here (fontlist.cpp) while we gain some understanding.

#include "gui.h"
#include "fontmonger.h"

std::list<std::string> fontList;

void drawFonts()
{
	constexpr int rowHeight = 24;
	constexpr int colWidth = 213;

	int maxRows = canvasHeight / rowHeight;
	int maxCols = canvasWidth / colWidth;

	int col = 0;
	int row = 0;

	std::list<std::string>::iterator it;
	for (it = fontList.begin(); it != fontList.end(); ++it) 
	{
		int x = col * colWidth;
		int y = row * rowHeight;

		textFont(it->c_str(), 18);
		text(it->c_str(), x, y);

		col++;
		if (col >= maxCols)
		{
			col = 0;
			row++;
		}
	}
}

void setup()
{
	setCanvasSize(1280, 1024);

	FontMonger::collectFontFamilies(fontList);

	background(PixelRGBA(0xffdcdcdc));

	drawFonts();
}

I must say, dealing with fonts, and text rendering is one of the most challenging of the graphics disciplines. We could spend years and gigabytes of text explaining the intricacies of how fonts and text work. For our demo scene, we’re not going to get into all that though. We just want a little bit of text to be able to splash around here and there. So, I’m going to go the easy route, and explain how to use the system text rendering and incorporate it into the rest of our little demo framework.

First of all, some terminology. These words; Font, Font Face, OpenType, Points, etc, are all related to fonts, and all can cause confusion. So, let’s ignore all that for now, and just do something simple.

And the code to make it happen?

#include "gui.h"

void setup()
{
	setCanvasSize(320, 240);
	background(PixelRGBA (0xffffffff));		// A white background

	text("Hello Scene!", 24, 48);
}

Pretty simple right? By default, the demo scene chooses the “Segoe UI” font at 18 pixels high to do text rendering. The single call to “text(…)”, puts whatever text you want at the x,y coordinates specified afterward. So, what is “Segoe UI”? A Font describes the shape of a character. So, the letter ‘A’ in one font looks one way in say “Times New Roman”, and probably slightly different in “Tahoma”. These are stylistic differences. Us humans will just recognize it as ‘A’. Each font contains a bunch of descriptions of how to draw individual characters. These descriptions are essentially just polygons, with curves, and straight lines.

I’m grossly simplifying.

The basic description can be scaled, rotated, printed in ‘bold’, ‘italics’, or ‘underline’, depending on what you want to do when you’re displaying text. So, besides just saying where we want text to be located, we can specify the size (in pixels), and choose a specific font name other than the default.

Which was produced with a slight change in the code

#include "gui.h"

void setup()
{
	setCanvasSize(640, 280);
	background(PixelRGBA (0xffffffff));		// A white background

	textFont("Sitka Text", 100);
	text("Hello My Scene!", 24, 48);
}

And last, you can change the color of the text

How exciting is that?! For the simplest of demos, and maybe even some UI framework, this might be enough. But, le’ts go a little bit further, and get some more functions that might be valuable.

First thing, we need to understand a little bit more about the font, like how tall and wide characters are, where’s the baseline, the ascent, and descent. Character width and height are easily understood. Ascent and descent might not be as well understood. Let’s start with a little display.

Some code to go with it

#include "gui.h"

constexpr int leftMargin = 24;
constexpr int topMargin = 24;


void drawTextDetail()
{
    // Showing font metrics
	const char* str2 = "My Scene!";
	PixelCoord sz;
	textMeasure(sz, str2);

	constexpr int myTop = 120;

	int baseline = myTop + fontHeight - fontDescent;
	int topline = myTop + fontLeading;

	strokeRectangle(*gAppSurface, leftMargin, myTop, sz.x(), sz.y(), PixelRGBA(0xffff0000));

	// Draw internalLeading - green
	copySpan(*gAppSurface, 
        leftMargin, topline, sz.x(), 
        PixelRGBA(0xff00ff00));

	// draw baseline
	copySpan(*gAppSurface, 
        leftMargin, baseline, sz.x(), 
        PixelRGBA(0xff0000ff));

	// Draw text in the box
    // Turquoise Text
	textColor(PixelRGBA(0xff00ffff));	
	text("My Scene!", leftMargin, myTop);
}

void setup()
{
	setCanvasSize(640, 280);
	background(PixelRGBA (0xffffffff));

	textFont("Sitka Text", 100);

	drawTextDetail();
}

In the setup, we do the usual to create a canvas of a desirable size. Then we select the font with a particular pixel height. Then wave our hands and call ‘drawDetail()’.

In ‘drawDetail()’, one of the first calls is to ‘textMeasure()’. We want the answer to; “How many pixels wide and high is this string?” The ‘textMeasure()’ function does this. It’s pretty straight forward as the GDI API that we’re using for text rendering has a function call for this purpose.

void textMeasure(PixelCoord& pt, const char* txt)
{
    SIZE sz;
    ::GetTextExtentPoint32A(gAppSurface->getDC(), txt,strlen(txt), &sz);

    pt[0] = sz.cx;
    pt[1] = sz.cy;
}

It’s that simple. Just pass in a structure to receive the size, and make the call to ‘GetTextExtentPoint32A()’. I choose to return the value in a PixelCoord object, because I don’t want the Windows specific data structures bleeding into my own demo API. This allows me to change the underlying text API without having to worry about changing dependent data structures.

The size that is returned incorporates a few pieces of information. It’s not a tight fit to the string. The size is derived from a combination of global font information (tallest character, lowest part of a character), as well as the cumulative widths of the actual characters specified. In the case of our little demo, the red rectangle represents the size that was returned.

There are a couple more bits of information that are set when you select a font of a particular size. The three most important bits are, the ascent, descent, and internal leading.

Let’s start from the descent. Represented by the blue line, this is the maximum amount any given character of the font might fall below the ‘baseline’. The baseline is implicitly defined by this descent, and it essentially the fontHeight-fontDescent. This is the line where all the other characters will have as their ‘bottom’. The ‘ascent’ is the amount of space above this baseline. So, the total fontHeight is the fontDescent+fontAscent. The ascent isn’t explicitly shown, because it is essentially the topline of the rectangle. The last bit is the internalLeading. This is the amount of space used by accent characters and the like. The fontLeading is this number, and is represented as the green line, as it’s essentially subtracted from the fontHeight in terms of coordinates.

And there you have it. All the little bits and pieces of a font. When you specify a location for drawing the font in the ‘text()’ function, you’re essentially specifying the top left corner of this red rectangle. Of course, that leaves you a bit high and dry when it comes to precisely placing your text. More than likely, what you really want to do is place your text according to the baseline, so that you can be more assured of where your text is actually going to show up. Maybe you want that, maybe you don’t. What you really need is the flexibility to specify the ‘alignment’ of your text rendering.

This is actually a re-creation of something I did about 10 years ago, for another project. It’s a pretty simple matter once you have adequate font and character sizing information.

#include "gui.h"
#include "textlayout.h"

TextLayout tLayout;

void drawAlignedText()
{
	int midx = canvasWidth / 2;
	int midy = canvasHeight / 2;

	// draw vertical line down center of canvas
	line(*gAppSurface, midx, 0, midx, canvasHeight - 1, PixelRGBA(0xff000000));

	// draw horizontal line across canvas
	line(*gAppSurface, 0, midy, canvasWidth - 1, midy, PixelRGBA(0xff000000));

	tLayout.textFont("Consolas", 24);
	tLayout.textColor(PixelRGBA(0xff000000));

	tLayout.textAlign(ALIGNMENT::LEFT, ALIGNMENT::BASELINE);
	tLayout.text("LEFT", midx, 24);

	tLayout.textAlign(ALIGNMENT::CENTER, ALIGNMENT::BASELINE);
	tLayout.text("CENTER", midx, 48);

	tLayout.textAlign(ALIGNMENT::RIGHT, ALIGNMENT::BASELINE);
	tLayout.text("RIGHT", midx, 72);

	tLayout.textAlign(ALIGNMENT::RIGHT, ALIGNMENT::BASELINE);
	tLayout.text("SOUTH EAST", midx, midy);

	tLayout.textAlign(ALIGNMENT::LEFT, ALIGNMENT::BASELINE);
	tLayout.text("SOUTH WEST", midx, midy);

	tLayout.textAlign(ALIGNMENT::RIGHT, ALIGNMENT::TOP);
	tLayout.text("NORTH EAST", midx, midy);

	tLayout.textAlign(ALIGNMENT::LEFT, ALIGNMENT::TOP);
	tLayout.text("NORTH WEST", midx, midy);
}

void setup()
{
	setCanvasSize(320, 320);

	tLayout.init(gAppSurface);

	background(PixelRGBA(0xffDDDDDD));

	drawAlignedText();
}

Design-wise, I chose to stuff the various text measurement and rendering routines into a separate object. My other choice would have been to put them into the gui.h/cpp file, and I did do that initially, but then I thought better of it, because that would be forcing a particular strong opinion on how text should be dealt with, and I did not make that choice for drawing in general, so I thought better of it and chose to encapsulate the text routines in this layout structure (textlayout.h) .

Now that we have the ability to precisely place a string, we can get a little creative in playing with the displacement of all the characters in a string. With that ability, we can have text placed based on the evaluation of a function, with animation of course.

#include "gui.h"
#include "geotypes.hpp"
#include "textlayout.h"

using namespace alib;

constexpr int margin = 50;
constexpr int FRAMERATE = 20;

int dir = 1;				// direction
int currentIteration = 1;	// Changes during running
int iterations = 30;		// increase past frame rate to slow down
bool showCurve = true;
TextLayout tLayout;

void textOnBezier(const char* txt, GeoBezier<ptrdiff_t>& bez)
{
	int len = strlen(txt);

	double u = 0.0;
	int offset = 0;
	int xoffset = 0;

	while (txt[offset])
	{
		// Isolate the current character
		char buf[2];
		buf[0] = txt[offset];
		buf[1] = 0;

		// Figure out the x and y offset
		auto pt = bez.eval(u);

		// Display current character
		tLayout.text(buf, pt.x(), pt.y());

		// Calculate size of current character
		// so we can figure out where next one goes
		PixelCoord charSize;
		tLayout.textMeasure(charSize, buf);

		// Now get the next value of 'u' so we 
		// can evaluate where the next character will go
		u = bez.findUForX(pt.x() + charSize.x());

		offset++;
	}

}

void strokeCurve(PixelMap& pmap, GeoBezier<ptrdiff_t> &bez, int segments, const PixelRGBA c)
{
	// Get starting point
	auto lp = bez.eval(0.0);

	int i = 1;
	while (i <= segments) {
		double u = (double)i / segments;

		auto p = bez.eval(u);

		// draw line segment from last point to current point
		line(pmap, lp.x(), lp.y(), p.x(), p.y(), c);

		// Assign current to last
		lp = p;

		i = i + 1;
	}
}

void onFrame()
{
	background(PixelRGBA(0xffffffff));

	int y1 = maths::Map(currentIteration, 1, iterations, 0, canvasHeight);

	GeoCubicBezier<ptrdiff_t> bez(margin, canvasHeight / 2, 
        canvasWidth * 0.25, y1, 
        canvasWidth - (canvasWidth * 0.25), canvasHeight -y1, 
        canvasWidth - margin, canvasHeight / 2.0);
	
	if (showCurve)
		strokeCurve(*gAppSurface, bez, 50, PixelRGBA(0xffff0000));

	// honor the character spacing
	tLayout.textColor(PixelRGBA(0xff0000ff));
	textOnBezier("When Will The Quick Brown Fox Jump Over the Lazy Dogs Back", bez);


	currentIteration += dir;

	// reverse direction if needs be
	if ((currentIteration >= iterations) || (currentIteration <= 1))
		dir = dir < 1 ? 1 : -1;
}

void setup()
{
	setCanvasSize(800, 600);
	setFrameRate(FRAMERATE);

	tLayout.init(gAppSurface);
	tLayout.textFont("Consolas", 24);
	tLayout.textAlign(ALIGNMENT::CENTER, ALIGNMENT::CENTER);
}


void keyReleased(const KeyboardEvent& e) 
{
	switch (e.keyCode) {
	case VK_ESCAPE:
		halt();
		break;

	case VK_SPACE:
		showCurve = !showCurve;
		break;

	case 'R':
		recordingToggle();
		break;
	}
}

For once, I won’t go line by line. The key trick here is the ‘findUForX()’ function of the bezier object. Since textMeasure() tells us how wide a string is (in pixels), we know how much to advance in the x direction as we display characters. Our bezier curve has an eval() function, which takes a value from 0.0 to 1.0. It will return a ‘y’ value along the curve when given a ‘u’ value between 0 and 1 to evaluate. So, we want to match the x offset of the next character with its corresponding ‘u’ value along the curve, then we can evaluate the curve at that position, and find out the appropriate ‘y’ value.

Notice in the setup, the text alignment is set to CENTER, CENTER. This means that the coordinate positions being calculated should represent the center of the characters being printed. That roughly leaves the center of the character aligned with the evaluated values of the curve, which will match your expectations most closely. Another way to do it might be to do LEFT, BASELINE, to get the characters left aligned, and use the curve as the baseline. There are a few possibilities, and you can simply choose what suits your needs.

This is a very crude way to do some displayment of text on a curve, but, showing text along a path is a fairly common parlor trick in demo applications, and this is one way to doing it quick and dirty. Your curves doesn’t have to be a bezier, it could be anything you like. Just take it one character at a time, and use the textAlignment, and see what you can accomplish.

There is a design choice here. I am using simple GDI based interfaces to display the text. I can do this because at the core, the PixelArray that we’re drawing into does in fact have a “DeviceContext”, so GDI knows how to draw into it. This is a great convenience, because it means that we can do all the independent drawing that we’ve been doing, from random pixels to bezier curves, and when we get to something we can’t quite handle, we can fall back to what the system provides, in this case text rendering.

With that, we’re at the end of this series. We’ve gone from a basic window on the screen, to drawing text along an animating bezier curve, all while recording to a .mpg file. This is just the beginning. We’ve covered some design choices along the way, including the desire to keep the code small and composable. The only thing left to do is go out and create something of your own, by using this kind of toolkit, or better yet, have the confidence to create your own.

The Demo Scene is out there. Go create something.


Hello Scene – Screen Captures for Fun and Profit

Being able to capture the display screen opens up some interesting possibilities for our demo scenes.

In this particular case, my demo app is capturing a part of my screen, and using it as a ‘texture map’ on a trapezoid, and compositing that onto a perlin noise background. The capture is live, as we’ll see shortly, but first, the code that does this little demo (sampmania.cpp).


#include "gui.h"
#include "sampledraw2d.h"
#include "screensnapshot.h"
#include "perlintexture.h"

ScreenSnapshot screenSamp;

void onFrame()
{
	// Take current snapshot of screen
	screenSamp.next();

	// Trapezoid
	PixelCoord verts[] = { PixelCoord({600,100}),PixelCoord({1000,100}),PixelCoord({1700,800}),PixelCoord({510,800}) };
	int nverts = 4;
	sampleConvexPolygon(*gAppSurface, 
		verts, nverts, 0, 
		screenSamp, 
		{ 0,0,canvasWidth, canvasHeight });

}

void setup()
{
	setCanvasSize(1920, 1080);
	setFrameRate(15);

	// Draw noisy background only once
	NoiseSampler perlinSamp(4);
	sampleRectangle(*gAppSurface, gAppSurface->frame(), perlinSamp);

	// Setup the screen sampler
	// Capture left half of screen
	screenSamp.init(0, 0, displayWidth / 2, displayHeight);
}

Pretty standard fair for our demos. There are a couple of new concepts here though. One is a sampler, the other is the ScreenSnapshot object. Let’s first take a look at the ScreenSnapshot object. The idea here is we want to take a picture of what’s on the screen, and make it available to the program in a PixelArray, which is how we represent pixel images in general. If we can do that, we can further use the screen snapshot just like the canvas. We can draw on it, save it, whatever.

On the Windows platform, there are 2 or 3 ways to take a snapshot of the display screen. Each method comes from a different era of the evolution of the Windows APIs, and has various benefits or limitations. In this case, we use the most ancient method for taking a snapshot, relying on the good old GDI API to do the work, since it’s been reliable all the way back to Windows 3.0.

#pragma once
// ScreenSnapshot
//
// Take a snapshot of a portion of the screen and hold
// it in a PixelArray (User32PixelMap)
//
// When constructed, a single snapshot is taken.
// every time you want a new snapshot, just call 'next()'
// This is great for doing a live screen capture
//
//    ScreenSnapshot ss(x,y, width, height);
//
//    References:
//    https://www.codeproject.com/articles/5051/various-methods-for-capturing-the-screen
//    https://stackoverflow.com/questions/5069104/fastest-method-of-screen-capturing-on-windows
//  https://github.com/bmharper/WindowsDesktopDuplicationSample
//

#include "User32PixelMap.h"

class ScreenSnapshot : public User32PixelMap
{
    HDC fSourceDC;  // Device Context for the screen

    // which location on the screen are we capturing
    int fOriginX;   
    int fOriginY;


public:
    ScreenSnapshot()
        : fSourceDC(nullptr)
        , fOriginX(0)
        , fOriginY(0)
    {}

    ScreenSnapshot(int x, int y, int awidth, int aheight, HDC srcDC = NULL)
        : User32PixelMap(awidth, aheight),
        fOriginX(x),
        fOriginY(y)
    {
        init(x, y, awidth, aheight, NULL);

        // take at least one snapshot
        next();
    }

    bool init(int x, int y, int awidth, int aheight, HDC srcDC=NULL)
    {
        User32PixelMap::init(awidth, aheight);

        if (NULL == srcDC)
            fSourceDC = GetDC(nullptr);
        else
            fSourceDC = srcDC;

        fOriginX = x;
        fOriginY = y;

        return true;
    }

    // take a snapshot of current screen
    bool next()
    {
        // copy the screendc into our backing buffer
        // getDC retrieves the device context of the backing buffer
        // which in this case is the 'destination'
        // the fSourceDC is the source
        // the width and height are dictated by the width() and height() 
        // and the source origin is given by fOriginX, fOriginY
        // We use the parameters (SRCCOPY, CAPTUREBLT) because that seems 
        // to be best practice in this case
        BitBlt(getDC(), 0, 0, width(), height(), fSourceDC, fOriginX, fOriginY, SRCCOPY | CAPTUREBLT);

        return true;
    }
};

There’s really not much to it. The real working end of it is the ‘next()’ function. That function call to ‘BitBlt()’ is where all the magic happens. That’s a Graphics Device Interface (GDI) system call, which will copy from one “DeviceContext” to another. A DevieContext is a Windows construct that represents the interface for drawing into something. This interface exists for screens, printers, or bitmaps in memory. Very old, very basic, very functional.

So, the basics are, get a ‘DeviceContext’ for the screen, and another ‘DeviceContext’ for a bitmap in memory, and call BitBlt to copy pixes from one to the other.

Also, notice the ScreenSnapshot inherits from User32PixelMap. We first saw this early on in this series (What’s In a Window), when we were first exploring how to put pixels up on the screen. We’re just leveraging what was built there, which was essentially a Windows Bitmap.

OK, so bottom line, we can take a picture of the screen, and put it into a bitmap, that we can then use in various ways.

Here’s the movie

Well, isn’t that nifty. You might notice that if you query the internet for “screen capture”, you’ll find links to tons of products that do screen capture, and recording. Finding a library that does this for you programmatically is a bit more difficult. One method that seems to pop up a lot is to capture the screen to a file, or to the clipboard, but that’s not what you want, you just want it in a bitmap ready to go, which is what we do here.

On Windows, a more modern method is to use DirectX, because that’s the preferred interface of modern day Windows. The GDI calls under the covers probably call into DirectX. The benefit of using this simple BitBlt() method is that you don’t have to increase your dependencies, and you don’t need to learn a fairly complicated interface layer, just to capture the screen.

I’ve used a complex image here, mainly to draw attention to this subject, but really, the screen capturing and viewing can be much simpler.

Just a straight up view, without any geometric transformation, other than to fit the rectangle.

Code that looks very similar, but just using a simple mapping to a rectangle, rather than a trapezoid. This is from screenview.cpp

//
// screenview
// Simplest application to do continuous screen capture
// and display in another window.
//
#include "gui.h"

#include "screensnapshot.h"

ScreenSnapshot screenSamp;

void onFrame()
{
    // Get current screen snapshot
    screenSamp.next();

    // Draw a rectangle with snapshot as texture
    sampleRectangle(*gAppSurface,gAppSurface->frame(),screenSamp);
}

// Do application setup before things get
// going
void setup()
{
    // Setup application window
	setCanvasSize(displayWidth/2, displayHeight);

    // setup the snapshot
    screenSamp.init(0, 0, displayWidth / 2, displayHeight);
}

void keyReleased(const KeyboardEvent& e) {
    switch (e.keyCode)
    {
    case VK_ESCAPE:
        halt();
        break;

    case 'R':
    {
        recordingToggle();
    }
    break;
    }
}

Capturing the screen has additional benefit for our demo scenes. One little used feature of Windows is the fact you can use translucency, and transparency. As such, you can display rather interesting things on the display. Using the recording technique where we just capture what’s on our canvas won’t really capture what the user will see. You’ll only capture what you’re drawing in your own buffer. In order to capture the fullness of the demo, you need to capture what’s on the screen.

And just to kick it up a notch, and show off some other things you can do with transparency…

In both these cases of the chasing balls, as well as the transparent keyboard, there is a function call within the demo scene ‘layered()’. If you call this in your setup, then your window won’t have any sort of border, and if you use transparency in your colors, they’ll be composited with whatever is on the desktop.

You can go one step further (in the case of the chasing balls), and call ‘fullscreen()’, which will essentiallly do a: setCanvas(displayWidth, displayHeight); layered();

There is one additional call, which allows you to retain your window title bar (for moving around and closing), but sets a global transparency level for your window ‘windowOpacity(double)’, which takes a value between 0.0 (fully transparent), and 1.0 (fully opaque).

And of course the demo code for the disappearing rectangles trick.

#include "apphost.h"
#include "draw.h"
#include "maths.hpp"

using namespace maths;

bool outlineOnly = false;
double opacity = 1.0;

INLINE PixelRGBA randomColor(uint32_t alpha=255)
{
	uint32_t r = random_int(255);
	uint32_t g = random_int(255);
	uint32_t b = random_int(255);

	return { r,g,b,alpha };
}

void handleKeyboardEvent(const KeyboardEventTopic& p, const KeyboardEvent& e)
{
	if (e.keyCode == VK_ESCAPE)
		halt();

	if (e.keyCode == VK_SPACE)
		outlineOnly = !outlineOnly;

	if (e.keyCode == VK_UP)
		opacity = maths::Clamp(opacity + 0.05, 0.0, 1.0);

	if (e.keyCode == VK_DOWN)
		opacity = maths::Clamp(opacity - 0.05, 0.0, 1.0);

	windowOpacity(opacity);
}

void onLoop()
{
	PixelRGBA stroke;
	PixelRGBA fill;
	PixelRGBA c;

	gAppSurface->setAllPixels(PixelRGBA(0x0));

	for (int i = 1; i <= 2000; i++)
	{
		int x1 = random_int(canvasWidth - 1);
		int y1 = random_int(canvasHeight - 1);
		int lwidth = random_int(4, 60);
		int lheight = random_int(4, 60);

		c = randomColor(192);

		if (outlineOnly)
		{
			stroke = c;
			draw::rectangle_copy(*gAppSurface, x1, y1, lwidth, lheight, c);

		}
		else
		{
			fill = c;
			//draw::rectangle_copy(*gAppSurface, x1, y1, lwidth, lheight, c);
			draw::rectangle_blend(*gAppSurface, x1, y1, lwidth, lheight, c);
		}
	}

	refreshScreen();
}

void onLoad()
{
	subscribe(handleKeyboardEvent);

	setCanvasSize(800, 800);
}

Well, that’s a lot of stuff, but mostly we covered various forms of screen capture, what you can do with it, and why recording just your own drawing buffer doesn’t show the full fidelity of your work.

We also covered a little bit of Windows wizardry with transparent windows, a very little known or used feature, but we can use it to great advantage for certain kinds of apps.

From a design perspective, I chose to use an ancient, but still supported API call, because it has the least number of dependencies, is the easiest of all the screen capture methods to understand and implement, and it uses the smallest amount of code.

Another thing of note for this demo framework is the maximum usage of ‘.h’ files. In each demo sample, there’s typically only 2 or 3 ‘.cpp’ files, and NO .dll files. This is again for simplicity and portability. You could easily put things in a library, and having appmain.cpp in a .exe file would even work, but that leads down a different path. Here, we just make every demo self contained, compiling all the code needed right then and there. This works out when your file count is relatively small (fewer than 10), and you’re working on a small team (fewer than 5). This probably does not scale as well beyond that.

But, there you have it. We’ve gone all the way from putting a single pixel on the screen, to displaying complex deometries with animation in transparent windows. The only thing left in this series is to draw some text, and call it a wrap.

So, next time.