Tinkerer’s Closet – Selecting hardware

Well, there’s no better reason to clean out your closet or workspace, than to fill it with new stuff!! I’ve been spending a fair amount of time cleaning out the old stuff, and I’ve gotten far enough along that I feel it’s safe to poke my head up and see what’s new and exciting in the world of hardware components today.

What to look at though, so I don’t just fill up with a bunch more stuff that doesn’t get used for the next 10 years? Well, this time around I’m going project based. That means, I will limit my searching to stuff that can help a project at hand. Yes, it’s useful to get some items just for the learning of it, but for a hoarder, it’s better to have an actual project in mind before making purchases.

On the compute front, I’ve been standardizing the low end around ESP 32 modules. I’ve mentioned this in the past, but it’s worth a bit more detail. The company, Espressif, came along within the past decade, and just kind of took the maker community by storm. Low cost, communications built in (wifi, bluetooth), capable processors (32-bit). They are a decent replacement at the low end of things, taking the place of the venerable Arduino, which itself was a watershed in its day.

The keen thing about the Espressif modules is how programmable they are. You can use the Aruino IDE, or PlatformIO (tied to Visual Studio), or their standalone IDE. You can program it like a single CPU with full control of everything, or you can run a Real-time OS (FreeRTOS) on it. This makes it super easy to integrate into anything from simple servo motor control, to full on robotics.

As for form factor, I’m currently favoring the Adafruit ‘feather’ forms. The ‘feather’ form factor is a board specification, which puts pins in certain locations, regardless of which actual processor is on the board. This makes it a module that can be easily integrated into designs, because you have known patterns to build around. I have been using the ESP32 Feather V2 primarily.

It’s just enough. USB-C connector for power, and programming. Battery connector for easy deployment (battery charges when USB-C is plugged in). STEMMA QT connector (tiny 8 pin connector) for easy I2C connection of things like joysticks, sensors, anything on the I2C bus. Antenna built in (wifi/bluetooth radio on the right, with black PCB antenna next to it).

It’s just a handy little package, and my current best “computicle”. You can go even smaller, and get ESP 32 modules in different packages, but this is the best for prototyping in my lab.

As an aside, I want to mention Adafruit, the company, as a good source for electronic components. You can checkout their about page to get their history. Basically, they were created in 2005, and have been cranking out the hits in the maker space ever since. What attracted me to them initially was their tutorials on the components they sell. They have kits and tutorials on how to solder, as well as how to integrate motors into an ESP 32 design. Step by step, detailed specs, they’re just good people. They also pursue quality components. I mean, every USB cable is the same right? Nope, and they go through the myriad options, and only sell the best ones. So, if you’re in the market, check them out, at least for their tutorials.

Going up the scale from here, you have “Single Board Computers”. The mindshare leader in this space is definitely the Raspberry Pi. When they sprung onto the scene, there really wasn’t any option in the sub-$50 range. Since then (2006ish), there has been an entire renaissance and explosion of single board computers. They are typically characterized by: Arm based processor, 4-8Gb RAM, USB powered, HDMI output, a couple of rows of IO pins, running Linux (Ubuntu typically).

I’ve certainly purchased my share of Raspberry Pi boards, and variants. I tend to favor those coming from Hard Kernel. I find their board innovations over the years to be better than what the Pi Foundation is typically doing. Also, they are more readily available. Hard Kernel has commercial customers that use their boards in embedded applications, so they tend to have Long Term Support for them. They have boards based on ARM typically, meant to run Linux, but they also have Windows based boards as well.

Here’s a typical offering,

The Odroid M1S.

The one thing that’s critical to have in a single board computer is software support. There are as many single board computers available in the world as there are grains of sand on a beach. What differentiates them is typically the software support, and the community around it. This is why the Raspberry Pi has been so popular. They have core OS support, and a super active community that’s always making contributions.

I find the Odroid boards to be similar, albeit a much smaller community. They do have core OS support, and typically get whatever changes they make integrated into the mainline Linux development tree.

This M1S I am considering as a brain for machines that need more than what the ESP32 can handle. A typical situation might be a CNC machine, where I want to have a camera to watch how things are going, and make adjustments if things are out of wack. For example, the camera sees that the cutting bit has broken, and will automatically stop the machine. Or, it can see how the material is burring or burning, and make adjustments to feeds and speeds automatically.

For such usage, it’s nice to have the IO pins available, but communicating over I2C, CANBus, or other means, should be readily available.

This is reason enough for me to purchase one of these boards. I will look specifically for pieces I can run on it, like OpenCV or some other visual module for the vision stuff. I have another CNC router that I am about to retrofit with new brains, and this could be the main brain, while the ESP32 can be used for the motor control side of things.

Last is the dreamy stuff.

The BeagleV-Fire

This is the latest creation of BeagleBoard.org. This organization is awesome because they are dedicated to creating “open source” hardware designs. That’s useful to the community because it means various people will create variants of the board for different uses, making the whole ecosystem more robust.

There are two special things about this board. One is that it uses a RISC-V chip, instead of ARM. RISC-V is an instruction set, which itself is open source, and license free. It is a counter to the ARM chips, which have license fees and various restrictions. RISC-V in general, will likely take up the low end of the market for CPU processors in all sorts of applications that typically had ARM based chips.

The other feature of these boards is onboard integrated FPGA (Field Programmable Gate Array). FPGA is a technology which makes the IO pins programmable. If you did not have a USB port, or you wanted another one, you could program some pins on the chip to be that kind of port. You can even program a FPGA to emulate a CPU, or other kinds of logic chips. Very flexible stuff, although challenging to program.

I’ve had various FPGA boards in the past, and even ones that are integrated with a CPU. This particular board is another iteration of the theme, done by a company that has been a strong contributor in the maker community for quite some time.

Why I won’t buy this board, as much as I want to; I don’t have an immediate need for it. I want to explore FPGA programming, and this may or may not be the best way to learn that. But, I don’t have an immediate need. Getting an Odroid for creating a smarter CNC makes sense right now, so one of those boards is a more likely purchase in the near term. It might be that in my explorations of CNC, I find myself saying “I need the programmability the BeagleBone has to offer”, but it will be a discovery based on usage, rather than raw “I want one!”, which is a departure from my past tinkerings.

At this point, I don’t need to explore above Single Board computers. They are more than powerful enough for the kinds of things I am exploring, so nothing about rack mountable servers and kubernetes clusters.

At the low end, ESP32 as my computicles. At the high end, Hard Kernel based Single Board Computers for brains.


Tinkerer’s Closet – Hardware Refresh

I am a tinkerer by birth. I’ve been fiddling about with stuff since I was old enough to take a screwdriver off my dad’s workbench. I’ve done mechanical things, home repair, wood working, gardening, 3d printing, lasering, just a little bit of everything over the years. While my main profession for about 40 years has been software development, I make the rounds through my various hobbies on a fairly regular basis.

Back around 2010, it was the age of 3D printers, and iOT devices. I should say, it was the genesis of those movements, so things were a little rough. 3D printers, for example, are almost formulaic at this point. Kits are easily obtained, and finished products can be had for $300-$400 for something that would totally blow away what we had in 2010.

At that time, I was playing around with tiny devices as well. How to make a light turn on from the internet. How to turn anything on, from a simple radio controller. As such, I was into Arduino microcontrollers, which we making the rounds of popularity, and much later, the Raspberry Pi, and other “Single Board Computers”. There were also tons of sensor modules (temperature, accelerometers, light, moisture, motion, etc), and little radio transmitters and receivers. The protocols were things like Xigbee, and just raw radio waves that could be decoded to ASCII streams.

As such, I accumulated quite a lot of kit to cover all the bases. My general moto was; “Buy two of everything, because if one breaks…”

The purchasing playground for all this kit was limited to a few choice vendors. In the past it would have been Radio Shack and HeathKit, but in 2010, it was:

AdaFruit

SeeedStudio

SparkFun

There were myriad other creators coming up with various dev boards, like the low power JeeLabs, or Dangerous Prototypes and their BusPirate product (still going today). But, mostly their stuff would end up at one of these reliable vendors, along with their own creations.

Lately, and why I’m writing this missive, I’ve been looking at the landscape of my workshop, wanting to regain some space, and make space for new projects. As such, I started looking through those hidey holes, where electronics components tend to hide, and hang out for generations. I’ve started going through the plastic bins, looking for things that are truly out of date, no longer needed, never going to find their way into a project, no longer supported by anyone, and generally, just taking up space.

To Wit, I’ve got a growing list of things that are headed for the scrap heaps;

433Mhz RF link kit, 915Mhz RF Link Kit, Various versions of Arduinos, Various versions of Raspberry Pi, TV B Gone KIt (built one, tossing the other, maybe save for soldering practice for the kids), Various Xigbee modules, Parallax Propellar (real neat stuff), SIM Card Reader, Gadget Factory FPGA boards and wings, trinkets, wearables, and myriad other things either as kits, boards, and what have you.

I’m sad to see it go, knowing how lovingly I put it all together over the years. But, most of that stuff from from 13 years ago. Things have advanced since then.

It used to be the “Arduino” was the dominant microcontroller and form factor for very small projects. Those boards could run $30, and weren’t much compared to what we have today. Nowadays, the new kids in town are the ESP 32 line of compute modules, along with form factors such as the Adafruit supported “Feather”. A lot of the modules you used to buy separate, like Wifi, are just a part of the chip package, along with BlueTooth. Even the battery charging circuitry, which used to be a whole separate board, is just a part of the module now. I can buy a feather board for $15, and it will have LiPo charging circuitry, USB-C connectivity for power and programming, Wifi (abgn), and BlueTooth LE. The same board will have 8 or 16Mb or RAM, and possibly even dual cores! That’s about $100 worth of components from 2010, all shrunken down to a board about the size of my big thumb. Yes, it’s definitely time to hit refresh.

So, I’m getting rid of all this old stuff, with a tear in my eye, but a smile on my face, because there’s new stuff to be purchased!! The hobby will continue.

I’m happily building new machines, so my purchases are more targeted than the general education I was pursuing back then. New CPUs, new instructions sets, new data sheets, new capabilities, dreams, and possibilities. It’s both a sad and joyous day, because some of the new stuff at the low end even has the words “AI Enabled” on it, so let’s see.


Embodied AI – Software seeking hardware

The “AI” space writ large, covers an array of different topics. At this moment in time, the Large Language Models (LLMs) have captured everyone’s imagination, due to their uncanny ability to give seemingly good answers to a number of run of the mill questions. I have been using ChatGPT specifically for the past year or so, and have found it to be a useful companion for certain tasks. The combination I use is GitHub Copilot, in my Visual Studio development environment, and ChatGPT on the side. Copilot is great for doing very sophisticated copy and paste based on comments I type in my code. ChatGPT is good for exploring new areas I’m not familiar with, and making suggestions as to things I can try.

That’s great stuff, and Microsoft isn’t the only game in town now. Google with their Bard/Gemini is coming right along the same path, and Facebook isn’t far behind with their various llama based offerings. I am currently exploring beyond what the LLM models provide.

One of the great benefits I see of AI is the ability to help automate various tasks. Earlier in the 20th century, we electrified, and motorized a lot of tasks, which resulted in the industrial revolution, giving us everything from cars to tractors, trains, airplanes, and rockets. Now we sit at a similar nexus. We have the means to not just motorize everything, but to give everything a little bit of intelligence as well. What I’m really after in this is the ability to create more complex machines, without having to spend the months and years to develop the software to run them. I want them to ‘learn’. I believe this can make the means of production of goods accessible to a much broader base of the population than ever before.

What I’m talking about is manufacturing at the speed of thought. A facility where this is done is a manufactory.

In my idealized manufactory, I have various semi-intelligent machines that are capable of learning how to perform various tasks. At a high level, I want to simply think about, and perhaps visualize a piece of furniture, turn to my manufactory and say “I need a queen sized bed, with four posts, that I can assemble using a screwdriver”. What ensues is what you might expect from a session with ChatGPT, a suggestion of options, some visualization with some sort of Dall-E piece, and ultimately an actual plan that shows the various pieces that need to be cut, and how to assemble them. I would then turn these plans over to the manufactory and simply say “make it so”, and the machinery would spring into life, cutting, shaping, printing, all the necessary pieces, and delivering them to me. Bonus if there is an assembly robot that I can hire to actually put it together in my bedroom.

Well, this is pure fantasy at this moment in time, but I have no doubt it is achievable. To that end, I’ve been exploring various kinds of machines from first principles to determine where the intelligence needs to be placed in order to speed up the process.

I am interested in three kinds of machines

CNC Router – Essentially a router, or spindle, which has a spinning cutting bit. Typically rides on a gantry across a flat surface, and is capable of carving pieces.

3D Printer – Automated hot glue gun. The workhorse of plastic part generation. Basically a hot glue gun mounted to a tool head that can be moved in a 3D space to additively created a workpiece.

Robotic Arm – Typically with 5 or 6 joints, can have various tools mounted to the end. Good for many different kinds of tasks from welding, to picking stuff up, to packing items into a box.

There are plenty of other base machines, including laser cutters, milling machines, lathes, and presses, but I’ve chosen these three because they represent different enough capabilities, but they’re all relatively easy to build using standard tools that I have on hand. So, what’s interesting, and what does AI have to do with it?

Let’s look at the 3D Printer.

the100 – This is a relatively small 3D printer where most of the parts are 3D printed. The other distinction is holds is that it’s super fast when it prints, rivaling anything in the consumer commercial realm. The printability is what drew me to this one because that means all I need to start is another relatively inexpensive ($300) 3D printer to start. And of course once the100 is built, it can 3D print the next version, even faster, and so on and so forth.

The thing about this, and all tools, is they have a kinematic model. That is, they have some motors, belts, pulleys, etc. Combined, these guts determine that this is a machine capable of moving a toolhead in a 3D space in a certain way. I can raise and lower the print bed in the Z direction. I can move the tool head in the XY direction. The model also has some constraints, such as speed limits based on the motors and other components I’m using. There’s also constraints as to the size of the area within which it can move.

The way this is all handled today is clever people come up with the programs that tie all this stuff together. We hard code the kinematic model into the software, and run something like Klipper, or Marlin, or various others, which take all that information, are fed a stream of commands (gcode), and know how to make the motors move in the right way to execute the commands.

There is typically a motherboard in these machines that has a combination of motor control and motion control, all wrapped up in a tight package.

I want to separate these things. I want motor control to be explicit, and here I want to inject a bit of AI. In order to ’embody’ AI, I need to teach a model about it’s kinematics. From there, I want to train it on how to move based on those kinematics. I don’t want to write the code telling it every step how to move from point A to B, which is what we do now. I want to let it flop around, giving it positive re-enforcement when it does the right thing, and negatives when it doesn’t. Just like we do with cars, just like we do with characters in video games. This is the first step of embodiment. Let the machine know its senses and actuators, and encourage it to learn how to use itself to perform a task.

Basic motor control is something the model needs to be told, as part of the kinematic model. Motion control is the next level up. Given a task, such as ‘draw a curved line from here to there’, which motors to engage, for how long, in which sequence, when to accelerate, how fast, how long, best deceleration curve, that’s all part of the motion control, and something a second level of intelligence needs to learn.

On top of all that, you want to layer an ability to translate from one domain to another. As a human, or perhaps another entity in the manufacturing process, I’m going to hand you a ‘.stl’ or ‘.step’ or various other kinds of design files. You will then need to translate that into the series of commands you understand you can give to your embodied self to carry out the task of creating the item.

But, it all starts down in motor control, and kinematic modeling.

Next up is the CNC Router

This is the lowrider 3 by V1 Engineering. What’s special here again is the ease of creating the machine. It has mostly 3D printed parts, and used standard components that can be found at a local hardware store. At it’s core is a motor controller, which is very similar to the ones used in the 3D printer case. Here again, the machine is running in a pretty constrained 3D space, and the motor control is very similar to that of the 3D printer. These two devices run off different motherboards, but I will be changing that so they essentially run with the same brain when it comes to their basic motor control and kinematic understanding.

Whereas the 3D printer is good for small parts (like the ones used to construct this larger machine), the CNC router, in this case, is good for cutting and shaping of sheet goods, like large 4ftx8ft sheets of playwood for cabinet and furniture making. Giving this platform intelligence gives us the ability to send it a cut list for a piece of furniture and have it figure that out and just do it.

Of course, these capabilities exist in larger industrial machines, that have typically been programmed, and are tied to CAD/CAM software. Here though, I’m after something different. I don’t want to “program” it, I want to teach it, starting from the base principles of its own kinematics.

Last is the venerable Robot Arm

Here, I am building a version of the AR4 MK2 robot arm from Annin Robotics

This machine represents a departure from the other two, with 6 degrees of freedom (shoulder, elbow, wrist, etc). The motors are larger than those found in the 3D printer or CNC router, but how to control and sense them is relatively the same. So, again, ultimately I want to separate sense and motor control from motion control. I will describe a kinematic model, and have the bot learn how to move itself based on reinforcement learning on that model.

All of this is possible now because of the start of the technology. Microcontrollers, or very small computers, are more than capable of handling the complex instructions to control a set of motors. This is a departure from just 10 years ago when I needed a complete real-time Linux PC with a parallel port to control the motors alone. Now I can do it with an esp32 based device that costs less than $20, and can run off a hobby battery. Similarly, the cost of ‘intelligence’ keeps dropping. There are LLMs such as llama.cpp which can run on a Raspberry pi class machine, which can be easily incorporated into these robot frames.

So, my general approach to creating the manufactory is to create these robot frames from first principles, and embody them with AI as low as we can go, then build up intelligence from there.

At this time, I have completed the AR4 arm, and the Lowrider CNC. the100 printer is in progress, and should complete in a couple of weeks. Then begins the task of creating the software to animate them all, run simulations, train models, and see where we get to.


Build vs Buy in software development – Part 1, the selection criteria

There is a common theme throughout life; “Should I build, or buy?” It doesn’t seem to matter which thing is under consideration, the process we used to make the decision should remain the same. This is a multi-part series on making the build vs buy decision for software projects.

This is WAAV Studio, a rapidly evolving application used for “Smart City” visualization and management. There are several components that go into it, and plenty of ‘build vs buy’ decisions to be made.

Here are the considerations I’ve had in deciding which way to go on several components

  • What is the scope of the component – How broad, and fundamental is it
  • Do I have the expertise to build it
  • How long will it take to integrate a purchased component
  • How long will it take to build it myself
  • What is the cost associated with purchasing it
  • How will maintenance look over time
  • The solution must be easy
  • The solution must be portable to multiple platforms
  • The solution must be small, fast, and intuitive

There are perhaps a couple more things to consider, but these are the highlights. There are some market considerations as well, but they essentially boil down to:

How important is time to market?;

and

What is your budget?

I’ll start with the WAAV Studio time to market and budget first. Time to market in this case is measured in months. A polished product needs to be available within a 12 month time period from when I started (January). The product must meet various milestones of usability along the way, but overall, the 1.0 version must be user friendly within 12 months.

Second, is the budget. This is the product of a very small team, using tools such as Copilot, and ChatGPT, as well as their core skills and programming tools. There is effectively no real budget to speak of, in terms of purchasing other bits of software.

With those hard constraints, I consider the list of ‘modules’ that need to be in this application.

  • Geo spacial mapping
  • Visualize various geo-spacial data sets
    • KMZ
    • GEOJSon
    • Shapefile
    • .csv data sets
  • Visualize various graphics assets
    • .png images
    • .gif images
    • .jpeg images
    • .svg images

That’s the rough list, with lots of detail on each item. The biggest and first build/buy decision was around mapping visualization. The easy and typical answer would be “just use Google Earth”, or something like that. Even before that, it might be “Use ArgGIS”, or any number of GIS software packages. Going this route might be expedient, but will lead you down a path of being constrained by whatever that core package is capable of.

A few of the criteria are around ease of use, and size. This application is something a typical city administrator will use on occasion. Their key job function might not be related to this software, so they need to be able to come to it after 3 months of absence, and still be able to pick it up and use it effectively to achieve some immediate goal. This is a hard one to achieve, and has to do with the UI elements, their layout, and natural usage. Again, when you select a core package, you may not have enough control over these factors to have a satisfying outcome. Using ArcGIS, for example, it has all the features a GIS professional could possibly want. The package size is measured in the 10s of megabytes, and the user’s manual would make an encyclopedia blush if it were printed in paper. This is not an app that can be picked up by your typical city clerk in a 10 minute session, let alone mastered six months later, without constant usage.

First decision: Create a core mapping component from scratch, without reliance, or dependence on any existing mapping components.

This is such a core, fundamental decision, it drives decision making across the other components, so, it better be good.

I have never built a mapping platform before WAAV Studio, so I started with the naive notion that I could actually do it. I mean, how hard could it be, right? All software engineers have the urge to build from scratch, and just jump onto any coding challenge. My years of wisdom told me, I better have a way to evaluate my progress on the task, and determine if it was time to abandon my naive first choice for a better choice later down the line.

In the next part, I’ll look into what goes into the core mapping platform, and which other components were chosen for the build vs buy machine.


SVG From The Ground Up – Time to wrap it up

This time around, we’re in the final stretch, going from a file, through the parsing, creating an object model, and finally, rendering an image.

To recap:

We started with low level building blocks to scan byte streams: Parsing Fundamentals

We got into the guts of XML parsing: It’s XML, how hard could it be

We then looked at core data structures, and how to parse their content: Along the right path

Most recently, we went over several drawing primitives and data structures: Can you imaging that

This series is a reflection on the code that can be found in this repository: svg2b2d, so you can follow along, and freely use it to make your own creations.

Now let’s get back to the build…

Thus far in the series, we’ve been looking at the guts of things, essentially from the bottom up. For this final installment, I’m going to go the other way around, and start from the end result and work back towards the beginning.

The goal I have for a program is to a .svg file into a .png file. That is, take the .svg, parse it, render it into a bitmap image, save that image as a .png. We’ll call this program svg2png, and here it is:

#include "blend2d.h"
#include "mmap.h"
#include "svg.h"

using namespace filemapper;


int main(int argc, char** argv)
{
    if (argc < 2)
    {
        printf("Usage: genimage <svg file>  [output file]\n");
        return 1;
    }

    // create an mmap for the specified file
    const char *filename = argv[1];
    auto mapped = mmap::createShared(filename);

    if (mapped == nullptr)
        return 0;

    // Create the BLImage we're going to draw into
    BLImage outImage(420, 340, BL_FORMAT_PRGB32);

    // parse the svg, and draw it into the image
    parseSVG(mapped->data(), mapped->size(), outImage);
    
    // save the image to a png file
    const char *output = argc > 2 ? argv[2] : "output.png";
    outImage.writeToFile(output);

    // close the mapped file
    mapped->close();


    return 0;
}

We’ve seen bits and pieces of this along the way. First, we get a filename from the command line, and create a memory mapped file from it. That allows us to deal with the contents as a pointer in memory, which makes the parsing really easy. Then we setup an initial BLImage object. The fact that it starts out as a certain size doesn’t matter. It will be changed later in the parsing process. Then there’s the call to parseSVG, which is the entry point to parsing SVG in this case. And finally, we output the image using an inbuilt capability of the blend2d library ‘outImage.writeToFile()’, and we’re done! What could be easier?

Let’s take a look at parseSVG(), since that’s where the real action is.

#include "svg.h"
#include "svgshapes.h"
#include "bspanutil.h"

#include <vector>
#include <memory>

bool parseSVG(const void* bytes, const size_t sz, BLImage& outImage)
{
    svg2b2d::ByteSpan inChunk(bytes, sz);
    
    // Create a new document
    svg2b2d::SVGDocument doc;

    // Load the document from the data
    doc.readFromData(inChunk);
    
    
    // Draw the document into a IRender
    outImage.create(doc.width(), doc.height(), BL_FORMAT_PRGB32);
    SVGRenderer ctx(outImage);
    doc.draw(ctx);
    ctx.end();
    
    return true;
}
  • Create a ByteSpan for the pointer and size that we’ve been given (the memory mapped file)
  • Create an instance of an SVGDocument (a container to hold what we parse)
  • Read/parse the contents, filling in the SVGDocument
  • Resize the image to match the size of the document
  • Create a rendering context and connect it to the image
  • Draw the document into the rendering context
  • Done

Light and simple. So, let’s go one step further, and look at that ‘readFromData()’, by examining the whole SVGDocument object

	struct SVGDocument : public IDrawable
	{

		// All the drawable nodes within this document
		std::shared_ptr<SVGRootNode> fRootNode{};
		std::vector<std::shared_ptr<IDrawable>> fShapes{};
		BLBox fExtent{};

		SVGDocument() = default;

		double width() const { 
			if (fRootNode == nullptr)
				return 100;
			return fRootNode->width();
        }

		double height() const {
			if (fRootNode == nullptr)
				return 100;
			return fRootNode->height();
		}

		void draw(IRender& ctx) override
		{
			for (auto& shape : fShapes)
			{
				shape->draw(ctx);
			}
		}

		// Add a node that can be drawn
		void addNode(std::shared_ptr<SVGObject> node)
		{
			fShapes.push_back(node);
		}


		// Load the document from an XML Iterator
		// Since this is the top level document, we just want to kick
		// off loading the root node 'svg', and we're done 
		void loadFromIterator(XmlElementIterator& iter)
		{

			// skip past any elements that come before the 'svg' element
			while (iter)
			{
				const XmlElement& elem = *iter;

				if (!elem)
					break;

				//printXmlElement(*iter);

				// Skip over these node types we don't know how to process
				if (elem.isComment() || elem.isContent() || elem.isProcessingInstruction())
				{
					iter++;
					continue;
				}


				if (elem.isStart() && (elem.name() == "svg"))
				{
                    // There should be only one root node in a document, so we should 
                    // break here, but, curiosity...
                    fRootNode = SVGRootNode::createFromIterator(iter);
                    if (fRootNode != nullptr)
                    {
                        addNode(fRootNode);
                    }
				}

				iter++;
			}
		}

		bool readFromData(const ByteSpan &inChunk)
		{
			ByteSpan s = inChunk;

			XmlElementIterator iter(s);

			loadFromIterator(iter);

			return true;
		}


	};

The SVGDocument has two primary things it achieves. The first is to parse the raw svg and turn it into a structured thing that can later be rendered, or some other processing can occur on it. The second thing is to provide a convenient entry point to render into a drawing context.

The readFromData() call should be familiar. It’s just XML after all isn’t it? So, create an XmlIterator on the chunk of memory we were passed, and get to parsing. The ‘loadFromIterator()’ function above that is the one that’s taking the individual nodes, and doing something with them. In this case, we’re only interested in the top level ‘<svg>’ node, so when we see that, we tell it to load itself from the iterator, and we save it as our root node.

The SVGRootNode itself isn’t much more complicated, and does a similar act

	struct SVGRootNode : public SVGGroup
	{
		std::shared_ptr<SVGPortal> fPortal;

		SVGRootNode() :SVGGroup(nullptr) { setRoot(this); }
		SVGRootNode(IMapSVGNodes *root)
			: SVGGroup(root)
		{
			setRoot(this);
		}
		
		double width()
		{
			if (fPortal != nullptr)
				return fPortal->width();
			return 100;
		}

		double height()
		{
			if (fPortal != nullptr)
				return fPortal->height();
			return 100;
		}

		void loadSelfFromXml(const XmlElement& elem) override
		{
			SVGGroup::loadSelfFromXml(elem);
			
			fPortal = SVGPortal::createFromXml(root(), elem, "portal");
		}

		static std::shared_ptr<SVGRootNode> createFromIterator(XmlElementIterator& iter)
		{
			auto node = std::make_shared<SVGRootNode>();
			node->loadFromIterator(iter);

			return node;
		}

		void draw(IRender& ctx) override
		{
			ctx.save();

			// Start with default state

			ctx.setFillStyle(BLRgba32(0, 0, 0));
			
			ctx.setStrokeStyle(BLRgba32(0));
			ctx.setStrokeWidth(1.0);
			ctx.textSize(16);
			
			// Apply attributes that have been gathered
			// in the case of the root node, it's mostly the viewport
			applyAttributes(ctx);

			// Draw the children
			drawSelf(ctx);

			ctx.restore();
		}

	};

The fact that it’s a subclass of SVGGroup takes care of some boilerplate code for loading self-enclosing elements, and grouped elements, and specialized elements and the like. The svgshapes.h file contains the nitty gritty details, so you should check there, so I don’t bore you with it all here. You can see that in the drawing routine, we setup the drawing context to have the default values the SVG environment expects. There are things such as having no stroke, but a black fill, to start. There are other items such as setting up the drawing coordinates, according to the ‘viewBox’ on the svg element, if it exists, and that happens in the ‘applyAttributes()’ function call.

Here’s another picture to keep you interested in the possibilities.

One last guidepost in the code, before we wrap this up. The SVGCompoundNode object is most important for the document structure. That’s where the ‘loadFromIterator()’ function lives. Classes such as SVGGroup, and SVGGradient, descend from there, and just implement a few calls to deal with further grouped things. So, that’s a piece of code to take a look at. It’s structured and organized to make it simple for sub-classes to just add a little bit here and there to specialize for given situations. Otherwise, it’s meant to be a relatively safe default to handle the processing of nodes, whether they be self contained, or compound.

And that’s about it. We’ve gone from the beginnings of how to scan stuff at a byte level, all the way through a simple XML parser, and into the complexities of parsing details of SVG types, and constructing a tree to be rendered, and saved as an image. All of the images shared in these articles have been rendered using the code built here, so it’s capable of doing fairly complex SVGs, beyond the typical rendering of the Ghostscript tiger. From here, if you have need for SVG in your own code, you can pretty much just lift the svg2b2d directly, and start using it. I did not cover doing text in SVG in this article series, but it’s actually not as hard as it might seem, simply because the blend2d library already deals with text rendering as well. Normally you’d have to contemplate using freetype, or stb_xxx something or other to get text, which just increases your surface area. With blend2d, you don’t, it does all that as well.

So, there you have it. SVG From The Ground Up! A step by step guide on how to go from bytes in a file, to bits on the screen. I hope this helps those who are inspired to simply learn some of the details, if not those who actually want to implement their own. I am personally using SVG for visualization and UI elements. You can imagine the common refrain of “just use HTML and the browser”, but what’s the fun in that.

Until next time, go parse you some SVG!!


SVG From the Ground Up – Can you imaging that?

It’s time to put the pieces together and get to rendering some SVG!

In the last couple of installments, we were looking at how to do some very low level scanning and parsing. We got through the basics of XML, and looked at some geometry with parsing of the SVG <path> ‘d’ element. The next step is to decide on how we’re going to represent an entire document in memory so that we can render the whole image. This is a pretty big subject, so we’ll start with some design constraints and goals.

At the end of the day, I want to turn the .svg file into bits on the screen. The blend2d graphics library has a BLContext object, which does all the drawing that I need. It can draw everything from individual lines, to gradient shaded polygons, and bitmaps. SVG has some particular drawing requirements, in terms which elements are drawn first, how they are styled, how they are grouped, etc. One example of this is the usage of Cascading Style Sheets (CSS). What this means is that if I turn on one attribute, such as a fill color for polygons, that attribute will be applied to all subsequent elements in a tree drawn after it, until something changes.

Example:

<svg
 viewbox='10 10 190 10'
 xmlns="http://www.w3.org/2000/svg">
<g stroke="red" stroke-width='4'>
  <line x1='10' y1='10' x2='200' y2='200'/>
  <line stroke='green' x1='10' y1='100' x2='200' y2='200'/>
  <line stroke-width='8' stroke='blue' x1='10' y1='200' x2='200' y2='200'/>
  <rect x='100' y='10' width='50' height='50' />
</g>
</svg>

The ‘<g…’ serves as a grouping mechanism. It allows you to set some attributes which will apply to all the elements that are within that group. In this case, I set the stroke (the color used to draw lines) to ‘red’. Until something later changes this, the default will be red lines. I also set the stroke-width (number of pixels used to draw the lines). So, again, unless it is explicitly changed, all subsequent lines in the group will have this width.

The first line drawn, since it does not change the color or the width, uses red, and 4.

The second line drawn, changes the color to ‘green’, but does not change the width.

The third line drawn, changes the color to blue, and changes the width to 8

The rectangle, does not explicitly change the color, so red line drawing, with a width of 4, and a default filll color of black.

Of note, changing the attributes on a single element, such as the green line, does not change that attribute for sibling elements, it only applies to that single element. Only attributes applied at a group level will affect the elements within that group, from above.

This imposes some of our first requirements. We need an object that can contain drawing attributes. In addition, there’s a difference between objects that contain the attributes, such as stroke-width, stroke, fill, etc, and actual geometry, such as line, polygon, path. I will drop SVGObject here, as it is a baseline. If you want to follow along, the code is in the svgtypes.h file.


struct IMapSVGNodes;    // forward declaration



struct SVGObject : public IDrawable
{
    XmlElement fSourceElement;

    IMapSVGNodes* fRoot{ nullptr };
    std::string fName{};    // The tag name of the element
    BLVar fVar{};
    bool fIsVisible{ false };
    BLBox fExtent{};



    SVGObject() = delete;
    SVGObject(const SVGObject& other) :fName(other.fName) {}
    SVGObject(IMapSVGNodes* root) :fRoot(root) {}
    virtual ~SVGObject() = default;

    SVGObject& operator=(const SVGObject& other) {
        fRoot = other.fRoot;
        fName = other.fName;
        BLVar fVar = other.fVar;

        return *this;
    }

    IMapSVGNodes* root() const { return fRoot; }
    virtual void setRoot(IMapSVGNodes* root) { fRoot = root; }

    const std::string& name() const { return fName; }
    void setName(const std::string& name) { fName = name; }

    const bool visible() const { return fIsVisible; }
    void setVisible(bool visible) { fIsVisible = visible; }

    const XmlElement& sourceElement() const { return fSourceElement; }

    // sub-classes should return something interesting as BLVar
    // This can be used for styling, so images, colors, patterns, gradients, etc
    virtual const BLVar& getVariant()
    {
        return fVar;
    }

    void draw(IRender& ctx) override
    {
        ;// draw the object
    }

    virtual void loadSelfFromXml(const XmlElement& elem)
    {
        ;
    }

    virtual void loadFromXmlElement(const svg2b2d::XmlElement& elem)
    {
        fSourceElement = elem;

        // load the common attributes
        setName(elem.name());

        // call to loadselffromxml
        // so sub-class can do its own loading
        loadSelfFromXml(elem);
    }
};

As a base object, it contains the bare minimum that is common across all subsequent objects. It also has a couple of extras which have proven to be convenient, if not strictly necessary.

The strictly necessary is the ‘void draw(IRender &ctx)’. Almost all objects, whether they be attributes, or elements, will need to affect the drawing context. So, they all will need to be given a chance to do that. The ‘draw()’ routine is what gives them that chance.

All objects need to be able to construct themselves from the xml element stream, so the convenient ‘load..’ functions sit here. Whether it’s an alement, or an attribute, it has a name, so we set the name as well. Attributes can set their name independently from being loaded from the XmlElement, so this is a bit of specialization, but it’s ok.

There is this bit of an oddity in the forward declaration of ‘struct IMapSVGNodes; // forward declaration’. As we’ll see much later, we need the ability to lookup nodes based on an ID, so we need an interface somewhere that allows us to do that. As every node constructed might need to do this, we need a way to pass this interface down the tree, without copying it, and without causing circular references, so the forward declaration, and use of the ‘root()’ method.

That’s got us started. We now have something of a base object.

Next up, we have SVGVisualProperty

// SVGVisualProperty
    // This is meant to be the base class for things that are optionally
    // used to alter the graphics context.
    // If isSet() is true, then the drawSelf() is called.
	// sub-classes should override drawSelf() to do the actual drawing
    //
    // This is used for things like; Paint, Transform, Miter, etc.
    //
    struct SVGVisualProperty :  public SVGObject
    {
        bool fIsSet{ false };

        //SVGVisualProperty() :SVGObject(),fIsSet(false){}
        SVGVisualProperty(IMapSVGNodes *root):SVGObject(root),fIsSet(false){}
        SVGVisualProperty(const SVGVisualProperty& other)
            :SVGObject(other)
            ,fIsSet(other.fIsSet)
        {}

        SVGVisualProperty operator=(const SVGVisualProperty& rhs)
        {
            SVGObject::operator=(rhs);
            fIsSet = rhs.fIsSet;
            
            return *this;
        }

        void set(const bool value) { fIsSet = value; }
        bool isSet() const { return fIsSet; }

		virtual void loadSelfFromChunk(const ByteSpan& chunk)
        {
			;
        }

        virtual void loadFromChunk(const ByteSpan& chunk)
        {
			loadSelfFromChunk(chunk);
        }
        
        // Apply propert to the context conditionally
        virtual void drawSelf(IRender& ctx)
        {
            ;
        }

        void draw(IRender& ctx) override
        {
            if (isSet())
                drawSelf(ctx);
        }

    };

It’s not much, and you might question whether it needs to even exist. Maybe it’s couple of routines can just be merged into the SVGObject itself. That is a simple design changed to contemplate, as the only real attribute introduced here is the ‘isSet()’. This is essentially a way to say ‘the value is null’. If I had nullable types, I’d just use that mechanism. But, it also allows you to turn an attribute on and off programmatically, which might turn out to be useful.

Now we can look at a single attribute, the stroke-width, and see how it goes from an xmlElement attribute, to a property in our tree.

    //=========================================================
    // SVGStrokeWidth
    //=========================================================
    
    struct SVGStrokeWidth : public SVGVisualProperty
    {
		double fWidth{ 1.0};

		//SVGStrokeWidth() : SVGVisualProperty() {}
		SVGStrokeWidth(IMapSVGNodes* iMap) : SVGVisualProperty(iMap) {}
		SVGStrokeWidth(const SVGStrokeWidth& other) :SVGVisualProperty(other) { fWidth = other.fWidth; }
        
		SVGStrokeWidth& operator=(const SVGStrokeWidth& rhs)
		{
			SVGVisualProperty::operator=(rhs);
			fWidth = rhs.fWidth;
			return *this;
		}

		void drawSelf(IRender& ctx)
		{
			ctx.setStrokeWidth(fWidth);
		}

		void loadSelfFromChunk(const ByteSpan& inChunk)
		{
			fWidth = toNumber(inChunk);
			set(true);
		}

		static std::shared_ptr<SVGStrokeWidth> createFromChunk(IMapSVGNodes* root, const std::string& name, const ByteSpan& inChunk)
		{
			std::shared_ptr<SVGStrokeWidth> sw = std::make_shared<SVGStrokeWidth>(root);

			// If the chunk is empty, return immediately 
			if (inChunk)
				sw->loadFromChunk(inChunk);

			return sw;
		}

		static std::shared_ptr<SVGStrokeWidth> createFromXml(IMapSVGNodes* root, const std::string& name, const XmlElement& elem)
		{
			return createFromChunk(root, name, elem.getAttribute(name));
		}
    };

It starts from the ‘createFromXml…’. We can look at the main parsing loop later, but there is a point where we’re looking at the attributes of an element, and we’ll run across the ‘stroke-width’, and call this function. The ‘createFromChunk’ is then called, which then calls loadFromChunk after instantiating an object.

There are a couple more design choices being made here. First is the fact that we’re using ‘std::shared_ptr’. This implies heap allocation of memory, and this is the right place to finally make such a decision. We did not want the XML parser itself to do any allocations, but we’re finally at the point where we need to. It’s possible to not even do allocations here, just have the attributes allocated on the objects that use them. But, since attributes can be shared, it’s easier just to bite the bullet now, and use shared_ptr.

In the case of stroke-width, we want to save the width specified (call toNumber()), and when it comes time to apply that width, in the ‘drawSelf()’, we make the rigth call on the drawing context ‘setStrokeWidth()’. Since the same drawing context is used throughout the rendering process, setting an attribute at one point will make that attribute sticky, until something else changes it, which is the behavior that we want.

I would like to describe the ‘stroke’ and ‘fill’ attributes, but they are actually the largest portions of the library. Setting these attributes can occur in so many different ways, it’s worth taking a look at them. Here I will just show a few ways in which they can be used, so you get a feel for how involved they are:

<line stroke="blue" x1='0' y1='0'  x2='100'  y2='100'/>
<line stroke="rgb(0,0,255)" x1='0' y1='0'  x2='100'  y2='100'/> 
<line stroke="rgba(0,0,255,1.0)" x1='0' y1='0'  x2='100'  y2='100'/> 
<line stroke="rgba(0,0,100%,1.0)" x1='0' y1='0'  x2='100'  y2='100'/> 
<line stroke="rgba(0%,0%,100%,100%)" x1='0' y1='0'  x2='100'  y2='100'/> 
<line style = "stroke:blue" x1='0' y1='0'  x2='100'  y2='100'/> 
<line stroke= "url(#SVGID_1)" x1='0' y1='0'  x2='100'  y2='100'/> 

And more…

There is a bewildering assortment of ways in which you can set a stroke or fill, and they don’t all resolve to a single color value. It can be patterns, gradients, even other graphics. So, it can get pretty intense. The SVGPaint structure does a good job of representing all the possibilities, so take a look at that if you want to want to see the intimate details.

We round out our basic object strucutures by looking at how shapes are represented.

//
	// SVGVisualObject
	// This is any object that will change the state of the rendering context
	// that's everything from paint that needs to be applied, to geometries
	// that need to be drawn, to line widths, text alignment, and the like.
	// Most things, other than basic attribute type, will be a sub-class of this
	struct SVGVisualNode : public SVGObject
	{

		std::string fId{};      // The id of the element
		std::map<std::string, std::shared_ptr<SVGVisualProperty>> fVisualProperties{};

		SVGVisualNode() = default;
		SVGVisualNode(IMapSVGNodes* root)
			: SVGObject(root)
		{
			setRoot(root);
		}
		SVGVisualNode(const SVGVisualNode& other) :SVGObject(other)
		{
			fId = other.fId;
			fVisualProperties = other.fVisualProperties;
		}


		SVGVisualNode & operator=(const SVGVisualNode& rhs)
		{
			fId = rhs.fId;
			fVisualProperties = rhs.fVisualProperties;
			
			return *this;
		}
		
		const std::string& id() const { return fId; }
		void setId(const std::string& id) { fId = id; }
		
		void loadVisualProperties(const XmlElement& elem)
		{
			// Run Through the property creation routines, generating
			// properties for the ones we find in the XmlElement
			for (auto& propconv : gSVGPropertyCreation)
			{
				// get the named attribute
				auto attrName = propconv.first;

				// We have a property and value, convert to SVGVisibleProperty
				// and add it to our map of visual properties
				auto prop = propconv.second(root(), attrName, elem);
				if (prop->isSet())
					fVisualProperties[attrName] = prop;

			}
		}

		void setCommonVisualProperties(const XmlElement &elem)
		{
			// load the common stuff that doesn't require
			// any additional processing
			loadVisualProperties(elem);

			// Handle the style attribute separately by turning
			// it into a standalone XmlElement, and then loading
			// that like a normal element, by running through the properties again
			// It's ok if there were already styles in separate attributes of the
			// original elem, because anything in the 'style' attribute is supposed
			// to override whatever was there.
			auto styleChunk = elem.getAttribute("style");

			if (styleChunk) {
				// Create an XML Element to hang the style properties on as attributes
				XmlElement styleElement{};

				// use CSSInlineIterator to iterate through the key value pairs
				// creating a visual attribute, using the gSVGPropertyCreation map
				CSSInlineStyleIterator iter(styleChunk);

				while (iter.next())
				{
					std::string name = std::string((*iter).first.fStart, (*iter).first.fEnd);
					if (!name.empty() && (*iter).second)
					{
						styleElement.addAttribute(name, (*iter).second);
					}
				}

				loadVisualProperties(styleElement);
			}

			// Deal with any more attributes that need special handling
		}

		void loadSelfFromXml(const XmlElement& elem) override
		{
			SVGObject::loadSelfFromXml(elem);
			
			auto id = elem.getAttribute("id");
			if (id)
				setId(std::string(id.fStart, id.fEnd));

			
			setCommonVisualProperties(elem);
		}
		
		// Contains styling attributes
		void applyAttributes(IRender& ctx)
		{
			for (auto& prop : fVisualProperties) {
				prop.second->draw(ctx);
			}
		}
		
		virtual void drawSelf(IRender& ctx)
		{
			;

		}
		
		void draw(IRender& ctx) override
		{
			ctx.save();
			
			applyAttributes(ctx);

			drawSelf(ctx);

			ctx.restore();
		}
	};

We are building up nodes in a tree structure. The SVGVisualNode is essentially the primary node of that construction. At the end of all the tree construction, we want to end up with a root node where we can just call ‘draw(context)’, and have it render itself into the context. That node needs to deal with the Cascading Styles, children drawing in the proper order (painter’s algorithm), deal with all the attributes, and context state.

Of particular note, right there at the end is the ‘draw()’ method. It starts with ‘ctx.save()’ and finishes with ‘ctx.restore()’. This is critical to maintaining the design constraint of ‘attributes are applied locally in the tree’. So, we save the sate of the context coming in, make whatever changes we, or our children will make, then restore the state upon exit. This is the essential operation required to maintain proper application of drawing attributes. Luckily, or rather by design, the blend2d library makes saving and restoring state very fast and efficient. If the base library did not have this facility, it would be up to our code to maintain this state.

Another note here is ‘applyAttributes’. This is what allows things such as the ‘<g>’ element to apply attributes at a high level in the tree, and subsequent elements don’t have to worry about it. They can just apply the attributes that they alter. And where do those common attributes come from?

	static std::map<std::string, std::function<std::shared_ptr<SVGVisualProperty> (IMapSVGNodes *root, const std::string& , const XmlElement& )>> gSVGPropertyCreation = {
		{"fill", [](IMapSVGNodes* root, const std::string& name, const XmlElement& elem) {return SVGPaint::createFromXml(root, "fill", elem ); } }
		,{"fill-rule", [](IMapSVGNodes* root, const std::string& name, const XmlElement& elem) {return SVGFillRule::createFromXml(root, "fill-rule", elem); } }
		,{"font-size", [](IMapSVGNodes* root, const std::string& name, const XmlElement& elem) {return SVGFontSize::createFromXml(root, "font-size", elem); } }
		,{"opacity", [](IMapSVGNodes* root, const std::string& name, const XmlElement& elem) {return SVGOpacity::createFromXml(root, "opacity", elem); } }
		,{"stroke", [](IMapSVGNodes* root, const std::string& name, const XmlElement& elem ) {return SVGPaint::createFromXml(root, "stroke", elem); } }
		,{"stroke-linejoin", [](IMapSVGNodes* root, const std::string& name, const XmlElement& elem) {return SVGStrokeLineJoin::createFromXml(root, "stroke-linejoin", elem); } }
		,{"stroke-linecap", [](IMapSVGNodes* root, const std::string& name, const XmlElement& elem ) {return SVGStrokeLineCap::createFromXml(root, "stroke-linecap", elem); } }
		,{"stroke-miterlimit", [](IMapSVGNodes* root, const std::string& name, const XmlElement& elem ) {return SVGStrokeMiterLimit::createFromXml(root, "stroke-miterlimit", elem); } }
		,{"stroke-width", [](IMapSVGNodes* root, const std::string& name, const XmlElement& elem ) {return SVGStrokeWidth::createFromXml(root, "stroke-width", elem); } }
		,{"text-anchor", [](IMapSVGNodes* root, const std::string& name, const XmlElement& elem) {return SVGTextAnchor::createFromXml(root, "text-anchor", elem); } }
		,{"transform", [](IMapSVGNodes* root, const std::string& name, const XmlElement& elem) {return SVGTransform::createFromXml(root, "transform", elem); } }
};

A nice dispatch table of the most common of attributes. The ‘loadVisualProperties()’ method uses this dispatch table to load the common display properties. Individual geometry objects can load their own specific properties after this, but these are the common ones, so this is very convenient. This table can and should be expanded as even more properties can be supported.

Finally, let’s get to the meat of the geometry representation. This can be found in the svgshapes.h file.

	struct SVGPathBasedShape : public SVGShape
	{
		BLPath fPath{};
		
		SVGPathBasedShape() :SVGShape() {}
		SVGPathBasedShape(IMapSVGNodes* iMap) :SVGShape(iMap) {}
		
		
		void drawSelf(IRender &ctx) override
		{
			ctx.fillPath(fPath);
			ctx.strokePath(fPath);
		}
	};

Ignoring the SVGShape object (a small shim atop SVGObject), we have a BLPath, and a drawSelf(). What could be simpler? The general premise is that all geometry can be represented as a BLPath at the end of the day. Everything from single lines, to polygons, to complex paths, they all boil down to a BLPath. Making this object hugely simplifies the drawing task. All subsequent geometry classes just need to convert themselves into BLPath, which we’ll see is very easy.

Here is the SVGLine, as it’s fairly simple, and representative of the rest of the geometries.

struct SVGLine : public SVGPathBasedShape
	{
		BLLine fGeometry{};
		
		SVGLine() :SVGPathBasedShape(){ reset(0, 0, 0, 0); }
		SVGLine(IMapSVGNodes* iMap) :SVGPathBasedShape(iMap) {}
		
		
		void loadSelfFromXml(const XmlElement& elem) override 
		{
			SVGPathBasedShape::loadSelfFromXml(elem);
			
			fGeometry.x0 = parseDimension(elem.getAttribute("x1")).calculatePixels();
			fGeometry.y0 = parseDimension(elem.getAttribute("y1")).calculatePixels();
			fGeometry.x1 = parseDimension(elem.getAttribute("x2")).calculatePixels();
			fGeometry.y1 = parseDimension(elem.getAttribute("y2")).calculatePixels();

			fPath.addLine(fGeometry);
		}

		static std::shared_ptr<SVGLine> createFromXml(IMapSVGNodes *iMap, const XmlElement& elem)
		{
			auto shape = std::make_shared<SVGLine>(iMap);
			shape->loadFromXmlElement(elem);

			return shape;
		}

		
	};

It’s fairly boilerplate. Just have to get the right attributes turned into values for the BLLine geometry, and add that to our path. That’s it. The rect, circle, ellipse, polyline, polygon, and path objects, all do pretty much the same thing, in as small a space. These are much simpler than having to deal with the ‘stroke’ or ‘fill’ attributes. There is some trickery here in terms of parsing the actual coordinate values, because they can be represented in different kinds of units, but the SVGDimension object deals with all those details.

That’s about enough code for this time around. We’ve looked at attributes, and VisualNodes, and we know that we need cascading styles, painter’s algorithm drawing order, and an ability to draw into a context. Now we have all the pieces we need to complete the final rendering task.

Next time around, I’ll wrap it up by bringing in the SVG ‘parser’, which will combine the XML scanning with our document tree, and render final images.


SVG From the Ground Up – Along the right path

<svg xmlns="http://www.w3.org/2000/svg" width="22" height="22" viewBox="0 0 22 22">
	<path d="M20.658,9.26l0.006-0.007l-9-8L11.658,1.26C11.481,1.103,11.255,1,11,1 
	c-0.255,0-0.481,0.103-0.658,0.26l-0.006-0.007l-9,8L1.342,9.26
	C1.136,9.443,1,9.703,1,10c0,0.552,0.448,1,1,1 c0.255,0,0.481-0.103,0.658-0.26l0.006,0.007
	L3,10.449V20c0,0.552,0.448,1,1,1h5v-8h4v8h5c0.552,0,1-0.448,1-1v-9.551l0.336,0.298 
	l0.006-0.007C19.519,10.897,19.745,11,20,11c0.552,0,1-0.448,1-1C21,9.703,20.864,9.443,20.658,9.26z 
	M7,16H5v-3h2V16z M17,16h-2 v-3h2V16z"/>
</svg>

In the last installment (It’s XML, How hard could it be?), we got as far as being able to scan XML, and generate a bunch of XmlElement objects. That’s a great first couple of steps, and now the really interesting parts begin. But, first, before we get knee deep into the seriousness of the rest of SVG, we need to deal with the graphics subsystem. It’s one thing to ‘parse’ SVG, and even build up a Document Object Model (DOM). It’s quite another to actually do the rendering of the same. To do both, in a compact form, with speed and agility, that’s what we’re after.

This time around I’m going to introduce blend2d, which is the graphics library that I’m using to do all my drawing. I stumbled across blend2d a few years ago, and I don’t even remember how I found it. There are a couple of key aspects to it that are of note. One is that the API is really simple to use, and the library is easy to build. The other part, is more esoteric, but perfect for our needs here. The library was built around support for SVG. So, it has all the functions we need to build the typical kinds of graphics that we’re concerned with. I won’t go into excruciating detail about the blend2d API here, as you can visit the project on github, but I will take a look at the BLPath object, because this is the true workhorse of most SVG graphics.

The little house graphic above is typical of the kinds of little vector based icons you find all over the place. In your apps as icons, on Etsy as laser cuttable images, etc. Besides the opening ‘<svg…’, you see the ‘<path…’. SVG images are comprised of various geometry elements such as rect, circle, polyline, polygon, and path. If you really want to get into the nitty gritty details, you can check out the full SVG Specification.

The path geometry is used to describe a series of movements a pen might make on a plotter. MoveTo, LineTo, CurveTo, etc. There are a total of 20 commands you can use to build up a path, and they can used in almost any combination to create as complex a figure as you want.

    // Shaper contour Commands
    // Origin from SVG path commands
    // M - move       (M, m)
    // L - line       (L, l, H, h, V, v)
    // C - cubic      (C, c, S, s)
    // Q - quad       (Q, q, T, t)
    // A - ellipticArc  (A, a)
    // Z - close        (Z, z)
    enum class SegmentCommand : uint8_t
    {
        INVALID = 0
        , MoveTo = 'M'
        , MoveBy = 'm'
        , LineTo = 'L'
        , LineBy = 'l'
        , HLineTo = 'H'
        , HLineBy = 'h'
        , VLineTo = 'V'
        , VLineBy = 'v'
        , CubicTo = 'C'
        , CubicBy = 'c'
        , SCubicTo = 'S'
        , SCubicBy = 's'
        , QuadTo = 'Q'
        , QuadBy = 'q'
        , SQuadTo = 'T'
        , SQuadBy = 't'
        , ArcTo = 'A'
        , ArcBy = 'a'
        , CloseTo = 'Z'
        , CloseBy = 'z'
    };

A single path has a ‘d’ attribute, which contains a series of these commands strung together. It’s a very compact description for geometry. A single path can be used to generate something quite complex.

With the exception of the blue text, that entire image is generated with a single path element. Quite complex indeed.

Being able to parse the ‘d’ attribute of the path element is super critical to our success in ultimately rendering SVG. There are a few design goals we have in doing this.

  • Be as fast as possible
  • Be as memory efficient as possible
  • Do not introduce intermediate forms if possible

No big deal right? Well, as luck would have it, or rather by design, the blend2d library has an object, BLPath, which was designed for exactly this task. You can checkout the API documentation if you want to look at the details, but it essentially has all those ‘moveTo’, ‘lineTo’, etc, and a whole lot more. It only implements the ‘to’ forms, and not the ‘by’ forms, but it’s easy to get the last vertex and implement the ‘by’ forms ourselves, which we’ll do.

So, our implementation strategy will be to read a command, and read enough numbers to make a call to a BLPath object to actually create the geometry. The entirety of the code is roughly 500 lines, and most of it is boilerplate, so I won’t bother listing it all here, but you can check it out online in the parseblpath.h file.

Let’s look at a little snippet of our house path, and see what it’s doing.

M20.658,9.26l0.006-0.007l-9-8

It is hard to see in this way, so let me write it another way.

M 20.658, 9.26
l 0.006, -0.007
l -9, -8

Said as a series of instructions (and it’s hard to tell between ‘one’ and ‘el’), it would be:

Move to 20.658, 9.26
Line by 0.006, -0.007
Line by -9, -8

If I were to do it as code in blend2d, it would be

BLPath path{};
BLPoint lastPt{};

path.moveTo(20.658, 9.26);
path.getLastVertex(&lastPt);

path.lineTo(lastPt.x + 0.006, lastPt.y+ -0.007);
path.getLastVertex(&lastPt);

path.lineTo(lastPt.x + -9, lastPt.y + -8);

So, our coding task is to get from that cryptic ‘d’ attribute to the code connecting to the BLPath object. Let’s get started.

The first thing we’re going to need is a main routine that drives the process.

		static bool parsePath(const ByteSpan& inSpan, BLPath& apath)
		{
			// Use a ByteSpan as a cursor on the input
			ByteSpan s = inSpan;
			SegmentCommand currentCommand = SegmentCommand::INVALID;
			int iteration = 0;

			while (s)
			{
				// ignore leading whitespace
				s = chunk_ltrim(s, whitespaceChars);

				// If we've gotten to the end, we're done
				// so just return
				if (!s)
					break;

				if (commandChars[*s])
				{
					// we have a command
					currentCommand = SegmentCommand(*s);
					
					iteration = 0;
					s++;
				}

				// Use parseMap to dispatch to the appropriate
				// parse function
				if (!parseMap[currentCommand](s, apath, iteration))
					return false;


			}

			return true;
		}

Takes a ByteSpan and a reference to a BLPath object, and returns ‘true’ if successful, ‘false’ otherwise. There are design choices to be made at every step of course. Why did I pass in a reference to a BLPath, instead of just constructing one inside the routine, and handing it back? Well, because this way, I allow something else to decide where the memory is allocated. This way also allows you to build upon an existing path if you want.

Second choice is, why a const ByteSpan? That’s a harder one. This allows a greater number of choices in terms of where the ByteSpan is coming from, such as you might have been passed a const span to begin with. But mainly it’s a contract that says “this routine will not alter the span.

OK, so following along, we make a reference to the input span, which does NOT copy anything, just sets up a couple of pointers. Then we use this ‘s’ span to do our movement. The main ‘while’ starts with a ‘trim’. XML, and thus SVG, are full of optional whitespace. I can say that for almost every routine, the first thing you want to do is eliminate whitespace. the ‘chunk_ltrim()’ function is very short and efficient, so liberal usage of that is a good thing.

Now we’re sitting at the ‘M’, so we first check to see if it’s one of our command characters. If it is, then we use it as our current command, and advance our pointer. The ‘iteration = 0’ is only useful for the Move commands, but we need that, as we’ll soon see.

Last, we have that cryptic function call thing

				if (!parseMap[currentCommand](s, apath, iteration))
					return false;

All set! Easy peasy, our task is done here…

That last little bit of function call trickery is using a dispatch table to make a call to a function. So let’s look at the dispatch table.

		// A dispatch std::map that matches the command character to the
		// appropriate parse function
		static std::map<SegmentCommand, std::function<bool(ByteSpan&, BLPath&, int&)>> parseMap = {
			{SegmentCommand::MoveTo, parseMoveTo},
			{SegmentCommand::MoveBy, parseMoveBy},
			{SegmentCommand::LineTo, parseLineTo},
			{SegmentCommand::LineBy, parseLineBy},
			{SegmentCommand::HLineTo, parseHLineTo},
			{SegmentCommand::HLineBy, parseHLineBy},
			{SegmentCommand::VLineTo, parseVLineTo},
			{SegmentCommand::VLineBy, parseVLineBy},
			{SegmentCommand::CubicTo, parseCubicTo},
			{SegmentCommand::CubicBy, parseCubicBy},
			{SegmentCommand::SCubicTo, parseSmoothCubicTo},
			{SegmentCommand::SCubicBy, parseSmoothCubyBy},
			{SegmentCommand::QuadTo, parseQuadTo},
			{SegmentCommand::QuadBy, parseQuadBy},
			{SegmentCommand::SQuadTo, parseSmoothQuadTo},
			{SegmentCommand::SQuadBy, parseSmoothQuadBy},
			{SegmentCommand::ArcTo, parseArcTo},
			{SegmentCommand::ArcBy, parseArcBy},
			{SegmentCommand::CloseTo, parseClose},
			{SegmentCommand::CloseBy, parseClose}
		};

Dispatch tables are the modern day C++ equivalent of the giant switch statement typically found in such programs. I actually started with the giant switch statement, then said to myself, “why don’t I just use a dispatch table”. They are functionally equivalent. In this case, we have a std::map, which uses a single SegmentCommand as the key. Each element is tied to a function, that takes the same set of parameters, namely a ByteSpan, a BLPath, and an int. As you can see, there is a function for each of our 20 commands.

I won’t go into every single one of those 20 commands, but looking at a couple will be instructive. Let’s start with the MoveTo

		static bool parseMoveTo(ByteSpan& s, BLPath& apath, int& iteration)
		{
			double x{ 0 };
			double y{ 0 };

			if (!parseNextNumber(s, x))
				return false;
			if (!parseNextNumber(s, y))
				return false;

			if (iteration == 0)
				apath.moveTo(x, y);
			else
				apath.lineTo(x, y);

			iteration++;

			return true;
		}

This has a few objectives.

  • Parse a couple of numbers
  • Call the appropriate function on the BLPath object
  • Increment the ‘iteration’ parameter
  • Advance the pointer, indicating how much we’ve consumed
  • Return false on failure, true on success

This pattern is repeated for every other of the 19 functions. One thing to know about all the commands, and why the main loop is structured the way it is, you can have multiple sets of numbers after the initial set. In the case of MoveTo, the following is a valid input stream.

M 0,0 20,20 30,30 40,40

The way you treat it, in the case of MoveTo, is the initial numbers set an origin (0,0), all subsequent number pairs are implied LineTo commands. That’s why we need to know the iteration. If the iteration is ‘0’, then we need to call moveTo on the BLPath object. If the iteration is greater than 0, then we need to call lineTo on the BLPath. All the commands behave in a similar fashion, except they don’t change based on the iteration number.

Well gee whiz, that seems pretty simple and straightforward. Don’t know what all the fuss is about. Hidden within the parseMoveTo() is parseNextNumber(), so let’s take a look at that as this is where all the bugs can be found.

// Consume the next number off the front of the chunk
// modifying the input chunk to advance past the  number
// we removed.
// Return true if we found a number, false otherwise
		static inline bool parseNextNumber(ByteSpan& s, double& outNumber)
		{
			static charset whitespaceChars(",\t\n\f\r ");          // whitespace found in paths

			// clear up leading whitespace, including ','
			s = chunk_ltrim(s, whitespaceChars);

			ByteSpan numChunk{};
			s = scanNumber(s, numChunk);

			if (!numChunk)
				return false;

			outNumber = chunk_to_double(numChunk);

			return true;
		}

The comment gives you the flavor of it. Again, we start with trimming ‘whitespace’, before doing anything. This is very important. In the case of these numbers, ‘whitespace’ not only includes the typical 0x20 TAB, etc, but also the COMMA (‘,’) character. “M20,20” and “M20 20” and “M 20 20” and “M 20, 20” and even “M,20,20” are all equivalent. So, if you’re going to be parsing numbers in a sequence, you’re going to have to deal with all those cases. The easiest thing to do is trim whitespace before you start. I will point out the convenience of the charset construction. Super easy.

We trim the whitespace off the front, then call ‘scanNumber()’. That’s another workhourse routine, which is worth looking into, but I won’t put the code here. You can find it in the bspanutil.h file. I will put the comment associated with the code here though, as it’s informative.

// Parse a number which may have units after it
//   1.2em
// -1.0E2em
// 2.34ex
// -2.34e3M10,20
// 
// By the end of this routine, the numchunk represents the range of the 
// captured number.
// 
// The returned chunk represents what comes next, and can be used
// to continue scanning the original inChunk
//
// Note:  We assume here that the inChunk is already positioned at the start
// of a number (including +/- sign), with no leading whitespace

This is probably the most singularly important routine in the whole library. It has the big task of figuring out numbers from a stream of characters. Those numbers, as you can see from the examples, come in many different forms, and things can get confusing. Here’s another example of a sequence of characters it needs to be able to figure out: “M-1.7-82L.92 27”. You save yourself a ton of time, headache, and heartburn by getting this right.

The next choice you make is how to convert from the number that we scanned (it’s still just a stream of ASCII characters) into an actual ‘double’. This is the point where most programmers might throw up their hands and reach for their trusty ‘strtod’ or ye olde ‘atof’, or even ‘sprintf’. There’s a whole science to this, just know that strtod() is not your friend, and for something you ‘ll be doing millions of times, it’s worth investigating some alternatives. I highly recommend reading the code for fast_double_parser. If you want to examine what I do, checkout the chunk_to_double() routine within the bspanutil.h file.

We’re getting pretty far into the weeds down here, so let’s look at one more function, the LineTo

		static bool parseLineTo(ByteSpan& s, BLPath& apath, int& iteration)
		{
			double x{ 0 };
			double y{ 0 };

			if (!parseNextNumber(s, x))
				return false;
			if (!parseNextNumber(s, y))
				return false;

			apath.lineTo(x, y);

			iteration++;

			return true;
		}

Same as MoveTo, parse a couple of numbers, apply them to the right function on the path object, return true or false. Just do the same thing 18 more times for the other functions, and you’ve got your path ‘parser’.

To recap, parsing the ‘d’ parameter is one of the most important parts of any SVG parser. In this case, we want to get from the text to an actual object we can render, as quickly as possible. A BLPath alone is not enough to create great images, we still have a long way to go until we start seeing pretty pictures on the screen. Parsing the path is critical to getting there though. This is where you could waste tons of time and memory, so it’s worth considering the options carefully. In this case, we’ve chosen to represent the path in memory using a data structure that can be a part of a graphic elements tree, as well as being handed to the drawing engine directly, without having to transform it once again before actually drawing.

There you have it. One step closer to our beautiful images.

Next time around, we need to look at what kind of Document Object Model (DOM) we want to construct, and how our SVG parser will construct it.


SVG From the Ground Up – It’s XML, How hard could it be?

Let’s take a look at the SVG (XML) code that generates that image.

<svg height="200" width="680" xmlns="http://www.w3.org/2000/svg">
    <circle cx="70" cy="70" r="50" />
    <circle cx="200" cy="70" r="50" fill="#79C99E" />
    <circle cx="330" cy="70" r="50" fill="#79C99E" stroke-width="10" stroke="#508484" />
    <circle cx="460" cy="70" r="50" fill="#79C99E" stroke-width="10" />
    <circle cx="590" cy="70" r="50" fill="none" stroke-width="10" stroke="#508484" />
</svg>

By the end of this post, we should be able to scan through the components of that, and generate the tokens necessary to begin rendering it as SVG. So, where to start?

Last time around (SVG From the Ground Up – Parsing Fundamentals), I introduced the ByteSpan and charset data structures, as a way to say “these are the only tools you’ll need…”. Well, at least they are certainly the core building blocks. Now we’re going to actually use those components to begin the process of breaking down the XML. XML can be a daunting sprawling beast. Its origins are in an even older document technology known as SGML. The first specification for the language can be found here: Extensible Markup Language (XML) 1.0 (Fifth Edition). When I joined the team at Microsoft in 1998 to work on this under Jean Paoli, one of the original authors, there were probably 30 people across dev, test, and pm. Of course we had people working on the standards body, and I was working on XSLT, and a couple on the parser, someone on DTD schema. It was quite a production. At that time, we had to deal with myriad encodings (utf-8 did not rule the world yet), conformance and compliance test suites, and that XSLT beast (CSS did not rule the world yet). It was a daunting endeavor, and at some point we tried to color everything with XML, much to the chagrin of most other people. But, some things did come out of that era, and SVG is one of them.

Today, our task is not to implement a fully compliant validating parser. That again would take a team of a few, and a ton of testing. What we’re after is something more modest. Something a hobby hacker could throw together in a weekend, but have a fair chance at it being able to consume most of the SVG you’re ever really interested in. To that end, there’s a much smaller, simpler XML spec out there. MicroXML. This describes a subset of XML that leaves out all the really hard parts. While that spec is far more readable, we’ll go even one step simpler. With our parser here, we won’t even be supporting utf-8. That might seem like a tragic simplification, but the reality is, not even that’s needed for most of what we’ll be doing with SVG. So, here’s the list of what we will be doing.

  • Decoding elements
  • Decoding attributes
  • Decoding element content (supporting text nodes)
  • Skipping Doctype
  • Skipping Comments
  • Skipping Processing Instructions
  • Not expanding character entities (although user can)

As you will soon see “skipping” doesn’t mean you have access to the data, it just means our SVG parser won’t do anything with it. This is a nice extensibility point. We start simple, and you can add as much complexity as you want over time, without changing the fundamental structure of what we’re about to build.

Now for some types and enums. I won’t put the entirety of the code in here, so if you want to follow along, you can look at the xmlscan.h file. We’ll start with the XML element types.

    enum XML_ELEMENT_TYPE {
        XML_ELEMENT_TYPE_INVALID = 0
		, XML_ELEMENT_TYPE_XMLDECL                  // An XML declaration, like <?xml version="1.0" encoding="UTF-8"?>
        , XML_ELEMENT_TYPE_CONTENT                  // Content, like <foo>bar</foo>, the 'bar' is content
        , XML_ELEMENT_TYPE_SELF_CLOSING             // A self-closing tag, like <foo/>
        , XML_ELEMENT_TYPE_START_TAG                // A start tag, like <foo>
        , XML_ELEMENT_TYPE_END_TAG                  // An end tag, like </foo>
        , XML_ELEMENT_TYPE_COMMENT                  // A comment, like <!-- foo -->
        , XML_ELEMENT_TYPE_PROCESSING_INSTRUCTION   // A processing instruction, like <?foo bar?>
        , XML_ELEMENT_TYPE_CDATA                    // A CDATA section, like <![CDATA[ foo ]]>
        , XML_ELEMENT_TYPE_DOCTYPE                  // A DOCTYPE section, like <!DOCTYPE foo>
    };

This is where we indicate what kinds of pieces of the XML file we will recognize. If something is not in this list, it will either be reported as invalid, or it will simply cause the scanner to stop processing. From the little bit of XML that opened this article, we see “START_TAG”, “SELF_CLOSING”, “END_TAG”. And that’s it!! Simple right?

OK. Next up are a couple of data structures which are the guts of the XML itself. First is the XmlName. Although we’re not building a super conformant parser, there are some simple realities we need to be able to handle to make our future life easier. XML namespaces are one of those things. In XML, you can have a name with a ‘:’ in it, which puts the name into a namespace. Without too much detail, just know that “circle”, could have been “svg:circle”, or something, and possibly mean the same thing. We need a data structure that will capture this.

struct XmlName {
        ByteSpan fNamespace{};
        ByteSpan fName{};

        XmlName() = default;
        
        XmlName(const ByteSpan& inChunk)
        {
            reset(inChunk);
        }

        XmlName(const XmlName &other):fNamespace(other.fNamespace), fName(other.fName){}
        
        XmlName& operator =(const XmlName& rhs)
        {
            fNamespace = rhs.fNamespace;
            fName = rhs.fName;
            return *this;
        }
        
        XmlName & operator=(const ByteSpan &inChunk)
        {
            reset(inChunk);
            return *this;
        }
        
		// Implement for std::map, and ordering in general
		bool operator < (const XmlName& rhs) const
		{
			size_t maxnsbytes = std::min(fNamespace.size(), rhs.fNamespace.size());
			size_t maxnamebytes = std::min(fName.size(), rhs.fName.size());
            
			return (memcmp(fNamespace.begin(), rhs.fNamespace.begin(), maxnsbytes)<=0)  && (memcmp(fName.begin(), rhs.fName.begin(), maxnamebytes) < 0);
		}
        
        // Allows setting the name after it's been created
        XmlName& reset(const ByteSpan& inChunk)
        {
            fName = inChunk;
            fNamespace = chunk_token(fName, charset(':'));
            if (chunk_size(fName)<1)
            {
                fName = fNamespace;
                fNamespace = {};
            }
            return *this;
        }
        
		ByteSpan name() const { return fName; }
		ByteSpan ns() const { return fNamespace; }
	};

Given a ByteSpan, our universal data representation, split it out into the ‘namespace’ and ‘name’ parts, if they exist. Then we can get the name part by calling ‘name()’, and if there was a namespace part, we can get that from ‘ns()’. Why ‘ns’ instead of ‘namespace’? Because ‘namespace’ is a keyword in C/C++, and we don’t want any confusion or compiler errors.

One thing to note here is the implementation of the ‘operator <‘. Why is that there? Because if you want to use this as a keyfield in an associative container, such as std::map, you need some comparison operator, and by implementing ‘<‘, you get a quick and dirty comparison operator. This is a future enhancement we’ll use later.

Next up is the representation of an XML node itself, where we have XmlElement.

    // Representation of an xml element
    // The xml iterator will generate these
    struct XmlElement
    {
    private:
        int fElementKind{ XML_ELEMENT_TYPE_INVALID };
        ByteSpan fData{};

        XmlName fXmlName{};
        std::string fName{};
        std::map<std::string, ByteSpan> fAttributes{};

    public:
        XmlElement() {}
        XmlElement(int kind, const ByteSpan& data, bool autoScanAttr = false)
            :fElementKind(kind)
            , fData(data)
        {
            reset(kind, data, autoScanAttr);
        }

		void reset(int kind, const ByteSpan& data, bool autoScanAttr = false)
		{
            clear();

            fElementKind = kind;
            fData = data;

            if ((fElementKind == XML_ELEMENT_TYPE_START_TAG) ||
                (fElementKind == XML_ELEMENT_TYPE_SELF_CLOSING) ||
                (fElementKind == XML_ELEMENT_TYPE_END_TAG))
            {
                scanTagName();

                if (autoScanAttr) {
                    if (fElementKind != XML_ELEMENT_TYPE_END_TAG)
                        scanAttributes();
                }
            }
		}
        
		// Clear this element to a default state
        void clear() {
			fElementKind = XML_ELEMENT_TYPE_INVALID;
			fData = {};
			fName.clear();
			fAttributes.clear();
		}
        
        // determines whether the element is currently empty
        bool empty() const { return fElementKind == XML_ELEMENT_TYPE_INVALID; }

        explicit operator bool() const { return !empty(); }

        // Returning information about the element
        const std::map<std::string, ByteSpan>& attributes() const { return fAttributes; }
        
        const std::string& name() const { return fName; }
		void setName(const std::string& name) { fName = name; }
        
        int kind() const { return fElementKind; }
		void kind(int kind) { fElementKind = kind; }
        
        const ByteSpan& data() const { return fData; }

		// Convenience for what kind of tag it is
        bool isStart() const { return (fElementKind == XML_ELEMENT_TYPE_START_TAG); }
		bool isSelfClosing() const { return fElementKind == XML_ELEMENT_TYPE_SELF_CLOSING; }
		bool isEnd() const { return fElementKind == XML_ELEMENT_TYPE_END_TAG; }
		bool isComment() const { return fElementKind == XML_ELEMENT_TYPE_COMMENT; }
		bool isProcessingInstruction() const { return fElementKind == XML_ELEMENT_TYPE_PROCESSING_INSTRUCTION; }
        bool isContent() const { return fElementKind == XML_ELEMENT_TYPE_CONTENT; }
		bool isCData() const { return fElementKind == XML_ELEMENT_TYPE_CDATA; }
		bool isDoctype() const { return fElementKind == XML_ELEMENT_TYPE_DOCTYPE; }

        
        void addAttribute(std::string& name, const ByteSpan& valueChunk)
        {
            fAttributes[name] = valueChunk;
        }

        ByteSpan getAttribute(const std::string &name) const
		{
			auto it = fAttributes.find(name);
			if (it != fAttributes.end())
				return it->second;
			else
                return ByteSpan{};
		}
        
    private:
        //
        // Parse an XML element
        // We should be sitting on the first character of the element tag after the '<'
        // There are several things that need to happen here
        // 1) Scan the element name
        // 2) Scan the attributes, creating key/value pairs
        // 3) Figure out if this is a self closing element

        // 
        // We do NOT scan the content of the element here, that happens
        // outside this routine.  We only deal with what comes up the the closing '>'
        //
        void setTagName(const ByteSpan& inChunk)
        {
            fXmlName.reset(inChunk);
            fName = toString(fXmlName.name());
        }
        
        void scanTagName()
        {
            ByteSpan s = fData;
            bool start = false;
            bool end = false;

            // If the chunk is empty, just return
            if (!s)
                return;

            // Check if the tag is end tag
            if (*s == '/')
            {
                s++;
                end = true;
            }
            else {
                start = true;
            }

            // Get tag name
            ByteSpan tagName = s;
            tagName.fEnd = s.fStart;

            while (s && !wspChars[*s])
                s++;

            tagName.fEnd = s.fStart;
            setTagName(tagName);


            fData = s;
        }

        public:
        //
        // scanAttributes
        // Scans the fData member looking for attribute key/value pairs
        // It will add to the member fAttributes these pairs, without further processing.
        // This should be called after scanTagName(), because we want to be positioned
        // on the first key/value pair. 
        //
        int scanAttributes()
        {

            int nattr = 0;
            bool start = false;
            bool end = false;
            uint8_t quote{};
            ByteSpan s = fData;


            // Get the attribute key/value pairs for the element
            while (s && !end)
            {
                uint8_t* beginattrValue = nullptr;
                uint8_t* endattrValue = nullptr;


                // Skip white space before the attrib name
                s = chunk_ltrim(s, wspChars);

                if (!s)
                    break;

                if (*s == '/') {
                    end = true;
                    break;
                }

                // Find end of the attrib name.
                //static charset equalChars("=");
                auto attrNameChunk = chunk_token(s, "=");
                attrNameChunk = chunk_trim(attrNameChunk, wspChars);    // trim whitespace on both ends

                std::string attrName = std::string(attrNameChunk.fStart, attrNameChunk.fEnd);

                // Skip stuff past '=' until the beginning of the value.
                while (s && (*s != '\"') && (*s != '\''))
                    s++;

                // If we've reached end of span, bail out
                if (!s)
                    break;

                // capture the quote character
                // Store value and find the end of it.
                quote = *s;

				s++;    // move past the quote character
                beginattrValue = (uint8_t*)s.fStart;    // Mark the beginning of the attribute content

                // Skip until we find the matching closing quote
                while (s && *s != quote)
                    s++;

                if (s)
                {
                    endattrValue = (uint8_t*)s.fStart;  // Mark the ending of the attribute content
                    s++;
                }

                // Store only well formed attributes
                ByteSpan attrValue = { beginattrValue, endattrValue };

                addAttribute(attrName, attrValue);

                nattr++;
            }

            return nattr;
        }
    };

That’s a bit of a brute, but actually pretty straightforward. We need a data structure that tells us what kind of XML element type we’re dealing with. We need the name, as the content of the element held onto for future processing. We hold onto the content as a ByteSpan, but have provision for making more convenient representations. For example, we turn the name into a std::string. In the futue, we can eliminate even this, and just use the XmlName with its chunks directly.

Besides the element name, we also have the ability to split out the attribute key/value pairs, as seen in ‘scanAttributes()’. Let’s take a deeper look at the constructor.

        XmlElement(int kind, const ByteSpan& data, bool autoScanAttr = false)
            :fElementKind(kind)
            , fData(data)
        {
            reset(kind, data, autoScanAttr);
        }

		void reset(int kind, const ByteSpan& data, bool autoScanAttr = false)
		{
            clear();

            fElementKind = kind;
            fData = data;

            if ((fElementKind == XML_ELEMENT_TYPE_START_TAG) ||
                (fElementKind == XML_ELEMENT_TYPE_SELF_CLOSING) ||
                (fElementKind == XML_ELEMENT_TYPE_END_TAG))
            {
                scanTagName();

                if (autoScanAttr) {
                    if (fElementKind != XML_ELEMENT_TYPE_END_TAG)
                        scanAttributes();
                }
            }
		}

The constructor takes a ‘kind’, a ByteSpan, and a flag indicating whether we want to parse out the attributes or not. In ‘reset()’, we see that we hold onto the kind of element, and the ByteSpan. That ByteSpan contains everything between the ‘<‘ of the tag to the closing ‘>’, non-inclusive. The first thing we do is scan the tag name, so we can at least hold onto that, leaving the fData representing the rest. This is relatively low impact so far.

Why not just do this in the constructor itself, why have a “reset()”? As we’ll see later, we actually reuse XmlElement in some situations while parsing, so we want to be able to set, and reset, the same object multiple times. At least that’s one way of doing things.

Another item of note is whether you scan the attributes or not. If you do scan the attributes, you end up with a map of those elements, and a way to get the value of individual attributes.

        std::map<std::string, ByteSpan> fAttributes{};

        ByteSpan getAttribute(const std::string &name) const
		{
			auto it = fAttributes.find(name);
			if (it != fAttributes.end())
				return it->second;
			else
                return ByteSpan{};
		}

The ‘getAttribute()’ method is a most critical piece when we later start building our SVG model, so it needs to be fast and efficient. Of course, this does not have to be embedded in the core of the XmlElement, you could just as easily construct an attribute list outside of the element, but then you’d have to associate it back to the element anyway, and you end up in the same place. getAttribute() takes a name as a string, and returns the ByteSpan which is the raw, uninterpreted content of that attribute, without the enclosing quote marks. In the future, it would be nice to replace that std::string based name with a XmlName, which will save on some allocations, but we’ll stick with this convenience for now.

The stage is now set. We have our core components and data structures, we’re ready for the main event of actually parsing some content. For that, we have to make some design decisions. The first one we already made in the very beginning. We will be consuming a chunk of memory as represented in a ByteSpan. The next decision is how we want to consume? Do we want to build a Document Object Model (DOM), or some other structure? Do we just want to print out nodes as we see them? Do we want a ‘pull model’ parser, where we are in control of getting each node one by one, or a ‘push model’, where we have a callback function which is called every time a node is seen, but the primary driver is elsewhere?

My choice is to have a pull model parser, where I ask for each node, one by one, and do whatever I’m going to do with it. In terms of programming patterns, this is the ‘iterator’. So, I’m going to create an XML iterator. The fundamental structure of an iterator is this.

Iterator iter(content)
while (iter)
{
   doSomethingWithCurrentItem(*iter);
  iter++;
}

So, that’s what we need to construct for our XML. Something that can scan its input, delivering XmlElement as the individual items that we can then do something with. So, here is XmlElementIterator.

   struct XmlElementIterator {
    private:
        // XML Iterator States
        enum XML_ITERATOR_STATE {
            XML_ITERATOR_STATE_CONTENT = 0
            , XML_ITERATOR_STATE_START_TAG

        };
        
        // What state the iterator is in
        int fState{ XML_ITERATOR_STATE_CONTENT };
        svg2b2d::ByteSpan fSource{};
        svg2b2d::ByteSpan mark{};

        XmlElement fCurrentElement{};
        
    public:
        XmlElementIterator(const svg2b2d::ByteSpan& inChunk)
        {
            fSource = inChunk;
            mark = inChunk;

            fState = XML_ITERATOR_STATE_CONTENT;
            
            next();
        }

		explicit operator bool() { return !fCurrentElement.empty(); }
        
        // These operators make it operate like an iterator
        const XmlElement& operator*() const { return fCurrentElement; }
        const XmlElement* operator->() const { return &fCurrentElement; }

        XmlElementIterator& operator++() { next(); return *this; }
        XmlElementIterator& operator++(int) { next(); return *this; }
        
        // Reset the iterator to a known state with data
        void reset(const svg2b2d::ByteSpan& inChunk, int st)
        {
            fSource = inChunk;
            mark = inChunk;

            fState = st;
        }

        ByteSpan readTag()
        {
            ByteSpan elementChunk = fSource;
            elementChunk.fEnd = fSource.fStart;
            
            while (fSource && *fSource != '>')
                fSource++;

            elementChunk.fEnd = fSource.fStart;
            elementChunk = chunk_rtrim(elementChunk, wspChars);
            
            // Get past the '>' if it was there
            fSource++;
            
            return elementChunk;
        }
        
        // readDoctype
		// Reads the doctype chunk, and returns it as a ByteSpan
        // fSource is currently sitting at the beginning of !DOCTYPE
        // Note: 
        
        ByteSpan readDoctype()
        {

            // skip past the !DOCTYPE to the first whitespace character
			while (fSource && !wspChars[*fSource])
				fSource++;
            
			// Skip past the whitespace
            // to get to the beginning of things
			fSource = chunk_ltrim(fSource, wspChars);

            
            // Mark the beginning of the "content" we might return
            ByteSpan elementChunk = fSource;
            elementChunk.fEnd = fSource.fStart;

            // To get to the end, we're looking for '[]' or just '>'
            auto foundChar = chunk_find_char(fSource, '[');
            if (foundChar)
            {
                fSource = foundChar;
                foundChar = chunk_find_char(foundChar, ']');
                if (foundChar)
                {
                    fSource = foundChar;
                    fSource++;
                }
                elementChunk.fEnd = fSource.fStart;
            }
            
            // skip whitespace?
            // search for closing '>'
            foundChar = chunk_find_char(fSource, '>');
            if (foundChar)
            {
                fSource = foundChar;
                elementChunk.fEnd = fSource.fStart;
                fSource++;
            }
            
            return elementChunk;
        }
        
        
        // Simple routine to scan XML content
        // the input 's' is a chunk representing the xml to 
        // be scanned.
        // The input chunk will be altered in the process so it
        // can be used in a subsequent call to continue scanning where
        // it left off.
        bool next()
        {
            while (fSource)
            {
                switch (fState)
                {
                case XML_ITERATOR_STATE_CONTENT: {

                    if (*fSource == '<')
                    {
                        // Change state to beginning of start tag
                        // for next turn through iteration
                        fState = XML_ITERATOR_STATE_START_TAG;

                        if (fSource != mark)
                        {
                            // Encapsulate the content in a chunk
                            svg2b2d::ByteSpan content = { mark.fStart, fSource.fStart };

                            // collapse whitespace
							// if the content is all whitespace
                            // don't return anything
							content = chunk_trim(content, wspChars);
                            if (content)
                            {
                                // Set the state for next iteration
                                fSource++;
                                mark = fSource;
                                fCurrentElement.reset(XML_ELEMENT_TYPE_CONTENT, content);
                                
                                return true;
                            }
                        }

                        fSource++;
                        mark = fSource;
                    }
                    else {
                        fSource++;
                    }

                }
                break;

                case XML_ITERATOR_STATE_START_TAG: {
                    // Create a chunk that encapsulates the element tag 
                    // up to, but not including, the '>' character
                    ByteSpan elementChunk = fSource;
                    elementChunk.fEnd = fSource.fStart;
                    int kind = XML_ELEMENT_TYPE_START_TAG;
                    
                    if (chunk_starts_with_cstr(fSource, "?xml"))
                    {
						kind = XML_ELEMENT_TYPE_XMLDECL;
                        elementChunk = readTag();
                    } 
                    else if (chunk_starts_with_cstr(fSource, "?"))
                    {
                        kind = XML_ELEMENT_TYPE_PROCESSING_INSTRUCTION;
                        elementChunk = readTag();
                    }
                    else if (chunk_starts_with_cstr(fSource, "!DOCTYPE"))
                    {
                        kind = XML_ELEMENT_TYPE_DOCTYPE;
                        elementChunk = readDoctype();
                    }
                    else if (chunk_starts_with_cstr(fSource, "!--"))
                    {
						kind = XML_ELEMENT_TYPE_COMMENT;
                        elementChunk = readTag();
                    }
                    else if (chunk_starts_with_cstr(fSource, "![CDATA["))
                    {
                        kind = XML_ELEMENT_TYPE_CDATA;
                        elementChunk = readTag();
                    }
					else if (chunk_starts_with_cstr(fSource, "/"))
					{
						kind = XML_ELEMENT_TYPE_END_TAG;
						elementChunk = readTag();
					}
					else {
						elementChunk = readTag();
                        if (chunk_ends_with_char(elementChunk, '/'))
                            kind = XML_ELEMENT_TYPE_SELF_CLOSING;
					}
                    
                    fState = XML_ITERATOR_STATE_CONTENT;

                    mark = fSource;

					fCurrentElement.reset(kind, elementChunk, true);

                    return true;
                }
                break;

                default:
                    fSource++;
                    break;

                }
            }

            fCurrentElement.clear();
            return false;
        } // end of next()
    };

That code might have a face only a programmer could love, but it’s relatively simple to break down. The constructor takes a ByteSpan, and holds onto it as fSource. This ByteSpan is ‘consumed’, meaning, once you’ve iterated, you can’t go back. But, since ‘iteration’ is nothing more than moving a pointer in a ByteSpan, you can always take a ‘snapshot’ of where you’re at, and continue, but we won’t go into that right here. That’s going to be useful for tracking down where an error occured.

The crux of the iterator is the ‘next()’ method. This is where we look for the ‘<‘ character that indicates the start of some tag. The iterator runs between two states. You’re either in ‘XML_ITERATOR_STATE_CONTENT’ or ‘XML_ITERATOR_STATE_START_TAG’. Initially we start in the ‘CONTENT’ state, and flip to ‘START_TAG’ as soon as we see the character. Once in ‘START_TAG’, we try to further refine what kind of tag we’re dealing with. In most cases, we just capture the content, and that becomes the current element.

The iteration terminates when the current XmlElement (fCurretElement) is empty, which happems if we run out of input, or there’s some kind of error.

So, next() returns true or false. And our iterator does what it’s supposed to do, which is hold onto the current XmlElement that we have scanned. You can get to the contents of the element by using the dereference operator *, like this: *iter, or the arrow operator. In either case, they simply return the current element

        const XmlElement& operator*() const { return fCurrentElement; }
        const XmlElement* operator->() const { return &fCurrentElement; }

Alright, in practice, it looks like this:

#include "mmap.h"
#include "xmlscan.h"
#include "xmlutil.h"

using namespace filemapper;
using namespace svg2b2d;

int main(int argc, char** argv)
{
    if (argc < 2)
    {
        printf("Usage: pullxml <xml file>\n");
        return 1;
    }

    // create an mmap for the specified file
    const char* filename = argv[1];
    auto mapped = mmap::createShared(filename);

    if (mapped == nullptr)
        return 0;


    // 
	// Parse the mapped file as XML
    // printing out the elements along the way
    ByteSpan s(mapped->data(), mapped->size());
    
    XmlElementIterator iter(s);

    while (iter)
    {
		ndt_debug::printXmlElement(*iter);

        iter++;
    }

    // close the mapped file
    mapped->close();

    return 0;
}

That will generate the following output, where the printXmlElement() function can be found in the file xmlutil.h. The individual attributes are indicated with their name followed by ‘:’, such as ‘height:’, followed by the value of the attributed, surrounded by ‘||’ markers. Each tag kind is indicated as well.

START_TAG: [svg]
    height: ||200||
    width: ||680||
    xmlns: ||http://www.w3.org/2000/svg||
SELF_CLOSING: [circle]
    cx: ||70||
    cy: ||70||
    r: ||50||
SELF_CLOSING: [circle]
    cx: ||200||
    cy: ||70||
    fill: ||#79C99E||
    r: ||50||
SELF_CLOSING: [circle]
    cx: ||330||
    cy: ||70||
    fill: ||#79C99E||
    r: ||50||
    stroke: ||#508484||
    stroke-width: ||10||
SELF_CLOSING: [circle]
    cx: ||460||
    cy: ||70||
    fill: ||#79C99E||
    r: ||50||
    stroke-width: ||10||
SELF_CLOSING: [circle]
    cx: ||590||
    cy: ||70||
    fill: ||none||
    r: ||50||
    stroke: ||#508484||
    stroke-width: ||10||
END_TAG: [svg]

At this point, we have our XML “parser”. It can scan/parse enough for us to continue on our journey to parse and display SVG. It’s not the most robust XML parser on the planet, but it’s a good performer, very small and hopefully understandable. Usage could not be easier, and it does not impose a lot of frameworks, or pull in a lot of dependencies. We’re at a good starting point, and if all you wanted was to be able to parse some XML to do something, you could stop here and call it a day.

Next time around, we’re going to look into the SVG side of things, and sink deep into that rabbit hole.


SVG From the Ground Up – Parsing Fundamentals

Scalable Vector Graphics, is a XML based format. So, the first thing we want to do is create an XML ‘parser’. I put that in quotes because we don’t really need to create a full fledged conformant, validating, XML parser. This is the first design principle I’m going to be following. Here I want to create just enough to make things work, with an eye towards future proofing and extensibility, but not go so far as to make it absolutely bullet proof. So, I’ll be writing just enough code to scan some typical .svg files, while leaving room to swap out our quick and dirty parser for something that is more substantial in the future.

If you want to follow along the code I am constructing, you can find it in this GitHub repository: svg2b2d

To start scanning text, begins with how text is represented in the first place. This is a very core fundamental design decision. Will we be reading from files on the local machine? Will we be reading from a stream of bytes coming from the network? Will we be reading from a chunk of memory handed to us through some API within the program? I’ve decided on this last choice. These days, it’s very common to be able to read a file into memory, and operate on it from there. Similarly with networks, speeds are fast enough that you can read the entirety of the content into memory before processing. SVG is not a format where you can easily progressively render, like with raster based formats. You really do need the whole document before you can render it.

So, we’re going to be reading from memory, assuming something else has already taken care of getting the image into memory. I am writing in C++, so I’m ok with struct, and class, but I don’t necessarily want to use the iostream facilities, nor get too far down the track with templates and the like. The latest C++ (20) has a std::span object, which is very useful, and does exactly what I want, but I want the code to be a bit more portable than C++ 20, so I’m not going to use that facility, instead I’m going to somewhat replicate it.

How to represent a chunk of memory. There are two choices. You can either use a starting pointer and length, or you can use a starting and ending pointer. After much deliberation, I chose to do the latter, and use two pointers.

struct ByteSpan
{
    const unsigned char* fStart;
    const unsigned char* fEnd;
};

Throughout the code, I will use the ‘struct’ instead of ‘class’ because I’m ok with the data structure defaulting to everything being publicly accessible. There’s not a lot of sub-classing that’s going to occur here either, so I’m not as concerned about data hiding and encapsulation. This also makes the code easier to understand, without a lot of extraneous decorations.

There we have it. You have a chunk of memory, now what? Well, the most common things you do when scanning code are advance the pointer, and check the character you’re currently scanning. So, lets add these conveniences, as well as a couple of constructors, then we can do a sample.


struct ByteSpan
{
    const unsigned char * fStart{};
    const unsigned char * fEnd{};

    ByteSpan():fStart(nullptr), fEnd(nullptr){}
    ByteSpan(const char *cstr):fStart((const unsigned char *)cstr), fEnd((const unsigned char *)cstr+strlen(cstr)){}
    explicit ByteSpan(const void* data, size_t sz) 
        :fStart((const unsigned char*)data)
        , fEnd((const unsigned char*)fStart + sz) 
        {}

    // Return false when start and end are the same
    explicit operator bool() const { return (fEnd - fStart) > 0; };

    // get current value from fStart, like a 'peek' operation
    unsigned char& operator*() { 
        static unsigned char zero = 0;  
        if (fStart < fEnd) 
            return *(unsigned char*)fStart; 
        return  zero; 
    }
    
    const uint8_t& operator*() const { 
        static unsigned char zero = 0;  
        if (fStart < fEnd) 
            return *(unsigned char*)fStart; 
        return  zero; 
    }

    ByteSpan& operator++() { return operator+=(1); }	// prefix notation ++y
    ByteSpan& operator++(int i) { return operator+=(1); }   // postfix notation y++
};


With all that boiler plate code added to the structure, you can now do the following operations

ByteSpan b("Here is some text");
while (b)
{
    printf("%c",*b);
    b++;
}

And that little loop will essentially print a copy of the string you used to create the ByteSpan ‘b’. At this point, it might hardly seem worth the effort. I mean, you could just as simply use a starting pointer and ending pointer, without the intervening ByteSpan structure. Well, yes, and a lot of code out there in the world does exactly that, and it’s just fine. But, we have some future design goals which will make this little encapsulation of the two pointers very convenient. One of the design goals is worth introducing now, and that is the concept of zero, or minimal allocation. We want the scanner to be light weight, fast, minimal impact on the memory of the system. We want it to be able to parse data that is megabytes in size without having any problems. To this end, the scanner itself does no allocations, and does not alter the original memory its operating on, even though the ByteSpan would allow you to.

Alright. With this little tool in hand, what else can we do? Well, soon enough we’re going to need to compare characters, and make decisions. Is that a ‘<‘ opening to an XmlElement? Is this whitespace? Does this string end with “/>”. We have need for something that can represent a set of characters. Here is charset.

struct charset {
		std::bitset<256> bits;

		explicit charset(const char achar) { addChar(achar); }
		charset(const char* chars) { addChars(chars); }

		charset& addChar(const char achar)
		{
			bits.set(achar);
			return *this;
		}

		charset& addChars(const char* chars)
		{
			size_t len = strlen(chars);
			for (size_t i = 0; i < len; i++)
				bits.set(chars[i]);

			return *this;
		}

		charset& operator+=(const char achar) { return addChar(achar); }
		charset& operator+=(const char* chars) { return addChars(chars); }

		charset operator+(const char achar) const
		{
			charset result(*this);
			return result.addChar(achar);
		}

		charset operator+(const char* chars) const
		{
			charset result(*this);
			return result.addChars(chars);
		}

		// This one makes it look like an array
		bool operator [](const size_t idx) const { return bits[idx]; }

		// This way makes it look like a function
		bool operator ()(const size_t idx) const { return bits[idx]; }

		bool contains(const uint8_t idx) const { return bits[idx]; }
		
	};

All of that, so that we can write the following

charset wspChars(" \t\r\n\v\f");

ByteSpan b("  Now is the time for all humans to come to the aid of animals  ");

while (b)
{
    // skip whitespace
   while (wspChars.contains(*b))
        b++;

    // Create a span that will represent a word
    // start it being empty
    ByteSpan aWord = b;
    aWord.fEnd = aWord.fStart;

    // Advance while there are still characters and they are not a whitespace character
    while (b && !wspChars.contains(*b))
        b++;

    // Now we're sitting at the end of the whole span, or at the beginning of the next
    // whitespace character.  In either case, it's the end of our word
    aWord.fEnd = b.fStart;

    // Now we can do something with the word that we found
    printWord(aWord);

    // And continue around the loop until we've exhausted the byte span
}


And that’s how we start. If you want to get ahead, you can look at the code in the repository, in particular bspan.h and bspanutil.h. With these two classes alone, we will build up the XML scanning capability, and ultimately the SVG building capability on top of that. So, these are very code, and important to get right, because they will maintain the promise of “no allocations” and “be super fast”.

One question that came up in my mind was “why not just use regex and be done with it?”. Well, yes, C/C++ have regular expression capabilities, either built-in, or as some side library. There were a couple of reasons I chose not to go that route. One is about speed, the other is about allocations. It’s super easy to just store your text into a std::string object, then use regex on that. But, when you do, you’ll find that std::string objects are allocated all over the place, and you don’t have tight control of your memory consumption, which breaks one of the design tenets I’m after. The other is just the size of such code. A good regex library can easily be as big, if not bigger, than the entirety of the SVG parser we’re trying to build. I am somewhat concerned with code size, so I’d rather not have the extra bloat. Besides, all that, trying to construct regex patterns that I or anyone can maintain in the future, can be quite challenging. We’ll essentially be building bits and pieces of what would typically go into regex libraries, but we’ll only be building as much as we need, so it will stay small and tight.

And there you have it. We have begun our journey with these two first small steps, the ByteSpan, and the charset.

Next time, we’ll see how easy it is to ‘parse’ some xml, as we introduce the XmlElement and XmlElementIterator.


Creating A SVG Viewer from the ground up

In the series I did last summer (Hello Scene), I walked through the fundamentals of creating a simple graphics system, from the ground up. Putting a window on the screen, setting pixels, drawing, text, visual effects, screen captures, and all that. Along the way, I discussed various design choices and tradeoffs that I made while creating the code.

While capturing screenshots, and flipping some bits might make for a cool demo, at the end of the day, I need to create actual applications that are: robust, performant, functional, and a delight for the user to use. A lot of what we see today are “web apps”, that is, things that are created to be run in a web browser. Web Apps have a lot of HTML, CSS, Javascript, and are programmed with myriad frameworks, in multiple languages on the front end and backend. It a whole industry out there!

One question arises for me though, and perhaps a bit of envy. Why do those web apps have to look so great, with their fancy gradients, shadows, and animations, whereas my typical applications look like they’re stuck in a late 2000 computer geek movie. I’m talking about desktop apps, and why they haven’t changed much in the past 20 years. Maybe we get a splash here and there with some changes in icon styles (shardows, transparency, flat, ‘dark’), but really, the rest of the app looks and feels the same. No animations, no fancy pictures, everything is square, just no fun.

Well, to this point, I’ve been on a mission to create more engaging desktop app experiences, and it starts with the graphics. To that end, I looked out into the world and saw that SVG (Scalable Vector Graphics) would be a great place to start. Vector graphics are great. The other form of graphics are ‘bitmap’. Bitmap graphics are the realm of file formats such as ‘png’, ‘jpeg’, ‘gif’, ‘webp’, and the like. As the name implies, a ‘bitmap’ is just a bunch of dots of color in a square. There are a couple of challenges with bitmap graphics. One is that when you scale them, the thing starts to look “pixelated”. You know, they get the ‘jaggies’, and they just don’t look that great.

The second challenge you have is that the image is static. You don’t know where the keys on that keyboard are located, so being able to push them, or have them reflect music that’s playing, is quite a hard task.

In steps vector graphics. Vector Graphics contain the original drawing commands that are used to create a bitmap, at any size. With a vector graphics file, you can retain the information about colors, locations, geometry, everything that went into creating the image. This means that you can locate individual elements, name them, change them during the application, and so on.

Why don’t we just use vector graphics all the time then? Honestly, I really don’t know. I do know that one impediment to using them is being able to parse the format, and do something meaningful with it. To date, you mostly find support for SVG in web browsers, where they’re already parsing this kind of data. In that environment, you have full access to all those annotations, and furthermore, you can attach javascript code the various actions, like mouse hovering, clicking, dragging and the like. But, for the most part, desktop applications don’t participate in that world. Instead, we’re typically stuck with bitmap graphics and clunky UI builders.

To change that, the first step is parsing the .svg file format. Lucky for me, SVG is based on XML, which is the first thing I worked on at Microsoft back in 1998. I never actually wrote the parser (worked on XSLT originally), but I’m super familiar with it. So, that’s where to start.

In this series, I’m going to write a functional SVG parser, which will be capable of generating SVG based bitmap images, as well as operate in an application development environment for desktop apps. I will be using the blend2d graphics library to do all the super heavy lifting of rendering the actual images, but I will focus on what goes into writing the parser, and seamlessly integrating the results into useful desktop applications.

So, follow along over the next few installments to see how it’s done.