Data Driven Programming.



Back in the days when real men programmed assembler, and the interwebs were nothing but a wet dream, data-driven programming (DDP) was a big thing. It was one of the most important ways of improving the capabilities of crappy assemblers and compilers and enhancing the performance of slow computers. Imagine doing a sinus calculation on a 1MHz 8-bit Motorola MPU without floating point. 

Then you would know what one flop is. 

The solution was to replace the insanely costly math with a sinus table, thus replacing code with data. If you wanted an arcTan function instead, it just required a new table.


Today, you less often use DDP just to speed up processing. Instead, you use it to add flexibility and vastly improve the functionality of your code. Let me give you a modern example.


If you look at the beta of the video editor, you will undoubtedly notice that it looks like crap. Compare this to the latest iteration of the design, made by the brilliant designer Amalia Plana. Apart from looking absolutely stunning, the takeaway here is that it is the exact same code running. 



A new DDP bottom menu was added, but apart from that, the code is identical. This was made possible because the design, layout, coloring, and large parts of the functionality were all controlled by data. Not code. 

I did this because I early on knew that we would have to change this many times for landscape, portrait, tablets, phones, etc. so to avoid having to change code an endless number of times, possibly introducing bugs and what-have-you, it was all implemented using DDP, and because of that, the design and the core video editing functionality became the least of our problems. Thanks, DDP.



I would even argue that there is stuff that would be very hard, if not impossible, to do in code alone. If you look at the snow globe video, the goal was to make a “realistic” snow globe demonstration. I wanted it to look and feel like a real snow globe, with “fluid currents” affecting the flow and momentum of the flakes. 

As I am not a rocket surgeon working for NASA, I knew that it would be a challenge to do real calculations for this. Instead, I set up a very simple DDP physics application, where the number of snowflakes and their physics properties was loaded from setup files. I then added a simple check in the game loop that reloaded the physics simulation if the setup files were updated. This way I could have the snow globe running continuously, and each time I changed the setup files and saved them, the simulation would automatically update. The rest was just a matter of “cut and try”. 

This way, I found out that:

  • Most of the snowflakes don't even have to have collision enabled.
  • Some snowflakes have collision enabled to simulate stacking.
  • A few snowflakes are extremely heavy, plowing through everything, simulating fluids.

And that is it. 

I have no idea what the settings were, but that is also almost irrelevant. It is not about the numbers, but only about trying out stuff, and seeing how it works.


Unfortunately, DDP comes at a price. While using DDP is like having an insanely powerful hammer when hammering in nails, the downside is that if you don't hit the nail, you will hit your thumb. Each time. Before I get into that, let me go over the basics of DDP.


At its lowest level, DDP means getting rid of all hard-coding.



_physics.AddDynamicNode(“solid”, 15.0f, 5.0f);



The above line is plain old hard-coding, meaning there are inserted string literals and numbers directly in the code. Obviously, readability suffers greatly, as no one knows what 15.0f or 5.0f is, so the first step will be to fix this.



const float BALL_BREAK_FORCE = 15.0f;

const float BALL_DENSITY = 5.0f;


_physics.AddDynamicNode(“solid”, BALL_BREAK_FORCE, BALL_DENSITY);



String literals are obviously not as bad for readability, so in cases where they are “just” used for displaying messages, you could argue that they are fine. However, if they are used programmatically as ex keys, a simple spelling mistake could break the application. Because of that, I most often used a static class defining all used key-words, and the final line would end up as:



public static class RSKeys

{

    public static const string SOLID = “solid”;

    public static const …

    public static const …

}


const float BALL_BREAK_FORCE = 15.0f;

const float BALL_DENSITY = 5.0f;


_physics.AddDynamicNode(RSKeys.SOLID, BALL_BREAK_FORCE, BALL_DENSITY);



This actually is the first (small) step in DDP. You will end up have a section in your code defining all the data used, and by changing these data, you can to some extent change the behavior of the application. However, it doesn't take much imagination to realize that it doesn't take much of a game for the “data” section to grow into a long boring unrelated list of constants and readonly's. We need some kind of grouping and way of organizing all these data. 


The rather obvious answer is to organize them into structs (classes), and make these available to the application. You might even want to split the data into a setup file of its own. If we then also added a version of  AddDynamicNode that took a setup class, the code would then look like this:



// somewhere in a setup class


public static RSBallSetup WEAK_BALL = new RSBallSetup

{

    BreakForce = 15.0f,

    Density = 5.0f

}; 

public static RSBallSetup NORMAL_BALL = new RSBallSetup

{

    BreakForce = 25.0f,

    Density = 5.0f

}; 


// and somewhere in the actual code


_physics.AddDynamicNode(RSKeys.SOLID, RSPhysicsSetup.WEAK_BALL);



At this point you have well-organized data, high readability, and little chance of things accidentally breaking. This is good code. But still, you need a struct for each and every setup you need, which again will lead to repeated settings and not much flexibility. If you, for example, had 10 physics setups with a default density of 5.0, and you wanted to change that, you would have to edit it in all 10 places. Especially for larger applications, we need more flexibility than that in DDP.


This brings us to the final step in implementing DDP, namely data stored in separate data files and loaded at runtime. There would be many ways of doing this, but I prefer loading JSON files into dictionaries and arrays. JSON is nice because it is easily readable, flexible, and will suit almost all needs. On top of that, it works well with dictionaries and arrays. Because of that, I often start by writing the RSDictionary and RSArray classes, which will be the safe and crash-proof way for the application to get its data.

Using this approach, the code would end up like this:



// the JSON file


"balls": 

{

    "fallback": 

    {

        "breakforce": 20.0,

        "density": 5.0

    },

    "weak": 

    {

        "breakforce": 15.0

    },

    "normal":

    {

        "breakforce": 25.0

    }

}


// and somewhere in the actual code


_physics.AddDynamicNode(RSKeys.SOLID, setup.GetDictionary(RSKeys.WEAK_BALL));


 

The final cool thing about this approach is that you can apply a fallback paradigm that ensures that meaningful data is always returned. This is mind-blowingly useful, but will be for my next post. All in all, you can see it all in action in the latest release (Version_0.0.4) on the GitHub repo.


—oOo—


So. Why is all this goodness dangerous? What is it with that hammer and my thumb?


The main issue with full-fledged DDP is that it can be notoriously hard to debug. A few lines of code will often be responsible for loading many complicated objects, and finding out why object 127 suddenly is a wrong color can sometimes be very hard. This is amplified by the fact that you have no compiler checks or warnings of the data. You might have seen games where what seems like simple minor bugs are not fixed for years, and while it, of course, could be because game programmers are lazy bastards, it could easily also be because the root cause is buried very, very deep in a humongous data stack. 

It is also very easy for DDP to grow into very complex data monsters, where data suddenly starts to get dependent on other data. Imagine building a data-driven setup for a manual watch. If you changed the teeth size of a single gear, you would most likely have to change a large number of other data, and you would have little way of knowing if you changed them all, and if you changed them correctly. This means that changing basic stuff in larger DDP systems is not easy and should not be taken lightly

Finally, loading data at runtime might in some situations have an impact on performance. Large amounts of data, like textures in a game, are because of this either pre-loaded or loaded in the background. Plus, having data files with humanly readable data could in some applications lead to security risks.


Takeaway


Data-Driven Programming is a mighty powerful tool, which used correctly will increase the flexibility and scalability of your application greatly. However, it is also a tool that will suddenly bite you in the ass and provide you with a lot of hard-to-debug headaches. So especially if you are not DDP savvy, my advice for you would be to start out small and climb the ladder slowly.

With great power comes great responsibility.


/Lars





Comments