What's the difference between data and code?

Fundamentally, there is of course no difference between data and code, but for real software infrastructures, there can be a big difference. Apart from obvious things like, as you mentioned, compilation, the biggest issue is this:

Most sufficiently large projects are designed to produce "releases" that are one big bundle, produced in 3-month (or longer) cycles, tested extensively and cannot be changed afterwards except in tightly controlled ways. "Code" most definitely cannot be changed, so anything that does need to be changed has to be factored out and made "configuration data" so that changing it becomes palatable those whose job it is to ensure that a release works.

Of course, in most cases bad configuration data can break a release just as thoroughly as bad code, so the whole thing is largely an illusion - in reality it doesn't matter whether it's code or "configuration data" that changes, what matters is that the interface between the main system and the parts that change is narrow and well-defined enough to give you a good chance that the person who does the change understands all consequences of what he's doing.

This is already harder than most people think when it's really just a few strings and numbers that are configured (I've personally witnessed a production mainframe system crash because it had one boolean value set differently than another system it was talking to). When your "configuration data" contains complex logic, it's almost impossible to achieve. But the situation isn't going to be any better ust because you use a badly-designed ad hoc "rules configuration" language instead of "real" code.


This is a rather philosophical question (which I like) so I'll answer it in a philosophical way: with nothing much to back it up. ;)

Data is the part of a system that can change. Code defines behavior; the way in which data can change into new data.

To put it more accurately: Data can be described by two components: a description of what the datum is supposed to represent (for instance, a variable with a name and a type) and a value. The value of the variable can change according to rules defined in code. The description does not change, of course, because if it does, we have a whole new piece of information. The code itself does not change, unless requirements (what we expect of the system) change.

To a compiler (or a VM), code is actually the data on which it performs its operations. However, the to-be-compiled code does not specify behavior for the compiler, the compiler's own code does that.


It all depends on the requirement. If the data is like lookup data and changes frequently you dont really want to do it in code, but things like Day of the Week, should not chnage for the next 200 years or so, so code that.

You might consider changing your topic, as the first thing I thought of when I saw it, was the age old LISP discussion of code vs data. Lucky in Scheme code and data looks the same, but thats about it, you can never accidentally mix code with data as is very possible in LISP with unhygienic macros.