Explain concatenative languages to me like I'm an 8-year-old

In normal programming languages, you have variables which can be defined freely and you call methods using these variables as arguments. These are simple to understand but somewhat limited. Often, it is hard to reuse an existing method because you simply can't map the existing variables into the parameters the method needs or the method A calls another method B and A would be perfect for you if you could only replace the call to B with a call to C.

Concatenative language use a fixed data structure to save values (usually a stack or a list). There are no variables. This means that many methods and functions have the same "API": They work on something which someone else left on the stack. Plus code itself is thought to be "data", i.e. it is common to write code which can modify itself or which accepts other code as a "parameter" (i.e. as an element on the stack).

These attributes make this languages perfect for chaining existing code to create something new. Reuse is built in. You can write a function which accepts a list and a piece of code and calls the code for each item in the list. This will now work on any kind of data as long it's behaves like a list: results from a database, a row of pixels from an image, characters in a string, etc.

The biggest problem is that you have no hint what's going on. There are only a couple of data types (list, string, number), so everything gets mapped to that. When you get a piece of data, you usually don't care what it is or where it comes from. But that makes it hard to follow data through the code to see what is happening to it.

I believe it takes a certain set of mind to use the languages successfully. They are not for everyone.

[EDIT] Forth has some penetration but not that much. You can find PostScript in any modern laser printer. So they are niche languages.

From a functional level, they are at par with LISP, C-like languages and SQL: All of them are Turing Complete, so you can compute anything. It's just a matter of how much code you have to write. Some things are more simple in LISP, some are more simple in C, some are more simple in query languages. The question which is "better" is futile unless you have a context.


First I'm going to make a rebuttal to Norman Ramsey's assertion that there is no theory.

Theory of Concatenative Languages

A concatenative language is a functional programming language, where the default operation (what happens when two terms are side by side) is function composition instead of function application. It is as simple as that.

So for example in the SKI Combinator Calculus (one of the simplest functional languages) two terms side by side are equivalent to applying the first term to the second term. For example: S K K is equivalent to S(K)(K).

In a concatenative language S K K would be equivalent to S . K . K in Haskell.

So what's the big deal

A pure concatenative language has the interesting property that the order of evaluation of terms does not matter. In a concatenative language (S K) K is the same as S (K K). This does not apply to the SKI Calculus or any other functional programming language based on function application.

One reason this observation is interesting because it reveals opportunities for parallelization in the evaluation of code expressed in terms of function composition instead of application.

Now for the real world

The semantics of stack-based languages which support higher-order functions can be explained using a concatenative calculus. You simply map each term (command/expression/sub-program) to be a function that takes a function as input and returns a function as output. The entire program is effectively a single stack transformation function.

The reality is that things are always distorted in the real world (e.g. FORTH has a global dictionary, PostScript does weird things where the evaluation order matters). Most practical programming languages don't adhere perfectly to a theoretical model.

Final Words

I don't think a typical programmer or 8 year old should ever worry about what a concatenative language is. I also don't find it particularly useful to pigeon-hole programming languages as being type X or type Y.


After reading http://concatenative.org/wiki/view/Concatenative%20language and drawing on what little I remember of fiddling around with Forth as a teenager, I believe that the key thing about concatenative programming has to do with:

  • viewing data in terms of values on a specific data stack
  • and functions manipulating stuff in terms of popping/pushing values on the same the data stack

Check out these quotes from the above webpage:

There are two terms that get thrown around, stack language and concatenative language. Both define similar but not equal classes of languages. For the most part though, they are identical.

Most languages in widespread use today are applicative languages: the central construct in the language is some form of function call, where a function is applied to a set of parameters, where each parameter is itself the result of a function call, the name of a variable, or a constant. In stack languages, a function call is made by simply writing the name of the function; the parameters are implicit, and they have to already be on the stack when the call is made. The result of the function call (if any) is then left on the stack after the function returns, for the next function to consume, and so on. Because functions are invoked simply by mentioning their name without any additional syntax, Forth and Factor refer to functions as "words", because in the syntax they really are just words.

This is in contrast to applicative languages that apply their functions directly to specific variables.

Example: adding two numbers.

Applicative language:

int foo(int a, int b)
{
    return a + b;
}

var c = 4;
var d = 3;
var g = foo(c,d);

Concatenative language (I made it up, supposed to be similar to Forth... ;) )

push 4
push 3
+
pop

While I don't think concatenative language = stack language, as the authors point out above, it seems similar.