Is there a programming language with built-in state machine construct?
There's a new W3C XML-based state machine language called SCXML, based on David Harel's StateChart formalism (which supports hierarchical and parallel state machines).
Apache Commons has a Java based implementation of SCXML:
Commons SCXML is an implementation aimed at creating and maintaining a Java SCXML engine capable of executing a state machine defined using a SCXML document, while abstracting out the environment interfaces.
Ragel is a state machine language. IOW, it's not a language that also supports state machines, it's a language that only supports state machines. Which obviously means that it's not Turing-complete, but who needs that?
More precisely, Ragel is a state machine compiler, which takes a description of a state machine in a regexp-like language and generates an implementation of that state machine in C, C++, Objective-C, D, Java or Ruby. (Think yacc
but for state machines instead of LALR(1) table parsers.) The primary purpose of Ragel is parsing binary protocols (such as networking protocols or also on-disk file formats), but it can just as well be used for text.
One famous example of using Ragel is the Mongrel webserver for Ruby: its HTTP kernel is written in Ragel, which makes it extremely fast and secure. The HTTP kernel is so good in fact, that it has been reused a number of times in different applications: Thin, Unicorn and Rainbows are also webservers, and in fact direct competitors to Mongrel. Ebb is a reverse HTTP proxy. RFuzz is a fuzz testing tool for web applications. Also, some security tools use it.
Ragel also allows embedding code in the host language into the state machine, thus making it Turing-complete, and able to not only recognize but also interpret protocols.
In general, every language with support for advanced user-defined control-flow via either coroutines (e.g. Lua) or continuations (e.g. Scala) or GOTO
(e.g. PHP) or proper tail calls (e.g. Scheme) can be used to easily implement state machines. (Generators (Python) aka iterators (C#), which are basically "crappy coroutines" might or might not work, depending on your definition of "work".) And any language which has flexible syntax (e.g. Ruby) or supports metasyntactic abstraction (e.g. Clojure) can be used to describe state machines. (Support for non-ASCII identifiers helps, too, so that you can use actual arrows for your state machine.)
Which means that if you combine the two, and use a language that supports both tail calls and metasyntactic abstraction, you get very nice state machines, without requiring native language support. Shriram Krishnamurthi gave a now (in)famous talk titled "The Swine before Perl" at the inaugural Lightweight Languages Conference, in which he demonstrated an FSM implementation in Scheme. (Here are the slides, an audio recording and a paper explaining the code). The code itself is a 26 line (very short lines, actually) macro, that allows you to write code like this:
(define my-regex
(automaton init
[init : (c → more)]
[more : (a → more)
(d → more)
(r → end)]
[end : accept]))
This is a specification of the state machine corresponding to the regular expression c(a|d)*r
. And it is not only a specification, but also a runnable program implementing that state machine.
I can call it like this:
(my-regex '(c a d a d d r))
And in this case get the result #t
(which is Scheme-speak for true
).