Write an interpreter for 99
Python 3, 421 414 410 404 388 395 401 bytes
Golfed:
import sys,re
v,i,c,g,L={},0,[re.sub('( ?)[^9]+','\\1',l).rstrip().split(' ')for l in open(sys.argv[1])],lambda i:v.get(i,int(i)//9),len
while-1<i<L(c):
d=c[i];l=L(d);e,*f=d;i+=1
if l>1:
x,*y=f
if e:w=list(map(g,f));v[e]=sum(w[::2])-sum(w[1::2])
elif l==2:j=input();v[x]=int(j)if L(x)%2 else ord(j)
elif~-any(g(j)for j in y):i=g(x)*9
elif e:w=g(e);print(w if L(e)%2 else chr(w%128),end='')
Ungolfed:
import sys, re
# Intialise variable table.
vars_ = {}
get_var = lambda i: vars_.get(i, int(i)//9)
# Parse commands.
commands=[re.sub('( ?)[^9]+','\\1',l).rstrip().split(' ') for l in open(sys.argv[1])]
# Run until the current instruction index is out of bounds.
index=0
while 0 <= index < len(commands):
# Get the current command and increment the index.
command = commands[index]
l = len(command)
first = command[0]
index += 1
if l > 1:
# Handle the "assignment" command.
if first:
operands = [get_var(i) for i in command[1:]]
vars_[first] = sum(operands[0::2]) - sum(operands[1::2])
# Handle the "input" command.
elif l==2:
inp = input()
vars_[command[1]] = int(inp) if len(command[1]) % 2 else ord(inp)
# Handle the "goto" command.
elif not any(get_var(i) for i in command[2:]):
index = get_var(command[1]) * 9
# Handle the "output" command.
elif first:
val = get_var(first)
print(val if len(first) % 2 else chr(val % 128),end='')
Pretty much just a literal implementation of the spec, golfed down as far as I can get it.
Run from the command line by supplying a 99 source-code file as the sole argument (e.g. the last example from the OP):
> python3 ninetynine.py countdown.txt
G11G10G9G8G7G6G5G4G3G2G1G
>
As an added bonus, here's a (rather poor) implementation of "99 bottles" in 99: http://pastebin.com/nczmzkFs
Common Lisp, 1180 857 837 836 bytes
I know this is not going to win, but I had fun golfing this one. I managed to remove 343 bytes, which is more than two 99 interpreters written in CJam.
Also, quite amusingly, the more I try to compress it, the more I am persuaded that for Common Lisp, it is shorter to compile the code than to try interpreting it on the fly.
(defmacro g(g &aux a(~ -1)> d x q(m 0)r v(n t)c(w 0)? u z)(flet((w(n p)(intern(format()"~a~a"p n))))(#1=tagbody %(case(setf c(ignore-errors(elt g(incf ~))))(#\ #2=(when(> w 0)(pushnew w v)(if u()(setq ?(oddp w)))(#5=push(w w'V)u)(setf w 0))(setf z t))(#\9(incf w)(setf >(or >(and n z))z()n()))((#\Newline())#2#(#5#(when u(setf u(reverse u)a(pop u))(if >(if u`(when(every'zerop(list,@u))(setf @,a)(go ^))`(setf,a,(if ?'(read)'(char-code(read-char)))))(if u`(setf,a,(do(p m)((not u)`(-(+,@p),@m))(#5#(pop u)p)(#5#(if u(pop u)0)m)))`(princ,(if ? a`(code-char(mod,a 128)))))))r)(incf m)(setf ?()u()z()>()n t)))(if c(go %))$(decf m)(setq d(pop r))(if d(#5# d x))(when(=(mod m 9)0)(#5#(w #3=(/ m 9)'L)x)(#5#`(,#3#(go,(w #3#'L)))q))(if(>= m 0)(go $)))`(let(@,@(mapcar(lambda(n)`(,(w n'V),(/(1-(expt 10 n))9)))v))(#1#,@x(go >)^(case @,@q)>))))
lexical analysis and code generation are interleaved: I do not store the internal representation, but process directly each line.
there is a single
tagbody
to perform 2 loops:(... (tagbody % ... (go %) $ ... (go $)) result)
local variables are declared in
&aux
don't generate a closure, but directly the interpreted code
etc.
Ungolfed, commented
(defmacro parse-99
(string &aux
(~ -1) ; current position in string
a ; first variable in a line
> ; does current line starts with a leading space?
d ; holds a statement during code generation
x ; all statements (labels + expressions)
q ; all generated case statements
(m 0) ; count program lines (first increases, then decreases)
r ; list of parsed expressions (without labels)
v ; set of variables in program, as integers: 999 is 3
(n t) ; are we in a new line without having read a variable?
c ; current char in string
(w 0) ; currently parsed variable, as integer
? ; is first variable odd?
u ; list of variables in current line, as integers
z) ; is the last read token a space?
(flet((w(n p)
;; produce symbols for 99 variables
;; e.g. (10 'V) => 'V10
;; (4 'L) => 'L4
(intern(format()"~a~a"p n))))
(tagbody
parse
(case (setf c
;; read current char in string,
;; which can be NIL if out-of-bounds
(ignore-errors(aref string (incf ~))))
;; Space character
(#\Space
#2=(when(> w 0)
(pushnew w v) ; we were parsing a variable, add it to "v"
(if u()(setq ?(oddp w))) ; if stack is empty, this is the first variable, determine if odd
(push(w w'V)u) ; add to stack of statement variable
(setf w 0)) ; reset w for next variable
;; Space can either be significant (beginning of line,
;; preceding a variable), or not. We don't know yet.
(setf z t))
;; Nine
(#\9
(incf w) ; increment count of nines
(setf >(or >(and n z)) ; there is an indent if we were
; starting a newline and reading a
; space up to this variable (or if we
; already know that there is an
; indent in current line).
;; reset z and n
z()n()))
;; Newline, or end of string
((#\Newline())
#2# ;; COPY-PASTE the above (when(> w 0)...) statement,
;; which adds previously read variable if necessary.
;; We can now convert the currently read line.
;; We push either NIL or a statement into variable R.
(push(when u
(setf u (reverse u) ; we pushed, we must reverse
a (pop u)) ; a is the first element, u is popped
(if >
;; STARTS WITH LEADING SPACE
(if u
;; JUMP
`(when(every'zerop(list,@u))(setf @,a)(go ^))
;; READ
`(setf,a,(if ?'(read)'(char-code(read-char)))))
;; STARTS WITH VARIABLE
(if u
;; ARITHMETIC
`(setf,a,(do(p m) ; declare p (plus) and m (minus) lists
;; stopping condition: u is empty
((not u)
;; returned value: (- (+ ....) ....)
`(-(+,@p),@m))
;; alternatively push
;; variables in p and m, while
;; popping u
(push(pop u)p)
;; first pop must succeed, but
;; not necessarly the second
;; one. using a zero when u
;; is empty covers a lot of
;; corner cases.
(push(if u (pop u) 0) m)))
;; PRINT
`(princ,(if ? a`(code-char(mod,a 128)))))))
r)
;; increase line count
(incf m)
;; reset intermediate variables
(setf ?()u()z()>()n t)))
;; loop until end of string
(if c (go parse))
build
;;; Now, we can add labels in generated code, for jumps
;; decrease line count M, which guards our second loop
(decf m)
;; Take generated statement from R
(setq d(pop r))
;; we pop from R and push in X, which means X will eventually
;; be in the correct sequence order. Here, we can safely
;; discard NIL statements.
;; We first push the expression, and THEN the label, so that
;; the label ends up being BEFORE the corresponding statement.
(if d(push d x))
;; We can only jump into lines multiple of 9
(when (=(mod m 9)0)
;; Push label
(push(w #3=(/ m 9)'L)x)
;; Also, build a case statement for the jump table (e.g. 2(go L2))
(push`(,#3#(go,(w #3#'L)))q))
;; loop
(if(>= m 0)(go build)))
;; Finally, return the code
`(let(@ ; target of a jump instruction
;; other variables: V3 represents 999 and has a default value of 111
,@(mapcar(lambda(n)`(,(w n'V),(/(1-(expt 10 n))9)))v))
;; build a tagbody, inject statements from X and case statements from Q
;; label ^ points to jump table : we go to ^ each time there is a JUMP
;; label > is the end of program
;; note that if the case does not match any authorized target
;; address, we simply end the programs.
(tagbody,@x(go >)^(case @,@q)>))))
We use standard input/output during evaluation, meaning we use standard read
and princ
functions. Hence, the resulting code can be made executable on the command-line, as show below.
Inputs are not completely well sanitized when running 99 programs: it is assumed that the user knows what kind of values are expected.
The only possible runtime overhead can occur when jumping, since we must evaluate the value of a variable and match that value to a label. Except that, the interpreter shall be quite efficient.
Based on the clever obseravtion from Mac that we don't need to divide and multiply by 9 each time, the current version manages to never divides nor multiply by 9 during execution.
Example
If we replace defmacro
by defun
, we see the generated code. For example:
(g
"99999999 Print G.
999 99 Set triple-nine to ninety-nine.
9999999999 9999999999 9999999999 99 99 9 9 999 999 Set 10-nine to zero.
99999999999 9999999999 Set 11-nine to zero.
999 Print triple-nine's value divided by nine. (This is the ninth line.)
99999999 Print G.
999 999 9 Subtract nine from triple-nine.
99999 999 Jump to line 5-nines if triple-nine is zero (endsprogram).
9 99999999999 9999999999 Jump to line nine if 10-nine and 11-nine are zero (alwa
")
Here is the resulting code:
(LET (@
(V5 11111)
(V11 11111111111)
(V1 1)
(V10 1111111111)
(V2 11)
(V3 111)
(V8 11111111))
(TAGBODY
L0
(PRINC (CODE-CHAR (MOD V8 128)))
(SETF V3 (- (+ V2) 0))
(SETF V10 (- (+ V3 V1 V2 V10) V3 V1 V2 V10))
(SETF V11 (- (+ V10) 0))
L1
(PRINC V3)
(PRINC (CODE-CHAR (MOD V8 128)))
(SETF V3 (- (+ V3) V1))
(WHEN (EVERY 'ZEROP (LIST V3)) (SETF @ V5) (GO ^))
(WHEN (EVERY 'ZEROP (LIST V11 V10)) (SETF @ V1) (GO ^))
(GO >)
^
(CASE @ (0 (GO L0)) (1 (GO L1)))
>))
When executed, prints "G11G10G9G8G7G6G5G4G3G2G1G"
Command-line
We can build an executable by dumping a core and specifying the toplevel
function. Define a file named boot.lisp
where you put the defmacro
, and then write the following:
(defun main()(parse-99 <PROGRAM>))
(save-lisp-and-die "test-99" :executable t :toplevel #'main)
Running sbcl --load boot.lisp
gives the following output:
$ sbcl --load boot.lisp
This is SBCL 1.2.8.32-18c2392, an implementation of ANSI Common Lisp.
More information about SBCL is available at <http://www.sbcl.org/>.
SBCL is free software, provided as is, with absolutely no warranty.
It is mostly in the public domain; some portions are provided under
BSD-style licenses. See the CREDITS and COPYING files in the
distribution for more information.
[undoing binding stack and other enclosing state... done]
[saving current Lisp image into test-99:
writing 5824 bytes from the read-only space at 0x20000000
writing 3120 bytes from the static space at 0x20100000
writing 55771136 bytes from the dynamic space at 0x1000000000
done]
Then, running the compiled 99 program:
$ time ./test-99
G11G10G9G8G7G6G5G4G3G2G1G
real 0m0.009s
user 0m0.008s
sys 0m0.000s
99 bottles
If you are interested, here is the compiled code for the 99 bottles program written in Mac's answer: http://pastebin.com/ZXe839CZ (this is the old version where we have jmp
and end
labels, a surrounding lambda and prettier arithmetic).
Here is an execution with the new version, to prove it still works: http://pastebin.com/raw.php?i=h73q58FN
CJam, 157 bytes
{:I;_N" 9"+--N/:P:,$W=){1a*Ab}%:V;{PT):T(=:LS%_{LS#\:,_,({(\{V=}%@{V-1@{2$*+0@-\}*\;t:V;}{:|T@V=9*?:T;}?}{~\{_V=\1&!{128%c}*o}{VIW):W=it:V;}?}?}R?Tg)TP,<*}g}
Try it online:
- I/O Test
- G11G10G9G8G7G6G5G4G3G2G1G
- 99 bottles of beer (this code is too long to fit in a URL, so copy the code and paste it into the interpreter manually)
Explanation
Trying to format this with proper indentation and comments would probably take forever, so I'll just give an algorithmic summary.
The code is a block, CJam's analog to anonymous functions. The block expects the program string and list of inputs on the stack when executed.
Initialization consists of three steps. First, the input list is saved. Then, every character in the program that isn't meaningful is removed and the result is split into a list of lines and saved. Finally, the variable list is initialized. This list maps each variable, indexed by name length, to its value divided by 9 (a variable can never hold a value that's not a multiple of 9, and all operations except goto benefit from this change). The list is initialized up to the length of the longest line, which is an upper bound on the longest varaible name present. There's also a bit of implicit initialization due to initial variable values: the line number is 0 and the input index is -1.
The interpreter is implemented as one would expect: a loop that reads the next line, increments the line number, and executes the line while the line number points to an existing line. Line parsing first checks that the line isn't empty, then branches based on whether the arity is 1 or >1, then branches based on whether there was a leading space. These four branches emulate the four (excluding no-op) operations in a mostly straightforward manner, albeit agressively golfed like everything else. Perhaps one optimization of note is that, since a valid input sequence should always produce an element of type expected by the program, I omitted making separate cases for input based on the length of the variable name. It is simply assumed that the element read from the input list is of the expected type.