How does an assembler work/how is it written?

A long time ago, in a galaxy far away, the very first assemblers were written directly in machine code. Once you have an assembler, though, you can use it to assemble new assemblers, and you can also use it to assemble a compiler.

Once you have a compiler, you can use it to compile new assemblers.

So in practice, assemblers are written in higher level languages nowadays. Often C or C++. As to how they work, a very, very, simple assembler is essentially just a big switch statement; recognize the opcode, translate it to the corresponding machine encoding.


I am actually doing this exact problem: assembling an assembler without having an assembler. The way I am doing it is with an Excel spreadsheet. The spreadsheet has formulas in it to lookup op codes etc and calculate the binary output. In the old days they did the same thing except they used paper spreadsheets instead of computer-based spreadsheets. Here is a screenshot of my manual assembler, the highlighted area is the machine code:

Screenshot of Manual Assembler

The area on the right is the assembly code. The area on the left is the intermediate calculations. Here is a closeup:

Closeup of manual assembler

So, the short answer to the question is: to do manual assembly in the old days they used a paper spreadsheet. Nowadays, to do manual assembly we use a computer spreadsheet (I use Excel).

In my spreadsheet the column labeled "Encode" is the actual binary code (machine code). The assembly instructions are on the right along with a description of what they are doing.