Why do we need the "nop" I.e. No operation instruction in microprocessor 8085?

One use of NOP (or NOOP, no-operation) instruction in CPUs and MCUs is to insert a little, predictable, delay in your code. Although NOPs don't perform any operation, it takes some time to process them (the CPU has to fetch and decode the opcode, so it needs some little time do do that). As little as 1 CPU cycle is "wasted" to execute a NOP instruction (the exact number can be inferred from the CPU/MCU datasheet, usually), therefore putting N NOPs in sequence is an easy way to insert a predictable delay:

\$ t_{delay} = N \cdot T_{clock} \cdot K\$

where K is the number of cycles (most often 1) needed for the processing of a NOP instruction, and \$T_{clock}\$ is the clock period.

Why would you do that? It may be useful to force the CPU to wait a little for external (maybe slower) devices to complete their work and report data to the CPU, i.e. NOP is useful for synchronization purposes.

See also the related Wikipedia page on NOP.

Another use is to align code at certain addresses in memory and other "assembly tricks", as explained also in this thread on Programmers.SE and in this other thread on StackOverflow.

Another interesting article on the subject.

This link to a Google book page especially refers to 8085 CPU. Excerpt:

Each NOP instruction uses four clocks for fetching, decoding and executing.

EDIT (to address a concern expressed in a comment)

If you are worrying about speed, keep in mind that (time) efficiency is only one parameter to consider. It all depends on the application: if you want to compute the 10-billionth figure of \$\pi\$, then perhaps your only concern could be speed. On the other hand, if you want to log data from temperature sensors connected to a MCU through an ADC, speed is not usually so important, but waiting the right amount of time to allow the ADC to correctly complete each reading is essential. In this case if the MCU doesn't wait enough it risks to get completely unreliable data (I concede it would get that data faster, though :o).


The other answers are only considering a NOP that actually executes at some point - that's used quite commonly, but it's not the only use of NOP.

The non-executing NOP is also pretty useful when writing code that can be patched - basically, you'll pad the function with a few NOPs after the RET (or similar instruction). When you have to patch the executable, you can easily add more code to the function starting from the original RET and using as many of those NOPs as you need (e.g. for long jumps or even inline code) and finishing with another RET.

In this use case, noöne ever expects the NOP to execute. The only point is to allow patching the executable - in a theoretical non-padded executable, you'd have to actually change the code of the function itself (sometimes it might fit the original boundaries, but quite often you'll need a jump anyway) - that's a lot more complicated, especially considering manually written assembly or an optimizing compiler; you have to respect jumps and similar constructs that might have pointed on some important piece of code. All in all, pretty tricky.

Of course, this was a lot more heavily used in the olden days, when it was useful to make patches like these small and online. Today, you'll just distribute a recompiled binary and be done with it. There's still some who use patching NOPs (executing or not, and not always literal NOPs - for example, Windows uses MOV EDI, EDI for online patching - that's the kind where you can update a system library while the system is actually running, without needing restarts).

So the last question is, why have a dedicated instruction for something that doesn't really do anything?

  • It is an actual instruction - important when debugging or handcoding assembly. Instructions like MOV AX, AX will do exactly the same, but do not signal the intent quite so clearly.
  • Padding - "code" that's there just to improve overall performance of code that depends on alignment. It's never meant to execute. Some debuggers simply hide padding NOPs in their disassembly.
  • It gives more space for optimizing compilers - the still used pattern is that you've got two steps of compilation, the first one being rather simple and producing lots of unnecessary assembly code, while the second one cleans up, rewires the address references and removes extraneous instructions. This is often seen in JIT-compiled languages as well - both .NET's IL and JVM's byte-code use NOPs quite a lot; the actual compiled assembly code doesn't have those anymore. It should be noted that those are not x86-NOPs, though.
  • It makes online debugging easier both for reading (pre-zeroed memory will be all-NOPs, making dissasembly a lot easier to read) and for hot-patching (though I do by far prefer Edit and Continue in Visual Studio :P).

For executing NOPs, there's of course a few more points:

  • Performance, of course - this is not why it was in 8085, but even the 80486 already had a pipelined instruction execution, which makes "doing nothing" a bit trickier.
  • As seen with MOV EDI, EDI, there's other effective NOPs than the literal NOP. MOV EDI, EDI has the best performance as a 2-byte NOP on x86. If you used two NOPs, that would be two instructions to execute.

EDIT:

Actually, the discussion with @DmitryGrigoryev forced me to think about this a bit more, and I think it's a valuable addition to this question / answer, so let me add some extra bits:

First, point, obviously - why would there be an instruction that does something like mov ax, ax? For example, let's look at the case of 8086 machine code (older than even 386 machine code):

  • There's a dedicated NOP instruction with opcode 0x90. This is still the time when many people wrote in assembly, mind you. So even if there wasn't a dedicated NOP instruction, the NOP keyword (alias/mnemonic) would still be useful and would map to that.
  • Instructions like MOV actually map to many different opcodes, because that saves on time and space - for example, mov al, 42 is "move immediate byte to the al register", which translates to 0xB02A (0xB0 being the opcode, 0x2A being the "immediate" argument). So that takes two bytes.
  • There's no shortcut opcode for mov al, al (since that's a stupid thing to do, basically), so you'll have to use the mov al, rmb (rmb being "register or memory") overload. That actually takes three bytes. (although it would probably use the less-specific mov rb, rmb instead, which should only take two bytes for mov al, al - the argument byte is used to specify both the source and the target register; now you know why 8086 only had 8 registers :D). Compare to NOP, which is a a single byte instruction! This saves on memory and time, since reading the memory in the 8086 was still quite expensive - not to mention loading that program from a tape or floppy or something, of course.

So where does the xchg ax, ax come from? You just have to look at the opcodes of the other xhcg instructions. You'll see 0x86, 0x87 and finally, 0x91 - 0x97. So nop with it's 0x90 seems like a pretty good fit for xchg ax, ax (which, again, isn't an xchg "overload" - you'd have to use xchg rb, rmb, at two bytes). And in fact, I'm pretty sure this was a nice side-effect of the micro-architecture of the time - if I recall correctly, it was easy to map the whole range of 0x90-0x97 to "xchg, acting over registers ax and ax-di" (the operand being symmetric, this gave you the full range, including the nop xchg ax, ax; note that the order is ax, cx, dx, bx, sp, bp, si, di - bx is after dx, not ax; remember, the register names are mnemonics, not ordered names - accumulator, counter, data, base, stack pointer, base pointer, source index, destination index). The same approach was also used for other operands, for example the mov someRegister, immediate set. In a way, you could think of this as if the opcode actually wasn't a full byte - the last few bits are "an argument" to the "real" operand.

All this said, on x86, nop might be considered a real instruction, or not. The original micro-architecture did treat it as a variant of xchg if I recall correctly, but it was actually named nop in the specification. And since xchg ax, ax doesn't really make sense as an instruction, you can see how the designers of the 8086 saved on transistors and pathways in instruction decoding by exploiting the fact that 0x90 maps naturally to something that's entirely "noppy".

On the other hand, the i8051 has an entirely designed-in opcode for nop - 0x00. Kinda practical. The instruction design is basically using the high nibble for operation and the low nibble for selecting the operands - for example, add a is 0x2Y, and 0xX8 means "register 0 direct", so 0x28 is add a, r0. Saves a lot on silicon :)

I could still go on, since CPU design (not to mention compiler design and language design) is quite a broad topic, but I think I've shown many different viewpoints that went into the design quite nicely as is.


Back in the late 70's, we (I was a young research student then) had a little dev system (8080 if memory serves) that ran in 1024 bytes of code (i.e. a single UVEPROM) - it only had four commands to load (L), save (S), print(P), and something else I can't remember. It was driven with a real teletype and punch tape. It was tightly coded!

One example of NOOP use was in an interrupt service routine (ISR), which were spaced in 8 byte intervals. This routine ended up being 9 bytes long ending with a (long) jump to an address slightly further up the address space. This meant, given the little endian byte order, that the high address byte was 00h, and slotted into the first byte of the next ISR, which meant that it (the next ISR) started with with NOOP, just so 'we' could fit the code in the limited space!

So the NOOP is useful. Plus, I suspect it was easiest for intel to code it that way - They probably had a list of instructions they wanted to implement and it started at '1', like all lists do (this was the days of FORTRAN), so the zero NOOP code became a fall out. (I've never seen an article arguing that a NOOP is an essential part of the Computing Science theory (the same Q as: do mathematicians have a nul op, as distinct from the zero of group theory?)