What's the difference between using int 0x20 and int 0x21 / ah=0x4C to exit a 16-bit assembly program?
First, some background. DOS uses interrupt 21h for its system calls. AH is used to demultiplex the various functions INT 21h provides. When a program is executed, DOS puts before it 256 bytes known as the PSP (Program Segment Prefix), which contain information about the process.
The original exit function in DOS is INT 21/AH=00
. Now, apparently DOS developers decided that returning from a program should be a way to exit the program (did this come from CP/M?). RET
(near) pops a word from the stack and jumps to it. So, when a program is created, its stack starts with a word 0000
. This is the start of the PSP. So, at the start of the PSP there is code to terminate the program. To keep that code small, INT 20h
acts as an alias to MOV AH,00h ; INT 21h
.
[Edit: This can be seen in the screenshot below.]
DOS 2.0 took many things from Unix, including return codes. So, a new INT 21h function appeared, INT 21h/AH=4ch
, which takes a return code to return it to the OS. This function is also designed to work with EXE files (AFAIR also new in DOS 2.0), which can have several segments. The previous exit functions (INT 20h and INT 21h/00) assume CS is the same as it was at program startup on a COM program, that is, it points to the PSP 256 bytes before the program.
[Edit:
Historical note: on CP/M, there were 3 ways to exit a program:
- Calling BDOS function 0 (the equivalent of INT 21 AH=00h, DOS function 0)
- Jumping to the WBOOTF location at 0000h (the equivalent of PSP offset 000h)
- RETurning
The WBOOTF location consisted of 3 bytes: 1 byte for a jump, 2 bytes for the jump target (the WBOOT function in BDOS).
In early versions of CP/M, calling BDOS function 0 or jumping to WBOOT caused parts of CP/M to be reloaded from disk (warm boot), and some OS initialization to be subsequently run; while RETurning directly returned to the CCP (Console Command Processor, the equivalent of COMMAND.COM), which then prompted for the next command line. AFAIU, CP/M 3 was usually loaded in ROM, and returning returned to the WBOOT location, causing a reload of parts of the OS from ROM.
The later, int 0x21, allows you to specify a return code.
The return code is placed in the register AL.
http://spike.scu.edu.au/~barry/interrupts.html#ah4c
This is what I know.
MOV AH, 0x4C
INT 0x21
Ends the running EXE file, the EXE file must be ended this way, because codesegment register CS. Correct me if I'm wrong but if you end COM files this way you get unexpected results (crashes, hangs, reboots etc). Therefore
INT 0x20
Ends a COM file.
CS is the same as it was at program startup on a COM program, that is, it points to the PSP 256 bytes before the program. (CodeSegment InstructionPointer CS:IP, CS contains codesegment). Yes we are talking registers, variables they are like a cupboard, I works like you can put something correctly in a drawer. AX=0000 BX=0000 CX=0000 (CX is composed of CL and CH) etc.
I thought COM files were generally limited to 64K ALWAYS. The second reason I thought was that COM files don't have a DATA SEGMENT, they do have data but it resides in the same segment as the code. I thought they didn't have ANY segment except for the CODE, All data in a COM file is stored within 64K. EXE files do have a segment, some EXE files can have more segments (CS:IP) when using the correct memory model (see Intel Memory Model).
- Tiny*
CS=DS=SS
- Small
DS=SS
- Medium
DS=SS
, multiple code segments - Compact single code segment, multiple data segments
- Large multiple code and data segments
- Huge multiple code and data segments; single array may be >64 KB
EXE files have a limit of 64K using memory model SMALL. When using larger memory models and far 32 bit pointers, you can address more than 64K (still limited). I thought that was the 'trick'.
Now everyone is complaining on me why should you use EXE files. Above are the reasons. Memory model comes from wikipedia. For those who don't care, I learned this the hard way from professionals. Peter Norton and John Socha. From Norton Utilities (the person behind The Norton Commander for DOS). He had a book about assembly something like "Assembly for the IBM-PC". You should read it, he explains it best. He was like a good teacher for me. Microsoft CodeView clarified me a lot. Programming in DOS C? Turbo C 2.0 is the best you can get. Oh, and I don't program anymore.