Problem switching to v8086 mode from 32-bit protected mode by setting EFLAGS.VM to 1

TL;DR :

Question #1:

POPF doesn't actually allow you to change the VM flag per the Instruction Set Architecture reference:

When operating in protected, compatibility, or 64-bit mode at privilege level 0 (or in real-address mode, the equivalent to privilege level 0), all non-reserved flags in the EFLAGS register except RF1, VIP, VIF, and VM may be modified. VIP, VIF and VM remain unaffected.

There are two general mechanisms that can be used to set EFLAGS.VM and enter v8086 mode:

  • A task switch to an 80386 task loads the image of EFLAGS from the new TSS. The TSS of the new task must be an 80386 TSS, not an 80286 TSS, because the 80286 TSS does not store the high-order word of EFLAGS, which contains the VM flag. A value of one in the VM bit of the new EFLAGS indicates that the new task is executing 8086 instructions; therefore, while loading the segment registers from the TSS, - the processor forms base addresses as the 8086 would.

  • An IRET from a procedure of an 80386 task loads the image of EFLAGS from the stack. A value of one in VM in this case indicates that the procedure to which control is being returned is an 8086 procedure. The CPL at the time the IRET is executed must be zero, else the processor does not change VM.

Question #2:

v8086 mode is only available on an x86-64 processor in 32-bit protected mode (legacy mode). You can not use it in 64-bit mode or 32-bit (or 16-bit) compatibility modes. You would have to switch the processor out of long mode and enter 32-bit protected mode (legacy mode) running at CPL=0 and perform one of the two methods noted above. This is an expensive (performance wise) undertaking and is fraught with problems. You would then have to switch back to long mode when finished.

If there is some use case for doing this and you are on a system with multiple cores - You can bring up one of the cores in 32-bit protected mode while the Bootstrap Processor (BSP) runs in long mode.


Method 1: use IRET to enter v8086 mode

This is the easiest solution. If you do an IRET from 32-bit protected mode (in CPL=0) and the EFLAGS.VM register on the stack is set, the CPU will attempt to return to v8086 mode and assumes the stack frame contains the required information to make that transition:

PROTECTED-MODE:
[snip]
    EIP ← Pop();
    CS ← Pop(); (* 32-bit pop, high-order 16 bits discarded *)
    tempEFLAGS ← Pop();
[snip]

 RETURN-TO-VIRTUAL-8086-MODE:
    (* Interrupted procedure was in virtual-8086 mode: PE = 1, CPL=0, VM = 1 in flag image *)
    IF EIP not within CS limit
        THEN #GP(0); FI;
    EFLAGS ← tempEFLAGS;
    ESP ← Pop();
    SS ← Pop(); (* Pop 2 words; throw away high-order word *)
    ES ← Pop(); (* Pop 2 words; throw away high-order word *)
    DS ← Pop(); (* Pop 2 words; throw away high-order word *)
    FS ← Pop(); (* Pop 2 words; throw away high-order word *)
    GS ← Pop(); (* Pop 2 words; throw away high-order word *)
    CPL ← 3;
    (* Resume execution in Virtual-8086 mode *)
END;

If you push these items on the stack in reverse order and do the iret you should be able to enter v8086 mode.

V86_STACK_SEG          EQU 0x0000  ; v8086 stack SS
V86_STACK_OFS          EQU 0x0000  ; v8086 stack SP
V86_CS_SEG             EQU 0x0000  ; v8086 code segment CS

EFLAGS_VM_BIT          EQU 17      ; EFLAGS VM bit
EFLAGS_BIT1            EQU 1       ; EFLAGS bit 1 (reserved , always 1)

[snip]

    xor ebx, ebx                ; EBX=0
    push ebx                    ; Real mode GS=0
    push ebx                    ; Real mode FS=0
    push ebx                    ; Real mode DS=0
    push ebx                    ; Real mode ES=0
    push V86_STACK_SEG
    push V86_STACK_OFS          ; v8086 stack SS:SP (grows down from SS:SP)
    push dword 1<<EFLAGS_VM_BIT | 1<<EFLAGS_BIT1
                                ; Set VM Bit, IF bit is off, DF=0(forward direction),
                                ; IOPL=0, Reserved bit (bit 1) always 1. Everything
                                ; else 0. These flags will be loaded in the v8086 mode
                                ; during the IRET. We don't want interrupts enabled
                                ; because we have no v86 monitor via protected mode
                                ; GPF handler
    push V86_CS_SEG             ; Real Mode CS (segment)
    push v86_mode_entry         ; Entry point (offset)
    iret                        ; Transfer control to v8086 mode and our real mode code

I have set ES=DS=CS=FS=GS=0 and a real mode stack at V86_STACK_SEG:V86_STACK_OFS (define these as you see fit). IP is set to the offset of the v86_mode_entry label. In the code snippet above I only set 2 bits to 1 (bit 1 and VM). Bit 1 is a reserved bit in EFLAGS that is always suppose to be set to 1. All other flags in EFLAGS are 0, thus IOPL=0.

All other registers will contain the same values they had before entering v8086 mode. You may wish to zero them out to avoid leaking information into the v8086 task from 32-bit protected mode (ie: a kernel).

A minimal complete verifiable example of using this code is:

VIDEO_TEXT_ADDR        EQU 0xb8000 ; Hard code beginning of text video memory
ATTR_BWHITE_ON_GREEN   EQU 0x2f    ; Bright white on green attribute
ATTR_BWHITE_ON_MAGENTA EQU 0x5f    ; Bright White on magenta attribute

PM_MODE_STACK          EQU 0x80000 ; Protected mode stack below EBDA
V86_STACK_SEG          EQU 0x0000  ; v8086 stack SS
V86_STACK_OFS          EQU 0x0000  ; v8086 stack SP
V86_CS_SEG             EQU 0x0000  ; v8086 code segment CS

EFLAGS_VM_BIT          EQU 17      ; EFLAGS VM bit
EFLAGS_BIT1            EQU 1       ; EFLAGS bit 1 (reserved, always 1)
EFLAGS_IF_BIT          EQU 9       ; EFLAGS IF bit

; Macro to build a GDT descriptor entry
%define MAKE_GDT_DESC(base, limit, access, flags) \
    (((base & 0x00FFFFFF) << 16) | \
    ((base & 0xFF000000) << 32) | \
    (limit & 0x0000FFFF) | \
    ((limit & 0x000F0000) << 32) | \
    ((access & 0xFF) << 40) | \
    ((flags & 0x0F) << 52))

bits 16
ORG 0x7c00

; Include a BPB (1.44MB floppy with FAT12) to be more compatible with USB floppy media
; %include "bpb.inc"

boot_start:
    xor ax, ax                  ; DS=SS=ES=0
    mov ds, ax
    mov ss, ax                  ; Stack at 0x0000:0x7c00
    mov sp, 0x7c00
    cld                         ; Set string instructions to use forward movement

    ; Fast method of enabling A20 may not work on all x86 BIOSes
    ; It is good enough for emulators and most modern BIOSes
    ; See: https://wiki.osdev.org/A20_Line
    cli                         ; Disable interrupts for rest of code as we don't
                                ; want A20 code to be interrupted. In protected mode
                                ; we have no IDT so any interrupt that does occur will
                                ; double fault and reboot.

    in al, 0x92
    or al, 2
    out 0x92, al                ; Enable A20 using Fast Method

    lgdt [gdtr]                 ; Load our GDT

    mov eax, cr0
    or eax, 1
    mov cr0, eax                ; Set protected mode flag
    jmp CODE32_SEL:start32      ; FAR JMP to set CS

; v8086 code entry point
v86_mode_entry:
    sub dword [vidmem_ptr], VIDEO_TEXT_ADDR
                                ; Adjust video pointer to be relative to beginning of
                                ;     segment 0xb800

    mov si, in_v86_msg          ; Print in v86 message
    mov ah, ATTR_BWHITE_ON_MAGENTA
                                ; Attribute to print with
    call print_string_rm_nobios

.endloop:
    jmp $                       ; Infinite loop since we did code a solution to exit VM

; Function: print_string_rm_nobios
;           Display a string to the console on display page 0 in real/v8086 mode
;           without using the BIOS. We don't have a proper v8086 monitor so can't
;           use BIOS to display.
;
;           Very basic. Doesn't update hardware cursor, doesn't handle scrolling,
;           LF, CR, TAB.
;
; Inputs:   SI  = Offset of address to print
;           AH  = Attribute of string to print
; Clobbers: None
; Returns:  None

print_string_rm_nobios:
    push di
    push si
    push ax
    push es

    mov di, VIDEO_TEXT_ADDR>>4  ; ES=0xb800 (text video mode segment)
    mov es, di

    mov di, [vidmem_ptr]        ; Start from video address stored at vidmem_ptr
    jmp .getchar
.outchar:
    stosw                       ; Output character to display
.getchar:
    lodsb                       ; Load next character from string
    test al, al                 ; Is character NUL?
    jne .outchar                ; If not, go output character

    mov [vidmem_ptr], di        ; Update global video pointer

    pop es
    pop ax
    pop si
    pop di
    ret

; 32-bit protected mode entry point
bits 32
start32:
    mov ax, DATA32_SEL          ; Setup the segment registers with data selector
    mov ds, ax
    mov es, ax
    mov ss, ax
    mov esp, PM_MODE_STACK      ; Set protected mode stack pointer

    mov fs, ax                  ; Not currently using FS and GS
    mov gs, ax

    mov ah, ATTR_BWHITE_ON_GREEN; Attribute to print with
    mov al, ah                  ; Attribute to clear last line when scrolling
    mov esi, in_pm_msg          ; Print message that we are in protected mode
    call print_string_pm

    xor ebx, ebx                ; EBX=0
    push ebx                    ; Real mode GS=0
    push ebx                    ; Real mode FS=0
    push ebx                    ; Real mode DS=0
    push ebx                    ; Real mode ES=0
    push V86_STACK_SEG
    push V86_STACK_OFS          ; v8086 stack SS:SP (grows down from SS:SP)
    push dword 1<<EFLAGS_VM_BIT | 1<<EFLAGS_BIT1
                                ; Set VM Bit, IF bit is off, DF=0(forward direction),
                                ; IOPL=0, Reserved bit (bit 1) always 1. Everything
                                ; else 0. These flags will be loaded in the v8086 mode
                                ; during the IRET. We don't want interrupts enabled
                                ; because we have no v86 monitor via protected mode
                                ; GPF handler
    push V86_CS_SEG             ; Real Mode CS (segment)
    push v86_mode_entry         ; Entry point (offset)
    iret                        ; Transfer control to v8086 mode and our real mode code

; Function: print_string_pm
;           Display a string to the console on display page 0 in protected mode.
;           Very basic. Doesn't update hardware cursor, doesn't handle scrolling,
;           LF, CR, TAB.
;
; Inputs:   ESI = Offset of address to print
;           AH  = Attribute of string to print
; Clobbers: None
; Returns:  None

print_string_pm:
    push edi
    push esi
    push eax

    mov edi, [vidmem_ptr]       ; Start from video address stored at vidmem_ptr
    jmp .getchar
.outchar:
    stosw                       ; Output character to video display
.getchar:
    lodsb                       ; Load next character from string
    test al, al                 ; Is character NUL?
    jne .outchar                ;     If not, go back and output character

    mov [vidmem_ptr], edi       ; Update global video pointer
    pop eax
    pop esi
    pop edi
    ret

align 4
vidmem_ptr: dd VIDEO_TEXT_ADDR  ; Start console output in upper left of display

in_pm_msg:
    db "In 32-bit protected mode!", 0
in_v86_msg:
    db "In v8086 mode!", 0

align 4
gdt_start:
    dq MAKE_GDT_DESC(0, 0, 0, 0)   ; null descriptor
gdt32_code:
    dq MAKE_GDT_DESC(0, 0x000fffff, 10011010b, 1100b)
                                ; 32-bit code, 4kb gran, limit 0xffffffff bytes, base=0
gdt32_data:
    dq MAKE_GDT_DESC(0, 0x000fffff, 10010010b, 1100b)
                                ; 32-bit data, 4kb gran, limit 0xffffffff bytes, base=0
end_of_gdt:

gdtr:
    dw end_of_gdt - gdt_start - 1
                                ; limit (Size of GDT - 1)
    dd gdt_start                ; base of GDT

CODE32_SEL equ gdt32_code - gdt_start
DATA32_SEL equ gdt32_data - gdt_start

; Pad boot sector to 510 bytes and add 2 byte boot signature
TIMES 510-($-$$) db  0
dw 0xaa55

This example code can be modified to do the hlt and it will double fault. It does properly enter v8086 mode. I print a string while it is in 32-bit protected mode and a string after it enters v8086 mode. Since IOPL=0 the real mode code doesn't use any privileged instructions nor does it use any instructions that are Interrupt Flag (IF) sensitive, nor does it do port IO. Without a VM Monitor (GPF handler that is v8086 mode aware) you are limited to non-privileged and non interrupt flag sensitive instructions. Since the INT instruction is IF sensitive, the BIOS can not be used. I write the characters directly to the display.


Method 2: Use a hardware task switch to enter v8086 mode

If you aren't using hardware task switching in your OS, I don't recommend using this mechanism. If you have made the choice to use hardware task switching then using this method makes sense.1

If using hardware task switching to enter v8086 mode a TSS structure and a TSS entry in the GDT are needed. The TSS entry in the GDT is to specify the base and limits of the segment containing the TSS. A GDT entry generally defined as:

enter image description here

A 32-bit TSS descriptor that is initially marked available has a type of 0x09; the S bit (system segment) set to 0; a P bit of 1; a G bit set to 0 (byte granularity); and remaining flag bits set to 0. For a v8086 task we want a Descriptor Privilege Level (DPL) of 0. This results in an access byte of 0x89 and a flags byte of 0x00.

The TSS structure itself can follow the type of structure that is suggested in this related Stackoverflow answer. For the example below we will not be using an IO Port Bitmap so I've set the TSS_IO_BITMAP_SIZE to 0.

Once the appropriate structures are created, the TSS can be filled in with the state of registers needed by the v8086 task. This will include the CS:IP where execution will start in the v8086 task. To enter the v8086 task all that is needed is a FAR JMP through the TSS selector:

jmp TSS32_SEL:0             ; Transfer control to v8086 mode and our real mode code

The offset is ignored when jumping via a TSS selector. I use a value of 0 for the offset, but it can be set to any value. This FAR JMP will load the Task Register with the TSS selector and mark the task as busy; setup the CPU state per the TSS structure; transfer control to the task. A minimal complete example is as follows:

VIDEO_TEXT_ADDR        EQU 0xb8000 ; Hard code beginning of text video memory
ATTR_BWHITE_ON_GREEN   EQU 0x2f    ; Bright white on green attribute
ATTR_BWHITE_ON_MAGENTA EQU 0x5f    ; Bright White on magenta attribute

PM_MODE_STACK          EQU 0x80000 ; Protected mode stack below EBDA

V86_STACK_SEG          EQU 0x0000  ; v8086 stack SS
V86_STACK_OFS          EQU 0x0000  ; v8086 stack SP
V86_CS_SEG             EQU 0x0000  ; v8086 code segment CS

EFLAGS_VM_BIT          EQU 17      ; EFLAGS VM bit
EFLAGS_BIT1            EQU 1       ; EFLAGS bit 1 (reserved, always 1)
EFLAGS_IF_BIT          EQU 9       ; EFLAGS IF bit

TSS_IO_BITMAP_SIZE     EQU 0       ; Size 0 disables IO port bitmap (no permission)

; Macro to build a GDT descriptor entry
%define MAKE_GDT_DESC(base, limit, access, flags) \
    (((base & 0x00FFFFFF) << 16) | \
    ((base & 0xFF000000) << 32) | \
    (limit & 0x0000FFFF) | \
    ((limit & 0x000F0000) << 32) | \
    ((access & 0xFF) << 40) | \
    ((flags & 0x0F) << 52))

bits 16
ORG 0x7c00

; Include a BPB (1.44MB floppy with FAT12) to be more compatible with USB floppy media
; %include "bpb.inc"

boot_start:
    xor ax, ax                  ; DS=SS=ES=0
    mov ds, ax
    mov ss, ax                  ; Stack at 0x0000:0x7c00
    mov sp, 0x7c00
    cld                         ; Set string instructions to use forward movement

    ; Fast method of enabling A20 may not work on all x86 BIOSes
    ; It is good enough for emulators and most modern BIOSes
    ; See: https://wiki.osdev.org/A20_Line
    cli                         ; Disable interrupts for rest of code as we don't
                                ; want A20 code to be interrupted. In protected mode
                                ; we have no IDT so any interrupt that does occur will
                                ; double fault and reboot.

    in al, 0x92
    or al, 2
    out 0x92, al                ; Enable A20 using Fast Method

    lgdt [gdtr]                 ; Load our GDT

    mov eax, cr0
    or eax, 1
    mov cr0, eax                ; Set protected mode flag
    jmp CODE32_SEL:start32      ; FAR JMP to set CS

; v8086 code entry point
v86_mode_entry:
    sub dword [vidmem_ptr], VIDEO_TEXT_ADDR
                                ; Adjust video pointer to be relative to beginning of
                                ;     segment 0xb800

    mov si, in_v86_msg          ; Print in v86 message
    mov ah, ATTR_BWHITE_ON_MAGENTA
                                ; Attribute to print with
    call print_string_rm_nobios

.endloop:
    jmp $                       ; Infinite loop since we did code a solution to exit VM

; Function: print_string_rm_nobios
;           Display a string to the console on display page 0 in real/v8086 mode
;           without using the BIOS. We don't have a proper v8086 monitor so can't
;           use BIOS to display.
;
;           Very basic. Doesn't update hardware cursor, doesn't handle scrolling,
;           LF, CR, TAB.
;
; Inputs:   SI  = Offset of address to print
;           AH  = Attribute of string to print
; Clobbers: None
; Returns:  None

print_string_rm_nobios:
    push di
    push si
    push ax
    push es

    mov di, VIDEO_TEXT_ADDR>>4  ; ES=0xb800 (text video mode segment)
    mov es, di

    mov di, [vidmem_ptr]        ; Start from video address stored at vidmem_ptr
    jmp .getchar
.outchar:
    stosw                       ; Output character to display
.getchar:
    lodsb                       ; Load next character from string
    test al, al                 ; Is character NUL?
    jne .outchar                ; If not, go output character

    mov [vidmem_ptr], di        ; Update global video pointer

    pop es
    pop ax
    pop si
    pop di
    ret

; 32-bit protected mode entry point
bits 32
start32:
    mov ax, DATA32_SEL          ; Setup the segment registers with data selector
    mov ds, ax
    mov es, ax
    mov ss, ax
    mov esp, PM_MODE_STACK      ; Set protected mode stack pointer

    mov fs, ax                  ; Not currently using FS and GS
    mov gs, ax

    mov ah, ATTR_BWHITE_ON_GREEN; Attribute to print with
    mov al, ah                  ; Attribute to clear last line when scrolling
    mov esi, in_pm_msg          ; Print message that we are in protected mode
    call print_string_pm

    mov ecx, TSS_SIZE           ; Zero out entire TSS structure
    mov edi, tss_entry
    xor eax, eax
    rep stosb

    ; v8086 stack SS:SP (grows down from SS:SP)
    mov dword [tss_entry.ss], V86_STACK_SEG
    mov dword [tss_entry.esp], V86_STACK_OFS

    mov dword [tss_entry.eflags], 1<<EFLAGS_VM_BIT | 1<<EFLAGS_BIT1
                                ; Set VM Bit, IF bit is off, DF=0(forward direction),
                                ; IOPL=0, Reserved bit (bit 1) always 1. Everything
                                ; else 0. We don't want interrupts enabled upon entry to
                                ; v8086 because we have no v8086 monitor (a protected mode
                                ; GPF handler)

    ; Set Real Mode CS:EIP to start execution at
    mov dword [tss_entry.cs], V86_CS_SEG
    mov dword [tss_entry.eip], v86_mode_entry

    ; Set iomap_base in tss with the offset of the iomap relative to beginning of the tss
    mov word [tss_entry.iomap_base], tss_entry.iomap-tss_entry
%if TSS_IO_BITMAP_SIZE > 0
    ; If using an IO Bitmap then a padding byte has to be set to 0xff at end of bitmap
    mov byte [tss_entry.iomap_pad], 0xff
%endif

    jmp TSS32_SEL:0             ; Transfer control to v8086 mode and our real mode code

; Function: print_string_pm
;           Display a string to the console on display page 0 in protected mode.
;           Very basic. Doesn't update hardware cursor, doesn't handle scrolling,
;           LF, CR, TAB.
;
; Inputs:   ESI = Offset of address to print
;           AH  = Attribute of string to print
; Clobbers: None
; Returns:  None

print_string_pm:
    push edi
    push esi
    push eax

    mov edi, [vidmem_ptr]       ; Start from video address stored at vidmem_ptr
    jmp .getchar
.outchar:
    stosw                       ; Output character to video display
.getchar:
    lodsb                       ; Load next character from string
    test al, al                 ; Is character NUL?
    jne .outchar                ;     If not, go back and output character

    mov [vidmem_ptr], edi       ; Update global video pointer
    pop eax
    pop esi
    pop edi
    ret

align 4
vidmem_ptr: dd VIDEO_TEXT_ADDR  ; Start console output in upper left of display

in_pm_msg:
    db "In 32-bit protected mode!", 0
in_v86_msg:
    db "In v8086 mode!", 0

align 4
gdt_start:
    dq MAKE_GDT_DESC(0, 0, 0, 0)   ; null descriptor
gdt32_code:
    dq MAKE_GDT_DESC(0, 0x000fffff, 10011010b, 1100b)
                                ; 32-bit code, 4kb gran, limit 0xffffffff bytes, base=0
gdt32_data:
    dq MAKE_GDT_DESC(0, 0x000fffff, 10010010b, 1100b)
                                ; 32-bit data, 4kb gran, limit 0xffffffff bytes, base=0
gdt32_tss:
    dq MAKE_GDT_DESC(tss_entry, TSS_SIZE-1, 10001001b, 0000b)
                                ; 32-bit TSS, 1b gran, available, IOPL=0
end_of_gdt:

CODE32_SEL equ gdt32_code - gdt_start
DATA32_SEL equ gdt32_data - gdt_start
TSS32_SEL  equ gdt32_tss  - gdt_start

gdtr:
    dw end_of_gdt - gdt_start - 1
                                ; limit (Size of GDT - 1)
    dd gdt_start                ; base of GDT

; Pad boot sector to 510 bytes and add 2 byte boot signature
TIMES 510-($-$$) db  0
dw 0xaa55

; Data section above bootloader @ 0x7c00. Acts like a BSS section
ABSOLUTE 0x7e00

; Store the TSS just beyond the boot signature read into memory
; at 0x0000:0x7e00
tss_entry:
.back_link: resd 1
.esp0:      resd 1              ; Kernel stack pointer used on ring transitions
.ss0:       resd 1              ; Kernel stack segment used on ring transitions
.esp1:      resd 1
.ss1:       resd 1
.esp2:      resd 1
.ss2:       resd 1
.cr3:       resd 1
.eip:       resd 1
.eflags:    resd 1
.eax:       resd 1
.ecx:       resd 1
.edx:       resd 1
.ebx:       resd 1
.esp:       resd 1
.ebp:       resd 1
.esi:       resd 1
.edi:       resd 1
.es:        resd 1
.cs:        resd 1
.ss:        resd 1
.ds:        resd 1
.fs:        resd 1
.gs:        resd 1
.ldt:       resd 1
.trap:      resw 1
.iomap_base:resw 1              ; IOPB offset

;.cetssp:    resd 1             ; Need this if CET is enabled

; Insert any kernel defined task instance data here
; ...

; If using VME (Virtual Mode extensions) there need to bean additional 32 bytes
; available immediately preceding iomap. If using VME uncomment next 2 lines
;.vmeintmap:                     ; If VME enabled uncomment this line and the next
;    resb 32                     ;     32*8 bits = 256 bits (one bit for each interrupt)

.iomap: resb TSS_IO_BITMAP_SIZE ; IO bitmap (IOPB) size 8192 (8*8192=65536) representing
                                ; all ports. An IO bitmap size of 0 would fault all IO
                                ; port access if IOPL < CPL (CPL=3 with v8086)
%if TSS_IO_BITMAP_SIZE > 0
.iomap_pad: resb 1              ; Padding byte that has to be filled with 0xff
                                ; To deal with issues on some CPUs when using an IOPB
%endif
TSS_SIZE EQU $-tss_entry

Notes

  • 1Relying on hardware task switching is hard to port to other CPUs; the x86 CPUs aren't optimized for hardware task switches; FPU and SIMD state aren't preserved; the performance can be slower than writing the task switching via software. Long mode on the x86-64 processors doesn't even support hardware task switching. Modern OSes running on x86 generally don't use the CPU's hardware task switching.

This answer had to be split from the first one because the post limit was exceeded.


Method 3: Use IRET and a TSS Structure

This method is actually the same as Method #1. Use an IRET to enter into v8086 mode, but we create a TSS structure and a 32-bit TSS entry in the GDT like Method #2. Creating a TSS in the absence of hardware task switching allows us to specify an IO Port Bitmap when running unprivileged (CPL=1,2,3) code where IOPL < CPL. On multi core systems a kernel generally creates a TSS for each processor.

The CPU will use the .esp0 and .ss0 fields as the kernel stack when an interrupt/call/trap gate transfers control to CPL=0 from CPL=1,2,3. You can't process interrupts when running code at CPL>0 without a TSS. The LTR instruction is used to specify the initial TSS without doing an actual task switch. The TSS is marked busy by LTR.

The following minimal complete example demonstrates this concept. For this example the IOPB is set to allow port access to the first 0x400 ports and deny it for the rest:

VIDEO_TEXT_ADDR        EQU 0xb8000 ; Hard code beginning of text video memory
ATTR_BWHITE_ON_GREEN   EQU 0x2f    ; Bright white on green attribute
ATTR_BWHITE_ON_MAGENTA EQU 0x5f    ; Bright White on magenta attribute

PM_MODE_STACK          EQU 0x80000 ; Protected mode stack below EBDA

V86_STACK_SEG          EQU 0x0000  ; v8086 stack SS
V86_STACK_OFS          EQU 0x0000  ; v8086 stack SP
V86_CS_SEG             EQU 0x0000  ; v8086 code segment CS

EFLAGS_VM_BIT          EQU 17      ; EFLAGS VM bit
EFLAGS_BIT1            EQU 1       ; EFLAGS bit 1 (reserved, always 1)
EFLAGS_IF_BIT          EQU 9       ; EFLAGS IF bit

TSS_IO_BITMAP_SIZE     EQU 0x400/8 ; IO Bitmap for 0x400 IO ports
                                   ; Size 0 disables IO port bitmap (no permission)

; Macro to build a GDT descriptor entry
%define MAKE_GDT_DESC(base, limit, access, flags) \
    (((base & 0x00FFFFFF) << 16) | \
    ((base & 0xFF000000) << 32) | \
    (limit & 0x0000FFFF) | \
    ((limit & 0x000F0000) << 32) | \
    ((access & 0xFF) << 40) | \
    ((flags & 0x0F) << 52))

bits 16
ORG 0x7c00

; Include a BPB (1.44MB floppy with FAT12) to be more compatible with USB floppy media
; %include "bpb.inc"

boot_start:
    xor ax, ax                  ; DS=SS=ES=0
    mov ds, ax
    mov ss, ax                  ; Stack at 0x0000:0x7c00
    mov sp, 0x7c00
    cld                         ; Set string instructions to use forward movement

    ; Fast method of enabling A20 may not work on all x86 BIOSes
    ; It is good enough for emulators and most modern BIOSes
    ; See: https://wiki.osdev.org/A20_Line
    cli                         ; Disable interrupts for rest of code as we don't
                                ; want A20 code to be interrupted. In protected mode
                                ; we have no IDT so any interrupt that does occur will
                                ; double fault and reboot.

    in al, 0x92
    or al, 2
    out 0x92, al                ; Enable A20 using Fast Method

    lgdt [gdtr]                 ; Load our GDT

    mov eax, cr0
    or eax, 1
    mov cr0, eax                ; Set protected mode flag
    jmp CODE32_SEL:start32      ; FAR JMP to set CS

; v8086 code entry point
v86_mode_entry:
    sub dword [vidmem_ptr], VIDEO_TEXT_ADDR
                                ; Adjust video pointer to be relative to beginning of
                                ;     segment 0xb800

    mov si, in_v86_msg          ; Print in v86 message
    mov ah, ATTR_BWHITE_ON_MAGENTA
                                ; Attribute to print with
    call print_string_rm_nobios

.endloop:
    jmp $                       ; Infinite loop since we did code a solution to exit VM

; Function: print_string_rm_nobios
;           Display a string to the console on display page 0 in real/v8086 mode
;           without using the BIOS. We don't have a proper v8086 monitor so can't
;           use BIOS to display.
;
;           Very basic. Doesn't update hardware cursor, doesn't handle scrolling,
;           LF, CR, TAB.
;
; Inputs:   SI  = Offset of address to print
;           AH  = Attribute of string to print
; Clobbers: None
; Returns:  None

print_string_rm_nobios:
    push di
    push si
    push ax
    push es

    mov di, VIDEO_TEXT_ADDR>>4  ; ES=0xb800 (text video mode segment)
    mov es, di

    mov di, [vidmem_ptr]        ; Start from video address stored at vidmem_ptr
    jmp .getchar
.outchar:
    stosw                       ; Output character to display
.getchar:
    lodsb                       ; Load next character from string
    test al, al                 ; Is character NUL?
    jne .outchar                ; If not, go output character

    mov [vidmem_ptr], di        ; Update global video pointer

    pop es
    pop ax
    pop si
    pop di
    ret

; 32-bit protected mode entry point
bits 32
start32:
    mov ax, DATA32_SEL          ; Setup the segment registers with data selector
    mov ds, ax
    mov es, ax
    mov ss, ax
    mov esp, PM_MODE_STACK      ; Set protected mode stack pointer

    mov fs, ax                  ; Not currently using FS and GS
    mov gs, ax

    mov ah, ATTR_BWHITE_ON_GREEN; Attribute to print with
    mov al, ah                  ; Attribute to clear last line when scrolling
    mov esi, in_pm_msg          ; Print message that we are in protected mode
    call print_string_pm

    mov ecx, TSS_SIZE           ; Zero out entire TSS structure
    mov edi, tss_entry
    xor eax, eax
    rep stosb

    ; Set iomap_base in tss with the offset of the iomap relative to beginning of the tss
    mov word [tss_entry.iomap_base], tss_entry.iomap-tss_entry

    mov eax, TSS32_SEL
    ltr ax                      ; Load default TSS (used for exceptions, interrupts, etc)

    xor ebx, ebx                ; EBX=0
    push ebx                    ; Real mode GS=0
    push ebx                    ; Real mode FS=0
    push ebx                    ; Real mode DS=0
    push ebx                    ; Real mode ES=0
    push V86_STACK_SEG
    push V86_STACK_OFS          ; v8086 stack SS:SP (grows down from SS:SP)
    push dword 1<<EFLAGS_VM_BIT | 1<<EFLAGS_BIT1
                                ; Set VM Bit, IF bit is off, DF=0(forward direction),
                                ; IOPL=0, Reserved bit (bit 1) always 1. Everything
                                ; else 0. These flags will be loaded in the v8086 mode
                                ; during the IRET. We don't want interrupts enabled
                                ; because we have no v86 monitor via protected mode
                                ; GPF handler
    push V86_CS_SEG             ; Real Mode CS (segment)
    push v86_mode_entry         ; Entry point (offset)
    iret                        ; Transfer control to v8086 mode and our real mode code

; Function: print_string_pm
;           Display a string to the console on display page 0 in protected mode.
;           Very basic. Doesn't update hardware cursor, doesn't handle scrolling,
;           LF, CR, TAB.
;
; Inputs:   ESI = Offset of address to print
;           AH  = Attribute of string to print
; Clobbers: None
; Returns:  None

print_string_pm:
    push edi
    push esi
    push eax

    mov edi, [vidmem_ptr]       ; Start from video address stored at vidmem_ptr
    jmp .getchar
.outchar:
    stosw                       ; Output character to video display
.getchar:
    lodsb                       ; Load next character from string
    test al, al                 ; Is character NUL?
    jne .outchar                ;     If not, go back and output character

    mov [vidmem_ptr], edi       ; Update global video pointer
    pop eax
    pop esi
    pop edi
    ret

align 4
vidmem_ptr: dd VIDEO_TEXT_ADDR  ; Start console output in upper left of display

in_pm_msg:
    db "In 32-bit protected mode!", 0
in_v86_msg:
    db "In v8086 mode!", 0

align 4
gdt_start:
    dq MAKE_GDT_DESC(0, 0, 0, 0)   ; null descriptor
gdt32_code:
    dq MAKE_GDT_DESC(0, 0x000fffff, 10011010b, 1100b)
                                ; 32-bit code, 4kb gran, limit 0xffffffff bytes, base=0
gdt32_data:
    dq MAKE_GDT_DESC(0, 0x000fffff, 10010010b, 1100b)
                                ; 32-bit data, 4kb gran, limit 0xffffffff bytes, base=0
gdt32_tss:
    dq MAKE_GDT_DESC(tss_entry, TSS_SIZE-1, 10001001b, 0000b)
                                ; 32-bit TSS, 1b gran, available, IOPL=0
end_of_gdt:

CODE32_SEL equ gdt32_code - gdt_start
DATA32_SEL equ gdt32_data - gdt_start
TSS32_SEL  equ gdt32_tss  - gdt_start

gdtr:
    dw end_of_gdt - gdt_start - 1
                                ; limit (Size of GDT - 1)
    dd gdt_start                ; base of GDT

; Pad boot sector to 510 bytes and add 2 byte boot signature
TIMES 510-($-$$) db  0
dw 0xaa55

; Data section above bootloader @ 0x7c00. Acts like a BSS section
ABSOLUTE 0x7e00

; Store the TSS just beyond the boot signature read into memory
; at 0x0000:0x7e00
tss_entry:
.back_link: resd 1
.esp0:      resd 1              ; Kernel stack pointer used on ring transitions
.ss0:       resd 1              ; Kernel stack segment used on ring transitions
.esp1:      resd 1
.ss1:       resd 1
.esp2:      resd 1
.ss2:       resd 1
.cr3:       resd 1
.eip:       resd 1
.eflags:    resd 1
.eax:       resd 1
.ecx:       resd 1
.edx:       resd 1
.ebx:       resd 1
.esp:       resd 1
.ebp:       resd 1
.esi:       resd 1
.edi:       resd 1
.es:        resd 1
.cs:        resd 1
.ss:        resd 1
.ds:        resd 1
.fs:        resd 1
.gs:        resd 1
.ldt:       resd 1
.trap:      resw 1
.iomap_base:resw 1              ; IOPB offset

;.cetssp:    resd 1             ; Need this if CET is enabled

; Insert any kernel defined task instance data here
; ...

; If using VME (Virtual Mode extensions) there need to bean additional 32 bytes
; available immediately preceding iomap. If using VME uncomment next 2 lines
;.vmeintmap:                     ; If VME enabled uncomment this line and the next
;    resb 32                     ;     32*8 bits = 256 bits (one bit for each interrupt)

.iomap: resb TSS_IO_BITMAP_SIZE ; IO bitmap (IOPB) size 8192 (8*8192=65536) representing
                                ; all ports. An IO bitmap size of 0 would fault all IO
                                ; port access if IOPL < CPL (CPL=3 with v8086)
%if TSS_IO_BITMAP_SIZE > 0
.iomap_pad: resb 1              ; Padding byte that has to be filled with 0xff
                                ; To deal with issues on some CPUs when using an IOPB
%endif
TSS_SIZE EQU $-tss_entry