Who Decides Between I/O Mapped and Memory Mapped I/O (x86)
As Michael Petch said in his comment it's the manufacturer, it doesn't always have full freedom though.
Standards and specifications can mandate the address space to use, some standards are generic (e.g. the OHCI, USB 1.0, refers to an "uncachable address space", which on x86 can be either IO or MMIO) other are not (e.g. the PC Client TPM spec maps the TPM registers by locality based on the MMIO area used).
As far as i know, and as far as we are concerned in this answer, MMIO adoption went mainstream with the advent of PCI1.
PCI BARs (Base Address Registers) have a special format that allow the software to know which address space the card is using (and how much of it is needed):
Bit 0 is Read-Only (set by manufacturer) and tell which address space is being used by the card.
The IO space has the advantage over MMIO of not requiring any setup, MMIO need a virtual to physical mapping and the correct caching type.
However the IO space is only 64KiB + 3B, it's very small.
In fact PCI 2.2 limits the max IO space used by a single BAR to 256 bytes.
Sorry for the image, copying from the PDF spec gives me gibberish
Furthermore, pointers don't work in the IO space and some devices works with pointers (e.g. USB controllers, GBe and so on).
IO is surely used for legacy devices (before MMIO was a thing).
I was used to think that IO was used for devices that have a small number of registers but that's not always true, for example the Power Manager Control registers of the PCH (chipset) are IO mapped and occupy 128B.
Sometimes, the device support both IO and MMIO. This requires two BARs, an example is the SMBus controller of the PCH:
It has two BARs (note the default value, one is for IO the other for MMIO) that control the same set of registers.
The documentation specifies that both can be used.
I cannot give an exact rule of when IO vs MMIO is used.
I don't think there a difference in performance, the distinction is just a bit in the TLP packet sent by the PCIe link layer.
However I've never investigated the matter, the IO instructions are serialising so there is a performance penalty at software level.
My rule of thumb is that IO is/can be used if any of the following is true:
- The device is a legacy one (there is really no freedom of choice here).
- Your device is not using pointers (because IO has no pointers) and the register set is small.
- The registers are mostly used for control and report the status of the device or the whole system (because IO instructions are serialising).
These are just rule of thumbs, based on my readings and memories, there are many exceptions and counter examples to them.
Today the tendency is to use MMIO, this may require more decode logic (more address lines to decode) but the PCI spec simplify it by allowing a device to round its decoding to 4KiB.
One example is the PCIe configuration space, in PCI it was IO accessed (with a technique similar to the stacking of registers as used in, e.g., the VGA controller) but now is memory mapped.
There is no need to consider other busses as PCIe is the main bus on modern PCs, everything else goes through a PCIe device (e.g. USB uses xHCI PCI devices).
The only exception to this are the off-core devices (e.g. the LAPICs, the TXT registers), these are accessed through memory mapped IO because it's more performant I think, this accesses won't make it to the system agent (these devices are close to their core and inside the CPU package anyway) so using a (serialising) IO instruction would impact them significantly.
Plus there is a nice spot a the top of the 4GiB where Intel can reclaim memory without too much pressure on the other devices.
Fun fact: Ports 0xf8-0xff are reserved due to the times when the FPU was a coprocessor (x87) and this ports were used by the CPU for communicating with it.
1 Before that both other PnP buses was already available (e.g. PnP ISA and MCA) but decoding memory accesses was mostly done for giving access to ROMs and on-card RAM. Mapping registers to memory was not yet a thing I guess.
If I want to build my own device (a peripheral) can I choose freely whether I use I/O mapped or memory mapped I/O to communicate with PC?
What kind of device?
If it's a legacy device (e.g. an ancient "PS/2 controller" or serial port or parallel port or ..) or a standardized device (e.g. implementing AHCI or NVMe or xHCI), then it has to comply with an existing (formal or de facto) specification.
Otherwise (no existing specification that has to be compiled with); if it's a USB device then you can't use IO ports or MMIO (it's responding to requests on a serial bus); if it's a PCI device and needs high performance it should use MMIO (because IO ports are a performance problem); and if it's a PCI device that doesn't need high performance it shouldn't be a PCI device at all (should be USB).