What's the difference between a commercial JTAG debugger and an open source FT2232H OpenOCD debugger?
JTAG cables can be built around all sorts of stuff. Xilinx JTAG cables, for example, have a Cypress chip and an FPGA. Atmel cables generally contain an AVR microcontroller with USB support. They will also usually contain some interface/level translation/protection/isolation components. It really depends on the manufacturer, they're all proprietary and mutually incompatible. Generally you need to have the cable that works with whatever software you need to use. If all you need is OpenOCD, then an FTDI based cable is fine. But if you want to use, say Xilinx ChipScope? Then you need to pay up for either the real thing from Xilinx or a chinese knockoff.
The links you have are not for simple JTAG cables, they are far more specialized. I would personally consider these to be a full-on piece of test equipment. They are basically specialized protocol analyzers. They are designed to interface with specialized trace hardware that is incorporated into the device under test. Trace hardware is distinct from JTAG. It's purpose is to record the complete execution trace of the running software (i.e. all branches taken) across all execution cores and pass it to the external trace collection system (the box in question) over a high speed bus. The trace is then analyzed offline. This is NOT the same as debugging that can be done over JTAG by setting breakpoints and stepping through the code. Trace collection is supposed to be completely transparent to the running program (no breakpoints or added code). Since the processor under test can be executing several hundred million instructions per second, storing the trace as it is produced requires a lot of bandwidth and fast memory. The linked devices support the Aurora protocol (probably among others), which is an 8b/10b encoded high speed serial protocol, somewhat similar to USB 3, serial ATA, serial gigabit/10G ethernet, and PCIe. It's capable of transferring data at 6.25 Gbps, significantly more than what the USB link back to the PC can handle, so the captured data must be stored in onboard RAM for offline analysis. These devices will contain rather high end FPGAs with internal high-speed deserializers to capture the data along with quite a bit (several GB) of fast DRAM, probably DDR2 or perhaps even DDR3.
The difference is in software & functionality, which affects the hardware greatly.
The FTDI JTAG cables uses a command set to produce JTAG signals. These are very low level commands, often going into the exact details how the JTAG statemachine works and is operated. The logic of sending the correct commands for your setup is done on the debug host on your PC.
This is functional, cheap hardware, free software (GNU GCC+GDB+OpenOCD), etc. It is flexible enough (because of the low-level command set) that there are ports for ARM debugging, FPGA programming, or generic JTAG chain scanning.
The commercial cables are much more specific to a platform and often contain logic within the cable. This allows the PC program to talk to the device in a more abstract way which can be faster.
For example: look at the JLINK USB protocol. It contains commands like EMU_CMD_WRITE_MEM_ARM79. The FTDI cables can also execute this command, but it is translated on the PC side to the low-level JTAG commands the FTDI cable understands. It also means the high-level command (write some memory) is broken down into many more sub-commands, which the JLINK can do on the cable it self. This can result in better latency (taking into account the limitations of USB) and/or higher speed.
It is also up to the IDE commercial vendors which cable they support, and it is more likely that a commercial cable is supported. On the other hand, it is more likely the free IDE's will support the cheap FTDI debug cables.
Some commercial software also contain support for software code breakpoints, where you can set more code breakpoints than the hardware allows for.
Using trace functionality of some microcontrollers requires very fast hardware to capture a 4-bit parallel bus. Hardware capable of this feature often contain a FPGA to do so.