python string join performance
The advice is about concatenating a lot of strings.
To compute s = s1 + s2 + ... + sn,
using +. A new string s1+s2 is created, then a new string s1+s2+s3 is created,..., etc, so a lot of memory allocation and copy operations is involved. In fact, s1 is copied n-1 times, s2 is copied n-2 time, ..., etc.
using "".join([s1, s2, ..., sn]). The concatenation is done in one pass, and each char in the strings is copied only once.
In your code, join is called on each iteration, so it's just like using +. The correct way is collect the items in an array, then call join on it.
Most of the performance issues with string concatenation are ones of asymptotic performance, so the differences become most significant when you are concatenating many long strings.
In your sample, you are performing the same concatenation many times. You aren't building up any long string, and it may be that the Python interpreter is optimizing your loops. This would explain why the time increases when you move to str.join and path.join - they are more complex functions that are not as easily reduced. (os.path.join does a lot of checking on the strings to see if they need to be rewritten in any way before they are concatenated. This sacrifices some performance for the sake of portability.)
By the way, since file paths are not usually very long, you almost certainly want to use os.path.join for the sake of the portability. If the performance of the concatenation is a problem, you're doing something very odd with your filesystem.