Why are passwords limited to 16 characters?
If you are to abide by CWE-521: Weak Password Requirements. Then all passwords must have a min and max password length.
There are two reasons for limiting the password size. For one, hashing a large amount of data can cause significant resource consumption on behalf of the server and would be an easy target for Denial of Service. Especially if the server is using key stretching such as PBKDF2.
The other concerns is hash length-extension attacks or the prefixing attack against MD5. However If you are using a hash function that isn't broken, such as bcrypt or sha-256 , then this shouldn't be a concern for passwords.
IMHO 16 bytes is far too small. bcrypt has a built-in cap of 72 characters, which is probably a reasonable size for a heavy hash function. Key Stretching used by these functions creates the possibility of an Algorithmic Complexity Attack or ACA.
What is the reason that most websites limit to 16 characters?
Arbitrary implementation limit.
Maybe they only want to allocate a 17-octet buffer (16 ASCII/1-octet characters + terminating NUL).
Maybe they believe that having a password with more than 16 characters is useless or silly, because they have no understanding of passwords.
I would have thought the longer the password the more difficult it makes it for someone to crack it?
Indeed. A password with 16 random independent alphabetic characters with a uniform distribution has enough entropy. But human being are bad at randomly choosing 16 random independent alphabetic characters with a uniform distribution, and terribly bad at remembering such meaningless character sequences, so they choose passwords that they can remember, but with less entropy per character.
What matters is only the total entropy, not the entropy per character. For example, a sequence of random dictionary words, generated with a word-die (word-die: open a random page in a dictionary, etc.) is easier to remember than a sequence of letters obtained with a letter-die.
Such passwords will be longer than purely random letter sequences, but they will be a lot easier to remember for equal entropy; or, if you prefer, they will have a lot more entropy for equal mental memory effort.
For adequate strength, these passwords made of dictionary words will probably have more than 16 characters.
In other words, this limit is stupid.
Is it something to do with hash collisions?
No.
There is no formal guaranty with respect to collisions with short passwords, but the practical impact of hash collisions is non-existent.