Why is XOR used in cryptography?
I can see 2 reasons:
1) (Main reason) XOR does not leak information about the original plaintext.
2) (Nice-to-have reason) XOR is an involutory function, i.e., if you apply XOR twice, you get the original plaintext back (i.e, XOR(k, XOR(k, x)) = x
, where x
is your plaintext and k
is your key). The inner XOR is the encryption and the outer XOR is the decryption, i.e., the exact same XOR function can be used for both encryption and decryption.
To exemplify the first point, consider the truth-tables of AND, OR and XOR:
And
0 AND 0 = 0
0 AND 1 = 0
1 AND 0 = 0
1 AND 1 = 1 (Leak!)
Or
0 OR 0 = 0 (Leak!)
0 OR 1 = 1
1 OR 0 = 1
1 OR 1 = 1
XOR
0 XOR 0 = 0
0 XOR 1 = 1
1 XOR 0 = 1
1 XOR 1 = 0
Everything on the first column is our input (ie, the plain text). The second column is our key and the last column is the result of your input "mixed" (encrypted) with the key using the specific operation (ie, the ciphertext).
Now, imagine an attacker got access to some encrypted byte, say: 10010111, and he wants to get the original plaintext byte.
Let's say the AND operator was used in order to generate this encrypted byte from the original plaintext byte. If AND was used, then we know for certain that every time we see the bit '1' in the encrypted byte then the input (ie, the first column, the plain text) MUST also be '1' as per the truth table of AND. If the encrypted bit is a '0' instead, we do not know if the input (ie, the plain text) is a '0' or a '1'. Therefore, we can conclude that the original plain text is: 1 _ _ 1 _ 111. So 5 bits of the original plain text were leaked (ie, could be accessed without the key).
Applying the same idea to OR, we see that every time we find a '0' in the encrypted byte, we know that the input (ie, the plain text) must also be a '0'. If we find a '1' then we do not know if the input is a '0' or a '1'. Therefore, we can conclude that the input plain text is: _ 00 _ 0 _ _ _. This time we were able to leak 3 bits of the original plain text byte without knowing anything about the key.
Finally, with XOR, we cannot get any bit of the original plaintext byte. Every time we see a '1' in the encrypted byte, that '1' could have been generated from a '0' or from a '1'. Same thing with a '0' (it could come from both '0' or '1'). Therefore, not a single bit is leaked from the original plaintext byte.
It isn't exactly true to say that the logical operation XOR is the only one used throughout all cryptography, however it is the only two way encryption where it is used exclusively.
Here is that explained:
Imagine you have a string of binary digits 10101
and you XOR the string 10111
with it you get 00010
now your original string is encoded and the second string becomes your key if you XOR your key with your encoded string you get your original string back.
XOR allows you to easily encrypt and decrypt a string, the other logic operations don't.
If you have a longer string you can repeat your key until its long enough
for example if your string was 1010010011
then you'd simple write your key twice and it would become 1011110111
and XOR it with the new string
Here's a wikipedia link on the XOR cipher.